CN102682454B - Method and device for tracking region of interest in video - Google Patents

Method and device for tracking region of interest in video Download PDF

Info

Publication number
CN102682454B
CN102682454B CN 201210132913 CN201210132913A CN102682454B CN 102682454 B CN102682454 B CN 102682454B CN 201210132913 CN201210132913 CN 201210132913 CN 201210132913 A CN201210132913 A CN 201210132913A CN 102682454 B CN102682454 B CN 102682454B
Authority
CN
China
Prior art keywords
roi
particle
parameter
pixel
macro block
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN 201210132913
Other languages
Chinese (zh)
Other versions
CN102682454A (en
Inventor
刘震
张冬
李厚强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Science and Technology of China USTC
Original Assignee
University of Science and Technology of China USTC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Science and Technology of China USTC filed Critical University of Science and Technology of China USTC
Priority to CN 201210132913 priority Critical patent/CN102682454B/en
Publication of CN102682454A publication Critical patent/CN102682454A/en
Application granted granted Critical
Publication of CN102682454B publication Critical patent/CN102682454B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention discloses a method and a device for tracking a region of interest in a video. The method for tracking the region of interest in the video comprises the steps: first, obtaining motion vectors of pixels or macro blocks in an existing frame, confirming distributed parameters of the movement speed of the region of interest (ROI) according to the motion vectors, confirming ROI scaling parameters according to the ROI state in a reference frame, utilizing the distributed parameters of the movement speed of the ROI and the scaling parameters to perform state transition to particles obtained through sampling in the existing frame, and confirming the ROI position and the ROI size of the existing frame according to the particles after the state transition. The method and a device for tracking the region of interest in the video can utilize motion vector information which exists in compressed code flows or is generated in encoding to guide the state transition process of the particles, accordingly can decrease required particle number in the tracking process under the condition that the tracking effect is ensured, further reduce the complexity level of tracking processing, and can have good tracking effect.

Description

Region-of-interest tracking method in a kind of video and device
Technical field
The present invention relates to a kind of technical field of video processing, relate in particular to method and device that in a kind of video processing procedure, area-of-interest is followed the tracks of.
Background technology
Along with the universal and development of the communication technology, corresponding mobile TV, the Video service such as video conference and video monitoring also develop rapidly are got up.In the user carries out the process of corresponding Video service by various terminals and different access way access, the diversity of user terminal and the complicacy of network environment, how effectively making, transmitted video content becomes the huge challenge that designs video service system.
At present, SVC(scalable video) technology can effectively be transmitted corresponding video content to a certain extent.The SVC technology is by the subcode stream of encode simultaneously in a segment encode stream various code rate, resolution, frame per second, carries out simple extraction operation at transmission node according to the demand of network condition and user or subscriber equipment and just can generate mutually deserved adaptive code stream.The SVC technology is with respect to single code stream technology, and the code stream that it can provide a space, time, quality scalable namely can extract some subcode streams from this code stream.Corresponding subcode stream can satisfy network transmission speed and terminal user to video in the space, the demand of the aspects such as time and signal to noise ratio (S/N ratio), therefore, the SVC technology makes video flowing can adapt to better various network environment and user terminal.
In the SVC technology, the minimum quality coding layer that can provide is known as the basic layer of BL(), the coding layer that can strengthen spatial resolution, temporal resolution or signal to noise ratio (S/N ratio) intensity is known as the EL(enhancement layer).Wherein, spatial scalability uses the method for hierarchical coding (Layered Coding), utilizes motion, texture and the residual information of interlayer; The time scalability adopts classification bi-directional predicted frames (Hierarchical B) coding techniques; For the scalability of signal to noise ratio (S/N ratio), can adopt CGS(coarseness quality scalable) and MGS(medium size quality scalable) method.
Corresponding SVC technology also provides the area-of-interest to ROI() support of coding.ROI typically refers to the zone that comprises the object with clear and definite high-level semantic in frame of video for the viewer, as someone, and object etc.Carry out in the process of video tour the user, if the display size of its equipment is little, perhaps its available bandwidth reduces, and can keep the sharpness of area-of-interest, not affect the user to the experience of viewing and admiring of this video as far as possible.For example, when access bandwidth is not enough, can the non-area-of-interest of deletion to adapt to bandwidth demand to the impact of Subjective video quality, namely when bandwidth is not enough to transmit the encoding code stream of basic layer and enhancement layer, can transmit the encoding code stream of basic layer and ROI to take full advantage of bandwidth, can keep video quality to a certain extent, guarantee that user's subjectivity is experienced impression.
In order to realize utilizing the ROI coding techniques to adapt to various application, need to determine position and the size of ROI in each frame of video, usually can adopt the video tracking technology to determine size and the position of ROI in each frame of video.
The below will determine that the implementation of ROI in each frame of video is described to the video tracking technology of passing through of present employing.
A kind of ROI tracking mode that adopts at present is to realize based on particle filter algorithm, specifically the region list of following the tracks of is shown as a particle (rectangle or ellipse etc.), and particle can comprise various states, as movement velocity, direction, and area size etc.When following the tracks of, generate the particle of some by importance sampling in present frame, and according to the correlativity of the particle in zone to be tracked in the particle of this some and reference frame, obtain the steady state (SS) of particle in present frame by weighting, thereby obtain the zone to be tracked in present frame.
Accordingly, the concrete processing mode based on the zone to be tracked in particle filter algorithm acquisition present frame comprises:
(1) initialization: getting k=0(is initial time), press p (x 0) N sample point of extraction
Figure GDA00002719392900021
I=1 ..., N, wherein p (x 0) refer to that initialized destination probability distributes, particularly, can be set as in the first frame the Gaussian distribution take the position size of target as average.
(2) importance sampling:
Figure GDA00002719392900022
Order
Figure GDA00002719392900023
I=1 wherein ..., N, wherein,
Figure GDA00002719392900024
Expression k is the state of i particle constantly,
Figure GDA00002719392900025
Expression is from 0 (initial time) state of particle till the k moment constantly, z 1:kExpression is from 1 (the initial tracking constantly) observed reading (referring generally to the color histogram of tracking target) of target till the k moment constantly,
Figure GDA00002719392900026
Refer to,
Figure GDA00002719392900027
The observed reading of 1 to k frame and i particle have been referred to the condition of the state of k-1 frame, the estimation of the distributions of particle in the k frame, i.e. importance function.
(3) calculate weights
Figure GDA00002719392900028
ω k ( i ) = ω k - 1 ( i ) p ( z k | x k ( i ) p ( x k ( i ) | x k - 1 ( i ) ) q ( x k ( i ) | x 0 : k - 1 ( i ) , z 1 : k ) , If adopt a step to shift the posteriority distributions, this formula can be reduced to:
Figure GDA000027193929000210
Wherein,
Figure GDA000027193929000211
Refer to observation model, namely represent particle
Figure GDA000027193929000212
The probability of institute's tracking target,
Figure GDA000027193929000213
Refer to state transition model, namely target is by the probability Distribution Model of k-1 frame to the motion of k frame.
(4) normalization weights: ω ~ k ( i ) = ω k ( i ) Σ j = 1 N ω k ( j ) .
(5) resample: according to normalization weights separately
Figure GDA000027193929000215
Size copy or give up sample
Figure GDA000027193929000216
Obtain N approximate the obedience The sample that distributes
Figure GDA000027193929000218
Order ω k ( i ) = ω ~ k ( i ) = 1 / N , i=1,…,N。
(6) Output rusults: the output of algorithm is the particle collection Can approximate representation posterior probability and function x with it 0:kExpectation, wherein:
Posterior probability: p ^ ( x 0 : k | z 1 : k ) = 1 N Σ i = 1 N δ x 0 : k ( i ) ( dx 0 : k ) ;
Function x 0:kExpectation: E ( x 0 : k ) = 1 N Σ i = 1 N x 0 : k i .
(7) make k=k+1, repeat said process (2) to process (6).
In above-mentioned implementation, if will obtain stable tracking effect, need more number of particles, and number of particles is more, follow the tracks of required calculated amount larger, causing processing complexity increases greatly.
Summary of the invention
The purpose of this invention is to provide region-of-interest tracking method and device in a kind of video, to reduce the processing complexity in tracing process under the prerequisite that guarantees tracking effect.
The objective of the invention is to be achieved through the following technical solutions:
Region-of-interest tracking method in a kind of video comprises:
Obtain the motion vector of pixel in present frame or macro block, and determine the translational speed distribution parameter of region of interest ROI according to described motion vector, also determine the ROI zooming parameter according to the status information of ROI in reference frame;
Wherein, the step of the translational speed distribution parameter of described definite ROI comprises: determine pixel in present frame or reference pixel corresponding to macro block or the position of macro block in reference frame according to the motion vector of pixel in present frame or macro block, and choose described position in the motion vector of pixel or macro block be arranged in the reference pixel of reference frame ROI or present frame pixel corresponding to macro block or the motion vector of macro block in present frame; Determine the translational speed distribution parameter of described ROI according to the motion vector of pixel or macro block in the present frame of choosing acquisition;
The step of described definite ROI zooming parameter comprises: determine pixel or macro block in corresponding present frame according to the motion vector of pixel or macro block in the present frame of choosing acquisition, and determine described ROI zooming parameter according to pixel or macro block in the reference frame of pixel or macro block in this present frame and correspondence;
The particle that utilizes the translational speed distribution parameter of described ROI and zooming parameter that sampling in present frame is obtained carries out state transitions to be processed, and determines ROI position and the size of present frame according to the particle after state transitions.
Alternatively, the translational speed distribution parameter of described ROI comprises:
P ( MV ROI ) = Σ i = 0 | G | - 1 δ ( MV i - MV ROI ) / | G | , Wherein, δ is Dirac function, and G is that described position is arranged in the reference pixel of reference frame ROI or present frame pixel corresponding to macro block or the motion vector set of macro block, MV iRefer to i motion vector in G,
Figure GDA00002719392900041
For ROI by the translational speed of former frame to present frame; Described ROI zooming parameter comprises:
Figure GDA00002719392900042
Wherein, a is, b is for being affine parameter, and this affine parameter adopts least square method in conjunction with four parameter transformation model solutions, and corresponding four parameter transformation modules be according to pixel or macro block foundation in pixel or macro block in this present frame and corresponding reference frame.
Alternatively, the described particle that sampling in present frame is obtained carries out the step that state transitions processes and comprises:
Determine according to the translational speed distribution parameter of described ROI the speed v that particle state shifts 1, v 2, comprise; Even distribution with 0~1 produces random number μ, if μ<α, order
Figure GDA00002719392900043
Figure GDA00002719392900044
Indexed variable
Figure GDA00002719392900045
Otherwise, with distribution P (MV ROI) be chosen at two components corresponding to an element in the motion vector set G that adds up in a module (MV ' x, MV ' y) respectively as v 1, v 2Value, even v 1=MV x', v 2=MV y', and note
Figure GDA00002719392900046
α is the state transitions parameter, and its initial value is for presetting, and the mode of upgrading this value in follow-up particle renewal process comprises: α = Σ n = 1 N T k n / N ,
Figure GDA00002719392900048
Be the v in particle n in the k-1 frame xComponent,
Figure GDA00002719392900049
Be the v in particle n in the k-1 frame yComponent;
Determine particle zooming parameter η according to described ROI zooming parameter, comprise; Even distribution with 0~1 produces random number γ, if γ<β makes η=ρ, otherwise makes η=1, and wherein, β is target sizes intensity of variation parameter, and its value is for presetting, and ρ is described zooming parameter;
The speed and the particle zooming parameter that shift according to described particle state carry out the particle state transfer processing, and the result after the particle state of n particle in acquisition k frame shifts comprises: the position of this particle ( s k n { x } = s k - 1 n { x } + v 1 + ϵ x , s k n { y } = s k - 1 n { y } + v 2 + ϵ y ) , The translational speed of the x of this particle, y direction is respectively: s k n { v x } = v 1 + ϵ v x With S k n { v y } = v 2 + ϵ v y , The major semi-axis of this particle corresponding ellipse on described position is s k n { a } = η × s k - 1 n { a } + ϵ a , Minor semi-axis is s k n { b } = η × s k - 1 n { b } + ϵ b , ε wherein x, ε yThe stochastic variable of obeying N (0,0.2) distribution, The stochastic variable of obeying N (0,0.25), ε a, ε bIt is the stochastic variable of obeying N (0,0.1).
Alternatively, describedly determine that according to the particle after state transitions the step of the ROI of present frame comprises:
The particle that carries out after state transitions is processed is carried out each particle field color statistics with histogram processing, and carry out particle according to each particle field color statistics with histogram result and upgrade processing;
The result that upgrade to process obtains according to particle is calculated ROI position size.
Area-of-interest tracking means in a kind of video comprises:
Translational speed distribution parameter and zooming parameter determination module, be used for obtaining the motion vector of present frame pixel or macro block, and determine the translational speed distribution parameter of region of interest ROI according to described motion vector,, also be used for determining zooming parameter according to the status information of reference frame ROI;
Described translational speed distribution parameter and zooming parameter determination module specifically comprise: the motion vector acquisition module, be used for determining pixel in present frame or reference pixel corresponding to macro block or the position of macro block in reference frame according to the motion vector of present frame pixel or macro block, and in present frame in the motion vector of pixel or macro block, choose described position and be arranged in the reference pixel of reference frame ROI or present frame pixel corresponding to macro block or the motion vector of macro block; The parameter determination module is used for determining according to the motion vector that described motion vector acquisition module is chosen the present frame pixel of acquisition or macro block translational speed distribution parameter and the zooming parameter of described ROI; Also determine pixel or macro block in corresponding present frame according to the motion vector of pixel or macro block in the present frame of choosing acquisition, and determine described ROI zooming parameter according to pixel or macro block in the reference frame of pixel or macro block in this present frame and correspondence;
The particle state shift module carries out the state transitions processing for the translational speed distribution parameter that utilizes the definite ROI of described translational speed distribution parameter and zooming parameter determination module and the zooming parameter particle that sampling obtains to present frame;
The ROI determination module is used for carrying out particle after state transitions according to described particle state shift module, determines ROI position and the size of present frame.
Alternatively, described parameter determination module comprises:
The translational speed estimation of distribution parameters module of ROI is used for estimating the translational speed distribution parameter of ROI, and estimates that the translational speed distribution parameter of the described ROI of acquisition comprises: P ( MV ROI ) = Σ i = 0 | G | - 1 δ ( MV i - MV ROI ) / | G | , Wherein, δ is Dirac function, and G is that described position is arranged in the reference pixel of reference frame ROI or present frame pixel corresponding to macro block or the motion vector set of macro block, MV iRefer to i motion vector in G,
Figure GDA00002719392900052
For ROI by the translational speed of former frame to present frame;
ROI zoom parameter estimation module is used for estimating the ROI zooming parameter, and estimates that the described ROI zooming parameter that obtains comprises:
Figure GDA00002719392900053
Wherein, a is, b is affine parameter, and this affine parameter is for adopting least square method in conjunction with four parameter transformation model solutions, and corresponding four parameter transformation modules be according to pixel or macro block foundation in pixel or macro block in this present frame and corresponding reference frame.
Alternatively, described particle state shift module comprises:
Particle position speed shift module, the translational speed distribution parameter that is used for the ROI that the translational speed estimation of distribution parameters module according to described ROI obtains is determined the speed v that particle state shifts 1, v 2, comprise; Even distribution with 0~1 produces random number μ, if μ<α, order v 1 = s k - 1 n { v k } , v 2 = s k - 1 n { v y } , Indexed variable T k - 1 n = 1 , Otherwise, with distribution P (MV ROI) be chosen at two components corresponding to an element in the motion vector set G that adds up in a module (MV ' x, MV ' y) respectively as v 1, v 2Value, even v 1=MV x', v 2=MV ' y, and note
Figure GDA00002719392900064
α is the state transitions parameter, and its initial value is for presetting, and the mode of upgrading this value in follow-up particle renewal process comprises:
Figure GDA00002719392900065
Be the v in particle n in the k-1 frame xComponent, Be the v in particle n in the k-1 frame yComponent;
The particle size shift module is used for determining particle zooming parameter η according to the ROI zooming parameter that described ROI zoom parameter estimation module obtains, and comprises; Even distribution with 0~1 produces random number γ, if γ<β makes η=ρ, otherwise makes η=1, and wherein, β is target sizes intensity of variation parameter, and its value is for presetting, and ρ is described ROI zooming parameter;
Particle state shifts determination module as a result, the particle zooming parameter that the speed that the particle state that is used for determining according to described particle position speed shift module shifts and described particle size shift module are determined carries out the particle state transfer processing, and the result after the particle state of n particle in acquisition k frame shifts comprises: the position of this particle ( s k n { x } = s k - 1 n { x } + v 1 + ϵ x , s k n { y } = s k - 1 n { y } + v 2 + ϵ y ) , The translational speed of the x of this particle, y direction is respectively: s k n { v x } = v 1 + ϵ v x With s k n { v y } = v 2 + ϵ v y , The major semi-axis of this particle corresponding ellipse on described position is s k n { a } = η × s k - 1 n { a } + ϵ a , Minor semi-axis is s k n { b } = η × s k - 1 n { b } + ϵ b , ε wherein x, ε yThe stochastic variable of obeying N (0,0.2) distribution,
Figure GDA000027193929000613
The stochastic variable of obeying N (0,0.25), ε a, ε bIt is the stochastic variable of obeying N (0,0.1).
Alternatively, described ROI determination module comprises:
Each particle field color statistics with histogram processing module is used for the particle that carries out after state transitions is processed is carried out each particle field color statistics with histogram processing;
The particle update module is used for carrying out particle according to each particle field color statistics with histogram result and upgrades processing;
Size computing module in ROI position is used for upgrading according to particle and processes the result that obtains and calculate ROI position size.
As seen from the above technical solution provided by the invention, the ROI tracking technique that the embodiment of the present invention provides, can utilize the motion vector information that is present in compressed bit stream or produces during coding to instruct the particle state transfer process, thereby can be in the situation that guarantee tracking effect, reduce required number of particles in tracing process, and then reduce and follow the tracks of the complexity of processing, and can obtain better tracking effect.
Description of drawings
In order to be illustrated more clearly in the technical scheme of the embodiment of the present invention, during the below will describe embodiment, the accompanying drawing of required use is done to introduce simply, apparently, accompanying drawing in the following describes is only some embodiments of the present invention, for those of ordinary skill in the art, do not pay at s under the prerequisite of creative work, can also obtain other accompanying drawings according to these accompanying drawings.
The processing procedure schematic diagram of the method that Fig. 1 provides for the embodiment of the present invention;
The structural representation one of the device that Fig. 2 provides for the embodiment of the present invention;
The structural representation two of the device that Fig. 3 provides for the embodiment of the present invention;
The structural representation three of the device that Fig. 4 provides for the embodiment of the present invention;
The structural representation four of the device that Fig. 5 provides for the embodiment of the present invention;
The MV distribution histogram that Fig. 6 provides for the embodiment of the present invention;
Fig. 7 shifts schematic diagram for the particle state that the embodiment of the present invention provides;
The structural representation five of the device that Fig. 8 provides for the embodiment of the present invention;
The particle field color histogram that Fig. 9 provides for the embodiment of the present invention;
The particle renewal process schematic diagram that Figure 10 provides for the embodiment of the present invention;
Figure 11 is the effect schematic diagram one of the embodiment of the present invention;
Figure 12 is the effect schematic diagram two of the embodiment of the present invention;
Figure 13 is the applied environment schematic diagram one of the embodiment of the present invention;
Figure 14 is the applied environment schematic diagram two of the embodiment of the present invention.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is clearly and completely described, obviously, described embodiment is only the present invention's part embodiment, rather than whole embodiment.Based on embodiments of the invention, those of ordinary skills belong to protection scope of the present invention not making the every other embodiment that obtains under the creative work prerequisite.
Below in conjunction with accompanying drawing, the embodiment of the present invention is described in further detail.
The embodiment of the present invention provides the region-of-interest tracking method in a kind of video, and its specific implementation can comprise the following steps as shown in Figure 1:
Step 11 is obtained the motion vector of pixel in present frame (or macro block), and determines the translational speed distribution parameter of ROI according to corresponding motion vector, also determines the ROI zooming parameter according to the status information of ROI in reference frame;
Wherein, corresponding ROI position present frame to be determined is referred to as present frame, and the reference frame of present frame on coding structure be referred to as reference frame, ROI location aware in reference frame;
Particularly, in this step, determine ROI the translational speed distribution parameter process specifically can but be not limited to comprise:
At first, determine pixel in present frame or reference pixel corresponding to macro block or the position of macro block in reference frame according to the motion vector of pixel in present frame or macro block; Afterwards, choose reference pixel or the macro block that corresponding position is arranged in reference frame ROI in the motion vector of pixel or macro block in present frame, and obtain the motion vector of pixel in this reference pixel or present frame corresponding to macro block or macro block; Afterwards, can determine according to the motion vector of pixel or macro block in the present frame of choosing acquisition the translational speed distribution parameter of corresponding ROI;
According to the status information of ROI in reference frame determine the ROI zooming parameter step can but be not limited to comprise:
At first, determine pixel or macro block in corresponding present frame according to the motion vector of pixel or macro block in the above-mentioned present frame of choosing; Afterwards, can determine described ROI zooming parameter according to pixel or macro block in the reference frame of pixel or macro block in this present frame and correspondence.
Wherein, the account form of the translational speed distribution parameter of above-mentioned ROI and ROI zooming parameter can but be not limited to comprise:
The translational speed distribution parameter of ROI P ( MV ROI ) = Σ i = 0 | G | - 1 δ ( MV i - MV ROI ) / | G | , Wherein, δ is Dirac function, and G is that described position is arranged in the reference pixel of reference frame ROI or present frame pixel corresponding to macro block or the motion vector set of macro block, MV iRefer to i motion vector in G,
Figure GDA00002719392900082
For ROI by the translational speed of former frame (being reference frame) to present frame;
The ROI zooming parameter Wherein, a is, b is affine parameter, and this affine parameter adopts least square method in conjunction with four parameter transformation model solutions, and corresponding four parameter transformation modules be according to pixel or macro block foundation in pixel or macro block in this present frame and corresponding reference frame.
Step 12, the particle that the translational speed distribution parameter of the ROI that utilization is determined and ROI zooming parameter obtain sampling in present frame are carried out state transitions and are processed;
The specific implementation of this step can comprise:
At first, determine according to the translational speed distribution parameter of above-mentioned ROI the speed v that particle state shifts 1, v 2, comprise; Even distribution with 0~1 produces random number μ, if μ<α, order
Figure GDA00002719392900084
Figure GDA00002719392900085
Indexed variable
Figure GDA00002719392900086
Otherwise, with distribution P (MV ROI) be chosen at two components corresponding to an element in the motion vector set G that adds up in a module (MV ' x, MV ' y) respectively as v 1, v 2Value, even v 1=MV x', v 2=MV ' y, and note
Figure GDA00002719392900091
α is the state transitions parameter, and its initial value is for presetting, and the mode of upgrading this value in follow-up particle renewal process comprises:
Figure GDA00002719392900092
Figure GDA00002719392900093
Be the v in particle n in the k-1 frame xComponent,
Figure GDA00002719392900094
Be the v in particle n in the k-1 frame yComponent, and present frame is the k frame, reference frame is the k-1 frame;
Afterwards, determine particle zooming parameter η according to above-mentioned ROI zooming parameter, comprise; Even distribution with 0~1 produces random number γ, if γ<β makes η=ρ, otherwise makes η=1, and wherein, β is target sizes intensity of variation parameter, and its value is for presetting, and ρ is described zooming parameter;
At last, the speed and the particle zooming parameter that shift according to above-mentioned particle state carry out the particle state transfer processing, and the result after the particle state of n particle in acquisition k frame shifts comprises: the position of this particle ( s k n { x } = s k - 1 n { x } + v 1 + ϵ x , s k n { y } = s k - 1 n { y } + v 2 + ϵ y ) , The translational speed of the x of this particle, y direction is respectively: s k n { v x } = v 1 + ϵ v x With s k n { v y } = v 2 + ϵ v y , The major semi-axis of this particle corresponding ellipse on described position is s k n { a } = η × s k - 1 n { a } + ϵ a , Minor semi-axis is s k n { b } = η × s k - 1 n { b } + ϵ b , ε wherein x, ε yThe stochastic variable of obeying N (0,0.2) distribution,
Figure GDA000027193929000910
The stochastic variable of obeying N (0,0.25), ε a, ε bIt is the stochastic variable of obeying N (0,0.1).
Step 13 is determined ROI position and the size of present frame according to the particle after state transitions, realize processing for the tracking of the ROI in video;
Particularly, the specific implementation of this step can comprise:
The particle that carries out after state transitions is processed is carried out each particle field color statistics with histogram processing, and carry out particle according to each particle field color statistics with histogram result and upgrade processing;
The result that upgrade to process obtains according to particle is calculated ROI position size.
The embodiment of the present invention is mainly to utilize in code stream or the additional information that produces during coding instructs particle state to shift, thereby can guarantee the tracking effect for ROI, and can reduce the processing complexity of tracing process.Particularly, the present invention has utilized particle state transfer process in the MV information guiding particle filter algorithm that is present in compressed bit stream or produces during coding, thereby can in the situation that guarantee tracking effect, reduce required number of particles, and then reduce the complexity of tracking processing procedure; Perhaps, the present invention can obtain more stable tracking effect in the situation that the identical number of particles of employing is followed the tracks of processing.And the present invention has also utilized Particle filtering theory the stability of noise to be obtained the robustness of track algorithm, has further guaranteed the tracking effect of the tracking technique scheme that the embodiment of the present invention provides.
The embodiment of the present invention also provides the tracking means of the area-of-interest in a kind of video, and its specific implementation structure can comprise as shown in Figure 2:
(1) translational speed distribution parameter and zooming parameter determination module 21, be used for obtaining the motion vector of present frame pixel or macro block, and determine the translational speed distribution parameter of region of interest ROI according to described motion vector, also determine the ROI zooming parameter according to the status information of ROI in reference frame;
Further, as shown in Figure 3, this translational speed distribution parameter and zooming parameter determination module 21 specifically can comprise:
Motion vector acquisition module 211, be used for determining pixel in present frame or reference pixel corresponding to macro block or the position of macro block in reference frame according to the motion vector of present frame pixel or macro block, and choose reference pixel or the macro block that described position is arranged in reference frame ROI in the motion vector of pixel or macro block in present frame, then obtain the motion vector of pixel in this reference pixel or present frame corresponding to macro block or macro block;
Parameter determination module 212 is used for determining according to the motion vector that above-mentioned motion vector acquisition module 211 is chosen the present frame pixel of acquisition or macro block the translational speed distribution parameter of corresponding ROI; Also determine pixel or macro block in corresponding present frame according to the motion vector of pixel or macro block in the present frame of choosing acquisition, and determine corresponding ROI zooming parameter according to pixel or macro block in the reference frame of pixel or macro block in this present frame and correspondence;
This parameter determination module 212 specifically can comprise translational speed estimation of distribution parameters module 2121 and the ROI zoom parameter estimation module 2122 of ROI, wherein:
The translational speed estimation of distribution parameters module 2121 of ROI is used for estimating the translational speed distribution parameter of ROI, and estimates that the translational speed distribution parameter of the described ROI of acquisition comprises: P ( MV ROI ) = Σ i = 0 | G | - 1 δ ( MV i - MV ROI ) / | G | , Wherein, δ is Dirac function, and G is that described position is arranged in the reference pixel of reference frame ROI or present frame pixel corresponding to macro block or the motion vector set of macro block, MV iRefer to i motion vector in G,
Figure GDA00002719392900102
For ROI by the translational speed of former frame (being reference frame) to present frame;
ROI zoom parameter estimation module 2122 is used for estimating the ROI zooming parameter, and estimates that the described ROI zooming parameter that obtains comprises:
Figure GDA00002719392900103
Wherein, a is, b is for being affine parameter, and this affine parameter adopts least square method in conjunction with four parameter transformation model solutions, and corresponding four parameter transformation modules be according to pixel or macro block foundation in pixel or macro block in this present frame and corresponding reference frame.
(2) the particle state shift module 22, carry out the state transitions processing for the translational speed distribution parameter that utilizes the definite ROI of above-mentioned translational speed distribution parameter and zooming parameter determination module 21 and the zooming parameter particle that sampling obtains to present frame;
Further, with reference to shown in Figure 8, corresponding particle state shift module can comprise:
Particle position speed shift module 221, the translational speed distribution parameter that is used for the ROI that the translational speed estimation of distribution parameters module 2121 according to above-mentioned ROI obtains is determined the speed v that particle state shifts 1, v 2, comprise; Even distribution with 0~1 produces random number μ, if μ<α, order v 1 = s k - 1 n { v k } , v 2 = s k - 1 n { v y } , Indexed variable T k - 1 n = 1 , Otherwise, with distribution P (MV ROI) be chosen at two components corresponding to an element in the motion vector set G that adds up in a module (MV ' x, MV ' y) respectively as v 1, v 2Value, even v 1=MV x', v 2=MV ' y, and note
Figure GDA00002719392900114
α is the state transitions parameter, and its initial value is for presetting, and the mode of upgrading this value in follow-up particle renewal process comprises:
Figure GDA00002719392900116
Be the vx component in particle n in the k-1 frame,
Figure GDA00002719392900117
Be the v in particle n in the k-1 frame yComponent;
Particle size shift module 222 is used for determining particle zooming parameter η according to the ROI zooming parameter that above-mentioned ROI zoom parameter estimation module 2122 obtains, and specifically can comprise; Even distribution with 0~1 produces random number γ, if γ<β makes η=ρ, otherwise makes η=1, and wherein, β is target sizes intensity of variation parameter, and its value is for presetting, and ρ is above-mentioned ROI zooming parameter;
Particle state shifts determination module 223 as a result, be used for the speed of the definite particle state transfer of the corresponding particle position speed shift module 221 of basis and the particle zooming parameter that described particle size shift module 222 is determined and carry out the particle state transfer processing, the result after the particle state of n particle in acquisition k frame shifts comprises: the position of this particle ( s k n { x } = s k - 1 n { x } + v 1 + ϵ x , s k n { y } = s k - 1 n { y } + v 2 + ϵ y ) , The translational speed of the x of this particle, y direction is respectively: s k n { v x } = v 1 + ϵ v x With s k n { v y } = v 2 + ϵ v y , The major semi-axis of this particle corresponding ellipse on described position is s k n { a } = η × s k - 1 n { a } + ϵ a , Minor semi-axis is s k n { b } = η × s k - 1 n { b } + ϵ b , ε wherein x, ε yThe stochastic variable of obeying N (0,0.2) distribution,
Figure GDA000027193929001113
The stochastic variable of obeying N (0,0.25), ε a, ε bIt is the stochastic variable of obeying N (0,0.1).
(3) the ROI determination module 23, are used for carrying out particle after state transitions according to above-mentioned particle state shift module 22, determine ROI position and the size of present frame;
Particularly, as shown in Figure 4, this ROI determination module specifically can comprise:
Each particle field color statistics with histogram processing module 231 is used for the particle that carries out after state transitions is processed is carried out each particle field color statistics with histogram processing;
Particle update module 232 is used for carrying out the particle renewal according to the result that above-mentioned each particle field color statistics with histogram processing module 231 is carried out each particle field color statistics with histogram processing acquisition and processes;
Size computing module 233 in ROI position is used for carrying out particles according to above-mentioned particle update module 232 and upgrades and process the result that obtains and calculate ROI position size.
Equally, in said apparatus, utilized the MV information guiding particle state transfer process that is present in compressed bit stream or produces during coding, thereby can be in the situation that guarantee tracking effect, reduce required number of particles, perhaps, in the situation that adopt identical number of particles to follow the tracks of processing, can obtain more stable tracking effect.
That is to say, for the degree of accuracy that improves the ROI tracking and the complexity that reduces tracking, adopted in the embodiment of the present invention utilize produce in corresponding cataloged procedure or code stream in the implementation of the state transitions of particle in the MV information guiding particle filter algorithm that existed, to obtain more superior tracking effect.
For ease of understanding better the embodiment of the present invention, below in conjunction with accompanying drawing and concrete application process, the embodiment of the present invention is explained in detail.
In the embodiment of the present invention, specifically according to Particle filtering theory, utilize produce in cataloged procedure or code stream in the MV information realization that existed to the tracking of ROI.Wherein, ROI obtains frame by frame, and in present frame, the data such as the position of ROI, size are the information acquisitions according to ROI in its reference frame.Corresponding tracing process mainly can comprise:
At first, obtain the reference frame of present frame from the buffer memory of reference frame, according to the MV of status information and the pixel (or macro block) of ROI in reference frame, by ROI translational speed distribution parameter and zoom parameter estimation module, estimate velocity distribution and zooming parameter that ROI moves to present frame;
Secondly, estimate that by the particle state shift module ROI position and size information in present frame distribute;
The 3rd, the color histogram in each particle zone that utilizes that each particle overlay area statistical color histogram module statistics obtains, through the particle update module, the information of the ROI that obtained by the particle state shift module is distributed and do further correction, calculate the position size information of ROI in the output present frame, and deposit together itself and present frame in the reference frame buffer memory, in order to estimate the ROI information in subsequent frame.
Particularly, as shown in Figure 4, the device that the embodiment of the present invention provides specifically can comprise:
(1) reference frame acquisition module
This module is used for obtaining the reference frame of present frame, and is identical, not shown with the method for obtaining reference frame in traditional coding method.
(2) ROI translational speed distribution parameter and zoom parameter estimation module
This module is translational speed distribution parameter and zooming parameter determination module, it is used for obtaining translational speed distribution parameter distribution estimation and the zoom parameter estimation in ROI zone, as shown in Figure 5, this module further can comprise MV acquisition module, ROI translational speed estimation of distribution parameters module and ROI zoom parameter estimation module, and the processing capacity that the below will complete modules respectively is described:
(21) MV acquisition module, i.e. motion vector acquisition module is used for the MV that statistical-reference pixel (or macro block, the pixel that occurs in below describing all can replace with macro block) is arranged in the pixel in reference frame ROI zone, and MV is a bivector (MV accordingly x, MV y), namely the MV of pixel is at the component of x, y direction;
Particularly, easily know according to coding standard H.264/SVC, can obtain the position of its reference pixel in reference frame by the MV of pixel in present frame, therefore, if if in present frame, the reference pixel of certain pixel is in ROI zone in reference frame, obtain the MV of this pixel in present frame, a plurality of pixels that its reference pixel are in the present frame in ROI zone in reference frame are designated as set G, element number in this set G is | G|, i element is g i, g iMV be designated as I=0,1 ..., | G|-1; Further, the set with each element M V in G is designated as M;
For example, as shown in Figure 6, corresponding MV1, MV2, the associated reference pixel of MV3 are arranged in the ROI zone of reference frame, and therefore, the pixel in present frame corresponding to this MV1, MV2, MV3 is at the row of statistics, namely need pixel with correspondence to be recorded in and gather in G; And the associated reference pixel of MV0 is arranged in outside reference frame ROI, and therefore, the pixel in present frame corresponding to this MV0 is not at the row of statistics.
(22) translational speed distribution estimation module, be used for (MV) the distribution histogram that the statistical-reference pixel is arranged in the present frame pixel in reference frame ROI zone, wherein-MV gets respectively negative sign to two components of MV to obtain, be in the set G that obtains of MV acquisition module pixel-the MV distribution histogram, and it is used the normalization of L1 norm, be used as the ROI zone by the estimation of reference frame to the translational speed probability distribution of present frame, it can be designated as translational speed distribution parameter P (MV particularly ROI), that is:
P ( MV ROI ) = Σ i = 0 | G | - 1 δ ( MV i - MV ROI ) / | G | , Wherein, δ is Dirac function, and G is that described position is arranged in present frame pixel corresponding to the reference pixel of reference frame ROI or the motion vector set of macro block, MV iRefer to i motion vector in G,
Figure GDA00002719392900133
For ROI by the translational speed of former frame (being reference frame) to present frame.
Only provided statistics P (MV in Fig. 6 ROI) an example, '+' represent pixel wherein, pixel reference pixel as corresponding in the dotted arrow afterbody is not arranged in the ROI zone of reference frame, its motion vector MV 0Not at the row of statistics; Other motion vector such as MV 1, MV 2Deng all at the row of statistics.
(23) the zoom parameter estimation module in ROI zone, be used for estimating to obtain the zooming parameter in ROI zone, this zooming parameter is used to indicate the interregional convergent-divergent relation of ROI zone and the ROI in present frame in reference frame, particularly, can obtain based on the mode of affined transformation this zooming parameter, this zooming parameter can for:
Figure GDA00002719392900141
Wherein, a, b are affine transformation parameter;
The mode that obtains particularly this ROI zooming parameter can comprise:
If i element g in G iCoordinate be (x i, y i), the coordinate of its corresponding reference pixel is
( u i , v i ) = ( x i + MV x i , y i + MV y i ) ; Adopt a kind of special case of affined transformation, namely four parameter transformation models are as the running transform model of target, that is:
x i y i = a - b b a × u i v i + c d , Wherein a, b, c, d are affine transformation parameters;
Adopt criterion of least squares to estimate to all pixels in above-mentioned set G the four parameter transformation models that ROI is unified, can be constructed as follows matrix A, B, C, that is:
A = a b c d , · · · u i ( - v i ) 10 v i u i 01 · · · , C = · · · x i y i · · · ;
Like this, the running transform model of corresponding target can be converted into C=B * A, can solve A=(B by criterion of least squares TB) -1B TC, wherein B TBe the transposed matrix of B, affine transformation parameter a, b are a, the b components in matrix A, because matrix B and C are known quantity, thus can obtain the value of affine transformation parameter a, b, and then can calculate the value ρ that obtains corresponding zooming parameter.
(3) particle transfer module
This module is the particle state shift module, and it can be used for obtaining according to ROI translational speed distribution parameter and zoom parameter estimation module ROI zone is by the estimation P (MV of reference frame to the translational speed probability distribution of present frame ROI) and the estimation ρ of zooming parameter, the state transitions of particle is done guidance, to reach better tracking results.
Particularly, the purpose of this module is that the ROI distribution of present frame is done according to a preliminary estimate, as shown in Figure 7, each circle represents a particle (information such as position size that comprise ROI), particle assembly in reference frame represents that the status information of ROI in reference frame distributes, obtain the particle assembly in present frame after particle assembly in reference frame shifts through particle state, the particle assembly in this present frame has represented that in the present frame, the status information of ROI distributes;
In the processing procedure of carrying out the particle state transfer, the information that this module need to use comprises: particle state, state transitions parameter alpha that in reference frame, expression ROI distributes, and P (MV ROI) and ρ, wherein, the α initial value can but do not limit and be set to 0.5, also can automatically upgrade the value of this state transitions parameter alpha in follow-up processing procedure.
Particularly, need to all carry out same processing for each particle, only be treated to example with the state transitions for n particle here and describe explanation, as shown in Figure 8, can comprise for the state transitions processing procedure of n particle:
At first, determine the speed that particle state shifts, specifically can comprise: the even distribution with 0 ~ 1 produces random number μ, if μ<α, order (be the v in particle n in the k-1 frame xComponent, below this type of expression formula implication similar),
Figure GDA00002719392900152
Figure GDA00002719392900153
Otherwise, with according to the ROI zone by the estimated result P (MV of reference frame to the translational speed probability distribution of present frame ROI) choose in the motion vector set M that obtains before two components corresponding to element (MV ' x, MV ' y) respectively as v 1, v 2Value, even v 1=MV x', v 2=MV ' y, and note
Figure GDA00002719392900154
Wherein, v 1, v 2Temporary variable;
Secondly, determine the particle zooming parameter, specifically can comprise: the even distribution with 0 ~ 1 produces random number γ, if γ<β, make η=ρ, otherwise make η=1, wherein, η is temporary variable, β is target sizes intensity of variation parameter, and it is a constant parameter, is used for controlling the intensity of variation of target sizes, if the size variation velocity ratio of target is very fast, the β value is larger, otherwise the β value is less, in general adjacent two frames, to change be not very violent to target sizes usually, thus can but to be not limited to parameters β be 0.2;
The 3rd, particle state shifts, according to the temporary variable v that determines before 1, v 2Carry out the particle state transfer processing with η, particularly, the state transitions result of the particle of the correspondence of n particle in calculating k frame
Figure GDA00002719392900155
For: the position of this particle on the x direction of principal axis
Figure GDA00002719392900156
The position of this particle on the y direction of principal axis s k n { y } = s k - 1 n { y } + v 2 + ϵ y , The translational speed of this particle on the x direction of principal axis s k n { v x } = v 1 + ϵ v x , The translational speed of this particle on the y direction of principal axis
Figure GDA00002719392900159
The major semi-axis of this particle corresponding ellipse on the relevant position
Figure GDA000027193929001510
The minor semi-axis of this particle corresponding ellipse on the relevant position s k n { b } = η × s k - 1 n { b } + ϵ b , ε wherein x, ε yThe stochastic variable of obeying N (0,0.2) distribution,
Figure GDA000027193929001512
The stochastic variable of obeying N (0,0.25), ε a, ε bIt is the stochastic variable of obeying N (0,0.1).Wherein,
Figure GDA00002719392900161
Represent n particle in the k frame, its implication is: in the k frame, ROI is with probability Appear at (x, y) and locate, major semi-axis and minor semi-axis are respectively a, b, and with (v x, v y) speed move to next frame;
(4) each particle field color statistics with histogram module
This module be used for to be calculated the observation model of Particle filtering theory, all needs to carry out identical operation for each particle, and the below describes as an example of the observation model that calculates Particle filtering theory for particle n example.
Particle n in present frame, state is
Figure GDA00002719392900163
It can be corresponding to centered by (x, y) on the image of present frame, and the ellipse take a, b as semi-major axis is designated as As shown in Figure 9, this module namely is used for the color histogram of statistical graph elliptical region, and with the normalization of L1 norm, can be designated as
Figure GDA00002719392900165
Its account form is as follows:
P ( sb k n = b ) = Σ t = 0 t = | R k n | - 1 δ ( b Y t × N U × N V + b U t × N v + b V t - b ) / | R k n | ;
Wherein, suppose that video input is the YUV color space, N Y, N U, N VBe each Color Channel number of dividing bin (by yuv space is quantized to obtain, as the every one dimension of yuv space all is divided into 10 parts, N Y, N U, N VAll equal 10);
Figure GDA00002719392900167
To drop on
Figure GDA00002719392900168
In the sequence number (sequence number of the bin of each Color Channel increases progressively since 0 accordingly, until 9) of the bin that falls on each Color Channel of t pixel;
Figure GDA00002719392900169
Expression is oval In number of pixels.
Particularly, the implementation that the observation model of calculating Particle filtering theory adopts in this module is same as the prior art, just the state transitions result that is input as the particle that obtains according to the inventive method of this module
s k n = { x , v x , y , v y , a , b } .
(5) particle update module
This module is used for carrying out corresponding particle and upgrades processing, the concrete particle (being the output of particle transfer module) that is used for the expression present frame ROI distributions that obtains through state transitions, calculate its weight, and estimating through the distribution that resampling obtains ROI in present frame, and update mode transfer parameters α.
Particularly, as shown in figure 10, corresponding particle upgrades processing procedure and can comprise:
At first, upgrade the weight of each particle π k n = π k - 1 n × exp { - { 1 - Σ i = 1 N P ( sb 0 ) [ j ] × P ( sb k n ) [ j ] 1 / 2 / σ } , Wherein,
Figure GDA000027193929001613
Represent the weight (zone of particle representative is the probability in ROI zone) of n particle in the k frame, the N here represents the bin number of color histogram used, N=N YN UN V
Figure GDA000027193929001614
Be illustrated in the value of j the bin of color histogram of the particle n that in a module (adding up each particle field color histogram module), statistics obtains; P (sb 0) refer to the color histogram in initialized ROI zone; Parameter σ is constant, can but be not limited to be set to 0.25; Each particle weight of normalization π k n = π k n / Σ n = 1 N π k n .
Secondly, carry out re-sampling operations, specifically can comprise: at first, calculate number of effective particles
Figure GDA00002719392900172
Judgement afterwards
Figure GDA00002719392900173
If number of effective particles
Figure GDA00002719392900174
Carry out corresponding re-sampling operations, otherwise do not carry out re-sampling operations, directly carry out next the 3rd step update mode transfer parameters α, wherein, N is number of particles, and λ is a constant parameter, can preset; Particularly, corresponding re-sampling operations comprises: to the particle assembly of present frame S = { s k n , π k n , T k n } n = 1 N According to weight
Figure GDA00002719392900176
Sample, put into new particle assembly S ′ = { s k ′ n , π k ′ n , T k ′ n } n = 1 N In, i.e. particle
Figure GDA00002719392900178
Have
Figure GDA00002719392900179
Possibility add in new particle assembly S', upgrade each particle weight in S ' after resampling
Figure GDA000027193929001710
Make S=S ', to complete corresponding re-sampling operations process.
The 3rd, update mode transfer parameters α: α = Σ n = 1 N T k n / N .
(6) calculate the large little module in ROI position
Through after the calculating of above each module, obtained the particle assembly in the present frame
Figure GDA000027193929001712
Can calculate now the position size of the ROI in present frame, the position at ROI center ( x , y ) = ( Σ n = 1 n π k n × s k n { x } , Σ n = 1 n π k n × s k n { y } ) , The length of side of ROI a = Σ n = 1 n π k n × s k n { a } ,
b = Σ n = 1 n π k n × s k n { b } .
Then, also need and to represent the particle assembly of ROI state in present frame
Figure GDA000027193929001716
Deposit the buffer(buffer memory of reference frame in) in, so that use in follow-up tracking processing procedure.
The realization of the invention described above embodiment makes in the process of following the tracks of ROI, can obtain better tracking effect with lower complexity.
Particularly, ROI tracking scheme and tracking scheme of the prior art that the embodiment of the present invention is provided, compare experiment as input and find with coastguard, stephen sequence: in the coastguard sequence, tracking scheme of the prior art is lost tracking results since the 49th frame, lose tracking results fully to 56 frames, the tracking scheme that the embodiment of the present invention proposes shows that tracking results is good; In the stephen sequence, tracking scheme of the prior art is lost tracking results since the 33rd frame, loses tracking results fully to the 78th frame, and the tracking scheme that the embodiment of the present invention proposes shows that equally tracking results is good.
In Particle filtering theory, a standard that is used for weighing the tracking scheme quality is that number of effective particles is with the situation of change of tracking time.as Figure 11 and shown in Figure 12, wherein shown respectively at sequence coastguard, the situation that in the tracking scheme that in stephen, the tracking scheme of employing prior art and the embodiment of the present invention propose, number of effective particles changes with tracking time, the line correspondence that is positioned at the top in two width figure the tracking scheme that the embodiment of the present invention proposes, the line of below is corresponding the tracking scheme of prior art, as seen the tracking scheme number of effective particles of embodiment of the present invention proposition reduces slower with tracking time, be that the tracking scheme that the embodiment of the present invention proposes obviously is better than tracking scheme of the prior art.
The below just provides the region-of-interest tracking method in a kind of video and installs adaptable environment the embodiment of the present invention and is illustrated.
Application Example one
The embodiment of the present invention provide in a kind of video region-of-interest tracking method and the device can use contain the gradable SVC(Scalable Video Coding of ROI, scalable video) coding in.
Particularly, contain the structure of the gradable SVC scrambler of ROI as shown in figure 13, corresponding SVC provides the support to encoding region of interest.ROI comprises the zone of the object with clear and definite high-level semantic often in frame of video for the viewer, as someone, and object etc.Carry out in the process of video tour the user, if the display size of its equipment is little, perhaps its available bandwidth reduces, and in order not affect it to the experience of viewing and admiring of this video, need to keep as far as possible the sharpness of area-of-interest.As shown in figure 13, the region-of-interest tracking method and the device that the invention provides in a kind of video can be used for containing in the acquisition process in ROI zone in the gradable SVC scrambler of ROI, namely can be used for realizing the regional acquisition module of the enhancement layer ROI shown in Figure 13.The input of module is basic layer MV and the video data information that present frame has encoded, and after following the tracks of processing, the big or small positional information of output ROI in present frame is so that the ROI regional code of enhancement layer.
Application Example two
The embodiment of the present invention provides the region-of-interest tracking method in a kind of video and installs except being applied to the scalable video technology at home and abroad, can also be applied in the transcoding in ROI zone.
Due to restrictions such as video terminal screen size and the network bandwidths, often existing compressed video bit stream transcoding need to be become the target code stream that client needs, in order to guarantee that visual quality can adopt, former compressed bit stream is transcoded into the ROI code stream, namely abandon inessential visual information concerning the client, only keep the high-quality in ROI zone.The structure of corresponding tandem type transcoder as shown in figure 14, the embodiment of the present invention provides the region-of-interest tracking method in a kind of video and installs the position size information that the ROI tracking module that can put into the cascade connection type transcoder shown in Figure 14 is used for obtaining ROI, ROI is encoded into the higher code stream of quality.
Certainly, the embodiment of the present invention provides region-of-interest tracking method and device in a kind of video can also be applied to other similar need to tracking in the applied environment of ROI.Give an example no longer one by one at this.
The above; only for the better embodiment of the present invention, but protection scope of the present invention is not limited to this, anyly is familiar with those skilled in the art in the technical scope that the present invention discloses; the variation that can expect easily or replacement are within all should being encompassed in protection scope of the present invention.Therefore, protection scope of the present invention should be as the criterion with the protection domain of claims.

Claims (8)

1. the region-of-interest tracking method in a video, is characterized in that, comprising:
Obtain the motion vector of pixel in present frame or macro block, and determine the translational speed distribution parameter of region of interest ROI according to described motion vector, also determine the ROI zooming parameter according to the status information of ROI in reference frame;
Wherein, the step of the translational speed distribution parameter of described definite ROI comprises: determine pixel in present frame or reference pixel corresponding to macro block or the position of macro block in reference frame according to the motion vector of pixel in present frame or macro block, and choose described position in the motion vector of pixel or macro block be arranged in the reference pixel of reference frame ROI or present frame pixel corresponding to macro block or the motion vector of macro block in present frame; Determine the translational speed distribution parameter of described ROI according to the motion vector of pixel or macro block in the present frame of choosing acquisition;
The step of described definite ROI zooming parameter comprises: determine pixel or macro block in corresponding present frame according to the motion vector of pixel or macro block in the present frame of choosing acquisition, and determine described ROI zooming parameter according to pixel or macro block in the reference frame of pixel or macro block in this present frame and correspondence;
The particle that utilizes the translational speed distribution parameter of described ROI and zooming parameter that sampling in present frame is obtained carries out state transitions to be processed, and determines ROI position and the size of present frame according to the particle after state transitions.
2. method according to claim 1, is characterized in that, the translational speed distribution parameter of described ROI comprises: P ( MV ROI ) = Σ i = 0 | G | - 1 δ ( MV i - MV ROI ) / | G | , Wherein, δ is Dirac function, and G is that described position is arranged in the reference pixel of reference frame ROI or present frame pixel corresponding to macro block or the motion vector set of macro block, MV iRefer to i motion vector in G,
Figure FDA00002719392800012
For ROI by the translational speed of former frame to present frame; Described ROI zooming parameter comprises:
Figure FDA00002719392800013
Wherein, a is, b is for being affine parameter, and this affine parameter adopts least square method in conjunction with four parameter transformation model solutions, and corresponding four parameter transformation modules be according to pixel or macro block foundation in pixel or macro block in this present frame and corresponding reference frame.
3. method according to claim 2, is characterized in that, the step that the described particle that sampling in present frame is obtained carries out the state transitions processing comprises:
Determine according to the translational speed distribution parameter of described ROI the speed v that particle state shifts 1, v 2, comprise; Even distribution with 0~1 produces random number μ, if μ<α, order
Figure FDA00002719392800014
Figure FDA00002719392800015
Indexed variable
Figure FDA00002719392800016
Otherwise, with distribution P (MV ROI) be chosen at two components corresponding to an element in the motion vector set G that adds up in a module (MV ' x, MV ' y) respectively as v 1, v 2Value, even v 1=MV x', v 2=MV ' y, and note
Figure FDA00002719392800021
α is the state transitions parameter, and its initial value is for presetting, and the mode of upgrading this value in follow-up particle renewal process comprises: α = Σ n = 1 N T k n / N ,
Figure FDA00002719392800023
Be the vx component in particle n in the k-1 frame,
Figure FDA00002719392800024
Be the v in particle n in the k-1 frame yComponent;
Determine particle zooming parameter η according to described ROI zooming parameter, comprise; Even distribution with 0~1 produces random number γ, if γ<β makes η=ρ, otherwise makes η=1, and wherein, β is target sizes intensity of variation parameter, and its value is for presetting, and ρ is described zooming parameter;
The speed and the particle zooming parameter that shift according to described particle state carry out the particle state transfer processing, and the result after the particle state of n particle in acquisition k frame shifts comprises: the position of this particle ( s k n { x } = s k - 1 n { x } + v 1 + ϵ x , s k n { y } = s k - 1 n { y } + v 2 + ϵ y ) , The translational speed of the x of this particle, y direction is respectively: s k n { v x } = v 1 + ϵ v x With s k n { v y } = v 2 + ϵ v y , The major semi-axis of this particle corresponding ellipse on described position is s k n { a } = η × s k - 1 n { a } + ϵ a , Minor semi-axis is s k n { b } = η × s k - 1 n { b } + ϵ b , ε wherein x, ε yThe stochastic variable of obeying N (0,0.2) distribution, The stochastic variable of obeying N (0,0.25), ε a, ε bIt is the stochastic variable of obeying N (0,0.1).
4. according to claim 1-3 described methods of any one, is characterized in that, describedly determines that according to the particle after state transitions the step of the ROI of present frame comprises:
The particle that carries out after state transitions is processed is carried out each particle field color statistics with histogram processing, and carry out particle according to each particle field color statistics with histogram result and upgrade processing;
The result that upgrade to process obtains according to particle is calculated ROI position size.
5. the area-of-interest tracking means in a video, is characterized in that, comprising:
Translational speed distribution parameter and zooming parameter determination module, be used for obtaining the motion vector of present frame pixel or macro block, and determine the translational speed distribution parameter of region of interest ROI according to described motion vector,, also be used for determining zooming parameter according to the status information of reference frame ROI;
Described translational speed distribution parameter and zooming parameter determination module specifically comprise: the motion vector acquisition module, be used for determining pixel in present frame or reference pixel corresponding to macro block or the position of macro block in reference frame according to the motion vector of present frame pixel or macro block, and in present frame in the motion vector of pixel or macro block, choose described position and be arranged in the reference pixel of reference frame ROI or present frame pixel corresponding to macro block or the motion vector of macro block; The parameter determination module is used for determining according to the motion vector that described motion vector acquisition module is chosen the present frame pixel of acquisition or macro block translational speed distribution parameter and the zooming parameter of described ROI; Also determine pixel or macro block in corresponding present frame according to the motion vector of pixel or macro block in the present frame of choosing acquisition, and determine described ROI zooming parameter according to pixel or macro block in the reference frame of pixel or macro block in this present frame and correspondence;
The particle state shift module carries out the state transitions processing for the translational speed distribution parameter that utilizes the definite ROI of described translational speed distribution parameter and zooming parameter determination module and the zooming parameter particle that sampling obtains to present frame;
The ROI determination module is used for carrying out particle after state transitions according to described particle state shift module, determines ROI position and the size of present frame.
6. device according to claim 5, is characterized in that, described parameter determination module comprises:
The translational speed estimation of distribution parameters module of ROI is used for estimating the translational speed distribution parameter of ROI, and estimates that the translational speed distribution parameter of the described ROI of acquisition comprises: P ( MV ROI ) = Σ i = 0 | G | - 1 δ ( MV i - MV ROI ) / | G | , Wherein, δ is Dirac function, and G is that described position is arranged in the reference pixel of reference frame ROI or present frame pixel corresponding to macro block or the motion vector set of macro block, MV iRefer to i motion vector in G,
Figure FDA00002719392800032
For ROI by the translational speed of former frame to present frame;
ROI zoom parameter estimation module is used for estimating the ROI zooming parameter, and estimates that the described ROI zooming parameter that obtains comprises:
Figure FDA00002719392800033
Wherein, a is, b is affine parameter, and this affine parameter is for adopting least square method in conjunction with four parameter transformation model solutions, and corresponding four parameter transformation modules be according to pixel or macro block foundation in pixel or macro block in this present frame and corresponding reference frame.
7. device according to claim 6, is characterized in that, described particle state shift module comprises:
Particle position speed shift module, the translational speed distribution parameter that is used for the ROI that the translational speed estimation of distribution parameters module according to described ROI obtains is determined the speed v that particle state shifts 1, v 2, comprise; Even distribution with 0~1 produces random number μ, if μ<α, order v 1 = s k - 1 n { v x } , v 2 = s k - 1 n { v y } , Indexed variable T k - 1 n = 1 , Otherwise, with distribution P (MV ROI) be chosen at two components corresponding to an element in the motion vector set G that adds up in a module (MV ' x, MV ' y) respectively as v 1, v 2Value, even v 1=MV x', v 2=MV ' y, and note
Figure FDA00002719392800037
α is the state transitions parameter, and its initial value is for presetting, and the mode of upgrading this value in follow-up particle renewal process comprises:
Figure FDA00002719392800041
Figure FDA00002719392800042
Be the v in particle n in the k-1 frame xComponent,
Figure FDA00002719392800043
Be the v in particle n in the k-1 frame yComponent;
The particle size shift module is used for determining particle zooming parameter η according to the ROI zooming parameter that described ROI zoom parameter estimation module obtains, and comprises; Even distribution with 0~1 produces random number γ, if γ<β makes η=ρ, otherwise makes η=1, and wherein, β is target sizes intensity of variation parameter, and its value is for presetting, and ρ is described ROI zooming parameter;
Particle state shifts determination module as a result, the particle zooming parameter that the speed that the particle state that is used for determining according to described particle position speed shift module shifts and described particle size shift module are determined carries out the particle state transfer processing, and the result after the particle state of n particle in acquisition k frame shifts comprises: the position of this particle ( s k n { x } = s k - 1 n { x } + v 1 + ϵ x , s k n { y } = s k - 1 n { y } + v 2 + ϵ y ) , The translational speed of the x of this particle, y direction is respectively: s k n { v x } = v 1 + ϵ v x With s k n { v y } = v 2 + ϵ v y , The major semi-axis of this particle corresponding ellipse on described position is s k n { a } = η × s k - 1 n { a } + ϵ a , Minor semi-axis is s k n { b } = η × s k - 1 n { b } + ϵ b , ε wherein x, ε yThe stochastic variable of obeying N (0,0.2) distribution,
Figure FDA00002719392800049
The stochastic variable of obeying N (0,0.25), ε a, ε bIt is the stochastic variable of obeying N (0,0.1).
8. according to claim 5-7 described devices of any one, is characterized in that, described ROI determination module comprises:
Each particle field color statistics with histogram processing module is used for the particle that carries out after state transitions is processed is carried out each particle field color statistics with histogram processing;
The particle update module is used for carrying out particle according to each particle field color statistics with histogram result and upgrades processing;
Size computing module in ROI position is used for upgrading according to particle and processes the result that obtains and calculate ROI position size.
CN 201210132913 2012-04-28 2012-04-28 Method and device for tracking region of interest in video Active CN102682454B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 201210132913 CN102682454B (en) 2012-04-28 2012-04-28 Method and device for tracking region of interest in video

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 201210132913 CN102682454B (en) 2012-04-28 2012-04-28 Method and device for tracking region of interest in video

Publications (2)

Publication Number Publication Date
CN102682454A CN102682454A (en) 2012-09-19
CN102682454B true CN102682454B (en) 2013-05-08

Family

ID=46814319

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 201210132913 Active CN102682454B (en) 2012-04-28 2012-04-28 Method and device for tracking region of interest in video

Country Status (1)

Country Link
CN (1) CN102682454B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104185078A (en) * 2013-05-20 2014-12-03 华为技术有限公司 Video monitoring processing method, device and system thereof
CN103517073B (en) * 2013-07-12 2016-11-02 上海交通大学 Video coding-decoding method, Apparatus and system
CN106575362A (en) * 2014-08-28 2017-04-19 高通股份有限公司 Object selection based on region of interest fusion
EP3029937B1 (en) 2014-12-03 2016-11-16 Axis AB Method and encoder for video encoding of a sequence of frames
US9542751B2 (en) 2015-05-08 2017-01-10 Qualcomm Incorporated Systems and methods for reducing a plurality of bounding regions
US9865062B2 (en) 2016-02-12 2018-01-09 Qualcomm Incorporated Systems and methods for determining a region in an image
CN110996099B (en) * 2019-11-15 2021-05-25 网宿科技股份有限公司 Video coding method, system and equipment
CN110933446B (en) * 2019-11-15 2021-05-25 网宿科技股份有限公司 Method, system and equipment for identifying region of interest
CN111105442B (en) * 2019-12-23 2022-07-15 中国科学技术大学 Switching type target tracking method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101127121A (en) * 2007-09-13 2008-02-20 复旦大学 Target tracking algorism based on self-adaptive initial search point forecast
CN101894378A (en) * 2010-06-13 2010-11-24 南京航空航天大学 Moving target visual tracking method and system based on double ROI (Region of Interest)
CN102124727A (en) * 2008-03-20 2011-07-13 无线电技术研究学院有限公司 A method of adapting video images to small screen sizes

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101127121A (en) * 2007-09-13 2008-02-20 复旦大学 Target tracking algorism based on self-adaptive initial search point forecast
CN102124727A (en) * 2008-03-20 2011-07-13 无线电技术研究学院有限公司 A method of adapting video images to small screen sizes
CN101894378A (en) * 2010-06-13 2010-11-24 南京航空航天大学 Moving target visual tracking method and system based on double ROI (Region of Interest)

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于颜色概率模型的实时手势跟踪算法;徐力群,吴晓娟;《计算机工程与科学》;20051230;第27卷;第44-47页 *
徐力群,吴晓娟.基于颜色概率模型的实时手势跟踪算法.《计算机工程与科学》.2005,第27卷

Also Published As

Publication number Publication date
CN102682454A (en) 2012-09-19

Similar Documents

Publication Publication Date Title
CN102682454B (en) Method and device for tracking region of interest in video
Hu et al. Improving deep video compression by resolution-adaptive flow coding
US11234006B2 (en) Training end-to-end video processes
US11528492B2 (en) Machine learning for visual processing
CN101268690B (en) Adaptive area of influence filter for moving object boundaries
CN101946514B (en) Estimation of true motion vectors using an adaptive search range
CN104363451A (en) Image forecasting method and related device
CN104365095A (en) Method and apparatus for encoding a selected spatial portion of a video stream
CN104144313A (en) Video communication method, video communication system and cloud image server
CN113810763A (en) Video processing method, device and storage medium
CN107211081A (en) The transmission of video of context update based on absolute coding
CN112468806B (en) Panoramic video transmission optimization method for cloud VR platform
CN115834922A (en) Picture enhancement type decoding method facing real-time video analysis
CN115131673A (en) Task-oriented remote sensing image compression method and system
Shen et al. Prediction-assistant frame super-resolution for video streaming
Tang et al. A Constrained Deformable Convolutional Network for Efficient Single Image Dynamic Scene Blind Deblurring with Spatially-Variant Motion Blur Kernels Estimation
Oliveira et al. Conventional versus learning-based video coding benchmarking: Where are we?
Pande et al. Meeting real-time requirements for a low bitrate multimedia encoding framework
CN114071188A (en) Method, apparatus and computer readable storage medium for processing video data
Forster et al. The effect of image compression on automotive optical flow algorithms
Xu et al. Neural adaptive transport framework for internet-scale interactive media streaming services
Ge et al. 3RE-Net: Joint Loss-REcovery and Super-REsolution Neural Network for REal-Time Video
Fu et al. 3DAttGAN: A 3D Attention-Based Generative Adversarial Network for Joint Space-Time Video Super-Resolution
Liu et al. An Improvement for View Synthesis Optimization Algorithm
Deng et al. STDF: Spatio-Temporal Deformable Fusion for Video Quality Enhancement on Embedded Platforms

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant