CN102682454A - Method and device for tracking region of interest in video - Google Patents
Method and device for tracking region of interest in video Download PDFInfo
- Publication number
- CN102682454A CN102682454A CN201210132913XA CN201210132913A CN102682454A CN 102682454 A CN102682454 A CN 102682454A CN 201210132913X A CN201210132913X A CN 201210132913XA CN 201210132913 A CN201210132913 A CN 201210132913A CN 102682454 A CN102682454 A CN 102682454A
- Authority
- CN
- China
- Prior art keywords
- mrow
- roi
- msub
- msubsup
- particle
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 76
- 239000002245 particle Substances 0.000 claims abstract description 259
- 230000033001 locomotion Effects 0.000 claims abstract description 82
- 230000007704 transition Effects 0.000 claims abstract description 81
- 239000013598 vector Substances 0.000 claims abstract description 77
- 238000012545 processing Methods 0.000 claims abstract description 65
- 230000008569 process Effects 0.000 claims abstract description 35
- 238000005070 sampling Methods 0.000 claims abstract description 6
- 238000009826 distribution Methods 0.000 claims description 94
- 230000009466 transformation Effects 0.000 claims description 21
- 238000012546 transfer Methods 0.000 claims description 19
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 claims description 17
- 238000009827 uniform distribution Methods 0.000 claims description 14
- 230000008859 change Effects 0.000 claims description 11
- 230000006870 function Effects 0.000 claims description 11
- 238000004364 calculation method Methods 0.000 claims description 8
- 230000000694 effects Effects 0.000 abstract description 16
- 230000007423 decrease Effects 0.000 abstract description 2
- 238000010586 diagram Methods 0.000 description 10
- 238000005516 engineering process Methods 0.000 description 9
- 238000012952 Resampling Methods 0.000 description 8
- 239000011159 matrix material Substances 0.000 description 3
- 230000005540 biological transmission Effects 0.000 description 2
- 230000002123 temporal effect Effects 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 230000003044 adaptive effect Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000014509 gene expression Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Landscapes
- Image Analysis (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
The invention discloses a method and a device for tracking a region of interest in a video. The method for tracking the region of interest in the video comprises the steps: first, obtaining motion vectors of pixels or macro blocks in an existing frame, confirming distributed parameters of the movement speed of the region of interest (ROI) according to the motion vectors, confirming ROI scaling parameters according to the ROI state in a reference frame, utilizing the distributed parameters of the movement speed of the ROI and the scaling parameters to perform state transition to particles obtained through sampling in the existing frame, and confirming the ROI position and the ROI size of the existing frame according to the particles after the state transition. The method and a device for tracking the region of interest in the video can utilize motion vector information which exists in compressed code flows or is generated in encoding to guide the state transition process of the particles, accordingly can decrease required particle number in the tracking process under the condition that the tracking effect is ensured, further reduce the complexity level of tracking processing, and can have good tracking effect.
Description
Technical Field
The invention relates to the technical field of video processing, in particular to a method and a device for tracking an interested area in a video processing process.
Background
With the popularization and development of communication technology, corresponding video services such as mobile televisions, video conferences, video monitoring and the like are rapidly developed. In the process that a user accesses and develops corresponding video services through various terminals and different access modes, the diversity of the user terminals and the complexity of the network environment make how to effectively transmit video contents become a great challenge in designing a video service system.
Currently, SVC (scalable video coding) technology is capable of efficiently transmitting corresponding video content to some extent. In the SVC technology, subcode streams with various code rates, resolutions and frame rates are coded in a section of code stream at the same time, and a corresponding adaptive code stream can be generated by performing simple extraction operation at a transmission node according to the network condition and the requirements of users or user equipment. Compared with a single code stream technology, the SVC technology can provide a code stream with scalable space, time and quality, namely, some sub-code streams can be extracted from the code stream. The corresponding sub-code stream can meet the requirements of network transmission rate and terminal users on the aspects of space, time, signal to noise ratio and the like of the video, so that the SVC technology enables the video stream to be better suitable for various network environments and user terminals.
In SVC technology, the lowest quality coding layer that can be provided is called BL (base layer) and the coding layer that can enhance spatial resolution, temporal resolution or signal-to-noise strength is called EL (enhancement layer). Among them, the spatial scalability uses a Layered Coding (Layered Coding) method, using motion, texture, and residual information between layers; temporal scalability employs a Hierarchical bi-directional predictive frame (Hierarchical B) coding technique; for the scalability of the signal-to-noise ratio, methods of CGS (coarse grain quality scalability) and MGS (medium grain quality scalability) may be employed.
The corresponding SVC technique also provides support for ROI (region of interest) coding. An ROI generally refers to a region of a video frame that contains an object with a clear high level of semantic meaning to the viewer, such as a person, an object, etc. In the process of browsing the video by the user, if the display size of the device is small or the available bandwidth of the device is reduced, the definition of the region of interest can be maintained as much as possible, so that the viewing experience of the user on the video is not influenced. For example, when the access bandwidth is insufficient, a part of the non-interesting region may be deleted to adapt to the influence of the bandwidth requirement on the subjective quality of the video, that is, when the bandwidth is insufficient to transmit the encoded code streams of the base layer and the enhancement layer, the encoded code streams of the base layer and the ROI may be transmitted to fully utilize the bandwidth, so that the video quality may be maintained to a certain extent, and the subjective experience of the user may be ensured.
In order to implement the use of ROI coding techniques to accommodate a variety of different applications, the location and size of the ROI in each frame of the video needs to be determined, and video tracking techniques can be generally employed to determine the size and location of the ROI in each frame of the video.
The following describes the implementation of the presently adopted video tracking technique for determining ROIs in frames of video.
One currently used ROI tracking method is implemented based on a particle filter algorithm, specifically, a tracked region is represented as a particle (rectangular or elliptical, etc.), and the particle may include various states, such as a motion speed, a motion direction, and a region size. During tracking, a certain number of particles are generated in the current frame through importance sampling, and the stable state of the particles in the current frame is obtained through weighting according to the correlation between the certain number of particles and the particles in the region to be tracked in the reference frame, so that the region to be tracked in the current frame is obtained.
Correspondingly, a specific processing mode for obtaining the region to be tracked in the current frame based on the particle filter algorithm includes:
(1) initialization: take k as 0 (i.e., initial time) and press p (x)0) Extracting N sample points 1, …, N, wherein p (x)0) Refers to the initialized probability distribution of the target, and specifically, may be set to a gaussian distribution with the position size of the target as the mean in the first frame.
(2) Importance sampling:order toWherein i is 1, …, N, wherein,indicating the state of the ith particle at time k,represents the state of the particle from time 0 (initial time) to time k, z1:kRepresents an observed value (generally referred to as a color histogram of a tracking target) of a target from time 1 (initial tracking time) to time k,it means that,it is an estimation of the state distribution of the particles in the k-th frame, i.e. the importance function, under the condition that the observed values of the 1-k frames and the states of the i-th particle to the k-1 frame exist.
(3) Calculating the weight If one-step transfer posterior state distribution is employed, the equation can be simplified as:wherein,refers to observation models, i.e. representing particlesIs the probability of the object being tracked,it is referred to as a state transition model, i.e. a probability distribution model of the target moving from k-1 frame to k frame.
(5) resampling: according to respective normalized weightSize of (2) copy or discard samplesObtaining N approximate obeysDistributed sampleOrder to <math>
<mrow>
<msubsup>
<mi>ω</mi>
<mi>k</mi>
<mrow>
<mo>(</mo>
<mi>i</mi>
<mo>)</mo>
</mrow>
</msubsup>
<mo>=</mo>
<msubsup>
<mover>
<mi>ω</mi>
<mo>~</mo>
</mover>
<mi>k</mi>
<mrow>
<mo>(</mo>
<mi>i</mi>
<mo>)</mo>
</mrow>
</msubsup>
<mo>=</mo>
<mn>1</mn>
<mo>/</mo>
<mi>N</mi>
<mo>,</mo>
</mrow>
</math> i=1,…,N。
(6) And outputting a result: the output of the algorithm is a set of particlesBy which the posterior probability and the function x can be approximated0:kWherein:
a posterioriRate: <math>
<mrow>
<mover>
<mi>p</mi>
<mo>^</mo>
</mover>
<mrow>
<mo>(</mo>
<msub>
<mi>x</mi>
<mrow>
<mn>0</mn>
<mo>:</mo>
<mi>k</mi>
</mrow>
</msub>
<mo>|</mo>
<msub>
<mi>z</mi>
<mrow>
<mn>1</mn>
<mo>:</mo>
<mi>k</mi>
</mrow>
</msub>
<mo>)</mo>
</mrow>
<mo>=</mo>
<mfrac>
<mn>1</mn>
<mi>N</mi>
</mfrac>
<munderover>
<mi>Σ</mi>
<mrow>
<mi>i</mi>
<mo>=</mo>
<mn>1</mn>
</mrow>
<mi>N</mi>
</munderover>
<msub>
<mi>δ</mi>
<msubsup>
<mi>x</mi>
<mrow>
<mn>0</mn>
<mo>:</mo>
<mi>k</mi>
</mrow>
<mrow>
<mo>(</mo>
<mi>i</mi>
<mo>)</mo>
</mrow>
</msubsup>
</msub>
<mrow>
<mo>(</mo>
<msub>
<mi>dx</mi>
<mrow>
<mn>0</mn>
<mo>:</mo>
<mi>k</mi>
</mrow>
</msub>
<mo>)</mo>
</mrow>
<mo>;</mo>
</mrow>
</math>
function x0:kThe expectation of (2): <math>
<mrow>
<mi>E</mi>
<mrow>
<mo>(</mo>
<msub>
<mi>x</mi>
<mrow>
<mn>0</mn>
<mo>:</mo>
<mi>k</mi>
</mrow>
</msub>
<mo>)</mo>
</mrow>
<mo>=</mo>
<mfrac>
<mn>1</mn>
<mi>N</mi>
</mfrac>
<munderover>
<mi>Σ</mi>
<mrow>
<mi>i</mi>
<mo>=</mo>
<mn>1</mn>
</mrow>
<mi>N</mi>
</munderover>
<msubsup>
<mi>x</mi>
<mrow>
<mn>0</mn>
<mo>:</mo>
<mi>k</mi>
</mrow>
<mi>i</mi>
</msubsup>
<mo>.</mo>
</mrow>
</math>
(7) let k be k +1, the above-described processes (2) to (6) are repeated.
In the above implementation, if a stable tracking effect is to be obtained, a larger number of particles is required, and the larger the number of particles is, the larger the amount of calculation required for tracking is, resulting in a great increase in processing complexity.
Disclosure of Invention
The invention aims to provide a method and a device for tracking a region of interest in a video, which aim to reduce the processing complexity in the tracking process on the premise of ensuring the tracking effect.
The purpose of the invention is realized by the following technical scheme:
a method of region of interest tracking in a video, comprising:
acquiring a motion vector of a pixel or a macro block in a current frame, determining a moving speed distribution parameter of an ROI (region of interest) according to the motion vector, and determining an ROI scaling parameter according to state information of the ROI in a reference frame;
and performing state transition processing on the particles obtained by sampling in the current frame by using the moving speed distribution parameter and the scaling parameter of the ROI, and determining the position and the size of the ROI of the current frame according to the particles after the state transition.
Optionally, the step of determining the moving speed distribution parameter of the ROI includes:
determining the position of a reference pixel or a macro block corresponding to the pixel or the macro block in the current frame in the reference frame according to the motion vector of the pixel or the macro block in the current frame, and selecting the motion vector of the pixel or the macro block in the current frame, wherein the position of the reference pixel or the macro block is positioned in the ROI in the reference frame, in the motion vector of the pixel or the macro block in the current frame; determining the moving speed distribution parameter of the ROI according to the selected motion vector of the pixel or the macro block in the current frame;
optionally, the step of determining ROI scaling parameters comprises:
and determining the corresponding pixel or macro block in the current frame according to the selected motion vector of the pixel or macro block in the current frame, and determining the ROI scaling parameter according to the pixel or macro block in the current frame and the pixel or macro block in the corresponding reference frame.
Optionally, the moving speed distribution parameters of the ROI include:
Optionally, the step of performing state transition processing on the particles sampled and obtained in the current frame includes:
determining the velocity v of the particle state transition according to the moving velocity distribution parameter of the ROI1、v2Comprises the following steps of; generating a random number mu in a uniform distribution of 0-1, and if mu is less than alpha, making Sign variableOtherwise, with distribution P (MV)ROI) Selecting two components (MV ') corresponding to one element in the motion vector set G counted in the previous module'x,MV′y) Respectively as v1、v2Value of, i.e. order v1=MV′x、v2=MV′yTo and fromα is a state transition parameter, an initial value of which is preset, and a mode of updating the value in a subsequent particle updating process includes: is v in particle n in the k-1 framexThe components of the first and second images are,is v in particle n in the k-1 frameyA component;
determining a particle scaling parameter η from the ROI scaling parameter, including; generating a random number gamma in a uniform distribution of 0-1, if gamma is less than beta, making eta equal to rho, otherwise making eta equal to 1, wherein beta is a target size change degree parameter, the value of beta is preset, and rho is the scaling parameter;
performing particle state transition processing according to the particle state transition speed and the particle scaling parameter, and obtaining a result after particle state transition of an nth particle in a kth frame includes: the position of the particle ( <math>
<mrow>
<msubsup>
<mi>s</mi>
<mi>k</mi>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<mi>x</mi>
<mo>}</mo>
<mo>=</mo>
<msubsup>
<mi>s</mi>
<mrow>
<mi>k</mi>
<mo>-</mo>
<mn>1</mn>
</mrow>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<mi>x</mi>
<mo>}</mo>
<mo>+</mo>
<msub>
<mi>v</mi>
<mn>1</mn>
</msub>
<mo>+</mo>
<msub>
<mi>ϵ</mi>
<mi>x</mi>
</msub>
<mo>,</mo>
</mrow>
</math> <math>
<mrow>
<msubsup>
<mi>s</mi>
<mi>k</mi>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<mi>y</mi>
<mo>}</mo>
<mo>=</mo>
<msubsup>
<mi>s</mi>
<mrow>
<mi>k</mi>
<mo>-</mo>
<mn>1</mn>
</mrow>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<mi>y</mi>
<mo>}</mo>
<mo>+</mo>
<msub>
<mi>v</mi>
<mn>2</mn>
</msub>
<mo>+</mo>
<msub>
<mi>ϵ</mi>
<mi>y</mi>
</msub>
</mrow>
</math> ) The moving speeds of the particles in the x and y directions are respectively as follows: <math>
<mrow>
<msubsup>
<mi>s</mi>
<mi>k</mi>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<msub>
<mi>v</mi>
<mi>x</mi>
</msub>
<mo>}</mo>
<mo>=</mo>
<msub>
<mi>v</mi>
<mn>1</mn>
</msub>
<mo>+</mo>
<msub>
<mi>ϵ</mi>
<msub>
<mi>v</mi>
<mi>x</mi>
</msub>
</msub>
</mrow>
</math> and <math>
<mrow>
<msubsup>
<mi>s</mi>
<mi>k</mi>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<msub>
<mi>v</mi>
<mi>y</mi>
</msub>
<mo>}</mo>
<mo>=</mo>
<msub>
<mi>v</mi>
<mn>2</mn>
</msub>
<mo>+</mo>
<msub>
<mi>ϵ</mi>
<msub>
<mi>v</mi>
<mi>y</mi>
</msub>
</msub>
<mo>,</mo>
</mrow>
</math> the major semi-axis of the ellipse corresponding to the particle at the position is <math>
<mrow>
<msubsup>
<mi>s</mi>
<mi>k</mi>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<mi>a</mi>
<mo>}</mo>
<mo>=</mo>
<mi>η</mi>
<mo>×</mo>
<msubsup>
<mi>s</mi>
<mrow>
<mi>k</mi>
<mo>-</mo>
<mn>1</mn>
</mrow>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<mi>a</mi>
<mo>}</mo>
<mo>+</mo>
<msub>
<mi>ϵ</mi>
<mi>a</mi>
</msub>
<mo>,</mo>
</mrow>
</math> Short semi-axis is <math>
<mrow>
<msubsup>
<mi>s</mi>
<mi>k</mi>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<mi>b</mi>
<mo>}</mo>
<mo>=</mo>
<mi>η</mi>
<mo>×</mo>
<msubsup>
<mi>s</mi>
<mrow>
<mi>k</mi>
<mo>-</mo>
<mn>1</mn>
</mrow>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<mi>b</mi>
<mo>}</mo>
<mo>+</mo>
<msub>
<mi>ϵ</mi>
<mi>b</mi>
</msub>
<mo>,</mo>
</mrow>
</math> Wherein epsilonx、εyIs a random variable obeying a distribution of N (0, 0.2),is a random variable, ε, obeying N (0, 0.25)a、εbIs a random variable obeying N (0, 0.1).
Optionally, the step of determining the ROI of the current frame according to the state-transferred particles includes:
performing the statistical processing of the color histogram of each particle area on the particles subjected to the state transition processing, and performing particle updating processing according to the statistical processing result of the color histogram of each particle area;
the ROI position size is calculated based on the result obtained by the particle update process.
An apparatus for region of interest tracking in a video, comprising:
a moving speed distribution parameter and scaling parameter determining module, for obtaining the motion vector of the pixel or macro block in the current frame, and determining the moving speed distribution parameter of the ROI according to the motion vector, and also for determining the scaling parameter according to the state information of the ROI in the reference frame;
the particle state transition module is used for carrying out state transition processing on the particles sampled and obtained in the current frame by utilizing the moving speed distribution parameters and the scaling parameters of the ROI determined by the moving speed distribution parameter and scaling parameter determining module;
and the ROI determining module is used for determining the position and the size of the ROI of the current frame according to the particles subjected to state transfer by the particle state transfer module.
Optionally, the module for determining the moving speed distribution parameter and the scaling parameter specifically includes:
the motion vector acquisition module is used for determining the position of a reference pixel or a macro block corresponding to the pixel or the macro block in the current frame in the reference frame according to the motion vector of the pixel or the macro block in the current frame, and selecting the motion vector of the pixel or the macro block in the current frame corresponding to the reference pixel or the macro block of which the position is positioned in the ROI in the reference frame from the motion vector of the pixel or the macro block in the current frame;
the parameter determining module is used for determining the moving speed distribution parameter and the zooming parameter of the ROI according to the motion vector of the pixel or the macro block in the current frame selected and obtained by the motion vector obtaining module; and determining the corresponding pixel or macro block in the current frame according to the selected motion vector of the pixel or macro block in the current frame, and determining the ROI scaling parameter according to the pixel or macro block in the current frame and the pixel or macro block in the corresponding reference frame.
Optionally, the parameter determination module includes:
a moving speed distribution parameter estimation module of the ROI, configured to estimate a moving speed distribution parameter of the ROI, and the estimating of the obtained moving speed distribution parameter of the ROI includes: <math>
<mrow>
<mi>P</mi>
<mrow>
<mo>(</mo>
<msup>
<mi>MV</mi>
<mi>ROI</mi>
</msup>
<mo>)</mo>
</mrow>
<mo>=</mo>
<munderover>
<mi>Σ</mi>
<mrow>
<mi>i</mi>
<mo>=</mo>
<mn>0</mn>
</mrow>
<mrow>
<mo>|</mo>
<mi>G</mi>
<mo>|</mo>
<mo>-</mo>
<mn>1</mn>
</mrow>
</munderover>
<mi>δ</mi>
<mrow>
<mo>(</mo>
<msup>
<mi>MV</mi>
<mi>i</mi>
</msup>
<mo>-</mo>
<msup>
<mi>MV</mi>
<mi>ROI</mi>
</msup>
<mo>)</mo>
</mrow>
<mo>/</mo>
<mo>|</mo>
<mi>G</mi>
<mo>|</mo>
<mo>,</mo>
</mrow>
</math> wherein δ is a dirac function, G is a motion vector set of pixels or macroblocks in the current frame corresponding to the reference pixel or macroblock whose position is located in the ROI in the reference frame, and MV isiRefers to the ith motion vector, MV, in GROIMoving speed of ROI from previous frame to current frame;
an ROI scaling parameter estimation module, configured to estimate an ROI scaling parameter, and estimating the obtained ROI scaling parameter includes:wherein, a is, b is affine parameter, the affine parameter is solved by adopting least square method combined with four-parameter transformation model, and corresponding four-parameter transformation moduleIs established according to the pixel or the macro block in the current frame and the pixel or the macro block in the corresponding reference frame.
Optionally, the particle state transition module includes:
a particle position and velocity transfer module for determining the velocity v of particle state transfer according to the ROI moving velocity distribution parameter obtained by the ROI moving velocity distribution parameter estimation module1、v2Comprises the following steps of; generating a random number mu in a uniform distribution of 0-1, and if mu is less than alpha, making Sign variableOtherwise, with distribution P (MV)ROI) Selecting two components (MV ') corresponding to one element in the motion vector set G counted in the previous module'x,MV′y) Respectively as v1、v2Value of, i.e. order v1=MV′x、v2=MV′yTo and fromα is a state transition parameter, an initial value of which is preset, and a mode of updating the value in a subsequent particle updating process includes: is v in particle n in the k-1 framexThe components of the first and second images are,is v in particle n in the k-1 frameyA component;
the particle size transfer module is used for determining a particle scaling parameter eta according to the ROI scaling parameter obtained by the ROI scaling parameter estimation module, and comprises the following steps of; generating a random number gamma in a uniform distribution of 0-1, if gamma is less than beta, making eta equal to rho, otherwise making eta equal to 1, wherein beta is a target size change degree parameter, the value of beta is preset, and rho is the ROI scaling parameter;
a particle state transition result determining module, configured to perform particle state transition processing according to the particle state transition speed determined by the particle position and speed transferring module and the particle scaling parameter determined by the particle size transferring module, where obtaining a result after particle state transition of an nth particle in a kth frame includes: the position of the particle ( <math>
<mrow>
<msubsup>
<mi>s</mi>
<mi>k</mi>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<mi>x</mi>
<mo>}</mo>
<mo>=</mo>
<msubsup>
<mi>s</mi>
<mrow>
<mi>k</mi>
<mo>-</mo>
<mn>1</mn>
</mrow>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<mi>x</mi>
<mo>}</mo>
<mo>+</mo>
<msub>
<mi>v</mi>
<mn>1</mn>
</msub>
<mo>+</mo>
<msub>
<mi>ϵ</mi>
<mi>x</mi>
</msub>
<mo>,</mo>
</mrow>
</math> <math>
<mrow>
<msubsup>
<mi>s</mi>
<mi>k</mi>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<mi>y</mi>
<mo>}</mo>
<mo>=</mo>
<msubsup>
<mi>s</mi>
<mrow>
<mi>k</mi>
<mo>-</mo>
<mn>1</mn>
</mrow>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<mi>y</mi>
<mo>}</mo>
<mo>+</mo>
<msub>
<mi>v</mi>
<mn>2</mn>
</msub>
<mo>+</mo>
<msub>
<mi>ϵ</mi>
<mi>y</mi>
</msub>
</mrow>
</math> ) The moving speeds of the particles in the x and y directions are respectively as follows: <math>
<mrow>
<msubsup>
<mi>s</mi>
<mi>k</mi>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<msub>
<mi>v</mi>
<mi>x</mi>
</msub>
<mo>}</mo>
<mo>=</mo>
<msub>
<mi>v</mi>
<mn>1</mn>
</msub>
<mo>+</mo>
<msub>
<mi>ϵ</mi>
<msub>
<mi>v</mi>
<mi>x</mi>
</msub>
</msub>
</mrow>
</math> and <math>
<mrow>
<msubsup>
<mi>s</mi>
<mi>k</mi>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<msub>
<mi>v</mi>
<mi>y</mi>
</msub>
<mo>}</mo>
<mo>=</mo>
<msub>
<mi>v</mi>
<mn>2</mn>
</msub>
<mo>+</mo>
<msub>
<mi>ϵ</mi>
<msub>
<mi>v</mi>
<mi>y</mi>
</msub>
</msub>
<mo>,</mo>
</mrow>
</math> the major semi-axis of the ellipse corresponding to the particle at the position is <math>
<mrow>
<msubsup>
<mi>s</mi>
<mi>k</mi>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<mi>a</mi>
<mo>}</mo>
<mo>=</mo>
<mi>η</mi>
<mo>×</mo>
<msubsup>
<mi>s</mi>
<mrow>
<mi>k</mi>
<mo>-</mo>
<mn>1</mn>
</mrow>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<mi>a</mi>
<mo>}</mo>
<mo>+</mo>
<msub>
<mi>ϵ</mi>
<mi>a</mi>
</msub>
<mo>,</mo>
</mrow>
</math> Short semi-axis is <math>
<mrow>
<msubsup>
<mi>s</mi>
<mi>k</mi>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<mi>b</mi>
<mo>}</mo>
<mo>=</mo>
<mi>η</mi>
<mo>×</mo>
<msubsup>
<mi>s</mi>
<mrow>
<mi>k</mi>
<mo>-</mo>
<mn>1</mn>
</mrow>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<mi>b</mi>
<mo>}</mo>
<mo>+</mo>
<msub>
<mi>ϵ</mi>
<mi>b</mi>
</msub>
<mo>,</mo>
</mrow>
</math> Wherein epsilonx、εyIs a random variable obeying a distribution of N (0, 0.2),is a random variable, ε, obeying N (0, 0.25)a、εbIs a random variable obeying N (0, 0.1).
Optionally, the ROI determination module comprises:
each particle area color histogram statistical processing module is used for executing each particle area color histogram statistical processing on the particles after the state transition processing;
the particle updating module is used for performing particle updating processing according to the statistical processing result of the color histogram of each particle area;
and the ROI position size calculation module is used for calculating the ROI position size according to the result obtained by the particle updating processing.
It can be seen from the above technical solutions that, the ROI tracking technology provided in the embodiments of the present invention can use motion vector information existing in a compressed code stream or generated during encoding to guide a particle state transition process, so as to reduce the number of particles required in a tracking process under a condition that a tracking effect is ensured, thereby reducing a complexity of tracking processing, and obtaining a better tracking effect.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without any inventive work.
FIG. 1 is a schematic process diagram of a method according to an embodiment of the present invention;
FIG. 2 is a first schematic structural diagram of an apparatus according to an embodiment of the present invention;
FIG. 3 is a second schematic structural diagram of an apparatus according to an embodiment of the present invention;
FIG. 4 is a third schematic structural diagram of an apparatus according to an embodiment of the present invention;
FIG. 5 is a fourth schematic structural diagram of an apparatus according to an embodiment of the present invention;
FIG. 6 is a MV distribution histogram provided by an embodiment of the present invention;
FIG. 7 is a schematic diagram of a particle state transition provided by an embodiment of the present invention;
FIG. 8 is a fifth schematic structural diagram of an apparatus according to an embodiment of the present invention;
FIG. 9 is a particle region color histogram provided by an embodiment of the present invention;
FIG. 10 is a schematic diagram of a particle update process according to an embodiment of the present invention;
FIG. 11 is a first diagram illustrating an application effect of the embodiment of the present invention;
FIG. 12 is a second schematic view illustrating an application effect of the embodiment of the present invention;
FIG. 13 is a first exemplary embodiment of an application environment;
fig. 14 is a schematic diagram of an application environment of the embodiment of the invention.
Detailed Description
The technical solutions in the embodiments of the present invention are clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.
Embodiments of the present invention will be described in further detail below with reference to the accompanying drawings.
The embodiment of the invention provides a method for tracking a region of interest in a video, and the specific implementation manner of the method is shown in fig. 1, and the method can comprise the following steps:
the current frame of the current frame on the coding structure is referred to as a reference frame for short, and the ROI position in the reference frame is known;
specifically, in this step, the process of determining the moving speed distribution parameter of the ROI may specifically include, but is not limited to:
firstly, determining the position of a reference pixel or a macro block corresponding to a pixel or a macro block in a current frame in the reference frame according to a motion vector of the pixel or the macro block in the current frame; then, selecting a reference pixel or a macro block with a corresponding position in the ROI of the reference frame from the motion vectors of the pixels or the macro blocks in the current frame, and acquiring the motion vector of the pixels or the macro blocks in the current frame corresponding to the reference pixel or the macro block; then, the corresponding moving speed distribution parameter of the ROI can be determined according to the selected motion vector of the pixel or the macro block in the current frame;
the step of determining ROI scaling parameters from the state information of the ROI in the reference frame may include, but is not limited to:
firstly, determining the corresponding pixels or macro blocks in the current frame according to the selected motion vectors of the pixels or macro blocks in the current frame; the ROI scaling parameters may then be determined from the pixels or macroblocks in the current frame and the corresponding pixels or macroblocks in the reference frame.
The calculation method of the moving speed distribution parameter of the ROI and the ROI scaling parameter may include, but is not limited to:
moving speed distribution parameter of ROI <math>
<mrow>
<mi>P</mi>
<mrow>
<mo>(</mo>
<msup>
<mi>MV</mi>
<mi>ROI</mi>
</msup>
<mo>)</mo>
</mrow>
<mo>=</mo>
<munderover>
<mi>Σ</mi>
<mrow>
<mi>i</mi>
<mo>=</mo>
<mn>0</mn>
</mrow>
<mrow>
<mo>|</mo>
<mi>G</mi>
<mo>|</mo>
<mo>-</mo>
<mn>1</mn>
</mrow>
</munderover>
<mi>δ</mi>
<mrow>
<mo>(</mo>
<msup>
<mi>MV</mi>
<mi>i</mi>
</msup>
<mo>-</mo>
<msup>
<mi>MV</mi>
<mi>ROI</mi>
</msup>
<mo>)</mo>
</mrow>
<mo>/</mo>
<mo>|</mo>
<mi>G</mi>
<mo>|</mo>
<mo>,</mo>
</mrow>
</math> Wherein δ is a dirac function, G is a motion vector set of pixels or macroblocks in the current frame corresponding to the reference pixel or macroblock whose position is located in the ROI in the reference frame, and MV isiRefers to the ith motion vector, MV, in GROIThe moving speed of the ROI from the previous frame (i.e. the reference frame) to the current frame;
ROI scaling parametersAnd a is, b is an affine parameter, the affine parameter is solved by combining a least square method and a four-parameter transformation model, and a corresponding four-parameter transformation module is established according to the pixel or the macro block in the current frame and the pixel or the macro block in the corresponding reference frame.
the specific implementation manner of the step may include:
first, according toThe moving speed distribution parameter of the ROI determines the speed v of the state transition of the particles1、v2Comprises the following steps of; generating a random number mu in a uniform distribution of 0-1, and if mu is less than alpha, making Sign variableOtherwise, with distribution P (MV)ROI) Selecting two components (MV ') corresponding to one element in the motion vector set G counted in the previous module'x,MV′y) Respectively as v1、v2Value of, i.e. order v1=MV′x、v2=MV′yTo and fromα is a state transition parameter, an initial value of which is preset, and a mode of updating the value in a subsequent particle updating process includes: is v in particle n in the k-1 framexThe components of the first and second images are,is v in particle n in the k-1 frameyComponent, and the current frame is the kth frame, and the reference frame is the (k-1) th frame;
then, determining a particle scaling parameter eta according to the ROI scaling parameter, including; generating a random number gamma in a uniform distribution of 0-1, if gamma is less than beta, making eta equal to rho, otherwise making eta equal to 1, wherein beta is a target size change degree parameter, the value of beta is preset, and rho is the scaling parameter;
finally, the particle state transition processing is performed according to the particle state transition speed and the particle scaling parameter, and the result obtained after the particle state transition of the nth particle in the kth frame includes: the position of the particle ( <math>
<mrow>
<msubsup>
<mi>s</mi>
<mi>k</mi>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<mi>x</mi>
<mo>}</mo>
<mo>=</mo>
<msubsup>
<mi>s</mi>
<mrow>
<mi>k</mi>
<mo>-</mo>
<mn>1</mn>
</mrow>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<mi>x</mi>
<mo>}</mo>
<mo>+</mo>
<msub>
<mi>v</mi>
<mn>1</mn>
</msub>
<mo>+</mo>
<msub>
<mi>ϵ</mi>
<mi>x</mi>
</msub>
<mo>,</mo>
</mrow>
</math> <math>
<mrow>
<msubsup>
<mi>s</mi>
<mi>k</mi>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<mi>y</mi>
<mo>}</mo>
<mo>=</mo>
<msubsup>
<mi>s</mi>
<mrow>
<mi>k</mi>
<mo>-</mo>
<mn>1</mn>
</mrow>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<mi>y</mi>
<mo>}</mo>
<mo>+</mo>
<msub>
<mi>v</mi>
<mn>2</mn>
</msub>
<mo>+</mo>
<msub>
<mi>ϵ</mi>
<mi>y</mi>
</msub>
</mrow>
</math> ) The moving speeds of the particles in the x and y directions are respectively as follows: <math>
<mrow>
<msubsup>
<mi>s</mi>
<mi>k</mi>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<msub>
<mi>v</mi>
<mi>x</mi>
</msub>
<mo>}</mo>
<mo>=</mo>
<msub>
<mi>v</mi>
<mn>1</mn>
</msub>
<mo>+</mo>
<msub>
<mi>ϵ</mi>
<msub>
<mi>v</mi>
<mi>x</mi>
</msub>
</msub>
</mrow>
</math> and <math>
<mrow>
<msubsup>
<mi>s</mi>
<mi>k</mi>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<msub>
<mi>v</mi>
<mi>y</mi>
</msub>
<mo>}</mo>
<mo>=</mo>
<msub>
<mi>v</mi>
<mn>2</mn>
</msub>
<mo>+</mo>
<msub>
<mi>ϵ</mi>
<msub>
<mi>v</mi>
<mi>y</mi>
</msub>
</msub>
<mo>,</mo>
</mrow>
</math> the major semi-axis of the ellipse corresponding to the particle at the position is <math>
<mrow>
<msubsup>
<mi>s</mi>
<mi>k</mi>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<mi>a</mi>
<mo>}</mo>
<mo>=</mo>
<mi>η</mi>
<mo>×</mo>
<msubsup>
<mi>s</mi>
<mrow>
<mi>k</mi>
<mo>-</mo>
<mn>1</mn>
</mrow>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<mi>a</mi>
<mo>}</mo>
<mo>+</mo>
<msub>
<mi>ϵ</mi>
<mi>a</mi>
</msub>
<mo>,</mo>
</mrow>
</math> Short semi-axis is <math>
<mrow>
<msubsup>
<mi>s</mi>
<mi>k</mi>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<mi>b</mi>
<mo>}</mo>
<mo>=</mo>
<mi>η</mi>
<mo>×</mo>
<msubsup>
<mi>s</mi>
<mrow>
<mi>k</mi>
<mo>-</mo>
<mn>1</mn>
</mrow>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<mi>b</mi>
<mo>}</mo>
<mo>+</mo>
<msub>
<mi>ϵ</mi>
<mi>b</mi>
</msub>
<mo>,</mo>
</mrow>
</math> Wherein epsilonx、εyIs a random variable obeying a distribution of N (0, 0.2),is a random variable, ε, obeying N (0, 0.25)a、εbIs a random variable obeying N (0, 0.1).
specifically, the specific implementation manner of this step may include:
performing the statistical processing of the color histogram of each particle area on the particles subjected to the state transition processing, and performing particle updating processing according to the statistical processing result of the color histogram of each particle area;
the ROI position size is calculated based on the result obtained by the particle update process.
The embodiment of the invention mainly utilizes the additional information generated in the code stream or during the coding to guide the state transition of the particles, thereby ensuring the tracking effect aiming at the ROI and reducing the processing complexity of the tracking process. Specifically, the invention utilizes MV information existing in compressed code stream or generated during coding to guide the particle state transfer process in the particle filter algorithm, thereby reducing the number of required particles under the condition of ensuring the tracking effect and further reducing the complexity of the tracking processing process; alternatively, the present invention can obtain a more stable tracking effect in the case of performing tracking processing using the same number of particles. In addition, the invention also utilizes the stability of the particle filter theory on noise to obtain the robustness of the tracking algorithm, and further ensures the tracking effect of the tracking technical scheme provided by the embodiment of the invention.
An embodiment of the present invention further provides a device for tracking a region of interest in a video, and a specific implementation structure of the device is shown in fig. 2, where the device may include:
(1) a moving speed distribution parameter and scaling parameter determining module 21, configured to obtain a motion vector of a pixel or a macroblock in a current frame, determine a moving speed distribution parameter of a region of interest ROI according to the motion vector, and determine a scaling parameter of the ROI according to state information of the ROI in a reference frame;
further, as shown in fig. 3, the moving speed distribution parameter and scaling parameter determining module 21 may specifically include:
a motion vector obtaining module 211, configured to determine, according to a motion vector of a pixel or a macroblock in a current frame, a position of a reference pixel or a macroblock in the reference frame, where the pixel or the macroblock in the current frame corresponds to the reference pixel or the macroblock, select, from motion vectors of the pixel or the macroblock in the current frame, a reference pixel or a macroblock, where the position is located in an ROI in the reference frame, and then obtain a motion vector of the pixel or the macroblock in the current frame, where the reference pixel or the macroblock corresponds to the reference pixel or the macroblock;
a parameter determining module 212, configured to determine a moving speed distribution parameter of the corresponding ROI according to the motion vector of the pixel or the macroblock in the current frame selected and obtained by the motion vector obtaining module 211; determining a corresponding pixel or macro block in the current frame according to the selected motion vector of the pixel or macro block in the current frame, and determining a corresponding ROI (region of interest) scaling parameter according to the pixel or macro block in the current frame and the pixel or macro block in the corresponding reference frame;
the parameter determination module 212 may specifically include a ROI movement velocity distribution parameter estimation module 2121 and a ROI scaling parameter estimation module 2122, wherein:
a moving speed distribution parameter estimation module 2121 of the ROI, configured to estimate a moving speed distribution parameter of the ROI, and the estimating the obtained moving speed distribution parameter of the ROI includes: <math>
<mrow>
<mi>P</mi>
<mrow>
<mo>(</mo>
<msup>
<mi>MV</mi>
<mi>ROI</mi>
</msup>
<mo>)</mo>
</mrow>
<mo>=</mo>
<munderover>
<mi>Σ</mi>
<mrow>
<mi>i</mi>
<mo>=</mo>
<mn>0</mn>
</mrow>
<mrow>
<mo>|</mo>
<mi>G</mi>
<mo>|</mo>
<mo>-</mo>
<mn>1</mn>
</mrow>
</munderover>
<mi>δ</mi>
<mrow>
<mo>(</mo>
<msup>
<mi>MV</mi>
<mi>i</mi>
</msup>
<mo>-</mo>
<msup>
<mi>MV</mi>
<mi>ROI</mi>
</msup>
<mo>)</mo>
</mrow>
<mo>/</mo>
<mo>|</mo>
<mi>G</mi>
<mo>|</mo>
<mo>,</mo>
</mrow>
</math> wherein δ is the Dirac function, and G is the current corresponding to the reference pixel or macroblock whose position is within the ROI in the reference frameSet of motion vectors, MV, for pixels or macroblocks in a frameiRefers to the ith motion vector, MV, in GROIThe moving speed of the ROI from the previous frame (i.e. the reference frame) to the current frame;
an ROI scaling parameter estimation module 2122, configured to estimate an ROI scaling parameter, where estimating the obtained ROI scaling parameter includes:and a is, b is an affine parameter, the affine parameter is solved by combining a least square method and a four-parameter transformation model, and a corresponding four-parameter transformation module is established according to the pixel or the macro block in the current frame and the pixel or the macro block in the corresponding reference frame.
(2) A particle state transition module 22, configured to perform state transition processing on the particles sampled and obtained in the current frame by using the moving speed distribution parameter and the scaling parameter of the ROI determined by the moving speed distribution parameter and scaling parameter determining module 21;
further, referring to fig. 8, the corresponding particle state transition module may include:
a particle position and velocity transition module 221, configured to determine the velocity v of particle state transition according to the moving velocity distribution parameter of the ROI obtained by the moving velocity distribution parameter estimation module 2121 of the ROI1、v2Comprises the following steps of; generating a random number mu in a uniform distribution of 0-1, and if mu is less than alpha, making Sign variableOtherwise, with distribution P (MV)ROI) Selecting two components (MV ') corresponding to one element in the motion vector set G counted in the previous module'x,MV′y) Respectively as v1、v2Value of, i.e. order v1=MV′x、v2=MV′yTo and fromα is a state transition parameter, an initial value of which is preset, and a mode of updating the value in a subsequent particle updating process includes: is v in particle n in the k-1 framexThe components of the first and second images are,is v in particle n in the k-1 frameyA component;
a particle size transfer module 222, configured to determine a particle scaling parameter η according to the ROI scaling parameter obtained by the ROI scaling parameter estimation module 2122, which may specifically include; generating a random number gamma in a uniform distribution of 0-1, if gamma is less than beta, making eta equal to rho, otherwise making eta equal to 1, wherein beta is a target size change degree parameter, the value of beta is preset, and rho is the ROI scaling parameter;
a particle state transition result determining module 223, configured to perform particle state transition processing according to the particle state transition speed determined by the corresponding particle position and speed transferring module 221 and the particle scaling parameter determined by the particle size transferring module 222, where the result obtained after the particle state transition of the nth particle in the kth frame includes: the position of the particle ( <math>
<mrow>
<msubsup>
<mi>s</mi>
<mi>k</mi>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<mi>x</mi>
<mo>}</mo>
<mo>=</mo>
<msubsup>
<mi>s</mi>
<mrow>
<mi>k</mi>
<mo>-</mo>
<mn>1</mn>
</mrow>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<mi>x</mi>
<mo>}</mo>
<mo>+</mo>
<msub>
<mi>v</mi>
<mn>1</mn>
</msub>
<mo>+</mo>
<msub>
<mi>ϵ</mi>
<mi>x</mi>
</msub>
<mo>,</mo>
</mrow>
</math> <math>
<mrow>
<msubsup>
<mi>s</mi>
<mi>k</mi>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<mi>y</mi>
<mo>}</mo>
<mo>=</mo>
<msubsup>
<mi>s</mi>
<mrow>
<mi>k</mi>
<mo>-</mo>
<mn>1</mn>
</mrow>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<mi>y</mi>
<mo>}</mo>
<mo>+</mo>
<msub>
<mi>v</mi>
<mn>2</mn>
</msub>
<mo>+</mo>
<msub>
<mi>ϵ</mi>
<mi>y</mi>
</msub>
</mrow>
</math> ) The moving speeds of the particles in the x and y directions are respectively as follows: <math>
<mrow>
<msubsup>
<mi>s</mi>
<mi>k</mi>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<msub>
<mi>v</mi>
<mi>x</mi>
</msub>
<mo>}</mo>
<mo>=</mo>
<msub>
<mi>v</mi>
<mn>1</mn>
</msub>
<mo>+</mo>
<msub>
<mi>ϵ</mi>
<msub>
<mi>v</mi>
<mi>x</mi>
</msub>
</msub>
</mrow>
</math> and <math>
<mrow>
<msubsup>
<mi>s</mi>
<mi>k</mi>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<msub>
<mi>v</mi>
<mi>y</mi>
</msub>
<mo>}</mo>
<mo>=</mo>
<msub>
<mi>v</mi>
<mn>2</mn>
</msub>
<mo>+</mo>
<msub>
<mi>ϵ</mi>
<msub>
<mi>v</mi>
<mi>y</mi>
</msub>
</msub>
<mo>,</mo>
</mrow>
</math> the major semi-axis of the ellipse corresponding to the particle at the position is <math>
<mrow>
<msubsup>
<mi>s</mi>
<mi>k</mi>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<mi>a</mi>
<mo>}</mo>
<mo>=</mo>
<mi>η</mi>
<mo>×</mo>
<msubsup>
<mi>s</mi>
<mrow>
<mi>k</mi>
<mo>-</mo>
<mn>1</mn>
</mrow>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<mi>a</mi>
<mo>}</mo>
<mo>+</mo>
<msub>
<mi>ϵ</mi>
<mi>a</mi>
</msub>
<mo>,</mo>
</mrow>
</math> Short semi-axis is <math>
<mrow>
<msubsup>
<mi>s</mi>
<mi>k</mi>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<mi>b</mi>
<mo>}</mo>
<mo>=</mo>
<mi>η</mi>
<mo>×</mo>
<msubsup>
<mi>s</mi>
<mrow>
<mi>k</mi>
<mo>-</mo>
<mn>1</mn>
</mrow>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<mi>b</mi>
<mo>}</mo>
<mo>+</mo>
<msub>
<mi>ϵ</mi>
<mi>b</mi>
</msub>
<mo>,</mo>
</mrow>
</math> Wherein epsilonx、εyIs a random variable obeying a distribution of N (0, 0.2),is a random variable, ε, obeying N (0, 0.25)a、εbIs a random variable obeying N (0, 0.1).
(3) An ROI determining module 23, configured to determine a position and a size of an ROI of the current frame according to the particle subjected to the state transition by the particle state transition module 22;
specifically, as shown in fig. 4, the ROI determining module may specifically include:
a particle region color histogram statistical processing module 231, configured to perform particle region color histogram statistical processing on the particles subjected to the state transition processing;
a particle update module 232, configured to perform particle update processing according to a result obtained by performing the color histogram statistical processing on each particle region by the particle region color histogram statistical processing module 231;
the ROI position size calculation module 233 is configured to calculate the ROI position size according to a result obtained by the particle update module 232 performing the particle update process.
In the device, the MV information existing in the compressed code stream or generated during coding is used for guiding the particle state transition process, so that the required particle number can be reduced under the condition of ensuring the tracking effect, or the more stable tracking effect can be obtained under the condition of adopting the same particle number for tracking processing.
That is to say, in order to improve the accuracy of ROI tracking and reduce the tracking complexity, the embodiment of the present invention employs an implementation manner that MV information generated in a corresponding encoding process or already existing in a code stream is used to guide state transition of particles in a particle filter algorithm, so as to obtain a more superior tracking effect.
For a better understanding of the embodiments of the present invention, reference will now be made in detail to the embodiments of the present invention, examples of which are illustrated in the accompanying drawings and described in the following description.
In the embodiment of the invention, the ROI is tracked by utilizing MV information generated in the encoding process or already existing in a code stream according to the particle filter theory. The ROI is obtained frame by frame, and the data of the position, the size and the like of the ROI in the current frame are obtained according to the information of the ROI in the reference frame. The corresponding tracking procedure may mainly include:
firstly, obtaining a reference frame of a current frame from a buffer memory of the reference frame, and estimating the speed distribution and the scaling parameters of the ROI moving to the current frame by an ROI moving speed distribution parameter and scaling parameter estimation module according to the state information of the ROI in the reference frame and the MV of pixels (or macro blocks);
secondly, estimating the position and size information distribution of the ROI in the current frame by a particle state transfer module;
thirdly, the color histogram of each particle area obtained by statistics of the color histogram statistical module of each particle coverage area is utilized, the information distribution of the ROI obtained by the particle state transition module is further corrected by the particle updating module, the position size information of the ROI in the current frame is calculated and output, and the position size information and the current frame are stored into a reference frame for caching so as to estimate the ROI information in the subsequent frame.
Specifically, as shown in fig. 4, the apparatus provided in the embodiment of the present invention may specifically include:
(1) reference frame acquisition module
This module is used to obtain the reference frame of the current frame, and is the same as the method for obtaining the reference frame in the conventional encoding method, which is not shown in the figure.
(2) R0I moving speed distribution parameter and scaling parameter estimation module
The module is a moving speed distribution parameter and scaling parameter determining module, which is used to obtain moving speed distribution parameter estimation and scaling parameter estimation of the ROI region, as shown in fig. 5, the module may further include an MV obtaining module, an ROI moving speed distribution parameter estimation module and an ROI scaling parameter estimation module, and the processing functions performed by each module will be described below:
(21) an MV obtaining module, i.e. a motion vector obtaining module, for counting MVs of pixels of a reference pixel (or a macro block, wherein the pixels appearing in the following description can be replaced by macro blocks) located in the ROI region of the reference frame, wherein the corresponding MV is a two-dimensional vector (MV)x,MVy) I.e. the components of the MV of a pixel in the x, y direction;
specifically, as is known from the h.264/SVC coding standard, the position of the reference pixel in the reference frame can be obtained from the MV of the pixel in the current frame, so that if the reference pixel of a certain pixel in the current frame is located in the ROI region in the reference frame, the MV of the pixel in the current frame is obtained, a plurality of pixels in the current frame whose reference pixels are located in the ROI region in the reference frame are recorded as a set G, the number of elements in the set G is | G |, and the ith element is Gi,giMV of (1) is recorded asi ═ 0, 1, …, | G | -1; further, a set of each element MV in G is recorded as M;
for example, as shown in fig. 6, the reference pixels associated with the corresponding MVs 1, 2, and 3 are located in the ROI region in the reference frame, so that the pixels in the current frame corresponding to the MVs 1, MV2, and MV3 are in statistical columns, i.e., the corresponding pixels need to be recorded in the set G; the reference pixel associated with MV0 is located outside the ROI in the reference frame, so the pixel in the current frame corresponding to MV0 is not in the statistics.
(22) A moving speed distribution estimation module for counting the (-MV) distribution histogram of the pixel in the current frame with the reference pixel located in the ROI area in the reference frame, wherein-MV is obtained by taking the negative sign of two components of MV respectively, i.e. the-MV distribution histogram of the pixel in the set G obtained by the MV acquisition module is normalized by the L1 norm, and is used as the estimation of the moving speed probability distribution of the ROI area from the reference frame to the current frame, and specifically can be recorded as a moving speed distribution parameter P (MV)ROI) Namely:
Only the statistics P (MV) are given in FIG. 6ROI) Wherein '+' represents a pixel, a pixel reference pixel corresponding to the end of the dotted arrow is not located in the ROI area in the reference frame, and its motion vector MV0Is not in the statistical column; other motion vectors, e.g. MV1,MV2These are all in the statistical column.
(23) A module for estimating a scaling parameter of the ROI region, configured to estimate a scaling parameter for obtaining the ROI region, where the scaling parameter is used to indicate a scaling relationship between the ROI region in the reference frame and the ROI region in the current frame, and specifically, the scaling parameter may be obtained based on an affine transformation, and the scaling parameter may be:wherein a and b are affine transformation parameters;
the manner of specifically obtaining the ROI scaling parameter may include:
let the ith element G in GiHas the coordinates of (x)i,yi) Then the coordinate of the corresponding reference pixel isA special case of affine transformation, namely a four-parameter transformation model, is used as a target motion transformation model, namely:
using the least squares criterion to estimate a uniform four-parameter transformation model of the ROI for all pixels in the set G, the following matrix A, B, C can be constructed, namely:
thus, the moving transformation model of the corresponding target can be converted into C ═ B × A, and A ═ B (B) can be obtained by the least square criterionTB)-1BTC, wherein BTThe matrix B and the matrix C are known quantities, so that the values of the affine transformation parameters a and B can be obtained, and further the value ρ of the corresponding scaling parameter can be obtained through calculation.
(3) Particle transfer module
The module, i.e. the particle state transition module, may be configured to estimate the probability distribution of the moving speed of the ROI region from the reference frame to the current frame based on the ROI moving speed distribution parameters and the estimation of the scaling parameters by the estimation moduleMeter P (MV)ROI) And scaling the estimated rho of the parameter to guide the state transition of the particle so as to achieve a better tracking result.
Specifically, the purpose of the module is to perform preliminary estimation on ROI distribution of a current frame, as shown in fig. 7, each circle represents a particle (including information such as position and size of the ROI), a particle set in a reference frame represents state information distribution of the ROI in the reference frame, the particle set in the reference frame is subjected to particle state transition to obtain a particle set in the current frame, and the particle set in the current frame represents the state information distribution of the ROI in the current frame;
in the process of particle state transition, the information that the module needs to utilize includes: the particle states in the reference frame representing the ROI distribution, the state transition parameter α, and P (MV)ROI) And ρ, where the initial value of α may be, but is not limited to, set to 0.5, and the value of the state transition parameter α is also automatically updated during subsequent processing.
Specifically, the same processing needs to be performed for each particle, and the description is given only by taking the state transition processing for the nth particle as an example, and as shown in fig. 8, the state transition processing procedure for the nth particle may include:
first, determining a speed of the particle state transition may specifically include: generating a random number mu in a uniform distribution of 0-1, and if mu is less than alpha, making(i.e., v in particle n in k-1 framexComponent, the following expressions are similar) Otherwise, the estimation result P (MV) of the probability distribution of the moving speed from the reference frame to the current frame according to the ROI area is usedROI) Set of motion vectors M obtained beforeTwo components (MV ') corresponding to one element are selected'x,MV′y) Respectively as v1、v2Value of, i.e. order v1=MV′x、v2=Mv′yTo and fromWherein v is1、v2A temporary variable;
secondly, determining a particle scaling parameter may specifically include: generating a random number gamma in a uniform distribution of 0-1, if gamma is less than beta, making eta equal to rho, otherwise making eta equal to 1, wherein eta is a temporary variable, beta is a target size change degree parameter, which is a constant parameter and is used for controlling the change degree of the target size, if the size change speed of the target is relatively high, the value of beta is relatively large, otherwise, the value of beta is relatively small, and the target size change in two adjacent frames is generally not very severe, so that the parameter beta can be set to be 0.2 without limitation;
third, particle state transitions, based on a previously determined temporary variable v1、v2And eta, performing particle state transition processing, specifically, calculating the state transition result of the corresponding particle of the nth particle in the kth frameComprises the following steps: the position of the particle in the x-axis directionThe position of the particle in the y-axis direction <math>
<mrow>
<msubsup>
<mi>s</mi>
<mi>k</mi>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<mi>y</mi>
<mo>}</mo>
<mo>=</mo>
<msubsup>
<mi>s</mi>
<mrow>
<mi>k</mi>
<mo>-</mo>
<mn>1</mn>
</mrow>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<mi>y</mi>
<mo>}</mo>
<mo>+</mo>
<msub>
<mi>v</mi>
<mn>2</mn>
</msub>
<mo>+</mo>
<msub>
<mi>ϵ</mi>
<mi>y</mi>
</msub>
<mo>,</mo>
</mrow>
</math> The moving speed of the particles in the x-axis direction <math>
<mrow>
<msubsup>
<mi>s</mi>
<mi>k</mi>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<msub>
<mi>v</mi>
<mi>x</mi>
</msub>
<mo>}</mo>
<mo>=</mo>
<msub>
<mi>v</mi>
<mn>1</mn>
</msub>
<mo>+</mo>
<msub>
<mi>ϵ</mi>
<msub>
<mi>v</mi>
<mi>x</mi>
</msub>
</msub>
<mo>,</mo>
</mrow>
</math> The moving speed of the particles in the y-axis directionThe particles correspond to the major semiaxis of the ellipse at the corresponding positionThe minor semi-axis of the ellipse corresponding to the particle at the corresponding positionWherein epsilonx、εyIs a random variable obeying a distribution of N (0, 0.2),is a random variable, ε, obeying N (0, 0.25)a、εbIs a random variable obeying N (0, 0.1). Wherein,represents the nth particle in the kth frame, and means: ROI in k-th frame with probabilityAppearing at (x, y) with the major and minor axes a, b, respectively, and in (v)x,vy) Moves to the next frame;
(4) color histogram statistical module for each particle region
This module is used to calculate an observation model of the particle filter theory, and the same operation needs to be performed for each particle, and the following description will take the observation model of the particle filter theory calculated for the particle n as an example.
Particle n in the current frame, the state isIt can be mapped to an ellipse with (x, y) as the center and a, b as the semi-major axis on the image of the current frame, and is marked asAs shown in FIG. 9, this module is used for the color histogram of the elliptical area in the histogram and normalized by the L1 norm, which can be recorded asThe calculation method is as follows:
wherein, suppose the video input is YUV color space, NY、NU、NVIs the number of bins divided by each color channel (obtained by quantizing YUV space, if each dimension of YUV space is equally divided into 10 parts, then NY、NU、NVAll equal to 10);is falling onThe bin number of the bin into which the t-th pixel falls on each color channel (the bin number of the corresponding bin of each color channel is incremented from 0 up to 9);representing an ellipseThe number of pixels in (1).
In particular, the implementation of the observation model for computing the particle filter theory in this module is the same as in the prior art, except that the input to the module is the state transition result of the particles obtained according to the method of the invention
(5) Particle update module
The module is used for performing corresponding particle update processing, and specifically is used for calculating the weight of particles (namely the output of the particle transfer module) which are obtained through state transfer and represent the ROI state distribution in the current frame, obtaining the distribution estimation of the ROI in the current frame through resampling, and updating the state transfer parameter alpha.
Specifically, as shown in fig. 10, the corresponding particle update process may include:
first, the weight of each particle is updated <math>
<mrow>
<msubsup>
<mi>π</mi>
<mi>k</mi>
<mi>n</mi>
</msubsup>
<mo>=</mo>
<msubsup>
<mi>π</mi>
<mrow>
<mi>k</mi>
<mo>-</mo>
<mn>1</mn>
</mrow>
<mi>n</mi>
</msubsup>
<mo>×</mo>
<mi>exp</mi>
<msup>
<mrow>
<mo>{</mo>
<mo>-</mo>
<mo>{</mo>
<mn>1</mn>
<mo>-</mo>
<munderover>
<mi>Σ</mi>
<mrow>
<mi>i</mi>
<mo>=</mo>
<mn>1</mn>
</mrow>
<mi>N</mi>
</munderover>
<msqrt>
<mi>P</mi>
<mrow>
<mo>(</mo>
<msub>
<mi>sb</mi>
<mn>0</mn>
</msub>
<mo>)</mo>
</mrow>
<mo>[</mo>
<mi>j</mi>
<mo>]</mo>
<mo>×</mo>
<mi>P</mi>
<mrow>
<mo>(</mo>
<msubsup>
<mi>sb</mi>
<mi>k</mi>
<mi>n</mi>
</msubsup>
<mo>)</mo>
</mrow>
<mo>[</mo>
<mi>j</mi>
<mo>]</mo>
</msqrt>
<mo>}</mo>
</mrow>
<mn>2</mn>
</msup>
<mo>/</mo>
<mi>σ</mi>
<mo>}</mo>
<mo>,</mo>
</mrow>
</math> Wherein,represents the weight of the nth particle in the kth frame (the probability that the region represented by the particle is the ROI region), where N represents the number of bins of the color histogram used, and N is equal to NYNUNV;A value indicating the jth bin of the color histogram of the particle n counted in the previous block (block for counting the color histograms of the respective particle regions); p (sb)0) Refers to the color histogram of the initialized ROI area; parameter σ is a constant, which may be set to 0.25, but is not limited to; normalizing each particle weight <math>
<mrow>
<msubsup>
<mi>π</mi>
<mi>k</mi>
<mi>n</mi>
</msubsup>
<mo>=</mo>
<msubsup>
<mi>π</mi>
<mi>k</mi>
<mi>n</mi>
</msubsup>
<mo>/</mo>
<munderover>
<mi>Σ</mi>
<mrow>
<mi>n</mi>
<mo>=</mo>
<mn>1</mn>
</mrow>
<mi>N</mi>
</munderover>
<msubsup>
<mi>π</mi>
<mi>k</mi>
<mi>n</mi>
</msubsup>
<mo>.</mo>
</mrow>
</math>
Secondly, the resampling operation may specifically include: first, the effective particle number is calculatedThen judgeIf number of significant particlesPerforming corresponding resampling operation, otherwise, not performing resampling operation, and directly performing the next third step to update the state transition parameter alpha, wherein N is the number of particles, and lambda is a constant parameter which can be preset; specifically, the corresponding resampling operation includes: set of particles for current frameAccording to the weightSampling and putting in a new particle setOf medium, i.e. particlesIs provided withIs added into a new particle set S ', and the weight of each particle in S' is updated after resamplingAnd making S equal to S' to complete the corresponding resampling operation process.
(6) module for calculating ROI position size
After the calculation of the above modules, the particle set in the current frame is obtainedNow the position size of the ROI in the current frame, the position of the ROI center can be calculated <math>
<mrow>
<mrow>
<mo>(</mo>
<mi>x</mi>
<mo>,</mo>
<mi>y</mi>
<mo>)</mo>
</mrow>
<mo>=</mo>
<mrow>
<mo>(</mo>
<munderover>
<mi>Σ</mi>
<mrow>
<mi>n</mi>
<mo>=</mo>
<mn>1</mn>
</mrow>
<mi>n</mi>
</munderover>
<msubsup>
<mi>π</mi>
<mi>k</mi>
<mi>n</mi>
</msubsup>
<mo>×</mo>
<msubsup>
<mi>s</mi>
<mi>k</mi>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<mi>x</mi>
<mo>}</mo>
<mo>,</mo>
<munderover>
<mi>Σ</mi>
<mrow>
<mi>n</mi>
<mo>=</mo>
<mn>1</mn>
</mrow>
<mi>n</mi>
</munderover>
<msubsup>
<mi>π</mi>
<mi>k</mi>
<mi>n</mi>
</msubsup>
<mo>×</mo>
<msubsup>
<mi>s</mi>
<mi>k</mi>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<mi>y</mi>
<mo>}</mo>
<mo>)</mo>
</mrow>
<mo>,</mo>
</mrow>
</math> Length of side of ROI <math>
<mrow>
<mi>a</mi>
<mo>=</mo>
<munderover>
<mi>Σ</mi>
<mrow>
<mi>n</mi>
<mo>=</mo>
<mn>1</mn>
</mrow>
<mi>n</mi>
</munderover>
<msubsup>
<mi>π</mi>
<mi>k</mi>
<mi>n</mi>
</msubsup>
<mo>×</mo>
<msubsup>
<mi>s</mi>
<mi>k</mi>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<mi>a</mi>
<mo>}</mo>
<mo>,</mo>
</mrow>
</math> <math>
<mrow>
<mi>b</mi>
<mo>=</mo>
<munderover>
<mi>Σ</mi>
<mrow>
<mi>n</mi>
<mo>=</mo>
<mn>1</mn>
</mrow>
<mi>n</mi>
</munderover>
<msubsup>
<mi>π</mi>
<mi>k</mi>
<mi>n</mi>
</msubsup>
<mo>×</mo>
<msubsup>
<mi>s</mi>
<mi>k</mi>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<mi>b</mi>
<mo>}</mo>
<mo>.</mo>
</mrow>
</math>
Then, it is also necessary to assemble the particles representing the ROI state in the current frameAnd storing the reference frame into a buffer of the reference frame so as to be conveniently applied in the subsequent tracking processing process.
The implementation of the embodiment of the invention can obtain better tracking effect with lower complexity in the process of tracking the ROI.
Specifically, the ROI tracking scheme provided in the embodiment of the present invention and the tracking scheme in the prior art are compared with each other using coastguard and stephen sequences as input, and the following results are found in the following experiment: in the coastguard sequence, the tracking scheme in the prior art loses the tracking result from the 49 th frame to the 56 th frame, but the tracking scheme provided by the embodiment of the invention shows that the tracking result is good; in the stephen sequence, the tracking scheme in the prior art loses the tracking result from the 33 rd frame to completely lose the tracking result from the 78 th frame, but the tracking scheme proposed by the embodiment of the present invention also shows that the tracking result is good.
In particle filter theory, one criterion used to measure the quality of a tracking scheme is the variation of the number of effective particles with tracking time. As shown in fig. 11 and fig. 12, which respectively show the change of the effective particle number with the tracking time in the tracking schemes of the prior art and the tracking schemes proposed by the embodiments of the present invention adopted in the coastguard and stephen sequences, the upper line in the two figures corresponds to the tracking scheme proposed by the embodiments of the present invention, and the lower line corresponds to the tracking scheme of the prior art, it can be seen that the effective particle number of the tracking scheme proposed by the embodiments of the present invention decreases more slowly with the tracking time, i.e. the tracking scheme proposed by the embodiments of the present invention is significantly superior to the tracking scheme in the prior art.
An environment in which the method and apparatus for tracking a region of interest in a video according to an embodiment of the present invention can be applied will be described below.
The embodiment of the invention provides a method and a device for tracking a region of interest in a Video, which can be applied to Scalable Video Coding (SVC) Coding containing ROI scalability.
Specifically, the structure of an SVC encoder with ROI scalability is shown in fig. 13, and the corresponding SVC provides support for region of interest coding. The ROI is often a region of the video frame that contains an object with a clear high level of semantic meaning to the viewer, such as a person, an object, etc. In the process of browsing a video by a user, if the display size of the device is small or the available bandwidth is reduced, in order not to affect the viewing experience of the video, it is necessary to maintain the definition of the region of interest as much as possible. As shown in fig. 13, the present invention provides a method and apparatus for tracking a region of interest in a video, which can be used in the process of acquiring a ROI region in an SVC encoder with ROI scalability, i.e. can be used to implement the module for acquiring an ROI region in an enhancement layer shown in fig. 13. The input of the module is the coded base layer MV and video data information of the current frame, and after tracking processing, the size and position information of ROI in the current frame is output, so as to facilitate the ROI area coding of the enhancement layer.
Application example two
The method and the device for tracking the region of interest in the video provided by the embodiment of the invention can be applied to scalable video coding technology and can also be applied to transcoding of the region of interest.
Due to the limitations of the size of a display screen of a video terminal, network bandwidth and the like, an existing compressed video code stream is often required to be transcoded into a target code stream required by a client, and in order to ensure visual quality, an original compressed code stream can be transcoded into an ROI code stream, that is, visual information which is not critical to the client is discarded, and only the high quality of an ROI area is reserved. The structure of the corresponding cascaded transcoder is shown in fig. 14, and the ROI tracking module of the cascaded transcoder shown in fig. 14 can be used to obtain the position and size information of the ROI, so as to encode the ROI into a code stream with high quality.
Of course, the method and the device for tracking the ROI in the video according to the embodiments of the present invention can also be applied to other similar application environments requiring tracking the ROI. This is not exemplified.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.
Claims (10)
1. A method for tracking regions of interest in a video, comprising:
acquiring a motion vector of a pixel or a macro block in a current frame, determining a moving speed distribution parameter of an ROI (region of interest) according to the motion vector, and determining an ROI scaling parameter according to state information of the ROI in a reference frame;
and performing state transition processing on the particles obtained by sampling in the current frame by using the moving speed distribution parameter and the scaling parameter of the ROI, and determining the position and the size of the ROI of the current frame according to the particles after the state transition.
2. The method of claim 1,
the step of determining the moving speed distribution parameter of the ROI includes:
determining the position of a reference pixel or a macro block corresponding to the pixel or the macro block in the current frame in the reference frame according to the motion vector of the pixel or the macro block in the current frame, and selecting the motion vector of the pixel or the macro block in the current frame, wherein the position of the reference pixel or the macro block is positioned in the ROI in the reference frame, in the motion vector of the pixel or the macro block in the current frame; determining the moving speed distribution parameter of the ROI according to the selected motion vector of the pixel or the macro block in the current frame;
the step of determining ROI scaling parameters comprises:
and determining the corresponding pixel or macro block in the current frame according to the selected motion vector of the pixel or macro block in the current frame, and determining the ROI scaling parameter according to the pixel or macro block in the current frame and the pixel or macro block in the corresponding reference frame.
3. The method of claim 2, wherein the moving speed distribution parameters of the ROI comprise:wherein δ is a dirac function, G is a motion vector set of pixels or macroblocks in the current frame corresponding to the reference pixel or macroblock whose position is located in the ROI in the reference frame, and MV isiRefers to the ith motion vector, MV, in GROIMoving speed of ROI from previous frame to current frame; the ROI scaling parameters include:and a is, b is an affine parameter, the affine parameter is solved by combining a least square method and a four-parameter transformation model, and a corresponding four-parameter transformation module is established according to the pixel or the macro block in the current frame and the pixel or the macro block in the corresponding reference frame.
4. The method of claim 3, wherein the step of performing state transition processing on the sampled particles in the current frame comprises:
determining the velocity v of the particle state transition according to the moving velocity distribution parameter of the ROI1、v2Comprises the following steps of; generating a random number mu in a uniform distribution of 0-1, and if mu is less than alpha, making Sign variableOtherwise, with distribution P (MV)ROI) Selecting two components (MV ') corresponding to one element in the motion vector set G counted in the previous module'x,MV′y) Respectively as v1、v2Value of, i.e. order v1=MV′x、v2=MV′yTo and fromα is a state transition parameter, an initial value of which is preset, and a mode of updating the value in a subsequent particle updating process includes: is v in particle n in the k-1 framexThe components of the first and second images are,is v in particle n in the k-1 frameyA component;
determining a particle scaling parameter η from the ROI scaling parameter, including; generating a random number gamma in a uniform distribution of 0-1, if gamma is less than beta, making eta equal to rho, otherwise making eta equal to 1, wherein beta is a target size change degree parameter, the value of beta is preset, and rho is the scaling parameter;
performing particle state transition processing according to the particle state transition speed and the particle scaling parameter, and obtaining a result after particle state transition of an nth particle in a kth frame includes: the position of the particle ( <math>
<mrow>
<msubsup>
<mi>s</mi>
<mi>k</mi>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<mi>x</mi>
<mo>}</mo>
<mo>=</mo>
<msubsup>
<mi>s</mi>
<mrow>
<mi>k</mi>
<mo>-</mo>
<mn>1</mn>
</mrow>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<mi>x</mi>
<mo>}</mo>
<mo>+</mo>
<msub>
<mi>v</mi>
<mn>1</mn>
</msub>
<mo>+</mo>
<msub>
<mi>ϵ</mi>
<mi>x</mi>
</msub>
<mo>,</mo>
</mrow>
</math> <math>
<mrow>
<msubsup>
<mi>s</mi>
<mi>k</mi>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<mi>y</mi>
<mo>}</mo>
<mo>=</mo>
<msubsup>
<mi>s</mi>
<mrow>
<mi>k</mi>
<mo>-</mo>
<mn>1</mn>
</mrow>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<mi>y</mi>
<mo>}</mo>
<mo>+</mo>
<msub>
<mi>v</mi>
<mn>2</mn>
</msub>
<mo>+</mo>
<msub>
<mi>ϵ</mi>
<mi>y</mi>
</msub>
</mrow>
</math> ) The moving speeds of the particles in the x and y directions are respectively as follows: <math>
<mrow>
<msubsup>
<mi>s</mi>
<mi>k</mi>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<msub>
<mi>v</mi>
<mi>x</mi>
</msub>
<mo>}</mo>
<mo>=</mo>
<msub>
<mi>v</mi>
<mn>1</mn>
</msub>
<mo>+</mo>
<msub>
<mi>ϵ</mi>
<msub>
<mi>v</mi>
<mi>x</mi>
</msub>
</msub>
</mrow>
</math> and <math>
<mrow>
<msubsup>
<mi>s</mi>
<mi>k</mi>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<msub>
<mi>v</mi>
<mi>y</mi>
</msub>
<mo>}</mo>
<mo>=</mo>
<msub>
<mi>v</mi>
<mn>2</mn>
</msub>
<mo>+</mo>
<msub>
<mi>ϵ</mi>
<msub>
<mi>v</mi>
<mi>y</mi>
</msub>
</msub>
<mo>,</mo>
</mrow>
</math> the major semi-axis of the ellipse corresponding to the particle at the position is <math>
<mrow>
<msubsup>
<mi>s</mi>
<mi>k</mi>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<mi>a</mi>
<mo>}</mo>
<mo>=</mo>
<mi>η</mi>
<mo>×</mo>
<msubsup>
<mi>s</mi>
<mrow>
<mi>k</mi>
<mo>-</mo>
<mn>1</mn>
</mrow>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<mi>a</mi>
<mo>}</mo>
<mo>+</mo>
<msub>
<mi>ϵ</mi>
<mi>a</mi>
</msub>
<mo>,</mo>
</mrow>
</math> Short semi-axis is <math>
<mrow>
<msubsup>
<mi>s</mi>
<mi>k</mi>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<mi>b</mi>
<mo>}</mo>
<mo>=</mo>
<mi>η</mi>
<mo>×</mo>
<msubsup>
<mi>s</mi>
<mrow>
<mi>k</mi>
<mo>-</mo>
<mn>1</mn>
</mrow>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<mi>b</mi>
<mo>}</mo>
<mo>+</mo>
<msub>
<mi>ϵ</mi>
<mi>b</mi>
</msub>
<mo>,</mo>
</mrow>
</math> Wherein epsilonx、εyIs a random variable obeying a distribution of N (0, 0.2),is a random variable, ε, obeying N (0, 0.25)a、εbIs a random variable obeying N (0, 0.1)。
5. The method according to any of claims 1-4, wherein the step of determining the ROI of the current frame from the state-transferred particles comprises:
performing the statistical processing of the color histogram of each particle area on the particles subjected to the state transition processing, and performing particle updating processing according to the statistical processing result of the color histogram of each particle area;
the ROI position size is calculated based on the result obtained by the particle update process.
6. An apparatus for tracking regions of interest in a video, comprising:
a moving speed distribution parameter and scaling parameter determining module, for obtaining the motion vector of the pixel or macro block in the current frame, and determining the moving speed distribution parameter of the ROI according to the motion vector, and also for determining the scaling parameter according to the state information of the ROI in the reference frame;
the particle state transition module is used for carrying out state transition processing on the particles sampled and obtained in the current frame by utilizing the moving speed distribution parameters and the scaling parameters of the ROI determined by the moving speed distribution parameter and scaling parameter determining module;
and the ROI determining module is used for determining the position and the size of the ROI of the current frame according to the particles subjected to state transfer by the particle state transfer module.
7. The apparatus of claim 6, wherein the moving speed distribution parameter and scaling parameter determining module specifically comprises:
the motion vector acquisition module is used for determining the position of a reference pixel or a macro block corresponding to the pixel or the macro block in the current frame in the reference frame according to the motion vector of the pixel or the macro block in the current frame, and selecting the motion vector of the pixel or the macro block in the current frame corresponding to the reference pixel or the macro block of which the position is positioned in the ROI in the reference frame from the motion vector of the pixel or the macro block in the current frame;
the parameter determining module is used for determining the moving speed distribution parameter and the zooming parameter of the ROI according to the motion vector of the pixel or the macro block in the current frame selected and obtained by the motion vector obtaining module; and determining the corresponding pixel or macro block in the current frame according to the selected motion vector of the pixel or macro block in the current frame, and determining the ROI scaling parameter according to the pixel or macro block in the current frame and the pixel or macro block in the corresponding reference frame.
8. The apparatus of claim 7, wherein the parameter determination module comprises:
a moving speed distribution parameter estimation module of the ROI, configured to estimate a moving speed distribution parameter of the ROI, and the estimating of the obtained moving speed distribution parameter of the ROI includes: <math>
<mrow>
<mi>P</mi>
<mrow>
<mo>(</mo>
<msup>
<mi>MV</mi>
<mi>ROI</mi>
</msup>
<mo>)</mo>
</mrow>
<mo>=</mo>
<munderover>
<mi>Σ</mi>
<mrow>
<mi>i</mi>
<mo>=</mo>
<mn>0</mn>
</mrow>
<mrow>
<mo>|</mo>
<mi>G</mi>
<mo>|</mo>
<mo>-</mo>
<mn>1</mn>
</mrow>
</munderover>
<mi>δ</mi>
<mrow>
<mo>(</mo>
<msup>
<mi>MV</mi>
<mi>i</mi>
</msup>
<mo>-</mo>
<msup>
<mi>MV</mi>
<mi>ROI</mi>
</msup>
<mo>)</mo>
</mrow>
<mo>/</mo>
<mo>|</mo>
<mi>G</mi>
<mo>|</mo>
<mo>,</mo>
</mrow>
</math> wherein δ is a dirac function, G is a motion vector set of pixels or macroblocks in the current frame corresponding to the reference pixel or macroblock whose position is located in the ROI in the reference frame, and MV isiRefers to the ith motion vector, MV, in GROIMoving speed of ROI from previous frame to current frame;
an ROI scaling parameter estimation module, configured to estimate an ROI scaling parameter, and estimating the obtained ROI scaling parameter includes:and a is, b is an affine parameter, the affine parameter is solved by combining a least square method with a four-parameter transformation model, and a corresponding four-parameter transformation module is established according to the pixel or the macro block in the current frame and the pixel or the macro block in the corresponding reference frame.
9. The apparatus of claim 8, wherein the particle state transition module comprises:
a particle position and velocity transfer module for determining the velocity v of particle state transfer according to the ROI moving velocity distribution parameter obtained by the ROI moving velocity distribution parameter estimation module1、v2Comprises the following steps of; generating a random number mu in a uniform distribution of 0-1, and if mu is less than alpha, making Sign variableOtherwise, with distribution P (MV)ROI) Selecting two components (MV ') corresponding to one element in the motion vector set G counted in the previous module'x,MV′y) Respectively as v1、v2Value of, i.e. order v1=MV′x、v2=MV′yTo and fromα is a state transition parameter, an initial value of which is preset, and a mode of updating the value in a subsequent particle updating process includes: is v in particle n in the k-1 framexThe components of the first and second images are,is v in particle n in the k-1 frameyA component;
the particle size transfer module is used for determining a particle scaling parameter eta according to the ROI scaling parameter obtained by the ROI scaling parameter estimation module, and comprises the following steps of; generating a random number gamma in a uniform distribution of 0-1, if gamma is less than beta, making eta equal to rho, otherwise making eta equal to 1, wherein beta is a target size change degree parameter, the value of beta is preset, and rho is the ROI scaling parameter;
a particle state transition result determining module, configured to perform particle state transition processing according to the particle state transition speed determined by the particle position and speed transferring module and the particle scaling parameter determined by the particle size transferring module, where obtaining a result after particle state transition of an nth particle in a kth frame includes: the position of the particle ( <math>
<mrow>
<msubsup>
<mi>s</mi>
<mi>k</mi>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<mi>x</mi>
<mo>}</mo>
<mo>=</mo>
<msubsup>
<mi>s</mi>
<mrow>
<mi>k</mi>
<mo>-</mo>
<mn>1</mn>
</mrow>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<mi>x</mi>
<mo>}</mo>
<mo>+</mo>
<msub>
<mi>v</mi>
<mn>1</mn>
</msub>
<mo>+</mo>
<msub>
<mi>ϵ</mi>
<mi>x</mi>
</msub>
<mo>,</mo>
</mrow>
</math> <math>
<mrow>
<msubsup>
<mi>s</mi>
<mi>k</mi>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<mi>y</mi>
<mo>}</mo>
<mo>=</mo>
<msubsup>
<mi>s</mi>
<mrow>
<mi>k</mi>
<mo>-</mo>
<mn>1</mn>
</mrow>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<mi>y</mi>
<mo>}</mo>
<mo>+</mo>
<msub>
<mi>v</mi>
<mn>2</mn>
</msub>
<mo>+</mo>
<msub>
<mi>ϵ</mi>
<mi>y</mi>
</msub>
</mrow>
</math> ) The moving speeds of the particles in the x and y directions are respectively as follows: <math>
<mrow>
<msubsup>
<mi>s</mi>
<mi>k</mi>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<msub>
<mi>v</mi>
<mi>x</mi>
</msub>
<mo>}</mo>
<mo>=</mo>
<msub>
<mi>v</mi>
<mn>1</mn>
</msub>
<mo>+</mo>
<msub>
<mi>ϵ</mi>
<msub>
<mi>v</mi>
<mi>x</mi>
</msub>
</msub>
</mrow>
</math> and <math>
<mrow>
<msubsup>
<mi>s</mi>
<mi>k</mi>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<msub>
<mi>v</mi>
<mi>y</mi>
</msub>
<mo>}</mo>
<mo>=</mo>
<msub>
<mi>v</mi>
<mn>2</mn>
</msub>
<mo>+</mo>
<msub>
<mi>ϵ</mi>
<msub>
<mi>v</mi>
<mi>y</mi>
</msub>
</msub>
<mo>,</mo>
</mrow>
</math> the major semi-axis of the ellipse corresponding to the particle at the position is <math>
<mrow>
<msubsup>
<mi>s</mi>
<mi>k</mi>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<mi>a</mi>
<mo>}</mo>
<mo>=</mo>
<mi>η</mi>
<mo>×</mo>
<msubsup>
<mi>s</mi>
<mrow>
<mi>k</mi>
<mo>-</mo>
<mn>1</mn>
</mrow>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<mi>a</mi>
<mo>}</mo>
<mo>+</mo>
<msub>
<mi>ϵ</mi>
<mi>a</mi>
</msub>
<mo>,</mo>
</mrow>
</math> Short semi-axis is <math>
<mrow>
<msubsup>
<mi>s</mi>
<mi>k</mi>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<mi>b</mi>
<mo>}</mo>
<mo>=</mo>
<mi>η</mi>
<mo>×</mo>
<msubsup>
<mi>s</mi>
<mrow>
<mi>k</mi>
<mo>-</mo>
<mn>1</mn>
</mrow>
<mi>n</mi>
</msubsup>
<mo>{</mo>
<mi>b</mi>
<mo>}</mo>
<mo>+</mo>
<msub>
<mi>ϵ</mi>
<mi>b</mi>
</msub>
<mo>,</mo>
</mrow>
</math> Wherein epsilonx、εyIs a random variable obeying a distribution of N (0, 0.2),is a random variable, ε, obeying N (0, 0.25)a、εbIs a random variable obeying N (0, 0.1).
10. The apparatus of any one of claims 6-9, wherein the ROI determination module comprises:
each particle area color histogram statistical processing module is used for executing each particle area color histogram statistical processing on the particles after the state transition processing;
the particle updating module is used for performing particle updating processing according to the statistical processing result of the color histogram of each particle area;
and the ROI position size calculation module is used for calculating the ROI position size according to the result obtained by the particle updating processing.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN 201210132913 CN102682454B (en) | 2012-04-28 | 2012-04-28 | Method and device for tracking region of interest in video |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN 201210132913 CN102682454B (en) | 2012-04-28 | 2012-04-28 | Method and device for tracking region of interest in video |
Publications (2)
Publication Number | Publication Date |
---|---|
CN102682454A true CN102682454A (en) | 2012-09-19 |
CN102682454B CN102682454B (en) | 2013-05-08 |
Family
ID=46814319
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN 201210132913 Active CN102682454B (en) | 2012-04-28 | 2012-04-28 | Method and device for tracking region of interest in video |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN102682454B (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103517073A (en) * | 2013-07-12 | 2014-01-15 | 上海交通大学 | Video encoding and decoding method, device and system |
CN104185078A (en) * | 2013-05-20 | 2014-12-03 | 华为技术有限公司 | Video monitoring processing method, device and system thereof |
WO2016029399A1 (en) * | 2014-08-28 | 2016-03-03 | Qualcomm Incorporated | Object selection based on region of interest fusion |
US9542751B2 (en) | 2015-05-08 | 2017-01-10 | Qualcomm Incorporated | Systems and methods for reducing a plurality of bounding regions |
US9865062B2 (en) | 2016-02-12 | 2018-01-09 | Qualcomm Incorporated | Systems and methods for determining a region in an image |
TWI613910B (en) * | 2014-12-03 | 2018-02-01 | 安訊士有限公司 | Method and encoder for video encoding of a sequence of frames |
CN110933446A (en) * | 2019-11-15 | 2020-03-27 | 网宿科技股份有限公司 | Method, system and equipment for identifying region of interest |
CN110996099A (en) * | 2019-11-15 | 2020-04-10 | 网宿科技股份有限公司 | Video coding method, system and equipment |
CN111105442A (en) * | 2019-12-23 | 2020-05-05 | 中国科学技术大学 | Switching type target tracking method |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101127121A (en) * | 2007-09-13 | 2008-02-20 | 复旦大学 | Target tracking algorism based on self-adaptive initial search point forecast |
CN101894378A (en) * | 2010-06-13 | 2010-11-24 | 南京航空航天大学 | Moving target visual tracking method and system based on double ROI (Region of Interest) |
CN102124727A (en) * | 2008-03-20 | 2011-07-13 | 无线电技术研究学院有限公司 | A method of adapting video images to small screen sizes |
-
2012
- 2012-04-28 CN CN 201210132913 patent/CN102682454B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101127121A (en) * | 2007-09-13 | 2008-02-20 | 复旦大学 | Target tracking algorism based on self-adaptive initial search point forecast |
CN102124727A (en) * | 2008-03-20 | 2011-07-13 | 无线电技术研究学院有限公司 | A method of adapting video images to small screen sizes |
CN101894378A (en) * | 2010-06-13 | 2010-11-24 | 南京航空航天大学 | Moving target visual tracking method and system based on double ROI (Region of Interest) |
Non-Patent Citations (1)
Title |
---|
徐力群,吴晓娟: "基于颜色概率模型的实时手势跟踪算法", 《计算机工程与科学》 * |
Cited By (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104185078A (en) * | 2013-05-20 | 2014-12-03 | 华为技术有限公司 | Video monitoring processing method, device and system thereof |
CN103517073B (en) * | 2013-07-12 | 2016-11-02 | 上海交通大学 | Video coding-decoding method, Apparatus and system |
WO2015003573A1 (en) * | 2013-07-12 | 2015-01-15 | 华为技术有限公司 | Video codec method, device and system |
CN103517073A (en) * | 2013-07-12 | 2014-01-15 | 上海交通大学 | Video encoding and decoding method, device and system |
US10620826B2 (en) | 2014-08-28 | 2020-04-14 | Qualcomm Incorporated | Object selection based on region of interest fusion |
CN106575362A (en) * | 2014-08-28 | 2017-04-19 | 高通股份有限公司 | Object selection based on region of interest fusion |
WO2016029399A1 (en) * | 2014-08-28 | 2016-03-03 | Qualcomm Incorporated | Object selection based on region of interest fusion |
TWI613910B (en) * | 2014-12-03 | 2018-02-01 | 安訊士有限公司 | Method and encoder for video encoding of a sequence of frames |
US9936217B2 (en) | 2014-12-03 | 2018-04-03 | Axis Ab | Method and encoder for video encoding of a sequence of frames |
US9542751B2 (en) | 2015-05-08 | 2017-01-10 | Qualcomm Incorporated | Systems and methods for reducing a plurality of bounding regions |
US9865062B2 (en) | 2016-02-12 | 2018-01-09 | Qualcomm Incorporated | Systems and methods for determining a region in an image |
CN110933446A (en) * | 2019-11-15 | 2020-03-27 | 网宿科技股份有限公司 | Method, system and equipment for identifying region of interest |
CN110996099A (en) * | 2019-11-15 | 2020-04-10 | 网宿科技股份有限公司 | Video coding method, system and equipment |
CN110933446B (en) * | 2019-11-15 | 2021-05-25 | 网宿科技股份有限公司 | Method, system and equipment for identifying region of interest |
CN110996099B (en) * | 2019-11-15 | 2021-05-25 | 网宿科技股份有限公司 | Video coding method, system and equipment |
CN111105442A (en) * | 2019-12-23 | 2020-05-05 | 中国科学技术大学 | Switching type target tracking method |
CN111105442B (en) * | 2019-12-23 | 2022-07-15 | 中国科学技术大学 | Switching type target tracking method |
Also Published As
Publication number | Publication date |
---|---|
CN102682454B (en) | 2013-05-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102682454B (en) | Method and device for tracking region of interest in video | |
CN106162177B (en) | Method for video coding and device | |
CN111918066B (en) | Video encoding method, device, equipment and storage medium | |
Liu et al. | Learned video compression via joint spatial-temporal correlation exploration | |
Zhang et al. | Deep learning based just noticeable difference and perceptual quality prediction models for compressed video | |
CN110751649B (en) | Video quality evaluation method and device, electronic equipment and storage medium | |
CN109151474A (en) | A method of generating new video frame | |
US10506249B2 (en) | Segmentation-based parameterized motion models | |
US20080002771A1 (en) | Video segment motion categorization | |
CN1262496A (en) | Method and apparatus for motion estimating using block matching in orthogonal transformation field | |
CN112465698A (en) | Image processing method and device | |
CN112954398B (en) | Encoding method, decoding method, device, storage medium and electronic equipment | |
CN112468806B (en) | Panoramic video transmission optimization method for cloud VR platform | |
JP2002125233A (en) | Image compression system for weighting video contents | |
Huang et al. | Perceptual quality driven frame-rate selection (PQD-FRS) for high-frame-rate video | |
CN115052187A (en) | Super-resolution live broadcast system based on online training | |
US8879622B2 (en) | Interactive system and method for transmitting key images selected from a video stream over a low bandwidth network | |
WO2008077160A1 (en) | Method and system for video quality estimation | |
CN114513670B (en) | End-to-end video compression method, device and computer readable storage medium | |
CN115834922A (en) | Picture enhancement type decoding method facing real-time video analysis | |
Lee et al. | Feature map compression for video coding for machines based on receptive block based principal component analysis | |
CN104506869B (en) | Method for estimating based on Block- matching video sequence under different resolution | |
Huang et al. | Semantic video adaptation using a preprocessing method for mobile environment | |
CN105357526A (en) | Compressed domain based mobile phone football video quality evaluation device and method considering scene classification | |
Wen et al. | Semantic-Based Motion Detection Method for Unmanned Aerial Vehicle Data Transmission |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant |