EP1606952A1 - Method for motion vector determination - Google Patents

Method for motion vector determination

Info

Publication number
EP1606952A1
EP1606952A1 EP04719556A EP04719556A EP1606952A1 EP 1606952 A1 EP1606952 A1 EP 1606952A1 EP 04719556 A EP04719556 A EP 04719556A EP 04719556 A EP04719556 A EP 04719556A EP 1606952 A1 EP1606952 A1 EP 1606952A1
Authority
EP
European Patent Office
Prior art keywords
block
pixels
motion vectors
groups
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP04719556A
Other languages
German (de)
French (fr)
Inventor
Gerard De Haan
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Koninklijke Philips NV
Original Assignee
Koninklijke Philips Electronics NV
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Koninklijke Philips Electronics NV filed Critical Koninklijke Philips Electronics NV
Priority to EP04719556A priority Critical patent/EP1606952A1/en
Publication of EP1606952A1 publication Critical patent/EP1606952A1/en
Withdrawn legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/14Picture signal circuitry for video frequency region
    • H04N5/144Movement detection
    • H04N5/145Movement estimation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/537Motion estimation other than block-based

Definitions

  • the invention relates to a method for determining motion vectors from image data for blocks or objects of an image taken from an image sequence.
  • the invention further relates to a display device comprising a determinator for determining motion vectors for blocks or objects of an image taken from an image sequence, and to a computer program product comprising software code portions for determining motion vectors for blocks or objects of an image taken from an image sequence.
  • motion vectors are represented by motion vectors that determine motion (or object displacement) from one image to another. Determination of motion vectors can for instance be used for motion-compensated predictive coding. Since one picture in an image is normally very similar to a displaced copy of its predecessors, encoding determined motion vector data together with information on the difference between the actual image and its prediction either in the pixel- or DCT-domain allows to vastly reduce the temporal redundancy in the coded signal.
  • Further examples for the estimation of motion vectors comprise methods to estimate the motion model for image segments (objects), where the components of the motion vectors then contain the parameters of the motion model.
  • Block Matching Algorithm (BMA)
  • an image is decomposed in blocks of fixed or variable size.
  • the image can be decomposed in its dominant objects instead of its blocks (object segmentation), so that the subsequent description equally well holds for objects instead of blocks.
  • BMA Block Matching Algorithm
  • For each block of the current image a similar block in the previous image is searched, where a similarity measure is applied to identify the previous block most similar to the current block.
  • the motion vector associated to the block of the previous image, for which the largest similarity was determined, then represents the motion vector associated to the pixels of the current block. Note that, when calculating the similarity measure, not all pixels of the two blocks that are to be compared have to be evaluated.
  • the blocks can be spatially sub-sampled, so that only each £-th pixel of both blocks is considered for the evaluation of the similarity measure.
  • block-matching motion estimators are used to calculate a displacement vector for every block of pixels in an image, usually by selecting that vector from a candidate vector set that minimizes a match criterion. That vector is then the motion vector for the relevant block of the image.
  • motion may be any type of displacement, encompassing e.g. real motion (e.g. one or more objects moving within a displayed image), but also zooming in or zooming out of an image (the image becoming larger or smaller) or camera movement, in which case the image as a whole moves within the frame of the camera.
  • Motion vectors comprise, within the concept of the invention, any estimation of motion or displacement data for blocks or objects, resulting from any method in which, based on a number of images of which the data are known, one or more further images are constructed. Said motion vectors are estimated to predict the position or other parameters of blocks or objects within said further images.
  • An example of such a method is for instance video format conversion, in which method, by use of e.g. picture interpolation and/or de-interlacing, from video data in one video format (format A, source format) video data in another video format (format B, target format) are derived.
  • vectors can be used to estimate for blocks or object based on the known data in the one known video format (the source format) the data for said blocks or objects in the another video format (the target format). It is to be noted that using picture interpolation new images are constructed from the known images (picture interpolation) but using de- interlacing the known images are not changed but the distribution of data over scan lines is changed. Using vector assigned to blocks in such a video conversion method simplifies calculation, and e.g. enables compression of data. A further example is so-called "disparity estimation" for stereoscopic video in which on the basis of two images representing two stereoscopic views the local depth is estimated.
  • vectors can be attributed to blocks or objects, which vectors enable to predict on the basis of known images the position of other parameters for said blocks in further images, e.g. interpolated or de-interlaced images or slightly displaced images due to a 3D effect in an stereoscopic image pair.
  • the invention is in particular useful for "classical" motion vectors, i.e. for predicting motion vectors for blocks or objects based on a number of preceding images to construct a (or a series) of following images.
  • block motion vectors is generally a useful strategy artifacts may appear, for instance around the boundary of objects or when an object overlays a subtitle.
  • a critical parameter in a block-matching algorithm is the size of the block. This parameter both determines the resolution of the estimated vector field and the sensitivity of the estimator for noise and periodic structures in the image. As a consequence, the optimal block size is a compromise. On the one hand, small block sizes lead to noisy motion vectors and a high sensitivity for periodic structures, whereas, on the other hand, big blocks lead to a poor vector resolution. A poor vector resolution yields vector fields in which the object boundary can only be coarsely approximated, resulting in blocking artifacts in applications that use these motion vectors.
  • the method in accordance with the invention is characterized in that for a block or object the pixels are divided in two or more groups in accordance with a comparison of a division criterion with the information of the pixels, and for each group within the block or object a motion vector is determined, and the respective motion vectors are assigned to the block and applied to the pixels of the respective groups within the block.
  • the group or block is divided into two or more groups based on a comparison between the information in the pixels within the group and a division criterion.
  • a comparison between the information in the pixels within the group and a division criterion i.e. which pixels belong to which group within the block follows from the comparison, i.e. is not fixed as would be the case when e.g. a block is divided into a number of equal parts. The latter would simply mean that the block size is reduced, i.e. smaller blocks are used.
  • a block is divided into groups based on the comparison between the division criterion and the information of the pixels.
  • the method in accordance with the invention allows larger block sizes to be used, while yet achieving better vector resolution. It has also been shown in experiments that artifacts are reduced.
  • the separation criterion may be a simple criterion, independent of the information content of the pixels.
  • An example of such a simple criterion is a fixed threshold intensity, e.g. dividing each block into two groups, the first one comprising the pixels having an luminance value below a certain percentage (e.g. 50 %) of the maximum luminance value, the second one comprising the pixels having an luminance value above said threshold.
  • the division criterion is based on the infonnation content of the pixels within the block. Examples of such criterion are e.g. dividing the block into two groups, wherein the criterion is an average intensity, or a color point area around the average color point. Such a division criterion is preferred since it leads to better results, since it leads to the possibility to assign more than one vector in all regions, regardless of their brightness levels.
  • the blocks or objects are divided into four or less, preferably two groups.
  • a block or object may be divided into any number of groups, a small number of groups, four or less and preferably two, is preferred. In most circumstances the additional division into more than four groups and often even into more than two only leads to a marginal improvement, or even noisy vector estimates, while complicating the method.
  • the division criterion is an average luminance value for the pixels within the block. This has proven to be a useful and simple criterion. This may be the average luminance value, i.e. the quotient of the sum of all luminance values and the number of pixels, or the median luminance value, i.e. that luminance value for which 50% of the pixels has a luminance value higher than that value and for which 50% has a luminance values less than or equal to said luminance value.
  • a comparison is made, after estimating the motion vectors for the groups constituting the block, between the motion vectors and if the difference between motion vectors of a number of groups is less than a threshold value, an average motion vector is calculated and attributed to said number of groups.
  • the division into groups provides for an improved method. However, if the block is divided into a number of group, while the block in fact comprises only one object (and thus only one motion vector is the appropriate one) splitting the block into two or more groups will lead to small differences between the calculated motion vectors.
  • the calculation of a motion vector is usually an approximation and is done on a limited number of pixels, so there is an error margin.
  • Fig. la to lc illustrate different image sequences showing different types of motions or displacements.
  • FIG. 2 illustrates a rotating wheel against a background.
  • Figure 3 illustrates by means of a flow diagram the different steps of an exemplary method in accordance with the invention.
  • Fig 1A to 1C illustrates very schematically different motions or displacements of or within an image.
  • Fig. 1A a sequence of images is illustrated in which an object (a very simplified image of a bird) is moving against a background. From the first two images of the sequence a motion vector (the arrow in the middle image) can be established, which can be used to predict the position of the object (the bird) in the following image.
  • a motion vector the arrow in the middle image
  • Fig. IB illustrates an action in which the camera zooms in on the bird.
  • a motion vector which could be more generally called a displacement vector
  • To blocks or objects of the image motion vectors may be assigned, calculated on basis of the information in preceding images to make a prediction of the position of said blocks or objects in the following (or if the method is used for e.g. interpolating a number of "intermediate") image(s) within the sequence.
  • Fig. 1C illustrates a camera movement, the camera moves with respect to the image.
  • a motion vector which could be more generally called a displacement vector
  • the vectors for blocks or objects may be different for different object, dependent on e.g. their position vis-a-vis the camera, e.g. whether the objects are in the foreground or the background of the image.
  • objects may deform (because different parts of an object are positioned differently vis-a-vis the camera) as the camera position is changed.
  • motion vector is thus to be broadly interpreted as denoting a set of parameters to predict any transformation of an object or block, such as for instance a simple translation T (for which a simple addition of a vector suffices, see e.g. figure 1 A), a rotation R (for which a matrix multiplication with a rotational matrix suffices), enlargement (matrix multiplication, see for instance Fig IB) or deformation (more complex matrices will then be needed) or any combination of such translation, rotation, enlargement and/or deformation.
  • the prediction is usable to predict the position of an object of block in succeeding images based on the information available in preceding images.
  • block-matching motion estimators are used to calculate a motion vector for every block of pixels in an image, usually by selecting that vector from a candidate vector set that minimizes a match criterion. That vector is then the motion vector for the relevant block of the image.
  • US 5 072 293 e.g. discloses such a BMA, where predictions from a 3D neighborhood are used as candidate vectors for motion vector estimation.
  • the set of candidate motion vectors comprises both spatial (2D) and temporal (ID) predictions of motion vectors, the best of which is determined for each block recursively.
  • the technique is recursive in that at least one candidate motion vector in the set of candidate motion vectors for a block in the current image n depends on already determined motion vectors of other blocks in the image n (spatial predictions) or in the preceding image n-1 (temporal predictions.
  • Figure 2 illustrates a rotating wheel against a background of stars. The wheel rotates whereas the stars have a fixed position.
  • SAD Summed Absolute Difference
  • the motion vector that results at the output -one vector per block- is the candidate vector that gives the lowest SAD value.
  • the quality of the above motion estimator is largely determined by the way the candidate vectors are generated. In this invention disclosure, we are indifferent concerning this choice. Good results (depending on the application) can be achieved with a full-search, a three-step search, a one-at-a-time search, or a 3-D Recursive Search block matcher. Also possible is a so-called hierarchical motion estimator method, in which method conventionally first for a relatively large block (e.g. 32x32 pixels) a motion vector is estimated, whereafter the large block is cut into smaller blocks (e.g.
  • the motion vector of the large block is transferred to the next hierarchical level, i.e. the motion vector of the large block is used as a starting point for the calculation of the motion vectors for the smaller blocks.
  • the method in accordance with the invention can be used for a hierarchical motion estimator method, there then two (or more dependent on the division criterion) motion vectors are transferred to the next hierarchical level.
  • a critical parameter in any such a block-matching (or any matching) algorithm is the size of the block. This parameter both determines the resolution of the estimated vector field and the sensitivity of the estimator for noise and periodic structures in the image. As a consequence, the optimal block size is a compromise. On the one hand, small block sizes lead to noisy motion vectors and a high sensitivity for periodic structures, whereas, on the other hand, big blocks lead to a poor vector resolution. A poor vector resolution yields vector fields in which the object boundary can only be coarsely approximated, resulting in blocking artifacts in applications that use these motion vectors. If a relatively large block or object size is chosen, such as for instance schematically indicated by the dotted rectangle in figure 2, the resulting motion vector will be off target.
  • the motion vector (which may be a rotational matrix) is near the true rotation of the wheel the motion vector is wrong for the stars, if the motion vector is found to be near zero (correct for the stars) the predicted motion vector if off target for the wheel. Any average value is off target for both the wheel and the stars.
  • a smaller block size such as schematically indicated in the lower part of figure 2, for most of the figure the problem are reduced, however at the cost of a reduced accuracy for the predicted motion vector(s). Even so, even for a small block size, such as for instance shown by rectangle 21, the same problem occurs.
  • the present invention aims to provide a way to resolve or at least reduce the problems.
  • the match criterion is modified, and based on that modified criterion, more than one vector per block are assigned.
  • the method in accordance with the invention is characterized in that for a block or object the pixels are divided in two or more groups in accordance with a comparison of a division cr terion with the information of the pixels, and for each group within the block or object a mot: on vector is determined, and the respective motion vectors are assigned to the block and appl ed to the pixels of the respective groups within the block.
  • the basic insight is that, if within a block or object there are groups of pixels for whom the best prediction for the motion vector differ, a division can be made between groups on the basis of the information of the pixels, by comparing the information of the pixels to a division criterion, whereafter for each of the groups a motion vector is determined, and the respective motion vectors are assigned to the pixels within each group.
  • each block is split into two groups of pixels, while the estimator assigns a motion vector to both groups, i.e. 2 vectors per block.
  • B(X) would in this example be e.g. the pixels within an object or within a rectangle of a size nxm, for instance with n and m between 4 and 32, for instance 16x16.
  • G a (X) G b (X) , of pixels together forming block B(X) :
  • G a (X) ⁇ x e B(X) I F(x,n) > Av(X,n) ⁇ (3) i.e. those pixels within block B(X) with a luminance value larger than the average luminance value, and
  • G b (X) ⁇ x e B(X) I F(x,n) ⁇ Av(X,n) ⁇ (4) i.e. those pixels with an luminance value equal or smaller than the average luminance value.
  • motion vectors D a and D b are calculated such that D a is the candidate vector that minimizes the SAD a for the pixels in group G a :
  • Both motion vectors D a and D b are assigned to block B(X) , such that a vector field results with two motion vectors for every block in the image. More precisely, for pixels with a luminance value above the average luminance in the block they apply D a and to the other pixels D b .
  • the average luminance value is used as a division criterion. Even such a simple division into two groups based on the average luminance value will lead to the fonnation of two groups, one group mostly comprising the low intensity pixels, such as the stars, and another mostly comprising the pixels associated with the wheel, The two predicted motion vectors are then close to the correct value of the wheel and the stars, and assigned to the different groups will give a better result.
  • a different division criterion would be the median luminance value, which would also lead to good results.
  • the median luminance value has the advantage that the groups always comprise 50% of the pixels thus a statistically relatively large number of pixels. It is throughout possible to divide the block into more groups, and they need not be of equal size. In this example for instance a division into three groups, one having a luminance value less smaller than 0.5 the average luminance value, one for pixels in between 0.5 and 1.5 the average luminance value and one for luminance value higher than 1.5 the average luminance value, may give better results under certain conditions.
  • Figure 3 illustrates by means of a flow diagram the different steps of an exemplary method in accordance with the invention.
  • a first step the 31 information on luminance values of pixels within block B(X) is gotten.
  • a step 32 in which a division criterion is defined e.g. average pixel value or if the division criterion is already known a value for the division criterion e.g. the average luminance value is calculated.
  • a step 33 in which groups within the block B(X) are defined, i.e.
  • step 34 it is calculated which pixels belong to which group by comparing the information of the pixel to the division criterion, in the example for instance defining whether or not a pixel belongs to the one or the other group on the basis of the sign of the difference between the luminance value of the pixel and the average luminance value.
  • step 34 for each of the groups (in the example for the two groups) the motion vector is determined.
  • the motion vector any known determination method may be used within the present invention. Good results (depending on the application) can be achieved with a full-search, a three-step search, a one-at-a-time search, or a 3-D Recursive Search block matcher.
  • step 35 in which the difference between the motion vectors is compared to a threshold and if, the difference is less than a threshold an average motion vector is used for both groups.
  • step 36 Assign motion vectors to block B(X) such that a vector field results with two motion vectors, apply to a pixel one of the vectors fields (Da to group Ga, Db to group Gb).
  • the invention relates also to a display device comprising a determinator for determining motion vectors for blocks or objects of an image taken from an image sequence, characterized in that the determinator comprises a divider to divide a block or object the pixels in two or more groups in accordance with a comparison of a division criterion with the information of the pixels, the determinator determines subsequently for each group within the block or object a motion vector, and the determinator comprises an assignator to assign the respective motion vectors to the block for application to the pixels of the respective groups within the block.
  • the invention further relates to a computer program product comprising software code portions for determining motion vectors for blocks or objects of an image taken from an image sequence in accordance with the method of the invention in its broadest sense, as well as in any of the embodiments, in particular the preferred embodiment.
  • determinator is to be broadly understood and to comprise e.g. any piece of hard- ware (such a determinator, divider, assignator), any circuit or sub-circuit designed for performing a determination, division, assignment as described as well as any piece of soft-ware (computer program or sub program or set of computer programs, or program code(s)) designed or programmed to perform a determination, division, assignment as well as any combination of pieces of hardware and software acting as such, alone or in combination, without being restricted to the below given exemplary embodiments. It will be appreciated by persons skilled in the art that the present invention is not limited by what has been particularly shown and described hereinabove.

Abstract

In a method for determining motion vectors from image data for blocks or objects of an image taken from an image sequence a block B (X) or object of pixels is divided (33) in two or more groups (Ga, Gb) within the block B(X) or object a motion vector (Da, Db) are assigned to the block (B(X)) and applied to the pixels of the respective groups (Ga, Gb) within the block.

Description

Method for motion vector determination
The invention relates to a method for determining motion vectors from image data for blocks or objects of an image taken from an image sequence. The invention further relates to a display device comprising a determinator for determining motion vectors for blocks or objects of an image taken from an image sequence, and to a computer program product comprising software code portions for determining motion vectors for blocks or objects of an image taken from an image sequence.
Determination of motion vectors from image data is required for a broad range of image processing applications. In a video coding framework such as MPEG or H.261, motion vectors are represented by motion vectors that determine motion (or object displacement) from one image to another. Determination of motion vectors can for instance be used for motion-compensated predictive coding. Since one picture in an image is normally very similar to a displaced copy of its predecessors, encoding determined motion vector data together with information on the difference between the actual image and its prediction either in the pixel- or DCT-domain allows to vastly reduce the temporal redundancy in the coded signal.
Further examples for the estimation of motion vectors comprise methods to estimate the motion model for image segments (objects), where the components of the motion vectors then contain the parameters of the motion model.
State-of-the-art techniques to estimate or determine motion vectors from image data usually apply some kind of Block Matching Algorithm (BMA), where an image is decomposed in blocks of fixed or variable size. Quite as well, the image can be decomposed in its dominant objects instead of its blocks (object segmentation), so that the subsequent description equally well holds for objects instead of blocks. For each block of the current image, a similar block in the previous image is searched, where a similarity measure is applied to identify the previous block most similar to the current block. The motion vector associated to the block of the previous image, for which the largest similarity was determined, then represents the motion vector associated to the pixels of the current block. Note that, when calculating the similarity measure, not all pixels of the two blocks that are to be compared have to be evaluated. E.g., the blocks can be spatially sub-sampled, so that only each £-th pixel of both blocks is considered for the evaluation of the similarity measure. In general block-matching motion estimators are used to calculate a displacement vector for every block of pixels in an image, usually by selecting that vector from a candidate vector set that minimizes a match criterion. That vector is then the motion vector for the relevant block of the image. Within the concept of the invention "motion" may be any type of displacement, encompassing e.g. real motion (e.g. one or more objects moving within a displayed image), but also zooming in or zooming out of an image (the image becoming larger or smaller) or camera movement, in which case the image as a whole moves within the frame of the camera. Motion vectors comprise, within the concept of the invention, any estimation of motion or displacement data for blocks or objects, resulting from any method in which, based on a number of images of which the data are known, one or more further images are constructed. Said motion vectors are estimated to predict the position or other parameters of blocks or objects within said further images. An example of such a method is for instance video format conversion, in which method, by use of e.g. picture interpolation and/or de-interlacing, from video data in one video format (format A, source format) video data in another video format (format B, target format) are derived. In such a method, vectors can be used to estimate for blocks or object based on the known data in the one known video format (the source format) the data for said blocks or objects in the another video format (the target format). It is to be noted that using picture interpolation new images are constructed from the known images (picture interpolation) but using de- interlacing the known images are not changed but the distribution of data over scan lines is changed. Using vector assigned to blocks in such a video conversion method simplifies calculation, and e.g. enables compression of data. A further example is so-called "disparity estimation" for stereoscopic video in which on the basis of two images representing two stereoscopic views the local depth is estimated. In such embodiments vectors can be attributed to blocks or objects, which vectors enable to predict on the basis of known images the position of other parameters for said blocks in further images, e.g. interpolated or de-interlaced images or slightly displaced images due to a 3D effect in an stereoscopic image pair.
Notwithstanding these further possible applications for this invention, the invention is in particular useful for "classical" motion vectors, i.e. for predicting motion vectors for blocks or objects based on a number of preceding images to construct a (or a series) of following images. Although the use of block motion vectors is generally a useful strategy artifacts may appear, for instance around the boundary of objects or when an object overlays a subtitle.
Furthermore a critical parameter in a block-matching algorithm is the size of the block. This parameter both determines the resolution of the estimated vector field and the sensitivity of the estimator for noise and periodic structures in the image. As a consequence, the optimal block size is a compromise. On the one hand, small block sizes lead to noisy motion vectors and a high sensitivity for periodic structures, whereas, on the other hand, big blocks lead to a poor vector resolution. A poor vector resolution yields vector fields in which the object boundary can only be coarsely approximated, resulting in blocking artifacts in applications that use these motion vectors.
It is an object of the invention to provide for an improved method and device and computer program product of the type as described in the opening paragraph.
To this end the method in accordance with the invention is characterized in that for a block or object the pixels are divided in two or more groups in accordance with a comparison of a division criterion with the information of the pixels, and for each group within the block or object a motion vector is determined, and the respective motion vectors are assigned to the block and applied to the pixels of the respective groups within the block.
Within the concept of the invention the group or block is divided into two or more groups based on a comparison between the information in the pixels within the group and a division criterion. As a consequence the relation between the groups of pixels and the block, i.e. which pixels belong to which group within the block follows from the comparison, i.e. is not fixed as would be the case when e.g. a block is divided into a number of equal parts. The latter would simply mean that the block size is reduced, i.e. smaller blocks are used. Within the concept of the invention a block is divided into groups based on the comparison between the division criterion and the information of the pixels.
The method in accordance with the invention allows larger block sizes to be used, while yet achieving better vector resolution. It has also been shown in experiments that artifacts are reduced.
The separation criterion may be a simple criterion, independent of the information content of the pixels. An example of such a simple criterion is a fixed threshold intensity, e.g. dividing each block into two groups, the first one comprising the pixels having an luminance value below a certain percentage (e.g. 50 %) of the maximum luminance value, the second one comprising the pixels having an luminance value above said threshold.
Preferably, however, the division criterion is based on the infonnation content of the pixels within the block. Examples of such criterion are e.g. dividing the block into two groups, wherein the criterion is an average intensity, or a color point area around the average color point. Such a division criterion is preferred since it leads to better results, since it leads to the possibility to assign more than one vector in all regions, regardless of their brightness levels.
In preferred embodiments the blocks or objects are divided into four or less, preferably two groups. Although within the broadest concept of the invention a block or object may be divided into any number of groups, a small number of groups, four or less and preferably two, is preferred. In most circumstances the additional division into more than four groups and often even into more than two only leads to a marginal improvement, or even noisy vector estimates, while complicating the method. In preferred embodiments the division criterion is an average luminance value for the pixels within the block. This has proven to be a useful and simple criterion. This may be the average luminance value, i.e. the quotient of the sum of all luminance values and the number of pixels, or the median luminance value, i.e. that luminance value for which 50% of the pixels has a luminance value higher than that value and for which 50% has a luminance values less than or equal to said luminance value.
In preferred embodiments a comparison is made, after estimating the motion vectors for the groups constituting the block, between the motion vectors and if the difference between motion vectors of a number of groups is less than a threshold value, an average motion vector is calculated and attributed to said number of groups. The division into groups provides for an improved method. However, if the block is divided into a number of group, while the block in fact comprises only one object (and thus only one motion vector is the appropriate one) splitting the block into two or more groups will lead to small differences between the calculated motion vectors. The calculation of a motion vector is usually an approximation and is done on a limited number of pixels, so there is an error margin. If the difference between calculated motion vectors are below a threshold (for instance the error margin in calculation of the motion vectors) it is likely that the difference is due to approximation or calculation inaccuracies. In such cases it is useful to assign to the relevant groups the same motion vector, and choose an average of the motion vectors found. These and other aspects of the invention are apparent from and will be elucidated with reference to the embodiments described hereinafter.
In the drawings:
Fig. la to lc illustrate different image sequences showing different types of motions or displacements.
Fig 2 illustrates a rotating wheel against a background. Figure 3 illustrates by means of a flow diagram the different steps of an exemplary method in accordance with the invention.
The figures are not drawn to scale. Generally, identical components are denoted by the same reference numerals in the figures.
Fig 1A to 1C illustrates very schematically different motions or displacements of or within an image.
In Fig. 1A a sequence of images is illustrated in which an object (a very simplified image of a bird) is moving against a background. From the first two images of the sequence a motion vector (the arrow in the middle image) can be established, which can be used to predict the position of the object (the bird) in the following image.
Fig. IB illustrates an action in which the camera zooms in on the bird. Again, using the first two images of the sequence (or more in general a number of preceding images), to each block or object a motion vector (which could be more generally called a displacement vector) can be assigned to predict the position of the object or block on the following image within the sequence. To blocks or objects of the image motion vectors may be assigned, calculated on basis of the information in preceding images to make a prediction of the position of said blocks or objects in the following (or if the method is used for e.g. interpolating a number of "intermediate") image(s) within the sequence.
Fig. 1C illustrates a camera movement, the camera moves with respect to the image. Again, using the first two images of the sequence (or more in general a number of preceding images), to each block or object a motion vector (which could be more generally called a displacement vector) can be assigned to predict the position of the object or block on the following image within the sequence, of course in this case this holds for those parts of the image that reoccur in the following shot, not for new parts. It is remarked that even for simple camera movements such as scanning the horizon, the vectors for blocks or objects may be different for different object, dependent on e.g. their position vis-a-vis the camera, e.g. whether the objects are in the foreground or the background of the image. Also objects may deform (because different parts of an object are positioned differently vis-a-vis the camera) as the camera position is changed.
Within the concept of the invention "motion vector" is thus to be broadly interpreted as denoting a set of parameters to predict any transformation of an object or block, such as for instance a simple translation T (for which a simple addition of a vector suffices, see e.g. figure 1 A), a rotation R (for which a matrix multiplication with a rotational matrix suffices), enlargement (matrix multiplication, see for instance Fig IB) or deformation (more complex matrices will then be needed) or any combination of such translation, rotation, enlargement and/or deformation. The prediction is usable to predict the position of an object of block in succeeding images based on the information available in preceding images. In general block-matching motion estimators (BMA) are used to calculate a motion vector for every block of pixels in an image, usually by selecting that vector from a candidate vector set that minimizes a match criterion. That vector is then the motion vector for the relevant block of the image.
US 5 072 293 e.g. discloses such a BMA, where predictions from a 3D neighborhood are used as candidate vectors for motion vector estimation. The set of candidate motion vectors comprises both spatial (2D) and temporal (ID) predictions of motion vectors, the best of which is determined for each block recursively. The technique is recursive in that at least one candidate motion vector in the set of candidate motion vectors for a block in the current image n depends on already determined motion vectors of other blocks in the image n (spatial predictions) or in the preceding image n-1 (temporal predictions.
Figure 2 illustrates a rotating wheel against a background of stars. The wheel rotates whereas the stars have a fixed position.
In a (block matching, or any other type of) motion estimator it is tried to match a shifted portion of a previous (or next, or both) image to a fixed portion of the present image. In our example used to elucidate the invention, the estimator uses e.g. the Summed Absolute Difference (SAD) as the matching criterion: SAD(C,X,n) = X \ F( x - C,n - \)- F(x,n) \, 1) xsB(X) where C is the candidate vector under test, vector X indicates the position of the block B(ϊ() , F(x, ) is the luminance signal, and n the picture or field number. The motion vector that results at the output -one vector per block- is the candidate vector that gives the lowest SAD value. The quality of the above motion estimator is largely determined by the way the candidate vectors are generated. In this invention disclosure, we are indifferent concerning this choice. Good results (depending on the application) can be achieved with a full-search, a three-step search, a one-at-a-time search, or a 3-D Recursive Search block matcher. Also possible is a so-called hierarchical motion estimator method, in which method conventionally first for a relatively large block (e.g. 32x32 pixels) a motion vector is estimated, whereafter the large block is cut into smaller blocks (e.g. 4 of 16x16 pixels) and the motion vector of the large block is transferred to the next hierarchical level, i.e. the motion vector of the large block is used as a starting point for the calculation of the motion vectors for the smaller blocks. The method in accordance with the invention can be used for a hierarchical motion estimator method, there then two (or more dependent on the division criterion) motion vectors are transferred to the next hierarchical level.
A critical parameter in any such a block-matching (or any matching) algorithm is the size of the block. This parameter both determines the resolution of the estimated vector field and the sensitivity of the estimator for noise and periodic structures in the image. As a consequence, the optimal block size is a compromise. On the one hand, small block sizes lead to noisy motion vectors and a high sensitivity for periodic structures, whereas, on the other hand, big blocks lead to a poor vector resolution. A poor vector resolution yields vector fields in which the object boundary can only be coarsely approximated, resulting in blocking artifacts in applications that use these motion vectors. If a relatively large block or object size is chosen, such as for instance schematically indicated by the dotted rectangle in figure 2, the resulting motion vector will be off target. If calculation of the motion vector (which may be a rotational matrix) is near the true rotation of the wheel the motion vector is wrong for the stars, if the motion vector is found to be near zero (correct for the stars) the predicted motion vector if off target for the wheel. Any average value is off target for both the wheel and the stars. By using a smaller block size, such as schematically indicated in the lower part of figure 2, for most of the figure the problem are reduced, however at the cost of a reduced accuracy for the predicted motion vector(s). Even so, even for a small block size, such as for instance shown by rectangle 21, the same problem occurs. The present invention aims to provide a way to resolve or at least reduce the problems.
To this end, the match criterion is modified, and based on that modified criterion, more than one vector per block are assigned.
The method in accordance with the invention is characterized in that for a block or object the pixels are divided in two or more groups in accordance with a comparison of a division cr terion with the information of the pixels, and for each group within the block or object a mot: on vector is determined, and the respective motion vectors are assigned to the block and appl ed to the pixels of the respective groups within the block. The basic insight is that, if within a block or object there are groups of pixels for whom the best prediction for the motion vector differ, a division can be made between groups on the basis of the information of the pixels, by comparing the information of the pixels to a division criterion, whereafter for each of the groups a motion vector is determined, and the respective motion vectors are assigned to the pixels within each group.
To elucidate the invention, we shall describe an example motion estimator, according to our invention, in which each block is split into two groups of pixels, while the estimator assigns a motion vector to both groups, i.e. 2 vectors per block.
The average pixel value of the pixels in block B(X) may be defined as follows: Av(X,n) = llN T F(x,n) , (2) x≡B(X) where N is the number of pixels in B(X) . B(X) would in this example be e.g. the pixels within an object or within a rectangle of a size nxm, for instance with n and m between 4 and 32, for instance 16x16.
Now we define two groups, Ga (X), Gb (X) , of pixels together forming block B(X) :
Ga(X) = {x e B(X) I F(x,n) > Av(X,n)} (3) i.e. those pixels within block B(X) with a luminance value larger than the average luminance value, and
Gb(X) = {x e B(X) I F(x,n) ≤ Av(X,n)} (4) i.e. those pixels with an luminance value equal or smaller than the average luminance value. In the proposed estimator now for each group, Ga(X),Gb(X) , motion vectors Da and Db are calculated such that Da is the candidate vector that minimizes the SADa for the pixels in group Ga :
SADa(C,X,n) = \ F( x - C,n - \) - F(x,n) \, (5) sG„ (X) and Db is the candidate vector that minimizes SADt for the pixels in group Gb :
SADb(C,X,n) = ^ | F( 3c - C,n - l) - F(3c,«) |, (6) x≡Gb(X)
Both motion vectors Da and Db are assigned to block B(X) , such that a vector field results with two motion vectors for every block in the image. More precisely, for pixels with a luminance value above the average luminance in the block they apply Da and to the other pixels Db . In the given example for instance the average luminance value is used as a division criterion. Even such a simple division into two groups based on the average luminance value will lead to the fonnation of two groups, one group mostly comprising the low intensity pixels, such as the stars, and another mostly comprising the pixels associated with the wheel, The two predicted motion vectors are then close to the correct value of the wheel and the stars, and assigned to the different groups will give a better result. A different division criterion would be the median luminance value, which would also lead to good results.
Using the median luminance value has the advantage that the groups always comprise 50% of the pixels thus a statistically relatively large number of pixels. It is throughout possible to divide the block into more groups, and they need not be of equal size. In this example for instance a division into three groups, one having a luminance value less smaller than 0.5 the average luminance value, one for pixels in between 0.5 and 1.5 the average luminance value and one for luminance value higher than 1.5 the average luminance value, may give better results under certain conditions.
Figure 3 illustrates by means of a flow diagram the different steps of an exemplary method in accordance with the invention. In a first step the 31 information on luminance values of pixels within block B(X) is gotten. This is followed by a step 32 in which a division criterion is defined e.g. average pixel value or if the division criterion is already known a value for the division criterion e.g. the average luminance value is calculated. This step is followed by a step 33 in which groups within the block B(X) are defined, i.e. it is calculated which pixels belong to which group by comparing the information of the pixel to the division criterion, in the example for instance defining whether or not a pixel belongs to the one or the other group on the basis of the sign of the difference between the luminance value of the pixel and the average luminance value. This step is followed by a step 34 in which for each of the groups (in the example for the two groups) the motion vector is determined. For determination of the motion vector any known determination method may be used within the present invention. Good results (depending on the application) can be achieved with a full-search, a three-step search, a one-at-a-time search, or a 3-D Recursive Search block matcher. The step in which the motion vectors are determined is optionally followed by step 35 in which the difference between the motion vectors is compared to a threshold and if, the difference is less than a threshold an average motion vector is used for both groups. Finally in step 36 Assign motion vectors to block B(X) such that a vector field results with two motion vectors, apply to a pixel one of the vectors fields (Da to group Ga, Db to group Gb). The invention relates also to a display device comprising a determinator for determining motion vectors for blocks or objects of an image taken from an image sequence, characterized in that the determinator comprises a divider to divide a block or object the pixels in two or more groups in accordance with a comparison of a division criterion with the information of the pixels, the determinator determines subsequently for each group within the block or object a motion vector, and the determinator comprises an assignator to assign the respective motion vectors to the block for application to the pixels of the respective groups within the block.
The invention further relates to a computer program product comprising software code portions for determining motion vectors for blocks or objects of an image taken from an image sequence in accordance with the method of the invention in its broadest sense, as well as in any of the embodiments, in particular the preferred embodiment.
Within the concept of the invention "determinator", "divider", "assignator" is to be broadly understood and to comprise e.g. any piece of hard- ware (such a determinator, divider, assignator), any circuit or sub-circuit designed for performing a determination, division, assignment as described as well as any piece of soft-ware (computer program or sub program or set of computer programs, or program code(s)) designed or programmed to perform a determination, division, assignment as well as any combination of pieces of hardware and software acting as such, alone or in combination, without being restricted to the below given exemplary embodiments. It will be appreciated by persons skilled in the art that the present invention is not limited by what has been particularly shown and described hereinabove. The invention resides in each and every novel characteristic feature and each and every combination of characteristic features. Reference numerals in the claims do not limit their protective scope. Use of the verb "to comprise" and its conjugations does not exclude the presence of elements other than those stated in the claims. Use of the article "a" or "an" preceding an element does not exclude the presence of a plurality of such elements.
The present invention has been described in terms of specific embodiments, which are illustrative of the invention and not to be construed as limiting. The invention may be implemented in hardware, firmware or software, or in a combination of them. Other embodiments are within the scope of the following claims.

Claims

CLAIMS:
1. A method for determining motion vectors from image data for blocks or objects of an image taken from an image sequence, characterized in that for a block B(X) or object the pixels are divided (33) in two or more groups (Ga, Gb) in accordance with a comparison of a division criterion (Av(X,n)) with the information F(x,n) of the pixels, and for each group (Ga, Gb) within the block B(X) or object a motion vector (Da, Db) is determined (34), and the respective motion vectors (Da, Db) are assigned to the block (B(X)) and applied to the pixels of the respective groups (Ga, Gb) within the block.
2. A method as claimed in claim 1, characterized in that the number of groups per block is equal to or less than four.
3. A method as claimed in claim 1, characterized in that the number of groups per block is two.
4. A method as claimed in claim 1, characterized in that the division criterion
(Av(X,n)) is determined based on the information content (F(x,n) of the pixels within the block B(X).
5. A method as claimed in claim 4, characterized in that the division criterion is the average luminance value of the pixels within the group.
6. A method as claimed in claim 4, characterized in that the division criterion is the median luminance value of the pixels within the group.
7. A method as claimed in claim 1, characterized in that the difference between motion vectors determined for different groups is compared to a threshold value, and, if the difference is less than the threshold value, the respective motion vectors are replaced by a combination of said motion vectors.
8. A display device comprising a determinator for determining motion vectors for blocks or objects of an image taken from an image sequence, characterized in that the determinator comprises a divider to divide a block or object the pixels in two or more groups in accordance with a comparison of a division criterion with the information of the pixels, the determinator determines subsequently for each group within the block or object a motion vector, and the determinator comprises an assignator to assign the respective motion vectors to the block for application to the pixels of the respective groups within the block.
9. A computer program product directly loadable into the internal memory of a digital computer, comprising software code portions for performing the steps of claim 1 when said product is run on a computer.
EP04719556A 2003-03-14 2004-03-11 Method for motion vector determination Withdrawn EP1606952A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP04719556A EP1606952A1 (en) 2003-03-14 2004-03-11 Method for motion vector determination

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
EP03100649 2003-03-14
EP03100649 2003-03-14
PCT/IB2004/050233 WO2004082294A1 (en) 2003-03-14 2004-03-11 Method for motion vector determination
EP04719556A EP1606952A1 (en) 2003-03-14 2004-03-11 Method for motion vector determination

Publications (1)

Publication Number Publication Date
EP1606952A1 true EP1606952A1 (en) 2005-12-21

Family

ID=32981927

Family Applications (1)

Application Number Title Priority Date Filing Date
EP04719556A Withdrawn EP1606952A1 (en) 2003-03-14 2004-03-11 Method for motion vector determination

Country Status (5)

Country Link
EP (1) EP1606952A1 (en)
JP (1) JP2006521740A (en)
KR (1) KR20050108397A (en)
CN (1) CN1762160A (en)
WO (1) WO2004082294A1 (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB0500174D0 (en) 2005-01-06 2005-02-16 Kokaram Anil Method for estimating motion and occlusion
JP2012129791A (en) * 2010-12-15 2012-07-05 Hitachi Kokusai Electric Inc Image encoder
CN102215417A (en) * 2011-05-04 2011-10-12 山东大学 Parallax prediction method capable of establishing mathematical model based on block matching
US9510018B2 (en) 2011-11-23 2016-11-29 Luca Rossato Signal analysis and generation of transient information
KR101939628B1 (en) 2012-05-30 2019-01-17 삼성전자주식회사 Method of detecting motion and motion detector
CN104427345B (en) * 2013-09-11 2019-01-08 华为技术有限公司 Acquisition methods, acquisition device, Video Codec and its method of motion vector
US10554965B2 (en) 2014-08-18 2020-02-04 Google Llc Motion-compensated partitioning
JP6294810B2 (en) * 2014-11-11 2018-03-14 日本電信電話株式会社 Moving picture encoding apparatus, moving picture decoding apparatus, and computer program
US10462457B2 (en) 2016-01-29 2019-10-29 Google Llc Dynamic reference motion vector coding mode
US10397600B1 (en) 2016-01-29 2019-08-27 Google Llc Dynamic reference motion vector coding mode

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE69423166T2 (en) * 1993-09-08 2000-07-06 Thomson Consumer Electronics Method and device for motion evaluation with block matching

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See references of WO2004082294A1 *

Also Published As

Publication number Publication date
WO2004082294A1 (en) 2004-09-23
JP2006521740A (en) 2006-09-21
CN1762160A (en) 2006-04-19
KR20050108397A (en) 2005-11-16

Similar Documents

Publication Publication Date Title
KR100582856B1 (en) Motion estimation and motion-compensated interpolation
JP4242656B2 (en) Motion vector prediction method and motion vector prediction apparatus
JP2004518341A (en) Recognition of film and video objects occurring in parallel in a single television signal field
JP2005287048A (en) Improvement of motion vector estimation at image border
KR20050089886A (en) Background motion vector detection
KR100727795B1 (en) Motion estimation
WO2008152951A1 (en) Method of and apparatus for frame rate conversion
WO2003102871A2 (en) Unit for and method of estimating a motion vector
US20080144716A1 (en) Method For Motion Vector Determination
US8306123B2 (en) Method and apparatus to improve the convergence speed of a recursive motion estimator
US20060098886A1 (en) Efficient predictive image parameter estimation
EP1514241A2 (en) Unit for and method of estimating a motion vector
KR100855976B1 (en) Frame interpolate device for estimating motion by separating static object and moving object and frame interpolate method using the device
EP1606952A1 (en) Method for motion vector determination
JP5025645B2 (en) Motion estimation method
US7881500B2 (en) Motion estimation with video mode detection
JP2003517795A (en) Motion estimation method for video images
Mertens et al. Motion vector field improvement for picture rate conversion with reduced halo
KR970003099B1 (en) Motion vector estimating apparatus considering covered/uncovered region
Mancuso et al. A fuzzy motion-compensated technique for field rate upconversion
WO2000072590A2 (en) Block matching

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20051014

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LI LU MC NL PL PT RO SE SI SK TR

AX Request for extension of the european patent

Extension state: AL LT LV MK

DAX Request for extension of the european patent (deleted)
STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20061030