GB2487777A

GB2487777A - Estimating motion in a sequence of digital images

Info

Publication number: GB2487777A
Application number: GB1101942.9A
Authority: GB
Inventors: Guillaume Laroche; Patrice Onno
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2011-02-04
Filing date: 2011-02-04
Publication date: 2012-08-08
Anticipated expiration: 2031-02-04
Also published as: GB201101942D0; GB2487777B

Abstract

Motion estimation in a sequence of digital images uses a first reference image 202 and a second reference image 208, for a current image portion 214, 215, 216 of a current image 201 of the sequence of digital images, the motion estimation being applied to the first reference image using a first set of image portions (301). The first reference image 202 is a first reconstruction of at least a first image in the sequence of digital images, and the second reference image 208 is a second reconstruction of said at least a first image in the sequence of digital images. The motion estimation is applied to the second reference image 208 using a subset of image portions, the subset of image portions being selected from among a second set of image portions, the image portions of the first set of image portions (301) and of the second set of image portions being at the same position in the first reference image 202 and in the second reference image 208 respectively. The subset of image portions may be selected on the basis of a first set of values representing prediction costs of the current image portion, the first set of values being determined for each image portion of the first set of image portions. Image portions are determined as belonging to the subset of image portions when the value associated with the image portion is lower than the minimum value of the first set of values.

Description

METHOD AND DEVICE FOR MOTION ESTIMATION IN A SEQUENCE OF

IMAGES

The present invention concerns a method and a device for motion estimation in a sequence of digital images.

More particularly, it concerns a method and a device for motion estimation in a sequence of digital images using at least a first reference image and a second reference image.

On compressing an image, coding methods use spatial and temporal predictions.

According to compression standards such as the H.2641AVC standard, each image of a video sequence is divided into slices, each slice is divided into macroblocks and each macroblock is divided into blocks.

Spatial prediction enables an image block (Intra block) to be coded predictively on the basis of neighboring image blocks.

Temporal prediction enables an image block (Inter block) to be coded predictively on the basis of at least a reference image.

Thus, during video compression, each block of an image (current image) of a sequence of digital images (or video sequence) being compressed is predicted spatially by an "lntra" predictor block, or temporally by an "Inter" predictor block.

A coder compressing the video selects a predictor block for predicting a current block of a current image from among Intra and Inter predictor blocks.

Next, a residual block is derived from the predictor block. This residual block is then transformed for example by means of a discrete cosine transform (DCT), and then quantized.

The coefficients of the quantized transformed residual block are then coded and inserted into a compressed data stream.

In temporal prediction, an estimation of motion between a current block of a current image of the sequence of images (or video sequence) and reference images is implemented in order to identify, in one of the reference images, a block to use it as a predictor block of the current block.

The predictor block is subtracted from the current block so as to obtain the residual block. This is called "motion compensation" in conventional compression algorithms.

Motion information is coded and inserted into the compressed data stream.

This motion information contains a motion vector, indicating the position of the predictor block in the reference image relative to the position of the current block, and an image index indexing the reference image among the reference images.

Motion estimation is applied in several reference images. The best predictor block, for the motion compensation, is selected in one of the several reference images. Thus, the selected predictor block is the block among all blocks of a reference image which has the most correlations with the current block.

Reference images used in temporal prediction consist of images in the video sequence that have already been coded and then decoded in order to be reconstructed.

Figure 1 shows a current image 101 of a video sequence being coded and several reference images 102, 103, 104, 105, 106, 107. In the example illustrated, three reference images 102, 103, 104 are used in the Inter prediction of blocks of the current image 101.

For each block of the current image 101, a predictor block (Inter predictor block) is selected.

In the current image 101, only three blocks 108, 109, 110 have been represented.

For the block 108 of the current image 101, a first predictor block 111 belonging to a second reference image 103 is selected. Thus, the first block 108 in the current image 101 is predicted by using the first predictor block 111 of the second reference image 103.

A second block 109 in the current image 101 is predicted by using a second predictor block 112 of a first reference image 102. A third block 110 in the current image 101 is predicted by using a third predictor block 113 of a third reference image 104.

For the three predicted blocks 108, 109, 110 of the current image 101, a motion vector 114, 115, 116 and a reference image index 102, 103, 104 are coded and inserted into the compressed data stream.

Patent US 732 16 26 describes a method for using a global motion predictor to reduce computations associated with predictive motion estimation for the compression of a video sequence.

Global motion parameters are defined with respect to a first reference frame and are used for a second reference frame.

Thus, motion estimation is applied for each reference frame using the global motion parameters defined with respect to the first reference frame.

Even if the computational complexity is reduced with respect to conventional video coding, for example in the context of the H.264 standard, it is necessary to apply motion estimation for both reference frames, and computational complexity is still high.

The present invention is directed to mitigating the aforesaid limitations and to providing a method for motion estimation in a sequence of digital images and a device associated with that method, making it possible to improve the reduction of the computational complexity of the motion estimation when multiple reference images are used for the prediction of a current image.

To that end, according to a first aspect, the present invention concerns a method for motion estimation in a sequence of digital images using at least a first reference image and a second reference image, for a current image portion of a current image of the sequence of digital images, the motion estimation being applied to the first reference image using a first set of image portions.

According to the invention, the first reference image is a first reconstruction of at least a first image in the sequence of digital images, and the second reference image is a second reconstruction of said at least a first image in the sequence of digital images, and the motion estimation is applied to the second reference image using a subset of image portions, the subset of image portions being selected from among a second set of image portions, the image portions of the first set of image portions and of the second set of image portions being at the same position in the first reference image and in the second reference image respectively.

Thus, the computational complexity is greatly reduced since motion estimation applied to the second reference image is performed only on a limited number of image portions.

Therefore, the processing time related to motion estimation in the second reference image is drastically reduced, and the coding time needed for the motion estimation is reduced.

In practice, said first and second reference images are produced from a quantized version of said at least a first image and said first and second reconstruction images differ through different inverse quantizations.

According to a feature, the subset of image portions is selected on the basis of a first set of values representing a prediction cost of said current image portion, the first set of values being determined for each image portion of the set of image portions of said first reference image.

Therefore, the prediction cost of said current image portion using each image portion of the set of image portions of the first reference image is taken into account when searching for the best motion vector for the current image in the second reference image.

In practice, selecting the subset of image portions comprises: -a step of selecting the minimum value from among said first set of values; -a step of comparing each value of a second set of values with the minimum value selected, said second set of values being determined on the basis of the first set of values; and -a step of determining the image portions that belong to the subset of image portions on the basis of the result of the comparison.

Thus, the number of image portions in the subset of image portions is at most the number of image portions of the first set of image portions.

According to one embodiment, in the determining step, an image portion is determined as belonging to the subset of image portions when the value associated with the image portion is lower than the selected value.

For example, the set of values representing a prediction cost of the current image portion corresponds to a set of rate-distortion costs.

In another example, the set of values representing a prediction cost of said current image portion corresponds to a set of distortion values.

According to a second aspect, the present invention concerns a device for motion estimation in a sequence of digital images using at least a first reference image and a second reference image, for a current image portion of a current image of the sequence of digital images, the motion estimation being applied to the first reference image using a first set of image portions.

According to the invention, the device it is adapted to reconstruct the first reference image and the second reference image, the first reference image being a first reconstruction of at least a first image in the sequence of digital images, and the second reference image being a second reconstruction of said at least a first image in said sequence of digital images, and to apply the motion estimation to the second reference image using a subset of image portions, the subset of image portions being selected from among a second set of image portions, the image portions of the first set of image portions and of the second set of image portions being at the same position in the first reference image and in the second reference image respectively.

This motion estimation device has features and advantages that are similar to those described above in relation to the motion estimation method.

According to a feature, the device for motion estimation is adapted to produce the first and second reference images from a quantized version of said at least one first image, and the first and second reference images differ through different inverse quantizations.

According to another feature, means for selecting the subset of image portions on the basis of a first set of values representing prediction costs of said current image portion, the first set of va'ues being determined for each image portion of the set of image portions of the first reference image.

According to a feature, means for selecting of the subset of image portions comprises: -means for selecting the minimum value from among the first set of values; -means for comparing each value of a second set of values with the minimum value selected, said second set of values being determined on the basis of the first set of values; and -means for determining the image portions that belong to the subset of image portions on the basis of the result of the comparison.

According to another feature, said means for determining are adapted to determine the image portions as belonging to the subset of image portions when the value associated with the image portion is lower than the selected value.

According to a third aspect, the present invention concerns an information storage means which can be read by a computer or a microprocessor holding instructions of a computer program, adapted to implement a motion estimation method in accordance with the invention, when said information is read by said computer or said microprocessor.

In a particular embodiment, this storage means is partially or totally removable.

According to a forth aspect, the present invention concerns a computer program product able to be loaded into a programmable apparatus, comprising sequences of instructions for implementing a motion estimation method in accordance with the invention, when said computer program product is loaded and executed by said programmable apparatus. Such a computer program may be transitory or non transitory. In an implementation, the computer program can be stored on a non-transitory computer-readable carrier medium.

The advantages and particular features of the information storage means and of the computer program products are similar to those of the motion estimation method that they implement.

Still other particularities and advantages of the invention will appear in the following description, made with reference to the accompanying drawings which are given by way of non-limiting example, and in which: -Figure 1 represents the principle of motion compensation of a

video coder of the prior art;

-Figure 2 represents the principle of motion compensation of a video coder including multiple reconstruction images of images in a list of reference images, in accordance with the invention; -Figure 3 illustrates an image region comprising blocks tested in the motion estimation method in accordance with the invention; -Figure 4 is a flow chart representing an embodiment of a method of motion estimation in accordance with the invention; and -Figure 5 represents a particular hardware configuration of a device suitable for implementation of the method according to the invention.

As described above, in order to identify a predictor image portion of a current image portion of a current image, an estimation of motion between the current image portion and reference images is implemented.

In this example, an image portion corresponds to a block.

According to the standard H.264/AVC, each image of a video sequence is divided into slices. Each slice is divided into macroblocks, and each macroblock is divided into blocks.

In the H.264 standard, the macroblock is the coding unit.

In the H.264 standard, a macroblock is a block having a size of 16 x 16 pixels, and a bloc may have different sizes, for example 4 x 4, 4 x 8, 8 x 4, 8 x 8, 8 x 16, 16 x 8 pixels.

It should be noted that a block may also have different shapes, for example, a square shape, a rectangular shape or other shapes.

The method for motion estimation in a sequence of digital images in accordance to the invention is in particular used in the context of video compression. However, the method can be used in another processing operation, for example in the estimation of motion during sequence analysis.

With reference to Figure 2, a description will first of all be given of the principle of motion compensation in a sequence of digital images in accordance with the invention.

Figure 2 shows a current image 201 being coded and a group of reference images 200.

This group of reference images 200 comprises a first sub-group of reference images 202, 203, 204, 205, 206, 207 consisting of images in the video sequence that have already been coded and then decoded in order to be reconstructed. These images are images used for generating the decoded video sequence that is output by a decoder, for example, in order to be displayed.

The group of reference images 200 also comprises a second sub-group of reference images 208, 209, 210, 211, 212, 213 consisting of images resulting from other decoding operations of the same images in the video sequence that have already been coded and then decoded. These images 208, 209, 210, 211, 212, 213 are reconstructed by using parameters different from those used for conventional reconstructions (or decoding operations) used for generating the decoded video sequence. These reconstructions (or decoding operations) are called "second reconstructions".

Therefore, in the example illustrated in Figure 2, a first image 202 of the first sub-group of reference images is a first reconstruction of one first image in the sequence of images (not illustrated) to be coded.

Images referenced 208 and 211 (reference images in the second sub-group of images) correspond respectively to a first second reconstruction and a second second reconstruction of the first image in the sequence of images.

A second reference image 203 in the first sub-group of reference images corresponds to a first reconstruction of a second image (not illustrated) in the video sequence. Images referenced 209 and 212 (reference images in the second sub-group of images) correspond respectively to first and second second reconstructions of the second image in the sequence image.

A third reference image 204 in the first sub-group of reference images corresponds to a first reconstruction of a third image (not illustrated) in the video sequence. Images referenced 210 and 213 (reference images in the second sub-group of images) correspond respectively to a first and a second second reconstruction of the third image in the sequence image.

In this Figure, image 208 corresponding to a second reconstruction of the first reference image contains a predictor block 218 for a first block 214 of the current image 201; image 202 corresponding to a first reconstruction of the first reference image contains a predictor block 217 for a second block 215 of the current image 201; and image 212 corresponding to a second reconstruction of a second reference image contains a predictor block 219 for a third block 216 of the current image 201.

A first reconstruction of a reference image is obtained by applying a first inverse quantization to an image.

A second reconstruction of a reference image is obtained by applying a second inverse quantization different from the first one.

This means that each block of the reference image, in particular of the closest reference image in time to the current image, is inverse quantized with parameters at least one of which is different from a parameter used in the inverse quantization process. These parameters are for example, the number of transformed coefficients and the value of the inverse quantization offset. These parameters are sent to the decoder for each reconstruction.

It should be noted that a coefficient is an element of a residual block transformed for example by the transform of the H.264 standard, and then quantized. In the case of coding without transformation, a coefficient is a pixel of a quantized residual.

Thus, according to the standard H.264, the residual block of the current pixel block is transformed, for example by applying a DOT, and then quantized. The quantization is applied to each of the coefficients of the residual block (as many coefficients as there are in the pixel block). The coefficient matrix is scanned in a particular order, making it possible to obtain an index number for each coefficient.

Therefore, the number of a transformed coefficient corresponds to an index number representing its position when scanning a matrix of transformed coefficients of the block. For example, the scan used is a zigzag scan.

The inverse quantization offset or reconstruction offset is a value that makes it possible to center the quantization interval.

The reference image corresponding to the first reconstruction of an image of the image sequence is named "first reference image" and the reference image corresponding to the second reconstruction of that image of the image sequence is named "second reference image" A description of the reconstruction of a second reference image will be given.

In this example, a second reference image (208 in Figure 2) is generated in the inverse quantization interval. Each second reconstruction or second reference image of the first reference image (previously coded and decoded image) is defined by the index c of the coefficient in the transformed matrix (in this example, DCT matrix) and its inverse quantization offset e.

A second reference image is generated by the addition of the same offset block to all blocks of the first reference image.

Thus, a block at the position k may be obtained according to the following equation: b,t0 =b0 +b; (1) Where b0 is the block at the position k in the second reference image ref'O (208 in Figure 2) and b is the offset block obtained in the transform domain with the coefficient c and the inverse quantization offset®.

In other embodiments, a second reference image is obtained by adding a pixel offset o to each pixel of the image. This offset is the same whatever the pixel position in the image.

Information on the number of second reconstructions and the associated parameters are inserted in the coded data stream in the purpose of informing the decoder of the number of second reconstructions and the parameters to use for these reconstructions.

In this context, the method of motion estimation according to the invention is implemented.

As described above, in temporal prediction, an estimation of motion between a current block of a current image and reference images is implemented in order to identify, in one of the reference images, a block to use it as a predictor block of the current block.

Motion information contains a motion vector, indicating the position of the predictor block in the reference image relative to the position of the current block, and an image index indexing the reference image among the reference images.

Motion estimation is applied in reference images. The best predictor block, for the motion compensation, is selected in one of the references images.

Thus, the selected predictor block is the block among all blocks of a reference image which has the most correlations with the current block.

A set of blocks of each reference image is tested in order to find the predictor block (best predictor block) for a current block 214, 215, 216 of the current image 201. This set of blocks is defined by a search region 301 (Figure 3) in the reference image.

In the example of Figure 3, the search region has a size of 8x8 pixels. Each square 302 represents a pixel, and each block has a size of 2x2 pixels.

In Figure 3, the search region 301 is centered at the origin of a Cartesian coordinate system. The block 303 situated at the origin has exactly the same position in a reference image as the current block 214, 215, 216 in the current image 201.

If, for example, a predictor block for the first current block 214 of the current image 201 is searched for, the block 303 is situated at the same position in each of reference images 202 to 213 as the first current block 214 in the current image 201.

In the described embodiment, all blocks or block positions are tested in order to find the (best) predictor block (full motion estimation).

In other embodiments, a restricted number of predictor block positions are tested.

The motion vector gives the movement to reach a particular block relative to the center or origin or the search region.

For example, the block 305 in the search region 301 illustrated in Figure 3, has a motion vector (mv, mv) equal to (-1, 3) (this means that the component X (mv) is equal to -1 and its component Y (mv) is equal to 3), and the block 304 has a motion vector (mv, mv) equal to (-3, 3).

In this example, the motion vector (mv, mv) points to the pixel situated at the upper left corner of the block.

With reference to Figure 4, a description is now given of the method for motion estimation according to an embodiment of the invention.

At a selecting step 401, the search region 301 is defined. The definition of the search region 301 is carried out as described above, for example by selecting an area centered on a block located at the same spatial position (co-located block) as the current block to encode. At this selecting step 401, motion estimation parameters are selected in order to obtain a set of blocks or a set of block positions in the first reference image 202 to be tested.

A list of blocks containing the blocks of the set of blocks to be tested in the first reference image 202 is obtained in an obtaining step 402.

At a motion estimation step 403, a value representing a prediction cost of a current block is determined for each block in the list of blocks obtained at the obtaining step 402. Thus, a set of values is determined for an image.

In this example, at the motion estimation step 403, a Rate-Distortion cost (RD cost) is computed for each block in the list of blocks or block positions obtained at the obtaining step 402. All RD costs computed are stored in memory.

For a block, a RD cost is computed according to the following formula: J(,J) = + AR(11) (2) where is the RD cost for the motion vectorQ,j), D(J) is its distortion, R(11) its related rate and A is the Lagrange parameter which depends on the quantization parameter (QP).

The distortion D can be the Sum of Absolute Differences (SAD) between each pixel of the current block and their corresponding pixel in a block of the reference image.

Thus, the distortion D of a block having coordinates (x,y) in a reference image ref pointed to by the motion vector lnVk is given by the equation below. J1+S m+sy

= org" j=m Where org"is the pixel at the position(i,j)in the current image, (n,rn) is the relative position of the current block in the reference image ref, ref'is the pixel at the position(x,y)of the reference image ref. (S,S)is the size in pixels of the blocks.

In this equation i varies from n to n + S-l and j varies from m to m-i-S-1.

Of course, other kind of distortion measure may be used.

In the example described (full motion estimation), the RD cost of each block of the search region or of the set of blocks 301 is computed.

At a storage step 404, RD costs calculated for each block or block positions in the list of blocks obtained at the obtaining step 402 are stored in memory.

Once RD costs have been computed for all block positions of the set of blocks, the (best) predictor block (or the (best) predictor block position) is selected at a selecting step 405.

In this example, the predictor block corresponds to the predictor block of the image of reference refO (202 in Figure 2) having the minimum RD cost refO At a second selecting step 407, a subset of blocks of a second reference image ref'O (208 at Figure 2) is selected. This subset of blocks is selected from among a second set of blocks, the blocks of the first set of blocks and of the second set of blocks being at the same position in the first reference image and the second reference image respectively.

The selection of the subset of blocks is made on the basis of a first set of values representing a prediction cost of a current block, the set of values being determined for each block of the set of blocks of the first reference image.

In this example, the selection of the subset of blocks is made on the basis of a set of RD costs calculated for each block of the set of blocks of the first reference image refO. Indeed, the set of values consists of RD costs.

It should be noted that the distortion DZk° of a block pointed by the motion vector invk in the second reference image ref'O (208 in Figure 2) is given by the following formula: 12 +S m+sy Dfo = -mv (4) j=m For example, if a 4x4 transform is used, the block offset badded to each block of the first reference image refO to obtain the second reference image ref 0 has a size of 4x4.

Thus, the equation above (4) could be rewritten as follows: n+sx m+s = org@M -(refo +mv,J*mvf) + (5) i=fl j=m Where the remainder operator % gives the remainder of a division.

It will be noted that in this equation (5), the distortion DZ° does not depend directly on the second reference image ref 0, but only on the block offset b0 added to refO to obtain ref 0.

By properties of the absolute value we can write: fl+S fl+S m+s1 -ref tnv,i+mvfl - b((mv;)v04(1+mvAi)%4) (6) =fl [=1? j=m j=m This formula (6) can be rewritten as follows: n+sx m+s1.

-Eb+m)Yo4Vm %4I «= (7) 1=/i 1=111 This equation means that the minimum bound ofDZ: is 7+st-m+s Dfo -VY Ij+mv)%4) mv L.s B i=n j=m n+sx m+s V' Ji+mv' )%4,( j+rnv)%4) The term L tile could be written as follows: i=n j=m n+s 4 m+s 4 = S7 4SY x = x B® (8) j=in 1=0 Generally, in a video coder, the minimum block size depends on the minimum size transform.

Moreover, in this equation the term Be is constant for all blocks. Thus, equation above (8) is computed only once for a reference image.

As described above, in other embodiments, a second reference image is obtained by adding the same pixel offset o to each pixel of the image.

That is to say, b°" o for any value of (i,j) in the block offset.

In this embodiment, the maximum bound may be

S XS

X x(16xo)=4xoxsxs.

In the embodiment described, equation (7) becomes: sxs Dfo_ X 3'xB «=Dfo (9) lflVk 4 8 fl1V) Thus, the minimum bound of the distortion is

S XS

-X B® * This minimum bound depends on the distortion of the first reference image refO.

At the second selecting step 407, the subset of blocks of the second reference image ret'O (208 at Figure 2) is selected on the basis of the result of this formula: sxs DZ° X xB+AR1°<J (10) Indeed, for each block pointed to by the motion vector mvk belonging to the second set of blocks, this equation above (10) is tested.

Therefore, in this example, the selecting step 407 comprises a step of comparing a second set of values with the minimum RD cost (JrefO) selected at the first selection step 405.

The second set of values is determined on the basis of the first set of values and corresponds to a set of RD costs calculated for each block on the basis of each RD cost calculated for each block of the set of blocks of the first reference image. The second set of values also depends on the block offset added to each block of the first reference image refO to obtain the second reference image ref'O.

The selecting step 407 aiso comprises a step of determining the blocks that belong to the subset of blocks on the basis of the result of the comparison.

In this example, for each block of the second subset of blocks for which the condition of the formula (1 0) is validated, the block is selected as belonging to the second subset of blocks.

Thus, blocks selected as belonging to the second subset of blocks may have a lower RD cost than the minimum RD cost found during the motion estimation on the first reference image refO.

Therefore, a list of blocks containing the blocks in the second subset of blocks is obtained at a second obtaining step 408.

Blocks in this list of blocks are tested during a second motion estimation step 410 on the second reference image ret'O.

Selecting step 407, second obtaining step 408 and second motion estimation step 410 are repeated for each second reconstruction of the first reference image refO.

Since in motion estimation of second reconstructions only blocks belonging to the second subset of block are tested, the search for the (best) predictor block for a current block is simplified, and the time needed for motion estimation is reduced.

Thus, the time needed on compressing a video sequence is reduced without modifying the coding efficiency.

In another embodiment, the set of values representing a prediction cost contains values of distortion.

Thus, in this embodiment, in the selective step 407, the step of comparing consists of comparing the distortion D calculated for each block of the set of blocks with a minimum distortion D of the blocks of the set of blocks in the first reference image refO.

In this embodiment, a block of the second reference image is selected when equation (9) is validated.

With reference now to Figure 5, a description is given by way of example of a particular hardware configuration of a video sequence processing device suitable for implementing the method according to the invention.

An information processing device implementing the invention is for example a microcomputer 50, a workstation, a personal assistant, or a mobile telephone connected to various peripherals. According to yet another embodiment of the invention, the information processing device is in the form of a photographic apparatus provided with a communication interface for enabling connection to a network.

The peripherals connected to the information processing device comprise for example a digital camera 64, or a scanner or any other image acquisition or storage means, connected to an input/output card (not shown) and supplying information on multimedia data (for example of the video sequence type) to the processing device.

The device 50 comprises a communication bus 51 to which there are connected: -a central processing unit CPU 52 in the form for example of a microprocessor; -a read only memory 53 able to contain programs whose execution enables the method according to the invention to be implemented. This may be a flash memory or EEPROM; -a random access memory 54 which, after the device 50 is powered up, contains the executable code of the programs of the invention necessary for implementing the invention. This random access memory 54 is of the RAM type (with random access), which offers fast access compared with the read only memory 53. This RAM memory 54 stores in particular the various images and the various blocks of pixels of video sequences as processing is carried out (transformation, quantization, storage of reference images, sets and subsets of blocks, computed RD costs, minimum RD cost, etc.); -a screen 55 for displaying data, in particular video data, and/or serving as a graphical interface with the user, who can thus interact with the programs of the invention, by means of a keyboard 56 or any other means such as a pointing device, such as for example a mouse 57 or an optical stylus; -a hard disk 58 or a storage memory, such as a memory of the compact flash type, able to contain the programs of the invention as well as data used or produced during the implementation of the invention; -an optional diskette drive 59, or another drive for a removable data carrier, adapted to receive a diskette 63 and to read/write thereon data processed or to be processed in accordance with the invention; and -a communication interface 60 connected to the telecommunication network 61, the interface 60 being able to transmit and receive data.

In the case of audio data, the device 50 is preferably equipped with an input/output card (not shown), which is connected to a microphone 62.

The communication bus 51 enables communication and interoperability between the various elements included in the device 50 or connected thereto. The representation of the bus 51 is not limiting and in particular the central processing unit 52 is able to communicate instructions to any element of the device 50 directly or by means of another element of the device 50.

The diskettes 63 can be replaced by any information carrier such as for example a compact disk (CD-ROM), rewritable or not, a zip disk or a memory card. In general terms, an information storage means, able to be read by a microcomputer or by a microprocessor, integrated or not in the video sequence processing device (coding or decoding), and possibly removable, is adapted to store one or more programs the execution of which enables the method according to the invention to be implemented.

The executable code enabling the video sequence processing device to implement the invention can be either stored in read only memory 53, on the hard disk 58 or on a removable digital medium such as for example a diskette 63 as described previously. According to a variant, the executable code of the programs is received by means of the telecommunication network 61, via the interface 60, in order to be stored in one of the storage means of the device 50 (for example the hard disk 58) before being executed.

The central processing unit 52 controls and directs the execution of the instructions or portions of software code of the program or programs of the invention, the instructions or portions of software code being stored in one of the aforementioned storage means. When the device 50 is powered up, the program or programs stored in a non-volatile memory, for example the hard disk 58 or the read only memory 53, are transferred into the random access memory 54, which then contains the executable code of the program or programs of the invention, as well as registers for storing the variables and parameters necessary for implementing the invention.

It should also be noted that the device implementing the invention or incorporating it can also be produced in the form of a programmed apparatus.

For example, such a device can then contain the code of the computer program or programs in a fixed form in an application specific integrated circuit (ASIC).

The devices described here and in particular the central processing unit 52 are able to implement all or some of the processing operations described in relation to Figure 4, in order to implement the methods that are the subject matter of the present invention and constitute the devices that are the subject matter of the present invention.

Thus, by virtue of the invention, computational complexity is greatly reduced.

Therefore, processing time related to motion estimation in the second reference image is drastically reduced, and the coding time needed for the motion estimation is reduced.

Claims

CLAIMS1. Method for motion estimation in a sequence of digital images using at least a first reference image (202 -207) and a second reference image (208 -213), for a current image portion (214, 215, 216) of a current image (201) of the sequence of digital images, the motion estimation being applied to the first reference image (202 -207) using a first set (301) of image portions, the method being characterized in that said first reference image (202 -207) is a first reconstruction of at least a first image in the sequence of digital images, and said second reference image (208 -213) is a second reconstruction of said at least a first image in said sequence of digital images, and said motion estimation is applied to said second reference image (208 -213) using a subset of image portions, said subset of image portions being selected from among a second set of image portions, the image portions of said first set of image portions (301) and of said second set of image portions being at the same position in said first reference image (202 -207) and in said second reference image (208 -213) respectively.
2. Method for motion estimation according to claim 1, characterized in that said first and second reference images (208 -213) are produced from a quantized version of said at least a first image, and said first (202 -207) and second (208 -213) reference images differ through different inverse q uantizations.
3. Method for motion estimation according to one of claims 1 to 2, characterized in that it comprises selecting (407) said subset of image portions on the basis of a first set of values representing prediction costs of said current image portion (214, 215, 216), said first set of values being determined for each image portion of said first set of image portions (301) of said first reference image (202 -207).
4. Method for motion estimation according to claim 3, characterized in that said selecting (407) of the subset of image portions comprises: -a step of selecting the minimum va'ue (JrefO) from among said first set of values; -a step of comparing each value of a second set of values with the minimum value (JrefO) selected, said second set of values being determined on the basis of the first set of values; and -a step of determining the image portions that belong to said subset of image portions on the basis of the result of the comparison.
5. Method for motion estimation according to claim 4, characterized in that in said determining step, the image portions are determined as belonging to said subset of image portions when the value associated with the image portion is lower than said selected value.
6. Method for motion estimation according to one of claims 3 to 5, characterized in that said set of values representing prediction costs of said current image portion (214, 215, 216) corresponds to a set of rate-distortion (RD)costs.
7. Method for motion estimation according to one of claims 3 to 5, characterized in that said set of values representing prediction costs of said current image portion (214, 215, 216) corresponds to a set of distortion (D) values.
8. Method for motion estimation according to one of claims 1 to 7, characterized in that said first subset of image portions and said second subset of image portions comprises the entirety of image portions of said first reference image (202 -207) and of said second reference image (208 -213) respectively.
9. Device for motion estimation in a sequence of digital images using at least a first reference image (202 -207) and a second reference image(208 -213), for a current image portion (214, 215, 216) of a current image of the sequence of digital images, the motion estimation being applied to the first reference image using a first set of image portions, the device being characterized in that it is adapted to reconstruct said first reference image (202 -207) and said second reference image (208-213), said first reference image (202 -207) being a first reconstruction of at least a first image in the sequence of digital images, and said second reference image (208 -213) being a second reconstruction of said at least a first image in said sequence of digital images, and to apply said motion estimation to said second reference image (208 -213) using a subset of image portions, said subset of image portions being selected from among a second set of image portions, the image portions of said first set of image portions and of said second set of image portions being at the same position in said first reference image (202 -207) and in said second reference image (208 -213) respectively.
10. Device for motion estimation according to claim 9, characterized in that it is adapted to produce said first and second reference images (208 - 213) from a quantized version of said at least one first image, and said first (202 -207) and second (208 -213) reference images differ through different inverse q uantizations.
11. Device for motion estimation according to one of claims 9 or 10, characterized in that it comprises means (52, 53, 54, 58) for selecting said subset of image portions on the basis of a first set of values representing prediction costs of said current image portion (214, 215, 216), said first set of values being determined for each image portion of said set of image portions of said first reference image (202 -207).
12. Device for motion estimation according to claim 11, characterized in that said means for selecting of the subset of image portions comprises: -means (52, 53, 54, 58) for selecting the minimum value from among said first set of values; -means (52, 53, 54, 58) for comparing each value of a second set of values with the minimum value selected, said second set of values being determined on the basis of the first set of values; and -means (52, 53, 54, 58) for determining the image portions that belong to said subset of image portions on the basis of the result of the comparison.
13. Device for motion estimation according to claim 12, characterized in that in said means (52, 53, 54, 58) for determining are adapted to determined the image portions as belonging to said subset of image portions when the value associated with the image portion is lower than said selected value.
14. Means for storing information which can be read by a computer or a microprocessor holding instructions of a computer program, characterized in that it is adapted to implement a method of motion estimation according to any one of claims I to 8, when said information is read by said computer or said microprocessor.
15. Means for storing information according to claim 14, characterized in that it is partially or totally removable.
16 Computer program product which can be loaded into a programmable apparatus, characterized in that it comprises sequences of instructions for implementing a motion estimation method according to any one of claims 1 to 8, when said computer program product is loaded into and executed by said programmable apparatus.
17. A method, device or computer program for motion estimation in a sequence of digital images substantially as hereinbefore described with reference to the accompanying drawings.