MXPA98001810A - Prediction and coding of planes of the object of video of prediccion bidireccional for digital video entrelaz - Google Patents

Prediction and coding of planes of the object of video of prediccion bidireccional for digital video entrelaz

Info

Publication number
MXPA98001810A
MXPA98001810A MXPA/A/1998/001810A MX9801810A MXPA98001810A MX PA98001810 A MXPA98001810 A MX PA98001810A MX 9801810 A MX9801810 A MX 9801810A MX PA98001810 A MXPA98001810 A MX PA98001810A
Authority
MX
Mexico
Prior art keywords
field
macroblock
current
image
bot
Prior art date
Application number
MXPA/A/1998/001810A
Other languages
Spanish (es)
Inventor
Chen Xuemin
O Eifrig Robert
Luthra Ajay
Original Assignee
General Instrument Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by General Instrument Corporation filed Critical General Instrument Corporation
Publication of MXPA98001810A publication Critical patent/MXPA98001810A/en

Links

Abstract

The present invention relates to a system for encoding digital video images, for example the video object planes with bidirectional prediction (B-VOPs), (420), in particular when the B-VOP and / or a reference image (400, 440) which is used to encode the B-VOP are encoded in an interlaced manner. For a macroblock B-VOP (420) that is located with a macroblock with field prediction and a future anchor image (440), the direct mode prediction is made by calculating four field movement vectors (MVf, top, MVf , bot, MVb, top, VMb, bot), and subsequently generating the prediction macroblock. The four field movement vectors and their reference fields are determined from (1) a deviant term (MVD) of the current macroblock coding vector, (2) the two future anchor image field motion vectors ( MVtop, MVbot), (3) the reference field (405, 410) used by the two field movement vectors of the co-located future anchor macroblock and (4) the time separation (TRb, top, TRb, bot, TRD, top, TRD, bot) in the field periods, between the current B-VOP fields and the anchor fields. Additionally, a coding mode decision process for the current MD selects a forward, backward or average field coding mode according to a minimum sum of absolute difference error (SAD) that is obtained over the upper fields (430) and lower (425) of the current MB (42

Description

• > - # PREDICTION AND CODING OF PLANES OF THE BIDIRECTIONAL PREDICTION OBJECT OF VIDEO FOR INTERRUPTED DIGITAL VIDEO BACKGROUND OF THE INVENTION The present invention provides a method and an apparatus for encoding digital video images for example the video object planes with bidirectional prediction (B-VOPs), in particular, when the B-VOP and / or • a reference image used to encode the B-VOP are encoded in an interlaced manner. The invention is particularly suitable for use with various multimedia applications and is compatible with the MPEG-4 Verification Model (VM) standard described in document ISO / IEC / JTC1 / SC29 / WG11 N1642, entitled "MPEG-4 Video Verification Model Version 7.0", April 1997, incorporated herein by reference. The MPEG-2 standard is a precursor to the MPEG-4 standard and is described in ISO / IEC 13818-2, entitled "Information Technology - Generic Codig of Moving Pictures and Associated Audio, Recommendation H.262, "of March 25, 1994, incorporated herein by reference." MPEG-4 is a new coding standard that provides a flexible framework and an open set of coding tools for communication, he access and manipulation of digital audiovisual data.
P113 ../ 98 H These tools support a wide range of features. The flexible MPEG-4 framework supports various combinations of coding tools and their corresponding functionalities for the applications required by the computer, telecommunications and entertainment industries (ie TV and cinema), such as, for example, navigation in databases, information retrieval and communications • interactive. 0 MPEG-4 provides standardized or standardized core technologies that enable efficient storage, transmission and manipulation of video data in multimedia environments. MPEG-4 achieves efficient compression, scalability of the object and spatial and temporal scalability and resilience of the error. The MPEG-4 video encoder / codec (VM) is a hybrid encoder based on block and object, with motion compensation. The texture is 0 encoded with a discrete cosine transformation (DCT) of 8x8 using overlap block motion compensation. The shapes of the object are represented as alpha maps and encoded using a content-based arithmetic coding algorithm (CAE) or a modified DCT encoder, using both the prediction P1136 / 38 temporary MX. The encoder can handle moving objects or sprites as they are known in the area of computer graphics. Other coding methods, such as wave train coding and motion objects, can also be used for special applications. The encoding of the compensated motion texture is a well-known approach to m video coding and can be modeled as a process of three stages. The first stage is signal processing that includes estimation and motion compensation (ME / MC) and a spatial transformation of two dimensions (2-D). The objective of the ME / MC and the spatial transformation is to take advantage of the temporal and Spatial in a video sequence to optimize the speed-distortion performance of the quantization and the entropy coding with a complexity constraint. The most common technique for ME / MC has been the matching or mating of blocks and the transformation The most common spatial space has been DCT. However, special aspects arise for the ME / MC of the VOPs, particularly when the VOP itself is encoded with interleaves and / or uses reference images that are encoded with interleaving. 25 In particular, you want to have a technique Pll 36/9 $ MX flP efficient to provide motion vector predictors (MV) for a MB in a B-VOP. It is also desirable to have an efficient technique for directing the coding of a field-coded MB in a B-VOP. It is also desirable to have a coding mode decision process for a MB in a field-coded B-VOP to select the reference image that results in the most efficient coding. • The present invention provides a system that has the above advantages as well as additional ones.
SUMMARY OF THE INVENTION According to the present invention, a method and an apparatus for encoding the video images are presented digital, for example a current image (for example a macroblock) in a predicted video object plane * bi-directional (B-VOP), in particular, when the current image and / or a reference image used to encode the current image are encoded by interlacing 0 (e.g., field). In a first aspect of the invention, a method provides direct mode motion vectors (MVs) for a field-coded image, with bidirectional, current prediction, for example a macroblock (MB) having 5 upper and lower fields, in one sequence of Pll 36/98 MX «ja digital video images. A field-coded reference image, passed, having upper and lower field and a field-coded, future reference image, having upper and lower fields, is determined. 5 The future image is predicted using the past image so that MV- ^ op, a forward MV of the upper field of the future image, refers to either the upper or lower field of the past image. The field with which j fcf refers contains a better matching MB for one MB in the upper field of the future image. This MV is called MV "of advance" because even though its references are a past image (ie backward in time) I, the prediction comes from the image passed to the future image, that is, in advance in time. As a mnemonic, the direction of prediction can be thought as opposed to the direction of the MV # correspondent. Similarly, the MVD0-, a forward moving vector of the lower field of the future image, refers to either the upper or lower field of the last image. The forward and backward MVs are determined to predict the upper and / or lower fields of the current image by scaling the forward MV for the corresponding field of the future image. In particular, the MVf to / e ^ vector of forward movement to predict the upper field of the P1136 / 98 MX < | H current image, are determined according to the expression MVj ^ top = (^ top * TRB, top) / TRD, top + MVD, where MVD, is a delta motion vector for a search area, TRB top corresponds to a temporary separation between the upper field of the current image and the field of the past image referenced by MV ^ o, and TRD t0p corresponds to a temporal separation between the upper field of the future image and the field of the past image jtfp that is referenced as M op- Temporary separation can be related to a frame rate at which images are displayed. Similarly, MVf b0, the forward motion vector to predict the lower field of the current image is determined according to the expression MVf, top = (MVtop * TRB, top) / TRD, top + ^ D- where MVD is a delta motion vector, TRß} - ,, -, - £ corresponds to a # temporal separation between the lower field of the current image and the field of the future image that is referenced as MVj-, 0 - (-, and TRD k0- corresponds to a temporary separation between the lower field of the future MB and the field of the last MB, which is referenced as MVbot. MVj-, top 'e-L moving forward vector to predict the upper field of the current MB is determined from according to the equation MVj-, ^ op = ((TRß top ~ Pll 36/98 MX TRD, top)? MVtop) / TRD, top when the delta motion vector MVD = 0, or MVbíbot = MVfíbot - MVbot when MVD? 0. MV bot / the moving forward vector to predict the lower field of the current MB is determined according to the equation MV bot = ((TRg bot ~ TRD bot) * MVbot) / TRD bot when the delta motion vector MVD = 0, or MVbrbot = MVf? Bot - MVbot when MVD? 0. A corresponding decoder is also present. In another aspect of the invention, a method for selecting a coding mode for a field-coded, predicted, current MB having upper and lower fields in a sequence of digital video MBs is presented. The coding mode can be a backward mode where the reference MB is temporarily after the current MB in the display order, a forward mode, where the reference MB is before the current MB, or a average mode (for example, bi-directional) where an average of the first and subsequent reference MBs is the one used. The method includes the step of determining an advance sum of the absolute difference error, SADforwarcj field For the current MB relative to a past reference MB, which corresponds to a forward coding mode. SADforward field 'indicates the error Pll 36/98 MX fl in pixel luminance values between the current MB and a best match MB in the last reference MB. A backward sum of the absolute difference error, SAD ac] warcj field For the current MB relative to a future reference MB 5, which corresponds to a backward encoding mode, it is also determined. SADback ard field indicates the error in the pixel luminance values between the current MB and a MB of better j k match in the future reference MB. 0 An average sum of the absolute difference errors, SADaverage field Pa ^ to the current MB in relation to an average of the past and future reference MB, which corresponds to an average coding mode, is also determined. SADaverage, field indicates the error in the pixel luminance values 5 between the MB and an MB that is the average of the MB of best match of the MBs of ^ past and future reference. The coding mode is selected according to the minimum of the SADs. The derivation terms that are taken into account for the number of MVs required of the respective coding modes can also be factorized in the coding mode lesson process. SADforward, field- SADbackward, field AND 5 SADaverage field are determined by adding the terms P1136 / 98 MX P. components on the upper and lower fields.
BRIEF DESCRIPTION OF THE DRAWINGS Figure 1 is an illustration of a process of coding and decoding the video object plane (VOP) in accordance with the present invention. Figure 2 is a block diagram of an encoder in accordance with the present invention. jj - Figure 3 illustrates an interpolation scheme for a half pixel search. Figure 4 illustrates the direct mode coding of the upper field of a B-VOP encoded by interleaving according to the present invention. Figure 5 illustrates the coding mode direct from the bottom field of a B-VOP encoded by interleaving according to the present invention. # Figure 6 illustrates the rearrangement of the pixel lines in an adaptive frame / field prediction scheme according to the present invention. Figure 7 is a block diagram of a decoder in accordance with the present invention. Figure 8 illustrates a layer structure of the macroblock in accordance with the present invention.
Pll 36/98 MX DETAILED DESCRIPTION OF THE INVENTION A method and an apparatus for encoding a digital video image, for example a macroblock (MB) in a video object plane with bi-directional prediction (B-VOP) in particular, wherein the MB and / or a reference image used to encode the MB are encoded by interleaving. The scheme provides a method for selecting a motion vector by jjfe prediction (PMV) for the upper and lower field of a current MB 0 encoded in the field, including the forward and backward PMVs as required, as well as the MBs encoded in picture. A direct coding mode for a field-encoded MB is also presented, in addition to a coding decision process that uses the minimum 5 sum of absolute difference terms to select an optimal mode. Figure 1 is an illustration of a process of coding and decoding a plane of the video object (VOP) in accordance with the present invention. The frame 105 includes three image elements, including a square front element 107, an oblong front element 108 and a bottom landscape element 109. In frame 115, the elements are designated VOP using a segmentation mask in such a way that the VOP 117 5 represents the square front element 107, the VOP 118 P1136 / 98 MX pp. represents the forward oblong element 108 and the VOP 119 represents the background landscape element 109. A VOP may have an arbitrary shape and a succession of VOPs is known as a video object. A full rectangular 5 video frame can also be considered as a VOP. Thus, the term "VOP" will be used herein to indicate both arbitrary and non-arbitrary image area forms (e.g., rectangular ßtk). A segmentation mask 0 is obtained using known techniques and has a format similar to that of the ITU-R 601 luminance data. It is identified that each pixel belongs to a certain region of the video frame. Table 105 and the VOP data of Table 115 are supplied to separate coding functions. In particular, the VOPs 117, 118 and 119 are subjected to the F coding of shape, motion and texture in the encoders 137, 138 and 139. With the shape coding, binary and scale-of-shape information is encoded. Gray. With motion coding, shape information is encoded using motion estimation within a frame. With texture coding, a spatial transformation such as DCT can be performed to obtain transformation coefficients that can have a variable length coding for "the P1136 / 33 MX W. compression. The coded VOP data is then combined in a multiplexer (MUX) 140 for transmission on a channel 145. Alternatively, the data may be stored in a recording medium. The received coded VOP data is separated by a demultiplexer (DEMUX) 150, so that the separate VOPs 117-119 are decoded and retrieved. Tables 155, 165 and 175 show that the VOPs 117, 118 and 119, respectively, have been decoded and recovered and, therefore, can be individually manipulated using a compositor 160 that interconnects with a library of videos 170, for example . The compositor can be a device such as a personal computer that is located in the user's place to allow the user to avoid the received data to provide a personalized image. For example, the user's personal video library 170 may include a previously stored VOP 178 (for example, a circle) that is different from the received VOPs. The user can compose a table 185, where the circular VOP 178 replaces the square VOP 117.
Table 185 thus includes the VOPs 118 and 119 received and the VOP 178 stored locally. In another example, the background VOP 109 may be replaced by a background of the user's selection. By P1136 / 98 MX P example, when watching a news broadcast on television, the driver can be coded as a VOP that is separated from the background, such as for example in a news studio. The user may select a fund 5 from the library 170 or from another television program, such as a channel with information on stock exchange or weather quotes. Therefore, the user can act as a video editor. Pp The video library 170 can also store VOPs that are received through channel 145 and can access VOPs and other image elements through a network such as the Internet. Generally, a video assignment comprises a single VOP or a sequence of VOPs. 5 The process of encoding and decoding the video object of Figure 1 allows many recreational or entertainment, commercial and educational applications, including games on personal computers, virtual environments, graphical user interfaces, or videoconferencing, applications of internet and similar. In particular, the capacity for the ME / MC with VOPs encoded with interleaving (e.g., in field mode) in accordance with the present invention, provides even greater capabilities. 5 Figure 2 is a block diagram of a Pll 6/8 MX flp encoder in accordance with the present invention. The encoder is suitable for use with both predictive coding VOPs (P-VOPs) and bidirectionally encoded or bi-directionally encoded VOPs (B-VOPs). The P-VOPs may include several macroblocks that can be individually encoded using an intraframe mode in an interframe fashion. With intraframe coding (INTRA) the macroblock is encoded without reference to another macroblock. With interframe coding (ÍNTER), the macroblock is differentially encoded with respect to a temporally subsequent frame in a mode known as forward prediction. The temporarily subsequent table is known as an anchor box or reference frame. The anchor box (for example, a VOP) must be a P-VOP, not a B-VOP. With forward prediction, the current macroblock is compared to a macroblock search area in the anchor box to determine the best match. A corresponding motion vector describes the relative displacement of the current macroblock with respect to the best match macroblock. Additionally, an advanced prediction mode P-VOPs can be used, where movement compensation is performed in 8x8 blocks instead of 16x16 macroblocks. In addition, both P1136 / .3 MX coded P-VOP macroblocks - both intraframe and interframe can be encoded in a frame mode or in a field mode. B-VOPs can use prediction mode in advance as described above in connection with P-VOPs, as well as backward prediction, bidirectional prediction and direct mode, all of which are inter-frame techniques. The B-VOPs do not use -áfc. currently macroblocks with intraframe coding according to Version 7.0 of MPEG-4, although this is subject to change. The anchor box (for example, a VOP) must be a P-VOP or 1-VOP, not a B-VOP. With the backward prediction of the B-VOPs, the current macroblock is compared with a search area of macroblocks in a temporarily previous anchor box to determine the best match. A corresponding # motion vector, known as forward MV, describes the relative displacement of the current macroblock with respect to the best match macroblock. With the In a bidirectional prediction of a MB of B-VOPs, the current MB is compared with a macroblock search area in both a prior temporal anchor box and a subsequent temporal anchor box to determine the best match MBs. The vectors of movement in advance and in reverse describe the relative displacement of P1136 / 38 X current macroblock with respect to the best match macroblocks. In addition, an average image is obtained from the best matching MBs for use in the coding of the current MB. With the direct mode prediction of the B-VOPs, a motion vector is derived for an 8x8 block when the macroblock placed in the next P-VOP uses the advanced 8x8 prediction mode. The motion vector of the 8x8 block in the P-VOP is linearly scaled to derive a motion vector for the block in the B-VOP without the need to search to find a block of best match. The encoder, generally shown at 200, includes a shape encoder 210, a motion estimation function 220, a motion compensation function 230 and a texture encoder 240 which each receive the pixel data input video in the terminal 205. The motion estimation function 220, the motion compensation function 230, the texture encoder 240 and the shape encoder 210 also receive the VOP form information input in the terminal 207, such as it can be the MPEG-4 parameter VOP_of_arbitrary_shape. When this parameter is zero, the VOP has a rectangular shape and, therefore, the shape encoder 210 is not used.
P1136 / 98 MX A reconstructed anchor function VOP 250 provides a reconstructed anchor VOP for use by the motion estimation function 220 and the motion compensation function 230. For P-VOPs, the anchor VOP occurs after the current VOP in the order of presentation and can be separated from the current VOP by one or more intermediate images. The current registered VOP of a compensated motion anchor VOP in the subtracter fc 260 to provide a residue that is encoded in the texture encoder 240. The texture encoder 240 performs the DCT to provide the texture information (e.g., the coefficients transformation) to a multiplexer (MUX) 280. The texture encoder 240 also provides information that is added with the output of the motion compensator 230 in the adder 270 for input to the function 250 of the reconstructed anchor VOP. The movement information (eg, motion vectors) is supplied from the motion estimation function 220 to the MUX 280, while the shape information indicating the shape of the VOP is supplied from the shape coding function 210 to the MUX 280. The MUX 280 provides a corresponding current multiplexed data to a buffer 290 for subsequent communication on a data channel.
P1136 / 38 MX flp The pixel data that is input to the encoder can have a YUV 4: 2: 0 format. The VOP is represented by a boundary rectangle. The upper left coordinate of the boundary rectangle is rounded to the nearest even number no greater than the upper left coordinates of the rectangle closest to the closest rectangle. In accordance with the above, the upper left coordinate of the boundary rectangle of the chrominance component is half the luminance component. Figure 3 illustrates an interpolation scheme for a half pixel search. Motion estimation and motion compensation (ME / MC) generally includes matching a block of a current video frame (eg, a current block) with a block in a searched area of a reference frame (e.g. a predicted block or a reference block). The reference frame (s) can be separated from the current frame by one or more intermediate images. The displacement of the reference block with respect to the current block is the motion vector (MV), which has horizontal (x) and vertical (y) components. Positive values of the MV components indicate that the predicted block is to the right and below the current block. A block of movement difference is formed P1136 / 98 MX? compensated by subtracting the pixel values point by point from the predicted block of the current block. The texture coding is then performed in the difference block. The encoded MV and the encoded texture information 5 of the difference block are transmitted to the decoder. The decoder can then reconstruct an approximate current block by adding the quantized difference block to the predicted block of agreement jßk with the MV. The block for the ME / MC can be n block of 16x16 frame (macroblock), an 8x8 frame block or a 16x8 field block. The accuracy of the MV is adjusted to one half of a pixel. The interpolation must be used on the anchor box so that p (i + X / j + y) is defined by x or y being half of an integer. The interpolation is performed as shown in Figure 3. The whole pixel positions are ? presented by the "+" symbol, as shown in A, B, C, and D. Half-pixel positions are indicated by circles, as shown in a, b, c, and d. As observed, a = A, b = (A + B) // 2 c = (A + C) // 2, and d = (A + B + C + D) // 4, where "//" denotes the division rounded Other details of the interpolation are discussed in MPEG-4 VM 8.0 which is referenced above as a United States patent application assigned in common form with No. of Series 08 / 897,847 of Eifrig et al., Filed on 21 P1136 / 98 MX iM July 1997 and titled "Compensation and Estimation of Motion of Video Object Planes for Digital Video Interlaced ", which is mentioned here by reference Figure 6 illustrates the rearrangement of the pixel lines in a frame / field adaptive prediction scheme according to the present invention. In a first aspect of the advanced prediction technique, an adaptive technique is used to decide whether a fl | Current macroblock (MB) of 16x16 pixels must be encoded ME / MC as it stands, or divided into four blocks of 8x8 pixels each, where each 8x8 block is coded separately ME / MC, or the movement estimate based on the field should be used when the lines of pixel in the MB reorder to group the same lines field in two field blocks 16x8, and each block of 16x8 is encoded separately by ME / MC. f A 16x16 macroblock of field mode (MB) is generally displayed at 600. The MB includes even-numbered lines 602, 604, 606, 608, 610, 612, 614, and 616 and lines of number non-603, 605, 607, 609, 611, 613, 615 and 617. The lines with even number and non-interlace and form upper and lower (or first and second) fields, respectively. When the pixel lines in image 600 are swapped to form luminance blocks of the same field, the MB shown in general at 650 can be formed. The P1136 / 98 MX, _ arrows, shown generally at 645, indicate the reordering of lines 602-617. For example, the par 602 line, which is the first line of MB 600, is also the first line of MB 650. The par 604 line is reordered as the second line in MB 650. Similarly, even lines 606, 608, 610, 612, 614 and 616 are reordered as the third to eighth lines, respectively, of MB 650. In this way, a 680 luminance region is formed. of 16x8 with lines numbered pair. Similarly, the lines numbered non-603, 605, 607, 609, 611, 613, 615 and 617 form a region 685 of 16x8. The decision process to select the MC mode for P-VOPs is as follows. For a picture-frame video, the sum of the absolute differences 15 (SAD) for a single MB of 16x16 is first obtained, for example SAD16 (MVx, MVy); and for four blocks of 8x8, for example, AL SAD8 (MV? l, MVyl), SAD8 (MV? 2, MVy2), SAD8 (MVx3, MVy3), and SAD8 (MVx4, MVy4). 4 20 Yes? SAD8 (MVxi, MVyi) < SAD16 (MVx, MVy) -129, i = l the 8x8 prediction is selected; otherwise, the 16x16 prediction is selected. The constant "129" is obtained from Nb / 2 + 1, where Nb is the number of non-transparent pixels in one MB. For interlaced video, get P1136 / 98 MX Jtf SADtop (MV? _top, Mv-y -op), SADbottom (MVx_bottom, MVy_bottom) / where (MVx_top, MVy-top) and (MVx_bottom, MVyjD? Ttom) are the motion vector (MV) for the upper (par) and lower (non) fields. Subsequently, the reference field having the smallest SAD (for example for SAD-0p and S Dj-x-j-t ^ ojn) is selected from the search of the half-field sample. The decision of the global prediction mode is based on selecting the minimum of: L§ 4 (a) SAD1 6 (MVx, MVy), (b)? SAD8 (MV? I, MVyi) +129, i = l Y (c) SADtop (x_topr Wy top) + SADbottom (MVx_bottom, MVy_pottom) +65. 15 If term (a) is the minimum, the 16x16 prediction is used. If the term (b) is the minimum, 8x8 motion compensation (advanced prediction mode) is used. If the term (c) is the minimum, the estimation of. movement based in the field. The constant "65" is obtained from Mb / 4 + 1. If the 8x8 prediction is selected, there are four MVs for the four 8x8 luminance blocks, that is, one MV for each 8x8 block. The MV for the two chrominance blocks is obtained after taking the average of these four MVs and dividing the average value by two. As each MV for the 8x8 luminance block has a P1136 / 98 MX fc half pixel precision, the MV for the chrominance blocks can have a value of one sixteenth of a pixel. Table 1 below specifies the conversion of the value of a sixteenth pixel to a value of half a pixel for the chrominance MVs. For example, 0 to 2/16 is rounded to 0, 3/16 to 13/16 rounds to a half, and 14/16 and 15/16 rounds to 2/2 = 1.
TABLE 1" With the field prediction there are two MVs for the two 16x8 blocks. The luminance prediction of generates in the following way. The even lines of the MB (for example lines 602, 604, 606, 608, 610, 612, 614 and 616) are defined by the upper field MV that uses the specific reference field. The MV is specified in the frame coordinates so that the full pixel vertical offsets correspond to the even integer values of the vertical MV coordinate, and the half pixel vertical offsets are denoted by non-integer values. When a vertical deviation of half a pixel is specified, only the pixels that come from P1136 / 98 MX J of the lines within the same reference field are combined. The MVs for the two chrominance blocks are derived from the MV. { luminance) dividing each component between two, and then doing the rounding. The horizontal component is rounded correlating all the fractional values in a deviation of half a pixel. The vertical MV component is an integer and the vertical flfc component MV of resulting chrominance is rounded to an integer 0. If the result of dividing by two gives a non-integer value, it is rounded to the non-adjacent integer. Note that the integer values denote vertical interpolation between the lines in the same field. The second aspect of the advanced prediction technique 5 is to overlap the CMs for luminance blocks, which are discussed in more detail in MPEG-4 VM 8.0 in the Eifrig et al. which is previously referred to. The specific coding techniques for the 0 B-VOPs will be discussed below. For the encoded VOPs ÍNTER, for example the B-VOPs, there are four prediction modes, namely direct mode, interpolation mode (for example averaged or bi-directional), backward mode and forward mode. These last three modes are 5 non-direct modes. The forward or backward prediction P1136 / 98 MX only, are also known as "unidirectional" prediction. The predicted blocks of the B-VOP are determined differently for each mode. In addition, the blocks of a B-VOP and the anchor block can be encoded in a progressive manner (for example frame) or in an interlaced form (for example field) I. A single B-VOP can have different MBs that are predicted with different modes. The term "B-VOP" only indicates that the predicted bi-directional blocks can included, but not required. In contrast, with P-VOPs and I-VOPs, MBs that are predicted bi-directionally are not used. For B-VOP MBs non-directly, the MVs are coded differentially. For the advance MVs in the forward and bi-directional modes and the backward MVs in the backward and bi-directional modes, the MV "of the same type flj" (for example, forward or backward of the MB immediately preceding the current MB in the same row, it is used as a predictor, the same thing happens with the MB immediately preceding in order of the plot, and, in general, in the order of transmission. However, if the frame order differs from the transmission order, the MVs of the immediately preceding MB in the order of transmission must be used in order to avoid the need to store and reorder the MBs and the corresponding MVs in the P1136 / 98 MX fl decoder. Using the MV of the same type and assuming that the order of transmission is the same as the frame order, and that the frame order goes from left to right, from 5 up to down, the MV in advance of the neighboring MB to the left ee used as a predictor for the MV in advance of the current MB of the B-VOP. In a similar way, the backward MV of the left neighbor MB is used as a predictor flfc for the backward MV of the current MB of the B-VOP. 0 The MVs of the current MB are then encoded differentially using the predictors. That is, the difference between the predictor and the MV that is determined for the current MB is transmitted as a difference of the vector of the movement towards a decoder. In the 5 decoder, the MV of the current MB is determined by recovering and adding the PMV and the difference MV. In case the current MB is located on the left edge of the VOP, the predictor for the current MB is set to zero. 0 For interlaced B-VOPs, each of the upper and lower fields has two associated prediction movement vectors, for a total of four MVs. The four prediction MVs represent, in the order of transmission, the upper advance field and the lower advance field 5 of the previous anchor MB, and the upper field in reverse and the lower field in reverse of the next anchor MB. The current MB and the forward MB and / or the current MB and the backward MB can be separated into one or more intermediate images that are not used to code ME / MC the current MB. The B-VOPs do not contain INTRA encoded MBs, so that every MB in the B-VOP will be encoded by ME / MC. The forward and backward anchoring MBs can be a P-VOP or I-VOP and can be coded in frame or field. For B-VOP MBs in non-direct interlace mode, four possible prediction movement vectors (PMVs) are shown in Table 2 below. The first column of Table 2 shows the prediction function while the second column shows a designator for the PMV. These PMVs are used as shown in Table 3 below for the different MB prediction modes.
TABLE 2 TABLE 3 For example, Table 3 shows that a current field mode MB with a forward prediction mode (eg "field, in progress") is used to predict the motion vector in the upper field advance ("0") and in lower field advance ("1"). After being used in differential coding, the motion vectors of a current MB are converted into PMVs for a subsequent MB, in the order of transmission. The PMVs are reset to zero at the beginning of each row of MBs since the MVs of a MB at the end of a preceding row are very likely to be similar to the MBs of a MB at the beginning of a current row. Predictors are also not used for the MBs directly. For skipped MBs, the PMVs retain the last value. With the direct mode coding of the B-VOP pp MBs, no vector differences are transmitted. Instead, the forward and backward MVs are calculated directly from the decoder from the MVs of the next P-VOP MB based on time, with correction for a 5 MV single delta that is not predicted. The technique is efficient since less MV data is transmitted. Table 4 below summarizes which PMVs are used to encode the movement vectors of the current B-VOP fl MBs based on the previous and current 0 MB types. For B-VOPs, an array of prediction movement vectors, pmv [] can be provided in a linked fashion from zero to three (eg, pmv [0], pmv [l], pmv [2] and pmv [3] ). The pmv [] indices are not transmitted but the decoder can determine the 5 pmv [] index to be used according to the type of MV encoding and the particular vector being decoded. After encoding a MB of B-VOP, some of the PMV vectors are updated to be the same as the motion vectors of the current MB. The first one, two or four PMVs are updated depending on the number of MVs associated with the current MB. For example, an MB that is predicted in the field, in advance has two motion vectors, where pmv [0] is the PMV for the upper field in advance, and the pmv [l] is the 5 PMV for the lower field in Advance. For a MB that is predicted in the backward field, pmv [2] is the PMV for the upper field in reverse and the pmv [3] is the PMV for the lower field in reverse. For a MB that is predicted in the field and bidirectionally, pmv [0] is the PMV for the upper 5 field in advance, and pmv [l] is the PMV for the lower field in advance, pmv [2] is the PMV for the upper field in reverse and the pmv [3] is the PMV for the lower field in reverse. For the MB of the B-VOP so fl. A table that is predicted to advance or retract is only 0 or MV, so only one pmv [0] is used for the advance and pmv [2] is used for the backspace. For a B-VOP MB of frame mode predicted on average (for example in bidirectional form) there are two MVs, namely pmv [0] for the forward MV and pmv [2] for the backward MV. The 5 row designated "pmv [] 's to update" indicates whether one, two or four MVs are updated.
TABLE 4 - PREDICTION OF THE MOVEMENT VECTOR INDEX PMvf 1? It will be appreciated that Table 4 is simply a simple notation for the instrumentation of the technique of the present invention to select a prediction MV for a current MB. However, the scheme can to express themselves in many other ways. The adaptive DC block prediction can use the same algorithm described in MPEG-4 VM 8.0 regardless of the value of dct_type. The prediction AC Wk adaptable intra block is carried out as described in MPEG-10 4 VM 8.0 except that the first row of the coefficients is to be copied from the block encoded above. This operation is only allowed if the dct_type has the same value for the current MB and the block above. If the dct_types is different, then the AC prediction can occur only by copying the first column of the block to the left. If there is no block to the left, fl ^ 1 is used as a zero for the AC predictors. Figure 4 illustrates the direct mode coding of the upper field of a B-VOP coded by interleaved, according to the present invention. The progressive direct coding mode is used for the current macroblock (MB) as long as the MB is a future anchor image that is the same as the relative position (for example co-located) as the current MB is encoded as (1) 1 MB of 16x16 (frame), (2) one MB Pll 36/98 MX Jfl intra ó (3) an MB of 8x8 (advanced prediction). The prediction of the direct mode is interleaved when the co-located future anchor image MB is encoded as an interlaced MB. Direct mode will be used to encode the current MB if its derived SAD is the minimum of all MB predictors of B-VOP. The direct mode for an MB encoded by interlacing forms the prediction MB separately for the upper and lower fields of the current MB. The four field movement vectors (MVs) of a compensated MB in bidirectional field movement (for example, advance in the upper field, advance in the lower field, reverse in the upper field and reverse in the lower field) are calculated directly from the respective MVs of the field. corresponding MB of the future anchor image. The technique is efficient since the search that is required is considerably reduced and the amount of MV data transmitted is reduced. Once the MVs and the reference field are determined, the current MBs are considered as a predicted MB in bidirectional field. Only one delta motion vector (used for the two fields) is presented in the bit stream for the predicted MB in the field. The prediction for the upper field of the current MB is based on the upper field MV of the MB of the P11 6/9 > . MX jff future anchor image (which can be a P-VOP or an I-VOP with MV = 0), a reference field passed from a previous anchor image that is selected by the corresponding MV of the upper field of the anchor MB future. That is, the upper field MB of the future anchor image which is correspondingly positioned (eg co-located) with the current MB has a better match MB in any of the upper or lower fields of the last anchor image. This MB of best field is used «10 then as the anchor MB for the upper field of the current MB. An extensive search is used to determine the delta movement vector MVQ given the future anchor MV co-located in a MB, based on the MB. The motion vectors for the lower field of the current MB are determined similarly using the MV of the background field correspondingly placed of the future anchor MB, which in turn refers to a MB of best match in the upper or lower fields of the last anchor image. 0 Essentially, the upper field movement vector is used to construct a predictor MB that is the average of: (a) pixels obtained from the upper field of the future anchored MB appropriately positioned and (b) pixels of the anchor field 5 past that is referenced by the upper field MV Pll 36/98 MX of the future anchor MB placed correspondingly. In # Similar way, the background field movement vector is used to construct a predictor MB that is the average of: (a) pixels obtained from the lower field of the future anchored MB placed correspondingly and (b) pixels from the the last anchor field referred to by the lower field MV of the future anchored MB placed correspondingly. As shown in Figure 4, the current B-VOP MB 420 includes an upper field 430 and a lower field 425, the MB 400 of past anchor VOP includes an upper field 410 and a lower field 405 and the MB 440 of Future anchor VOP includes an upper field 450 and a lower field 445. The motion vector MV-op is the vector of forward movement for the upper field 450 of the future anchor MB 440 which indicates the MB of best match in the MB 400 of last anchorage. Even though through -YTVtop reference is made to a previous image (for example in backward movement in time) it is an MV in advance since the VOP 440 of future anchoring is in advance in time in relation to the VOP 400 of anchor last . In the example, ^ top refers to the lower field 405 of the passed anchor MB 400, although reference can be made to any of the upper fields 410 or lower 405.
Pll 6/98 MX pp V top is the MV of advance of the upper field of the current MB and MVjD / top is the MV in reverse of the upper field of the current MB. The pixel data is derived from the predicted MB in bidirectional form in a decoder, averaging the pixel data in the future and past anchor images that are identified by MVb ^ op and MVf t0, respectively, and adding the averaged image with a residue that was transmitted. ^ m Motion vectors for the upper field are calculated in the following way: MVf, top = (TRB, top * MVtop) / TRD f top + MVD; MVb, top = - - RB, top "TRD, top) * MVtop) / TRD I if MVD = 0, and MVb, top = (Vf, top" MVtop) if MVD? O. 15 MVQ is a delta or deviation vector. Note that the motion vectors are two-dimensional TW. Additionally, the motion vectors are integral half luma pixel movement vectors. 20 The diagonal "/" denotes division of truncated integers to zero. Also, the future anchor VOP is always a P-VOP for the direct mode in the field. If the future anchor was a 1-VOP, the MV would be zero and the 16x16 progressive direct mode would be used. TRB, top is the temporal distance in the fields between the last reference field Pll 3 .- / 9 í MX JFor example upper and lower) which is the lower field 405 in this example and the upper field 430 of the current B-VOP 420. TRp top is the temporary distance between the past reference field (for example upper or lower) which is the lower field 405 in this example, and the future upper reference field 450. FIG. 5 illustrates the direct mode coding of the background field of a B-VOP encoded by interleaving according to the present invention . Observe • 0 that the source interleaving video may have a format of first upper field or first lower field. A lower first field format is shown in Figures 4 and 5. Similar numbered elements are the same as in Figure 4. Here, the MVbot motion vector 5 is the forward motion vector for the background field 445 of the macroblock Future anchorage (MB) 440 that flp. indicates the best match MB in the last 400 anchor MB. In the example, MVbot refers to the background field 405 of the last anchor MB 400, although any of the upper 0 fields 410 or lower 405 may be employed. The MVf bot AND the MVb bo are the forward and backward movement vectors, respectively. The motion vectors for the background field are calculated in parallel with the upper-field motion vectors, in the following way: P1136 / 98 MX t MVf / bot = (TRB, bot * MVbot) / TRD bot + MVD; MVb / bot = ((RB, b? T "TRD, bot) * MVbot) / TRDfI if MVD = 0; and MVb bot = (MVf, bot - MVbot) if MVD? 0 .5 TRB bot is the temporary distance between the past reference field (for example upper or lower) which is the lower field 405 in this example, and the lower field 425 of the current B-VOP 420. TRD ot is the temporal distance jA between the past reference field (for example 0 higher or lower) which is the background field 405 in this example, and the future background reference field 445. With respect to the examples in Figures 4 and 5, the calculation of TRB / top 'RD, top' RB, bot And TRD, bot depends not only on the current field, the reference field and the 5 temporary frame references, but it also depends on whether the current video is the first upper field or the first lower field flp. top or TRD, bot = 2 * (TRfuture - TRpast) + 5, and TRB, top or TRB, bot = 2 * (TRcurrent - TRpast) + d; 0 where TRfuture, TRcurrent and TRpast are the frame number of the du pictures ro, current and past, respectively, in order of display, and d is an additive correction to the temporal distance between the fields, is given by Table 5 below. d has 5 units of field periods.
P1136 / 98 KX j For example, the designation "1" in the last row of the first column indicates that the future anchor field is the upper field and the reference field is the lower field. This is shown in Figure 4. The designation "1" in the last row of the second column indicates that the future anchor field is the bottom field and the reference field is also the bottom field. This is shown in Figure 5.
TABLE 5- TEMPORARY CORRECTION, d For efficient coding, a proper coding mode decision process is required. As indicated, for B-VOPs, a MB can be encoded using (1) direct coding; (2) 16x16 movement compensated (includes advance, backward and average modes) or (3) field movement compensation (includes modes of P1136 / 98 MX Áévance, retracement and average). The box or field directs the encoding of a current MB that is used when the future anchor MB is encoded directly in box or field, respectively. For a compensated MB in field movement in a B-VOPs a decision is made to encode the MB in a forward, backward or average mode based on the minimum pixel SEDs of minimum luminance with respect to the decoded kk anchor images. Specifically, seven deviated SAD terms are calculated as follows: (1) SADdirect + b !, (2) SADforward + b2 / (3) SADbackward + 2 • (4) SADaverage + b3 - (5) SADforward, field + b3 > (6) SADbackward, field + b3, and (7) SADaverage / field + b4, where the subscripts indicate direct mode, forward motion prediction, backward movement prediction, average (i.e., interpolated or bidirectional), motion prediction, frame mode (ie locally progressive) and field mode (ie interlace locally). The previous field SADs (ie, SADforward, field- SADbackward, field- AND SADaverage field) are the sums of the upper and lower field SADs, each with its own motion vector and reference field. Specifically, SADforward / field = SADforward / top Pll36 / 98 MX jJÉfield + SADforward, bottom field- SADbackward, field = SADbackward, top field + SADbackward, bottom field 'SADaverage, field = SADaverage, top field + SADaverage, bottom field- SADdirect is ^ a better direct mode prediction, SADforward is the best 16x16 prediction from the forward reference (past), SADbac]; ward is the best 16x16 prediction from the backward reference j • k. (future), SADaverage is the best 16x16 prediction formed by an average of pixel per pixel of the best of forward and reverse references, SADforward field is the best field prediction that comes from the advance reference (past), SADbackward field it is the best field prediction that comes from the backward reference (future) and SADaverage field is the best field prediction formed by a pixel average per pixel of the best forward and backward reference. The bi 's are biased values as defined in Table 6 below, to justify the prediction modes that require more motion vectors. Direct mode and modes with minors Mvs are favored.
P1136 / 98 MX TABLE 6 The negative bias for direct mode is for consistency with existing MPEG-4 VMs for progressive video, and can result in relatively more skipped MBs. Figure 7 is a block diagram of a decoder in accordance with the present invention. The decoder, generally shown at 700, can be used to receive and decode the encoded data signals transmitted from the encoder of Figure 2. The encoded video image data and the different motion vector (MV) data. substantially coded are received at terminal 740 and are supplied to a demultiplexer (DEMUX) 742. The image data of P1136 / 98 MX ^ teideo encoded are usually differentially encoded in DCT transformation coefficients as a signal of the prediction error (eg, residue). A function decoding function 744 processes the data when the VOP has an arbitrary way to retrieve the information so that it, in turn, is supplied to a motion compensation function 750 and a VOP reconstruction function 752. A texture decoding function 746 performs an inverse DCT on the transformation coefficients to recover the residual information. For macroblocks (MBs) with INTRA coding, the pixel information is directly retrieved and is supplied to the VOP reconstruction function 752. For blocks and MBs with ÍNTER coding, such as those of the B-VOPs, the pixel information supplied from the texture decoding function 746 to the reconstructed VOP function 752, represents a residue between the current MB and a reference image . The reference image can be pixel data from a single MB anchor that is indicated by a forward or backward MV. Alternatively, for an interpolated MB (for example, averaged), the reference image is an average of pixel data of two reference MBs, for example, an MB anchor passed and an MB anchor P1136 / 98 MX uturo. In this case, the decoder must calculate the pixel data averaged in accordance with the forward and backward MBs before retrieving the pixel data of the current MB. For blocks and MBs with ÍNTER coding, a motion decoding function 748 processes the coded MV data to recover the differential MVs and supply them to the motion compensation function 750 and a memory 749 of the motion vector, such as a RAM . The motion compensation function 750 receives the differential MV data and determines a reference movement vector (e.g., predictive motion vector or PMV) in accordance with the present invention. The PMV is determined in accordance with the coding mode (for example, forward, backward, bidirectional or direct). Once the motion compensation function 750 determines a full reference MV and adds it to the differential MV of the current MB, the full MV of the current MB is available. In accordance with the foregoing, the motion compensation function 750 can now recover the data of the best match of the anchor box of a VOP memory 754, such as a RAM, calculate an averaged image if required and supply the pixel data from the anchor box to the P1136 / 98 MX? ^ Reconstruction of the VOP to reconstruct the current MB. The best match data retrieved or calculated is added back to the pixel residue in the VOP reconstruction function 752 to obtain the MB or the decoded current block. The reconstructed block is issued as a video output signal and also supplied to the VOP memory 754 to provide new anchor box data. Observe that. an appropriate buffering capacity of video data may be required, depending on the transmission of the frame and the presentation orders, since an anchor frame for a B-VOP MB may be a temporarily future frame or field, in the order of presentation. Figure 8 illustrates a structure of the package of MB in accordance with the present invention. The structure is suitable for B-VOPs and, indicates the format of the data received by the decoder. Note that packages are shown in four rows for convenience only. The packets are actually transmitted in series, starting from the top row and from left to right within a row. The first row 810 includes the first_shape_code, MVD_shape, CR, ST, and BAC fields. The second row 830 includes the MODB and MBTYPE fields. The third row 850 includes the fields CBPB, DQUANT, interlaced_information, P1136 / 98 MX MVD, and MVDB. The fourth row includes the fields CODA, CBPBA, Alpha Block Data and Block Data. Each of the above fields is defined in accordance with MPEG-4 VM 8.0. first_shape_code, indicates if a MB is in a boundary box of a VOP. CR indicates a conversion ratio for Binary Alpha Blocks. ST indicates a horizontal or vertical scan order. BAC refers to a binary arithmetic keyword. MODB, which indicates the mode of a MB, is present for each MB encoded (not skipped) in a B-VOP. The difference movement vectors (MVDf, MVDb, or MVDB) and CBPB are present if they are indicated by MODB. The type of macroblock is indicated by MBTYPE, which also points to the modes of the movement vector (MVDs) and the quantization (DQUANT). With interlaced mode, there can be up to four MVs per MB. MBTYPE indicates the type of coding, for example, forward, backward, bidirectional or direct. CBPB is the Coded Block Pattern for a type B macroblock. CBPBA is defined similarly as CBPB with the exception that it has a maximum of four bits. DQUANT defines changes in the value of a quantifier. The interleaved_information field in the third row 850 indicates whether a MB is encoded with interleaving and P1136 / 98 MX ^^ provides reference data of the field MV that informs the decoder of the coding mode of the current block or MB. The decoder uses this information to calculate the MV of a current MB. The Interlace_information field may be stored for subsequent use as required in MV memory 749 or in another memory of the decoder. The Interlaced_information field may also include a dct_type flag that indicates whether the 0-pixel lines of the upper and lower field in a field-encoded or field-encoded MB are reordered from the interleaved order, as discussed above in connection with the Figure 6. The layer structure of the MB is used when 5 VOP_prediction_type == 10. If COD indicates skipping (COD == "1") for a MB in the I- or P-VOP most recently decoded, then the MB -located (for example, co-located) in the B-VOP is also skipped. That is, information is not included in the bitstream. 0 The MVDf is the movement vector of a MB in a B-VOP with respect to a temporarily previous or previous reference VOP (an I- or a P-VOP). It consists of a variable length keyword for the horizontal component followed by a keyword of variable length 5 for the vertical component. For a MB B-VOP P1136 / 98 MX J ^ ntrelazado with Field_prediction of "1" and MBTYPE of advance or interpolate, the MVDf represents a pair of field movement vectors (upper field followed by the lower field) that refer to the last anchor VOP. The MVD is the movement vector of a MB in a B-VOP with respect to a reference VOP that follows temporarily (an I- or a P-VOP). It consists of a word fc variable length key for the horizontal component followed by a variable length keyword for the vertical component. For an MB B-VOP interleaved with field_prediction of "1" and MBTYPE of backspace or interpolate, the MVDb represents a pair of field MVs (upper field followed by lower field) which refer to the future anchor VOP. MVDB is only present in the B-VOPs if the. Direct mode is indicated by MODB and MBTYPE and, consists of a variable length keyword for the horizontal component followed by a variable length keyword for the vertical component of each vector. The MVDBs represent delta vectors that are used to correct the MB B-VOP motion vectors obtained by scaling the MB P-VOP motion vectors. CODA refers to the coding of shape in gray scale.
P1136 / 98 MX ß The arrangement shown in Figure 8 is only an example and the various other arrangements for communicating the relevant information to the decoder will be apparent to those skilled in the art. The bit stream syntax and layer or stratum syntax MB to be used in accordance with the present invention is described in MPEG-4 VM 8.0, as well as in the Eifrig et al. which was previously referred to. In accordance with the above, it can be seen that the present invention provides a scheme for encoding a current MB in a B-VOP, in particular, when the current MB is coded field and / or an anchor MB is coded field. A schema for the direct coding of a coded field MB is presented, as well as a coding decision process that uses the minimum terms of the sum of absolute differences to select an optimal mode. A prediction movement vector (PMV) is also provided for the upper and lower field of a current coded field MB, including the forward and backward PMVs as required, as well as for coded frame MBs. Although the invention has been described in connection with various specific embodiments, those skilled in the art will appreciate that it can be made thereto.
P1136 / 98 MX and many adaptations and modifications without departing from the spirit and scope of the invention as set forth in the claims.
P1136 / 98 MX

Claims (27)

  1. NOVELTY OF THE INVENTION * Having described the present invention, it is considered as a novelty and, therefore, the content of the following CLAIMS is claimed as property: 1. A method for calculating motion vectors in direct mode for a current image of field encoded and bi-directionally predicted to have upper and lower fields, in a sequence of digital video images, comprising the steps of: determining a past coded field reference image having upper and lower fields and a future reference image coded field that has upper and lower fields; where the future image is predicted using the past image in such a way that Vtop. a vector of forward movement of the upper field of the future image, refers to one of the upper and lower fields of the past image and, MVbot- a vector of forward movement of the lower field of the future image, refers to one of the upper and lower fields of the past image; and determine the forward and backward motion vectors to predict at least one of the upper and lower fields of the current image by scaling to the vector P1136 / 98 MX e forward movement of the corresponding field of the future image.
  2. 2. The method according to claim 1, wherein: MVf top. the forward motion vector to predict the upper field of the current image is determined in accordance with the expression (MVtop * TRß top) / TRD top + MVD; where TRB top corresponds to a separation # temporal between the upper field of the current image and the field of the past image referenced by MV op 'TRD, top corresponds to a temporal separation between the upper field of the future image and the field of the image passed to which is referred to by V "0p, and MVQ is a delta motion vector 3. The method according to claim 2, wherein: MVf -t-0 is determined using the integer division with zero truncation, and Vtop Y bo bo on integer half luma pixel movement vectors 4. The method according to claim 2 or 3, wherein: TRß top AND TRD top incorporate a temporal correction that explains whether the current field image P1136 / 98 MX ^ coded is the first upper field or the first lower field. The method according to one of the preceding claims, wherein: MVf bot / l forward movement vector to predict the lower field of the current image is determined in accordance with the expression (MV ot * TRB bot) / TRD bot + MVD; where TRß bot corresponds to a temporal separation between the lower field of the current image and the field of the past image that is referred to by MV ot, Rp bot corresponds to a temporal separation between the lower field of the future image and the field of the past image that is referred to by MVbot, and, MVD is a delta motion vector. The method according to claim 5, wherein: M f bot is determined using the integer division with zero truncation; and V op and MVbot are pixel movement vectors from half lu to integers. The method according to claim 5 or 6, wherein: TRß bot AND TRrj bot incorporate a temporal correction that explains whether the current field image P1136 / 98 MX coded is the first upper field or the first lower field. The method according to one of the preceding claims, wherein: MVb top-e-L reverse motion vector for predicting the upper field of the current image is determined in accordance with one of equations (a) MVb / top = ((TB, top-T D, top) * MVtop) / TD, top Y () MVb top = MVf, top "MVtop; where TRg top corresponds to a temporary separation between the upper field of the image current and the field of the past image that is referred by MV-ko, TRD top corresponds to a temporal separation between the upper field of the future image and the field of the past image that is referred by MVtop, and MVf top is e ^ forward movement vector for predicting the upper field of the current image 9. The method according to claim 8, wherein: the equation of (a) is selected when a delta motion vector MVD = 0 y, the equation of ( b) is selected when MVQ? 0. 10. The method according to one of the preceding claims, wherein: MVb or the backward motion vector P1136 / 98 MX ijjp to predict the lower field of the current image is determined in accordance with one of the equations (a) MVb, b? T = ((TRBfbot-TRDfbot) * MVbot) / TRD / bot Y () MVb, bot = Vf, bot "MVbot; where TRB bot corresponds to a temporary separation between the lower field of the current image and the field of the past image that is referred to by MVbo -? -, TRD bot corresponds to a temporary separation between the ^^. lower field of the future image and the field of the past image that is referred to by MVbot, and MV bot is the forward motion vector to predict the lower field of the current image. The method according to claim 10, wherein: the equation of (a) is selected when a delta motion vector MVQ = 0 and, the equation of (b) is selected when MVQ? 0. 12. A method for selecting a coding mode for a current coded and predicted field macroblock having upper and lower fields, in a sequence of digital video images, comprising the steps of: determining an advancing sum of the absolute error differences, SADforward field For the current macroblock in relation to a past reference macroblock, Pll36 / 98 MX .J-ÉÉÍTue corresponds to a coding mode in advance; determine a backward sum of the absolute error differences, SADbac] ward, fie? d for the current macroblock in relation to a macroblock of 5 future reference, which corresponds to a backward encoding mode; determine an average sum of absolute error differences, SADaverage, field for the macroblock Current ^ in relation to an average of the macroblocks of 0 past and future reference, which correspond to an average coding mode; and select the coding mode in accordance with the minimum of the SADs. The method according to claim 12, which comprises the additional step of: selecting the encoding mode of jflk according to the minimum of the respective sums of the SADs with the corresponding bias terms that explain the number of required motion vectors of 0 the respective coding modes. The method according to claim 12 or 13, wherein: SADforward field? E determines in accordance with the sum of: (a) the sum of the absolute differences of the upper field of the current macroblock in relation to a P1136 / 98 MX upper field of the reference macroblock and, (b) the sum of the absolute differences of the lower field of the current macroblock relative to a lower field of the reference macroblock passed. 15. The method according to one of the claims 12 to 14, where: SADbackward field? E determines in accordance with the sum of: (a) the sum of the absolute differences of the - ^ upper field of the current macroblock in relation to a higher field of the future reference macroblock and, (b) the sum of the absolute differences of the lower field of the current macroblock in relation to a lower field of the future reference macroblock. 16. The method according to one of claims 15 to 15, wherein: SADaverage fieid is determined in accordance with j ^ - the sum of: (a) the sum of the absolute differences of the upper field of the current macroblock relative to a average of the upper fields of the past and future reference macroblocks and, (b) the sum of the absolute differences of the lower field of the current macroblock relative to an average of the lower fields of the past and future reference macroblocks. 17. A decoder for recovering a current macroblock of coded field and directly P1136 / 9í MX p having upper and lower fields in a digital video macroblock sequence from a received bit stream, where the current macroblock is predicted is bi-directionally using a reference macroblock 5 past coded field having upper and lower fields and a future coded field reference macroblock having upper and lower fields, comprising: - a means to recover MVtop, a vector of 10 forward movement of the upper field of the future macroblock that refers to one of the upper and lower fields of the macroblock passed and, MVb0. a forward movement vector of the lower field of the future macroblock that refers to one of the upper fields 15 below the last macroblock; and a means for determining forward and backward movement vectors for predicting at least one of the upper and lower fields of the current macroblock by scaling to the forward motion vector of the current macroblock. 20 corresponding field of the future macroblock. 18. The decoder according to claim 17, further comprising: a means for determining MVf top 'the moving forward vector to predict the upper field of the 25 current macroblock, in accordance with the expression Pll j.o / 8 MX ff (MVt? P * TRB, top) / TRD, top + MVD; where TRB, top corresponds to a temporal separation between the upper field of the current macroblock and the field of the past macroblock that is referred by 5 Vtop TRQ top corresponds to a temporal separation between the upper field of the future macroblock and the field of the last macroblock that is referred to by MV op Y 'MVD is a delta motion vector. 19. The decoder according to claim 18, wherein: MVf top is determined using the integer division with zero truncation; and Vtop and MVbot are motion vectors of the half luma pixel integer. 20. The decoder according to claim 18 or 19, wherein: fl wir TRB top AND TRD top incorporate a temporal correction that explains whether the current coded field image is the first upper field or the first lower field 20. The decoder according to one of the claims 17 to 20, further comprising: a means for determining MVf bot 'the vector of forward movement to predict the lower field of the 25 current macroblock, in accordance with the expression P113 / 5U MX (MVbot * TRB / bot) / TRD, b? T + MVD; where TRB bot corresponds to a temporary separation between the lower field of the current macroblock and the field of the past macroblock that is referred to by 5 MVbot 'TRD bot corresponds to a temporary separation between the lower field of the future macroblock and the field of the past macroblock that is referred to by MVbot Y' VQ is a delta motion vector. 22. The decoder according to claim 21, 10 where: MVf bot is determined using the integer division with truncation to zero; and MVtop and Vbot are integer half luma pixel movement vectors. 23. The decoder according to claim 21 or 22, wherein: "PP TRB bo and RQ bot incorporate a temporal correction that explains whether the current coded field image is the first upper field or the first lower field 20. 24. The decoder according to one of claims 17 to 23, further comprising: a means for determining MVb top 'l backward movement vector to predict the upper field of the 25 current macroblock, in accordance with one of the FU3Ú / 98 MX ^ equations (a) MVb, t? P = ((? B, top "TRD, top) * MVtop) / TRD / op Y (b) MVb / top = MVf, top "MVtop f where TRB, top corresponds to a temporary separation between the upper field of the current macroblock and 5 the field of the past macroblock that is referred to by MVtop 'TRD top corresponds to a temporal separation between the upper field of the future macroblock and the field of the past macroblock that is referred to by MVtop Y' wp MVf top is the forward motion vector to predict 0 the upper field of the current macroblock. 25. The decoder according to claim 24, further comprising: means for selecting the equation of (a) when a delta motion vector MVp = 0; and 5 a means to select the equation of (b) when MVD? 0. - 26. The decoder according to one of claims 17 to 24, further comprising: a means for determining MV bot 'the vector of 0 backward movement to predict the lower field of the current macroblock, in accordance with one of the equations (a) MVb, bot = ((TRB, bot-TRD, bot) * MVbot) / TRD, bot And (b) MVb, bot = MVf, bot "MVbot where TRB bot corresponds to a temporal separation between the lower field of the current macroblock and Pll 36 / H MX # the field of the past macroblock referred to by MV ot 'TRD bot corresponds to a temporary separation between the lower field of the future macroblock and the field of the past macroblock that is referred to by MVbot Yr 5 _ MVf bot is the forward motion vector to predict the lower field of the current macroblock. 27. The decoder according to claim 26, further comprising: P a means for selecting the equation of (a) 10 when a delta motion vector MVD = 0; and a means to select the equation of (b) when MVD? 0 # PII; M.- jjf- SUMMARY OF THE INVENTION A system for encoding digital video images, for example the video object planes with bidirectional prediction (B-VOPs) (420), in particular 5 when the B-VOP and / or a reference image (400, 440) that is used to encode the B-VOP is encoded in an interlaced manner. For a macroblock B-VOP (420) that is co-located with a macroblock with field prediction -fl and a future anchor image (440), the prediction mode 10 direct is done by calculating four field movement vectors (Vfftop, MVf / bot, MVb / top, MVb / bot), and subsequently generating the prediction macroblock. The four field movement vectors and their reference fields are determined from (1) a deviated term (MVD) of the 15 vector coding of the current macroblock, (2) the two motion vectors of future anchor image field (MVtop, MVbot), (3) the reference field (405, 410) used by the two field movement vectors of the co-located future anchor macroblock and (4) the 0 time separation (TRb, top, TRb / bot, TRD, top, TRD, bot) in the field periods, between the current B-VOP fields and the fields Anchor. Additionally, a coding mode decision process for the current MB selects a forward, rewind or average field coding mode according to a minimum sum of the P1136 /: - MX mjf absolute difference error (SAD) that is obtained on the upper (430) and lower (425) fields of the current MB (420). # * Pll36 / 98 MX
MXPA/A/1998/001810A 1997-03-07 1998-03-06 Prediction and coding of planes of the object of video of prediccion bidireccional for digital video entrelaz MXPA98001810A (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US040120 1997-03-07
US042245 1997-03-31
US08944118 1997-10-06

Publications (1)

Publication Number Publication Date
MXPA98001810A true MXPA98001810A (en) 1999-05-31

Family

ID=

Similar Documents

Publication Publication Date Title
US5991447A (en) Prediction and coding of bi-directionally predicted video object planes for interlaced digital video
US6005980A (en) Motion estimation and compensation of video object planes for interlaced digital video
CA2230422C (en) Intra-macroblock dc and ac coefficient prediction for interlaced digital video
CA2238900C (en) Temporal and spatial scaleable coding for video object planes
US6483874B1 (en) Efficient motion estimation for an arbitrarily-shaped object
US7463685B1 (en) Bidirectionally predicted pictures or video object planes for efficient and flexible video coding
JP2000023193A (en) Method and device for picture encoding, method and device for picture decoding and provision medium
EP1820351A1 (en) Apparatus for universal coding for multi-view video
EP2293576B1 (en) Motion picture encoding
JP2000023194A (en) Method and device for picture encoding, method and device for picture decoding and provision medium
JP3440830B2 (en) Image encoding apparatus and method, and recording medium
USRE38564E1 (en) Motion estimation and compensation of video object planes for interlaced digital video
MXPA98001810A (en) Prediction and coding of planes of the object of video of prediccion bidireccional for digital video entrelaz
AU758254B2 (en) Padding of video object planes for interlaced digital video
AU728756B2 (en) Motion estimation and compensation of video object planes for interlaced digital video
MXPA98001809A (en) Estimation and compensation of the movement of planes of the video object for digital video entrelaz
MXPA98001807A (en) Prediction of dc and ac coefficient of intra-macrobloque for digital video concaten
EP1958449A1 (en) Method of predicting motion and texture data
MXPA98004502A (en) Scalable temporary and space codification for planes objects of vi