CN111989926B

CN111989926B - Method and apparatus for universal OBMC

Info

Publication number: CN111989926B
Application number: CN201980015256.0A
Authority: CN
Inventors: A.罗伯特; F.莱林内克; T.波里尔
Original assignee: InterDigital VC Holdings Inc
Current assignee: InterDigital VC Holdings Inc
Priority date: 2018-02-26
Filing date: 2019-02-22
Publication date: 2024-05-07
Anticipated expiration: 2039-02-22
Also published as: US11563970B2; US20200413085A1; CN111989926A; KR20200123787A; EP3759919A1; WO2019165162A1

Abstract

A block of video data partitioned into sub-blocks uses predictions from neighboring sub-blocks to form predictions for the sub-blocks such that the current predictions for the sub-blocks are combined with weighted versions of the neighboring predictions. The neighboring sub-block motion vector is checked to determine if it is different from the motion vector of the sub-block being predicted. If different, a prediction of the current sub-block is generated using the corresponding neighboring sub-block. In an embodiment, when the size of the block containing the sub-block is less than a particular size, two rows or columns of pixels within the sub-block are used to form a prediction of the current sub-block.

Description

Method and apparatus for universal OBMC

Technical Field

The present principles relate to the field of video compression.

Background

In the HEVC video compression standard (international telecommunication union, ITU-T h.265 efficient video coding), pictures are divided into so-called Coding Tree Units (CTUs), which are typically 64×64, 128×128, or 256×256 pixels in size.

Each CTU is represented by a coding tree in the compressed domain. As shown in fig. 1, this is a quadtree partitioning of CTUs, where each leaf is called a Coding Unit (CU).

Then, some intra or inter prediction parameters (prediction information) are provided for each CU. To this end, a CU is spatially partitioned into one or more Prediction Units (PUs), each PU being assigned some prediction information. Intra or inter coding modes are allocated at the CU level, see fig. 2.

According to the HEVC standard, the coding units are also recursively divided into so-called transform units following a "transform tree". Thus, the transform tree is a quadtree partition of the coding unit and the transform units are leaves of the transform tree. The transform unit encapsulates a square transform block corresponding to each picture component of the square spatial region under consideration. The transform block is a square block of samples in a single component, where the same transform is applied.

The emerging video compression tools include coding tree unit representations in the compressed domain, which are proposed for representing picture data in a more flexible manner in the compressed domain. An advantage of this flexible representation of the coding tree is that it provides increased compression efficiency compared to the CU/PU/TU arrangement of the HEVC standard.

Disclosure of Invention

These and other drawbacks and disadvantages of the prior art are addressed by at least one of the described embodiments, which is directed to a method and apparatus for encoding or decoding a block of video data. In at least one embodiment, it is proposed to use convolutional neural networks to generate vectors of partition probabilities.

According to at least one general embodiment described herein, a method for encoding a block of video data is provided. The method comprises the following steps: comparing pairs of horizontal and vertical motion vectors around a sub-block of a video coding block with current motion vectors of the sub-block, respectively, to check for differences; filtering the predicted pixels of the sub-block using at least one neighboring sub-block prediction from a different motion vector and the predicted pixels using the current motion vector to generate a prediction of the sub-block; and encoding the sub-block using the filtered prediction.

According to at least one general embodiment described herein, a method for decoding a block of video data is provided. The method comprises the following steps: comparing pairs of horizontal and vertical motion vectors around a sub-block of a video coding block with current motion vectors of the sub-block, respectively, to check for differences; filtering the predicted pixels of the sub-block using at least one neighboring sub-block prediction from a different motion vector and the predicted pixels using the current motion vector to generate a prediction of the sub-block; and decoding the sub-block using the filtered prediction.

According to another general embodiment described herein, an apparatus for encoding a block of video data is provided. The device comprises: a memory, and a processor configured to: comparing pairs of horizontal and vertical motion vectors around a sub-block of a video coding block with current motion vectors of the sub-block, respectively, to check for differences; filtering the predicted pixels of the sub-block using at least one neighboring sub-block prediction from a different motion vector and the predicted pixels using the current motion vector to generate a prediction of the sub-block; and encoding the sub-block using the filtered prediction.

According to another general embodiment described herein, an apparatus for encoding a block of video data is provided. The device comprises: a memory, and a processor configured to: comparing pairs of horizontal and vertical motion vectors around a sub-block of a video coding block with current motion vectors of the sub-block, respectively, to check for differences; filtering the predicted pixels of the sub-block using at least one neighboring sub-block prediction from a different motion vector and the predicted pixels using the current motion vector to generate a prediction of the sub-block; and decoding the sub-block using the filtered prediction.

According to another aspect described herein, there is provided a non-transitory computer readable storage medium containing data content generated by a method according to any of the described method embodiments or by an apparatus of any of the described apparatus embodiments for playback using a processor.

According to another aspect described herein, there is provided a signal comprising video data generated according to the method for encoding any of the blocks of video data in the described method embodiments or generated by the apparatus for encoding any of the blocks of video data in the described apparatus embodiments for playback using a processor.

According to another aspect described herein, there is provided a computer program product comprising instructions which, when the program is executed by a computer, cause the computer to perform the method of any of the described method embodiments.

These and other aspects, features and advantages of the present principles will become apparent from the following detailed description of exemplary embodiments, which is to be read in connection with the accompanying drawings.

Drawings

Fig. 1 shows an example of a coding tree unit and coding tree concept representing compressed pictures.

Fig. 2 shows an example of dividing a coding tree unit into a coding unit, a prediction unit, and a transform unit.

Fig. 3 shows a standard generic video compression scheme.

Fig. 4 shows a standard generic video decompression scheme.

Fig. 5 shows a prior art OBMC applied to a 32 x 16 entire inter-coded unit.

Fig. 6 shows a prior art OBMC process for an entire inter-coded unit.

Fig. 7 shows a prior art OBMC applied to inter-coded units of 16 x 16 partition.

Fig. 8 shows a prior art OBMC process for partitioned inter-coded units.

Fig. 9 shows a prior art OBMC applied to detect inter-coded units that are 16 x 16 partitions of an entire coded unit.

Fig. 10 shows an OBMC applied to the prior art detecting a 32 x 16 entire inter-coded unit divided into sub-blocks.

Fig. 11 shows one embodiment of the proposed generic OBMC procedure for all inter-coded units.

Fig. 12 shows an example of a proposed generic OBMC applied to 32 x 16 inter coded units in merge ATMVP mode.

Fig. 13 shows a corresponding prior art OBMC applied to 32 x 16 inter coded units in merge ATMVP mode.

Fig. 14 shows an example of the proposed OBMC procedure for a fine coding unit.

Fig. 15 shows a corresponding prior art OBMC procedure for a fine coding unit.

Fig. 16 shows an example embodiment of a proposed generic OBMC procedure for an entire inter-coded unit.

Fig. 17 shows an example embodiment of the proposed generic OBMC procedure for all inter-coded units.

Fig. 18 shows an embodiment of the proposed OBMC method for an encoder.

Fig. 19 shows an embodiment of the proposed OBMC method for a decoder.

Fig. 20 shows an embodiment of an apparatus for the proposed OBMC procedure.

Detailed description of the preferred embodiments

A method for improving entropy coding efficiency of a video signal is described. In particular, an improved method of OBMC (Overlapped Block Motion Compensation ) is described.

In the HEVC video compression standard, motion compensated temporal prediction is employed to exploit redundancy that exists between successive pictures of video.

For this purpose, associating a motion vector with each Prediction Unit (PU) will now be described. Each CTU is represented by a coding tree in the compressed domain. As shown in fig. 1, this is a quadtree partitioning of CTUs, where each leaf is called a Coding Unit (CU).

In HEVC, each PU is assigned exactly one motion vector. The motion vector is used for motion compensated temporal prediction of the PU under consideration.

In the Joint Exploration Model (JEM) developed by the JVET (joint video exploration team) group, CUs are no longer divided into PUs or TUs, and some motion data is directly assigned to each CU. In this new codec design, a CU may be divided into sub-CUs, and a motion vector may be calculated for each sub-CU.

In JEM, for all inter CUs, the motion compensation step is followed by a process called overlapped block motion compensation OBMC, regardless of its coding mode, which aims to attenuate the motion transitions between CUs (somewhat similar to deblocking filters along with block artifacts). However, the OBMC method applied is not the same depending on the CU coding mode. There are two different procedures, one for a CU that is divided into smaller parts (affine, FRUC, etc.), and another for other CUs (whole CU).

The method proposes to generalize the process of the OBMC tool performed on the encoder and decoder side immediately after the motion compensated inter prediction process.

The problem solved by this method is how to generalize the OBMC procedure of all CUs to simplify the design and improve the overall compression performance of the video codec under consideration.

In the prior art approach, the CU divided into sub-portions does not follow the same OBMC procedure as the undivided CU (the entire CU).

The basic idea of the proposed method is to generalize the OBMC procedure performed after motion compensated inter prediction for all CUs.

The prediction operation after the smoothing process is described with respect to Overlapped Block Motion Compensation (OBMC). The OBMC operates at the sub-block level of size 4 x 4 pixels. In the figure, the complete block is a Coding Unit (CU), while the small squares are 4 x 4 sub-blocks.

At each step, the process builds two predictions Pc and Pn, where the predictions are corresponding sub-blocks (sub-blocks selected from the reference picture using motion compensation, i.e. using motion vectors) compensated with the motion vector (Pc) of the current CU and the motion vector (Pn) from the neighboring sub-blocks.

The current prediction (Pc) is then smoothed using another prediction (Pn) to give a new current prediction.

For example, if Pn is obtained using the left-side neighboring motion vector, the left-side first column pixel will become Pc '=3/4pc+1/4 Pn, the second column will become Pc' =7/8pc+1/8 Pn, and so on.

Thus, in the drawing, the streak lines correspond to the direction of the smoothing process. If it fills a 4x 4 sub-block, this means that 4 rows/columns are filtered; if it only fills half of the sub-block, this means that only 2 rows/columns are filtered. The stripes themselves do not represent the number of rows/columns of pixels.

One major difference between the prior art OBMC and the generic OBMC procedure is the manner in which the number of rows/columns of pixels to be filtered are defined.

In the prior art, all directions are set a priori according to the coding mode of the current coding unit. If the coding unit is divided into sub-blocks, 2 rows/columns are filtered using each neighbor, 4 rows/columns are filtered if the entire coding unit is treated as one entity, except if the area of the coding unit is less than 64, 2 rows/columns are filtered.

In an embodiment of the generic OBMC, the number may be different for each direction and for each sub-block. It is defined according to two relative neighbors, filtering 2 rows/columns according to each neighbor if both have a different motion vector than the current sub-block and are available, and filtering 4 rows/columns according to the available neighbor if only one is available. And if neither is available, or both motion vectors are equal to the current sub-block motion vector, no filtering is performed.

For example, in testing the horizontal direction, if possible, left and right MVs are taken from left and right sub-blocks (of the current block), and if both MVs are different from the current motion vector of the current sub-block, the left first column of pixels of the current sub-block will become Pc '=3/4pc+1/4P _L, the second column will become Pc' =7/8pc+1/8P _L, the last column of pixels will become Pc '=3/4pc+1/4P _R, and the third column will become Pc' =7/8pc+1/8P _R.

Furthermore, in the OBMC of the related art, if the area of the current CU is smaller than 64 (4×4, 8×4, 4×8), it forces the number of pixels to be filtered to be 2.

This constraint limits filtering when the coding unit is not large. For example, for a 4 x 4 CU there is only one sub-block, and if this criterion is not used, all rows and columns will be filtered from the left and then above, which may be somewhat excessive. If this criterion is used, only two rows and two columns will be filtered.

Some CUs with an area larger than 64 may have a size of 4 pixels in one direction (4×16, 16×4, 4×32, 32×4, etc.), as in the proposed general OBMC, the number of pixels to be filtered may be set differently for each direction, and the area criterion may be modified to a direction-dependent size criterion. If the size of CU in a certain direction is smaller than 8, the number of pixels subjected to smoothing in that direction may be forced to be 2, and may be 4 in the other direction, as shown in fig. 14. Fig. 15 shows prior art filtering overmuch.

The proposed embodiment comprises:

generalizing the OBMC procedure of all CUs, whatever its coding mode. Encoder/decoder

-Adapting the procedure to a fine CU. Encoder/decoder

-Speeding up the process of some specific CUs. Encoder/decoder

The affected codec modules are the motion compensation 170 and motion estimation 175 of fig. 3, and 275 of fig. 4.

OBMC (overlapped block motion compensation) aims to reduce motion transitions between CUs internally divided into sub-blocks.

In the prior art, the first step of the OBMC procedure consists in detecting the kind of CU to be filtered, whether it is whole or divided into sub-blocks. By default, the incoming CU is considered to be whole. In actual JEM, a CU divided into sub-blocks is encoded using a merge mode with ATMVP/STMVP predictors, FRUC merge mode, or affine mode.

The following OBMC procedure applied to these two CUs is then different.

According to the prior art, motion transitions occur on the upper and left boundaries (lower and right have not yet been encoded/decoded) for the entire CU, so for these CUs OBMC only applies to the 4 x 4 sub-blocks of the upper row and left column as shown in fig. 5, which is a prior art OBMC applied to the 32 x 16 entire inter-coded coding unit.

P _c represents the current 4 x 4 sub-block prediction obtained using motion compensation of the current CU and using the motion vector of the current CU, and P _n (where n is above (T) or left (L)) represents the corresponding 4 x 4 sub-block prediction obtained using neighboring 4 x 4 sub-block motion vectors. P _n exists only if the motion vector of the neighboring 4 x 4 sub-block is different from the current motion vector, and OBMC can be applied to the current 4 x 4 sub-block.

For each 4 x 4 sub-block, the current prediction P _c is then filtered using the available predictions P _n to smooth the motion transitions.

Pixels of 4 rows and/or columns of P _n are added to the current prediction P _c using the weighting factors {1/4,1/8,1/16,1/32} of P _n and the weighting factors {3/4,7/8, 15/16, 31/32} of P _c.

If the current CU area is smaller than 64 (4 x 4, 8 x 4, and 4 x 8 CUs), only the pixels of the first two rows/columns are filtered using two first weighting factors.

Since several pixels are filtered in turn, in the first 4 x 4 sub-block, the result depends on the order of use of the different neighbors. At this point, OBMC is filtered from the left and then above, which means that P _c becomes:

-for the first 4 x 4 sub-block: p _c---P_L--->P_c'---P_T--->P_c'

-For the other sub-blocks of the first column: p _c---P_L--->P_c'

-For the other sub-blocks of the first row: p _c---P_T--->P_c'

The complete OBMC procedure for the entire CU is shown in fig. 6.

According to the prior art, for CUs divided into sub-blocks, motion transitions occur at the boundaries between and above and to the left of each sub-block, so for these CUs, OBMC is applied on each 4 x 4 sub-block, as shown in fig. 7, where the prior art OBMC applied to the 32 x 16 entire inter coded coding unit is shown.

In this case, P _n may be obtained from 4 different neighbors, i.e., the neighboring sub-blocks (if available) above (T), left (L), below (B), and right (R) (from the causal CU or from the current CU, with motion vectors different from the current motion vector).

For each 4 x 4 sub-block, the current prediction P _c is then filtered to smooth motion transitions using all available predictions P _n.

The first two rows or columns of pixels of each sub-block are respectively a weighted sum of the current prediction P _c and P _n from the adjacent sub-block above or to the left. The last two rows or columns of pixels of each sub-block use P _n from the adjacent sub-block below or to the right, respectively. The weighting factor used here is {1/4,1/8} for P _n, and {3/4,7/8} for P _c.

Since almost all pixels are filtered several times in sequence, the result depends on the order of use of the different neighbors. In this regard, OBMC filters from left, above, then right, below, which means that P _c becomes:

-for all 4 x 4 sub-blocks except the last row and the last column:

P_c---P_L--->P_c'---P_T--->P_c"---P_R--->P_c"'---P_B--->P_c""

-for the last row of 4 x 4 sub-blocks except the last 4 x 4 sub-block:

P_c---P_L--->P_c'---P_T--->P_c"---P_R--->P_c"'

-for the last column of 4 x 4 sub-blocks except the last 4 x 4 sub-block:

P_c---P_L--->P_c'---P_T--->P_c"---P_B--->P_c""

-for the last 4 x 4 sub-block: p _c---P_L--->P_c'---P_T--->P_c'

The OBMC process of these sub-block divided CUs is shown in fig. 8.

The main limitation of this tool is that it needs to detect if the CU to be processed is divided into sub-blocks and defaults to be whole.

In practice, when a new coding mode is added, it is defaulted to be whole. In this case, if the CU is divided into a plurality of sub-blocks, a classification error shown in fig. 9 below may occur.

Also, if the CU divided into sub-blocks is not classified well, only the sub-blocks of the first row and the first column are filtered, but have four rows/columns of pixels as shown in fig. 9, instead of the result of fig. 7.

In the same way, if the entire CU is classified as divided into sub-blocks, all sub-blocks will be processed. Since all motion vectors within the current CU have the same value, the OBMC will filter only the sub-blocks of the first row and first column, but smooth only the two rows/columns of pixels, as shown in fig. 10 below, instead of the correct result of fig. 5

Thus, when a CU is not well classified, the subsequent OBMC process will be different, as will the result.

Every time an existing tool is modified and changed from one category to another or a new tool is added, the OBMC must be modified to correctly classify the CU, otherwise a sub-optimal process will result.

The following paragraphs describe the general OBMC procedure proposed in this method.

The proposed solution does not require classification of the CU to obtain the same results as the prior art OBMC and is not limited by the CU area.

The proposed method is based on a prior art OBMC procedure for CUs divided into sub-blocks, wherein all surrounding neighbors are considered.

In the proposed generic OBMC procedure, four neighboring sub-blocks are examined in pairs, left-right and up-down, i.e. horizontal and vertical, but still used in sequence to smooth the current prediction.

For each 4 x 4 sub-block and each direction of the current coding unit, two neighboring motion vectors are taken (if available, i.e. if they have a motion vector different from the motion vector of the current sub-block).

If both neighboring motion vectors are available, the number of pixels to be filtered along the test direction is set to 2. That is, the first two rows or columns will be filtered using the first neighbor and the second two rows or columns will be filtered using the second neighbor. If only one neighboring motion vector is available, it is set to 4 pixels. That is, 4 rows/columns will be filtered using this neighbor. If none are available, OBMC is still not applicable.

The sub-blocks are then smoothed using the available neighboring motion vectors and their associated P _n and the same weighting factors as the prior art OBMC ({ 1/4,1/8,1/16,1/32} for P _n, {3/4,7/8, 15/16, 31/32} for P _c). Applying these weights from the neighbor under test to its relative neighbor; left to right, top to bottom, right to left.

In the proposed solution, neighbors are studied in terms of pairs of directions (left-right and up-down). This allows the number of rows or columns of pixels to be smoothed to be chosen as 2 or 4 instead of setting the number a priori according to the CU coding mode.

The OBMC process smoothes the motion transition by filtering the current prediction, and therefore the proposed solution described herein slightly modifies the current prediction for each sub-block of the coding unit.

Furthermore, it is important to note that this solution allows to select the number of smooth rows/columns in each direction individually, and that they may be different. This allows a better adaptation to motion transitions in the CU than the prior art OBMC in fig. 13, as shown in fig. 12 for a CU encoded in the merge ATMVP/STMVP mode.

In this example, the 8 x 16 partition on the right is completely smoothed as if it were a separate CU using a generic OBMC procedure, which is not the case with prior art OBMC.

For the entire CU as shown in fig. 5, and for the CU (affine, FRUC) fully divided into sub-blocks as shown in fig. 7, the output of the generic OBMC process is the same as the output of the prior art OBMC.

For CUs that are not fully divided into sub-blocks, a unique difference in the results can be observed, as shown in fig. 12 for CUs encoded using the merge ATMVP/STMVP mode.

In the prior art OBMC, only two rows/columns of pixels are filtered when the area of the CU is smaller than 64.

This concept can also be generalized in the newly proposed OBMC process.

For each pair of neighbors, if the corresponding size (horizontal width or vertical height) of the CU is less than 8, the number of pixels to be filtered is forced to 2. The fine CUs may then be filtered along 2 rows and 4 columns of pixels and vice versa, as shown in fig. 14.

This general standard also allows a better adaptation to motion transitions than prior art OBMC that filter only 2 pixels for CU's smaller than 64 (4 x 4, 8 x 4, and 4 x 8), filter 4 pixels for larger CU's (4 x 16, 16 x 4, 4 x 32, 32 x 4 … …), where the entire width or height is smoothed, as shown in fig. 15.

The complexity of the proposed generic OBMC is higher than that of the prior art OBMC, because it examines 4 neighbors of all sub-blocks of all CUs, whereas in the prior art OBMC only CUs divided into sub-blocks are examined.

In order to speed up the process of generic OBMC, the same sort categories as used in prior art OBMC can be used.

The classification used herein isolates the entire CU (rather than a CU divided into sub-blocks) and defaults the incoming CU to be divided into sub-blocks (rather than the entire sub-block). This reverse classification limits errors associated with poor detection, although the errors in FIG. 9 can still be observed, but the errors in FIG. 10 are no longer present. Further, when a new coding mode is added, it is regarded as divided into a plurality of sub-blocks by default, and if the entire CU is used, bad classification as shown in fig. 10 occurs, where a different OBMC result is not generated. Thus, when the coding mode changes from whole to divided, the OBMC must be modified, but when the coding mode changes from divided to whole, or when a new coding mode is added, the result of the OBMC is the same, but the complexity can be reduced by informing the OBMC.

After classification, the entire CU enters a simplified and faster OBMC process in which only the first neighbor in each pair is used to filter only the sub-blocks of the first row and first column (the opposite neighbor is always considered unusable).

The OBMC procedure for the CU classified as a whole will become as shown in fig. 16, while the OBMC procedure for the other CUs is still as shown in fig. 11.

In a preferred embodiment, three proposed improvements of OBMC are used: general procedure, general standard for fine CU, and simplified version for whole CU.

The procedure of the whole CU is depicted in fig. 16, the procedure of the other CU is depicted in fig. 11, and additional fine CU management is shown in fig. 17.

The prediction operation after the smoothing process is described with respect to Overlapped Block Motion Compensation (OBMC), but may be made common to other prediction methods.

In the prior art, all directions are set a priori according to the coding mode of the current coding unit. If the coding unit is divided into sub-blocks, 2 rows/columns are filtered using each neighbor, if the entire coding unit is treated as one entity, 4 rows/columns are filtered, unless the area of the coding unit is less than 64, 2 rows/columns are filtered.

In an embodiment of the generic OBMC, the number may be different for each direction and for each sub-block. It is defined according to two relative neighbors, filtering 2 rows/columns according to each neighbor if both have a different motion vector than the current sub-block and are available, and filtering 4 rows/columns according to the available neighbor if only one is available.

Some CUs with an area larger than 64 may have a size of 4 pixels in one direction (4×16, 16×4, 4×32, 32×4, etc.), as in the proposed general OBMC, the number of pixels to be filtered may be set differently for each direction, and the area criterion may be modified to the size criterion according to the direction. If the size of CU in a certain direction is smaller than 8, the number of pixels subjected to smoothing processing in that direction is forced to be 2, and may be 4 in the other direction, as shown in fig. 14. Fig. 15 shows prior art filtering overmuch.

By not calculating Illumination Compensation (IC) parameters for each OBMC 4XS band, but inheriting IC parameters from neighboring 4x 4 sub-blocks, the burden of the OBMC design can be reduced. For this purpose, the IC parameters of the current picture are stored with the motion field information at a 4x 4 sub-block resolution.

The above embodiments are described with respect to an encoder or encoding operation. The decoder will simply understand the partitions generated at the encoder using the described embodiments in the same manner as the RDO procedure or any other type of partition embodiment.

Fig. 18 illustrates one embodiment of a method 1800 for encoding a block of video data. The method begins at start block 1801 and proceeds to block 1810 where horizontal and vertical pairs of motion vectors around the sub-block are compared. The sub-block may be part of a larger block to be encoded. The method compares the motion vector of the neighboring sub-block with the current motion vector of the sub-block. Control passes from block 1810 to block 1820 to filter the prediction of the sub-block using the current prediction of the sub-block and the neighboring prediction of the sub-block having a motion vector different from the current motion vector of the sub-block to generate a smoothed prediction of the current sub-block. Control passes from block 1820 to block 1830 to encode the sub-block using the smoothed prediction of the sub-block.

Fig. 19 illustrates one embodiment of a method 1900 for encoding a block of video data. The method begins at start block 1901 and proceeds to block 1910 where horizontal and vertical pairs of motion vectors around the sub-block are compared. The sub-block may be part of a larger block to be decoded. The method compares the motion vector of the neighboring sub-block with the current motion vector of the sub-block. Control passes from block 1910 to block 1920 to filter the prediction of the sub-block using the current prediction of the sub-block and the neighboring prediction of the sub-block having a motion vector different from the current motion vector of the sub-block to generate a smoothed prediction of the current sub-block. Control passes from block 1920 to block 1930, which decodes the sub-block using the smoothed prediction of the sub-block.

Fig. 20 illustrates one embodiment of an apparatus 2000 for encoding or decoding a block of video data. The apparatus includes a processor 2010 having input and output ports and being in signal connection with a memory 2020, which also has input and output ports. The apparatus may perform any of the foregoing method embodiments or any variation thereof.

The functions of the various elements shown in the figures may be provided either through dedicated hardware as well as hardware capable of executing software in association with appropriate software. When provided by a processor, the functions may be provided by a single dedicated processor, by a single shared processor, or by a plurality of individual processors, some of which may be shared. Furthermore, explicit use of the term "processor" or "controller" should not be construed to refer exclusively to hardware capable of executing software, and may implicitly include, without limitation, digital signal processor ("DSP") hardware, read-only memory ("ROM") for storing software, random access memory ("RAM"), and non-volatile storage.

Other conventional and/or custom hardware may also be included. Similarly, any switches shown in the figures are conceptual only. Its functionality may be implemented by the operation of program logic, by dedicated logic, by the interaction of program control and dedicated logic, or even manually, the particular technique being selectable by the implementer as more specifically understood from the context.

The present description illustrates the present concepts. It will thus be appreciated that those skilled in the art will be able to devise various arrangements that, although not explicitly described or shown herein, embody the present concepts and are included within its spirit and scope.

All examples and conditional language recited herein are intended for pedagogical purposes to aid the reader in understanding the principles and the concepts contributed by the inventor to furthering the art, and are to be construed as being without limitation to such specifically recited examples and conditions.

Moreover, all statements herein reciting principles, aspects, and embodiments of the present principles, as well as specific examples thereof, are intended to encompass both structural and functional equivalents thereof. Additionally, it is intended that such equivalents include both currently known equivalents as well as equivalents developed in the future, i.e., any elements developed that perform the same function, regardless of structure.

Thus, for example, it will be appreciated by those skilled in the art that the block diagrams presented herein represent conceptual views of illustrative circuitry embodying the present principles. Similarly, it will be appreciated that any flow charts, flow diagrams, state transition diagrams, pseudocode, and the like represent various processes which may be substantially represented in computer readable media and so executed by a computer or processor, whether or not such computer or processor is explicitly shown.

In the claims hereof any element expressed as a means for performing a specified function is intended to encompass any way of performing that function including, for example, a) a combination of circuit elements that performs that function; or b) any form of software, including, therefore, firmware, microcode, etc., combined with appropriate circuitry for executing the software to perform the function. The present principles defined by such claims reside in the fact that: the functions provided by the various recited elements are combined and brought together in the manner which is claimed. It is thus regarded that any means that can provide those functionalities are equivalent to those shown herein.

Reference in the specification to "one embodiment" or "an embodiment" of the present principles and other variations means that a particular feature, structure, characteristic, or the like described in connection with the embodiment is included in at least one embodiment of the present principles. Thus, appearances of the phrases "in one embodiment" or "in an embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment.

Claims

1. A method, comprising:

Determining that at least one motion vector of a first pair of neighboring sub-blocks of a current sub-block of a current block is different from a motion vector of the current sub-block, the motion vector of the current block being referred to as a current motion vector, the neighboring sub-blocks being located on opposite sides of the current block in a first one of horizontal and vertical directions;

obtaining a prediction block of the current sub-block from each motion vector of the first pair of neighboring sub-blocks that is different from the current motion vector;

smoothing the predicted pixels of the current sub-block by sequentially adding the obtained predicted pixels of the predicted block using weighting factors;

Repeating the steps of determining, obtaining and smoothing for a second pair of adjacent sub-blocks located on opposite sides of the current block in a second direction different from the first direction;

Wherein smoothing the predicted pixels of the current sub-block in the case where the directions are horizontal and vertical, respectively, includes smoothing two rows and two columns of the predicted pixels, respectively, in the case where both motion vectors of the pair are different from the current motion vector, and smoothing four rows and four columns of the predicted pixels, respectively, in the case where only one motion vector of the pair is different from the current motion vector; and

The sub-block is encoded using the smoothed prediction.

2. The method of claim 1, wherein smoothing the predicted pixels of the current sub-block if only one motion vector of the pair is different from the current motion vector comprises smoothing four rows and four columns, respectively, only if the size of the current block is greater than a particular size, and otherwise smoothing two rows and two columns, respectively.

3. The method of any of claims 1-2, wherein the sub-block is part of a coding unit.

4. An apparatus for encoding a block of video data, comprising:

Memory, and

A processor configured to:

The sub-block is encoded using the smoothed prediction.

5. The apparatus of claim 4, wherein smoothing the predicted pixels of the current sub-block if only one motion vector of the pair is different from the current motion vector comprises smoothing four rows and four columns, respectively, only if the size of the current block is greater than a particular size, and otherwise smoothing two rows and two columns, respectively.

6. The apparatus of any of claims 4 to 5, wherein the sub-block is part of a coding unit.

7. A method, comprising:

The sub-block is decoded using the smoothed prediction.

8. The method of claim 7, wherein smoothing the predicted pixels of the current sub-block if only one motion vector of the pair is different from the current motion vector comprises smoothing four rows and four columns, respectively, only if the size of the current block is greater than a particular size, and otherwise smoothing two rows and two columns, respectively.

9. The method of any of claims 7 to 8, wherein the sub-block is part of a coding unit.

10. An apparatus for decoding a block of video data, comprising:

Memory, and

A processor configured to:

The sub-block is decoded using the smoothed prediction.

11. The apparatus of claim 10, wherein smoothing the predicted pixels of the current sub-block if only one motion vector of the pair is different from the current motion vector comprises smoothing four rows and four columns, respectively, only if the size of the current block is greater than a particular size, and otherwise smoothing two rows and two columns, respectively.

12. The apparatus of any of claims 10 to 11, wherein the sub-block is part of a coding unit.

13. A non-transitory computer readable medium having instructions stored thereon, which when executed by a processor, cause the processor to perform the method of any of claims 1-3 and 7-9.

14. A computer program product comprising instructions which, when executed by a computer, cause the computer to decode a stream according to the method of any one of claims 7 to 9 or by the apparatus of any one of claims 10 to 12.