CN110249628B

CN110249628B - Video encoder and decoder for predictive partitioning

Info

Publication number: CN110249628B
Application number: CN201780085826.4A
Authority: CN
Inventors: 赵志杰; 马克斯·布莱瑟; 麦蒂尔斯·韦恩
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2017-02-06
Filing date: 2017-02-06
Publication date: 2021-08-20
Anticipated expiration: 2037-02-06
Also published as: CN110249628A; WO2018141416A1; EP3571839A1; US10771808B2; US20190364296A1; EP3571839B1

Abstract

The present invention provides a video encoder 100 and a video decoder 200 operable to: partitioning a block 301 in a current picture 302 based on at least one partition prediction value. The encoder 100 and decoder 200 are configured to select at least one reference picture 303a and a plurality of blocks 304a in the at least one reference picture 303 a. The projected position of each selected block 304a in the current picture 302 is further calculated based on the motion vector associated with the selected block 304a in the reference picture 303 a. Then, a selected block 304a, each projection location of which spatially overlaps with a block 301 in the current picture 302, is determined as a reference block, and a partition prediction value is generated for at least one reference block based on partition information associated with the at least one reference picture 303a, e.g., partition information stored in the at least one reference picture 303 a.

Description

Video encoder and decoder for predictive partitioning

Technical Field

The present invention relates to the field of video processing, and in particular to the subject matter generally referred to as video coding. In particular, the present invention proposes a video encoder and a video decoder for partitioning a block in a current picture based on at least one partition prediction value, i.e. for prediction block partitioning. The invention also relates to corresponding video encoding and decoding methods.

Background

In current video coding schemes, such as h.264/AVC and HEVC, motion information in inter-predicted pictures is divided into rectangular blocks of configurable size. In h.264/AVC, the motion is divided into symmetric blocks with a maximum size of 16x16 pixels, which are called macroblocks and may be further subdivided into a minimum of 4x4 pixels, while HEVC replaces macroblocks with Coding Tree Units (CTUs) with a maximum size of 64x64 pixels. The CTU is not just a larger macroblock because with a quadtree decomposition scheme the CTU can be divided into smaller coding units (CU for short), which can be subdivided into minimum 8x8 pixels. Furthermore, unlike h.264/AVC, HEVC supports asymmetric block partitioning (AMP) of coding units into Prediction Units (PUs).

The block partitioning of HEVC is based entirely on rectangular blocks. For arbitrary shaped moving objects, which are common in natural video sequences, very fine block partitioning can be done along object boundaries. Since the motion vectors on both sides of the boundary are similar in direction and magnitude, encoding overhead is introduced. That is, additional side information needs to be transmitted in order to describe the fine block partitions and redundant motion vectors.

This problem can be avoided by applying different block partitioning strategies. In video coding, there are generally the following block partitioning methods: rectangular block partitions, geometric block partitions, and object-based block partitions.

Examples of these different segmentation methods are shown in fig. 9, which shows a simple scene with moving foreground objects and moving background. The quad-tree PU partition of HEVC, and the related quad-tree-binary tree partition method, is representative of rectangular block partitions. Geometric partitioning is achieved by dividing a block with straight lines into two segments (also referred to herein as wedges). Object-based partitioning is the most flexible way of partitioning blocks, since blocks can be divided into arbitrarily shaped segments.

However, more flexible block partitioning can present the following challenges: more side information may need to be sent to the partition structure than to the tile partitions. In addition, determining partitions at the encoder typically adds significant complexity.

In the prior art, for example in HEVC, the determination of the best partition is a task of the encoder. Typically, rate-distortion optimization is used to determine partitions in an exhaustive search. Furthermore, the rate-distortion optimization is highly targeted to a variety of internal and external conditions, such as encoder implementation, target bit rate, quality, application scenarios, etc.

Block partitioning in HEVC is also limited to rectangular partitions of the coded block. In detail, this means that a square coding block can be divided into two rectangular prediction blocks, where each prediction block is associated with at most two motion vectors. As in AVC, horizontal and vertical splitting into two equal sized rectangular blocks is specified. In this extension, four asymmetric partitions are specified to further increase flexibility. Thus, eight partition modes are specified in HEVC in total.

A simplified method of temporal projection of motion is used for the encoding of motion vectors. In merge mode, the merge candidate list is composed of spatially and temporally neighboring motion vectors. For the spatial motion vector, a motion vector field of the current picture is used, the motion vector field including motion vectors associated with blocks of the current picture. Motion vectors sampled at specific positions around the current prediction block are added to the merge candidate list. For the temporal motion vector, a motion vector field of a reference picture is used. Here, the motion vector field is sampled at two collocated positions, denoted C₀And C₁As shown in fig. 10.

Assuming that the motion vector fields of the current picture and reference picture are highly correlated and do not change significantly, it can be expected that position C in the reference picture motion vector field₀Or C₁At this point, a motion predictor can be found.

Disclosure of Invention

In view of the above problems and disadvantages, it is an object of the present invention to improve the prior art of video coding. It is a particular object of the present invention to provide an encoder and a decoder to employ a prediction block partitioning method. Therefore, the present invention is intended to improve the encoding of block partition related side information and to improve the block partitioning methods, in particular geometric partitions and object-based partitions, such as segmentation-based partitioning (SBP for short).

The object of the invention is achieved by the solution presented in the attached independent claims. Advantageous implementations of the invention are further defined in the dependent claims.

In particular, the invention proposes to use a temporal projection process based on a motion vector field of at least one reference picture in order to generate partition predictors for block partitions of the current picture. A motion vector field of a picture typically includes motion vectors associated with blocks of the picture. That is, the motion vector field of the reference picture includes motion vectors associated with blocks in the reference picture.

A first aspect of the present invention provides a video encoder for: selecting at least one reference picture and a plurality of blocks in the at least one reference picture; calculating a projection position of each selected block in a current picture based on motion vectors associated with the selected blocks in the reference picture; determining a selected block for which each projection position spatially overlaps the block in the current picture as a reference block; and generate a partition prediction value for at least one reference block based on partition information associated with the at least one reference picture, e.g., partition information stored in the at least one reference picture.

According to the first aspect, a prediction block partition is implemented. Specifically, for example, a motion vector field of an already encoded picture, a partition structure may be temporally projected according to a motion of an object or the like so that it may be used as a partition prediction value in the current picture. That is, the partition prediction value is a prediction to be applied to a partition structure of an encoding block in the current picture. In other words, the partition predictor is an estimate of the best partition for the current coding block. Projection of partition information may be performed for all existing block partitioning methods, e.g. for rectangular-based, geometric-based and object-based partitions.

With the prediction block partition, certain benefits may be obtained. First, a prediction of the partition structure of the current coding block (partition predictor) may be generated, e.g., may be used directly by the current block and may be sent through a predictor flag or predictor index, in case there are multiple predictors. This may occur if several selected blocks have been determined to be reference blocks and partition predictors are calculated for the multiple reference blocks. The partition predictors may be further refined using differential coding methods where beneficial according to a rate-distortion criterion. Second, the partition prediction value may be used as a starting point for rate-distortion optimization of the encoder. That is, the encoder may be configured to: partitioning a block in a current picture based on at least one partition prediction value. Thus, a fast decision method may be used which ends the rate-distortion optimization after a specified number of refinement steps, or if the already achieved rate-distortion cost is below a specified threshold. This reduces complexity and speeds up encoding.

In a first implementation form of the encoder according to the first aspect, the encoder is configured to: calculating a temporal motion trajectory for each selected block based on a motion vector associated with the selected block in the reference picture and a temporal Picture Order Count (POC) distance between the current picture and the at least one reference picture, and calculating the projected location of each selected block based on a location of the selected block in the reference picture and the motion trajectory.

In this way, the motion trajectory can be calculated accurately and efficiently.

In a second implementation form of the encoder according to the first implementation form of the first aspect, the encoder is configured to: the motion trajectory is calculated by inversion and scaling of the motion vector associated with the selected block as a function of the ratio of the two POC distances, i.e. the POC distance of the current picture and reference picture to the POC distance between the reference picture and the reference picture associated with the selected block.

This implementation is the actual implementation of the prediction block partition.

In a third implementation of the encoder according to the first aspect as such or according to any of the preceding implementations of the first aspect, the plurality of blocks selected in each reference picture comprises all blocks of the reference picture or blocks of the reference picture within a projection range centered on the position of the block in the current picture.

The first alternative provides the highest accuracy of predictive block partitioning, but the computational cost increases. The second alternative is a reliable and computationally cost-effective solution.

In a fourth implementation form of the encoder according to the first aspect as such or according to any of the preceding implementation forms of the first aspect, the encoder is configured to construct and output a list comprising a plurality of indexed partition predictors.

The list of index partition predictors is beneficial because only the index can be sent to reduce signaling overhead.

In a fifth implementation form of the encoder according to the first aspect as such or according to any of the preceding implementation forms of the first aspect, the at least one partition prediction value comprises at least one of: line parameters for geometric partitions, boundary motion vectors for object-based partitions, and rectangular partition information.

Thus, the prediction block partition is compatible with existing block partitioning methods.

In a sixth implementation form of the encoder according to the fifth implementation form of the first aspect, the line parameter is specified by a polar coordinate or a truncation point at the reference block boundary and/or the boundary motion vector specifies a partition boundary in a reference picture.

In a seventh implementation form of the encoder according to the first aspect as such or according to any of the preceding implementation forms of the first aspect, the encoder is configured to: generating an initial partition for the block in the current picture using the at least one partition prediction value.

Starting from the initial partition, the encoder may also find the best partition of the block in terms of rate-distortion optimization. Thus, the optimal partitioning of the block in the current picture can be performed more efficiently and faster.

In an eighth implementation form of the encoder according to the first aspect as such or according to any of the preceding implementation forms of the first aspect, the encoder is configured to: sending the at least one partition predictor or at least one index to a decoder, the index pointing to a position of the at least one partition predictor in an indexed partition predictor list.

Thus, the partition prediction value may be used at the decoder side. Sending only the index reduces the signaling overhead.

In a ninth implementation form of the encoder according to the first aspect as such or according to any of the preceding implementation forms of the first aspect, the encoder is configured to: sending, to a decoder, difference information between the at least one partition prediction value and a final partition applied to the block in the current picture. Optionally, the encoder may be configured to additionally send a partition predictor flag and/or a partition predictor index.

The final partition of the current block is typically determined according to rate-distortion optimization. The difference information is information regarding a difference between the estimated partition, i.e., the partition according to the partition prediction value, and a final (best) partition of the current block.

The difference information in the geometric partition is an offset between the line start and end coordinates of the partition line in the current block and the line start and end coordinates (adjusted, as shown in fig. 7) of the partition line in the associated block of the partition predictor (as shown in fig. 8).

For a segment-based partition, the difference information is a difference between a boundary motion vector in the current block and a boundary motion vector of the partition predictor associated block.

Sending only the difference information significantly reduces signaling overhead and enables the decoder to obtain partition prediction values and then apply the block partitions directly to blocks in the current picture based on the partition prediction values and the difference information.

A second aspect of the present invention provides a video decoder for: obtaining difference information; selecting at least one reference picture and a plurality of blocks in the at least one reference picture; calculating a projection position of each selected block in a current picture based on motion vectors associated with the selected blocks in the reference picture; determining a selected block for which each projection position spatially overlaps the block in the current picture as a reference block; and generating a partition prediction value for at least one reference block based on partition information associated with the at least one reference picture, e.g., partition information stored in the at least one reference picture; and partitioning the block in the current picture according to the obtained partition prediction value and the difference information. Optionally, the decoder may be configured to additionally obtain/receive a partition predictor flag and/or a partition predictor index.

Thus, the advantages of the prediction block partitioning discussed above at the encoder side are also available at the decoder side. The decoder may use the obtained partition prediction values and the obtained difference information, e.g., information obtained from the encoder, to find a block partition for the block in the current picture.

In a first implementation form of the decoder according to the second aspect, the decoder is configured to: calculating a temporal motion trajectory for each selected block based on a motion vector associated with the selected block in the reference picture and a temporal Picture Order Count (POC) distance between the current picture and the at least one reference picture, and calculating the projected location of each selected block based on a location of the selected block in the reference picture and the motion trajectory.

In a second implementation form of the decoder according to the first implementation form of the second aspect, the decoder is configured to: the motion trajectory is calculated by inversion and scaling of the motion vector associated with the selected block as a function of the ratio of the two POC distances, i.e. the POC distance of the current picture and reference picture to the POC distance between the reference picture and the reference picture associated with the selected block.

In a third implementation of the decoder according to the second aspect as such or according to any of the preceding implementations of the second aspect, the plurality of blocks selected in each reference picture comprises all blocks of the reference picture or blocks of the reference picture within a projection range centered on the position of the block in the current picture.

In a fourth implementation form of the decoder according to the second aspect as such or according to any of the preceding implementation forms of the second aspect, the at least one partition predictor comprises at least one of: line parameters for geometric partitions, boundary motion vectors for object-based partitions, and rectangular partition information.

In a fifth implementation form of the decoder according to the fourth implementation form of the second aspect, the line parameter is specified by a polar coordinate or a truncation point at the reference block boundary and/or the boundary motion vector specifies a partition boundary in a reference picture.

The implementation of the decoder achieves the same advantages as the encoder described above.

A third aspect of the present invention provides a video encoding method, comprising the steps of: selecting at least one reference picture and a plurality of blocks in the at least one reference picture; calculating a projection position of each selected block in the current picture based on motion vectors associated with the selected blocks in the reference picture; determining a selected block for which each projection position spatially overlaps the block in the current picture as a reference block; and generate a partition prediction value for at least one reference block based on partition information associated with the at least one reference picture, e.g., partition information stored in the at least one reference picture.

In a first implementation form of the video coding method according to the third aspect, the method further comprises: calculating a temporal motion trajectory for each selected block based on a motion vector associated with the selected block in the reference picture and a temporal Picture Order Count (POC) distance between the current picture and the at least one reference picture, and calculating the projected location of each selected block based on a location of the selected block in the reference picture and the motion trajectory.

In a second implementation form of the video coding method according to the first implementation form of the third aspect, the method further comprises: the motion trajectory is calculated by inversion and scaling of the motion vector associated with the selected block as a function of the ratio of the two POC distances, i.e. the POC distance of the current picture and reference picture to the POC distance between the reference picture and the reference picture associated with the selected block.

In a third implementation of the video coding method according to the third aspect as such or according to any of the preceding implementations of the third aspect, the plurality of blocks selected in each reference picture comprises all blocks of the reference picture or blocks of the reference picture within a projection range centered on the position of the block in the current picture.

In a fourth implementation form of the video coding method according to the third aspect as such or according to any of the preceding implementation forms of the third aspect, the method further comprises constructing and outputting a list comprising a plurality of index partition predictors.

In a fifth implementation form of the video coding method according to the third aspect as such or according to any of the preceding implementation forms of the third aspect, the at least one partition prediction value comprises at least one of: line parameters for geometric partitions, boundary motion vectors for object-based partitions, and rectangular partition information.

In a sixth implementation form of the video coding method according to the fifth implementation form of the third aspect, the line parameter is specified by a polar coordinate or a truncation point at the reference block boundary, and/or the boundary motion vector specifies a partition boundary in a reference picture.

In a seventh implementation form of the video coding method according to the third aspect as such or according to any of the preceding implementation forms of the third aspect, the method further comprises: generating an initial partition for the block in the current picture using the at least one partition prediction value.

In an eighth implementation form of the video coding method according to the third aspect as such or according to any of the preceding implementation forms of the third aspect, the method further comprises: sending a partition prediction flag, the at least one partition prediction value, or at least one index to a decoder, the index pointing to a position of the at least one partition prediction value in an indexed partition prediction value list.

In a ninth implementation of the video coding method according to the third aspect as such or according to any of the preceding implementations of the third aspect, the method further comprises: sending, to a decoder or the like, difference information between the at least one partition prediction value and a final partition applied to the block in the current picture.

The method of the third aspect and its implementation achieve the same advantages as the encoder of the first aspect and its implementation, respectively.

A fourth aspect of the present invention provides a video decoding method, comprising the steps of: obtaining difference information; selecting at least one reference picture and a plurality of blocks in the at least one reference picture; calculating a projection position of each selected block in the current picture based on motion vectors associated with the selected blocks in the reference picture; determining a selected block for which each projection position spatially overlaps the block in the current picture as a reference block; and generating a partition prediction value for at least one reference picture based on partition information associated with the at least one reference picture, e.g., partition information stored in the at least one reference picture; and partitioning the block in the current picture according to the partition prediction value and the difference information. Optionally, the method may include obtaining/receiving a partition predictor flag and/or a partition predictor index.

In a first implementation form of the video decoding method according to the fourth aspect, the method further comprises: calculating a motion trajectory of each selected block temporally based on the motion vector associated with the selected block in the reference picture and a temporal Picture Order Count (POC) distance between the current picture and the at least one reference picture, and calculating the projected location of each selected block based on a location of the selected block in the reference picture and the motion trajectory.

In a second implementation form of the video decoding method according to the first implementation form of the fourth aspect, the method further comprises: the motion trajectory is calculated by inversion and scaling of the motion vector according to the ratio of two POC distances, i.e. the POC distance of the current picture and reference picture to the POC distance between the reference picture and the reference picture associated with the selected block.

In a third implementation of the video decoding method according to the fourth aspect as such or according to any of the preceding implementations of the fourth aspect, the plurality of blocks selected in each reference picture comprises all blocks of the reference picture or blocks of the reference picture within a projection range centered on the position of the block in the current picture.

In a fourth implementation form of the video decoding method according to the fourth aspect as such or according to any of the preceding implementation forms of the fourth aspect, the at least one partition prediction value comprises at least one of: line parameters for geometric partitions, boundary motion vectors for object-based partitions, and rectangular partition information.

In a fifth implementation of the video decoding method according to the fourth implementation of the fourth aspect, the line parameters are specified by polar coordinates or truncation points at the reference block boundaries and/or the boundary motion vectors specify partition boundaries in reference pictures.

The method of the fourth aspect and its implementation achieve the same advantages as the decoder of the second aspect and its implementation, respectively, described above.

A fifth aspect of the invention provides a computer program product comprising program code for performing the method according to the third or fourth aspect when run on a computer.

Thus, the computer program product of the fifth aspect achieves all the advantages of the methods of the third and fourth aspects.

It should be noted that all devices, elements, units and means described in the present application may be implemented in software or hardware elements or any combination thereof.

The steps performed by the various entities described in this application and the functions to be performed by the various entities described are intended to mean that the various entities are used to perform the various steps and functions.

Even if in the following description of specific embodiments the specific functions or steps performed by the external entity are not represented in the description of specific detailed elements of the entity performing the specific steps or functions, the skilled person will appreciate that these methods and functions may be implemented in individual software or hardware elements or any combination thereof.

Drawings

The foregoing aspects and many of the attendant aspects of this invention will become more readily appreciated as the same become better understood by reference to the following detailed description, when taken in conjunction with the accompanying drawings, wherein:

fig. 1 illustrates an encoder and an encoding method, respectively, according to embodiments of the present invention.

Fig. 2 shows a decoder and a decoding method according to embodiments of the present invention, respectively.

Fig. 3 illustrates a concept of projecting partition information of a reference picture onto a block in a current picture.

Fig. 4 shows a simplified hybrid encoder/decoder (decoder grey shaded) model according to an embodiment of the present invention.

Fig. 5 shows an example of block partition information for the case of a) geometric partitioning and b) object-based partitioning using motion vectors.

Fig. 6 shows an example of motion vectors stored on a 4 × 4 pixel grid for a block of size 32 × 32 pixels, in the case of a) geometric partitions and b) object-based partitions.

Fig. 7 illustrates the adjustment of reference block partition information for geometric partitioning.

Fig. 8 shows the difference information (offset of line start and end coordinates) for a geometric partition with two truncation points.

Fig. 9 shows examples of different conventional motion partitioning methods in video coding.

Fig. 10 shows temporally collocated motion vectors in HEVC.

Detailed Description

Fig. 1 illustrates a video encoder 100 (and corresponding video encoding method) provided by an embodiment of the present invention. Referring to fig. 3, the encoder 100 of fig. 1 may be configured to partition a block 301 in a current picture 302 based on at least one partition prediction value, i.e., prediction block partitioning may be performed. To this end, the encoder 100 is configured to perform the video encoding method according to the embodiment of the present invention.

Specifically, in a first step 101, the encoder 100 is configured to: at least one reference picture 303a and a plurality of blocks 304a in the at least one reference picture 303a are selected. In a second step 102, the encoder 100 is configured to: the projected position of each selected block 304a in the current picture 302 is calculated based on the motion vector associated with the selected block 304a in the reference picture 303 a. In a third step 103, the encoder 100 is configured to: a selected block 304a for which each projection position spatially overlaps the block 301 in the current picture 302 is determined as a reference block. In a fourth step 103, the encoder 100 is configured to: a partition prediction value is generated for at least one reference block based on partition information associated with the at least one reference picture 303a, e.g., partition information stored in the at least one reference picture 303 a.

Fig. 2 illustrates a video decoder 200 (and corresponding video decoding method) provided by an embodiment of the present invention. Referring again to fig. 3, the decoder 200 of fig. 2 may be used to decode the partitions of a block 301 in a current picture 302 based on at least one partition predictor, i.e., prediction block partitioning may occur. For this purpose, the decoder 200 is configured to perform the video decoding method provided by the embodiment of the present invention.

Specifically, in the first step 201, the decoder 200 is configured to: the difference information is obtained from the encoder 100 and the like. In a second step 202, the decoder 200 is configured to: at least one reference picture 303a and a plurality of blocks 304a in the at least one reference picture 303a are selected. In a third step 203, the decoder 200 is configured to: the projected position of each selected block 304a in the current picture 302 is calculated based on the motion vector associated with the selected block 304a in the reference picture 303a, the POC distance 306 between the current picture 302 and the reference picture 303a, and the POC distance 306 between the reference picture 303a and the reference picture 303b associated with the selected block 304 a. In a fourth step 204, the decoder 200 is configured to: a selected block 304a for which each projection position spatially overlaps the block 301 in the current picture 302 is determined as a reference block. In a fifth step 205, the decoder 200 is configured to: a partition prediction value is generated for at least one reference block based on partition information associated with the at least one reference picture 303a, e.g., partition information stored in the at least one reference picture 303 a. In a sixth step 206, the decoder 200 is configured to obtain partitions of the block 301, in other words to partition the block 301 in the current picture 302 based on the partition prediction values and the difference information.

Fig. 3 shows the main basic idea of an embodiment of the invention, namely temporal projection, to estimate said current picture 302 (also called picture P)₀) Of the current block 301. For example, the motion vector of the block 301 has not been determined (at the encoder 100) or decoded. For example, if the current block 301 contains a moving object, the motion of the object in time can be tracked using at least one available reference picture 303 a. In particular, the block 304b in the reference picture 303b of the reference picture 303a is the reference block 304b of the selected block 304a (e.g., these two

pictures

303a and 303b may be referred to as P_-1、P_-2(ii) a The indices-1 and-2 of these pictures are exemplary and may be used as well from other than the first and second previous framesPictures of its frame). The temporal movement between the selected block 304 in the reference picture 303a and the block 304b in the reference picture 303b results in a motion trajectory 305, wherein the block 304b in the reference picture 303b is associated with the block 304a in the reference picture 303 a.

The motion trajectory 305 can be modeled by a linear function, provided that the direction and magnitude of the motion does not change significantly over a reasonable time interval. If the picture P_-1、P_-2All temporally prior to the current picture 302, the process of linearly modeling the motion trajectory is referred to as forward projection. That is, the continuation of the motion trajectory 305 from the reference picture 303a to the current picture 302 is calculated as a forward projection. By continuing the motion trajectory 305 to the current picture 302, the predicted position of the selected reference picture block 304a (containing the moving object) in the current picture 302 can be obtained. This is based on the assumption that the block (its content and partition structure) remains unchanged over the time span between the reference picture 303a and the current picture 302. If the prediction position spatially overlaps with the position of the block 301 in the current picture 302, the partition information associated with the reference picture 303a of the selected block 304a, e.g. stored in the reference picture 303a, may be effectively reused. That is, a partition prediction value based on the correlated or stored partition information may be generated. The partition predictor may then be used for encoding and partitioning of the current block 301. For the case of bi-prediction, the projection may also be performed using a reference picture 303a, a reference picture 303b associated with at least one selected block 304a in said reference picture 303 a. The reference picture 303a is later in time than the current picture 302 (e.g., the

pictures

303a and 303b may be referred to as P)₁、P₂… …, not shown in fig. 3, where the indices 1 and 2 are only examples of pictures in subsequent frames), so that a continuation of the motion trajectory 305 can be calculated as a back projection.

In a practical implementation of the above idea, motion vectors can be processed element-wiseA field, i.e. a motion vector multiplied by a motion vector, which is included in the motion vector field, is associated with the reference picture 303 a. In any case, the current picture P may be based on the ratio of the two POC distances₀And the reference picture P_-1A first POC distance 306 between and the reference picture P_-1And the reference picture P associated with the corresponding motion vector (i.e. associated with the selected block 304a in the reference picture 303a)_-2The second POC distance 306, by inversion and scaling of the motion vector, generates the motion trajectory 305. The forward and backward projection of the motion vectors can be automatically processed. Each projected motion vector may be used in a process like motion compensation, where partition information, rather than pixel values, is compensated.

Fig. 4 shows a simplified structure of a hybrid encoder/decoder 400 (where the decoder portions are shaded in grey) according to an embodiment of the invention. The hybrid encoder/decoder 400 may perform the functions of the encoder 100 and the decoder 200 shown in fig. 1 and 2, respectively. Fig. 4 specifically shows the location of the projection subunit 401 in such a hybrid video encoder/decoder 400. Said sub-unit 401 performs the task of generating projected motion vectors, i.e. calculating a motion trajectory 305 based on the motion vectors of the selected block 304 a; and to perform the task of finding the projection position of the selected block 304a in the current picture 302. The sub-unit 401 then applies the partition information associated with the selected block 304a in the at least one reference picture 303a to the current block 301 in a motion compensation-like process. That is, when encoding the reference picture 303a, information on how to partition the selected block 304a may be reused. For example, the current block 301 may be partitioned identically to the selected block 304 a. However, in general, the partition predictor associated with the selected block 304a is used only as a starting point (initial partition) to obtain the best partition of the current block 301.

The input of the sub-unit 401 is at least one reference picture 303a, which may be stored in a picture buffer 402. The reference picture 303a contains all information needed for the projection process, including at least the motion vector field and partition information.

For example, referring to fig. 5, the partition information may include: line parameters of the geometric partition 501, e.g. block boundary truncation coordinate (x)_s，y_s)^T、(x_e，y_e)^T(where index s represents the "start point" and index e represents the "end point", as shown in FIG. 5 a) or polar coordinates

As shown in fig. 5 a; or motion vectors for the case of object-based partitioning 502, in particular for the case of segmentation-based partitioning, where the boundary motion vector MV is generated by segmentation_B(where index B represents a "boundary") to generate the partition boundary, as shown in fig. 5B. Alternatively, any other method of partitioning blocks is possible, where the partition rows or boundaries may be parameterized.

In practical encoder/decoder implementations, such partition information is typically stored in a form similar to the storage field of the motion vector field, where each element of the storage field addresses a sub-block of pixels. The size of the sub-blocks is typically specified by the smallest independently addressable block of pixels.

Motion vectors are typically stored block by block using a regular grid, where each motion vector represents a translational motion of a block of pixels. The size of each pixel block is set according to the encoder and controls the motion vector field resolution. Hereinafter, if not otherwise stated, it is assumed for simplicity that the fixed block size is 4 × 4 pixels. All motion vectors associated with sub-portions of a prediction block in a rectangular motion partition, or all motion vectors associated with different prediction segments in a geometric or segment-based partition, may share the same motion vector. Fig. 6 illustrates these two cases.

Fig. 6a shows an example of motion vectors stored on a 4 × 4 pixel grid for a block of size 32 × 32 pixels in the case of geometric partitioning. In FIG. 6a, the prediction block is divided intoInto two prediction segments S₀And S₁。S₀Each 4x4 pixel block and motion vector MV in (grey shaded area)₀Associated, i.e. the blocks of pixels share the motion vector, and S₁Each 4x4 pixel block and motion vector MV in (white area)₁And (4) associating. Fig. 6b shows the same situation for object-based partitioning. That is, different partitions are distinguished in a binary manner by the two motion vectors.

One motion vector MV_P，kCan point to reference picture P_iAnd the other motion vector points to the other reference picture P_j. Since motion vectors of the same motion vector field may point to different reference pictures 303a, the corresponding reference picture indices are preferentially stored on a 4x4 grid as part of the motion vector field. Reference picture P_iAnd another reference picture P_j Time POC distance 306 between, denoted t_d。

Hereinafter, the selected reference picture P _i303a should be represented as MVF_P(x, y), the index P represents a "picture", and the partition information of the reference picture is represented as I_P(x, y). Position (x)_k，y_k)^TThe motion vector field MVF of_PShould be expressed as a motion vector MV, i.e. a single motion vector_P，kAnd similarly, (x)_k，y_k)^TThe partition information of (A) is represented as I_P，k. The motion vector MV_P，kAssociated with a selected block 304a in the reference picture 303 a. The current block 301 has a position of (x) in the current picture 302_c，y_c)^T。

The projection and compensation processes may be performed at the encoder 100 and decoder 200 sides, respectively, with an exemplary implementation comprising the steps of:

1. for the position (x)_c，y_c)^TAnd an encoding block 301 of a given size S (e.g. in luma or chroma samples), which is the current picture P₀Part of) from the reference pictureSelect reference picture P in slice buffer 401 or the like_iDetermining the picture P₀And P_iThe temporal POC distance 306 between and denoted t_b。

2. Accessing a reference picture P_iWherein the motion vector field comprises the motion vector MV of the reference picture 303a_P，kThe index k represents a value located at (x)_k，y_k)^TThe address of the motion vector within the motion vector field. For the subsequent projection processing, a projection range centered on a collocated position (e.g., a 3 × 3CTU window), that is, a projection range in the reference picture 303a centered on the position of the block 301 in the current picture 302, may be specified; or the entire motion vector field of the reference picture 303a may be processed. The projection process may proceed in raster scan order, or may start at the center of selected block 304a and access the motion vectors for the spiral outward motion until all elements within the projection range have been processed.

3. For the position (x) within the projection range_k，y_k)^TEach motion vector MV of_P，kProjecting motion vector MV_TP，k) Calculated from the following formula:

wherein, t_dRepresents the current reference picture 303a (P)_i) And the temporal POC distance 306 between reference picture 303b, which is associated with the selected block 304a, i.e. referred to as the motion vector. That is, the motion trajectory 305 of the selected block 304a is calculated based on the motion vector of the selected block 304a and the temporal POC distance 306 between the current picture 302 and the reference picture 303 a.

4. The projected position (x) is then determined by adding the projected motion vector (motion trajectory 305) to the current position_p，y_p)^T。

That is, the projection position of the selected block 304a is calculated based on the current position of the selected block 304a in the reference picture 303a and the motion trajectory 305.

5. If the projection position (x)_p，y_p)^TIs located in (x)_c，y_c)^TWithin the boundary of the current encoding block 301, i.e. if the projected position of the selected block 304a spatially overlaps the block 301 in the current picture 302, the selected block 304a is a reference block, i.e. a candidate partition predictor has been found:

(x_c≤x_p≤x_c+S∩y_c≤y_p≤y_c+ S) → candidate predictor

6. Is stored in location (x)_k，y_k)^TSaid partition information I_P，kIs added to the partition predictor candidate list of the current block 301.

1) In case of geometric partitioning, the reference block (x)_s，y_s)_P ^T、(x_e，y_e)_P ^TOr

May need to be adjusted according to the center position of the current block relative to the reference block position. This is shown in fig. 7 and explained below.

If a line truncation point is specified, then the truncation point (x) can be compensated from motion_s，y_s)_P ^T+MV_TP，kAnd (x)_e，y_e)_P ^T+MV_TP，kThe new truncation point (x) is easily calculated by the following formula_s，y_s)_C ^TAnd (x)_e，y_e)_C ^T：

Wherein (x)_s，y_s)_B ^T、(x_e，y_e)_B ^TLine start and end coordinates representing four block boundaries of the current block. (x)_s，o，y_s，o) And (x)_e，o，y_e，o) And may further be used to represent intermediate calculation results.

In a polar diagram, this results in the coordinate system being offset by an offset vector:

2) in the case of object-based partitioning using boundary motion vectors, such an adjustment is not required. According to and MV_BAfter appropriate motion vector scaling by the associated POC distance 306, the partition information I at the reference location can be used_P，kWherein the partition information comprises a boundary motion vector MV_B(where index B represents a "boundary").

7. The above projection process may be repeated until a specified maximum number of partition predictors are found, or if the current reference picture 303a (P) has been processed_i) All motion vectors MV of_P，kThe projection process is terminated. Thus, a candidate list of projection partition information may be constructed. For ease of transmission, the selected partition may be transmitted by an index pointing to the position of the partition predictor candidate list.

In summary, the output of the projection subunit 401 is the candidate list, which may also be referred to as a partition prediction value list. The partition predictor may include geometric partition row parameters, and boundary motion vectors for object-based or rectangular partitions. The partition prediction value may be used in the following subsequent stage.

For example, at the encoder 100 side, a segment-based motion estimation may be performed. In practical implementations, the partition prediction values may be used to generate initial partitions, which may be further refined by rate-distortion estimation.

On the encoder side, the projected partition information (row parameters for geometric partitions, boundary motion vectors for segment-based partitions, quadtree and binary tree partitions for rectangular partitions) can be used as partition starting points for the current block 301 in a rate-distortion optimization process. A fast decision method may be used which ends the partitioning in the rate-distortion optimization process after a specified number of refinement steps, or if the achieved rate-distortion cost is below a specified threshold. For a line parameter with line start and end coordinates in a geometric partition, a small range of offsets around the projected line start and end coordinates may be defined for the block partitioning. The search range is reduced in this case because only a limited number of offsets around the projection line start and end point coordinates need to be tested. An optimal partition may be selected based on rate-distortion optimization. In this way, the number of test partition rows is significantly reduced. Thus, this approach may reduce the complexity of the encoder and reduce the encoding time.

For example, at the decoder 200 side, segment motion compensation may be performed by applying decoded segment motion vectors to segments generated by the decoded partition information.

Fig. 8 shows the difference information for geometric partitioning using two offset values for the truncation point coordinates. By using two points P located on the boundary of a given block₀＝[x₀，y₀]^TAnd P₁＝[x₁，y₁]^TThe partition lines in the geometric partitions in fig. 8 are described. The two points form a straight line.

(y-y₀)(x₁-x₀)＝(y₁-y₀)(x-x₀)

Due to the twoPoint P₀And P₁The direct encoding of (2) consumes too high a bit rate and therefore uses temporal/spatial prediction of the partition rows. In particular, for the temporal prediction of the partition line, if the coordinates of the two truncation points of the partition line of the prediction value are P_p，0＝[x_p，0，y_p，0]^TAnd P_p，1＝[x_p，1，y_p，1]^TThen, as shown in fig. 8, the difference information is (Δ)_s，Δ_e) Wherein

Δ₈＝(x₀-x_p，0，y₀-y_p，0)

Δ_e＝(x₁-x_p，1，y₁-y_p，1)

A negative offset moves a point forward along the block boundary and vice versa. Sending only the difference information significantly reduces the signaling overhead of the two truncation points.

In other words, fig. 8 shows an example of a partition 801 generated based on a partition prediction value (e.g., an initial partition) and another partition 802. The partition 801 may be an initial partition based on the partition prediction value. The other partition may be any other partition obtained using other rules, such as a partition obtained by testing all possible subsets or a given subset of the fixed partition, or modifying the partition 801 based on a given rule. Embodiments may be used to evaluate different candidate partitions (including the partition 801 and any other partitions 802) and select only one partition as the final partition. The evaluation and selection of the final partition may be performed, for example, by comparing the rate distortions of the different partitions and selecting the partition with the smallest rate distortion, or by selecting the first partition with a rate distortion below a predetermined threshold.

With reference to the above description of the various embodiments, the embodiments of the video encoder and video decoder may be used to: for example, at least one reference picture 303a (only one exemplary reference picture is shown in fig. 3) is selected and included in the at least one reference picture for the current block (i.e., the block currently to be encoded or decoded, the index 0 indicating the current time instance)303a (only one exemplary block is shown in fig. 3). The embodiment may be used to select a previous picture, e.g. the previous (P)_-1) Or any other preceding picture (e.g., P)_-2、P_-3Etc., negative index indicates a previous time instance), and/or a subsequent picture, e.g., a subsequent (P)₁) Or any other subsequent picture (e.g. P)₂、P₃Etc., positive index indicates a subsequent time instance).

Embodiments of the video encoder and video decoder may be further configured to: calculating a projected position of each selected block 304a of the plurality of selected blocks in the current picture 302 based on a motion vector associated with the selected block (304a) in the reference picture (303 a). Reference numeral 304m shows the motion vector of the reference block 304a, e.g., a shift or motion of the reference block 304a in the reference picture 303a relative to the reference block 304b in the corresponding picture 303b, which reference block 304b is or has been used to predict and/or reconstruct the reference block 304a in the reference picture 303 a. Thus, the reference block 304b and the reference picture 303b may also be referred to as a reference block 304b of the reference block 304a and a reference image 303b of the reference block 303 a. When encoding (or decoding) the current block 301, the reference block 304a and at least part of the reference picture 303a have been previously reconstructed and, for example, stored in the picture buffer 402 (see fig. 4). Thus, the motion vector 304m has also been decoded or reconstructed and is, for example, also stored in the picture buffer 402. Thus, all information needed to project the position of the reference block 304a to a position in the current picture 302 is available, e.g. from the picture buffer 402, depending on the reference block 304a was moved (represented by motion vector 304 m) to the corresponding position (see reference numeral 304 p). Reference numeral 304p shows the projected or predicted position of the reference block 304a in the reference picture 303a, and reference numeral 304r shows the corresponding position of the reference block 304b of the reference block 304a in the reference picture 303 a.

In the ideal case (ideal case with respect to prediction), the projection position 304p of the reference block 304a in the reference image 303a is the same as the position of the current block 301 in the current picture 302. In a scenario or use case, if the time difference between the current picture 302 and the reference picture 303a (e.g. by actual time or by picture counting) and the time difference between the reference picture 303a and the reference picture 303b of the reference block 304a are the same, embodiments may directly use the motion vector for prediction by inverting the motion vector 304m only, as shown in fig. 3 (see arrows indicating 180 ° inversion). In other cases, embodiments may scale to improve the prediction or projection as described above. ,

embodiments of the video encoder and video decoder may be further configured to: a selected block 304a for which each projection position 304p spatially overlaps the block 301 in the current picture 302 is determined as a reference block. In other words, embodiments of the video encoder and video decoder may be further used to: if the projection position 304p of a selected block 304a spatially overlaps the block 301 in the current picture 302, this selected block 304a is selected as a reference block. Embodiments may be used to select as reference blocks all selected blocks 304a of which projection locations 304p spatially overlap said current block 301, or only some or only one, depending on predetermined or adaptive rules, e.g. fast mode limits the number of reference blocks to a certain number, or overlap mode requires that said projection blocks at projection locations 304p only need to meet a certain minimum percentage (not just any overlap) overlap or only a certain number of pixels overlap in the block area.

Embodiments of the video encoder and video decoder may be further configured to: the partition prediction value of the current block 301 is generated for at least one reference block based on partition information associated with at least one reference picture 303a, in particular based on partition information associated with the reference block 304 a. In other words, embodiments of the video encoder and video decoder may be used to: the partition predictor is generated for only one reference block or some or all of the reference blocks, e.g., one partition predictor is generated for each reference block. The number of reference blocks to be selected may be determined based on the fast mode or the overlap mode or other modes as examples described above, and the like.

Embodiments of the video encoder may be further configured to: partitioning the block 301 in the current picture 302 based on the partition prediction values, and optionally, additionally deriving difference information, wherein the difference information comprises or indicates a difference between the partition information of the partition prediction value (e.g., 801) and the partition information of a final partition (e.g., 802) to improve the partition. The difference information may also be referred to as partition difference information. Embodiments of the video encoder may be used to send the difference information or the partition prediction value and the difference information.

Embodiments of the video decoder may be further configured to: partitioning the block 301 in the current picture 302 based on the partition prediction values, and optionally, partitioning may be performed based on difference information, wherein the difference information includes or indicates a difference between the partition information of the partition prediction value (e.g., 801) and partition information of a final partition (e.g., 802) to improve the partitioning. The difference information may also be referred to as partition difference information, and may also be zero in the case where the partition information of the partition prediction value is selected as the final partition. Embodiments of the video decoder may be used to receive the difference information or the partition prediction value and the difference information.

Embodiments of the video encoder and video decoder may be further configured to: a partition prediction flag (to indicate use or enable/disable of the partition prediction) and a partition prediction value index (e.g., where multiple partition prediction values are available) are sent or received.

Embodiments of the present invention may be performed by hardware, software, or any combination thereof. Embodiments of the video encoder and video decoder may include a processor, and embodiments of the video encoding and decoding methods may be performed by the processor.

The invention has been described in connection with various embodiments and implementations as examples. Other variations will be understood and effected by those skilled in the art in practicing the claimed invention, from a study of the drawings, the disclosure, and the independent claims. In the claims and the description, the terms "comprising" does not exclude other elements or steps, and "a" does not exclude a plurality. A single element or other unit may fulfill the functions of several entities or items recited in the claims. The mere fact that certain measures are recited in mutually different dependent claims does not indicate that a combination of these measures cannot be used to advantage.

Claims

1. A video encoder (100), comprising a processor configured to:

selecting at least one reference picture (303a) and a plurality of blocks (304a) in the at least one reference picture (303 a);

calculating a projection position of each selected block (304a) in the current picture (302) based on the motion vector associated with the selected block (304a) in the reference picture (303 a);

determining a selected block (304a) for which each projection position spatially overlaps with a block (301) in the current picture (302) as a reference block;

generating a partition predictor of the current block (301) for at least one reference block based on partition information associated with the at least one reference picture (303 a).

2. The video encoder (100) of claim 1, wherein the processor is configured to:

calculating a motion trajectory (305) for each selected block (304a) in time based on the motion vector associated with the selected block (304a) in the reference picture (303a) and a temporal Picture Order Count (POC) distance (306) between the current picture (302) and the at least one reference picture (303 a);

-calculating the projection position of each selected block (304a) based on the position of the selected block (304a) in the reference picture (303a) and the motion trajectory (305).

3. The video encoder (100) of claim 2, wherein the processor is configured to:

-calculating the motion trajectory (305) by inversion and scaling of the motion vector associated with the selected block (304a) as a function of the ratio of two POC distances, i.e. the POC distance (306) between the current picture (302) and the reference picture (303a) and the POC distance (306) between the reference picture (303a) and the reference picture (303b) associated with the selected block (304 a).

4. The video encoder (100) of one of claims 1 to 3, wherein the selected plurality of blocks (304a) in each reference picture (303a) comprises:

all blocks of the reference picture (303a), or

A block of the reference picture (303a) within a projection range centered on the position of the block (301) in the current picture (302).

5. A video encoder (100), characterized in that the video encoder (100) comprises the features of the video encoder (100) of any of claims 1 to 4, and the processor is configured to:

a list is constructed and output that includes a plurality of indexed partition predictors.

6. A video encoder (100), characterized in that the video encoder (100) comprises the features of the video encoder (100) of any of claims 1 to 5, and in that the at least one partition prediction value of the video encoder (100) comprises at least one of: line parameters of the geometric partition (501), boundary motion vectors of the object-based partition (502), and rectangular partition information.

7. The video encoder (100) of claim 6, wherein:

the line parameters are specified by polar coordinates or truncation points at the reference block boundaries, and/or

The boundary motion vector specifies a partition boundary in a reference picture.

8. A video encoder (100), characterized in that the video encoder (100) comprises the features of the video encoder (100) of any of claims 1 to 7, and the processor is configured to:

generating an initial partition for the block (301) in the current picture (302) using the at least one partition prediction value.

9. A video encoder (100), characterized in that the video encoder (100) comprises the features of the video encoder (100) of any of claims 1 to 8, and the processor is configured to:

sending the at least one partition predictor or at least one index to a decoder (200), the index pointing to a position of the at least one partition predictor in an indexed partition predictor list.

10. A video encoder (100), characterized in that the video encoder (100) comprises the features of the video encoder (100) of any of claims 1 to 9, and the processor is configured to:

sending, to a decoder (200), difference information between the at least one partition prediction value and a final partition applied to the block (304a) in the current picture (302).

11. A video decoder (200) comprising a processor configured to:

obtaining difference information, which is information on a difference between the estimated partition and a final partition of the current block;

generating a partition predictor of the current block (301) for at least one reference block based on partition information associated with the at least one reference picture (303 a); partitioning the block (301) in the current picture (302) based on the partition prediction value and the difference information.

12. The video decoder (200) of claim 11, wherein the processor is configured to:

13. The video decoder (200) of claim 12, wherein the processor is configured to:

14. A video decoder (200), characterized in that the video decoder (200) comprises the features of the video decoder (200) of any of claims 11 to 13, and in that the selected plurality of blocks (304a) in each reference picture (303a) comprises:

all blocks of the reference picture (303a), or

15. A video decoder (200), characterized in that said video decoder (200) comprises the features of said video decoder (200) of any of claims 11 to 14, said at least one partition prediction value comprising at least one of: line parameters of the geometric partition (501), boundary motion vectors of the object-based partition (502), and rectangular partition information.

16. The video decoder (200) of claim 15, wherein:

The boundary motion vector specifies a partition boundary in a reference picture (303 a).

17. A video encoding method, characterized in that said method comprises the steps of:

selecting (101) at least one reference picture (303a) and a plurality of blocks (304a) in the at least one reference picture (303 a);

calculating (102) a projection position of each selected block (304a) in the current picture (302) based on the motion vector associated with the selected block (304 a);

determining (103) a selected block (304a) for which each projection position spatially overlaps with a block (301) in the current picture (302) as a reference block;

generating (104) a partition prediction value for the current block (301) for at least one reference block based on partition information associated with the at least one reference picture (303 a).

18. A video decoding method, characterized in that it comprises the steps of:

receiving (201) difference information, wherein the difference information is information on a difference between an estimated partition and a final partition of a current block;

selecting (202) at least one reference picture (303a) and a plurality of blocks (304a) in the at least one reference picture (303 a);

calculating (203) a projection position of each selected block (304a) in the current picture (302) based on the motion vector associated with the selected block (304 a);

determining (204) a selected block (304a) for which each projection position spatially overlaps with a block (301) in the current picture (302) as a reference block;

generating (205) a partition prediction value for the current block (301) for at least one reference block based on partition information associated with the at least one reference picture (303 a);

partitioning (206) the block (301) in the current picture (302) based on the partition prediction value and the difference information.

19. A computer storage medium storing a computer program, characterized in that,

the computer program comprises program code for performing, when running on a computer, the method according to one of claims 17 or 18.