WO2022047117A1 - Procédés et systèmes de partitionnement géométrique adaptatif - Google Patents

Procédés et systèmes de partitionnement géométrique adaptatif Download PDF

Info

Publication number
WO2022047117A1
WO2022047117A1 PCT/US2021/047886 US2021047886W WO2022047117A1 WO 2022047117 A1 WO2022047117 A1 WO 2022047117A1 US 2021047886 W US2021047886 W US 2021047886W WO 2022047117 A1 WO2022047117 A1 WO 2022047117A1
Authority
WO
WIPO (PCT)
Prior art keywords
current block
encoder
modes
geometric
partition
Prior art date
Application number
PCT/US2021/047886
Other languages
English (en)
Inventor
Borivoje Furht
Hari Kalva
Velibor Adzic
Original Assignee
Op Solutions, Llc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Op Solutions, Llc filed Critical Op Solutions, Llc
Publication of WO2022047117A1 publication Critical patent/WO2022047117A1/fr

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/105Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
    • H04N19/96Tree coding, e.g. quad-tree coding

Definitions

  • the present invention generally relates to the field of video compression.
  • the present invention is directed to methods and systems of adaptive geometric partitioning.
  • a video codec can include an electronic circuit or software that compresses or decompresses digital video. It can convert uncompressed video to a compressed format or vice versa.
  • a device that compresses video (and/or performs some function thereof) can typically be called an encoder, and a device that decompresses video (and/or performs some function thereof) can be called a decoder.
  • a format of the compressed data can conform to a standard video compression specification.
  • the compression can be lossy in that the compressed video lacks some information present in the original video. A consequence of this can include that decompressed video can have lower quality than the original uncompressed video because there is insufficient information to accurately reconstruct the original video.
  • Motion compensation can include an approach to predict a video frame or a portion thereof given a reference frame, such as previous and/or future frames, by accounting for motion of the camera and/or objects in the video. It can be employed in the encoding and decoding of video data for video compression, for example in the encoding and decoding using the Motion Picture Experts Group (MPEG)'s advanced video coding (AVC) standard (also referred to as H.264). Motion compensation can describe a picture in terms of the transformation of a reference picture to the current picture. The reference picture can be previous in time when compared to the current picture, from the future when compared to the current picture. When images can be accurately synthesized from previously transmitted and/or stored images, compression efficiency can be improved.
  • MPEG Motion Picture Experts Group
  • AVC advanced video coding
  • an encoder includes circuitry configured to receive a bitstream, wherein the bitstream includes a current picture, the current picture including a current block of pixels with multiple geometric partition boundaries, at least a first partition boundary partitioning the block into first and second non-rectangular regions, and a second partition boundary, non-parallel to and intersecting the at least a first partition boundary, partition the second non-rectangular region of the current block via a geometric partitioning mode to partition the current block into at least three portions, determine a first predictor for use on a first side of the at least a first partition boundary using a first motion vector, wherein the first motion vector extends from the first partition boundary to the second partition boundary as a function of a line segment slope angle, determine a second predictor as a function of a second motion vector, wherein the second motion vector originates at a geometric reference of the current block of pixels and extends to the first motion vector, and encode the current block using the first motion vector and the second motion vector, wherein decoding further comprises smoothing the first predictor
  • a method includes, receiving, by an encoder, a bitstream, wherein the bitstream includes a current picture, the current picture including a current block of pixels with multiple geometric partition boundaries, at least a first partition boundary partitioning the block into first and second non-rectangular regions, and a second partition boundary, non-parallel to and intersecting the at least a first partition boundary, partitioning, by the encoder, the second non-rectangular region of the current block via a geometric partitioning mode to partition the current block into at least three portions, determining, by the encoder, a first predictor for use on a first side of the at least a first partition boundary using a first motion vector, wherein the first motion vector extends from the first partition boundary to the second partition boundary as a function of a line segment slope angle, determining, by the encoder, a second predictor as a function of a second motion vector, wherein the second motion vector originates at a geometric reference of the current block of pixels and extends to the first motion vector, and encoding
  • an encoder includes circuitry configured to receive an input video, select a current block of the input video, determine a number of geometric partitioning modes applicable to the current block, identify a geometric partition of the current block as a function of the number of geometric partitioning modes, and encode the current block as a function of the geometric partition.
  • a method includes receiving an input video, selecting a current block of the input video, determining a number of geometric partitioning modes applicable to the current block, identifying a geometric partition of the current block as a function of the number of geometric partitioning modes, and encoding the current block as a function of the geometric partition.
  • FIG. 1 is a block diagram illustrating an exemplary embodiment of block partitioning
  • FIG. 2 is an illustration of an exemplary embodiment of geometric partitioning
  • FIG. 3 is an illustration of an exemplary embodiment of geometric partitioning
  • FIG. 4 is an illustration of an exemplary embodiment of a set of available angles for geometric partitioning
  • FIG. 5 is an illustration of an exemplary embodiment of a set of available angles for geometric partitioning
  • FIG. 6 is an illustration of an exemplary embodiment of a set of available angles for geometric partitioning
  • FIG. 7 is an illustration of an exemplary embodiment of a set of available angles for geometric partitioning
  • FIG. 8 is a process flow diagram illustrating an example process of adaptive geometric partitioning
  • FIG. 9 is a system block diagram illustrating an example decoder capable of decoding a bit stream according to some implementations of the current subject matter
  • FIG. 10 is a system block diagram illustrating an example video encoder according to some implementations of the current subject matter.
  • FIG. 11 is a block diagram of a computing system that can be used to implement any one or more of the methodologies disclosed herein and any one or more portions thereof.
  • Some implementations of the current subject matter include performing inter prediction with regions that have been partitioned with a geometric partitioning mode, as selected from an adaptive number of possible geometric partitioning modes, in which a rectangular block may be divided into two or more non-rectangular regions.
  • Performing inter prediction with non- rectangular blocks that have been partitioned with geometric partitioning an adaptive number of regions may allow partitioning to more closely follow object boundaries, resulting in lower motion compensation prediction error, smaller residuals, and thus improved compression efficiency.
  • motion compensation may be performed using motion vectors predicted for blocks (e g., coding units, prediction units, and the like) determined according to a geometric partitioning mode.
  • Motion vectors may be predicted using advanced motion vector prediction (AMVP) and/or via merge mode, where the motion vector is selected from a list of motion vector candidates without encoding a motion vector difference.
  • AMVP advanced motion vector prediction
  • the current subject matter may be applied to relatively larger blocks, such as blocks with a size of 128 x 128 or 64 x 64, for example.
  • the geometric partitioning may involve partitioning a current block into an adaptive number of regions, such as two or more regions for a given current block, and a motion information can be determined for each region.
  • Motion compensation may include an approach to predict a video frame or a portion thereof given the previous and/or future frames by accounting for motion of the camera and/or objects in the video.
  • Motion compensation may be employed in encoding and decoding of video data for video compression, for example in the encoding and decoding using the Motion Picture Experts Group (MPEG)-2 (also referred to as advanced video coding (AVC)) standard.
  • MPEG Motion Picture Experts Group
  • AVC advanced video coding
  • Motion compensation may describe a picture in terms of the transformation of a reference picture to a current picture.
  • a reference picture may be previous in time or from the future when compared to a current picture.
  • compression efficiency may be improved.
  • Block partitioning may refer to a method in video coding to find regions of similar motion. Some form of block partitioning may be found in video codec standards including MPEG-2, H.264 (also referred to as AVC or MPEG-4 Part 10), and H.265 (also referred to as High Efficiency Video Coding (HEVC)).
  • HEVC High Efficiency Video Coding
  • nonoverlapping blocks of a video frame may be partitioned into rectangular sub-blocks to find block partitions that contain pixels with similar motion. This approach can work well when all pixels of a block partition have similar motion. Motion of pixels in a block may be determined relative to previously coded frames.
  • An initial rectangular picture or block 100 which may itself be a sub-block (e.g., a node within a coding tree), can be partitioned into rectangular sub-blocks.
  • block 100 is partitioned into two rectangular sub-blocks 110a and 110b.
  • Sub-blocks 110a and 110b may then be processed separately.
  • block 100 may be partitioned into four rectangular sub-blocks 120a, 120b, 120c, and 120d.
  • Subblocks may themselves be further divided until it is determined that the pixels within the subblocks share the same motion, a minimum block size is reached, or another criteria.
  • a motion vector may describe the motion of all pixels in that region.
  • some approaches to video coding can include geometric partitioning, which may be a form of exponential partitioning in which a rectangular block (e.g., as illustrated in FIG. 1) is further divided by a straight line segment into two regions that may be non-rectangular.
  • FIG. 2 illustrates various exemplary geometric partitions that may be formed according to geometric partitioning modes; each geometric partition may be defined by intersecting a block using a line segment. Where a geometric partition divides a block into more than two regions, two or more line segments as described in this disclosure may by used to define the geometric partition; line segments may be specified as overlapping or nonoverlapping.
  • FIG. 3 is a diagram illustrating an example of geometric partitioning.
  • An exemplary rectangular block 300 which may be described as having a width of M pixels and a height of N pixels, denoted as MxN pixels, may be divided along a straight line segment P1P2 304 into two regions, a first region 308 and a second region 312.
  • rectangular block 300 may have a width of M pixels and a heigh of N pixels comprising a 64 x 64 width and height.
  • rectangular block 300 may have a width of M pixels and a height of N pixels comprising a 128 x 128 width and height.
  • a motion vector may describe the motion of all pixels in that region. Motion vector may be used to compress first region 308. Similarly, when pixels in second region 312 have similar motion, an associated motion vector may describe the motion of pixels in the second region 312.
  • Such a geometric partition may be signaled to the receiver (e g., decoder) by encoding positions Pi and P2 (or representations of positions Pi and P2) in the video bitstream.
  • geometric inter prediction and/or geometric partitioning may be signaled in terms of an angle, denoted in FIG. 3 as a, from the horizontal and displacement b from a point situated at a geometric reference of block, where displacement may be interpreted as any possible form of distance or norm, including without limitation the Euclidean definition of a length of a line segment orthogonal to the line segment forming the partition and terminating at the point at the geometric reference.
  • a “geometric reference” is reference point and/or origination point of displacement b that exists within current block.
  • geometric reference may denote a geometric center, such as but not limited to a central location and/or point of current block
  • geometric inter prediction and/or geometric partitioning may be signaled in terms of an angle, denoted in FIG. 3 as a, from the horizontal and displacement b from a point situated at a geometric center of block, where displacement may be interpreted as any possible form of distance or norm, including without limitation the Euclidean definition of a length of a line segment orthogonal to the line segment forming the partition and terminating at the point at the geometric center
  • geometric partitions may have possible modes specified according to potential positions, defined by b, and potential angles, defined by cz, of line segment slopes used to perform such partitions.
  • a number of possible modes may be specified and/or signaled by specifying and/or signaling how many possible values of a and/or b are available for use in specifying each line segment.
  • possible values for line segment slope angle a may be a range of quantized angles of between 0 and 360 degrees with 11.25 degrees of separation, which gives total 32 angles.
  • a number of possible modes may be signaled and/or specified by defining a range and/or set of possible values for a, given a fixed set of possible values for b.
  • a first set of possible modes may be defined by a first set of 32 values for a, depicted in FIG. 4 as angles from a horizontal line of rays 0-31, may be combined with a set of possible values of b, such as 5, for a total of 140 modes.
  • a second set of modes which may be smaller, may include specified by a second, potentially smaller, set of possible values of a as shown in FIG.
  • a third set of modes which may be smaller, may include specified by a second, potentially smaller, set of possible values of a, which may be derived in a non-limiting example, by removal of angles as defined in larger set; for instance, and without limitation, angles defined by rays 5, 7, 17, and 19 from the set of rays as depicted in FIG. 5, may be eliminated from third set of possible angles, resulting in 20 possible angles depicted as a non-limiting example as defined by rays 0 through 19, which in a non-limiting example may be combined with 5 possible values of b for a total of 64 possible modes.
  • a fourth set of modes which may be smaller, may include specified by a second, potentially smaller, set of possible values of a , which may be derived in a non-limiting example, by removal of angles as defined in larger set; for instance, and without limitation, angles defined by rays 1, 9, 11, and 19 as depicted in FIG. 6may be eliminated from fourth set of possible angles, resulting in 16 possible angles, which in a non-limiting example may be combined with 5 possible values of b for a total of 50 possible modes.
  • numbers of modes may alternatively be specified by varying the number of values for b, either keeping values for a fixed, or in combination with variations in sets of possible values for a.
  • a video is received, for instance and without limitation at an encoder.
  • Video includes any of video as described in this disclosure.
  • Video includes a current picture, the current picture includes a current block of pixels with multiple geometric partition boundaries.
  • Current picture includes any of the current picture as described above, in reference to FIGS. 1-7.
  • Encoder may establish a first partition boundary partitioning a block of the current picture into first and second non-rectangular regions.
  • First partition boundary includes any of the first partition boundary as described above, in reference to FIGS. 1- 7.
  • Encoder may establish a second partition boundary that is non-parallel to and intersecting the at least a first partition boundary.
  • Second partition boundary includes any of the second partition boundary as described above. In reference to FIGS. 1-7.
  • encoder partitions the second non-rectangular region of the current block.
  • Partition includes any of the partition as described above, in reference to FIGS. 1-7.
  • Encoder partitions the second non-rectangular region of the current block via a geometric partitioning mode to partition the current block into three portions.
  • Geometric partitioning mode includes any of the geometric partitioning mode as described above, in reference to FIGS. 1-7.
  • Encoder may determine the geometric partitioning mode as a function of receiving, in the bitstream, a signal identifying the number of partitioning modes, such as an integer value; determination may be performed as a function of the signal.
  • Determining the number of partitioning modes may include receiving, in the bitstream, a signal identifying a label corresponding to a number of partitioning modes, where a label is defined as a datum used to identify a stored number of partitioning modes; as a non-limiting example, a label may include one or more bits in a field corresponding to a signaled number of modes in a sequence parameter set (SPS), picture parameter set (PPS) or the like, such as 2-bitfield corresponding to 4 possible numbers of modes or the like. As a non-limiting example, a label may be used to retrieve a number of modes from a lookup table.
  • SPS sequence parameter set
  • PPS picture parameter set
  • numbers of modes may be identified and/or defined in any manner as disclosed above.
  • a number of geometric partitioning modes may correspond to a distribution of line segment slope angles.
  • a number of modes may be selected from a plurality of possible mode quantities, which may include at least four distinct possible mode quantities, for instance as described above.
  • a plurality of possible mode quantities may include a possible mode quantity of 50 modes, a possible mode quantity of 64 modes, a possible mode quantity of 82 modes, and a possible mode quantity of 140.
  • geometric partition may be signaled in bitstream, for instance and without limitation in an SPS or a PPS, including without limitation communication of angle a and/or displacement b as described above, and/or one or more labels, as described above, suitable for looking up such angle and/or displacement, or the like.
  • encoder determines a first predictor for use on a first side of the at least a first partition boundary using a first motion vector.
  • First predictor includes any of the first predictor as described above, in reference to FIGS. 1-7.
  • First motion vector includes any of the first motion vector as described above, in reference to FIGS. 1-7.
  • First motion vector extends from the first partition boundary to the second partition boundary as a function of a line segment slope angle.
  • Line segment slope angle includes any of the line segment slope angle as described above, in reference to FIGS. 1-7.
  • encoder determines a second predictor as a function of a second motion vector.
  • Second predictor includes any of the second predictor as described above, in reference to FIGS. 1-7.
  • Second motion vector includes any of the second motion vector as described above, in reference to FIGS. 1-7.
  • Second motion vector originates at a geometric reference of the current block of pixels and extends to the first motion vector.
  • Geometric reference includes any of the geometric reference as described above, in reference to FIGS. 1-7.
  • method 800 includes encoding the current block using the first motion vector and the second motion vector.
  • Encoding the current block further comprises smoothing the first predictor and the second predictor across the at least a first partition boundary.
  • First partition boundary includes any of the first partition boundary as described above, in reference to FIGS. 1-7.
  • Encoding the current block further comprises adding residual pixel values to the first predictor and the second predictor. Residual pixel values include any of the residual pixel values as described above, in reference to FIGS. 1-7.
  • encoding current block may include partitioning the current block into a plurality of regions as a function of geometric partition.
  • Encoding may include determining a motion vector associated with at least one region of plurality of regions defined by geometric partition, and decoding current block using the determined motion vector.
  • bitstream may include a parameter indicating whether the geometric partitioning mode is enabled for the current block; where not enabled, encoder may disregard bitstream parameters regarding numbers of geometric partitioning modes, geometric partition parameters, or the like, and/or such parameters may be excluded from parameter sets such as an SPS, PPS, or the like.
  • Encoder may encode, in the bitstream, a signal identifying the number of partitioning modes.
  • Encoder may encode, in the bitstream, a signal identifying a label corresponding to a number of partitioning modes.
  • Geometric partition of the current block may be signaled in a bitstream.
  • Encoder may encode current block by partitioning the current block into a plurality of regions as a function of geometric partition.
  • Bitstream may include a parameter indicating whether the geometric partitioning mode is enabled for the current block.
  • Current block may form part of a quadtree plus binary decision tree.
  • Current block may include a non-leaf node of the quadtree plus binary decision tree.
  • Current block may include a coding tree unit and/or a coding unit.
  • Current block may include a coding unit and/or a prediction unit.
  • encoder may use motion information from spatial and temporal neighbors of current block to reduce a GEO modes to be evaluated.
  • Encoder may perform a pre-processing step where input frames are processed and edges in a current frame including current block are detected.
  • Encoder may also use edge information to determine whether GOP mode is applicable and to reduce complexity by reducing a number of modes to be evaluated.
  • Encoder may also use edge information and GEO information of neighboring blocks and/or motion information of the neighboring blocks to determine the GEO mode of the current block. For instance, and without limitation, analysis of pixels in a block may determine orientation and strength of edges passing through the block.
  • Edge detectors such as a Canny edge detector may be used; if a block shows a strong diagonal edge, geometric partitioning mode may be beneficial, and an encoder and/or other device or component may identify that geometric partitioning may be used. If a block does not have any significant edges, geometric inter prediction may not be useful, and an encoder and/or other device or component may identify that geometric partitioning may be used.
  • edge information can be used to skip evaluation of certain geometric modes and thereby reduce computational complexity; for instance, edges may correspond to line segment angles corresponding to a particular set of modes, which may be selected by encoder or the like and signaled in a bitstream.
  • geometric modes selected for spatial and/or temporal neighbors may be used to determine the likely mode to use and modes to skip in the current block and/or region, and signaled in the bitstream.
  • number of geometric partitioning modes corresponds to a distribution of line segment slope angles.
  • Number of modes may be selected from a plurality of possible mode quantities.
  • Plurality of possible mode quantities includes at least four distinct possible mode quantities.
  • Plurality of possible mode quantities includes a possible mode quantity of 50 modes, a possible mode quantity of 64 modes, a possible mode quantity of 82 modes, and a possible mode quantity of 140 modes.
  • FIG. 9 is a system block diagram illustrating an example decoder 900 capable of decoding a bitstream according to adaptive geometric partitioning as described above.
  • Decoder 900 may include an entropy decoder processor 904, an inverse quantization and inverse transformation processor 908, a deblocking filter 912, a frame buffer 916, a motion compensation processor 920 and/or an intra prediction processor 924.
  • bit stream 928 may be received by decoder 900 and input to entropy decoder processor 904, which may entropy encode portions of bit stream into quantized coefficients.
  • Quantized coefficients may be provided to inverse quantization and inverse transformation processor 908, which may perform inverse quantization and inverse transformation to create a residual signal, which may be added to an output of motion compensation processor 920 or intra prediction processor 924 according to a processing mode.
  • An output of the motion compensation processor 920 and intra prediction processor 924 may include a block prediction based on a previously decoded block.
  • a sum of prediction and residual may be processed by deblocking filter 912 and stored in a frame buffer 916.
  • decoder 900 may include circuitry configured to implement any operations as described above in any embodiment as described above, in any order and with any degree of repetition.
  • decoder 900 may be configured to perform a single step or sequence repeatedly until a desired or commanded outcome is achieved; repetition of a step or a sequence of steps may be performed iteratively and/or recursively using outputs of previous repetitions as inputs to subsequent repetitions, aggregating inputs and/or outputs of repetitions to produce an aggregate result, reduction or decrement of one or more variables such as global variables, and/or division of a larger processing task into a set of iteratively addressed smaller processing tasks.
  • Decoder may perform any step or sequence of steps as described in this disclosure in parallel, such as simultaneously and/or substantially simultaneously performing a step two or more times using two or more parallel threads, processor cores, or the like; division of tasks between parallel threads and/or processes may be performed according to any protocol suitable for division of tasks between iterations.
  • steps, sequences of steps, processing tasks, and/or data may be subdivided, shared, or otherwise dealt with using iteration, recursion, and/or parallel processing.
  • FIG. 10 is a system block diagram illustrating an example video encoder 1000 capable of encoding for adaptive geometric partitioning as described above.
  • the example video encoder 1000 receives an input video 1005, which can be initially segmented or dividing according to a processing scheme, such as a tree- structured macro block partitioning scheme (e g., quad-tree plus binary tree).
  • a tree-structured macro block partitioning scheme can include partitioning a picture frame into large block elements called coding tree units (CTU).
  • CTU coding tree units
  • each CTU can be further partitioned one or more times into a number of subblocks called coding units (CU).
  • the final result of this portioning can include a group of subblocks that can be called predictive units (PU).
  • Transform units (TU) can also be utilized.
  • example video encoder 1000 includes an intra prediction processor 1015, a motion estimation/compensation processor 1020 (also referred to as an inter- prediction processor) capable of supporting adaptive cropping, a transform /quantization processor 1025, an inverse quantization/inverse transform processor 1030, an in-loop filter 1035, a decoded picture buffer 1040, and an entropy coding processor 1045.
  • Bit stream parameters can be input to the entropy coding processor 1045 for inclusion in the output bit stream 1050.
  • the intra prediction processor 1010 can perform the processing to output the predictor. If the block is to be processed via motion estimation / compensation, the motion estimation / compensation processor 1020 can perform the processing including using adaptive cropping, if applicable.
  • residual can be formed by subtracting the predictor from the input video.
  • the residual can be received by the transform/quantization processor 1025, which can perform transformation processing (e g., discrete cosine transform (DCT)) to produce coefficients, which can be quantized.
  • transformation processing e g., discrete cosine transform (DCT)
  • the quantized coefficients and any associated signaling information can be provided to the entropy coding processor 1045 for entropy encoding and inclusion in the output bit stream 1050.
  • the entropy encoding processor 1045 can support encoding of signaling information related to encoding the current block.
  • the quantized coefficients can be provided to the inverse quantization/inverse transformation processor 1030, which can reproduce pixels, which can be combined with the predictor and processed by the in loop filter 1035, the output of which is stored in the decoded picture buffer 1040 for use by the motion estimation / compensation processor 1020 that is capable of adaptive cropping.
  • current blocks can include any symmetric blocks (8x8, 16x16, 32x32, 64x64, 128 x 128, and the like) as well as any asymmetric block (8x4, 16x8, and the like).
  • a quadtree plus binary decision tree can be implemented.
  • QTBT quadtree plus binary decision tree
  • the partition parameters of QTBT are dynamically derived to adapt to the local characteristics without transmitting any overhead.
  • a joint-classifier decision tree structure can eliminate unnecessary iterations and control the risk of false prediction.
  • LTR frame block update mode can be available as an additional option available at every leaf node of the QTBT.
  • additional syntax elements can be signaled at different hierarchy levels of the bit stream.
  • a flag can be enabled for an entire sequence by including an enable flag coded in a Sequence Parameter Set (SPS).
  • SPS Sequence Parameter Set
  • CTU flag can be coded at the coding tree unit (CTU) level.
  • encoder 1000 may include circuitry configured to implement any operations as described above in reference to FIGS. 8 or 10 in any embodiment, in any order and with any degree of repetition.
  • encoder 1000 may be configured to perform a single step or sequence repeatedly until a desired or commanded outcome is achieved; repetition of a step or a sequence of steps may be performed iteratively and/or recursively using outputs of previous repetitions as inputs to subsequent repetitions, aggregating inputs and/or outputs of repetitions to produce an aggregate result, reduction or decrement of one or more variables such as global variables, and/or division of a larger processing task into a set of iteratively addressed smaller processing tasks.
  • Encoder 1000 may perform any step or sequence of steps as described in this disclosure in parallel, such as simultaneously and/or substantially simultaneously performing a step two or more times using two or more parallel threads, processor cores, or the like; division of tasks between parallel threads and/or processes may be performed according to any protocol suitable for division of tasks between iterations.
  • Persons skilled in the art upon reviewing the entirety of this disclosure, will be aware of various ways in which steps, sequences of steps, processing tasks, and/or data may be subdivided, shared, or otherwise dealt with using iteration, recursion, and/or parallel processing.
  • non-transitory computer program products may store instructions, which when executed by one or more data processors of one or more computing systems, causes at least one data processor to perform operations, and/or steps thereof described in this disclosure, including without limitation any operations described above and/or any operations decoder 900 and/or encoder 1000 may be configured to perform.
  • computer systems are also described that may include one or more data processors and memory coupled to the one or more data processors. The memory may temporarily or permanently store instructions that cause at least one processor to perform one or more of the operations described herein.
  • methods can be implemented by one or more data processors either within a single computing system or distributed among two or more computing systems.
  • Such computing systems can be connected and can exchange data and/or commands or other instructions or the like via one or more connections, including a connection over a network (e.g. the Internet, a wireless wide area network, a local area network, a wide area network, a wired network, or the like), via a direct connection between one or more of the multiple computing systems, or the like.
  • a network e.g. the Internet, a wireless wide area network, a local area network, a wide area network, a wired network, or the like
  • An exemplary embodiment includes an encoder, the encoder including circuitry configured to receive a bitstream, wherein the bitstream includes a current picture, the current picture including a current block of pixels with multiple geometric partition boundaries, at least a first partition boundary partitioning the block into first and second non-rectangular regions, and a second partition boundary, non-parallel to and intersecting the at least a first partition boundary, partition the second non-rectangular region of the current block via a geometric partitioning mode to partition the current block into at least three portions, determine a first predictor for use on a first side of the at least a first partition boundary using a first motion vector, wherein the first motion vector extends from the first partition boundary to the second partition boundary as a function of a line segment slope angle determine a second predictor as a function of a second motion vector, wherein the second motion vector originates at a geometric reference of the current block of pixels and extends to the first motion vector, and encode the current block using the first motion vector and the second motion vector, wherein decoding further includes smooth
  • the encoder is further configured to encode, in the bitstream, a signal identifying a number of geometric partitioning modes. In some embodiments, the encoder is further configured to encode, in the bitstream, a signal identifying a label corresponding to the number of geometric partitioning modes.
  • the number of geometric partitioning modes may correspond to a distribution of line segment slope angles.
  • the number of geometric partitioning modes may be selected from a plurality of possible mode quantities.
  • the plurality of possible mode quantities may include at least four distinct possible mode quantities.
  • the plurality of possible mode quantities may include a possible mode quantity of 50 modes, a possible mode quantity of 64 modes, a possible mode quantity of 82 modes, and a possible mode quantity of 140 modes.
  • the geometric partition of the current block may be signaled in the bitstream.
  • Encoder may be further configured to encode the current block by partitioning the current block into a plurality of regions as a function of the geometric partition.
  • the bitstream may include a parameter indicating whether the geometric partitioning mode is enabled for the current block.
  • a current block forms part of a quadtree plus binary decision tree.
  • the current block may include a non-leaf node of the quadtree plus binary decision tree.
  • the current block may be a coding tree unit or a coding unit.
  • the current block may include a coding unit or a prediction unit.
  • a method includes receiving, by an encoder, a bitstream, wherein the bitstream includes a current picture, the current picture including a current block of pixels with multiple geometric partition boundaries, at least a first partition boundary partitioning the block into first and second non-rectangular regions, and a second partition boundary, non-parallel to and intersecting the at least a first partition boundary, partitioning, by the encoder, the second non-rectangular region of the current block via a geometric partitioning mode to partition the current block into at least three portions, determining, by the encoder, a first predictor for use on a first side of the at least a first partition boundary using a first motion vector, wherein the first motion vector extends from the first partition boundary to the second partition boundary as a function of a line segment slope angle, determining, by the encoder, a second predictor as a function of a second motion vector, wherein the second motion vector originates at a geometric reference of the current block of pixels and extends to the first motion vector, and encoding,
  • the method may further include encoding, in the bitstream, a signal identifying a number of geometric partitioning modes.
  • the method may include encoding, in the bitstream, a signal identifying a label corresponding to a number of geometric partitioning modes.
  • the number of geometric partitioning modes may correspond to a distribution of line segment slope angles.
  • the number of geometric partitioning modes may be selected from a plurality of possible mode quantities.
  • the plurality of possible mode quantities may include at least four distinct possible mode quantities.
  • the plurality of possible mode quantities may include a possible mode quantity of 50 modes, a possible mode quantity of 64 modes, a possible mode quantity of 82 modes, and a possible mode quantity of 140 modes.
  • the geometric partition of the current block may be signaled in the bitstream.
  • Method may include encoding the current block by partitioning the current block into a plurality of regions as a function of the geometric partition.
  • Bitstream may include a parameter indicating whether the geometric partitioning mode is enabled for the current block.
  • the current block may form part of a quadtree plus binary decision tree.
  • the current block may include a non-leaf node of the quadtree plus binary decision tree.
  • the current block may include a coding tree unit or a coding unit.
  • the current block may include a coding unit or a prediction unit.
  • an encoder includes circuitry configured to receive an input video, select a current block of the input video, determine a number of geometric partitioning modes applicable to the current block, identify a geometric partition of the current block as a function of the number of geometric partitioning modes, and encode the current block as a function of the geometric partition.
  • the encoder may be configured to encode, in the bitstream, a signal identifying the number of partitioning modes.
  • the encoder may be further configured to encode, in the bitstream, a signal identifying a label corresponding to a number of partitioning modes.
  • a number of geometric partitioning modes may correspond to a distribution of line segment slope angles.
  • a number of modes may be selected from a plurality of possible mode quantities.
  • Plurality of possible mode quantities may include at least four distinct possible mode quantities.
  • Plurality of possible mode quantities may include a possible mode quantity of 50 modes, a possible mode quantity of 64 modes, a possible mode quantity of 82 modes, and a possible mode quantity of 140 modes.
  • the geometric partition of the current block may be signaled in the bitstream.
  • the encoder may be further configured to encode the current block by partitioning the current block into a plurality of regions as a function of the geometric partition.
  • the bitstream may include a parameter indicating whether the geometric partitioning mode is enabled for the current block.
  • the current block may form part of a quadtree plus binary decision tree.
  • the current block may include a non-leaf node of the quadtree plus binary decision tree.
  • the current block may include a coding tree unit or a coding unit.
  • the current block may include a coding unit or a prediction unit.
  • a method includes receiving an input video, selecting a current block of the input video, determining a number of geometric partitioning modes applicable to the current block, identifying a geometric partition of the current block as a function of the number of geometric partitioning modes, and encoding the current block as a function of the geometric partition.
  • the methods may include encoding, in the bitstream, a signal identifying the number of partitioning modes.
  • Method may include encoding, in the bitstream, a signal identifying a label corresponding to a number of partitioning modes.
  • the number of geometric partitioning modes may correspond to a distribution of line segment slope angles.
  • the number of modes may be selected from a plurality of possible mode quantities.
  • Plurality of possible mode quantities may include at least four distinct possible mode quantities.
  • the plurality of possible mode quantities may include a possible mode quantity of 50 modes, a possible mode quantity of 64 modes, a possible mode quantity of 82 modes, and a possible mode quantity of 140 modes.
  • the geometric partition of the current block may be signaled in the bitstream.
  • the method may include encoding the current block by partitioning the current block into a plurality of regions as a function of the geometric partition.
  • the bitstream may include a parameter indicating whether the geometric partitioning mode is enabled for the current block.
  • the current block may form part of a quadtree plus binary decision tree.
  • the current block may include a non-leaf node of the quadtree plus binary decision tree.
  • the current block may include a coding tree unit or a coding unit.
  • the current block may include a coding unit or a prediction unit.
  • any one or more of the aspects and embodiments described herein may be conveniently implemented using one or more machines (e.g., one or more computing devices that are utilized as a user computing device for an electronic document, one or more server devices, such as a document server, etc.) programmed according to the teachings of the present specification, as will be apparent to those of ordinary skill in the computer art.
  • Appropriate software coding can readily be prepared by skilled programmers based on the teachings of the present disclosure, as will be apparent to those of ordinary skill in the software art.
  • Aspects and implementations discussed above employing software and/or software modules may also include appropriate hardware for assisting in the implementation of the machine executable instructions of the software and/or software module.
  • Such software may be a computer program product that employs a machine-readable storage medium.
  • a machine-readable storage medium may be any medium that is capable of storing and/or encoding a sequence of instructions for execution by a machine (e.g. , a computing device) and that causes the machine to perform any one of the methodologies and/or embodiments described herein.
  • Examples of a machine-readable storage medium include, but are not limited to, a magnetic disk, an optical disc e.g., CD, CD-R, DVD, DVD-R, etc.), a magneto-optical disk, a read-only memory “ROM” device, a random access memory “RAM” device, a magnetic card, an optical card, a solid-state memory device, an EPROM, an EEPROM, and any combinations thereof.
  • a machine-readable medium, as used herein, is intended to include a single medium as well as a collection of physically separate media, such as, for example, a collection of compact discs or one or more hard disk drives in combination with a computer memory.
  • a machine-readable storage medium does not include transitory forms of signal transmission.
  • Such software may also include information (e.g., data) carried as a data signal on a data carrier, such as a carrier wave.
  • a data carrier such as a carrier wave.
  • machine-executable information may be included as a data-carrying signal embodied in a data carrier in which the signal encodes a sequence of instruction, or portion thereof, for execution by a machine (e.g., a computing device) and any related information (e.g., data structures and data) that causes the machine to perform any one of the methodologies and/or embodiments described herein.
  • Examples of a computing device include, but are not limited to, an electronic book reading device, a computer workstation, a terminal computer, a server computer, a handheld device (e.g., a tablet computer, a smartphone, etc.), a web appliance, a network router, a network switch, a network bridge, any machine capable of executing a sequence of instructions that specify an action to be taken by that machine, and any combinations thereof.
  • a computing device may include and/or be included in a kiosk.
  • FIG. 11 shows a diagrammatic representation of one embodiment of a computing device in the exemplary form of a computer system 1100 within which a set of instructions for causing a control system to perform any one or more of the aspects and/or methodologies of the present disclosure may be executed. It is also contemplated that multiple computing devices may be utilized to implement a specially configured set of instructions for causing one or more of the devices to perform any one or more of the aspects and/or methodologies of the present disclosure.
  • Computer system 1100 includes a processor 1104 and a memory 1108 that communicate with each other, and with other components, via a bus 1112.
  • Bus 1112 may include any of several types of bus structures including, but not limited to, a memory bus, a memory controller, a peripheral bus, a local bus, and any combinations thereof, using any of a variety of bus architectures.
  • Processor 1104 may include any suitable processor, such as without limitation a processor incorporating logical circuitry for performing arithmetic and logical operations, such as an arithmetic and logic unit (ALU), which may be regulated with a state machine and directed by operational inputs from memory and/or sensors; processor 1104 may be organized according to Von Neumann and/or Harvard architecture as a non-limiting example.
  • ALU arithmetic and logic unit
  • Processor 1104 may include, incorporate, and/or be incorporated in, without limitation, a microcontroller, microprocessor, digital signal processor (DSP), Field Programmable Gate Array (FPGA), Complex Programmable Logic Device (CPLD), Graphical Processing Unit (GPU), general purpose GPU, Tensor Processing Unit (TPU), analog or mixed signal processor, Trusted Platform Module (TPM), a floating point unit (FPU), and/or system on a chip (SoC)
  • DSP digital signal processor
  • FPGA Field Programmable Gate Array
  • CPLD Complex Programmable Logic Device
  • GPU Graphical Processing Unit
  • TPU Tensor Processing Unit
  • TPM Trusted Platform Module
  • FPU floating point unit
  • SoC system on a chip
  • Memory 1108 may include various components (e.g., machine-readable media) including, but not limited to, a random-access memory component, a read only component, and any combinations thereof.
  • a basic input/output system 1116 (BIOS), including basic routines that help to transfer information between elements within computer system 1100, such as during start-up, may be stored in memory 1108.
  • BIOS basic input/output system
  • Memory 1108 may also include e.g., stored on one or more machine-readable media) instructions (e.g., software) 1120 embodying any one or more of the aspects and/or methodologies of the present disclosure.
  • memory 1108 may further include any number of program modules including, but not limited to, an operating system, one or more application programs, other program modules, program data, and any combinations thereof.
  • Computer system 1100 may also include a storage device 1124.
  • a storage device e.g., storage device 1124
  • Examples of a storage device include, but are not limited to, a hard disk drive, a magnetic disk drive, an optical disc drive in combination with an optical medium, a solid-state memory device, and any combinations thereof.
  • Storage device 1124 may be connected to bus 1112 by an appropriate interface (not shown).
  • Example interfaces include, but are not limited to, SCSI, advanced technology attachment (ATA), serial ATA, universal serial bus (USB), IEEE 1394 (FIREWIRE), and any combinations thereof.
  • storage device 1124 (or one or more components thereof) may be removably interfaced with computer system 1100 (e.g., via an external port connector (not shown)).
  • storage device 1124 and an associated machine-readable medium 1128 may provide nonvolatile and/or volatile storage of machine- readable instructions, data structures, program modules, and/or other data for computer system 1100.
  • software 1120 may reside, completely or partially, within machine- readable medium 1128. In another example, software 1120 may reside, completely or partially, within processor 1104.
  • Computer system 1100 may also include an input device 1132.
  • a user of computer system 1100 may enter commands and/or other information into computer system 1100 via input device 1132.
  • Examples of an input device 1132 include, but are not limited to, an alpha-numeric input device (e.g., a keyboard), a pointing device, a joystick, a gamepad, an audio input device (e.g., a microphone, a voice response system, etc.), a cursor control device (e.g., a mouse), a touchpad, an optical scanner, a video capture device (e.g., a still camera, a video camera), a touchscreen, and any combinations thereof.
  • an alpha-numeric input device e.g., a keyboard
  • a pointing device e.g., a joystick, a gamepad
  • an audio input device e.g., a microphone, a voice response system, etc.
  • a cursor control device e.g., a
  • Input device 1132 may be interfaced to bus 1112 via any of a variety of interfaces (not shown) including, but not limited to, a serial interface, a parallel interface, a game port, a USB interface, a FIREWIRE interface, a direct interface to bus 1112, and any combinations thereof.
  • Input device 1132 may include a touch screen interface that may be a part of or separate from display 1136, discussed further below.
  • Input device 1132 may be utilized as a user selection device for selecting one or more graphical representations in a graphical interface as described above.
  • a user may also input commands and/or other information to computer system 1100 via storage device 1124 (e.g., a removable disk drive, a flash drive, etc.) and/or network interface device 1140.
  • a network interface device such as network interface device 1140, may be utilized for connecting computer system 1100 to one or more of a variety of networks, such as network 1144, and one or more remote devices 1148 connected thereto. Examples of a network interface device include, but are not limited to, a network interface card (e.g., a mobile network interface card, a LAN card), a modem, and any combination thereof.
  • Examples of a network include, but are not limited to, a wide area network (e.g., the Internet, an enterprise network), a local area network (e.g., a network associated with an office, a building, a campus or other relatively small geographic space), a telephone network, a data network associated with a telephone/voice provider (e.g., a mobile communications provider data and/or voice network), a direct connection between two computing devices, and any combinations thereof.
  • a network such as network 1144, may employ a wired and/or a wireless mode of communication. In general, any network topology may be used.
  • Information e. , data, software 1120, etc.
  • Computer system 1100 may further include a video display adapter 1152 for communicating a displayable image to a display device, such as display device 1136.
  • a display device include, but are not limited to, a liquid crystal display (LCD), a cathode ray tube (CRT), a plasma display, a light emitting diode (LED) display, and any combinations thereof.
  • Display adapter 1152 and display device 1136 may be utilized in combination with processor 1104 to provide graphical representations of aspects of the present disclosure.
  • computer system 1100 may include one or more other peripheral output devices including, but not limited to, an audio speaker, a printer, and any combinations thereof.
  • peripheral output devices may be connected to bus 1112 via a peripheral interface 1156. Examples of a peripheral interface include, but are not limited to, a serial port, a USB connection, a FIREWIRE connection, a parallel connection, and any combinations thereof.
  • the phrases “at least one of A and B;” “one or more of A and B;” and “A and/or B” are each intended to mean “A alone, B alone, or A and B together.”
  • a similar interpretation is also intended for lists including three or more items.
  • the phrases “at least one of A, B, and C;” “one or more of A, B, and C;” and “A, B, and/or C” are each intended to mean “A alone, B alone, C alone, A and B together, A and C together, B and C together, or A and B and C together.”
  • use of the term “based on,” above and in the claims is intended to mean, “based at least in part on,” such that an unrecited feature or element is also permissible.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

L'invention concerne un codeur comprenant des circuits configurés pour recevoir un train binaire, partitionner la deuxième région non rectangulaire du bloc actuel via un mode de partitionnement géométrique pour partitionner le bloc actuel en trois parties, déterminer un premier prédicteur à utiliser sur un premier côté de la au moins une première limite de partition en utilisant un premier vecteur de mouvement, dans lequel le premier vecteur de mouvement s'étend de la première limite de partition à la seconde limite de partition en fonction d'un angle de pente de segment de ligne, déterminer un second prédicteur en fonction d'un second vecteur de mouvement, dans lequel le second vecteur de mouvement a pour origine une référence géométrique du bloc actuel de pixels et s'étend jusqu'au premier vecteur de mouvement, et décoder le bloc actuel en utilisant le premier vecteur de mouvement et le second vecteur de mouvement.
PCT/US2021/047886 2020-08-28 2021-08-27 Procédés et systèmes de partitionnement géométrique adaptatif WO2022047117A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202063072067P 2020-08-28 2020-08-28
US63/072,067 2020-08-28

Publications (1)

Publication Number Publication Date
WO2022047117A1 true WO2022047117A1 (fr) 2022-03-03

Family

ID=80354078

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2021/047886 WO2022047117A1 (fr) 2020-08-28 2021-08-27 Procédés et systèmes de partitionnement géométrique adaptatif

Country Status (1)

Country Link
WO (1) WO2022047117A1 (fr)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170280156A1 (en) * 2006-08-02 2017-09-28 Thomson Licensing Methods and apparatus for adaptive geometric partitioning for video decoding
US20170280162A1 (en) * 2016-03-22 2017-09-28 Qualcomm Incorporated Constrained block-level optimization and signaling for video coding tools
US20200014950A1 (en) * 2017-08-22 2020-01-09 Panasonic Intellectual Property Corporation Of America Image encoder, image decoder, image encoding method, and image decoding method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170280156A1 (en) * 2006-08-02 2017-09-28 Thomson Licensing Methods and apparatus for adaptive geometric partitioning for video decoding
US20170280162A1 (en) * 2016-03-22 2017-09-28 Qualcomm Incorporated Constrained block-level optimization and signaling for video coding tools
US20200014950A1 (en) * 2017-08-22 2020-01-09 Panasonic Intellectual Property Corporation Of America Image encoder, image decoder, image encoding method, and image decoding method

Similar Documents

Publication Publication Date Title
US20210360271A1 (en) Inter prediction in exponential partitioning
JP7482536B2 (ja) 適応的な数の領域を伴う幾何学的分割のための形状適応離散コサイン変換
US11451810B2 (en) Merge candidate reorder based on global motion vector
US11102498B2 (en) Block-based adaptive resolution management
WO2020219940A1 (fr) Mouvement global pour des candidats de mode de fusion dans une inter-prédiction
US11695922B2 (en) Inter prediction in geometric partitioning with an adaptive number of regions
WO2020219952A1 (fr) Candidats dans des trames avec mouvement global
WO2020072494A1 (fr) Procédés et systèmes de partitionnement exponentiel
US11622105B2 (en) Adaptive block update of unavailable reference frames using explicit and implicit signaling
EP3959889A1 (fr) Candidats de prédiction adaptative de vecteurs de mouvement dans des trames avec un mouvement global
WO2020219948A1 (fr) Prédiction sélective de candidats de vecteur de mouvement dans des trames avec un mouvement global
WO2021211651A1 (fr) Procédés et systèmes de codage vidéo à l'aide de régions de référence
WO2022047117A1 (fr) Procédés et systèmes de partitionnement géométrique adaptatif
WO2021092319A1 (fr) Procédés et systèmes de pour un cadrage adaptatif
WO2022047099A1 (fr) Procédés et systèmes de division géométrique adaptative
US11812044B2 (en) Signaling of global motion relative to available reference frames
US11706410B2 (en) Methods and systems for combined lossless and lossy coding
WO2022047129A1 (fr) Procédés et systèmes pour codage combiné avec et sans perte
WO2023081280A1 (fr) Systèmes et procédés de codage vidéo utilisant une segmentation d'image

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21862796

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21862796

Country of ref document: EP

Kind code of ref document: A1