US20220038702A1 - A method and an apparatus for encoding and decoding of digital image/video material - Google Patents

A method and an apparatus for encoding and decoding of digital image/video material Download PDF

Info

Publication number
US20220038702A1
US20220038702A1 US17/275,948 US201917275948A US2022038702A1 US 20220038702 A1 US20220038702 A1 US 20220038702A1 US 201917275948 A US201917275948 A US 201917275948A US 2022038702 A1 US2022038702 A1 US 2022038702A1
Authority
US
United States
Prior art keywords
transform
mode
block
determining
coefficients
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
US17/275,948
Other languages
English (en)
Inventor
Jani Lainema
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nokia Technologies Oy
Original Assignee
Nokia Technologies Oy
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nokia Technologies Oy filed Critical Nokia Technologies Oy
Assigned to NOKIA TECHNOLOGIES OY reassignment NOKIA TECHNOLOGIES OY ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: LAINEMA, JANI
Publication of US20220038702A1 publication Critical patent/US20220038702A1/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/129Scanning of coding units, e.g. zig-zag scan of transform coefficients or flexible macroblock ordering [FMO]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/12Selection from among a plurality of transforms or standards, e.g. selection between discrete cosine transform [DCT] and sub-band transform or selection between H.263 and H.264
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/11Selection of coding mode or of prediction mode among a plurality of spatial predictive coding modes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/132Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/157Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/167Position within a video image, e.g. region of interest [ROI]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/18Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a set of transform coefficients
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/46Embedding additional information in the video signal during the compression process
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/625Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding using discrete cosine transform [DCT]

Definitions

  • the present solution generally relates to image/video encoding and decoding.
  • the solution relates to a method and an apparatus for adaptively selecting a transform mode for a block.
  • a video coding system may comprise an encoder that transforms an input video into a compressed representation suited for storage/transmission, and a decoder that can uncompress the compressed video representation back into a viewable form.
  • the encoder may discard some information in the original video sequence in order to represent the video in a more compact form, for example, to enable the storage/transmission of the video information at a lower bitrate than otherwise might be needed.
  • Hybrid video codecs may encode the video information in two phases. Firstly, sample values (i.e. pixel values) in a certain picture area are predicted, e.g., by motion compensation means or by spatial mans. Secondly, the prediction error, i.e. the difference between the prediction block of samples and the original block of samples, is coded.
  • the video decoder reconstructs the output video by applying prediction means similar to the encoder to form a prediction representation of the sample blocks and prediction error decoding. After applying prediction and prediction error decoding means, the decoder sums up the prediction and prediction error signals to form the output video frame.
  • a method comprising determining a coding mode of a transform block, wherein a transform block comprises a set of transform coefficients; determining a shape of the transform block; determining at least one transform mode for a block based at least partly on said coding mode and said shape of the transform block; applying the determined transform mode to a set of transform coefficients to produce sample values; and adding said sample values to a block of predicted sample values.
  • an apparatus comprising means for determining a coding mode of a transform block, wherein a transform block comprises a set of transform coefficients; means for determining a shape of the transform block; means for determining at least one transform mode for a block based at least partly on said coding mode and said shape of the transform block; means for applying the determined transform mode to a set of transform coefficients to produce sample values; and means for adding said sample values to a block of predicted sample values.
  • a computer program product comprising computer program code configured to, when executed on at least one processor, cause an apparatus or a system to determine a coding mode of a transform block, wherein a transform block comprises a set of transform coefficients; determine a shape of the transform block; determine at least one transform mode for a block based at least partly on said coding mode and said shape of the transform block; apply the determined transform mode to a set of transform coefficients to produce sample values; and add said sample values to a block of predicted sample values.
  • the shape of the transform block is determined by comparing the width and height of the transform block.
  • the direction is horizontal or vertical.
  • the coding mode relates to at least one of the following: a coding unit or a prediction unit is inter predicted or intra predicted; the transform block belongs to an intra slice or an inter slice; a mode of the transform; intra prediction mode of the block.
  • the shape of the transform block is determined by classifying the block into one of predefined categories.
  • two transform modes are determined for a block having two directions of different sizes.
  • the transform mode comprises discrete cosine transforms (DCT) or discrete sine transforms (DST).
  • determining the transform mode comprises selecting a discrete cosine transform (DCT) for the direction of the block with larger size and a discrete sine transform (DST) for the direction of the block with smaller size.
  • DCT discrete cosine transform
  • DST discrete sine transform
  • a bitstream includes a signal indicating the determined at least one transform modes.
  • the apparatus comprises at least one processor, memory including computer program code.
  • FIG. 1 shows an encoding process according to an embodiment
  • FIG. 2 shows a decoding process according to an embodiment
  • FIG. 3 shows a method according to an embodiment
  • FIG. 4 shows an example of different shapes of transform blocks
  • FIG. 5 shows a transform selection according to an embodiment
  • FIG. 6 shows an encoding process according to an embodiment
  • FIG. 7 shows an embodiment of a shape adaptive transform process
  • FIG. 8 is a flowchart illustrating a method according to an embodiment.
  • FIG. 9 shows an apparatus according to an embodiment.
  • the several embodiments will be described in the context of coding and decoding of digital image/video material.
  • the several embodiments enable selection of a well performing transform for a transform block based on the shape of the block.
  • Video codec consists of an encoder that transforms the input video into a compressed representation suited for storage/transmission and a decoder that can uncompress the compressed video representation back into a viewable form.
  • the encoder may discard some information in the original video sequence in order to represent the video in a more compact form (that is, at lower bitrate).
  • Hybrid video codecs may encode the video information in two phases.
  • pixel values in a certain picture area are predicted for example by motion compensation means (finding and indicating an area in one of the previously coded video frames that corresponds closely to the block being coded) or by spatial means (using the pixel values around the block to be coded in a specified manner).
  • the prediction error i.e. the difference between the predicted block of pixels and the original block of pixels, is coded. This may be done by transforming the difference in pixel values using a specified transform (e.g.
  • DCT Discreet Cosine Transform
  • FIG. 1 illustrates an image to be encoded (I n ); a predicted representation of an image block (P′ n ); a prediction error signal (D n ); a reconstructed prediction error signal (D′ n ); a preliminary reconstructed image (I′ n ); a final reconstructed image (R′ n ); a transform (T) and inverse transform (T ⁇ 1 ); a quantization (Q) and inverse quantization (Q ⁇ 1 ); entropy encoding (E); a reference frame memory (RFM); inter prediction (P inter ); intra prediction (P intra ); mode selection (MS) and filtering (F).
  • video pictures are divided into coding units (CU) covering the area of the picture.
  • a CU consists of one or more prediction units (PU) defining the prediction process for the samples within the CU and one or more transform units (TU) defining the prediction error coding process for the samples in the said CU.
  • CU may consist of a square block of samples with a size selectable from a predefined set of possible CU sizes.
  • a CU with the maximum allowed size may be named as LCU (largest coding unit) or CTU (coding tree unit), and the video picture is divided into non-overlapping CTUs.
  • a CTU can be further split into a combination of smaller CUs, e.g.
  • Each resulting CU may have at least one PU and at least one TU associated with it.
  • Each PU and TU can be further split into smaller PUs and TUs in order to increase granularity of the prediction and prediction error coding processes, respectively.
  • Each PU has prediction information associated with it defining what kind of a prediction is to be applied for the pixels within that PU (e.g. motion vector information for inter predicted PUs and intra prediction directionality information for intra predicted PUs).
  • each TU is associated with information describing the prediction error decoding process for the samples within the said TU (including e.g. DCT coefficient information).
  • the division of the image into CUs, and division of CUs into PUs and TUs may be signaled in the bitstream allowing the decoder to reproduce the intended structure of these units.
  • the decoder reconstructs the output video by applying prediction means similar to the encoder to form a predicted representation of the pixel blocks (using the motion or spatial information created by the encoder and stored in the compressed representation) and prediction error decoding (inverse operation of the prediction error coding recovering the quantized prediction error signal in spatial pixel domain). After applying prediction and prediction error decoding means the decoder sums up the prediction and prediction error signals (pixel values) to form the output video frame.
  • the decoder (and encoder) can also apply additional filtering means to improve the quality of the output video before passing it for display and/or storing it as prediction reference for the forthcoming frames in the video sequence. Decoding process according to an embodiment is illustrated in FIG. 2 .
  • FIG. 2 illustrates a predicted representation of an image block (P′ n ); a reconstructed prediction error signal (D′ n ); a preliminary reconstructed image (I′ n ); a final reconstructed image (R′ n ); an inverse transform (V); an inverse quantization (Q ⁇ 1 ); an entropy decoding (E ⁇ 1 ); a reference frame memory (RFM); a prediction (either inter or intra) (P); and filtering (F).
  • a color palette based coding can be used.
  • Palette based coding refers to a family of approaches for which a palette, i.e. a set of colors and associated indexes, is defined and the value for each sample within a coding unit is expressed by indicating its index in the palette.
  • Palette-based coding can achieve good coding efficiency in coding units with a relatively small number of colors (such as image areas which are representing computer screen content, like text or simple graphics).
  • different kinds of palette index prediction approaches can be utilized, or the palette indexes can be run-length coded to be able to represent larger homogenous image areas efficiently.
  • escape coding can be utilized. Escape coded samples are transmitted without referring to any of the palette indexes. Instead their values are indicated individually for each escape coded sample.
  • the motion information may indicated with motion vectors associated with each motion compensated image block in video codecs.
  • Each of these motion vectors represents the displacement of the image block in the picture to be coded (in the encoder side) or decoded (in the decoder side) and the prediction source block in one of the previously coded or decoded pictures.
  • the predicted motion vectors may be created in a predefined way, for example calculating the median of the encoded or decoded motion vectors of the adjacent blocks.
  • Another way to create motion vector predictions is to generate a list of candidate predictions from adjacent blocks and/or co-located blocks in temporal reference pictures and signaling the chosen candidate as the motion vector predictor.
  • the reference index of previously coded/decoded picture can be predicted.
  • the reference index may be predicted from adjacent blocks and/or or co-located blocks in temporal reference picture.
  • high efficiency video codecs may employ an additional motion information coding/decoding mechanism, often called merging/merge mode, where all the motion field information, which includes motion vector and corresponding reference picture index for each available reference picture list, is predicted and used without any modification/correction.
  • predicting the motion field information is carried out using the motion field information of adjacent blocks and/or co-located blocks in temporal reference pictures and the used motion field information is signaled among a list of motion field candidate list filled with motion field information of available adjacent/co-located blocks.
  • Video codecs may support motion compensated prediction from one source image (uni-prediction) and two sources (bi-prediction).
  • uni-prediction a single motion vector is applied whereas in the case of bi-prediction two motion vectors are signaled and the motion compensated predictions from two sources are averaged to create the final sample prediction.
  • weighted prediction the relative weights of the two predictions can be adjusted, or a signaled offset can be added to the prediction signal.
  • the displacement vector indicates where from the same picture a block of samples can be copied to form a prediction of the block to be coded or decoded.
  • This kind of intra block copying methods can improve the coding efficiency substantially in presence of repeating structures within the frame—such as text or other graphics.
  • the prediction residual after motion compensation or intra prediction may be first transformed with a transform kernel (like DCT) and then coded.
  • a transform kernel like DCT
  • Video encoders may utilize Lagrangian cost functions to find optimal coding modes, e.g. the desired Macroblock mode and associated motion vectors.
  • This kind of cost function uses a weighting factor 2 to tie together the (exact or estimated) image distortion due to lossy coding methods and the (exact or estimated) amount of information that is required to represent the pixel values in an image area:
  • C is the Lagrangian cost to be minimized
  • D is the image distortion (e.g. Mean Squared Error) with the mode and motion vectors considered
  • R the number of bits needed to represent the required data to reconstruct the image block in the decoder (including the amount of data to represent the candidate motion vectors).
  • Scalable video coding refers to coding structure where one bitstream can contain multiple representations of the content at different bitrates, resolutions or frame rates. In these cases the receiver can extract the desired representation depending on its characteristics (e.g. resolution that matches best the display device). Alternatively, a server or a network element can extract the portions of the bitstream to be transmitted to the receiver depending on e.g. the network characteristics or processing capabilities of the receiver.
  • a scalable bitstream may consist of a “base layer” providing the lowest quality video available and one or more enhancement layers that enhance the video quality when received and decoded together with the lower layers. In order to improve coding efficiency for the enhancement layers, the coded representation of that layer typically depends on the lower layers. E.g. the motion and mode information of the enhancement layer can be predicted from lower layers. Similarly the pixel data of the lower layers can be used to create prediction for the enhancement layer.
  • a scalable video codec for quality scalability also known as Signal-to-Noise or SNR
  • spatial scalability may be implemented as follows.
  • a base layer a conventional non-scalable video encoder and decoder is used.
  • the reconstructed/decoded pictures of the base layer are included in the reference picture buffer for an enhancement layer.
  • the base layer decoded pictures may be inserted into a reference picture list(s) for coding/decoding of an enhancement layer picture similarly to the decoded reference pictures of the enhancement layer. Consequently, the encoder may choose a base-layer reference picture as inter prediction reference and indicate its use e.g.
  • the decoder decodes from the bitstream, for example from a reference picture index, that a base-layer picture is used as inter prediction reference for the enhancement layer.
  • a base-layer picture is used as inter prediction reference for an enhancement layer, it is referred to as an inter-layer reference picture.
  • base layer information could be used to code enhancement layer to minimize the additional bitrate overhead.
  • Scalability can be enabled in two basic ways. Either by introducing new coding modes for performing prediction of pixel values or syntax from lower layers of the scalable representation or by placing the lower layer pictures to the reference picture buffer (decoded picture buffer, DPB) of the higher layer.
  • the first approach is more flexible and thus can provide better coding efficiency in most cases.
  • the second, reference frame-based scalability, approach can be implemented very efficiently with minimal changes to single layer codecs while still achieving majority of the coding efficiency gains available.
  • a reference frame-based scalability codec can be implemented by utilizing the same hardware or software implementation for all the layers, just taking care of the DPB management by external means.
  • images can be split into independently codable and decodable image segments (slices or tiles).
  • “Slices” in this description may refer to image segments constructed of certain number of basic coding units that are processed in default coding or decoding order, while “tiles” may refer to image segments that have been defined as rectangular image regions that are processed at least to some extend as individual frames.
  • prediction error may be transformed into frequency domain and quantized into desired accuracy.
  • This transformation can be done for example using a transform mode from the family of discrete cosine transforms (DCT) or discrete sine transforms (DST).
  • DCT discrete cosine transforms
  • DST discrete sine transforms
  • the received transform coefficients are inverse quantized and inverse of the selected transform is applied to these inverse quantized transform coefficients to recover the prediction error signal in spatial or sample domain.
  • the selected transform or inverse transform can be indicated in the bitstream separately for the horizontal and vertical directions. However, that may lead to increased number of bits needed to signal the selected transform and increased burden on the encoder side to evaluate effectiveness of different transform alternatives.
  • H.265/HEVC video coding standards defines two different types of transforms that are used in prediction error coding.
  • DST transform is used for 4 ⁇ 4 intra predicted luma blocks, while DCT transform is used for the rest of the blocks.
  • H.265 also defines a “transform skip” mode, where the sample values are not transformed into frequency domain but transmitted as quantized sample values.
  • VTM-2 The Versatile Video Coding (WC) test model number 2 (VTM-2) defines a mechanism to select between different combinations of DCT-2, DCT-8 and DST-7 transform modes. Encoder can select between using DCT-2 for both horizontal and vertical direction, or alternatively select a combination of DCT-8 and DST-7 transforms for different directions. The selection is indicated in the bitstream and decoder is expected to perform a matching inverse transform to recover the prediction error signal.
  • JEM-6 Joint Exploration Test Model 6
  • JEM-6 defines a mechanism for “mode-dependent non-separable transform” coding where depending on the intra prediction mode different transform basis functions are selected for the transform block.
  • the “adaptive multiple core transform” defines a mechanism that selects a different transform pair for horizontal and vertical directions based on the intra prediction direction.
  • the present embodiments provide a method for selecting a well performing transform for a transform block based on the shape of the block.
  • a video or image decoder operating according to the method has two alternative transforms for the horizontal and vertical direction.
  • One of the transforms is characterized with a smaller first coefficient for its first basis function, and another of the transforms is characterized with a larger first coefficients for its first basis function.
  • Such a pair of transforms can be defined for example using a (DST-7, DCT-2) pair.
  • the decoder is further configured to select the transform with smaller first coefficient for the first basis function for the direction of the smaller dimension of the transform block, and the transform with larger first coefficient for the first basis function for the direction of the larger dimension of the transform block.
  • the method according to an embodiment comprises the following steps, as shown in FIG. 3 :
  • the method can be performed by a video or image decoder/encoder.
  • the above method can be implemented in different ways.
  • the order of operations can be changed or the operations can be interleaved in different ways.
  • different additional operations can be applied in different stages of the processing. For example, there may be additional filtering or other processing applied to the residual sample values before adding those to the predicted sample values.
  • Determining the coding mode of a transform block may include determining if a coding unit or a prediction unit into which the transform block belongs to is inter predicted or intra predicted. It may also include determining if the transform block belongs to an intra slice or an inter slice. In addition or instead of, it may also include determining mode of the transform. For example, it may be signaled with a flag in a bitstream that a transform block is using the default transforms for the block. In the case the default transform mode is used, the shape adaptive transform selection process described here may be invoked. Such an example is shown in FIG. 4 .
  • FIG. 4 illustrates an example of an encoding process when signaling for a default transform is employed and when the default transform utilizes shape adaptive transform selection.
  • a default transform is used. If not, a horizontal and vertical transforms are signaled 420 .
  • the default transform mode is used, it is then determined if the transform block is square 430 . If yes, then the same transform for both horizontal and vertical direction is used 440 . If not, then different transforms for horizontal and vertical direction are used 450 .
  • a shape adaptive transform process selects a transform between DST-7 and DCT-2, while the explicit signaling of transforms is used to select between DST-7 and DCT-8 transforms.
  • it is determined 510 whether a default transform is used. If not, it is signaled 520 if a horizontal transform is DST-7 or DCT-8 or some other, and it is signaled 530 if a vertical transform is DST-7 or DCT-8 or some other.
  • the default transform mode it is then determined if the transform block is square 540 . If yes, then the same transform, e.g. DCT-2, for both horizontal and vertical direction is used 550 . If not, then different transforms for horizontal and vertical direction are used 560 , for example, DST-7 for direction of shorter side and DCT-2 for the other direction.
  • DCT-2 the same transform
  • DST-7 different transforms for horizontal and vertical direction are used 560 , for example, DST-7 for direction of shorter side and DCT-2 for the other direction.
  • Determination of the shape of the block can be done in different ways. It can, for example, include classifying the block in one of the three categories, also shown in FIG. 6 : a) square (width W of the block is identical to the height H of the block), b) wide (width W of the block is larger than the height H) and c) narrow (height H of the block is larger than the width W). It is appreciated that additional categories may be defined. For example, there can be separate categories for blocks with width twice the height of the block and width more than twice the height of the block.
  • the shape adaptive transform selection is operated to select between two transforms: the first transform characterized with a smaller first coefficient for its first basis function and one characterized with a larger first coefficient for its first basis function.
  • a pair of transforms can be defined for example using transform mode comprising a (DST-7, DCT-2) pair.
  • the decoder is further configured to select the transform with smaller first coefficient for the first basis function (DST-7) for the direction of the smaller dimension of the transform block and the transform with larger first coefficient for the first basis function (DCT-2) for the direction of the larger dimension of the transform block.
  • the shape adaptive transform process selects a transform mode between DST-7 and DCT-2.
  • DST-7 transform is selected for the shorter direction of the transform block, and DCT-2 is used for the longer direction of the transform block, or if the transform block is square.
  • DCT-2 is used for both horizontal and vertical directions if the transform block is square ( FIG. 7 , a). If the width of the block is larger than the height, as shown in FIG. 7 b), DCT-2 is used for horizontal and DST-7 is used for the vertical direction. If the height of the block is larger than the width, as shown in FIG. 7 c), DCT-2 is used for the vertical and DST-7 is used for the horizontal direction.
  • the DCT-2 and DST-7 transforms are used.
  • DCT-2, DST-7 and DCT-8 transforms are used.
  • the shape adaptivity can be enabled for some or all the transform modes that can be indicated in the bitstream.
  • the shape adaptive transform selection can be enabled conditional to a MTS_CU_flag (multiple transform selection for coding unit flag) as defined in the following pseudo-code:
  • the coding mode (inter or intra) is used.
  • Other examples include determining if the current slice is intra coded or combining different conditions.
  • the condition may contain determining the intra prediction mode of the block and enabling shape adaptive transform selection for certain combination of prediction modes or prediction direction and shapes.
  • the shape of the block is used to give priority (or shorter binarization for related syntax elements or different context for context adaptive arithmetic coding) for certain transforms.
  • the codec is configured to select between DCT-2, DST-7 and DCT-8 transforms for both horizontal and vertical directions, a shorter binary codeword is used for DST-7 for the shorter dimension of the block. That is, the binarized codeword for selecting DST-7 for vertical direction for blocks with width larger than height can be 0, whereas binarized codewords for DCT-2 and DCT-8 can be 10 and 11 (or 11 and 10), respectively.
  • the binarized codewords for horizontal transform could be 0 for DCT-2, and DST-7 and DCT-8 could have binarized codewords or 10 and 11, respectively.
  • the context and initial probability selection for context adaptive arithmetic coding of the syntax elements can be configured to provide the adaptivity or prioritization of specific transforms.
  • the signaling can be conditioned to the position of the last non-zero coefficient of the transform block in selected scan order.
  • the explicit signaling of the horizontal and vertical transforms can be done before transform coefficient coding in the encoder and before the transform coefficient decoding in a decoder.
  • the proposed approach of conditioning the signaling to the position of the last active transform coefficient allows the coefficient coding and decoding processes to be adapted based on the transforms selected.
  • explicit or implicit signaling of transform types for horizontal and vertical direction is performed prior to transform coefficient decoding.
  • explicit or implicit signaling of transform types for horizontal and vertical directions is performed prior to transform coefficient decoding and transform coefficient decoding is adapted based on the determined type of the transforms.
  • explicit or implicit signaling of transforms types for horizontal and vertical directions is performed if scan position of the last non-zero transform coefficient is larger than a predefined threshold.
  • 8-point transform matrices for DCT-2, DCT-8 and DST-7 are defined using the following coefficients:
  • DCT-8 350 338 314 280 237 185 127 65 338 237 65 ⁇ 127 ⁇ 280 ⁇ 350 ⁇ 314 ⁇ 185 314 65 ⁇ 237 ⁇ 350 ⁇ 185 127 338 280 280 ⁇ 127 ⁇ 350 ⁇ 65 314 237 ⁇ 185 ⁇ 338 237 ⁇ 280 ⁇ 185 314 127 ⁇ 338 ⁇ 65 350 185 ⁇ 350 127 237 ⁇ 338 65 280 ⁇ 314 127 ⁇ 314 338 ⁇ 185 ⁇ 65 280 ⁇ 350 237 65 ⁇ 185 280 ⁇ 338 350 ⁇ 314 237 ⁇ 127 ⁇ 127 ⁇ 338 65 280 ⁇ 314 127 ⁇ 314 338 ⁇ 185 ⁇ 65 280 ⁇ 350 237 65 ⁇ 185 280 ⁇ 338 350 ⁇ 314 237 ⁇ 127 ⁇ 127
  • DST-7 65 127 185 237 280 314 338 350 185 314 350 280 127 ⁇ 65 ⁇ 237 ⁇ 338 280 338 127 ⁇ 185 ⁇ 350 ⁇ 237 65 314 338 185 ⁇ 237 ⁇ 314 65 350 127 ⁇ 280 350 ⁇ 65 ⁇ 338 127 314 ⁇ 185 ⁇ 280 237 314 ⁇ 280 ⁇ 65 338 ⁇ 237 ⁇ 127 350 ⁇ 185 237 ⁇ 350 280 ⁇ 65 ⁇ 185 338 ⁇ 314 127 127 ⁇ 237 314 ⁇ 350 338 ⁇ 280 185 ⁇ 65
  • 4-point transform matrices for DCT-2, DCT-8 and DST-7 are defined using the following coefficients:
  • DCT-8 336 296 219 117 296 0 ⁇ 296 ⁇ 296 219 ⁇ 296 ⁇ 117 336 117 ⁇ 296 336 ⁇ 219
  • FIG. 8 is a flowchart illustrating a method according to an embodiment.
  • a method comprises determining a coding mode of a transform block 810 , wherein a transform block comprises a set of transform coefficients; determining a shape of the transform block 820 ; determining at least one transform mode for a block based at least partly on said coding mode and said shape of the transform block 830 ; applying the determined transform mode to a set of transform coefficients to produce sample values 840 ; and adding said sample values to a block of predicted sample values 850 .
  • An apparatus comprises means for The means comprises at least one processor, and a memory including a computer program code, wherein the processor may further comprise processor circuitry.
  • the memory and the computer program code are configured to, with the at least one processor, cause the apparatus to perform the method of FIG. 8 according to various embodiments.
  • FIG. 9 An example of a data processing system for an apparatus is illustrated in FIG. 9 .
  • the data processing system comprises a main processing unit 100 , a memory 102 , a storage device 104 , an input device 106 , an output device 108 , and a graphics subsystem 110 , which are all connected to each other via a data bus 112 .
  • the main processing unit 100 is a conventional processing unit arranged to process data within the data processing system.
  • the main processing unit 100 may comprise or be implemented as one or more processors or processor circuitry.
  • the memory 102 , the storage device 104 , the input device 106 , and the output device 108 may include conventional components as recognized by those skilled in the art.
  • the memory 102 and storage device 104 store data in the data processing system 100 .
  • Computer program code resides in the memory 102 for implementing, for example the method according to flowchart of FIG. 8 .
  • the input device 106 inputs data into the system while the output device 108 receives data from the data processing system and forwards the data, for example to a display.
  • the data bus 112 is a conventional data bus and while shown as a single line it may be any combination of the following: a processor bus, a PCI bus, a graphical bus, an ISA bus. Accordingly, a skilled person readily recognizes that the apparatus may be any data processing device, such as a computer device, a personal computer, a server computer, a mobile phone, a smart phone or an Internet access device, for example Internet tablet computer.
  • a device may comprise circuitry and electronics for handling, receiving and transmitting data, computer program code in a memory, and a processor that, when running the computer program code, causes the device to carry out the features of an embodiment.
  • a network device like a server may comprise circuitry and electronics for handling, receiving and transmitting data, computer program code in a memory, and a processor that, when running the computer program code, causes the network device to carry out the features of an embodiment.
  • the computer program code comprises one or more operational characteristics.
  • Said operational characteristics are being defined through configuration by said computer based on the type of said processor, wherein a system is connectable to said processor by a bus, wherein a programmable operational characteristic of the system comprises determining a coding mode of a transform block, wherein a transform block comprises a set of transform coefficients; determining a shape of the transform block; determining at least one transform mode for a block based at least partly on said coding mode and said shape of the transform block; applying the determined transform mode to a set of transform coefficients to produce sample values; and adding said sample values to a block of predicted sample values.
  • the computer program code can be a part of a computer program product that may be embodied on a non-transitory computer readable medium.
  • the computer program product may be downloadable via communication network.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Discrete Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
US17/275,948 2018-09-20 2019-09-13 A method and an apparatus for encoding and decoding of digital image/video material Pending US20220038702A1 (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
FI20185782 2018-09-20
FI20185782 2018-09-20
PCT/FI2019/050656 WO2020058568A1 (en) 2018-09-20 2019-09-13 A method and an apparatus for encoding and decoding of digital image/video material

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
PCT/FI2019/050656 A-371-Of-International WO2020058568A1 (en) 2018-09-20 2019-09-13 A method and an apparatus for encoding and decoding of digital image/video material

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/435,271 Division US20240187594A1 (en) 2018-09-20 2024-02-07 Method And An Apparatus for Encoding and Decoding of Digital Image/Video Material

Publications (1)

Publication Number Publication Date
US20220038702A1 true US20220038702A1 (en) 2022-02-03

Family

ID=69888422

Family Applications (1)

Application Number Title Priority Date Filing Date
US17/275,948 Pending US20220038702A1 (en) 2018-09-20 2019-09-13 A method and an apparatus for encoding and decoding of digital image/video material

Country Status (9)

Country Link
US (1) US20220038702A1 (zh)
EP (1) EP3854081A4 (zh)
KR (1) KR102507024B1 (zh)
CN (1) CN112889280B (zh)
BR (1) BR112021005238A2 (zh)
MX (1) MX2021003205A (zh)
PH (1) PH12021550614A1 (zh)
WO (1) WO2020058568A1 (zh)
ZA (1) ZA202102521B (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102657540B1 (ko) * 2019-03-03 2024-04-12 후아웨이 테크놀러지 컴퍼니 리미티드 변환 프로세스를 위해 사용되는 인코더, 디코더 및 대응하는 방법
US11785254B2 (en) * 2020-05-29 2023-10-10 Tencent America LLC Implicit mode dependent primary transforms

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140064361A1 (en) * 2012-09-04 2014-03-06 Qualcomm Incorporated Transform basis adjustment in scalable video coding

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8571104B2 (en) * 2007-06-15 2013-10-29 Qualcomm, Incorporated Adaptive coefficient scanning in video coding
US9641846B2 (en) * 2010-10-22 2017-05-02 Qualcomm Incorporated Adaptive scanning of transform coefficients for video coding
US10390046B2 (en) * 2011-11-07 2019-08-20 Qualcomm Incorporated Coding significant coefficient information in transform skip mode
WO2013154366A1 (ko) * 2012-04-12 2013-10-17 주식회사 팬택 블록 정보에 따른 변환 방법 및 이러한 방법을 사용하는 장치
US9813737B2 (en) * 2013-09-19 2017-11-07 Blackberry Limited Transposing a block of transform coefficients, based upon an intra-prediction mode
JP6595711B2 (ja) * 2015-12-23 2019-10-23 華為技術有限公司 階層的分割内でのブロックレベルの変換選択および黙示的シグナリングを伴う変換コーディングのための方法および装置
US10735731B2 (en) * 2016-02-12 2020-08-04 Samsung Electronics Co., Ltd. Image encoding method and apparatus, and image decoding method and apparatus
ES2710807B1 (es) * 2016-03-28 2020-03-27 Kt Corp Metodo y aparato para procesar senales de video
CN114422796A (zh) * 2016-06-24 2022-04-29 韩国电子通信研究院 用于基于变换的图像编码/解码的方法和设备
US10972733B2 (en) * 2016-07-15 2021-04-06 Qualcomm Incorporated Look-up table for enhanced multiple transform
US10764583B2 (en) * 2016-08-31 2020-09-01 Kt Corporation Method and apparatus for processing video signal
US10440394B2 (en) * 2016-09-08 2019-10-08 Google Llc Context adaptive scan order for entropy coding
KR20240017992A (ko) 2018-09-02 2024-02-08 엘지전자 주식회사 다중 변환 선택에 기반한 영상 코딩 방법 및 그 장치

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140064361A1 (en) * 2012-09-04 2014-03-06 Qualcomm Incorporated Transform basis adjustment in scalable video coding

Also Published As

Publication number Publication date
ZA202102521B (en) 2024-01-31
EP3854081A1 (en) 2021-07-28
KR102507024B1 (ko) 2023-03-06
WO2020058568A1 (en) 2020-03-26
PH12021550614A1 (en) 2021-10-04
EP3854081A4 (en) 2022-07-06
BR112021005238A2 (pt) 2021-06-15
CN112889280A (zh) 2021-06-01
MX2021003205A (es) 2021-05-27
CN112889280B (zh) 2023-09-26
KR20210059768A (ko) 2021-05-25

Similar Documents

Publication Publication Date Title
US10609402B2 (en) Method and apparatus for prediction and transform for small blocks
US11095922B2 (en) Geometry transformation-based adaptive loop filtering
EP3120548B1 (en) Decoding of video using a long-term palette
CN108632628B (zh) 推导参考预测模式值的方法
US9743092B2 (en) Video coding with helper data for spatial intra-prediction
US20150326864A1 (en) Method and technical equipment for video encoding and decoding
US20150312568A1 (en) Method and technical equipment for video encoding and decoding
GB2509998A (en) Providing a prediction mode for image encoding based on a first set of most probable modes (MPMs) and a selected second, restricted number of prediction modes
KR102507024B1 (ko) 디지털 이미지/비디오 자료를 인코딩 및 디코딩하는 방법 및 장치
US20240187594A1 (en) Method And An Apparatus for Encoding and Decoding of Digital Image/Video Material
EP3672241A1 (en) A method, an apparatus and a computer program product for video encoding and video decoding
CN113497935A (zh) 视频编解码方法及设备
US20230128882A1 (en) Dc down-scaled weighted cost function for image/video coding
US20240064311A1 (en) A method, an apparatus and a computer program product for encoding and decoding
WO2023242466A1 (en) A method, an apparatus and a computer program product for video coding
WO2023194647A1 (en) A method, an apparatus and a computer program product for encoding and decoding of digital media content
WO2020002762A1 (en) Method and apparatus for motion compensation with non-square sub-blocks in video coding
WO2023237808A1 (en) A method, an apparatus and a computer program product for encoding and decoding of digital media content
WO2023237809A1 (en) A method, an apparatus and a computer program product for video encoding and video decoding
WO2017178696A1 (en) An apparatus and a computer program product for video encoding and decoding, and a method for the same
KR20150145753A (ko) 화면 내 블록 복사 기반의 동영상 부호화 및 복호화 방법 및 장치

Legal Events

Date Code Title Description
AS Assignment

Owner name: NOKIA TECHNOLOGIES OY, FINLAND

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:LAINEMA, JANI;REEL/FRAME:055578/0644

Effective date: 20181026

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: FINAL REJECTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE AFTER FINAL ACTION FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: FINAL REJECTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE AFTER FINAL ACTION FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: ADVISORY ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: FINAL REJECTION MAILED