CN114600463A - Video encoding and video decoding - Google Patents

Video encoding and video decoding Download PDF

Info

Publication number
CN114600463A
CN114600463A CN202080074323.9A CN202080074323A CN114600463A CN 114600463 A CN114600463 A CN 114600463A CN 202080074323 A CN202080074323 A CN 202080074323A CN 114600463 A CN114600463 A CN 114600463A
Authority
CN
China
Prior art keywords
prediction
intra
component
video
block
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202080074323.9A
Other languages
Chinese (zh)
Inventor
玛丽亚·克劳迪娅·圣玛丽亚·戈麦斯
萨韦里奥·布拉西
迈尔塔·姆拉克
埃布劳尔·伊兹基耶多
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
British Broadcasting Corp
Original Assignee
British Broadcasting Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by British Broadcasting Corp filed Critical British Broadcasting Corp
Publication of CN114600463A publication Critical patent/CN114600463A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/593Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial prediction techniques
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/132Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/105Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/11Selection of coding mode or of prediction mode among a plurality of spatial predictive coding modes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/157Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/157Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
    • H04N19/159Prediction type, e.g. intra-frame, inter-frame or bidirectional frame prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/46Embedding additional information in the video signal during the compression process
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards

Abstract

The intra prediction modes are defined in that the prediction obtained according to those modes is based on the sum of a first component that depends on the group of reference samples and a second component that does not depend on the reference samples.

Description

Video encoding and video decoding
Technical Field
The present disclosure relates to video codecs. More particularly, it relates to intra prediction in video codecs.
Background
Intra-prediction includes performing prediction in a block of samples of a video frame by using reference samples extracted from within the same frame. Such predictions may be obtained by different techniques, referred to as "modes" in the traditional codec architecture.
The video compression standard is currently being developed by the joint video experts group (jfet) of the Moving Picture Experts Group (MPEG) working group, which is commonly established by the International Standards Organization (ISO) and the International Electrotechnical Commission (IEC). This draft standard is known as universal Video Coding (VVC). In the context of VVC, a frame of samples is subdivided into a plurality of blocks, called Coding Units (CUs).
In current VVC draft specifications, intra prediction may be performed using various different modes. Conventional intra prediction modes include angular intra prediction or prediction performed by well-known techniques such as planar prediction or DC prediction. The angular prediction mode may be performed in one of a number of different modes (depending on the CU shape, which may include wide-angle expansion). In addition, when intra-predicting a block of samples, a variety of tools may be used. A cross-component linear model (CCLM) may be used to predict chroma samples from reconstructed luma samples of the same CU. Position-dependent intra prediction combining (PDPC) may be used to combine the unfiltered boundary reference samples with the prediction obtained using the filtered samples. An intra sub-partition (ISP) performs prediction and transformation independently on smaller sub-partitions of a CU.
Further, in the latest VVC draft specification, it is proposed to predict a luma sample block using matrix-based intra prediction (MIP). MIP consists in multiplying a reference sample by a fixed matrix to obtain the prediction of the current block. Such matrices are acquired based on pre-training to ensure that meaningful predictions can be acquired. A number of different modes may be employed, corresponding to the use of different matrices. The derivation of these matrices is produced by training of a Neural Network (NN) -based approach, where the coefficients in the network are trained using a training set formed by various sequences of different content at different resolutions.
Drawings
FIG. 1 is a schematic diagram generally illustrating an encoding method according to embodiments described herein;
FIG. 2 is a schematic diagram illustrating the mathematical operations on which intra prediction is based, according to an embodiment;
fig. 3 is a schematic diagram of a communication network according to an embodiment;
FIG. 4 is a schematic diagram of a transmitter of the communication network of FIG. 3;
FIG. 5 is a schematic diagram illustrating an encoder implemented on the transmitter of FIG. 4;
FIG. 6 is a flow diagram of a prediction process performed at a prediction module of the encoder of FIG. 5;
FIG. 7 is a schematic diagram of a receiver of the communication network of FIG. 3;
fig. 8 is a schematic diagram illustrating a decoder implemented on the receiver of fig. 6; and is provided with
Fig. 9 is a flow diagram of a prediction process performed at a prediction module of the decoder of fig. 8.
Detailed Description
Aspects of the disclosure may correspond to the subject matter of the appended claims.
Neural Networks (NN) and other complex learning-based techniques can be considered black boxes because the learned models are often difficult to interpret. In aspects disclosed herein, a method is applied whereby an NN-based intra prediction method is analyzed to determine an understanding of the operation of the black box. The purpose of this analysis is to achieve a simplified and clear method that can achieve similar results as the NN-based method.
FIG. 1 conceptually illustrates a method that may be undertaken in accordance with embodiments disclosed herein.
In particular, the prediction may be obtained by manipulating a reference sample. Different "modes" may be used to generate the prediction of the block, where each mode uses different parameters.
In one aspect of the disclosure, such manipulation of a given mode includes adding two components, one component dependent on the reference sample and one component independent of the reference sample.
The sample-wise prediction (m is the number of reference samples) can be expressed as:
Figure BDA0003610127150000021
wherein
Figure BDA0003610127150000022
And
Figure BDA0003610127150000023
in [ -1, 1 [)]In the range, i.e.:
Figure BDA0003610127150000024
and
Figure BDA0003610127150000025
thus, the sample-wise prediction in range [0, 1023] can be expressed as:
Figure BDA0003610127150000031
second item
Figure BDA0003610127150000032
May be considered as a "bias" term. In this case, if
Figure BDA0003610127150000033
The "bias" term depends primarily on β. Otherwise, the "bias" term depends primarily on α.
In the above expression, k represents a set of possible parameters among various possible patterns, each pattern identifying a set of possible parameters. The value 512 is merely an example, which may depend on the bit depth of the input signal. Other values may also be used.
The above are examples of functions that may be used to predict samples in a prediction block. In general, the prediction for a given sample can be taken as the sum of two components as shown below:
p(k)=f(r,α(k))+g(α(k),β(k))
also in this expression, k represents a set of possible parameters in various possible patterns, each pattern identifying a set of possible parameters. The above represents a prediction calculated as the sum of a reference sample r dependent component and a reference sample independent component.
Fig. 2 shows the mathematical operation diagram as a data process.
As an example, the reference sample dependent component of the prediction of each sample may be obtained by defining a set of weights. Multiplying a given weight by a given reference sample; the results of these multiplications are then added together to form the predicted reference sample dependent first component.
As an example, the characteristics of the weights may be controlled by the position of the samples in the prediction block. As an example, the sum of the weights may depend on the distance of each prediction sample to the reference sample. As an example, information on the position of the samples in the prediction block may be used to derive the weights.
As an example, the reference sample-independent component of the prediction for each sample may depend on various parameters. It may depend on the weight of the first component dependent on the reference sample used to compute the prediction. It may also depend on a fixed parameter independent of the weight of the first component dependent on the reference sample used to calculate the prediction. It can be obtained by a combination of the two.
As another example, the reference sample-independent component of the prediction of each sample may be obtained based on the current mode used to predict the block, or it may depend on other characteristics of the current block (such as its weight or height), and/or it may depend on characteristics of previously decoded blocks, such as their prediction mode or their size.
As another example, the fixed parameters may be extracted from a look-up table (LUT), where various LUTs may be defined. An index may be signaled in the bitstream to reference a particular entry in the LUT. As another example, the correct element in the LUT may depend on the current mode used to predict the block, or it may depend on other characteristics of the current block (such as its weight or height), and/or it may depend on characteristics of previously decoded blocks, such as their prediction modes or their sizes.
As another example, the reference sample-independent component of the prediction for each sample may be obtained based on a learning mechanism that occurs during decoding.
As another example, the reference sample-independent component of the prediction for each sample may be obtained based on parameters extracted from the bitstream. As an example, it may rely on weights dependent on the first component of the reference sample for calculating the prediction, wherein such weights may be extracted from the bitstream. It may also depend on fixed parameters independent of the weights dependent on the first component of the reference sample used for calculating the prediction, wherein such fixed parameters may be extracted from the bitstream. It can be obtained by a combination of the two.
As an example, the weights or fixed parameters may be obtained based on a derivation process performed at the decoder. Alternatively, they may be calculated based on information extracted from the bitstream and based on a derivation process performed at the decoder.
As another example, the derivation process may rely on a summation of the analysis weights. For example, in case the sum of weights is equal or close to the value 1, the reference sample independent component of the prediction of each sample may be obtained based only or mainly on fixed parameters; or conversely, where the sum of the weights does not approach a value of 1, the reference sample-independent component of the prediction for each sample may be obtained based only or primarily on the weights.
As another example, the predicted reference sample-independent component of each sample may be derived by extracting the magnitude of that component from the bitstream. As another example, the predicted reference sample-independent component of each sample may be derived by extracting its sign (i.e., whether the value of the component is greater than or equal to zero) from the bitstream.
The two components of the prediction (i.e. the reference sample dependent and the reference sample independent components) may be used in combination or may depend exclusively on one or the other of the two components of the prediction. Each of these components may be used in conjunction with other intra prediction methods or separately. For example, an angular prediction mode may be used for a block, and the result of such prediction may then be added to the component of the prediction that is not dependent on the reference sample to obtain the final prediction for the block.
The use of any of these techniques can be signaled in the bitstream as a new set of different patterns. The signaling may depend on whether a flag is present in the bitstream indicating the use of these new modes. This signaling may construct a list of Most Probable Modes (MPMs) for the current block depending on whether a previously decoded block used a particular intra prediction mode.
Other aspects of the disclosure may be determined by the following claims.
An implementation of a communications network embodying the above aspects of the present disclosure will now be described.
As illustrated in fig. 3, the arrangement comprises an illustrative video communication network 10 in which a transmitter 20 and a receiver 30 communicate via a communication channel 40. In practice, the communication channel 40 may comprise a satellite communication channel, a cable network, a terrestrial-based radio broadcast network, a communication channel implemented over a public switched telephone network, such as for providing internet services to homes and small businesses, a fiber optic communication system, or a combination of any of the above and any other conceivable communication media.
The disclosure also extends to communication over a physically transmitted storage medium having stored thereon a machine-readable record of the encoded bitstream for transmission to a suitably configured receiver capable of reading the medium and retrieving the bitstream therefrom. An example of this is the provision of a Digital Versatile Disc (DVD) or equivalent. The following description focuses on signal transmission, such as through an electronic or electromagnetic signal carrier, but should not be construed to exclude the aforementioned methods involving storage media.
As shown in fig. 4, the transmitter 20 is a computer device in structure and function. It may share certain features with a general purpose computer device, but some features may be implementation specific given the specific functionality of the transmitter 20. The reader will understand which features may be of a general type and which may require special configuration for the video transmitter.
Thus, transmitter 20 includes a Graphics Processing Unit (GPU)202 configured for particular use in processing graphics and similar operations. The transmitter 20 also includes one or more other processors 204, which are typically provided or configured for other purposes such as mathematical operations, audio processing, managing communication channels, and the like.
The input interface 206 provides a means for receiving user input actions. Such user input actions may be caused, for example, by user interaction with a particular input unit comprising one or more control buttons and/or switches, a keyboard, mouse or other pointing device, a speech recognition unit capable of receiving speech and processing it into control commands, a signal processor or remote control receiver configured to receive and control a process from another device such as a tablet or smartphone. This list is to be understood as non-exhaustive and the reader may envision other forms of input, whether user initiated or automatic.
Similarly, output interface 214 may be operable to provide a means for outputting a signal to a user or another device. Such output may include a display signal for driving a local Video Display Unit (VDU) or any other device.
Communication interface 208 enables communication channels, whether broadcast or end-to-end, with one or more signal recipients. In the context of the present embodiment, the communication interface is configured to transmit a signal carrying a bitstream defining the video signal encoded by the transmitter 20.
In operation of the encoder, the processor 204, and in particular for the benefit of this disclosure, the GPU 202 is operable to execute computer programs. In doing so, the data storage provided by the mass storage device 208 is employed, the mass storage device 208 being implemented to provide large-scale data storage, although relatively slow in access speed, and will in practice store computer programs and, in the present context, video presentation data in preparation for performing the encoding process.
A Read Only Memory (ROM)210 is preconfigured with executable programs designed to provide the core functionality of the transmitter 20, and a Random Access Memory (RAM)212 is provided for fast access and storage of data and program instructions during execution of the computer program.
The function of the transmitter 20 will now be described with reference to fig. 5. Fig. 5 shows a processing pipeline performed by an encoder implemented on transmitter 20 through executable instructions on a data file representing a video presentation comprising a plurality of frames for sequential display as a sequence of pictures.
The data file may also include audio playback information accompanying the video presentation, as well as further supplemental information such as electronic program guide information, subtitles, or metadata to enable cataloging of the presentation. The processing of these aspects of the data file is not relevant to the present disclosure.
Referring to fig. 5, a current picture or frame in an image sequence is passed to a partitioning module 230, where it is partitioned into rectangular blocks of a given size for processing by an encoder. The processing may be sequential or parallel. The method may depend on the processing power of a particular implementation.
Each block is then input to a prediction module 232, and the prediction module 232 seeks to discard the temporal and spatial redundancies present in the sequence and to use the previously encoded content to obtain a prediction signal. Information capable of calculating such a prediction is encoded in the bitstream. This information should be sufficient to enable calculations, including the possibility of deriving other information at the receiver that is needed to complete the prediction.
The prediction signal is subtracted from the original signal to obtain a residual signal. The residual signal is then input to a transform module 234, which transform module 234 attempts to further reduce the spatial redundancy within the block by using a more appropriate data representation. The reader will note that in some embodiments, the domain transformation may be an optional stage, and may be omitted entirely. A domain transform or other application may be signaled in the bitstream.
The resulting signal is then generally quantized by a quantization module 236 and the resulting data, formed by the coefficients and the information needed to calculate the prediction of the current block, is finally input to an entropy coding module 238, the entropy coding module 238 representing the signal in a compact form by means of a short binary code, using statistical redundancy. Again, the reader will note that entropy encoding may be an optional feature in some embodiments, and may be omitted entirely in some cases. The adoption of entropy coding and the information that can be decoded, such as the mode of entropy coding (e.g., huffman coding) and/or the index of the codebook, can be signaled in the bitstream.
By repeated actions of the encoding means of the transmitter 20, a bit stream of block information elements may be constructed for transmission to one or more receivers as the case may be. The bitstream may also carry information elements that are applied across multiple block information elements and, therefore, remain in the bitstream syntax independent of the block information elements. Examples of such information elements include configuration options, parameters applicable to the sequence of frames, and parameters related to the video presentation as a whole.
The prediction module 232 will now be described in more detail with reference to fig. 6. It is to be understood that this is merely an example and that other methods are also contemplated as being within the scope of the present disclosure and the appended claims.
The prediction module 232 is configured to determine, for a given block partitioned from a frame, whether intra-prediction is to be employed, and if so, which of a plurality of predetermined intra-prediction modes is to be used. The prediction module then applies the selected intra prediction mode (if applicable) and then determines a prediction based on which a residual may then be generated as described previously. The prediction employed is signaled in the bitstream for receipt and interpretation by a suitably configured decoder.
The process performed at the prediction module 232 is illustrated in fig. 6.
Fig. 6 illustrates a method for establishing which of a predetermined selection of intra-prediction modes to use for a particular block of a frame of video data with reference to a specified set of reference samples, in accordance with the described embodiments.
In step S1-2, candidate predictions are developed based on a library of intra-prediction modes. These intra-prediction modes include conventional intra-prediction modes, such as those found in earlier video coding techniques or earlier drafts of the VVC specification. The library also includes one or more intra-prediction modes developed as models of NN (or other machine learning) methods for intra-prediction. That is, based on the training data, the NN will recognize the appropriate intra-prediction modes, and may then model these modes as described above.
In general, such modes produce intra prediction that includes two components (i.e., a reference sample dependent component and a reference sample independent component), may be used in combination, or may rely exclusively on one or the other of the two components of the prediction.
Then, based on the scores, such as the compression ratios achievable based on each mode, one of the modes is selected in step S1-4. For the selected mode, a residual is generated, the residual comprising data enabling reconstruction of the block from the residual and equivalent data of the reference block.
Once the residuals are calculated, they are signaled on the bitstream in step S1-8.
Finally, the selected mode is signaled on the bitstream S1-10, if needed. Note that in some cases, mode selection may be implicit and need not be signaled. Various methods of signaling modes have been discussed in the context of existing video coding standards and draft VVC standards, and the precise method of signaling is not within the scope of the present disclosure.
The architecture of the receiver is illustrated in fig. 7. It has elements that are computer-implemented devices. Thus, receiver 30 includes a GPU 302 configured for handling particular uses in graphics and similar operations. The receiver 30 also includes one or more other processors 304 that are typically provided or configured for other purposes such as mathematical operations, audio processing, managing communication channels, and the like.
As the reader will appreciate, receiver 30 may be implemented in the form of a set-top box, a handheld personal electronic device, a personal computer, or any other device suitable for playback of a video presentation.
The input interface 306 provides a means for receiving user input actions. Such user input actions may be caused, for example, by user interaction with a particular input unit comprising one or more control buttons and/or switches, a keyboard, mouse or other pointing device, a speech recognition unit capable of receiving speech and processing it into control commands, a signal processor or remote control receiver configured to receive and control a process from another device such as a tablet or smartphone. This list is to be understood as non-exhaustive and the reader may envision other forms of input, whether user initiated or automatic.
Similarly, output interface 314 is operable to provide a means for outputting a signal to a user or another device. Such output may include a television signal of a suitable format for driving the local television apparatus.
The communication interface 308 enables communication channels, whether broadcast or end-to-end, with one or more signal recipients. In the context of the present embodiment, the communication interface is configured to transmit a signal carrying a bitstream defining a video signal encoded by the receiver 30.
In operation of the receiver, the processor 304, and in particular for the benefit of this disclosure, the GPU 302 is operable to execute computer programs. In doing so, a data storage provided by the mass storage device 308 is employed, the mass storage device 308 being implemented to provide mass data storage, although the access speed is relatively slow, and will in practice store computer programs and, in the present context, video presentation data resulting from execution of the receiving process.
The ROM 310 is preconfigured with executable programs designed to provide the core functionality of the receiver 30, and the RAM 312 is provided for fast access and storage of data and program instructions during execution of the computer program.
The function of the receiver 30 will now be described with reference to fig. 8. Fig. 8 shows a processing pipeline performed by a decoder implemented on the receiver 20 through executable instructions on a bitstream received at the receiver 30, the bitstream including structured information from which a video presentation can be derived, including a reconstruction of frames encoded by the encoder function of the transmitter 20.
The decoding process illustrated in fig. 8 is intended to reverse the process performed at the encoder. The reader will understand that this does not mean that the decoding process is the exact inverse of the encoding process.
The received bitstream includes a series of encoded information elements, each element being associated with a block. The block information element is decoded in the entropy decoding module 330 to obtain the coefficient block and information needed to calculate the prediction of the current block. The coefficient blocks are typically dequantized in a dequantization module 332 and are typically inverse transformed to spatial domain by a transform module 334.
As described above, readers will recognize that entropy decoding, dequantization, and inverse transformation need only be employed at the receiver if entropy encoding, quantization, and transformation, respectively, are employed at the transmitter.
As previously described, the prediction signal is generated by the prediction module 336 from previously decoded samples from a current or previous frame and using information decoded from the bitstream. Then, in a reconstruction block 338, the reconstruction of the original image block is derived from the decoded residual signal and the calculated prediction block. The prediction module 336 signals the use of intra prediction in response to information on the bitstream and, if such information is present, reads information from the bitstream that enables the decoder to determine which intra prediction mode has been employed and, therefore, which prediction technique should be employed in the reconstruction of the block information samples.
By repeated action of the decoding function on consecutively received block information elements, the image blocks may be reconstructed into frames, which may then be combined to produce a video presentation for playback.
An exemplary decoder algorithm that complements the previously described encoder algorithm is illustrated in fig. 9. Essentially, the process is conventional in structure, wherein in step S2-2, the residual is read from the bitstream, and in step S2-4, the employed intra prediction mode is read from the bitstream. Then, in step S206, the block is reconstructed based on the signaled intra prediction mode.
This approach is unique in the nature of the intra prediction modes available. That is, as with (or, in some embodiments, instead of) conventional intra-prediction modes, the modes are defined in the context of models developed by machine learning.
As previously described, the decoder function of the receiver 30 extracts from the bitstream a series of block information elements, defining block information and accompanying configuration information encoded by the encoder means of the transmitter 20.
Typically, the decoder constructs a prediction of the current block using information from previous predictions. In doing so, the decoder may combine knowledge from inter prediction (i.e., from a previous frame) and intra prediction (i.e., from another block in the same frame). The present embodiments relate to the implementation of intra prediction, and in particular, to the particular case where intra prediction modes are implemented.
As the reader will see, at the decoder side, the embodiments described herein may simplify the decoding process beyond the arrangements proposed in the current VVC draft specification and the proposals submitted for modifications thereto.
It will be understood that the present invention is not limited to the embodiments described above, and that various modifications and improvements may be made without departing from the concepts described herein. Any feature may be used alone or in combination with any other feature except where mutually exclusive, and the disclosure extends to and includes all combinations and subcombinations of one or more features described herein.

Claims (27)

1. A video encoder operable to encode a block of video sample data relative to a block of reference samples, the block of video sample data comprising a plurality of samples, the video encoder comprising:
intra-prediction means for encoding the block using intra-prediction;
mode selection means operable to select a selected intra-prediction mode for use by the intra-prediction means from a plurality of available predetermined intra-prediction modes, at least one of which is such that the intra-prediction means configured to operate in that mode is operable to obtain prediction of samples in the block as a combination of a first component that is dependent on the reference sample and a second component that is independent of the reference sample.
2. The video encoder of claim 1, wherein the combination comprises a sum of the first component and the second component.
3. The video encoder of claim 1 or claim 2, wherein each intra prediction mode has associated therewith a set of parameters controlling the first component and/or the second component of the sample prediction, wherein the parameters comprise a set of weights used as multipliers of reference samples to produce the first component of the prediction.
4. The video encoder of claim 3, wherein the weights are further used to generate the second component of the prediction.
5. The video encoder of any preceding claim, wherein each intra-prediction mode has associated therewith a set of parameters that control the first component and/or the second component of the sample prediction, wherein the parameters comprise fixed parameters, the second component being generated based on the fixed parameters.
6. The video encoder according to any of the preceding claims, wherein the way in which the second component is calculated depends on the characteristics of the block and/or neighboring blocks that have been previously encoded.
7. The video encoder of any preceding claim, comprising a mode signaller operable to place a mode information element on an output bitstream to enable identification of the intra-prediction mode for a particular encoding at a receiver.
8. A method of encoding a block of video sample data relative to a block of reference samples, the block of video sample data comprising a plurality of samples, the method comprising encoding the block using intra-prediction, the method further comprising selecting a selected intra-prediction mode for use by the intra-prediction device from a plurality of available predetermined intra-prediction modes, at least one of the plurality of available predetermined intra-prediction modes being such that encoding using intra-prediction obtains prediction of samples in the block as a combination of a first component that is dependent on the reference samples and a second component that is independent of the reference samples.
9. A video decoder operable to decode encoded video data, the decoder comprising intra-prediction reconstruction means operable to reconstruct samples of a block of video sample data from input intra-prediction data, the intra-prediction reconstruction means being operable in one of a plurality of predetermined intra-prediction modes available such that a corresponding intra-prediction means configured to operate in that mode is operable to obtain a prediction of samples in the block as a combination of a first component that is dependent on the reference sample and a second component that is independent of the reference sample.
10. Video decoder in accordance with claim 9, comprising a mode parser operable to compute a list of most probable modes, MPMs, most probable modes indicating which candidates of the intra prediction modes have been employed, and decoding a flag indicating whether a mode in which the received encoded video data is encoded is included in the MPM list from an input bitstream, and if the flag indicates that the employed mode is included in the MPM list, decoding an index from an input bitstream to determine which mode in the MPM list has been employed, and if the flag indicates that the employed mode is not included in the MPM list, decoding an information element from an input bitstream, to enable identification of which of the remaining modes to employ when decoding the received encoded video data into reconstructed blocks of video sample data.
11. The video decoder of claim 9 or claim 10, wherein the first component is obtained according to one of a set of intra prediction modes comprising a plane intra prediction mode, a DC intra prediction mode, or a directional angle intra prediction mode.
12. The video decoder of any of claims 9 to 11, the video decoder operable to determine, from received encoded video data, an intra-prediction mode to be used for reconstructing a block of video sample data.
13. Video decoder according to any of claims 9 to 12, said video decoder being operable to receive a mode identifier, based on which an intra-prediction mode to be used for reconstructing a block of video sample data is determined.
14. The video decoder of any of claims 9 to 13, wherein the combination comprises a sum of the first component and the second component.
15. The video decoder of any of claims 9 to 14, wherein each intra-prediction mode has associated therewith a set of parameters that control the first component and/or the second component of the sample prediction, wherein the parameters comprise a set of weights that are used as multipliers of reference samples to produce the first component of the prediction.
16. The video decoder of claim 15, wherein the weights are further used to generate the second component of the prediction.
17. The video decoder of claim 15 or claim 16, wherein the weights are determined based on a position of the prediction sample within the block or a minimum distance of the position to the reference sample position.
18. The video decoder of any of claims 15 to 17, wherein the parameter comprises a fixed parameter, the second component being generated based on the fixed parameter.
19. The video decoder of claim 18, comprising a mode signal detector operable to detect a characteristic of the fixed parameter on an input bitstream comprising information on a magnitude of the fixed parameter or information on a sign of the fixed parameter.
20. The video decoder of claim 18 or claim 19, wherein the fixed parameter is determined based on a position of a prediction sample within the block or a minimum distance of the position to the reference sample position.
21. The video decoder according to any of claims 9 to 20, wherein the manner in which the second component is calculated depends on the characteristics of the block and/or neighboring blocks that have been previously decoded.
22. The video decoder of any of claims 9 to 21, comprising a mode signal detector operable to detect a mode information element on an input bitstream to enable identification of an intra prediction mode for a particular encoding.
23. A video decoder operable to decode encoded video data, the decoder comprising intra-prediction reconstruction means operable to reconstruct samples of a block of video sample data from input intra-prediction data, the intra-prediction reconstruction means being operable in one of a plurality of predetermined intra-prediction modes available such that a corresponding intra-prediction means configured to operate in that mode is operable to scale predicted samples taken in dependence on the reference sample, wherein the scaling operates in accordance with a parameter that is dependent on the position of the sample in the block.
24. A method of decoding encoded video data, the method comprising reconstructing samples of a block of video sample data from input intra-prediction data in one of a plurality of predetermined intra-prediction modes, at least one of which is available such that a corresponding intra-prediction method configured to operate in that mode acquires prediction of samples in the block as a combination of a first component that is dependent on the reference sample and a second component that is independent of the reference sample.
25. A method of decoding encoded video data, the method comprising reconstructing samples of a block of video sample data from input intra-prediction data in one of a plurality of predetermined intra-prediction modes, at least one of the plurality of available predetermined intra-prediction modes causing a corresponding intra-prediction method configured to operate in that mode to scale prediction samples taken in dependence on the reference sample, wherein the scaling is in accordance with a parameter that is dependent on the position of the sample in the block.
26. A computer program product comprising computer executable instructions operable to configure a general purpose computer as an encoder according to any of claims 1 to 7 or a decoder according to any of claims 9 to 23.
27. A signal carrying information encoded by an encoder according to any one of claims 1 to 7.
CN202080074323.9A 2019-10-22 2020-10-02 Video encoding and video decoding Pending CN114600463A (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
GB1915256.0A GB2588406B (en) 2019-10-22 2019-10-22 Video encoding and video decoding
GB1915256.0 2019-10-22
PCT/EP2020/077745 WO2021078498A1 (en) 2019-10-22 2020-10-02 Video encoding and video decoding

Publications (1)

Publication Number Publication Date
CN114600463A true CN114600463A (en) 2022-06-07

Family

ID=68653484

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202080074323.9A Pending CN114600463A (en) 2019-10-22 2020-10-02 Video encoding and video decoding

Country Status (6)

Country Link
US (1) US20220377342A1 (en)
EP (1) EP4049453A1 (en)
KR (1) KR20220079996A (en)
CN (1) CN114600463A (en)
GB (1) GB2588406B (en)
WO (1) WO2021078498A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023197189A1 (en) * 2022-04-12 2023-10-19 Oppo广东移动通信有限公司 Coding method and apparatus, decoding method and apparatus, and coding device, decoding device and storage medium

Family Cites Families (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9654785B2 (en) * 2011-06-09 2017-05-16 Qualcomm Incorporated Enhanced intra-prediction mode signaling for video coding using neighboring mode
CA2997097C (en) * 2015-08-28 2022-06-07 Kt Corporation Method and device for processing video signal
US11259047B2 (en) * 2016-04-06 2022-02-22 Kt Corporation Method and apparatus for processing video signal
CN117221590A (en) * 2016-06-22 2023-12-12 Lx 半导体科技有限公司 Image encoding/decoding method and image data transmission method
CN111034197B (en) * 2017-08-22 2022-07-12 松下电器(美国)知识产权公司 Image encoder, image decoder, image encoding method, and image decoding method
US10965941B2 (en) * 2017-10-09 2021-03-30 Qualcomm Incorporated Position-dependent prediction combinations in video coding
WO2020058893A1 (en) * 2018-09-19 2020-03-26 Beijing Bytedance Network Technology Co., Ltd. History based motion vector predictor for intra block copy
US11095885B2 (en) * 2018-10-05 2021-08-17 Tencent America LLC Mode list generation for multi-line intra prediction
KR102628361B1 (en) * 2018-11-12 2024-01-23 베이징 바이트댄스 네트워크 테크놀로지 컴퍼니, 리미티드 Bandwidth control method for inter-prediction
US20200162737A1 (en) * 2018-11-16 2020-05-21 Qualcomm Incorporated Position-dependent intra-inter prediction combination in video coding
CN112514384A (en) * 2019-01-28 2021-03-16 株式会社 Xris Video signal encoding/decoding method and apparatus thereof
WO2020207493A1 (en) * 2019-04-12 2020-10-15 Beijing Bytedance Network Technology Co., Ltd. Transform coding based on matrix-based intra prediction
US20220321910A1 (en) * 2019-06-11 2022-10-06 Sony Group Corporation Image processing device and image processing method
LT3989578T (en) * 2019-06-19 2023-11-10 Sony Group Corporation Image processing device, and image processing method
US11115658B2 (en) * 2019-06-25 2021-09-07 Qualcomm Incorporated Matrix intra prediction and cross-component linear model prediction harmonization for video coding
WO2021015934A1 (en) * 2019-07-22 2021-01-28 Interdigital Vc Holdings, Inc. Method and apparatus for video encoding and decoding with matrix based intra-prediction
US20220256161A1 (en) * 2019-07-22 2022-08-11 Interdigital Vc Holdings, Inc. Method and apparatus for video encoding and decoding with matrix based intra-prediction
EP4007282A4 (en) * 2019-07-25 2022-08-31 Wilus Institute of Standards and Technology Inc. Video signal processing method and device
CN114223208B (en) * 2019-08-17 2023-12-29 北京字节跳动网络技术有限公司 Context modeling for side information of reduced secondary transforms in video
KR20220036970A (en) * 2019-08-22 2022-03-23 엘지전자 주식회사 Matrix intra prediction-based image coding apparatus and method
US11838541B2 (en) * 2019-08-30 2023-12-05 Alibaba Group Holding Limited Matrix weighted intra prediction of video signals
GB2591806B (en) * 2020-02-07 2023-07-19 British Broadcasting Corp Chroma intra prediction in video coding and decoding
JP7415027B2 (en) * 2020-02-29 2024-01-16 北京字節跳動網絡技術有限公司 Constraints for high-level syntax elements
WO2023038447A1 (en) * 2021-09-08 2023-03-16 현대자동차주식회사 Video encoding/decoding method and device

Also Published As

Publication number Publication date
GB201915256D0 (en) 2019-12-04
GB2588406B (en) 2022-12-07
WO2021078498A1 (en) 2021-04-29
EP4049453A1 (en) 2022-08-31
KR20220079996A (en) 2022-06-14
US20220377342A1 (en) 2022-11-24
GB2588406A (en) 2021-04-28

Similar Documents

Publication Publication Date Title
RU2769944C1 (en) Method and apparatus for configuring conversion for video compression
US20230412818A1 (en) Video encoding and video decoding
CN112913233B (en) Method and apparatus for constructing prediction candidates based on HMVP
JP2016226001A (en) Decoder and decoding method
US20220303536A1 (en) Method of signalling in a video codec
CN112956201B (en) Syntax design method and apparatus for performing encoding using syntax
CN115349256A (en) Chroma intra prediction in video encoding and decoding
EP4074031A1 (en) Video encoding and video decoding
CN114600463A (en) Video encoding and video decoding
US11589038B2 (en) Methods for video encoding and video decoding
RU2795258C1 (en) Method and device for configuration of conversion for video compression
CN112806017B (en) Method and apparatus for encoding transform coefficients
EA043408B1 (en) VIDEO ENCODING AND VIDEO DECODING
GB2587363A (en) Method of signalling in a video codec
US20220166967A1 (en) Intra coding mode signalling in a video codec
GB2596394A (en) Method of signalling in a video codec
KR20240054296A (en) Feature encoding/decoding method, device, and recording medium storing bitstream based on inter-channel correlation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination