EP4074031A1 - Codage vidéo et décodage vidéo - Google Patents

Codage vidéo et décodage vidéo

Info

Publication number
EP4074031A1
EP4074031A1 EP20780738.9A EP20780738A EP4074031A1 EP 4074031 A1 EP4074031 A1 EP 4074031A1 EP 20780738 A EP20780738 A EP 20780738A EP 4074031 A1 EP4074031 A1 EP 4074031A1
Authority
EP
European Patent Office
Prior art keywords
prediction
block
accordance
picture
intra
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
EP20780738.9A
Other languages
German (de)
English (en)
Inventor
Saverio BLASI
Andre Seixas DIAS
Gosala KULUPANA
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
British Broadcasting Corp
Original Assignee
British Broadcasting Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by British Broadcasting Corp filed Critical British Broadcasting Corp
Publication of EP4074031A1 publication Critical patent/EP4074031A1/fr
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/119Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/105Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/593Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial prediction techniques

Definitions

  • This disclosure relates to video encoding and video decoding.
  • Video compression provides opportunities to reduce payload on a transmission channel. It also provides opportunities for increasingly efficient storage of encoded video data.
  • Known video coding standards enable transmission (or storage) of bitstream data defining a video, such that a receiver (or retriever) of the bitstream is able to decode the bitstream in such a way as to construct a decoded video which is substantially faithful to the original video from which the encoded bitstream was derived.
  • Figure 1 is a schematic representation of a communications network in accordance with an embodiment
  • Figure 2 is a schematic representation of an emitter of the communications network of figure 1 ;
  • Figure 3 is a diagram illustrating an encoder implemented on the emitter of figure 2;
  • Figure 4 is a flow diagram of a prediction process performed at a prediction module of the encoder of figure 3;
  • Figure 5 is a schematic representation of a receiver of the communications network of figure 1 ;
  • Figure 6 is a diagram illustrating a decoder implemented on the receiver of figure 4.
  • Figure 7 is a flow diagram of a prediction process performed at a prediction module of the decoder of figure 6.
  • decoding of a merge- predicted block is carried out using a combination of inter-prediction and intra-prediction.
  • the combination of inter-prediction and intra-prediction may be performed in one of a plurality of modes.
  • said combination may be operable to generate a prediction of samples of a block of video sample data by the application of a blending mask.
  • said combination may be operable to generate a prediction of samples following a partitioning of the block by means of a geometrical partitioning scheme.
  • the mode or combination of inter-prediction and intra-prediction to be used may be determined by a decoder by way of signalling on a received bitstream, or by way of inference.
  • the inference may be established by a decoder by way of a determination carried out on the basis of characteristics of the block, of surrounding blocks, or of other criteria.
  • Embodiments disclosed herein relate to a method of performing prediction in a video codec by means of more efficiently exploiting redundancies in the video signal.
  • a video presentation generally comprises a plurality of frames, for sequential display on playback equipment.
  • Various strategies are used to reduce the amount of data required to describe each frame in turn on a bitstream transmitted on a communications channel from an emitter to a receiver.
  • the emitter will comprise an encoder for encoding the frame data into a bitstream
  • the receiver will comprise a decoder for generating frame data on the basis of information borne in the bitstream.
  • each frame of the video presentation is partitioned into blocks.
  • the content of a block is predicted based on previously compressed content. Such content may be extracted from the same frame as the current block being predicted, as in the case of intra-prediction, or such content may be extracted from previously encoded frames, as in the case of inter-prediction.
  • This block prediction is subtracted from the actual block, resulting in a set of residual differences (residuals).
  • the residual data can be encoded using a transformation into the frequency domain.
  • the transform of data from the time domain to the frequency domain may be specific to certain implementations, and is not essential to the performance of disclosed embodiments.
  • Merge prediction may be employed to compute the inter-prediction for a given set of samples, in which the content of a partition can be predicted on the basis of the motion information pertaining to another neighbouring block or neighbouring block partition. In certain circumstances, merge inter-prediction may not provide a good prediction for a specific partition.
  • embodiments described herein provide an encoding process, and a corresponding decoding process, in which other types of prediction, such as intra prediction, may be employed on blocks, with a prospect of leading to more accurate results.
  • An approach to video encoding and decoding allows blocks to be predicted using a combination of intra-prediction and inter-prediction.
  • this combination may happen by applying uniform weighting to the inter-prediction and intra-prediction samples to produce a combined prediction.
  • uniform weighting may be derived from characteristics of neighbouring blocks.
  • the combined intra-prediction and inter-prediction block is further partitioned into two partitions.
  • Two predictions are computed, one for each partition in the block, respectively corresponding to an inter-prediction (for instance computed by means of merge prediction) and an intra-prediction (computed by means of a mode that is either inferred, or signalled in the bitstream).
  • the partitioning of the block into two partitions may be extracted from a set of possible partitioning schemes.
  • the combined prediction is formed by using one prediction in one of the two partitions and the other prediction in the other of the two partitions.
  • the partitioning scheme may depend on an angle defining the directionality of the edge dividing the two partitions, and on an offset or distance which depends on the distance between the edge dividing the two partitions and the centre of the block.
  • a smoothing process may be employed whereby samples next to the border between the two partitions are predicted using a weighted combination of the two partitions.
  • the combined intra-prediction and inter-prediction block is performed by means of a blending mask formed of a set of weights.
  • Two predictions are computed, an inter-prediction (for instance computed by means of merge prediction) and an intra-prediction (computed by means of a mode that is either inferred, or signalled in the bitstream).
  • the combined prediction is formed by applying the blending mask to the two predictions.
  • the blending mask may be formed by means of a geometrical partitioning process based on an angle and an offset. The angle and offset, together with other characteristics of the block (for instance the width and height of the block, or the bitdepth of the signal) may determine the weights in the blending mask. A variety of different angles and offsets may be considered, resulting in a variety of possible blending masks.
  • the blending mask to use on a current block may be signalled in the bitstream or may be inferred.
  • Embodiments may accommodate a case wherein some of the weights in the blending mask associated with the inter-prediction are 0 and some of the weights associated with intra-prediction are 1 ; in such a case, then at least part of at least one of the two partitions will be predicted completely using intra-prediction.
  • embodiments may accommodate a case wherein some of the weights in the blending mask associated with the inter-prediction are 0 and some of the weights associated with intra-prediction are 1; in such a case, then at least part of at least one of the two partitions will be predicted completely using inter-prediction.
  • VVC video compression standard under the heading “VVC”
  • a combined inter-prediction merge and intra-prediction is considered.
  • a list of possible merge candidates is computed. This list is formed by means of a process whereby some of these candidates may use motion information extracted from previously encoded blocks, or where some merge candidates may use motion information that is obtained by combining or manipulating the motion information of previously encoded blocks.
  • a merge candidate can be used to form a merge inter-predicted prediction.
  • an intra-predicted block is also computed. The computation of the intra-predicted block follows from the determination of an intra-prediction mode to be used. This mode may be a fixed mode, for instance the Planar mode, or may be inferred based on information extracted from neighbouring blocks, or may be determined based on signalling in the bitstream.
  • Each sample in the combined prediction is computed using the two collocated samples in the two predictions, whereby such samples are then weighted to form the combined prediction sample.
  • a fixed weight is used to perform the combination. The weight may be inferred based on information extracted from neighbouring blocks, such as for instance whether the neighbouring blocks are intra-predicted or not.
  • Signalling may happen by means of a mechanism whereby a bit is signalled to determine whether the current block should employ the combined inter-prediction merge and intra prediction facility.
  • Embodiments disclosed herein make use of a combined inter-picture merge and intra picture prediction facility operable to predict samples of a block of video sample data, the combined inter-picture merge and intra-picture prediction facility being operable in one of a plurality of modes.
  • each mode corresponds to the use of a blending mask to compute the combination of intra and inter-prediction.
  • the blending mask is extracted from a list of blending masks, whereby the possible blending masks are derived based on a number of different angles and offsets (or distances). For each combination of a given angle and offset, a particular blending mask can be derived whereby the weights in the blending mask will produce a blended prediction in accordance to such angle and offset.
  • a geometrical partitioning index is assigned to each blending mask, where a computation can be performed to derive the angle and offset of the blending mask depending on the corresponding geometrical partitioning index.
  • each mode corresponds to the use of a partitioning process whereby each block is split into two partitions based on an angle and an offset to compute the combination of intra and inter-prediction.
  • the resulting combined intra and inter predictions is then treated in the same way as similar partitioning methods such as triangular partitions or geometrical partitions, which means they may be smoothed at the edge forming the boundary between the two partitions of the block.
  • determination of the combined inter-picture merge and intra picture prediction mode to use on a given block may follow from the establishment of a look-up table.
  • the look-up table may contain a number of possible modes where the formation of the look-up table may depend on characteristics of the current block and/or of neighbouring blocks.
  • the look-up table may be used to enable signalling by an encoder and, correspondingly, interpretation of information signalled on a bitstream received by a decoder, to identify which of the possible modes has been employed in an encoding.
  • the look-up table may be formed depending on information extracted from neighbouring blocks.
  • the look-up table may be formed depending on whether the block on the top-left of the current block is intra-predicted or not.
  • the look-up table may be formed depending on whether the block on the bottom-right of the current block is intra-predicted or not. Other indicators may be used to trigger formation of the look-up table, depending on the specific implementation.
  • the look-up table may be formed depending on information extracted from the current block. In one embodiment, the look-up table may be formed depending on the width and height of the current block, or whether the block on the top-left of the current block is intra-predicted or not.
  • other neighbouring blocks may be used to designate whether the look-up table is formed.
  • the look-up table may be formed depending on employment of intra-prediction in a combination of neighbouring blocks.
  • the candidates in the look-up table may include a mode whose angle corresponds to having an edge at the border between the two partitions that connects the top-right corner of the block with the bottom-left corner of the block.
  • the look-up table may include modes whose angles are selected to be close to the angle that corresponds to having an edge at the border between the two partitions that connects the top-right corner of the block with the bottom-left corner of the block.
  • these candidates may be included in the look-up table if both blocks on the top-left of the current block and on the bottom-right of the current block are intra-predicted.
  • the candidates in the look-up table may include a mode whose angle corresponds to having an edge at the border between the two partitions that connects the top-left corner of the block with the bottom-right corner of the block.
  • the look-up table may include modes whose angles are selected to be close to the angle that corresponds to having an edge at the border between the two partitions that connects the top-left corner of the block with the bottom-right corner of the block.
  • these candidates may be included in the look-up table if one and one only between the blocks on the top-left of the current block and on the bottom- right of the current block are intra- predicted.
  • the look-up tables contain a number of items set to a power of 2. In some embodiments, the look-up tables contain 8 candidates.
  • the encoder may select one mode in the look-up table, and correspondingly form a combined inter-picture merge and intra-picture prediction.
  • the selected mode may be signalled in the bitstream using a combined inter-picture merge and intra-picture prediction mode index.
  • the index is signalled using a fixed number of bits. In one embodiment, in case the look-up table contains 8 modes, the index is signalled using 3 bits.
  • the index is signalled using 3 bits, this does not compel the definition of 8 modes, it simply means that the table has capacity for 8 modes. It may be convenient, for example, for an embodiment to define fewer than 8 modes, and to leave one or more entries in the table in reserve for future extension.
  • determination of the partitioning used for combining the inter picture merge and intra-picture prediction mode to use on a given block may depend on information extracted from neighbouring blocks.
  • the partitioning may be determined depending on whether the block on the top-left of the current block is intra-predicted or not. In one embodiment, the partitioning may be determined depending on whether the block on the bottom-right of the current block is intra-predicted or not. In one embodiment, the partitioning may be determined depending on whether both the block on the bottom-right of the current block and the block on the top-left of the current block are intra-predicted or not.
  • the usage of an intra-predicted candidate or an inter-predicted candidate in one of the two partitions may be determined depending on whether the neighbouring blocks are intra-predicted or not.
  • the combined inter-picture merge and intra-picture prediction mode can be directly extracted from the bitstream.
  • the determination of the combined inter picture merge and intra-picture prediction mode may depend on the detection of a flag indicating usage of combined inter-picture merge and intra-picture prediction.
  • the determination of the combined inter-picture merge and intra-picture prediction mode may depend on the detection of a flag indicating usage of the conventional combined inter-picture merge and intra-picture prediction using uniforme weighting.
  • the determination of the combined inter-picture merge and intra-picture prediction mode may depend on signalling that is extracted from the bitstream only upon the determination of flags indicating that conventional combined inter-picture merge and intra-picture prediction is not used.
  • the communications channel 40 may comprise a satellite communications channel, a cable network, a ground-based radio broadcast network, a POTS-implemented communications channel, such as used for provision of internet services to domestic and small business premises, fibre optic communications systems, or a combination of any of the above and any other conceivable communications medium.
  • the disclosure also extends to communication, by physical transfer, of a storage medium on which is stored a machine readable record of an encoded bitstream, for passage to a suitably configured receiver capable of reading the medium and obtaining the bitstream therefrom.
  • a suitably configured receiver capable of reading the medium and obtaining the bitstream therefrom.
  • DVD digital versatile disk
  • the following description focuses on signal transmission, such as by electronic or electromagnetic signal carrier, but should not be read as excluding the aforementioned approach involving storage media.
  • the emitter 20 is a computer apparatus, in structure and function. It may share, with general purpose computer apparatus, certain features, but some features may be implementation specific, given the specialised function for which the emitter 20 is to be put. The reader will understand which features can be of general purpose type, and which may be required to be configured specifically for use in a video emitter.
  • the emitter 20 thus comprises a graphics processing unit 202 configured for specific use in processing graphics and similar operations.
  • the emitter 20 also comprises one or more other processors 204, either generally provisioned, or configured for other purposes such as mathematical operations, audio processing, managing a communications channel, and so on.
  • An input interface 206 provides a facility for receipt of user input actions. Such user input actions could, for instance, be caused by user interaction with a specific input unit including one or more control buttons and/or switches, a keyboard, a mouse or other pointing device, a speech recognition unit enabled to receive and process speech into control commands, a signal processor configured to receive and control processes from another device such as a tablet or smartphone, or a remote-control receiver.
  • a specific input unit including one or more control buttons and/or switches, a keyboard, a mouse or other pointing device, a speech recognition unit enabled to receive and process speech into control commands, a signal processor configured to receive and control processes from another device such as a tablet or smartphone, or a remote-control receiver.
  • an output interface 214 is operable to provide a facility for output of signals to a user or another device. Such output could include a display signal for driving a local video display unit (VDU) or any other device.
  • VDU local video display unit
  • a communications interface 208 implements a communications channel, whether broadcast or end-to-end, with one or more recipients of signals.
  • the communications interface is configured to cause emission of a signal bearing a bitstream defining a video signal, encoded by the emitter 20.
  • the processors 204 and specifically for the benefit of the present disclosure, the GPU 202, are operable to execute computer programs, in operation of the encoder. In doing this, recourse is made to data storage facilities provided by a mass storage device 208 which is implemented to provide large-scale data storage albeit on a relatively slow access basis, and will store, in practice, computer programs and, in the current context, video presentation data, in preparation for execution of an encoding process.
  • a Read Only Memory (ROM) 210 is preconfigured with executable programs designed to provide the core of the functionality of the emitter 20, and a Random Access Memory 212 is provided for rapid access and storage of data and program instructions in the pursuit of execution of a computer program.
  • ROM Read Only Memory
  • Figure 3 shows a processing pipeline performed by an encoder implemented on the emitter 20 by means of executable instructions, on a data file representing a video presentation comprising a plurality of frames for sequential display as a sequence of pictures.
  • the data file may also comprise audio playback information, to accompany the video presentation, and further supplementary information such as electronic programme guide information, subtitling, or metadata to enable cataloguing of the presentation.
  • audio playback information to accompany the video presentation
  • supplementary information such as electronic programme guide information, subtitling, or metadata to enable cataloguing of the presentation.
  • the current picture or frame in a sequence of pictures is passed to a partitioning module 230 where it is partitioned into rectangular blocks of a given size for processing by the encoder.
  • This processing may be sequential or parallel. The approach may depend on the processing capabilities of the specific implementation.
  • Each block is then input to a prediction module 232, which seeks to discard temporal and spatial redundancies present in the sequence and obtain a prediction signal using previously coded content.
  • Information enabling computation of such a prediction is encoded in the bitstream. This information should comprise sufficient information to enable computation, including the possibility of inference at the receiver of other information necessary to complete the prediction.
  • the prediction signal is subtracted from the original signal to obtain a residual signal.
  • This is then input to a transform module 234, which attempts to further reduce spatial redundancies within a block by using a more suitable representation of the data.
  • domain transformation may be an optional stage and may be dispensed with entirely. Employment of domain transformation, or otherwise, may be signalled in the bitstream.
  • entropy coding may, in some embodiments, be an optional feature and may be dispensed with altogether in certain cases.
  • the employment of entropy coding may be signalled in the bitstream, together with information to enable decoding, such as an index to a mode of entropy coding (for example, Huffman coding) and/or a code book.
  • bitstream of block information elements can be constructed for transmission to a receiver or a plurality of receivers, as the case may be.
  • the bitstream may also bear information elements which apply across a plurality of block information elements and are thus held in bitstream syntax independent of block information elements. Examples of such information elements include configuration options, parameters applicable to a sequence of frames, and parameters relating to the video presentation as a whole.
  • the prediction module 232 will now be described in further detail, with reference to figure 4. As will be understood, this is but an example, and other approaches, within the scope of the present disclosure and the appended claims, could be contemplated.
  • the prediction module 232 is configured to determine, for a given block partitioned from a frame, whether it is advantageous to apply combined inter-picture merge and intra prediction to the block, and, if so, to generate a combined inter-picture merge and intra prediction prediction for the block, and combination information to enable signalling to a decoder as to the manner in which the block has been subjected to combined inter picture merge and intra-prediction and how the combined inter-picture merge and intra prediction prediction information is then to be decoded.
  • the prediction module then applies the selected mode of combined inter-picture merge and intra-prediction, if applicable, and then determines a prediction, on the basis of which residuals can then be generated as previously noted.
  • the prediction employed is signalled in the bitstream, for receipt and interpretation by a suitably configured decoder.
  • conventional prediction methods may be employed to predict the content of the block, including conventional inter-prediction and/or conventional intra prediction techniques.
  • the encoder will signal, by means of a flag, on the bitstream, whether or not triangular partitioning has been employed. Turning therefore to the encoder-side algorithm illustrated in figure 4, in step S102 a set of candidate combined inter-picture merge and intra-prediction modes is assembled for the block in question. Candidates are assembled using any of the techniques as previously described.
  • Candidates may include the conventional way of performing combined inter-picture merge and intra-prediction as previously described, plus any other combined inter-picture merge and intra-prediction modes which may be identified as suitable. This may include modes that make use of blending masks, and/or modes that correspond to partitioning the block following a geometrical partitioning.
  • the candidates for a given block may be obtained by analysing neighbouring blocks information, or information extracted from the current block. The candidates for a given block may be determined using the same parameters that are used to determine the geometrical partitioning mode.
  • a loop commences in step S104, with operations carried out on each candidate triangular partition.
  • a prediction is determined using the mode associated with that candidate.
  • a quality measure is determined for that prediction, comprising a score of accuracy of the prediction with respect to the original data.
  • Step S110 signifies the closure of the loop.
  • step S112 the candidate with the best quality score is selected.
  • the attributes of this candidate are then encoded, such as using an encoding using a fixed number of bits or established in a look-up table or using a Golomb code as described above. Other techniques may be used for signalling. These attributes are added to the bitstream for transmission.
  • the structural architecture of the receiver is illustrated in figure 5. It has the elements of being a computer implemented apparatus.
  • the receiver 30 thus comprises a graphics processing unit 302 configured for specific use in processing graphics and similar operations.
  • the receiver 30 also comprises one or more other processors 304, either generally provisioned, or configured for other purposes such as mathematical operations, audio processing, managing a communications channel, and so on.
  • the receiver 30 may be implemented in the form of a set top box, a hand held personal electronic device, a personal computer, or any other device suitable for the playback of video presentations.
  • An input interface 306 provides a facility for receipt of user input actions. Such user input actions could, for instance, be caused by user interaction with a specific input unit including one or more control buttons and/or switches, a keyboard, a mouse or other pointing device, a speech recognition unit enabled to receive and process speech into control commands, a signal processor configured to receive and control processes from another device such as a tablet or smartphone, or a remote-control receiver.
  • a specific input unit including one or more control buttons and/or switches, a keyboard, a mouse or other pointing device, a speech recognition unit enabled to receive and process speech into control commands, a signal processor configured to receive and control processes from another device such as a tablet or smartphone, or a remote-control receiver.
  • an output interface 314 is operable to provide a facility for output of signals to a user or another device. Such output could include a television signal, in suitable format, for driving a local television device.
  • a communications interface 308 implements a communications channel, whether broadcast or end-to-end, with one or more recipients of signals.
  • the communications interface is configured to cause emission of a signal bearing a bitstream defining a video signal, encoded by the receiver 30.
  • the processors 304 and specifically for the benefit of the present disclosure, the GPU 302, are operable to execute computer programs, in operation of the receiver. In doing this, recourse is made to data storage facilities provided by a mass storage device 308 which is implemented to provide large-scale data storage albeit on a relatively slow access basis, and will store, in practice, computer programs and, in the current context, video presentation data, resulting from execution of an receiving process.
  • a Read Only Memory (ROM) 310 is preconfigured with executable programs designed to provide the core of the functionality of the receiver 30, and a Random Access Memory 312 is provided for rapid access and storage of data and program instructions in the pursuit of execution of a computer program.
  • ROM Read Only Memory
  • Figure 6 shows a processing pipeline performed by a decoder implemented on the receiver 20 by means of executable instructions, on a bitstream received at the receiver 30 comprising structured information from which a video presentation can be derived, comprising a reconstruction of the frames encoded by the encoder functionality of the emitter 20.
  • the decoding process illustrated in figure 6 aims to reverse the process performed at the encoder. The reader will appreciate that this does not imply that the decoding process is an exact inverse of the encoding process.
  • a received bit stream comprises a succession of encoded information elements, each element being related to a block.
  • a block information element is decoded in an entropy decoding module 330 to obtain a block of coefficients and the information necessary to compute the prediction for the current block.
  • the block of coefficients is typically de- quantised in dequantisation module 332 and typically inverse transformed to the spatial domain by transform module 334.
  • a prediction signal is generated as before, from previously decoded samples from current or previous frames and using the information decoded from the bit stream, by prediction module 336.
  • a reconstruction of the original picture block is then derived from the decoded residual signal and the calculated prediction block in the reconstruction block 338.
  • the prediction module 336 is responsive to information, on the bitstream, signalling the use of triangular partitioning and, if such information is present, reading from the bitstream the mode under which triangular partitioning has been implemented and thus which prediction technique should be employed in reconstruction of a block information sample.
  • the decoder functionality of the receiver 30 extracts from the bitstream a succession of block information elements, as encoded by the encoder facility of the emitter 20, defining block information and accompanying configuration information.
  • the decoder avails itself of information from prior predictions, in constructing a prediction for a present block. In doing so, the decoder may combine the knowledge from inter-prediction, i.e. from a prior frame, and intra-prediction, i.e. from another block in the same frame.
  • step S202 the information enabling formation of a prediction candidate is extracted from the bitstream.
  • This can be in the form of a flag, which may be binary in syntactical form, indicating whether or not combined inter-picture merge and intra-prediction has been used.
  • step S204 a decision is taken dependent on the value of this flag. If combined inter picture merge and intra-prediction is to be used for the merge-predicted block, then in step S206 a look-up table containing a list of possible modes is considered. This list may be pre-determined, or may depend on information inferred from available information (such as the size or the block, or the manner in which neighbouring blocks have been decoded). It may be pre-stored at the receiver, or it may be transmitted thereto on the bitstream. This transmission of look-up table information may be at a commencement transmission of the current bitstream transmission, or it may be, for instance, in a pre configuration transmission to the receiver to configure the receiver to be capable of decoding bitstreams encoded to a particular specification.
  • step S208 an index is extracted from the bitstream to signal which item in the look-up table is to be employed in generating a prediction.
  • step S210 the look-up table is consulted, in accordance with the index, to obtain a set of attributes defining the combined inter-picture and intra-prediction mode to be used.
  • the attributes may be considered, collectively, as prediction configuration attributes, which can be used by the decoder to configure the way the decoder constructs a prediction of the block samples, whether combined inter-picture and intra-prediction is to be employed, and, if so, the manner in which combined inter-picture and intra-prediction is to be performed to reconstruct the block.
  • the attributes may for instance also specify how the combination should be implemented, such as including weight parameters, or an index to another table of pre-determined weight parameters.
  • step S212 a prediction is generated using the specific characteristics determined in step S210.
  • step S220 In the alternative, if combined inter-picture merge and intra-prediction has not been signalled, using the previously described flag, then conventional techniques are used, in step S220, to generate a prediction of the merge-predicted block.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

Un décodage vidéo est mis en œuvre à l'aide d'une installation combinée de fusion inter-images et de prédiction intra-image, utilisable pour prédire des échantillons d'un bloc de données d'échantillon vidéo. L'installation combinée de fusion inter-image et de prédiction intra-image peut fonctionner dans un mode parmi une pluralité de modes dans lesquels, dans au moins un desdits modes, le décodage implique la génération d'une prédiction d'échantillons d'un bloc de données d'échantillon vidéo par application d'un masque de mélange, le masque de mélange régissant le partitionnement du bloc en deux parties, le masque de mélange étant appliqué à une première prédiction générée par un processus de prédiction de fusion inter-image et à une seconde prédiction générée par un processus de prédiction intra-image.
EP20780738.9A 2019-12-13 2020-10-01 Codage vidéo et décodage vidéo Pending EP4074031A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
GB1918431.6A GB2589932A (en) 2019-12-13 2019-12-13 Video encoding and video decoding
PCT/EP2020/077607 WO2021115657A1 (fr) 2019-12-13 2020-10-01 Codage vidéo et décodage vidéo

Publications (1)

Publication Number Publication Date
EP4074031A1 true EP4074031A1 (fr) 2022-10-19

Family

ID=69186637

Family Applications (1)

Application Number Title Priority Date Filing Date
EP20780738.9A Pending EP4074031A1 (fr) 2019-12-13 2020-10-01 Codage vidéo et décodage vidéo

Country Status (4)

Country Link
EP (1) EP4074031A1 (fr)
CN (1) CN114868385A (fr)
GB (1) GB2589932A (fr)
WO (1) WO2021115657A1 (fr)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111951189B (zh) * 2020-08-13 2022-05-06 神思电子技术股份有限公司 一种多尺度纹理随机化的数据增强方法
WO2023197183A1 (fr) * 2022-04-12 2023-10-19 Oppo广东移动通信有限公司 Procédé et appareil de codage vidéo, procédé et appareil de décodage vidéo, et dispositif, système et support d'enregistrement

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9609343B1 (en) * 2013-12-20 2017-03-28 Google Inc. Video coding using compound prediction
US11172203B2 (en) * 2017-08-08 2021-11-09 Mediatek Inc. Intra merge prediction

Also Published As

Publication number Publication date
CN114868385A (zh) 2022-08-05
GB201918431D0 (en) 2020-01-29
GB2589932A (en) 2021-06-16
WO2021115657A1 (fr) 2021-06-17

Similar Documents

Publication Publication Date Title
US11778201B2 (en) Video encoding and video decoding
TWI692245B (zh) 視訊解碼裝置、視訊編碼方法及裝置與電腦可讀儲存媒體
US20220303536A1 (en) Method of signalling in a video codec
CN104604237A (zh) 依据块尺寸来确定帧间预测参考画面列表的用于对视频进行编码的方法和设备以及用于对视频进行解码的方法和设备
US11350104B2 (en) Method for processing a set of images of a video sequence
EP4074031A1 (fr) Codage vidéo et décodage vidéo
EP4101170A1 (fr) Prédiction intra de chrominance en codage et décodage vidéo
US11589038B2 (en) Methods for video encoding and video decoding
US20220377342A1 (en) Video encoding and video decoding
EA043408B1 (ru) Кодирование видео и декодирование видео
GB2587363A (en) Method of signalling in a video codec
US20220166967A1 (en) Intra coding mode signalling in a video codec
GB2596394A (en) Method of signalling in a video codec
EA045634B1 (ru) Кодирование видео и декодирование видео

Legal Events

Date Code Title Description
STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: UNKNOWN

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE

PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE

17P Request for examination filed

Effective date: 20220422

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

DAV Request for validation of the european patent (deleted)
DAX Request for extension of the european patent (deleted)