EP3973709A2 - Procédé, appareil et produit-programme informatique pour codage et décodage vidéo - Google Patents

Procédé, appareil et produit-programme informatique pour codage et décodage vidéo

Info

Publication number
EP3973709A2
EP3973709A2 EP20810687.2A EP20810687A EP3973709A2 EP 3973709 A2 EP3973709 A2 EP 3973709A2 EP 20810687 A EP20810687 A EP 20810687A EP 3973709 A2 EP3973709 A2 EP 3973709A2
Authority
EP
European Patent Office
Prior art keywords
sample
block
prediction
prediction model
samples
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
EP20810687.2A
Other languages
German (de)
English (en)
Other versions
EP3973709A4 (fr
Inventor
Ramin GHAZNAVI YOUVALARI
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nokia Technologies Oy
Original Assignee
Nokia Technologies Oy
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nokia Technologies Oy filed Critical Nokia Technologies Oy
Publication of EP3973709A2 publication Critical patent/EP3973709A2/fr
Publication of EP3973709A4 publication Critical patent/EP3973709A4/fr
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/186Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/11Selection of coding mode or of prediction mode among a plurality of spatial predictive coding modes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • H04N19/14Coding unit complexity, e.g. amount of activity or edge presence estimation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/182Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a pixel
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/593Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial prediction techniques
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • H04N19/423Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation characterised by memory arrangements

Definitions

  • a method comprising receiving a source picture; partitioning the source picture into a set of non-overlapping blocks; for a block, obtaining at least one sample/pixel value and its location from at least one neighboring block; deriving a prediction model for the block that relates the sample(s) of neighboring block(s) to the corresponding location(s) of the neighboring block(s); predicting a sample value of the sample of a first prediction area based on the derived prediction model and the sample location inside the first prediction area; deriving at least one other prediction model for at least one other prediction area using the predicted samples and their locations from the prediction area being previously predicted; and predicting a sample value of the sample of the at least one other prediction area based on said at least one other prediction model and the sample location inside said at least one other prediction area.
  • an apparatus comprising means for receiving a source picture; means for partitioning the source picture into a set of non overlapping blocks; for a block, means for obtaining at least one sample/pixel value and its location from at least one neighboring block; means for deriving a prediction model for a block that relates the sample(s) of a neighboring block(s) to the corresponding location(s) of the neighboring block(s); means for predicting a sample value of the sample of a first prediction area based on the derived prediction model and the sample location inside the first prediction area; means for deriving at least one other prediction model for at least one other prediction area using the predicted samples and their locations from the prediction area being previously predicted; and means for predicting a sample value of the sample of the at least one other prediction area based on said at least one other prediction model and the sample location inside said at least one other prediction area.
  • a computer program product comprising computer program code configured to, when executed on at least one processor, cause an apparatus or a system to receive a source picture; partition the source picture into a set of non-overlapping blocks; for a block, obtain at least one sample/pixel value and its location from at least one neighboring block; derive a prediction model for the block that relates the sample(s) of neighboring block(s) to the corresponding location(s) of the neighboring block(s); predict a sample value of the sample of a first prediction area based on the derived prediction model and the sample location inside the first prediction area; derive at least one other prediction model for at least one other prediction area using the predicted samples and their locations from the prediction area being previously predicted; and predict a sample value of the sample of the at least one other prediction area based on said at least one other prediction model and the sample location inside said at least one other prediction area.
  • the sample is a pixel or a sub-block.
  • the method is continued for each block.
  • the neighboring block is one of the following: a block on a top of a current block, a block on the left of a current block, a block on top-left of a current block, a block on bottom-left of a current block, a block on top-right of a current block.
  • a derived prediction models are stored for certain region in a picture.
  • a prediction model is derived by using samples and their locations of a certain sample component by samples of another sample component.
  • an existing prediction model is derived from a storage and the retrieved existing prediction model is used together with at the first or said at least one other derived prediction model.
  • the model derivation is executed at an encoder and/or in a decoder.
  • the apparatus comprises at least one processor and a memory including computer program code.
  • the computer program product is embodied on a non- transitory computer readable medium.
  • Fig. 1 shows an encoding process according to an embodiment
  • Fig. 2 shows a decoding process according to an embodiment
  • Fig. 3 shows a method according to an embodiment for intra prediction using neighboring samples
  • the Advanced Video Coding standard (which may be abbreviated AVC or FI.264/AVC) was developed by the Joint Video Team (JVT) of the Video Coding Experts Group (VCEG) of the Telecommunications Standardization Sector of International Telecommunication Union (ITU-T) and the Moving Picture Experts Group (MPEG) of International Organization for Standardization (ISO) / International Electrotechnical Commission (IEC).
  • JVT Joint Video Team
  • VCEG Video Coding Experts Group
  • MPEG Moving Picture Experts Group
  • ISO International Organization for Standardization
  • ISO International Electrotechnical Commission
  • the H.264/AVC standard is published by both parent standardization organizations, and it is referred to as ITU-T Recommendation H.264 and ISO/IEC International Standard 14496-10, also known as MPEG-4 Part 10 Advanced Video Coding (AVC).
  • ITU-T Recommendation H.264 and ISO/IEC International Standard 14496-10 also known as MPEG-4 Part 10 Advanced Video Coding (AVC).
  • High Efficiency Video Coding standard (which may be abbreviated HEVC or H.265/HEVC) was developed by the Joint Collaborative Team - Video Coding (JCT- VC) of VCEG and MPEG.
  • JCT- VC Joint Collaborative Team - Video Coding
  • the standard is published by both parent standardization organizations, and it is referred to as ITU-T Recommendation H.265 and ISO/IEC International Standard 23008-2, also known as MPEG-H Part 2 High Efficiency Video Coding (HEVC).
  • Extensions to H.265/HEVC include scalable, multiview, three- dimensional, and fidelity range extensions, which may be referred to as SHVC, MV- HEVC, 3D-HEVC, and REXT, respectively.
  • H.264/AVC and HEVC Some key definitions, bitstream and coding structures, and concepts of H.264/AVC and HEVC and some of their extensions are described in this section as an example of a video encoder, decoder, encoding method, decoding method, and a bitstream structure, wherein the embodiments may be implemented.
  • Some of the key definitions, bitstream and coding structures, and concepts of H.264/AVC are the same as in HEVC standard - hence, they are described below jointly.
  • the aspects of various embodiments are not limited to H.264/AVC or HEVC or their extensions, but rather the description is given for one possible basis on top of which the present embodiments may be partly or fully realized.
  • a syntax element may be defined as an element of data represented in the bitstream.
  • a syntax structure may be defined as zero or more syntax elements present together in the bitstream in a specified order.
  • the bitstream syntax and semantics as well as the decoding process for error-free bitstreams are specified in H.264/AVC and HEVC.
  • the encoding process is not specified, but encoders must generate conforming bitstreams.
  • Bitstream and decoder conformance can be verified with the Hypothetical Reference Decoder (HRD).
  • HRD Hypothetical Reference Decoder
  • the standards contain coding tools that help in coping with transmission errors and losses, but the use of the tools in encoding is optional and no decoding process has been specified for erroneous bitstreams.
  • the elementary unit for the input to an H.264/AVC or HEVC encoder and the output of an H.264/AVC or HEVC decoder, respectively, is a picture.
  • a picture given as an input to an encoder may also be referred to as a source picture, and a picture decoded by a decoder may be referred to as a decoded picture.
  • Arrays representing other unspecified monochrome or tri-stimulus color samplings for example, YZX, also known as XYZ).
  • a picture may either be a frame or a field.
  • a frame comprises a matrix of luma samples and possibly the corresponding chroma samples.
  • a field is a set of alternate sample rows of a frame. Fields may be used as encoder input for example when the source signal is interlaced.
  • Chroma sample arrays may be absent (and hence monochrome sampling may be in use) or may be subsampled when compared to luma sample arrays.
  • Some chroma formats may be summarized as follows: - In monochrome sampling there is only one sample array, which may be nominally considered the luma array.
  • each of the two chroma arrays has the same height and half the width of the luma array.
  • H.264/AVC and HEVC it is possible to code sample arrays as separate color planes into the bitstream and respectively decode separately coded color planes from the bitstream.
  • each one of them is separately processed (by the encoder and/or the decoder) as a picture with monochrome sampling.
  • the location of chroma samples with respect to luma samples may be determined in the encoder side (e.g. as pre-processing step or as part of encoding).
  • the chroma sample positions with respect to luma sample positions may be pre-defined for example in a coding standard, such as H.264/AVC or HEVC, or may be indicated in the bitstream for example as part of VUI of H.264/AVC or HEVC.
  • the source video sequence(s) provided as input for encoding may either represent interlaced source content or progressive source content. Fields of opposite parity have been captured at different times for interlaced source content. Progressive source content contains captured frames.
  • An encoder may encode fields of interlaced source content in two ways: a pair of interlaced fields may be coded into a coded frame or a field may be coded as a coded field.
  • an encoder may encode frames of progressive source content in two ways: a frame of progressive source content may be coded into a coded frame or a pair of coded fields.
  • a field pair or a complementary field pair may be defined as two fields next to each other in decoding and/or output order, having opposite parity (i.e.
  • a partitioning may be defined as a division of a set into subsets such that each element of the set is in exactly one of the subsets.
  • a picture partitioning may be defined as a division of a picture into smaller non-overlapping units.
  • a block partitioning may be defined as a division of a block into smaller non-overlapping units, such as sub-blocks.
  • term block partitioning may be considered to cover multiple levels of partitioning, for example partitioning of a picture into slices, and partitioning of each slice into smaller units, such as macroblocks of H.264/AVC. It is noted that the same unit, such as a picture, may have more than one partitioning. For example, a coding unit of HEVC may be partitioned into prediction units and separately by another quadtree into transform units.
  • Hybrid video codecs may encode the video information in two phases.
  • pixel value in a certain picture area are predicted for example by motion compensation means or by spatial means.
  • predictive coding may be applied, for example, as so-called sample prediction and/or so-called syntax prediction.
  • pixel or sample values in a certain picture area or“block” are predicted. These pixel or sample values can be predicted, for example, using one or more of the following ways:
  • Motion compensation mechanisms (which may also be referred to as inter prediction, temporal prediction or motion-compensation temporal prediction or motion- compensated prediction or MCP) involve finding and indicating an area in one of the previously encoded video frames that corresponds closely to the block being coded. Inter prediction may reduce temporal redundancy.
  • Intra prediction where pixel or sample values can be predicted by spatial mechanisms, involve finding and indicating a spatial region relationship. Intra prediction can be performed in spatial or transform domain, i.e., either sample values or transform coefficients can be predicted. Intra prediction may be exploited in intra coding, where no inter-prediction is applied.
  • syntax prediction which may also be referred to as parameter prediction
  • syntax elements and/or syntax element values and/or variables derived from syntax elements are predicted from syntax elements (de)coded earlier and/or variables derived earlier.
  • Non-limiting examples of syntax prediction are provided below:
  • motion vectors e.g. for inter and/or inter-view prediction may be coded differentially with respect to a block-specific predicted motion vector.
  • the predicted motion vectors are created in a predefined way, for example by calculating the media of the encoded or decoded motion vectors of the adjacent blocks.
  • AVP advanced motion vector prediction
  • Another way to create motion vector predictions is to generate a list of candidate predictions from adjacent blocks and/or co-located blocks in temporal reference pictures and signaling the chosen candidate as the motion vector predictor.
  • the reference index of previously coded/decoded picture can be predicted. Differential coding of motion vectors may be disabled according to slice boundaries.
  • the block partitioning e.g. from CTU to CUs and down to PUs, may be predicted.
  • the filtering parameters e.g. for sample adaptive offset may be predicted.
  • Prediction approaches using image information from a previously coded image can also be called as inter prediction methods which may also be referred to as temporal prediction and motion compensation.
  • Prediction approaches using image information within the same image can also be called as intra prediction methods.
  • the second phase is coding the error between the prediction block of samples and the original block of samples. This may be accomplished by transforming the difference in sample values using a specified transform. This transform may be e.g. a Discrete Cosine Transform (DCT) or a variant thereof. After transforming the difference, the transformed difference is quantized, and entropy coded.
  • DCT Discrete Cosine Transform
  • the decoder reconstructs the output video by applying a prediction mechanism similar to that used by the encoder in order to form a prediction representation of the sample blocks (using the motion or spatial information created by the encoder and included in the compressed representation of the image) and prediction error decoding (the inverse operation of the prediction error coding to recover the quantized error signal in the spatial domain).
  • prediction error decoding the inverse operation of the prediction error coding to recover the quantized error signal in the spatial domain.
  • the decoder After applying sample prediction and error decoding processes the decoder combines the prediction and the prediction error signals (the sample values) to form the output video frame.
  • FIG. 1 illustrates an image to be encoded (l n ); a prediction representation of an image block (P’ n ); a prediction error signal (D n ); a reconstructed prediction error signal (D’ n ); a preliminary reconstructed image (l’ n ); a final reconstructed image (R’ n ); a transform (T) and inverse transform (T -1 ); a quantization (Q) and inverse quantization (Q _1 ); entropy encoding (E); a reference frame memory (RFM); inter prediction (Pinter); intra prediction (Pintra); mode selection (MS) and filtering (F).
  • FIG. 2 illustrates a prediction representation of an image block (P’ n ); a reconstructed prediction error signal (D’ n ); a preliminary reconstructed image (l’ n ); a final reconstructed image (R’ n ); an inverse transform (T-1 ); an inverse quantization (Q 1 ); an entropy decoding (E -1 ); a reference frame memory (RFM); a prediction (either inter or intra) (P); and filtering (F).
  • the decoder may also apply additional filtering processes in order to improve the quality of the output video before passing it for display and/or storing as a prediction reference for the forthcoming pictures in the video sequence.
  • motion information is indicated by motion vectors associated with each motion compensated image block.
  • Each of these motion vectors represents the displacement of the image block in the picture to be coded (in the encoder) or decoded (at the decoder), and the prediction source block in one of the previously coded or decoded image (or pictures).
  • FI.264/AVC and FIEVC as many other video compression standards, a picture is divided into a mesh of rectangles, for each of which a similar block in one of the reference pictures is indicated for inter prediction. The location of the prediction block is coded as a motion vector that indicates the position of the prediction block relative to the block being coded.
  • Intra prediction methods for image/video compression are not efficient, and still intra coded images or blocks consume significant amount of bitrate compared to the inter predicted frames or blocks in a video.
  • One of the aspects that is not considered in the intra prediction methods is the relation between the position/location of the samples and their pixel values.
  • the texture of an image may include different behaviors in different parts of the image plane, for example, there could be certain deformations, samplings, etc. in certain parts, so that the conventional intra prediction methods are not capable of capturing such sample distribution in an efficient way. This is due to the fact that most of the intra prediction methods use a specific prediction angle/direction for predicting a block of samples based on its neighbors.
  • the present embodiments relates to a method for intra prediction in video compression.
  • the intra prediction method attempts to model the samples inside a block based on the available neighboring samples and their locations. For that, a portion of the neighboring samples from the left, above, above-left, above-right and bottom-left of the block along with their coordinates (x, y) is collected, as shown in Figure 3. Then a prediction model is derived according to the collected information from neighbors. The derived prediction model is used for predicting the samples inside the block using the location of each sample inside the prediction block.
  • the new intra prediction method is then used as an additional intra prediction mode along with the existing modes or can replace one or more of the existing ones in the codec (e.g., AVC, HEVC, VVC, etc.)
  • codec e.g., AVC, HEVC, VVC, etc.
  • the method according to the present embodiments derives a prediction model which can simulate the sample distribution behavior in different parts of the image/video.
  • a prediction model can be derived in a way that relates the sample or pixel value P of a block 300 to its (x,y) location, neighboring sample values and neighboring samples’ locations with at least one weight and/or at least one offset.
  • the prediction method can model the samples 320 based on both or either of the x and y locations of neighboring samples 310.
  • the prediction model can be linear, polynomial, etc.
  • the prediction model can be defined as below, in which the sample P in location ( x,y ) is calculated based on the functions ⁇ (x) and ⁇ (y):
  • a sample P value can be calculated in each direction (i.e., x and y) relative to its direction or vice versa and the final predicted sample can be calculated based on certain weights between them:
  • the weights Wo and Wi can be calculated in different ways, e.g. based on the properties of the block (e.g., height, width, Luma/Chroma) or can be calculated based on the learning process from the neighboring blocks.
  • x and y refers to the location of the predicted sample in (x,y) coordinates .
  • the remaining parameters i.e., a, b, c, d, e, P(x 0 , y 0 )
  • the training (or parameter derivation or learning) process makes use of the neighboring sample/pixel values along with their (x, y) location for calculating the relation of the sample/pixel values to the location of them.
  • the training (parameter derivation or learning) process can be done in various ways e.g. by using one or more linear or polynomial regression methods.
  • a set of neural networks may be used for the learning process.
  • the training (parameter derivation or learning) process is not limited to the described ones and any method can be used for such purpose considering both sample/pixel values and their (x,y) locations.
  • the model parameters of the above prediction model can be calculated for each block 300 based on the sample/pixel 310 values and their locations from the available neighboring blocks.
  • a neighboring block can be one or more of the following: a block on a top of the current block, a block on the left of the current block, a block on top-left of the current block, a block on bottom-left of the current block, a block on top-right of the current block.
  • the collected neighboring information is used for a training or derivation process for example with linear regression, polynomial regression, logical regression, RANSAC (random sample consensus), etc. to calculate the mentioned parameters.
  • the parameter derivation process can be done also for example with gradient estimation approach.
  • P n (x i ) and P n (y i ) are the sample values from the neighboring above and left of the block.
  • g x and g y are the gradient estimations in horizontal and vertical directions, respectively.
  • a neighboring block can be an adjacent and/or non-adjacent block in a neighboring block can be an adjacent and/or non-adjacent block in a neighboring region.
  • Neighboring region relates to blocks locating at certain distance from the current block, wherein the distance may be defined by one or more blocks in a certain direction, e.g. up or left.
  • the size of the neighboring region may be selected based on the size of the current block. For example, the blocks in the top, left, top-left, bottom- left and top-right regions of the current block may be used to train the model.
  • the size of the neighboring block may be considered in training process.
  • the encoder may derive different prediction models for one block based on the neighboring samples and locations. For example, one prediction model may be derived based on the sample and locations of the left-side neighbor, one prediction model may be derived based on samples and their locations from the top-side neighbor, and another prediction model based on both left- and top side neighbor samples and their locations. In such case, the best performing prediction model may be selected out of these prediction models.
  • a flag or a predefined index may be encoded to the bitstream in order to inform the decoder that which prediction model is used.
  • the prediction model parameters may be transmitted/encoded into the bitstream.
  • the decoder may identify which prediction model out of the tested ones are selected in the encoder side. The latter one requires that the encoder and decoder use the same criteria for selecting among the tested prediction model derivation schemes.
  • the parameter derivation (training) process may be applied more than once in order to have more accurate parameters for prediction. This is particularly important for removing the outliers in the collected samples from the neighborhood.
  • the block can be divided into multiple subblocks, and a separate training or model derivation is applied for each subblock.
  • Figures 4 - 6 illustrate examples on this embodiment.
  • a block 400 is divided into at least two sub-blocks 421 , 422.
  • the first model derivation that has been described in above, is applied to the first subblock (e.g., the top-left subblock) 421 , which represents the first prediction area, and the first subblock 421 is predicted by using the first model and using the sample values and locations of available neighboring samples 420.
  • the second subblock 422, which represents the second prediction area a separate model is derived that may include samples and their locations from the first subblock 421 along with samples and locations of default neighboring block(s) 420. This is illustrated in Figure 4b, where the available samples from first predicted area 425 predicting the second prediction area 422 are referred to with a reference number 430.
  • the block 500 is divided into two subblocks 521 , 522.
  • the first subblock (the first prediction area) 521 is predicted based on the first model which is derived from the neighboring available samples 520 (sample values and locations).
  • samples from the first subblock 525 are available.
  • the values and locations of these available samples 530 from the first subblock 525 are used for predicting the second subblock 522.
  • a separate model is derived for this subblock that may include also the samples and their locations 530 from the first subblock 525 along with the default neighboring samples 520.
  • Figure 6 illustrates an example where a block 600 is divided into 4 subblocks 621 , 622, 623, 624.
  • the division into subblocks 621 , 622, 623, 624 may be based on a predefined structure e.g., based on the size of the block, or it can be based on a separate learning from the neighboring samples. In other words, a learning operation or a simple neighboring check may be applied in order to decide the subblock partitioning based on the sample’s values in neighbors. According to an alternative embodiment, the subblock partitioning may be inherited from the neighboring block(s).
  • the current prediction area 621 is predicted based on the first model which is derived from the neighboring available samples 620, comprising both sample values and the (x, y) locations of them.
  • the current prediction area 622 utilizes also the sample/pixel values and the (x, y) location of the area that has been predicted in Figure 6a, i.e. predicted area 621 , which available for modeling for the current prediction area 622.
  • both the first and the second areas 621 , 622 have been predicted, and therefore the sample/pixel values and the (x, y) location of predicted areas are thus also available for modeling for the current prediction area 623.
  • the sample/pixel values and the (x, y) location of each predicted area 621 , 622, 623 are available for modelling for the current prediction area 624.
  • the partitioning may result in partitions having different sizes.
  • the partitioning may be done in an unequal way based on the block properties, e.g., texture and size of the current and neighboring blocks, etc.
  • the method according to the present embodiments may be used for predicting the non-available areas from the right and/or bottoms sides of the prediction block by using the sample values of locations of available samples.
  • the predicted samples for the non-available areas can be used, for example,
  • the first prediction is based on the earlier prediction model that is derived based on the available samples (from left and/or above sides).
  • the second prediction model that is derived based on the samples from the predicted non-available areas (the second prediction may also include the samples which are available by default).
  • the final prediction may use either or both of these predictions with a certain weighting between them. Higher weights may be applied to the first prediction since it uses the default available samples not the predicted ones (the predicted ones may not be very accurate).
  • the non-available area prediction is not limited to the right and bottom side areas.
  • samples from the side of the block which is in picture boundary can be left, above-left, above or all
  • these non-available ones are padded from the available ones. For example, if the above samples are not available but the left side samples are available, then the left samples are projected or padded to the above side for having better prediction.
  • none of the sides are available (e.g., block is located in top-left boundary area of picture) as fixed value may be used for the non-available areas.
  • these areas can be predicted from the samples that are available from at least one of the sides for the final prediction.
  • the derived prediction model (consisting of its weights and/or coefficients) of each block may be used for intra prediction of the neighboring blocks.
  • a shared memory may be defined where these models are stored for a certain region in an image.
  • a model can be derived based on the neighboring samples (i.e. sample values and locations), moreover, the available models from the shared memory can be used for having a more efficient prediction.
  • the shared models from the previous blocks may be scaled or tuned before using them for the current block. The scaling or tuning may be done for example based on the distance and/or size of the current block to the shared model’s block.
  • the model derivation may make use of the neighboring blocks prediction mode for deciding which side of the neighboring samples can be used for model derivation. This means that the neighboring block’s prediction mode can be inferred to get the direction of samples and also the weighting for the final model parameters.
  • the present embodiments can also be used for cross-component scheme.
  • the samples of a block in a certain sample component e.g., YUV or YCbCr
  • a certain sample component e.g., YUV or YCbCr
  • chroma components Cb, Cr
  • Cross chroma prediction can be also used, i.e., Cb from Cr or vice versa.
  • Another approach may be to derive the Luma prediction model from one or more of the chroma (Cb, Cr) corresponding samples. In these cases, sample modification or scaling may be applied to the samples of the channel that are used for prediction model derivation.
  • the present embodiment may be used jointly with other existing prediction models.
  • a first prediction of block can be done based on one or more of the existing prediction models;
  • a second prediction can be done based on the method according to the present embodiments.
  • the final prediction of a sample can be done by joint prediction as shown below:
  • receiving is understood here to mean that the information is read from a memory or received over a communications connection.
  • An apparatus comprises means for receiving a source picture; means for partitioning the source picture into a set of non-overlapping blocks; for a block, means for obtaining at least one sample/pixel value and its location from at least one neighboring block; means for deriving a prediction model for the block that relates the neighboring sample(s)/pixel(s) value(s) to their corresponding location; means for predicting a sample value of the sample of a first prediction area based on the derived prediction model and the sample location inside the first prediction area; deriving at least one other prediction model for at least one other prediction area using the predicted samples and their locations from the prediction area being previously predicted; and means predicting a sample value of the sample of the at least one other prediction area based on said at least one other prediction model and the sample location inside said at least one other prediction area.
  • the means comprises at least one processor, and a memory including a computer program code, wherein the processor may further comprise processor circuitry.
  • the memory and the computer program code are configured to, with the at least one processor, cause the apparatus to perform the method of Figure 8 or Figure 9according to various embodiments.
  • An apparatus according to an embodiment is illustrated in Figure 10.
  • An apparatus of this embodiment is a camera having multiple lenses and imaging sensors, but also other types of cameras may be used to capture wide view images and/or wide view video.
  • wide view image and wide view video mean an image and a video, respectively, which comprise visual information having a relatively large viewing angle, larger than 100 degrees.
  • a so called 360 panorama image/video as well as images/videos captured by using a fisheye lens may also be called as a wide view image/video in this specification.
  • the wide view image/video may mean an image/video in which some kind of projection distortion may occur when a direction of view changes between successive images or frames of the video so that a transform may be needed to find out co-located samples from a reference image or a reference frame. This will be described in more detail later in this specification.
  • the camera 2700 of Figure 10 comprises two or more camera units 2701 and is capable of capturing wide view images and/or wide view video.
  • Each camera unit 2701 is located at a different location in the multi-camera system and may have a different orientation with respect to other camera units 2501 .
  • the camera units 2701 may have an omnidirectional constellation so that it has a 360-viewing angle in a 3D-space. In other words, such camera 2700 may be able to see each direction of a scene so that each spot of the scene around the camera 2700 can be viewed by at least one camera unit 2701 .
  • the camera 2700 of Figure 10 may also comprise a processor 2704 for controlling the operations of the camera 2700. There may also be a memory 2706 for storing data and computer code to be executed by the processor 2704, and a transceiver 2708 for communicating with, for example, a communication network and/or other devices in a wireless and/or wired manner.
  • the camera 2700 may further comprise a user interface (Ul) 2710 for displaying information to the user, for generating audible signals and/or for receiving user input.
  • Ul user interface
  • the camera 2700 need not comprise each feature mentioned above or may comprise other features as well. For example, there may be electric and/or mechanical elements for adjusting and/or controlling optics of the camera units 2701 (not shown).
  • Figure 10 also illustrates some operational elements which may be implemented, for example, as a computer code in the software of the processor, in a hardware, or both.
  • a focus control element 2714 may perform operations related to adjustment of the optical system of camera unit or units to obtain focus meeting target specifications or some other predetermined criteria.
  • An optics adjustment element 2716 may perform movements of the optical system or one or more parts of it according to instructions provided by the focus control element 2714. It should be noted here that the actual adjustment of the optical system need not be performed by the apparatus, but it may be performed manually, wherein the focus control element 2714 may provide information for the user interface 2710 to indicate a user of the device how to adjust the optical system.
  • Said operational characteristics are being defined through configuration by said computer based on the type of said processor, wherein a system is connectable to said processor by a bus, wherein a programmable operational characteristic of the system comprises receiving a source picture; partitioning the source picture into a set of non-overlapping blocks; for a block, obtaining at least one sample/pixel value and its location from at least one neighboring block; deriving a prediction model for the block that relates the sample(s) of neighboring block(s) to the corresponding location(s) of the neighboring block(s); predicting a sample value of the sample of a first prediction area based on the derived prediction model and the sample location inside the first prediction area; deriving at least one other prediction model for at least one other prediction area using the predicted samples and their locations from the prediction area being previously predicted; and predicting a sample value of the sample of the at least one other prediction area based on said at least one other prediction model and the sample location inside said at least one other prediction area.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

Les modes de réalisation concernent un procédé et un équipement technique, le procédé comprenant la réception d'une image source ; le cloisonnement de l'image source en un ensemble de blocs non chevauchants ; pour un bloc, l'obtention d'au moins une valeur d'échantillon/pixel et de son emplacement à partir d'au moins un bloc voisin ; la dérivation d'un modèle de prédiction pour le bloc qui relie le ou les échantillons d'un ou plusieurs blocs voisins au ou aux emplacements correspondants du ou des blocs voisins ; la prédiction d'une valeur d'échantillon de l'échantillon d'une première zone de prédiction sur la base du modèle de prédiction dérivé et de l'emplacement d'échantillon à l'intérieur de la première zone de prédiction ; la dérivation d'au moins un autre modèle de prédiction pour au moins une autre zone de prédiction à l'aide des échantillons prédits et de leurs emplacements à partir de la zone de prédiction précédemment prédite ; et la prédiction d'une valeur d'échantillon de l'échantillon de ladite autre zone de prédiction sur la base dudit au moins un autre modèle de prédiction et de l'emplacement d'échantillon à l'intérieur de ladite au moins une autre zone de prédiction.
EP20810687.2A 2019-05-22 2020-05-15 Procédé, appareil et produit-programme informatique pour codage et décodage vidéo Pending EP3973709A4 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
FI20195421 2019-05-22
PCT/FI2020/050327 WO2020234512A2 (fr) 2019-05-22 2020-05-15 Procédé, appareil et produit-programme informatique pour codage et décodage vidéo

Publications (2)

Publication Number Publication Date
EP3973709A2 true EP3973709A2 (fr) 2022-03-30
EP3973709A4 EP3973709A4 (fr) 2023-02-01

Family

ID=73459030

Family Applications (1)

Application Number Title Priority Date Filing Date
EP20810687.2A Pending EP3973709A4 (fr) 2019-05-22 2020-05-15 Procédé, appareil et produit-programme informatique pour codage et décodage vidéo

Country Status (2)

Country Link
EP (1) EP3973709A4 (fr)
WO (1) WO2020234512A2 (fr)

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180213224A1 (en) * 2015-07-20 2018-07-26 Lg Electronics Inc. Intra prediction method and device in video coding system
US10652575B2 (en) * 2016-09-15 2020-05-12 Qualcomm Incorporated Linear model chroma intra prediction for video coding

Also Published As

Publication number Publication date
EP3973709A4 (fr) 2023-02-01
WO2020234512A3 (fr) 2021-03-04
WO2020234512A2 (fr) 2020-11-26

Similar Documents

Publication Publication Date Title
US20210297697A1 (en) Method, an apparatus and a computer program product for coding a 360-degree panoramic video
WO2017158236A2 (fr) Procédé, appareil et produit programme d'ordinateur permettant de coder des images panoramiques à 360 degrés et une vidéo
JP7366901B2 (ja) 映像コーディングシステムにおけるインター予測による映像デコーディング方法及び装置
US11496734B2 (en) Cross component filtering-based image coding apparatus and method
KR102062821B1 (ko) 필터 정보 예측을 이용한 영상 부호화/복호화 방법 및 장치
CA3132463A1 (fr) Methode et appareil de codage video axe sur le procede de raffinement deprediction avec flux optique (prof) pour la prediction affine
US11936875B2 (en) In-loop filtering-based image coding apparatus and method
CN111801944B (zh) 视频图像编码器、解码器以及对应的运动信息编码方法
US20220337823A1 (en) Cross-component adaptive loop filtering-based image coding apparatus and method
GB2509563A (en) Encoding or decoding a scalable video sequence using inferred SAO parameters
US11595697B2 (en) Adaptive loop filtering-based image coding apparatus and method
KR20130002242A (ko) 영상 정보의 부호화 방법 및 복호화 방법
WO2021195546A1 (fr) Procédés de signalisation de données de vidéocodage
CN115396666A (zh) 基于神经网络的滤波的参数更新
CN114303375A (zh) 使用双向预测的视频解码方法和用于该方法的装置
WO2019158812A1 (fr) Procédé et appareil de compensation de mouvement
EP3987808A1 (fr) Procédé, appareil et produit-programme informatique pour codage et décodage vidéo
CN115988202B (zh) 一种用于帧内预测的设备和方法
WO2020234512A2 (fr) Procédé, appareil et produit-programme informatique pour codage et décodage vidéo
WO2020008107A1 (fr) Procédé, appareil et produit-programme informatique pour codage et décodage vidéo
US11991362B2 (en) Method for coding image on basis of deblocking filtering, and apparatus therefor
CN118118669A (zh) 基于自适应环路滤波的图像编译装置和方法
CN118118668A (zh) 基于自适应环路滤波的图像编译装置和方法
CN115152214A (zh) 基于画面划分的图像编码装置和方法

Legal Events

Date Code Title Description
STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE

PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE

17P Request for examination filed

Effective date: 20211222

AK Designated contracting states

Kind code of ref document: A2

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

DAV Request for validation of the european patent (deleted)
DAX Request for extension of the european patent (deleted)
REG Reference to a national code

Ref country code: DE

Ref legal event code: R079

Free format text: PREVIOUS MAIN CLASS: H04N0019593000

Ipc: H04N0019110000

A4 Supplementary search report drawn up and despatched

Effective date: 20230105

RIC1 Information provided on ipc code assigned before grant

Ipc: H04N 19/423 20140101ALN20221223BHEP

Ipc: H04N 19/593 20140101ALI20221223BHEP

Ipc: H04N 19/182 20140101ALI20221223BHEP

Ipc: H04N 19/186 20140101ALI20221223BHEP

Ipc: H04N 19/176 20140101ALI20221223BHEP

Ipc: H04N 19/14 20140101ALI20221223BHEP

Ipc: H04N 19/11 20140101AFI20221223BHEP