WO2017054630A1 - 图像预测的方法及装置 - Google Patents
图像预测的方法及装置 Download PDFInfo
- Publication number
- WO2017054630A1 WO2017054630A1 PCT/CN2016/098464 CN2016098464W WO2017054630A1 WO 2017054630 A1 WO2017054630 A1 WO 2017054630A1 CN 2016098464 W CN2016098464 W CN 2016098464W WO 2017054630 A1 WO2017054630 A1 WO 2017054630A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- mode
- affine
- image
- prediction mode
- processed
- Prior art date
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
- H04N19/105—Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
- H04N19/107—Selection of coding mode or of prediction mode between spatial and temporal predictive coding, e.g. picture refresh
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/157—Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
- H04N19/159—Prediction type, e.g. intra-frame, inter-frame or bidirectional frame prediction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/44—Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
- H04N19/513—Processing of motion vectors
- H04N19/517—Processing of motion vectors by encoding
- H04N19/52—Processing of motion vectors by encoding by predictive encoding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/70—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
- H04N19/537—Motion estimation other than block-based
- H04N19/54—Motion estimation other than block-based using feature points or meshes
Definitions
- the present invention relates to the field of video coding and compression, and more particularly to a method and apparatus for image prediction.
- Digital video capabilities can be incorporated into a wide range of devices, including digital television, digital live broadcast systems, wireless broadcast systems, personal digital assistants (PDAs), laptop or desktop computers, tablet computers, e-book readers, digital Cameras, digital recording devices, digital media players, video game devices, video game consoles, cellular or satellite radio phones, video conferencing devices, video streaming devices, and the like.
- Digital video devices implement video compression techniques such as MPEG-2, MPEG-4, ITU-TH.263, ITU-TH.264/MPEG-4 Part 10 Advanced Video Coding (AVC), ITU-TH.265 High
- AVC Advanced Video Coding
- Video devices may transmit, receive, encode, decode, and/or store digital video information more efficiently by implementing such video coding techniques.
- Video compression techniques include spatial (intra-picture) prediction and/or temporal (inter-picture) prediction to reduce or remove redundancy inherent in video sequences.
- a video slice ie, a portion of a video frame or video frame
- a number of video blocks which may also be referred to as a tree block, a coding unit (CU), and/or Decoding node.
- Spatial prediction is used to encode video blocks in intra-coded (I) slices of a picture relative to reference samples in neighboring blocks in the same picture.
- a video block in an inter-coded (P or B) slice of a picture may use spatial prediction with respect to reference samples in neighboring blocks in the same picture or temporal prediction with respect to reference samples in other reference pictures.
- a picture may be referred to as a frame, and a reference picture may be referred to as a reference frame.
- Spatial or temporal prediction produces a predictive block of the block to be coded.
- the residual data represents the pixel difference between the original block to be decoded and the predictive block.
- the inter-coded block is encoded according to a motion vector that points to a reference sample block that forms a predictive block and residual data that indicates a difference between the coded block and the predictive block.
- the intra-coded block is encoded according to an intra coding mode and residual data.
- the residual data may be transformed from a pixel domain to a transform domain, resulting in residual transform coefficients, which may then be quantized.
- the quantized transform coefficients initially arranged in a two-dimensional array can be sequentially scanned to produce a one-dimensional vector of transform coefficients, and entropy coding can be applied to achieve more compression.
- the present invention describes an image prediction method for improving coding efficiency.
- the prediction mode of the image unit to be processed is derived by predicting information or unit size of a neighboring image unit of an image unit to be processed, or a prediction mode candidate set indicating a region level. Since the a priori information is provided for the coding of the prediction mode, the code rate of the coding prediction mode is saved, and the coding efficiency is improved.
- a prediction image decoding method includes: determining, according to information of an adjacent image unit adjacent to an image unit to be processed, whether a candidate prediction mode set of the image unit to be processed includes an affine merge mode, The affine merge mode indicates that the image unit to be processed and the adjacent image unit of the image unit to be processed obtain respective predicted images using the same affine model; parsing first indication information in the code stream; a first indication information, from the set of candidate prediction modes, determining a prediction mode of the image unit to be processed; and determining, according to the prediction mode, a predicted image of the image unit to be processed.
- the adjacent image unit of the image unit to be processed includes at least adjacent image units of the upper, left, upper right, lower left, and upper left of the image unit to be processed.
- the candidate prediction mode set of the image unit to be processed includes an affine merge mode according to information of adjacent image units adjacent to the image unit to be processed:
- the first implementation manner includes: when the prediction mode of the at least one of the adjacent image units is to obtain a predicted image by using an affine model, parsing the second indication information in the code stream, when the second indication information is 1 that the candidate prediction mode set includes the affine combining mode, and when the second indication information is 0, the candidate prediction mode set does not include the affine combining mode; otherwise, the candidate prediction mode The set does not contain the affine merge mode.
- the second implementation manner includes: when the prediction mode of the at least one of the adjacent image units is to obtain a predicted image using an affine model, the candidate prediction mode set includes the affine merge mode; otherwise, the candidate prediction The pattern set does not contain the affine merge mode.
- a third implementation manner includes: the prediction mode includes at least obtaining a first affine mode of the predicted image by using the first affine model or a second affine mode of obtaining the predicted image by using the second affine model, correspondingly,
- the affine combining mode includes at least a first affine combining mode combining the first affine modes or a second affine combining mode combining the second affine modes, correspondingly, the basis and the image unit to be processed Information of adjacent adjacent image units, determining the image unit to be processed Whether the candidate prediction mode set includes an affine merge mode, and includes: when the first affine mode is the most in the prediction mode of the neighboring prediction unit, the candidate prediction mode set includes the first affine merge mode And not including the second affine merge mode; when the second affine mode is the most in the prediction mode of the neighboring prediction unit, the candidate prediction mode set includes the second affine merge mode And not including the first affine merge mode; when the prediction mode of the neighboring prediction unit
- the method further includes: when the first affine mode is the most in the prediction mode of the neighboring prediction unit, the candidate prediction mode set includes the first affine merge mode, and does not include The second affine merge mode; when the second affine mode is the most in the prediction mode of the neighboring prediction unit, the candidate prediction mode set includes the second affine merge mode, and does not include The first affine merge mode; when the prediction mode of the adjacent prediction unit is not the most affine mode and the first affine mode is the second most, the candidate prediction mode set includes the first An affine merge mode, and does not include the second affine merge mode; when the prediction mode of the adjacent prediction unit is not the most affine mode and the second affine mode is multiple, the candidate The set of prediction modes includes the second affine merge mode and does not include the first affine merge mode.
- a fourth implementation manner includes: when a prediction mode of the at least one of the adjacent image units is a prediction image obtained by using an affine model, and a width and a height of the at least one of the adjacent image units are respectively smaller than the to-be-processed Parsing the third indication information in the code stream when the image unit is wide and high, and when the third indication information is 1, the candidate prediction mode set includes the affine merge mode, when the third When the indication information is 0, the candidate prediction mode set does not include the affine merge mode; otherwise, the candidate prediction mode set does not include the affine merge mode.
- a fifth implementation manner includes: when a prediction mode of the at least one of the adjacent image units is a prediction image obtained by using an affine model, and a width and a height of the at least one of the adjacent image units are respectively smaller than the to-be-processed
- the candidate prediction mode set includes the affine merge mode when the image unit is wide and high; otherwise, the candidate prediction mode set does not include the affine merge mode.
- a prediction image decoding method includes: parsing first indication information in a code stream; determining, according to the first indication information, a candidate mode set of a first to-be-processed image region; When the indication information is 0, the candidate mode set of the first to-be-processed image region adopts a candidate translation mode set, and the translational mode represents a prediction mode for obtaining a predicted image by using a translational model; when the first indication information is 1 o'clock, the candidate mode of the first image area to be processed The set adopts a candidate translation mode set and a candidate affine pattern set, the affine mode represents a prediction mode for obtaining a predicted image using an affine model; parsing second indication information in the code stream; and according to the second indication information Determining, from the candidate prediction mode set of the first to-be-processed image region, a prediction mode of the image unit to be processed, the image unit to be processed belongs to the first image to be processed; determining, according to
- the first to-be-processed image area includes one of an image frame group, an image frame, an image slice set, an image slice set, an image slice, an image slice, an image coding unit set, and an image coding unit.
- a predictive image decoding method includes: determining, according to information of a neighboring image unit adjacent to an image unit to be processed, whether a candidate prediction mode set of the image unit to be processed includes an affine merge mode, The affine merge mode indicates that the image unit to be processed and the adjacent image unit of the image unit to be processed obtain the respective predicted images using the same affine model; parsing the first indication information in the code stream; And an indication information, from the set of candidate prediction modes, determining a prediction mode of the image unit to be processed; and determining, according to the prediction mode, a predicted image of the image unit to be processed.
- a predictive image encoding method includes: determining, according to information of a neighboring image unit adjacent to an image unit to be processed, whether a candidate prediction mode set of the image unit to be processed includes an affine merge mode, The affine merge mode indicates that the image unit to be processed and the adjacent image unit of the image unit to be processed obtain respective predicted images using the same affine model; and from the candidate prediction mode set, determine the to-be-processed Processing a prediction mode of the image unit; determining a predicted image of the image unit to be processed according to the prediction mode; programming the first indication information into a code stream, the first indication information indicating the prediction mode.
- a prediction image decoding method includes: parsing first indication information in a code stream; determining, according to the first indication information, a candidate mode set of a first to-be-processed image region; When the indication information is 0, the candidate mode set of the first to-be-processed image region adopts a candidate translation mode set, and the translational mode represents a prediction mode for obtaining a predicted image by using a translational model; when the first indication information is 1st, the candidate mode set of the first image to be processed adopts a candidate translation mode set and a candidate affine mode set, the affine mode represents a prediction mode for obtaining a predicted image using an affine model; and parsing the code stream
- a prediction mode of the image unit the image unit to be processed belongs to the first image area to be processed; and the predicted image of the image unit to be processed is determined
- a predictive image encoding method includes: setting a first indication information to 0 when the candidate mode set of the first to-be-processed image region adopts a candidate translation mode set, and compiling the first indication information a code stream, the translation mode represents a prediction mode of obtaining a predicted image using a translational model; and when the candidate mode set of the first image region to be processed adopts a candidate translation mode set and a candidate affine mode set,
- the first indication information is 1, the first indication information is programmed into a code stream, the affine mode represents a prediction mode for obtaining a predicted image using an affine model; and a candidate prediction mode from the first image region to be processed Determining, in the set, a prediction mode of the image unit to be processed, the image unit to be processed belongs to the first image area to be processed; determining a predicted image of the image unit to be processed according to the prediction mode; and displaying the second indication information
- the code stream is programmed, and the second indication information represents the prediction mode.
- a predictive image decoding apparatus includes: a first determining module, configured to determine a candidate prediction mode set of the image unit to be processed according to information of a neighboring image unit adjacent to an image unit to be processed Whether an affine merge mode is included, the affine merge mode indicating that the image unit to be processed and the adjacent image unit of the image unit to be processed obtain respective predicted images using the same affine model; a parsing module for parsing a first indication information in the code stream; a second determining module, configured to determine, according to the first indication information, a prediction mode of the image unit to be processed from the candidate prediction mode set; and a third determining module, And determining, according to the prediction mode, a predicted image of the image unit to be processed.
- a predictive image encoding apparatus includes: a first determining module, configured to determine a candidate prediction mode set of the image unit to be processed according to information of adjacent image units adjacent to an image unit to be processed Whether an affine merge mode is included, the affine merge mode indicating that the image unit to be processed and the adjacent image unit of the image unit to be processed obtain the respective predicted image using the same affine model; the second determining module uses Determining, from the candidate prediction mode set, a prediction mode of the image unit to be processed; a third determining module, configured to determine a predicted image of the image unit to be processed according to the prediction mode; and an encoding module, configured to: The first indication information is encoded into a code stream, and the first indication information represents the prediction mode.
- a predictive image decoding apparatus includes: a first parsing module, configured to parse first indication information in a code stream; and a first determining module, configured to determine, according to the first indication information, a first a candidate mode set of the image area to be processed; when the first indication information is 0,
- the candidate mode set of the first to-be-processed image region adopts a candidate translation mode set, and the translational mode represents a prediction mode in which a prediction image is obtained using a translational model;
- the candidate mode set of the first to-be-processed image region adopts a candidate translation mode set and a candidate affine pattern set, where the affine mode represents obtaining an predicted image using an affine model.
- a prediction mode a prediction mode
- a second parsing module configured to parse the second indication information in the code stream
- a second determining module configured to select a prediction mode from the first image area to be processed according to the second indication information Determining, in the set, a prediction mode of the image unit to be processed, the image unit to be processed belongs to the first image area to be processed, and a third determining module, configured to determine a prediction of the image unit to be processed according to the prediction mode image.
- a predictive image encoding apparatus includes: a first encoding module, configured to set a first indication information to 0 when a candidate mode set of the first image region to be processed adopts a candidate translation mode set, The first indication information is programmed into a code stream, and the translation mode represents a prediction mode in which a prediction image is obtained by using a translational model; when the candidate mode set of the first to-be-processed image region adopts a candidate translation mode set and a candidate simulation When the mode is set, the first indication information is set to 1, and the first indication information is programmed into a code stream, where the affine mode represents a prediction mode for obtaining a predicted image using an affine model; Determining a prediction mode of the image unit to be processed from the candidate prediction mode set of the first image to be processed, the image unit to be processed belongs to the first image area to be processed, and the second determining module is configured to Determining a predicted image of the image unit to be processed; and a second encoding module, configured to Determin
- an apparatus for decoding video data comprising a video decoder configured to: determine from information of neighboring image units adjacent to an image unit to be processed Whether the candidate prediction mode set of the image unit to be processed includes an affine merge mode, the affine merge mode indicating that the image unit to be processed and the adjacent image unit of the image unit to be processed are obtained using the same affine model a respective prediction image; parsing first indication information in the code stream; determining, according to the first indication information, a prediction mode of the image unit to be processed from the candidate prediction mode set; determining, according to the prediction mode, a predicted image of the image unit to be processed.
- an apparatus for encoding video data comprising a video encoder configured to: determine from information of neighboring image units adjacent to an image unit to be processed Whether the candidate prediction mode set of the image unit to be processed includes an affine a merge mode, the affine merge mode indicating that the image unit to be processed and the adjacent image unit of the image unit to be processed obtain respective predicted images using the same affine model; from the set of candidate prediction modes, determining Determining a mode of the image unit to be processed; determining a predicted image of the image unit to be processed according to the prediction mode; programming the first indication information into a code stream, wherein the first indication information indicates the prediction mode.
- an apparatus for decoding video data comprising a video decoder configured to: first parse the first indication information in the code stream; according to the first indication information Determining a candidate mode set of the first to-be-processed image region; when the first indication information is 0, the candidate mode set of the first to-be-processed image region adopts a candidate translation mode set, and the translation mode indicates use
- the translation model obtains a prediction mode of the predicted image; when the first indication information is 1, the candidate mode set of the first to-be-processed image region adopts a candidate translation mode set and a candidate affine pattern set, the affine
- the mode represents obtaining a prediction mode of the predicted image using the affine model; parsing the second indication information in the code stream; and determining, according to the second indication information, from the candidate prediction mode set of the first to-be-processed image region a prediction mode of the image unit to be processed, the image unit to be processed belongs to the first image area to be processed; according to the prediction
- an apparatus for encoding video data comprising a video encoder configured to: when a candidate mode set of a first image region to be processed employs a set of candidate translation modes And setting the first indication information to 0, and encoding the first indication information into a code stream, where the translation mode represents a prediction mode of obtaining a predicted image using a translational model; and when the candidate of the first image area to be processed
- the mode set adopts the candidate translation mode set and the candidate affine mode set the first indication information is set to 1, and the first indication information is programmed into a code stream, and the affine mode indicates that the prediction is obtained by using an affine model.
- a prediction mode of the image determining, from a candidate prediction mode set of the first to-be-processed image region, a prediction mode of the image unit to be processed, the image unit to be processed belonging to the first image to be processed; according to the prediction a mode of determining a predicted image of the image unit to be processed; encoding second indication information into the code stream, the second indication information indicating the Measurement mode.
- a computer readable storage medium storing instructions that, when executed, are used by one or more processors of a device that decodes video data to perform operations on: Information about adjacent image units adjacent to the unit, determining whether the candidate prediction mode set of the image unit to be processed includes an affine merge mode, the affine merge mode representation And the image unit to be processed and the adjacent image unit of the image unit to be processed obtain the respective predicted image using the same affine model; parse the first indication information in the code stream; according to the first indication information, Determining, in the set of candidate prediction modes, a prediction mode of the image unit to be processed; and determining a predicted image of the image unit to be processed according to the prediction mode.
- a computer readable storage medium storing instructions that, when executed, are used by one or more processors of a device that encodes video data to perform operations on: Determining, by the information of the neighboring image units adjacent to the unit, whether the candidate prediction mode set of the image unit to be processed includes an affine merge mode, the affine merge mode indicating the image unit to be processed and the image unit to be processed
- the adjacent image units obtain the respective predicted images using the same affine model; from the candidate prediction mode set, the prediction mode of the image unit to be processed is determined; and the image unit to be processed is determined according to the prediction mode a predicted image; the first indication information is encoded into a code stream, and the first indication information represents the prediction mode.
- a computer readable storage medium storing instructions that, when executed, are used by one or more processors of a device that decodes video data to: And determining, by the first indication information, a candidate mode set of the first to-be-processed image region; when the first indication information is 0, the candidate mode set of the first to-be-processed image region adopts a candidate a translation mode set, wherein the translation mode represents a prediction mode of obtaining a predicted image using a translational model; when the first indication information is 1, the candidate mode set of the first to-be-processed image region adopts a candidate translation mode And a set of candidate affine patterns representing a prediction mode of obtaining a predicted image using an affine model; parsing second indication information in the code stream; and from the first according to the second indication information Determining, in the candidate prediction mode set of the image region to be processed, a prediction mode of the image unit to be processed, the image unit to be processed belonging to the first to-be-processed Image region;
- a computer readable storage medium storing instructions that, when executed, are used by one or more processors of a device that encodes video data to:
- the first indication information is set to 0
- the first indication information is programmed into the code stream
- the translation mode indicates that the prediction mode of the predicted image is obtained by using the translational model.
- the candidate mode set of the first to-be-processed image region adopts the candidate translation mode set and the candidate affine mode set
- the first indication information is set to 1, and the first indication information is programmed into the code stream.
- the affine pattern representation is obtained using an affine model Determining a prediction mode of the image; determining, from the candidate prediction mode set of the first to-be-processed image region, a prediction mode of the image unit to be processed, the image unit to be processed belonging to the first image to be processed; a prediction mode, determining a predicted image of the image unit to be processed; programming the second indication information into the code stream, the second indication information indicating the prediction mode.
- FIG. 1 is a schematic block diagram of a video decoding system in accordance with an embodiment of the present invention
- FIG. 2 is a schematic block diagram of a video encoder in accordance with an embodiment of the present invention.
- FIG. 3 is a schematic flowchart of an example operation of a video encoder according to an embodiment of the present invention.
- FIG. 4 is a schematic diagram showing the positions of a to-be-processed block and its adjacent reconstructed blocks according to an embodiment of the present invention
- FIG. 5 is a schematic block diagram of another video encoder according to an embodiment of the present invention.
- FIG. 6 is a schematic flowchart of another example operation of a video encoder according to an embodiment of the present invention.
- FIG. 7 is a schematic block diagram of still another video encoder in accordance with an embodiment of the present invention.
- FIG. 8 is a schematic block diagram of a video decoder according to an embodiment of the present invention.
- FIG. 9 is a schematic flowchart of an example operation of a video decoder according to an embodiment of the present invention.
- FIG. 10 is a schematic block diagram of another video decoder according to an embodiment of the present invention.
- FIG. 11 is a schematic flowchart of another example operation of a video decoder according to an embodiment of the present invention.
- FIG. 12 is a schematic block diagram of still another video decoder in accordance with an embodiment of the present invention.
- Motion compensation is one of the most critical techniques for improving compression efficiency in video coding.
- Traditional block based Matched motion compensation is a popular method of video encoders, especially in video coding standards.
- an inter prediction block adopts a translational motion model, which assumes that motion vectors at all pixel positions in one block are equal. However, this assumption does not hold in many cases.
- the motion of objects in real-life video is often a complex combination of motions such as translation, rotation, and scaling. If such a complex motion is contained in one pixel block, the prediction signal obtained by the conventional block matching-based motion compensation method is not accurate enough, so the inter-frame correlation cannot be sufficiently removed.
- a high-order motion model is introduced into the motion compensation of video coding. Compared with the translational motion model, the higher-order motion model has more freedom, which allows the motion vectors of each pixel in an inter-predicted block to be different, that is, the motion vector field generated by the high-order motion model has better precision. .
- the affine motion model based on the control point description is a representative one in the high-order motion model. Unlike the traditional translational motion model, the value of the motion vector of each pixel in the block is related to its location and is the first-order linear equation of the coordinate position.
- the affine motion model allows the reference block to undergo a warp transformation such as rotation, scaling, etc., and a more accurate prediction block can be obtained in motion compensation.
- the type of inter prediction obtained by predicting a block at the time of motion compensation by the affine motion model is generally referred to as an affine mode.
- the inter prediction type includes two modes: advanced motion vector prediction (AMVP) and merge (Merge).
- AMVP needs to explicitly pass the prediction direction of each coding block.
- the Merge mode directly derives the motion information of the current coded block by using the motion vector of the neighboring block.
- the affine mode and the inter-frame prediction method based on AMVP and Merge based on the translational motion model can form a new inter-frame prediction mode based on affine motion model such as AMVP and Merge.
- the Merge mode based on affine motion model may be used. This is called affine merge mode (Affine Merge).
- affine merge mode Affine Merge
- the new prediction mode and the prediction mode existing in the current standard participate in the comparison process of “performance cost ratio”, select the optimal mode as the prediction mode, and generate the predicted image of the block to be processed.
- performance cost ratio select the optimal mode as the prediction mode, and generate the predicted image of the block to be processed.
- the result of the prediction mode selection is encoded and transmitted to the decoder.
- the affine mode can better improve the precision value of the prediction block and improve the efficiency of coding.
- it takes more code rate to encode the motion information of each control point.
- the code rate used to encode the prediction mode selection result is also increased due to the increase in the candidate prediction mode.
- the code stream is parsed, and the indication information is used to determine whether a certain region uses the candidate prediction mode set including the affine mode, and the prediction mode is determined according to the candidate prediction mode set and the received additional indication information, and the predicted image is generated.
- the prediction mode information or the size information of the adjacent image unit of the image unit to be processed may be used as a priori knowledge for encoding the prediction information of the to-be-processed block, and the indication information formed by the candidate prediction mode set of the region is also As a prior knowledge of the prediction information of the to-be-processed block, the prior knowledge guides the coding of the prediction mode, saves the code rate of the coding mode selection information, and improves the coding efficiency.
- the affine model is a collective term for a non-translational motion model.
- Actual motions including rotation, scaling, deformation, dialysis, etc. can be used for motion estimation and motion compensation in interframe prediction by establishing different motion models.
- they are simply referred to as first affine model and second affine, respectively.
- FIG. 1 is a schematic block diagram of a video coding system 10 in accordance with an embodiment of the present invention.
- video coder generally refers to both a video encoder and a video decoder.
- video coding or “coding” may generally refer to video coding or video decoding.
- video coding system 10 includes source device 12 and destination device 14.
- Source device 12 produces encoded video data.
- source device 12 may be referred to as a video encoding device or a video encoding device.
- Destination device 14 may decode the encoded video data produced by source device 12.
- destination device 14 may be referred to as a video decoding device or a video decoding device.
- Source device 12 and destination device 14 may be examples of video coding devices or video coding devices.
- Source device 12 and destination Device 14 may include a wide range of devices, including desktop computers, mobile computing devices, notebook (eg, laptop) computers, tablet computers, set top boxes, telephone handsets such as so-called “smart" phones, televisions, cameras , display device, digital media player, video game console, on-board computer, or the like.
- Channel 16 may include one or more media and/or devices capable of moving encoded video data from source device 12 to destination device 14.
- channel 16 may include one or more communication media that enable source device 12 to transmit encoded video data directly to destination device 14 in real time.
- source device 12 may modulate the encoded video data in accordance with a communication standard (eg, a wireless communication protocol) and may transmit the modulated video data to destination device 14.
- the one or more communication media can include wireless and/or wired communication media, such as a radio frequency (RF) spectrum or one or more physical transmission lines.
- RF radio frequency
- the one or more communication media may form part of a packet-based network (eg, a local area network, a wide area network, or a global network (eg, the Internet)).
- the one or more communication media can include a router, a switch, a base station, or other device that facilitates communication from the source device 12 to the destination device 14.
- channel 16 can include a storage medium that stores encoded video data generated by source device 12.
- destination device 14 can access the storage medium via disk access or card access.
- the storage medium may include a variety of locally accessible data storage media, such as Blu-ray Disc, DVD, CD-ROM, flash memory, or other suitable digital storage medium for storing encoded video data.
- channel 16 can include a file server or another intermediate storage device that stores encoded video data generated by source device 12.
- destination device 14 may access encoded video data stored at a file server or other intermediate storage device via streaming or download.
- the file server can be a server type capable of storing encoded video data and transmitting the encoded video data to destination device 14.
- the instance file server includes a web server (eg, for a website), a file transfer protocol (FTP) server, a network attached storage (NAS) device, and a local disk drive.
- FTP file transfer protocol
- NAS network attached storage
- the techniques of the present invention are not limited to wireless applications or settings.
- the techniques can be applied to video coding supporting a variety of multimedia applications, such as aerial television broadcasts, cable television transmissions, satellite television transmissions, streaming video transmissions (eg, via the Internet), and storage on data storage media. Encoding of video data, decoding of video data stored on a data storage medium, or other application.
- video coding system 10 may be configured to support one-way or two-way video transmission to support applications such as video streaming, video playback, video broadcasting, and/or video telephony.
- source device 12 includes a video source 18, a video encoder 20, and an output interface 22.
- output interface 22 can include a modulator/demodulation transformer (modem) and/or a transmitter.
- Video source 18 may include a video capture device (eg, a video camera), a video archive containing previously captured video data, a video feed interface to receive video data from a video content provider, and/or for generating video data.
- Video encoder 20 may encode video data from video source 18.
- source device 12 transmits encoded video data directly to destination device 14 via output interface 22.
- the encoded video data may also be stored on a storage medium or file server for later access by the destination device 14 for decoding and/or playback.
- destination device 14 includes an input interface 28, a video decoder 30, and a display device 32.
- input interface 28 includes a receiver and/or a modem.
- Input interface 28 may receive encoded video data via channel 16.
- Display device 32 may be integral with destination device 14 or may be external to destination device 14. In general, display device 32 displays the decoded video data.
- Display device 32 may include a variety of display devices such as a liquid crystal display (LCD), a plasma display, an organic light emitting diode (OLED) display, or another type of display device.
- LCD liquid crystal display
- OLED organic light emitting diode
- video encoder 20 and video decoder 30 may operate in accordance with other proprietary or industry standards including ITU-TH.261, ISO/IEC MPEG-1 Visual, ITU-TH.262, or ISO/IEC MPEG-2 Visual, ITU. -TH.263, ISO/IECMPEG-4 Visual, ITU-TH.264 (also known as ISO/IECMPEG-4 AVC), including its tunable video coding (SVC) and multi-view video decoding (MVC) extension.
- SVC tunable video coding
- MVC multi-view video decoding extension
- FIG. 1 is merely an example and the techniques of the present invention are applicable to video coding settings (eg, video encoding or video decoding) that do not necessarily include any data communication between the encoding device and the decoding device.
- data is retrieved from local memory, data is streamed over a network, or manipulated in a similar manner.
- the encoding device may encode the data and store the data to a memory, and/or the decoding device may retrieve the data from the memory and decode the data.
- encoding and decoding are performed by a plurality of devices that only encode data to and/or retrieve data from the memory and decode the data by not communicating with each other.
- Video encoder 20 and video decoder 30 may each be implemented as any of a variety of suitable circuits, such as one or more microprocessors, digital signal processors (DSPs), application specific integrated circuits (ASICs), field programmable Gate array (FPGA), discrete logic, hardware, or any combination thereof. If the technology is implemented partially in software, the device may store the instructions of the software in a suitable non-transitory computer readable storage medium, and the instructions in the hardware may be executed using one or more processors to perform the techniques of the present invention. Any of the foregoing (including hardware, software, a combination of hardware and software, etc.) can be considered one or more processors. Each of video encoder 20 and video decoder 30 may be included in one or more encoders or decoders, any of which may be integrated into a combined encoder/decoder in a respective device (coded Part of the decoder (CODEC).
- CODEC coded Part of the decoder
- video encoder 20 may partition the picture into a raster of a coding tree block (CTB).
- CTB may be referred to as a "tree block,” a "largest coding unit” (LCU), or a "coding tree unit.”
- LCU largest coding unit
- coding tree unit The CTB of HEVC can be substantially similar to macroblocks of previous standards (eg, H.264/AVC). However, CTB is not necessarily limited to a particular size and may include one or more coding units (CUs).
- the CTB of the picture can be grouped into one or more slices.
- each of the slices contains an integer number of CTBs.
- video encoder 20 may generate an encoded representation (ie, a coded slice) of each slice of the picture.
- video encoder 20 may encode each CTB of the slice to produce an encoded representation of each of the sliced CTBs (ie, coded CTB).
- video encoder 20 may recursively perform quadtree partitioning on the block of pixels associated with the CTB to segment the block of pixels into progressively decreasing blocks of pixels.
- Each of the smaller blocks of pixels can be associated with a CU.
- a partitioned CU may be a CU that is partitioned into blocks of pixels associated with other CUs.
- An unpartitioned CU may be a CU in which a block of pixels is not partitioned into blocks of pixels associated with other CUs.
- Video encoder 20 may generate one or more prediction units (PUs) for each unpartitioned CU. Each of the PUs of the CU may be associated with a different block of pixels within a block of pixels of the CU. Video encoder 20 may generate predictive pixel blocks for each PU of the CU. The predictive pixel block of the PU can be a block of pixels.
- PUs prediction units
- Video encoder 20 may use intra prediction or inter prediction to generate predictive pixel blocks for the PU. If video encoder 20 uses intra prediction to generate a predictive pixel block for a PU, video encoder 20 may generate a predictive pixel block for the PU based on the decoded pixels of the picture associated with the PU. If video encoder 20 uses inter prediction to generate a predictive pixel block for a PU, video encoder 20 may generate a predictive pixel for the PU based on decoded pixels of one or more pictures that are different from the picture associated with the PU. Piece.
- Video encoder 20 may generate residual pixel blocks of the CU based on the predictive pixel blocks of the PU of the CU.
- the residual pixel block of the CU may indicate the sample in the predictive pixel block of the PU of the CU and the initial pixel of the CU The difference between the corresponding samples in the block.
- video encoder 20 may perform recursive quadtree partitioning on the residual pixel block of the CU to partition the residual pixel block of the CU into one or associated with a transform unit (TU) of the CU. Multiple smaller residual pixel blocks. Because the pixels in the block of pixels associated with the TU each comprise a luma sample and two chroma samples, each of the TUs can be associated with one residual sample block of the luma sample and two residual sample blocks of the chroma sample. Union.
- Video encoder 20 may generate a set of syntax elements that represent coefficients in the quantized coefficient block. Video encoder 20 may apply an entropy encoding operation (eg, a context adaptive binary arithmetic coding (CABAC) operation) to at least some of such syntax elements.
- CABAC context adaptive binary arithmetic coding
- video encoder 20 may binarize the syntax elements to form a binary string that includes a succession of one or more bits (referred to as "binary"). Video encoder 20 may encode some of the bins using regular CABAC encoding, and may use the pass encoding to encode the other of the bins.
- video encoder 20 may first identify the coding context.
- the coding context can identify the probability of decoding a binary having a particular value. For example, the coding context may indicate that the probability of decoding a zero value binary is 0.7 and the probability of decoding a binary value is 0.3.
- video encoder 20 may divide the interval into a lower subinterval and an upper subinterval. One of the subintervals may be associated with a value of 0, and another subinterval may be associated with a value of one. The width of the subinterval may be proportional to the probability indicated by the identified coding context for the associated value.
- video encoder 20 When video encoder 20 uses a pass-through encoding to encode a sequence of bins, video encoder 20 may be able to code several bins in a single loop, while video encoder 20 may use regular CABAC encoding, video encoder 20 may It is possible to decode only a single binary in a loop.
- the bypass decoding can be simpler because the bypass decoding does not require the video encoder 20 to select the context and can enable the video encoder 20 to assume that the probability of two symbols (0 and 1) is 1/2 (50%). . Therefore, in the bypass decoding, the interval is directly split into two halves. In effect, the bypass decoding bypasses the context adaptive portion of the arithmetic decoding engine.
- Performing a bypass decoding on a binary may be less computationally intensive than a binary execution rule CABAC decoding.
- performing parallel decoding enables higher parallelism and throughput.
- Binary bits encoded using bypass decoding may be referred to as "bypass decoded binary".
- video encoder 20 may apply inverse quantization and inverse transform to the transform block to reconstruct the residual sample block from the transform block.
- Video encoder 20 may add reconstructed residual sample blocks to corresponding samples from one or more predictive sample blocks to produce reconstructed sample blocks.
- By reconstructing the block of samples of each color component video encoder 20 may reconstruct the block of pixels associated with the TU. By reconstructing the block of pixels of each TU of the CU in this way, video encoder 20 can reconstruct the block of pixels that make up the CU.
- video encoder 20 may perform a deblocking operation to reduce blockiness artifacts associated with the CU.
- video encoder 20 may use sample adaptive offset (SAO) to modify the reconstructed structured block of CTB of the picture.
- SAO sample adaptive offset
- adding an offset value to a pixel in a picture can improve coding efficiency.
- video encoder 20 may store the reconstructed structured block of CUs in a decoded picture buffer for use in generating predictive pixel blocks for other CUs.
- video decoder 30 may decode the binary having the value associated with the upper subinterval. To decode the next binary of the syntax element, video decoder 30 may repeat these steps with respect to the interval that is the subinterval containing the encoded value. When video decoder 30 repeats these steps for the next binary, video decoder 30 may use the modified probability based on the probability indicated by the identified coding context and the decoded binary. Video decoder 30 may then debinarize the binary to recover the syntax elements. Debinarization may refer to selecting a syntax element value based on a mapping between a binary string and a syntax element value.
- Video decoder 30 may reconstruct a picture of the video data based on the syntax elements extracted from the bitstream. The process of reconstructing video data based on syntax elements is generally reciprocal to the program executed by video encoder 20 to generate syntax elements. For example, video decoder 30 may generate a predictive pixel block of a PU of a CU based on syntax elements associated with the CU. Additionally, video decoder 30 may inverse quantize the coefficient blocks associated with the TUs of the CU. Video decoder 30 may perform an inverse transform on the coefficient block to reconstruct a residual pixel block associated with the TU of the CU. Video decoder 30 may reconstruct a block of pixels that constitute a CU based on the predictive pixel block and the residual pixel block.
- video decoder 30 may perform a deblocking operation to reduce blockiness artifacts associated with the CU. Additionally, video decoder 30 may apply the SAO applied by video encoder 20 based on one or more SAO syntax elements. After video decoder 30 performs such operations, video decoder 30 may store the block of pixels of the CU in a decoded picture buffer. The decoded picture buffer may provide reference pictures for subsequent motion compensation, intra prediction, and presentation on the display device.
- FIG. 2 is a block diagram illustrating an example video encoder 20 that is configured to implement the techniques of the present invention.
- Figure 2 is provided for purposes of explanation and should not be construed as limiting the invention as broadly illustrated and described herein. Surgery.
- the present invention describes video encoder 20 in HEVC coded image prediction.
- the techniques of the present invention are applicable to other coding standards or methods.
- video encoder 20 includes prediction processing unit 100, residual generation unit 102, transform processing unit 104, quantization unit 106, inverse quantization unit 108, inverse transform processing unit 110, reconstruction unit 112, and filter unit. 113.
- Entropy encoding unit 116 includes a regular CABAC decoding engine 118 and a bypass decoding engine 120.
- the prediction processing unit 100 includes an inter prediction processing unit 121 and an intra prediction processing unit 126.
- the inter prediction processing unit 121 includes a motion estimation unit 122 and a motion compensation unit 124.
- video encoder 20 may include more, fewer, or different functional components.
- Video encoder 20 receives the video data.
- video encoder 20 may encode each slice of each picture of the video data.
- video encoder 20 may encode each CTB in the slice.
- the prediction processing unit 100 may perform quadtree partitioning on the pixel blocks associated with the CTB to divide the pixel block into progressively smaller pixel blocks. Smaller blocks of pixels can be associated with the CU. For example, prediction processing unit 100 may partition a block of CTB into four equally sized sub-blocks, split one or more of the sub-blocks into four equally sized sub-sub-blocks, and the like.
- Video encoder 20 may encode the CU of the CTB in the picture to produce an encoded representation of the CU (ie, the coded CU). Video encoder 20 may encode the CU of the CTB according to the z-scan order. In other words, video encoder 20 may encode the CU by the upper left CU, the upper right CU, the lower left CU, and then the lower right CU. When video encoder 20 encodes a partitioned CU, video encoder 20 may encode the CU associated with the sub-block of the pixel block of the partitioned CU according to the z-scan order.
- prediction processing unit 100 may partition the pixel blocks of the CU among one or more PUs of the CU.
- Video encoder 20 and video decoder 30 can support a variety of PU sizes. Assuming that the size of a particular CU is 2N ⁇ 2N, video encoder 20 and video decoder 30 may support a PU size of 2N ⁇ 2N or N ⁇ N for intra prediction, and support 2N ⁇ 2N, 2N ⁇ N, N ⁇ A symmetric PU size of 2N, N x N or similar size for inter prediction. Video encoder 20 and video decoder 30 may also support asymmetric partitioning of PU sizes of 2N x nU, 2N x nD, nL x 2N, and nR x 2N for inter prediction.
- the inter prediction processing unit 121 may generate predictive data of the PU by performing inter prediction on each PU of the CU.
- the predictive data of the PU may include motion information corresponding to the predictive pixel block of the PU and the PU.
- the slice can be an I slice, a P slice, or a B slice.
- Inter prediction unit 121 may depend on PU Whether the I performs a different operation on the PU of the CU in the I slice, the P slice, or the B slice. In an I slice, all PUs are intra predicted. Therefore, if the PU is in an I slice, the inter prediction unit 121 does not perform inter prediction on the PU.
- motion estimation unit 122 may search for a reference picture in a list of reference pictures (eg, "List 0") to find a reference block for the PU.
- the reference block of the PU may be the pixel block that most closely corresponds to the pixel block of the PU.
- Motion estimation unit 122 may generate a reference picture index that indicates a reference picture of the PU-containing reference block in list 0, and a motion vector that indicates a spatial displacement between the pixel block of the PU and the reference block.
- the motion estimation unit 122 may output the reference picture index and the motion vector as motion information of the PU.
- Motion compensation unit 124 may generate a predictive pixel block of the PU based on the reference block indicated by the motion information of the PU.
- motion estimation unit 122 may perform uni-directional inter prediction or bi-directional inter prediction on the PU.
- motion estimation unit 122 may search for a reference picture of a first reference picture list ("List 0") or a second reference picture list ("List 1") to find a reference block for the PU.
- the motion estimation unit 122 may output the following as the motion information of the PU: a reference picture index indicating a position in the list 0 or the list 1 of the reference picture containing the reference block, a space between the pixel block indicating the PU and the reference block The motion vector of the displacement, and the prediction direction indicator indicating whether the reference picture is in list 0 or in list 1.
- motion estimation unit 122 may search for reference pictures in list 0 to find reference blocks for the PU, and may also search for reference pictures in list 1 to find another reference block for the PU.
- Motion estimation unit 122 may generate a reference picture index indicating the list 0 of the reference picture containing the reference block and the location in list 1. Additionally, motion estimation unit 122 may generate a motion vector that indicates a spatial displacement between the reference block and the pixel block of the PU.
- the motion information of the PU may include a reference picture index of the PU and a motion vector.
- Motion compensation unit 124 may generate a predictive pixel block of the PU based on the reference block indicated by the motion information of the PU.
- Intra prediction processing unit 126 may generate predictive data for the PU by performing intra prediction on the PU.
- the predictive data of the PU may include predictive pixel blocks of the PU and various syntax elements.
- Intra prediction processing unit 126 may perform intra prediction on PUs within I slices, P slices, and B slices.
- intra-prediction processing unit 126 may use multiple intra-prediction modes to generate multiple sets of predictive data for the PU.
- intra-prediction processing unit 126 may extend samples of sample blocks from neighboring PUs across sample blocks of the PU in a direction associated with the intra-prediction mode. Assume from left to right, from top The lower coding order is used for PU, CU, and CTB, and the adjacent PU may be above the PU, at the upper right of the PU, at the upper left of the PU, or to the left of the PU.
- Intra prediction processing unit 126 may use various numbers of intra prediction modes, for example, 33 directional intra prediction modes. In some examples, the number of intra prediction modes may depend on the size of the pixel block of the PU.
- the prediction processing unit 100 may select the predictive data of the PU of the CU from among the predictive data generated by the inter prediction processing unit 121 for the PU or the predictive data generated by the intra prediction processing unit 126 for the PU. In some examples, prediction processing unit 100 selects predictive data for the PU of the CU based on the rate/distortion metric of the set of predictive data.
- a predictive pixel block that selects predictive data may be referred to herein as a selected predictive pixel block.
- Residual generation unit 102 may generate a residual pixel block of the CU based on the pixel block of the CU and the selected predictive pixel block of the PU of the CU. For example, the residual generation unit 102 may generate a residual pixel block of the CU such that each sample in the residual pixel block has a value equal to a difference between: a sample in a pixel block of the CU, and a PU of the CU Corresponding samples in the predictive pixel block are selected.
- the prediction processing unit 100 may perform quadtree partitioning to partition the residual pixel block of the CU into sub-blocks. Each undivided residual pixel block can be associated with a different TU of the CU. The size and location of the residual pixel block associated with the TU of the CU may or may not be based on the size and location of the pixel block of the PU of the CU.
- Transform processing unit 104 may generate a coefficient block for each TU of the CU by applying one or more transforms to the residual sample block associated with the TU.
- Transform processing unit 104 may apply various transforms to the residual sample block associated with the TU. For example, transform processing unit 104 may apply a discrete cosine transform (DCT), a directional transform, or a conceptually similar transform to the residual sample block.
- DCT discrete cosine transform
- a directional transform or a conceptually similar transform to the residual sample block.
- Quantization unit 106 may quantize the coefficients in the coefficient block. The quantization procedure can reduce the bit depth associated with some or all of the coefficients. For example, an n-bit coefficient can be truncated to an m-bit coefficient during quantization, where n is greater than m. Quantization unit 106 may quantize the coefficient block associated with the TU of the CU based on a quantization parameter (QP) value associated with the CU. Video encoder 20 may adjust the degree of quantization applied to the coefficient block associated with the CU by adjusting the QP value associated with the CU.
- QP quantization parameter
- Inverse quantization unit 108 and inverse transform processing unit 110 may apply inverse quantization and inverse transform, respectively, to the coefficient block to reconstruct the residual sample block from the coefficient block.
- the reconstruction unit 112 may add samples of the reconstructed residual sample block to corresponding samples from the one or more predictive sample blocks generated by the prediction processing unit 100 to generate reconstructed sample blocks associated with the TU. By reconstructing the CU in this way For each TU of sample blocks, video encoder 20 may reconstruct the block of pixels of the CU.
- Filter unit 113 may perform a deblocking operation to reduce blockiness artifacts in the block of pixels associated with the CU. Further, the filter unit 113 may apply the SAO offset determined by the prediction processing unit 100 to the reconstructed sample block to recover the pixel block. Filter unit 113 may generate a sequence of SAO syntax elements of the CTB.
- the SAO syntax elements may include regular CABAC coded bins and bypass coded bins. In accordance with the teachings of the present invention, none of the pass-coded bins of the color components are within the sequence between the regular CABAC coded bins of the same color component.
- the decoded picture buffer 114 may store the reconstructed block of pixels.
- Inter prediction unit 121 may perform inter prediction on PUs of other pictures using reference pictures containing reconstructed blocks of pixels.
- intra-prediction processing unit 126 can use the reconstructed block of pixels in decoded picture buffer 114 to perform intra-prediction on other PUs in the same picture as the CU.
- Entropy encoding unit 116 may receive data from other functional components of video encoder 20. For example, entropy encoding unit 116 may receive a coefficient block from quantization unit 106 and may receive syntax elements from prediction processing unit 100. Entropy encoding unit 116 may perform one or more entropy encoding operations on the data to produce entropy encoded data. For example, entropy encoding unit 116 may perform context adaptive variable length coding (CAVLC) operations, CABAC operations, variable to variable (V2V) length coding operations, grammar-based context adaptive binary arithmetic translations on data. Code (SBAC) operation, probability interval partition entropy (PIPE) decoding operation, or another type of entropy coding operation.
- CAVLC context adaptive variable length coding
- CABAC CABAC
- V2V variable to variable
- SBAC probability interval partition entropy
- PIPE probability interval partition entropy
- entropy encoding unit 116 may encode the SAO syntax elements generated by filter unit 113. As part of encoding the SAO syntax element, entropy encoding unit 116 may encode a regular CABAC coding binary of the SAO syntax element using regular CABAC engine 118, and may encode the bypass coding binary using bypass coding engine 120 .
- the inter prediction unit 121 determines a set of inter prediction prediction modes.
- video encoder 20 is an example of a video encoder that is configured to determine the image to be processed based on information of neighboring image units adjacent to the image unit to be processed in accordance with the teachings of the present invention.
- the candidate prediction mode set of the unit includes an affine merge mode, the affine merge mode indicating that the image unit to be processed and the adjacent image unit of the image unit to be processed obtain the respective predicted images using the same affine model; Determining, from the candidate prediction mode set, a prediction mode of the image unit to be processed; determining, according to the prediction mode, a prediction image of the image unit to be processed; and encoding the first indication information into a code stream, where An indication information indicates the prediction mode.
- FIG. 3 is a flow diagram of an example operation 200 of a video encoder for encoding video data in accordance with one or more techniques of this disclosure.
- Figure 3 is provided as an example. In other examples, the techniques of the present invention may be practiced using more or fewer steps than those shown in the examples of FIG. 3 or steps that are different therefrom.
- video encoder 20 performs the following steps:
- S210 Determine, according to information about an adjacent image unit adjacent to the image unit to be processed, whether the candidate prediction mode set of the image unit to be processed includes an affine merge mode;
- the blocks A, B, C, D, and E are adjacent reconstructed blocks of the current block to be coded, respectively located at the upper, left, upper right, lower left, and upper left positions of the block to be coded. Whether the affine merge mode exists in the candidate prediction mode set of the current block to be coded may be determined by the coding information of the adjacent reconstructed block.
- FIG. 4 in the embodiment of the present invention exemplarily gives the number and location of adjacent reconstructed blocks of the block to be coded, and the number of the adjacent reconstructed blocks may be more than five or less. 5, not limited.
- determining whether there is a block in the adjacent reconstructed block that the prediction type is an affine prediction if not, the candidate prediction mode set of the to-be-coded block does not include an affine merge mode; If so, the candidate prediction mode set of the block to be encoded includes an affine merge mode.
- the adjacent reconstructed block includes multiple affine modes, and may include a first affine mode or a second affine mode, and correspondingly, the affine merge mode includes Combining the first affine merge mode of the first affine mode or the second affine merge mode of the second affine mode, respectively counting the first affine mode and the second of the adjacent reconstructed blocks The number of affine patterns and non-affine patterns, when the first affine pattern is the most, the candidate prediction mode set includes the first affine merge mode, and does not include the second affine merge mode; When the second affine mode is the most, the candidate prediction mode set includes the second affine merge mode, and does not include the first An affine merge mode; when the non-affine mode is the most, the candidate prediction mode set does not include the affine merge mode.
- the third implementation manner may be: when the first affine mode is the most, the candidate prediction mode set includes the first affine merge mode, and does not include the second affine merge mode; When the second affine mode is the most, the candidate prediction mode set includes the second affine merge mode, and does not include the first affine merge mode; when the non-affine mode is the most, the first affine mode is counted and Which second is the second affine mode, when the first affine mode is too many, the candidate prediction mode set includes the first affine merge mode, and does not include the second affine merge mode; When the affine mode is many times, the candidate prediction mode set includes the second affine merge mode, and does not include the first affine merge mode.
- two conditions are determined: (1) whether there is a block in the adjacent reconstructed block whose prediction type is an affine pattern; (2) a width sum of adjacent blocks in the affine mode Whether the height is smaller than the width and height of the block to be coded; if any condition is not satisfied, the candidate prediction mode set of the block to be coded does not include the affine merge mode; if both conditions are satisfied, respectively, the block to be coded
- the set of candidate prediction modes includes an affine combining mode and the candidate prediction mode set of the block to be encoded does not include an affine combining mode. In both cases, the encoding process shown in FIG.
- the candidate prediction mode set of the to-be-coded block includes an affine merge mode, and an indication information may be set as the third indication information, set to 1 and encoded into the code stream, and vice versa, the candidate prediction mode set of the to-be-coded block.
- the third indication information is set to 0 and programmed into the code stream.
- the width of the adjacent block that satisfies the affine mode is smaller than the width of the block to be coded, and the height of the adjacent block in the affine mode is smaller than the height of the block to be coded.
- the determining condition may be that the width of the adjacent block of the affine mode is smaller than the width of the block to be coded or the height of the adjacent block of the affine mode is smaller than the height of the block to be coded, and is not limited. .
- two conditions are determined: (1) whether there is a block in the adjacent reconstructed block whose prediction type is an affine pattern; (2) a width sum of adjacent blocks in the affine mode Whether the height is smaller than the width and height of the block to be coded; if any condition is not satisfied, the candidate prediction mode set of the block to be coded does not include the affine merge mode; if both conditions are satisfied, the candidate prediction of the block to be coded
- the pattern collection contains an affine merge mode.
- the prediction type and size of the adjacent reconstructed block are used as the basis for determining the candidate prediction mode set of the current block to be coded, and the adjacent reconstructed block solution may also be adopted.
- the obtained attribute information is judged and is not limited here.
- determining whether there is a block in the adjacent reconstructed block that the prediction type is an affine prediction may be used. For example, when the prediction type of at least two neighboring blocks is an affine mode, the candidate prediction mode set of the to-be-coded block includes an affine merge mode, otherwise, the block to be coded The candidate prediction mode set does not include an affine merge mode.
- the prediction type of the adjacent block is that the number of affine patterns may be at least three, or at least four, and is not limited.
- two conditions are determined: (1) whether there is a prediction in the adjacent reconstructed block.
- the type is a block of affine mode; (2) whether the width and height of adjacent blocks of the affine mode are smaller than the width and height of the block to be coded; wherein, the second judgment condition, for example, may also be an affine mode Whether the width and height of adjacent blocks are less than 1/2 or 1/3, 1/4 of the width and height of the block to be coded are not limited.
- the indication information is set to 0 or set to be exemplary, and the opposite setting may also be performed.
- the Whether there is a block in the adjacent reconstructed block whose prediction type is affine prediction; if not, the candidate prediction mode set of the block to be coded does not include an affine merge mode; if yes, candidate predictions of the block to be coded respectively
- the mode set includes an affine merge mode and the candidate prediction mode set of the block to be coded does not include an affine merge mode, and in both cases, the encoding process shown in FIG.
- the candidate prediction mode set is the candidate prediction mode set determined by S210, and the coding process shown in FIG. 2 is sequentially performed by using each of the candidate prediction mode sets, and a mode with the best coding performance is selected as the The prediction mode of the coded block.
- the purpose of performing the encoding process shown in FIG. 2 in the embodiment of the present invention is to select a prediction mode that may optimize encoding performance.
- the performance cost ratio of each prediction mode can be compared, wherein the performance is represented by the quality of the image restoration, the cost is represented by the coded rate of the encoding, and only the performance or cost of each prediction mode can be compared, corresponding Can complete Figure 2
- the encoding step may be stopped after obtaining the index to be compared. For example, if only the prediction modes are compared using the performance, only the prediction unit needs to be passed through, and is not limited.
- the H.265 standard cited in the foregoing and the application documents such as CN201010247275.7 are used to generate a predicted image of a block to be coded according to a prediction mode, a prediction mode including a translational motion model, an affine prediction mode, and an affine merge mode. Description, no more details here.
- the prediction mode determined by S220 is programmed into the code stream. It should be understood that this step may occur at any time after S220, and there is no specific requirement in the order of the steps, and the decoded first indication information of the decoding end may correspond.
- FIG. 5 is a block diagram of an example of another video encoder 40 for encoding video data in accordance with one or more techniques of this disclosure.
- the video encoder 40 includes a first determining module 41, a second determining module 42, a third determining module 43, and an encoding module 44.
- the first determining module 41 is configured to: S210, determining, according to information about a neighboring image unit adjacent to the image unit to be processed, whether the candidate prediction mode set of the to-be-processed image unit includes an affine merge mode;
- the second determining module 42 is configured to execute, by S220, determining, from the candidate prediction mode set, a prediction mode of the image unit to be processed;
- the third determining module 43 is configured to execute S230, according to the prediction mode, determining a predicted image of the image unit to be processed;
- the encoding module 44 is configured to execute S240 to program the first indication information into the code stream.
- the current block and the adjacent block have a large possibility to have the same or similar prediction mode.
- the information of the adjacent block is used to derive the current block.
- the prediction mode information reduces the code rate of the coding prediction mode and improves the coding efficiency.
- FIG. 6 is a flow diagram of an example operation 300 of a video encoder for encoding video data in accordance with one or more techniques of this disclosure.
- Figure 5 is provided as an example. In other examples, the techniques of the present invention may be practiced using fewer, fewer, or different steps than those illustrated in the example of FIG. According to the example method of FIG. 5, video encoder 20 performs the following steps:
- the indication information of the candidate prediction mode set of the first image area to be processed is encoded.
- the candidate mode set of the first to-be-processed image region adopts the candidate translation mode set, set the first indication information to 0, and encode the first indication information into the code stream, where the translation mode indicates that the translation model is used.
- Predicting a prediction mode of the image when the candidate mode set of the first to-be-processed image region adopts the candidate translation mode set and the candidate affine mode set, setting the first indication information to 1, and the first The indication information is programmed into a code stream, and the affine pattern represents a prediction mode for obtaining a predicted image using an affine model; wherein the first image area to be processed may be an image frame group, an image frame, an image slice set, and an image strip.
- the first indication information is encoded in a header of an image frame group, such as a video parameter set (VPS), Sequence Parameter Set (SPS), Additional Enhancement Information (SEI), Image Frame Header, such as Image Parameter Set (PPS), Head of Image Slice Set, Head of Image Strip Set, Head of Image Fragment,
- a video parameter set such as a video parameter set (VPS), Sequence Parameter Set (SPS), Additional Enhancement Information (SEI)
- Image Frame Header such as Image Parameter Set (PPS), Head of Image Slice Set, Head of Image Strip Set, Head of Image Fragment
- PPS Picture Parameter Set
- Head of Image Slice Set such as Image Parameter Set (PPS)
- PPS Picture Parameter Set
- Head of Image Slice Set such as Image Parameter Set (PPS)
- Head of Image Fragment For example, an image tile header, a slice header, a header of a set of image coding units, and a header of an image coding unit.
- the determining of the first to-be-processed image area may be pre-configured, or may be adaptively determined during the encoding process, and the representation of the first to-be-processed image area range may be learned by the protocol of the codec.
- the range of the first image area to be processed may also be encoded and transmitted through the code stream, which is not limited.
- the determination of the candidate prediction mode set may be pre-configured, or may be determined after comparing the coding performance, and is not limited.
- the indication information is set to 0 or set to be exemplary, and the opposite setting may also be performed.
- S320 Determine, for the to-be-processed unit in the first to-be-processed image region, a prediction mode of the image unit to be processed from the candidate prediction mode set of the first to-be-processed image region.
- FIG. 7 is a block diagram of an example of another video encoder 50 for encoding video data in accordance with one or more techniques of this disclosure.
- the video encoder 50 includes: a first encoding module 51, a first determining module 52, and a second determining module. Block 53, a second encoding module 54.
- the first encoding module 51 is configured to execute, by S310, indication information that encodes a candidate prediction mode set of the first to-be-processed image region;
- the first determining module 52 is configured to: S320, for the to-be-processed unit in the first to-be-processed image region, determine a prediction mode of the image unit to be processed from the candidate prediction mode set of the first to-be-processed image region;
- the second determining module 53 is configured to execute S330, according to the prediction mode, to determine a predicted image of the image unit to be processed;
- the second encoding module 54 is configured to execute S340 to program the prediction mode selected by the unit to be processed into the code stream.
- the flag is selected by setting the candidate prediction mode set at the regional level. The code rate spent by the coding redundancy mode is avoided, and the coding efficiency is improved.
- FIG. 8 is a block diagram illustrating an example video decoder 30 that is configured to implement the techniques of the present invention.
- FIG. 8 is provided for purposes of explanation and does not limit the techniques as broadly exemplified and described in the present invention.
- the present invention describes video decoder 30 in HEVC coded image prediction.
- the techniques of the present invention are applicable to other coding standards or methods.
- video decoder 30 includes an entropy decoding unit 150, a prediction processing unit 152, an inverse quantization unit 154, an inverse transform processing unit 156, a reconstruction unit 158, a filter unit 159, and a decoded picture buffer 160.
- the prediction processing unit 152 includes a motion compensation unit 162 and an intra prediction processing unit 164.
- Entropy decoding unit 150 includes a regular CABAC decoding engine 166 and a bypass decoding engine 168. In other examples, video decoder 30 may include more, fewer, or different functional components.
- Video decoder 30 can receive the bitstream.
- Entropy decoding unit 150 may parse the bitstream to extract syntax elements from the bitstream. As part of parsing the bitstream, entropy decoding unit 150 may entropy decode the entropy encoded syntax elements in the bitstream.
- Prediction processing unit 152, inverse quantization unit 154, inverse transform processing unit 156, reconstruction unit 158, and filter unit 159 may generate decoded video data based on syntax elements extracted from the bitstream.
- the bitstream may comprise a sequence of coded SAO syntax elements of the CTB.
- the SAO syntax elements may include regular CABAC coded bins and bypass coded bins. According to the techniques of the present invention, in the sequence of coded SAO syntax elements, none of the bypass coded bits are decoded by regular CABAC. Between the two in the carry.
- Entropy decoding unit 150 may decode the SAO syntax elements. As part of decoding the SAO syntax elements, entropy decoding unit 150 may use regular CABAC coding engine 166 to decode the regular CABAC coding bins, and may use bypass coding engine 168 to decode the binned coding bins.
- video decoder 30 may perform a reconstruction operation on the unpartitioned CU. To perform a reconstruction operation on an unpartitioned CU, video decoder 30 may perform a reconstruction operation on each TU of the CU. By performing a reconstruction operation on each TU of the CU, video decoder 30 may reconstruct the residual pixel blocks associated with the CU.
- inverse quantization unit 154 may inverse quantize (ie, dequantize) the coefficient block associated with the TU. Inverse quantization unit 154 may use the QP value associated with the CU of the TU to determine the degree of quantization, and likewise determine the degree of inverse quantization that inverse quantization unit 154 will apply.
- inverse transform processing unit 156 may apply one or more inverse transforms to the coefficient block to generate a residual sample block associated with the TU.
- inverse transform processing unit 156 can apply inverse DCT, inverse integer transform, inverse Karhunen-Loeve transform (KLT), inverse rotation transform, inverse directional transform, or another inverse transform. Coefficient block.
- intra prediction processing unit 164 may perform intra prediction to generate a predictive sample block for the PU.
- Intra-prediction processing unit 164 may use an intra-prediction mode to generate a predictive pixel block of a PU based on a block of pixels of a spatially neighboring PU.
- Intra prediction processing unit 164 may determine an intra prediction mode for the PU based on one or more syntax elements parsed from the bitstream.
- Motion compensation unit 162 may construct a first reference picture list (List 0) and a second reference picture list (List 1) based on syntax elements extracted from the bitstream. Furthermore, if the PU uses inter prediction encoding, the entropy decoding unit 150 may extract motion information of the PU. Motion compensation unit 162 may determine one or more reference blocks of the PU based on the motion information of the PU. Motion compensation unit 162 can generate a predictive pixel block of the PU based on one or more reference blocks of the PU.
- the reconstruction unit 158 may use the residual pixel block associated with the TU of the CU and the predictive pixel block of the PU of the CU (ie, intra prediction data or inter prediction data) to reconstruct the block of pixels of the CU, where applicable.
- reconstruction unit 158 may add samples of the residual pixel block to corresponding samples of the predictive pixel block to reconstruct the pixel block of the CU.
- Filter unit 159 may perform a deblocking operation to reduce blockiness artifacts associated with pixel blocks of the CU of the CTB. Additionally, filter unit 159 can modify the block of pixels of the CTB based on the SAO syntax elements parsed from the bitstream. For example, filter unit 159 can be based on the SAO syntax element of CTB The value is determined and the determined value is added to the sample in the reconstructed block of the CTB. By modifying at least some of the block of pixels of the CTB of the picture, filter unit 159 can modify the reconstructed picture of the video data based on the SAO syntax element.
- Video decoder 30 may store the block of pixels of the CU in decoded picture buffer 160.
- the decoded picture buffer 160 may provide reference pictures for subsequent motion compensation, intra prediction, and presentation on a display device (eg, display device 32 of FIG. 1).
- video decoder 30 may perform intra-prediction operations or inter-prediction operations on PUs of other CUs based on the blocks of pixels in decoded picture buffer 160.
- prediction processing unit 152 determines a set of inter-frame candidate prediction modes.
- video decoder 30 is an example of a video decoder that is configured to determine the image to be processed based on information of neighboring image units adjacent to the image unit to be processed in accordance with the teachings of the present invention.
- the candidate prediction mode set of the unit includes an affine merge mode, the affine merge mode indicating that the image unit to be processed and the adjacent image unit of the image unit to be processed obtain the respective predicted images using the same affine model; Parsing a code stream, obtaining first indication information; determining, according to the first indication information, a prediction mode of the image unit to be processed from the candidate prediction mode set; determining the image to be processed according to the prediction mode The predicted image of the unit.
- FIG. 9 is a flow diagram of an example operation 400 of a video decoder for decoding video data in accordance with one or more techniques of this disclosure.
- Figure 9 is provided as an example. In other examples, the techniques of the present invention may be practiced using more or fewer steps than those shown in the examples of FIG. 9 or steps that are different therefrom. According to the example method of FIG. 9, video decoder 30 performs the following steps:
- S410 Determine, according to information about a neighboring image unit adjacent to the image unit to be processed, whether the candidate prediction mode set of the image unit to be processed includes an affine merge mode;
- the blocks A, B, C, D, and E are adjacent reconstructed blocks of the current block to be coded, respectively located at the upper, left, upper right, lower left, and upper left positions of the block to be coded. Whether the affine merge mode exists in the candidate prediction mode set of the current block to be coded may be determined by the coding information of the adjacent reconstructed block.
- FIG. 4 in the embodiment of the present invention exemplarily gives the number and location of adjacent reconstructed blocks of the block to be coded, and the number of the adjacent reconstructed blocks may be more than five or less. 5, not limited.
- the second indication information in the code stream is parsed.
- the candidate prediction mode set includes the affine merge mode, and when the second indication information is 0, the candidate prediction mode set does not include the affine merge mode; otherwise, the candidate prediction mode set does not include the affine merge mode.
- determining whether there is a block in the adjacent reconstructed block that the prediction type is an affine prediction if not, the candidate prediction mode set of the to-be-coded block does not include an affine merge mode; If so, the candidate prediction mode set of the block to be encoded includes an affine merge mode.
- the adjacent reconstructed block includes multiple affine modes, and may include a first affine mode or a second affine mode, and correspondingly, the affine merge mode includes Combining the first affine merge mode of the first affine mode or the second affine merge mode of the second affine mode, respectively counting the first affine mode and the second of the adjacent reconstructed blocks The number of affine patterns and non-affine patterns, when the first affine pattern is the most, the candidate prediction mode set includes the first affine merge mode, and does not include the second affine merge mode; When the two affine modes are the most, the candidate prediction mode set includes the second affine merge mode, and does not include the first affine merge mode; when the non-affine mode is the most, the candidate prediction mode set is not The affine merge mode is included.
- the third implementation manner may be: when the first affine mode is the most, the candidate prediction mode set includes the first affine merge mode, and does not include the second affine merge mode; When the second affine mode is the most, the candidate prediction mode set includes the second affine merge mode, and does not include the first affine merge mode; when the non-affine mode is the most, the first affine mode is counted and Which second is the second affine mode, when the first affine mode is too many, the candidate prediction mode set includes the first affine merge mode, and does not include the second affine merge mode; When the affine mode is many times, the candidate prediction mode set includes the second affine merge mode, and does not include the first affine merge mode.
- two conditions are determined: (1) whether there is a block in the adjacent reconstructed block whose prediction type is an affine pattern; (2) a width sum of adjacent blocks in the affine mode Whether the height is smaller than the width and height of the block to be coded; if any of the conditions are not satisfied, the candidate prediction mode set of the block to be coded does not include the affine merge mode; if both conditions are satisfied, the first of the code streams is parsed
- the third indication information when the third indication information is 1, the candidate prediction mode set includes the affine merge mode, and when the third indication information is 0, the candidate prediction mode set does not include the An affine merge mode; otherwise, the candidate prediction mode set does not include the affine merge mode.
- the width of the adjacent block that satisfies the affine mode is smaller than the width of the block to be coded, and the height of the adjacent block in the affine mode is smaller than the height of the block to be coded.
- the determining condition may be that the width of the adjacent block of the affine mode is smaller than the width of the block to be coded or the height of the adjacent block of the affine mode is smaller than the height of the block to be coded, and is not limited. .
- two conditions are determined: (1) whether there is a block in the adjacent reconstructed block whose prediction type is an affine pattern; (2) a width sum of adjacent blocks in the affine mode Whether the height is smaller than the width and height of the block to be coded; if any condition is not satisfied, the candidate prediction mode set of the block to be coded does not include the affine merge mode; if both conditions are satisfied, the candidate prediction of the block to be coded
- the pattern collection contains an affine merge mode.
- the prediction type and size of the adjacent reconstructed block are used as the basis for determining the candidate prediction mode set of the current to-be-coded block, and the attribute information obtained by parsing the adjacent reconstructed block may also be used. To judge, it corresponds to the encoding end, which is not limited here.
- determining whether there is a block in the adjacent reconstructed block that the prediction type is an affine prediction may be used. For example, when the prediction type of at least two neighboring blocks is an affine mode, the candidate prediction mode set of the to-be-coded block includes an affine merge mode, otherwise, the block to be coded The candidate prediction mode set does not include an affine merge mode.
- the prediction type of the adjacent block may be at least three, or at least four, corresponding to the coding end, and is not limited.
- two conditions are determined: (1) whether there is a prediction in the adjacent reconstructed block.
- the type is a block of affine mode; (2) whether the width and height of adjacent blocks of the affine mode are smaller than the width and height of the block to be coded; wherein, the second judgment condition, for example, may also be an affine mode Whether the width and height of the adjacent blocks are less than 1/2 or 1/3, 1/4 of the width and height of the block to be coded, and corresponding to the coding end, are not limited.
- setting or setting the indication information to the encoding end in the embodiment of the present invention corresponds to the encoding end.
- the first indication information indicates index information of a prediction mode of a block to be decoded, and this step corresponds to the encoding end step S240.
- the prediction of the block to be decoded can be found by searching the list of prediction modes corresponding to the candidate prediction mode set determined in S410 by using the index information obtained in S420 for the list of different prediction mode sets corresponding to different prediction modes. mode.
- FIG. 10 is a block diagram of an example of another frequency-decoding encoder 60 for decoding video data in accordance with one or more techniques of this disclosure.
- the video decoder 60 includes a first determining module 61, a parsing module 62, a second determining module 63, and a third determining module 64.
- the first determining module 61 is configured to execute, by S410, determining, according to information of a neighboring image unit adjacent to the image unit to be processed, whether the candidate prediction mode set of the to-be-processed image unit includes an affine merge mode;
- the parsing module 62 is configured to execute the first indication information in the S420 parsing code stream;
- the second determining module 63 is configured to perform, according to the first indication information, determining, according to the first indication information, a prediction mode of the image unit to be processed from the candidate prediction mode set;
- the third determining module 64 is configured to execute S440 to determine a predicted image of the image unit to be processed according to the prediction mode.
- the current block and the adjacent block have a large possibility to have the same or similar prediction mode.
- the information of the adjacent block is used to derive the current block.
- the prediction mode information reduces the code rate of the coding prediction mode and improves the coding efficiency.
- FIG. 11 is a flow diagram of an example operation 500 of a video decoder for decoding video data in accordance with one or more techniques of this disclosure.
- Figure 11 is provided as an example. In other examples, the techniques of the present invention may be implemented using more or fewer steps than steps shown in the examples of FIG. 11 or steps that are different therefrom.
- video decoder 20 performs the following steps:
- the first indication information indicates whether the candidate mode set of the first to-be-processed image region includes an affine motion model, and this step corresponds to step S310 of the encoding end.
- the translation mode represents a prediction mode of obtaining a predicted image using a translational model
- the candidate mode set of the first image to be processed adopts the a candidate translation mode set and a candidate affine pattern set, the affine mode representing a prediction mode for obtaining a predicted image using an affine model
- the first image area to be processed may be an image frame group, an image frame, and an image point
- the first indication information is encoded in a head of the image frame group, for example, a video parameter Set (VPS), Sequence Parameter Set (SPS), Additional Enhancement Information (SEI), Image Frame Header, such as Image Parameter Set (PPS), Head of Image Slice Set, Head of Image Strip Set, Image Score
- VPS video parameter Set
- SPS Sequence Parameter Set
- SEI Additional Enhancement Information
- Image Frame Header such as Image Parameter Set (PPS), Head of Image Slice Set, Head of Image Strip Set, Image Score
- PPS Physical Enhancement Information
- the determining of the first to-be-processed image area may be pre-configured, or may be adaptively determined during the encoding process, and the representation of the first to-be-processed image area range may be learned by the protocol of the codec.
- the range of the first image area to be processed may be accepted from the encoding end by the code stream, and may correspond to the encoding end, which is not limited.
- setting or setting the indication information to the exemplary embodiment in the embodiment of the present invention is exemplary and corresponds to the encoding end.
- the second indication information indicates a prediction mode of a block to be processed in the first image area to be processed, and this step corresponds to step S340 of the encoding end.
- S540 Determine, according to the second indication information, a prediction mode of an image unit to be processed from a candidate prediction mode set of the first to-be-processed image region.
- FIG. 12 is a block diagram of an example of another video decoder 70 for decoding video data in accordance with one or more techniques of this disclosure.
- the video decoder 70 includes a first parsing module 71, a first determining module 72, a second parsing module 73, a second determining module 74, and a third determining module 75.
- the first parsing module 71 is configured to execute the first indication information in the S510 parsing code stream;
- the first determining module 72 is configured to execute S520, determining, according to the first indication information, that the first to-be a set of candidate modes for the image region;
- the second parsing module 73 is configured to perform S530 to parse the second indication information in the code stream;
- the second determining module 74 is configured to perform, according to the second indication information, determining, according to the second indication information, a prediction mode of the image unit to be processed from the candidate prediction mode set of the first to-be-processed image region;
- the third determining module 75 is configured to execute S550 to determine a predicted image of the image unit to be processed according to the prediction mode.
- the flag is selected by setting the candidate prediction mode set at the regional level. The code rate spent by the coding redundancy mode is avoided, and the coding efficiency is improved.
- the functions described may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the functions may be stored on or transmitted as one or more instructions or code via a computer readable medium and executed by a hardware-based processing unit.
- the computer readable medium can comprise a computer readable storage medium (which corresponds to a tangible medium such as a data storage medium) or a communication medium comprising, for example, any medium that facilitates transfer of the computer program from one place to another in accordance with a communication protocol. .
- computer readable media generally may correspond to (1) a non-transitory tangible computer readable storage medium, or (2) a communication medium such as a signal or carrier wave.
- Data storage media may be any available media that can be accessed by one or more computers or one or more processors to retrieve instructions, code and/or data structures for implementing the techniques described in the present disclosure.
- the computer program product can comprise a computer readable medium.
- certain computer-readable storage media may comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, disk storage or other magnetic storage device, flash memory, or may be used to store instructions or data structures. Any other medium in the form of the desired program code and accessible by the computer. Also, any connection is properly termed a computer-readable medium. For example, if you use coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), or wireless technology (eg, infrared, radio, and microwave) to transmit commands from a website, server, or other remote source, then coaxial cable , fiber optic cable, twisted pair, DSL, or wireless technologies (eg, infrared, radio, and microwave) are included in the definition of the media.
- coaxial cable, fiber optic cable, twisted pair, DSL, or wireless technologies eg, infrared, radio, and microwave
- a magnetic disk and an optical disk include a compact disk (CD), a laser disk, an optical disk, a digital video disk (DVD), a flexible disk, and a Blu-ray disk, wherein the disk usually reproduces data magnetically, and the disk passes the laser Optically copy data.
- CD compact disk
- DVD digital video disk
- a flexible disk a hard disk
- Blu-ray disk wherein the disk usually reproduces data magnetically, and the disk passes the laser Optically copy data.
- the combination of the above should also be included in the computer Read the scope of the media.
- processors such as one or more digital signal processors (DSPs), general purpose microprocessors, application specific integrated circuits (ASICs), field programmable logic arrays (FPGAs), or other equivalent integrated or discrete logic circuits
- DSPs digital signal processors
- ASICs application specific integrated circuits
- FPGAs field programmable logic arrays
- processors may refer to any of the foregoing structures or any other structure suitable for implementing the techniques described herein.
- the functionality described herein may be provided within dedicated hardware and/or software modules configured for encoding and decoding, or incorporated in a combined codec.
- the techniques may be fully implemented in one or more circuits or logic elements.
- the techniques of this disclosure may be implemented in a wide variety of devices or devices, including a wireless handset, an integrated circuit (IC), or a collection of ICs (eg, a chipset).
- IC integrated circuit
- Various components, modules or units are described in this disclosure to emphasize functional aspects of the apparatus configured to perform the disclosed techniques, but are not necessarily required to be implemented by different hardware units. Rather, as described above, various units may be combined in a codec hardware unit or combined with suitable software and/or by a collection of interoperable hardware units (including one or more processors as described above). Or firmware to provide.
- system and “network” are used interchangeably herein. It should be understood that the term “and/or” herein is merely an association relationship describing an associated object, indicating that there may be three relationships, for example, A and/or B, which may indicate that A exists separately, and A and B exist simultaneously. There are three cases of B alone. In addition, the character "/" in this article generally indicates that the contextual object is an "or" relationship.
- B corresponding to A means that B is associated with A, and B can be determined from A.
- determining B from A does not mean that B is only determined based on A, and that B can also be determined based on A and/or other information.
- the disclosed systems, devices, and methods may be implemented in other manners.
- the device embodiments described above are merely illustrative.
- the division of the unit is only a logical function division.
- there may be another division manner for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not executed.
- the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.
- the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
Abstract
Description
Claims (40)
- 一种预测图像解码方法,包括:根据与待处理图像单元相邻的相邻图像单元的信息,确定所述待处理图像单元的候选预测模式集合是否包含仿射合并模式,所述仿射合并模式表示所述待处理图像单元和所述待处理图像单元的相邻图像单元使用相同的仿射模型获得各自的预测图像;解析码流中的第一指示信息;根据所述第一指示信息,从所述候选预测模式集合中,确定所述待处理图像单元的预测模式;根据所述预测模式,确定所述待处理图像单元的预测图像。
- 根据权利要求1所述的方法,其特征在于,所述待处理图像单元的相邻图像单元至少包括所述待处理图像单元的上、左、右上、左下和左上的相邻图像单元。
- 根据权利要求1或2所述的方法,其特征在于,所述相邻图像单元的信息为所述相邻图像单元的预测模式,对应的,所述根据与待处理图像单元相邻的相邻图像单元的信息,确定所述待处理图像单元的候选预测模式集合是否包含仿射合并模式,包括:当至少一个所述相邻图像单元的预测模式为使用仿射模型获得预测图像时,解析所述码流中的第二指示信息,当所述第二指示信息为1时,所述候选预测模式集合包含所述仿射合并模式,当所述第二指示信息为0时,所述候选预测模式集合不包含所述仿射合并模式;否则,所述候选预测模式集合不包含所述仿射合并模式。
- 根据权利要求1或2所述的方法,其特征在于,所述相邻图像单元的信息为所述相邻图像单元的预测模式,对应的,所述根据与待处理图像单元相邻的相邻图像单元的信息,确定所述待处理图像单元的候选预测模式集合是否包含仿射合并模式,包括:当至少一个所述相邻图像单元的预测模式为使用仿射模型获得预测图像时,所述候选预测模式集合包含所述仿射合并模式;否则,所述候选预测模式集合不包含所述仿射合并模式。
- 根据权利要求1或2所述的方法,其特征在于,所述相邻图像单元的信息为所述相邻图像单元的预测模式,所述预测模式至少包括使用第一仿射模型获得预测图像的第一仿射模式或使用第二仿射模型获得预测图像的第二仿射模式,对应的,所述仿射合并模式至少包括合并所述第一仿射模式的第一仿射合并模式或合并所述第二仿射模式的第二仿射合并模式,对应的,所述根据与待处理图像单元相邻的相邻图像单元的信息,确定所述待处理图像单元的候选预测模式集合是否包含仿射合并模式,包括:当所述相邻预测单元的预测模式中,所述第一仿射模式最多时,所述候选预测模式集合包含所述第一仿射合并模式,且不包含所述第二仿射合并模式;当所述相邻预测单元的预测模式中,所述第二仿射模式最多时,所述候选预测模式集合包含所述第二仿射合并模式,且不包含所述第一仿射合并模式;当所述相邻预测单元的预测模式中,非所述仿射模式最多时,所述候选预测模式集合不包含所述仿射合并模式。
- 根据权利要求1或2所述的方法,其特征在于,所述相邻图像单元的信息为所述相邻图像单元的预测模式,所述预测模式至少包括使用第一仿射模型获得预测图像的第一仿射模式或使用第二仿射模型获得预测图像的第二仿射模式,对应的,所述仿射合并模式至少包括合并所述第一仿射模式的第一仿射合并模式或合并所述第二仿射模式的第二仿射合并模式,对应的,所述根据与待处理图像单元相邻的相邻图像单元的信息,确定所述待处理图像单元的候选预测模式集合是否包含仿射合并模式,包括:当所述相邻预测单元的预测模式中,所述第一仿射模式最多时,所述候选预测模式集合包含所述第一仿射合并模式,且不包含所述第二仿射合并模式;当所述相邻预测单元的预测模式中,所述第二仿射模式最多时,所述候 选预测模式集合包含所述第二仿射合并模式,且不包含所述第一仿射合并模式;当所述相邻预测单元的预测模式中,非所述仿射模式最多且所述第一仿射模式次多时,所述候选预测模式集合包含所述第一仿射合并模式,且不包含所述第二仿射合并模式;当所述相邻预测单元的预测模式中,非所述仿射模式最多且所述第二仿射模式次多时,所述候选预测模式集合包含所述第二仿射合并模式,且不包含所述第一仿射合并模式。
- 根据权利要求1或2所述的方法,其特征在于,所述信息为所述相邻图像单元的预测模式和尺寸,对应的,所述根据与待处理图像单元相邻的相邻图像单元的信息,确定所述待处理图像单元的候选预测模式集合是否包含仿射合并模式,包括:当至少一个所述相邻图像单元的预测模式为使用仿射模型获得预测图像,且所述至少一个所述相邻图像单元的宽和高分别小于所述待处理图像单元的宽和高时,解析所述码流中的第三指示信息,当所述第三指示信息为1时,所述候选预测模式集合包含所述仿射合并模式,当所述第三指示信息为0时,所述候选预测模式集合不包含所述仿射合并模式;否则,所述候选预测模式集合不包含所述仿射合并模式。
- 根据权利要求1或2所述的方法,其特征在于,所述信息为所述相邻图像单元的预测模式和尺寸,对应的,所述根据与待处理图像单元相邻的相邻图像单元的信息,确定所述待处理图像单元的候选预测模式集合是否包含仿射合并模式,包括:当至少一个所述相邻图像单元的预测模式为使用仿射模型获得预测图像,且所述至少一个所述相邻图像单元的宽和高分别小于所述待处理图像单元的宽和高时,所述候选预测模式集合包含所述仿射合并模式;否则,所述候选预测模式集合不包含所述仿射合并模式。
- 一种预测图像解码方法,包括:解析码流中的第一指示信息;根据所述第一指示信息,确定第一待处理图像区域的候选模式集合;当所述第一指示信息为0时,所述第一待处理图像区域的候选模式集合采用候选平动模式集合,所述平动模式表示使用平动模型获得预测图像的预测模式;当所述第一指示信息为1时,所述第一待处理图像区域的候选模式集合采用候选平动模式集合和候选仿射模式集合,所述仿射模式表示使用仿射模型获得预测图像的预测模式;解析所述码流中的第二指示信息;根据所述第二指示信息,从所述第一待处理图像区域的候选预测模式集合中,确定待处理图像单元的预测模式,所述待处理图像单元属于所述第一待处理图像区域;根据所述预测模式,确定所述待处理图像单元的预测图像。
- 根据权利要求9所述的方法,其特征在于,所述第一待处理图像区域,包括:图像帧组、图像帧、图像分片集、图像条带集、图像分片、图像条带、图像编码单元集合、图像编码单元其中之一。
- 一种预测图像编码方法,包括:根据与待处理图像单元相邻的相邻图像单元的信息,确定所述待处理图像单元的候选预测模式集合是否包含仿射合并模式,所述仿射合并模式表示所述待处理图像单元和所述待处理图像单元的相邻图像单元使用相同的仿射模型获得各自的预测图像;从所述候选预测模式集合中,确定所述待处理图像单元的预测模式;根据所述预测模式,确定所述待处理图像单元的预测图像;将第一指示信息编入码流,所述第一指示信息表示所述预测模式。
- 根据权利要求11所述的方法,其特征在于,所述待处理图像单元的相邻图像单元至少包括所述待处理图像单元的上、左、右上、左下和左上的相邻图像单元。
- 根据权利要求11或12所述的方法,其特征在于,所述信息为所述相邻图像单元的预测模式,对应的,所述根据与待处理图像单元相邻的相邻图像单元的信息,确定所述待处理图像单元的候选预测模式集合是否包含仿射合并模式,包括:当至少一个所述相邻图像单元的预测模式为使用仿射模型获得预测图像,且所述候选预测模式集合包含所述仿射合并模式时,设置第二指示信息为1,将所述第二指示信息编入所述码流;当至少一个所述相邻图像单元的预测模式为使用仿射模型获得预测图像,且所述候选预测模式集合不包含所述仿射合并模式时,设置所述第二指示信息为0,将所述第二指示信息编入所述码流;否则,所述候选预测模式集合不包含所述仿射合并模式。
- 根据权利要求11或12所述的方法,其特征在于,所述信息为所述相邻图像单元的预测模式,对应的,所述根据与待处理图像单元相邻的相邻图像单元的信息,确定所述待处理图像单元的候选预测模式集合是否包含仿射合并模式,包括:当至少一个所述相邻图像单元的预测模式为使用仿射模型获得预测图像时,所述候选预测模式集合包含所述仿射合并模式;否则,所述候选预测模式集合不包含所述仿射合并模式。
- 根据权利要求11或12所述的方法,其特征在于,所述相邻图像单元的信息为所述相邻图像单元的预测模式,所述预测模式至少包括使用第一仿射模型获得预测图像的第一仿射模式或使用第二仿射模型获得预测图像的第二仿射模式,对应的,所述仿射合并模式至少包括合并所述第一仿射模式的第一仿射合并模式或合并所述第二仿射模式的第二仿射合并模式,对应的,所述根据与待处理图像单元相邻的相邻图像单元的信息,确定所述待处理图像单元的候选预测模式集合是否包含仿射合并模式,包括:当所述相邻预测单元的预测模式中,所述第一仿射模式最多时,所述候选预测模式集合包含所述第一仿射合并模式,且不包含所述第二仿射合并模式;当所述相邻预测单元的预测模式中,所述第二仿射模式最多时,所述候选预测模式集合包含所述第二仿射合并模式,且不包含所述第一仿射合并模式;当所述相邻预测单元的预测模式中,非所述仿射模式最多时,所述候选预测模式集合不包含所述仿射合并模式。
- 根据权利要求11或12所述的方法,其特征在于,所述相邻图像单元的信息为所述相邻图像单元的预测模式,所述预测模式至少包括使用第一仿射模型获得预测图像的第一仿射模式或使用第二仿射模型获得预测图像的第二仿射模式,对应的,所述仿射合并模式至少包括合并所述第一仿射模式的第一仿射合并模式或合并所述第二仿射模式的第二仿射合并模式,对应的,所述根据与待处理图像单元相邻的相邻图像单元的信息,确定所述待处理图像单元的候选预测模式集合是否包含仿射合并模式,包括:当所述相邻预测单元的预测模式中,所述第一仿射模式最多时,所述候选预测模式集合包含所述第一仿射合并模式,且不包含所述第二仿射合并模式;当所述相邻预测单元的预测模式中,所述第二仿射模式最多时,所述候选预测模式集合包含所述第二仿射合并模式,且不包含所述第一仿射合并模式;当所述相邻预测单元的预测模式中,非所述仿射模式最多且所述第一仿射模式次多时,所述候选预测模式集合包含所述第一仿射合并模式,且不包含所述第二仿射合并模式;当所述相邻预测单元的预测模式中,非所述仿射模式最多且所述第二仿射模式次多时,所述候选预测模式集合包含所述第二仿射合并模式,且不包含所述第一仿射合并模式。
- 根据权利要求11或12所述的方法,其特征在于,所述信息为所述相邻图像单元的预测模式和尺寸,对应的,所述根据与待处理图像单元相邻的相邻图像单元的信息,确定所述待处理图像单元的候选预测模式集合是否包含仿射合并模式,包括:当至少一个所述相邻图像单元的预测模式为使用仿射模型获得预测图 像,且所述至少一个所述相邻图像单元的宽和高分别小于所述待处理图像单元的宽和高,且所述候选预测模式集合包含所述仿射合并模式时,设置第三指示信息为1,将所述第三指示信息编入所述码流;当至少一个所述相邻图像单元的预测模式为使用仿射模型获得预测图像,且所述至少一个所述相邻图像单元的宽和高分别小于所述待处理图像单元的宽和高,且所述候选预测模式集合不包含所述仿射合并模式时,设置所述第三指示信息为0,将所述第三指示信息编入所述码流;否则,所述候选预测模式集合不包含所述仿射合并模式。
- 根据权利要求11或12所述的方法,其特征在于,所述信息为所述相邻图像单元的预测模式和尺寸,对应的,所述根据与待处理图像单元相邻的相邻图像单元的信息,确定所述待处理图像单元的候选预测模式集合是否包含仿射合并模式,包括:当至少一个所述相邻图像单元的预测模式为使用仿射模型获得预测图像,且所述至少一个所述相邻图像单元的宽和高分别小于所述待处理图像单元的宽和高时,所述候选预测模式集合包含所述仿射合并模式;否则,所述候选预测模式集合不包含所述仿射合并模式。
- 一种预测图像编码方法,包括:当第一待处理图像区域的候选模式集合采用候选平动模式集合时,设置第一指示信息为0,将所述第一指示信息编入码流,所述平动模式表示使用平动模型获得预测图像的预测模式;当所述第一待处理图像区域的候选模式集合采用候选平动模式集合和候选仿射模式集合时,设置所述第一指示信息为1,将所述第一指示信息编入码流,所述仿射模式表示使用仿射模型获得预测图像的预测模式;从所述第一待处理图像区域的候选预测模式集合中,确定待处理图像单元的预测模式,所述待处理图像单元属于所述第一待处理图像区域;根据所述预测模式,确定所述待处理图像单元的预测图像;将第二指示信息编入所述码流,所述第二指示信息表示所述预测模式。
- 根据权利要求19所述的方法,其特征在于,所述第一待处理图像 区域,包括:图像帧组、图像帧、图像分片集、图像条带集、图像分片、图像条带、图像编码单元集合、图像编码单元其中之一。
- 一种预测图像解码装置,包括:第一确定模块,用于根据与待处理图像单元相邻的相邻图像单元的信息,确定所述待处理图像单元的候选预测模式集合是否包含仿射合并模式,所述仿射合并模式表示所述待处理图像单元和所述待处理图像单元的相邻图像单元使用相同的仿射模型获得各自的预测图像;解析模块,用于解析码流中的第一指示信息;第二确定模块,用于根据所述第一指示信息,从所述候选预测模式集合中,确定所述待处理图像单元的预测模式;第三确定模块,用于根据所述预测模式,确定所述待处理图像单元的预测图像。
- 根据权利要求21所述的装置,其特征在于,所述待处理图像单元的相邻图像单元至少包括所述待处理图像单元的上、左、右上、左下和左上的相邻图像单元。
- 根据权利要求21或22所述的装置,其特征在于,所述信息为所述相邻图像单元的预测模式,对应的,所述第一确定模块,具体用于:当至少一个所述相邻图像单元的预测模式为使用仿射模型获得预测图像时,解析所述码流中的第二指示信息,当所述第二指示信息为1时,所述候选预测模式集合包含所述仿射合并模式,当所述第二指示信息为0时,所述候选预测模式集合不包含所述仿射合并模式;否则,所述候选预测模式集合不包含所述仿射合并模式。
- 根据权利要求21或22所述的装置,其特征在于,所述信息为所述相邻图像单元的预测模式,对应的,所述第一确定模块,具体用于:当至少一个所述相邻图像单元的预测模式为使用仿射模型获得预测图 像时,所述候选预测模式集合包含所述仿射合并模式;否则,所述候选预测模式集合不包含所述仿射合并模式。
- 根据权利要求21或22所述的装置,其特征在于,所述相邻图像单元的信息为所述相邻图像单元的预测模式,所述预测模式至少包括使用第一仿射模型获得预测图像的第一仿射模式或使用第二仿射模型获得预测图像的第二仿射模式,对应的,所述第一确定模块,具体用于:当所述相邻预测单元的预测模式中,所述第一仿射模式最多时,所述候选预测模式集合包含所述第一仿射合并模式,且不包含所述第二仿射合并模式;当所述相邻预测单元的预测模式中,所述第二仿射模式最多时,所述候选预测模式集合包含所述第二仿射合并模式,且不包含所述第一仿射合并模式;当所述相邻预测单元的预测模式中,非所述仿射模式最多时,所述候选预测模式集合不包含所述仿射合并模式。
- 根据权利要求21或22所述的装置,其特征在于,所述相邻图像单元的信息为所述相邻图像单元的预测模式,所述预测模式至少包括使用第一仿射模型获得预测图像的第一仿射模式或使用第二仿射模型获得预测图像的第二仿射模式,对应的,所述仿射合并模式至少包括合并所述第一仿射模式的第一仿射合并模式或合并所述第二仿射模式的第二仿射合并模式,对应的,所述第一确定模块,具体用于:当所述相邻预测单元的预测模式中,所述第一仿射模式最多时,所述候选预测模式集合包含所述第一仿射合并模式,且不包含所述第二仿射合并模式;当所述相邻预测单元的预测模式中,所述第二仿射模式最多时,所述候选预测模式集合包含所述第二仿射合并模式,且不包含所述第一仿射合并模式;当所述相邻预测单元的预测模式中,非所述仿射模式最多且所述第一仿射模式次多时,所述候选预测模式集合包含所述第一仿射合并模式,且不包含所述第二仿射合并模式;当所述相邻预测单元的预测模式中,非所述仿射模式最多且所述第二仿射模式次多时,所述候选预测模式集合包含所述第二仿射合并模式,且不包含所述第一仿射合并模式。
- 根据权利要求21或22所述的装置,其特征在于,所述信息为所述相邻图像单元的预测模式和尺寸,对应的,所述第一确定模块,具体用于:当至少一个所述相邻图像单元的预测模式为使用仿射模型获得预测图像,且所述至少一个所述相邻图像单元的宽和高分别小于所述待处理图像单元的宽和高时,解析所述码流中的第三指示信息,当所述第三指示信息为1时,所述候选预测模式集合包含所述仿射合并模式,当所述第三指示信息为0时,所述候选预测模式集合不包含所述仿射合并模式;否则,所述候选预测模式集合不包含所述仿射合并模式。
- 根据权利要求21或22所述的装置,其特征在于,所述信息为所述相邻图像单元的预测模式和尺寸,对应的,所述第一确定模块,具体用于:当至少一个所述相邻图像单元的预测模式为使用仿射模型获得预测图像,且所述至少一个所述相邻图像单元的宽和高分别小于所述待处理图像单元的宽和高时,所述候选预测模式集合包含所述仿射合并模式;否则,所述候选预测模式集合不包含所述仿射合并模式。
- 一种预测图像解码装置,包括:第一解析模块,用于解析码流中的第一指示信息;第一确定模块,用于根据所述第一指示信息,确定第一待处理图像区域的候选模式集合;当所述第一指示信息为0时,所述第一待处理图像区域的候选模式集合采用候选平动模式集合,所述平动模式表示使用平动模型获得预测图像的预测模式;当所述第一指示信息为1时,所述第一待处理图像区域的候选模式集合采用候选平动模式集合和候选仿射模式集合,所述仿射模式表示使用仿射模 型获得预测图像的预测模式;第二解析模块,用于解析所述码流中的第二指示信息;第二确定模块,用于根据所述第二指示信息,从所述第一待处理图像区域的候选预测模式集合中,确定待处理图像单元的预测模式,所述待处理图像单元属于所述第一待处理图像区域;第三确定模块,用于根据所述预测模式,确定所述待处理图像单元的预测图像。
- 根据权利要求29所述的装置,其特征在于,所述第一待处理图像区域,包括:图像帧组、图像帧、图像分片集、图像条带集、图像分片、图像条带、图像编码单元集合、图像编码单元其中之一。
- 一种预测图像编码装置,包括:第一确定模块,用于根据与待处理图像单元相邻的相邻图像单元的信息,确定所述待处理图像单元的候选预测模式集合是否包含仿射合并模式,所述仿射合并模式表示所述待处理图像单元和所述待处理图像单元的相邻图像单元使用相同的仿射模型获得各自的预测图像;第二确定模块,用于从所述候选预测模式集合中,确定所述待处理图像单元的预测模式;第三确定模块,用于根据所述预测模式,确定所述待处理图像单元的预测图像;编码模块,用于将第一指示信息编入码流,所述第一指示信息表示所述预测模式。
- 根据权利要求31所述的装置,其特征在于,所述待处理图像单元的相邻图像单元至少包括所述待处理图像单元的上、左、右上、左下和左上的相邻图像单元。
- 根据权利要求31或32所述的装置,其特征在于,所述信息为所述相邻图像单元的预测模式,对应的,对应的,所述第一确定模块,具体用于:当至少一个所述相邻图像单元的预测模式为使用仿射模型获得预测图 像,且所述候选预测模式集合包含所述仿射合并模式时,设置第二指示信息为1,将所述第二指示信息编入所述码流;当至少一个所述相邻图像单元的预测模式为使用仿射模型获得预测图像,且所述候选预测模式集合不包含所述仿射合并模式时,设置所述第二指示信息为0,将所述第二指示信息编入所述码流;否则,所述候选预测模式集合不包含所述仿射合并模式。
- 根据权利要求31或32所述的装置,其特征在于,所述信息为所述相邻图像单元的预测模式,对应的,所述第一确定模块,具体用于:当至少一个所述相邻图像单元的预测模式为使用仿射模型获得预测图像时,所述候选预测模式集合包含所述仿射合并模式;否则,所述候选预测模式集合不包含所述仿射合并模式。
- 根据权利要求31或32所述的装置,其特征在于,所述相邻图像单元的信息为所述相邻图像单元的预测模式,所述预测模式至少包括使用第一仿射模型获得预测图像的第一仿射模式或使用第二仿射模型获得预测图像的第二仿射模式,对应的,所述仿射合并模式至少包括合并所述第一仿射模式的第一仿射合并模式或合并所述第二仿射模式的第二仿射合并模式,对应的,所述第一确定模块,具体用于:当所述相邻预测单元的预测模式中,所述第一仿射模式最多时,所述候选预测模式集合包含所述第一仿射合并模式,且不包含所述第二仿射合并模式;当所述相邻预测单元的预测模式中,所述第二仿射模式最多时,所述候选预测模式集合包含所述第二仿射合并模式,且不包含所述第一仿射合并模式;当所述相邻预测单元的预测模式中,非所述仿射模式最多时,所述候选预测模式集合不包含所述仿射合并模式。
- 根据权利要求31或32所述的方法,其特征在于,所述相邻图像单元的信息为所述相邻图像单元的预测模式,所述预测模式至少包括使用第一仿射模型获得预测图像的第一仿射模式或使用第二仿射模型获得预测图像 的第二仿射模式,对应的,所述仿射合并模式至少包括合并所述第一仿射模式的第一仿射合并模式或合并所述第二仿射模式的第二仿射合并模式,对应的,所述第一确定模块,具体用于:当所述相邻预测单元的预测模式中,所述第一仿射模式最多时,所述候选预测模式集合包含所述第一仿射合并模式,且不包含所述第二仿射合并模式;当所述相邻预测单元的预测模式中,所述第二仿射模式最多时,所述候选预测模式集合包含所述第二仿射合并模式,且不包含所述第一仿射合并模式;当所述相邻预测单元的预测模式中,非所述仿射模式最多且所述第一仿射模式次多时,所述候选预测模式集合包含所述第一仿射合并模式,且不包含所述第二仿射合并模式;当所述相邻预测单元的预测模式中,非所述仿射模式最多且所述第二仿射模式次多时,所述候选预测模式集合包含所述第二仿射合并模式,且不包含所述第一仿射合并模式。
- 根据权利要求31或32所述的装置,其特征在于,所述信息为所述相邻图像单元的预测模式和尺寸,对应的,所述第一确定模块,具体用于:当至少一个所述相邻图像单元的预测模式为使用仿射模型获得预测图像,且所述至少一个所述相邻图像单元的宽和高分别小于所述待处理图像单元的宽和高,且所述候选预测模式集合包含所述仿射合并模式时,设置第三指示信息为1,将所述第三指示信息编入所述码流;当至少一个所述相邻图像单元使用仿射模型获得预测图像,且所述至少一个所述相邻图像单元的宽和高分别小于所述待处理图像单元的宽和高,且所述候选预测模式集合不包含所述仿射合并模式时,设置所述第三指示信息为0,将所述第三指示信息编入所述码流;否则,所述候选预测模式集合不包含所述仿射合并模式。
- 根据权利要求31或32所述的装置,其特征在于,所述信息为所述相邻图像单元的预测模式和尺寸,对应的,所述第一确定模块,具体用于:当至少一个所述相邻图像单元的预测模式为使用仿射模型获得预测图 像,且所述至少一个所述相邻图像单元的宽和高分别小于所述待处理图像单元的宽和高时,所述候选预测模式集合包含所述仿射合并模式;否则,所述候选预测模式集合不包含所述仿射合并模式。
- 一种预测图像编码装置,包括:第一编码模块,用于当第一待处理图像区域的候选模式集合采用候选平动模式集合时,设置第一指示信息为0,将所述第一指示信息编入码流,所述平动模式表示使用平动模型获得预测图像的预测模式;当所述第一待处理图像区域的候选模式集合采用候选平动模式集合和候选仿射模式集合时,设置所述第一指示信息为1,将所述第一指示信息编入码流,所述仿射模式表示使用仿射模型获得预测图像的预测模式;第一确定模块,用于从所述第一待处理图像区域的候选预测模式集合中,确定待处理图像单元的预测模式,所述待处理图像单元属于所述第一待处理图像区域;第二确定模块,用于根据所述预测模式,确定所述待处理图像单元的预测图像;第二编码模块,用于将第二指示信息编入所述码流,所述第二指示信息表示所述预测模式。
- 根据权利要求39所述的装置,其特征在于,所述第一待处理图像区域,包括:图像帧组、图像帧、图像分片集、图像条带集、图像分片、图像条带、图像编码单元集合、图像编码单元其中之一。
Priority Applications (13)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
MX2018003764A MX2018003764A (es) | 2015-09-29 | 2016-09-08 | Metodo y aparato de prediccion de imagen. |
BR112018006271-5A BR112018006271B1 (pt) | 2015-09-29 | 2016-09-08 | método e aparelho para decodificar uma imagem predita |
MYPI2018700879A MY196371A (en) | 2015-09-29 | 2016-09-08 | Image Prediction Method and Apparatus |
SG11201801863QA SG11201801863QA (en) | 2015-09-29 | 2016-09-08 | Image prediction method and apparatus |
AU2016333221A AU2016333221B2 (en) | 2015-09-29 | 2016-09-08 | Image prediction method and apparatus |
RU2018114921A RU2697726C1 (ru) | 2015-09-29 | 2016-09-08 | Способ и устройство предсказания изображений |
JP2018511731A JP6669859B2 (ja) | 2015-09-29 | 2016-09-08 | 画像予測方法および装置 |
KR1020187008717A KR102114764B1 (ko) | 2015-09-29 | 2016-09-08 | 이미지 예측 방법 및 장치 |
KR1020207014280A KR102240141B1 (ko) | 2015-09-29 | 2016-09-08 | 이미지 예측 방법 및 장치 |
EP16850254.0A EP3331243B1 (en) | 2015-09-29 | 2016-09-08 | Image prediction method and device |
ZA2018/01541A ZA201801541B (en) | 2015-09-29 | 2018-03-06 | Image prediction method and apparatus |
US15/923,434 US11323736B2 (en) | 2015-09-29 | 2018-03-16 | Image prediction method and apparatus |
US17/540,928 US20220094969A1 (en) | 2015-09-29 | 2021-12-02 | Image prediction method and apparatus |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510632589.1 | 2015-09-29 | ||
CN201510632589.1A CN106559669B (zh) | 2015-09-29 | 2015-09-29 | 预测图像编解码方法及装置 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/923,434 Continuation US11323736B2 (en) | 2015-09-29 | 2018-03-16 | Image prediction method and apparatus |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2017054630A1 true WO2017054630A1 (zh) | 2017-04-06 |
Family
ID=58414556
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2016/098464 WO2017054630A1 (zh) | 2015-09-29 | 2016-09-08 | 图像预测的方法及装置 |
Country Status (13)
Country | Link |
---|---|
US (2) | US11323736B2 (zh) |
EP (1) | EP3331243B1 (zh) |
JP (4) | JP6669859B2 (zh) |
KR (2) | KR102114764B1 (zh) |
CN (3) | CN108965871B (zh) |
AU (1) | AU2016333221B2 (zh) |
BR (1) | BR112018006271B1 (zh) |
MX (1) | MX2018003764A (zh) |
MY (1) | MY196371A (zh) |
RU (1) | RU2697726C1 (zh) |
SG (1) | SG11201801863QA (zh) |
WO (1) | WO2017054630A1 (zh) |
ZA (1) | ZA201801541B (zh) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111107373A (zh) * | 2018-10-29 | 2020-05-05 | 华为技术有限公司 | 基于仿射预测模式的帧间预测的方法及相关装置 |
CN112822514A (zh) * | 2020-12-30 | 2021-05-18 | 北京大学 | 基于依赖关系的视频流分组传输方法、系统、终端及介质 |
CN113170210A (zh) * | 2018-10-10 | 2021-07-23 | 交互数字Vc控股公司 | 视频编码和解码中的仿射模式信令 |
US11336907B2 (en) | 2018-07-16 | 2022-05-17 | Huawei Technologies Co., Ltd. | Video encoder, video decoder, and corresponding encoding and decoding methods |
RU2772813C1 (ru) * | 2018-07-16 | 2022-05-26 | Хуавей Текнолоджиз Ко., Лтд. | Видеокодер, видеодекодер и соответствующие способы кодирования и декодирования |
US11438578B2 (en) | 2018-10-29 | 2022-09-06 | Huawei Technologies Co., Ltd. | Video picture prediction method and apparatus |
Families Citing this family (42)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108965871B (zh) * | 2015-09-29 | 2023-11-10 | 华为技术有限公司 | 图像预测的方法及装置 |
EP4072134A1 (en) * | 2016-12-27 | 2022-10-12 | Samsung Electronics Co., Ltd. | Video affine mode encoding method and device therefor, and decoding method and device therefor |
US10630994B2 (en) * | 2017-06-28 | 2020-04-21 | Agora Lab, Inc. | Specific operation prediction in video compression |
US10979718B2 (en) * | 2017-09-01 | 2021-04-13 | Apple Inc. | Machine learning video processing systems and methods |
US10609384B2 (en) * | 2017-09-21 | 2020-03-31 | Futurewei Technologies, Inc. | Restriction on sub-block size derivation for affine inter prediction |
EP3468195A1 (en) * | 2017-10-05 | 2019-04-10 | Thomson Licensing | Improved predictor candidates for motion compensation |
US20190208211A1 (en) * | 2018-01-04 | 2019-07-04 | Qualcomm Incorporated | Generated affine motion vectors |
US20190222834A1 (en) * | 2018-01-18 | 2019-07-18 | Mediatek Inc. | Variable affine merge candidates for video coding |
EP3788787A1 (en) | 2018-06-05 | 2021-03-10 | Beijing Bytedance Network Technology Co. Ltd. | Interaction between ibc and atmvp |
WO2019244117A1 (en) | 2018-06-21 | 2019-12-26 | Beijing Bytedance Network Technology Co., Ltd. | Unified constrains for the merge affine mode and the non-merge affine mode |
CN113115046A (zh) | 2018-06-21 | 2021-07-13 | 北京字节跳动网络技术有限公司 | 分量相关的子块分割 |
KR20210024487A (ko) * | 2018-07-01 | 2021-03-05 | 베이징 바이트댄스 네트워크 테크놀로지 컴퍼니, 리미티드 | 효율적인 아핀 병합 모션 벡터 유도 |
CN110677645B (zh) * | 2018-07-02 | 2022-06-10 | 华为技术有限公司 | 一种图像预测方法及装置 |
US10516885B1 (en) | 2018-07-11 | 2019-12-24 | Tencent America LLC | Method and apparatus for video coding |
CN117499672A (zh) | 2018-08-27 | 2024-02-02 | 华为技术有限公司 | 一种视频图像预测方法及装置 |
CN117241039A (zh) * | 2018-08-28 | 2023-12-15 | 华为技术有限公司 | 帧间预测方法、装置以及视频编码器和视频解码器 |
JP7225375B2 (ja) * | 2018-08-30 | 2023-02-20 | 華為技術有限公司 | パレット符号化を使用するエンコード装置、デコード装置および対応する方法 |
BR112021004667A2 (pt) * | 2018-09-12 | 2021-06-01 | Huawei Technologies Co., Ltd. | codificador de vídeo, decodificador de vídeo e métodos correspondentes |
EP3837835A4 (en) | 2018-09-18 | 2021-06-23 | Huawei Technologies Co., Ltd. | CODING PROCESS, DEVICE AND SYSTEM |
PT3847818T (pt) | 2018-09-18 | 2024-03-05 | Huawei Tech Co Ltd | Codificador de vídeo, um descodificador de vídeo e métodos correspondentes |
GB2579763B (en) | 2018-09-21 | 2021-06-09 | Canon Kk | Video coding and decoding |
GB2577318B (en) * | 2018-09-21 | 2021-03-10 | Canon Kk | Video coding and decoding |
GB2597616B (en) * | 2018-09-21 | 2023-01-18 | Canon Kk | Video coding and decoding |
TWI818086B (zh) | 2018-09-24 | 2023-10-11 | 大陸商北京字節跳動網絡技術有限公司 | 擴展Merge預測 |
WO2020070730A2 (en) * | 2018-10-06 | 2020-04-09 | Beijing Bytedance Network Technology Co., Ltd. | Size restriction based on affine motion information |
GB2578150C (en) | 2018-10-18 | 2022-05-18 | Canon Kk | Video coding and decoding |
GB2595054B (en) | 2018-10-18 | 2022-07-06 | Canon Kk | Video coding and decoding |
CN112997495B (zh) | 2018-11-10 | 2024-02-20 | 北京字节跳动网络技术有限公司 | 当前图片参考中的取整 |
CN112997487A (zh) | 2018-11-15 | 2021-06-18 | 北京字节跳动网络技术有限公司 | 仿射模式与其他帧间编解码工具之间的协调 |
CN113016185B (zh) | 2018-11-17 | 2024-04-05 | 北京字节跳动网络技术有限公司 | 以运动矢量差分模式控制Merge |
CN111263147B (zh) | 2018-12-03 | 2023-02-14 | 华为技术有限公司 | 帧间预测方法和相关装置 |
EP3868107A4 (en) * | 2018-12-21 | 2021-12-15 | Beijing Bytedance Network Technology Co. Ltd. | MOTION VECTOR ACCURACY IN INTERACTING WITH MOTION VECTOR DIFFERENCE MODE |
CN111526362B (zh) * | 2019-02-01 | 2023-12-29 | 华为技术有限公司 | 帧间预测方法和装置 |
WO2020181428A1 (zh) | 2019-03-08 | 2020-09-17 | Oppo广东移动通信有限公司 | 预测方法、编码器、解码器及计算机存储介质 |
WO2020181471A1 (zh) * | 2019-03-11 | 2020-09-17 | Oppo广东移动通信有限公司 | 帧内预测方法、装置及计算机存储介质 |
CN117692660A (zh) | 2019-03-12 | 2024-03-12 | Lg电子株式会社 | 图像编码/解码方法以及数据的传输方法 |
WO2020186882A1 (zh) * | 2019-03-18 | 2020-09-24 | 华为技术有限公司 | 基于三角预测单元模式的处理方法及装置 |
CN113853793B (zh) * | 2019-05-21 | 2023-12-19 | 北京字节跳动网络技术有限公司 | 基于光流的帧间编码的语法信令 |
CN113347434B (zh) * | 2019-06-21 | 2022-03-29 | 杭州海康威视数字技术股份有限公司 | 预测模式的解码、编码方法及装置 |
US11132780B2 (en) | 2020-02-14 | 2021-09-28 | Huawei Technologies Co., Ltd. | Target detection method, training method, electronic device, and computer-readable medium |
CN112801906B (zh) * | 2021-02-03 | 2023-02-21 | 福州大学 | 基于循环神经网络的循环迭代图像去噪方法 |
US20230412794A1 (en) * | 2022-06-17 | 2023-12-21 | Tencent America LLC | Affine merge mode with translational motion vectors |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102377992A (zh) * | 2010-08-06 | 2012-03-14 | 华为技术有限公司 | 运动矢量的预测值的获取方法和装置 |
CN102934440A (zh) * | 2010-05-26 | 2013-02-13 | Lg电子株式会社 | 用于处理视频信号的方法和设备 |
EP2645720A2 (en) * | 2010-11-23 | 2013-10-02 | LG Electronics Inc. | Method for encoding and decoding images, and device using same |
CN104363451A (zh) * | 2014-10-27 | 2015-02-18 | 华为技术有限公司 | 图像预测方法及相关装置 |
CN104539966A (zh) * | 2014-09-30 | 2015-04-22 | 华为技术有限公司 | 图像预测方法及相关装置 |
CN104661031A (zh) * | 2015-02-16 | 2015-05-27 | 华为技术有限公司 | 用于视频图像编码和解码的方法、编码设备和解码设备 |
Family Cites Families (23)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP3654664B2 (ja) | 1994-08-23 | 2005-06-02 | シャープ株式会社 | 画像符号化装置及び画像復号装置 |
JP3681342B2 (ja) * | 2000-05-24 | 2005-08-10 | 三星電子株式会社 | 映像コーディング方法 |
JP4245587B2 (ja) * | 2005-06-22 | 2009-03-25 | シャープ株式会社 | 動き補償予測方法 |
US9258519B2 (en) | 2005-09-27 | 2016-02-09 | Qualcomm Incorporated | Encoder assisted frame rate up conversion using various motion models |
JP2012080151A (ja) | 2009-02-09 | 2012-04-19 | Toshiba Corp | 幾何変換動き補償予測を用いる動画像符号化及び動画像復号化の方法と装置 |
US20100246675A1 (en) * | 2009-03-30 | 2010-09-30 | Sony Corporation | Method and apparatus for intra-prediction in a video encoder |
WO2011013253A1 (ja) * | 2009-07-31 | 2011-02-03 | 株式会社 東芝 | 幾何変換動き補償予測を用いる予測信号生成装置、動画像符号化装置及び動画像復号化装置 |
KR101611437B1 (ko) | 2009-10-28 | 2016-04-26 | 삼성전자주식회사 | 복수의 프레임을 참조하여 영상을 부호화, 복호화하는 방법 및 장치 |
US8179446B2 (en) * | 2010-01-18 | 2012-05-15 | Texas Instruments Incorporated | Video stabilization and reduction of rolling shutter distortion |
CN107071487B (zh) | 2010-11-04 | 2020-09-15 | Ge视频压缩有限责任公司 | 支持区块合并和跳过模式的图像编码 |
US9319716B2 (en) * | 2011-01-27 | 2016-04-19 | Qualcomm Incorporated | Performing motion vector prediction for video coding |
CA2830242C (en) * | 2011-03-21 | 2016-11-22 | Qualcomm Incorporated | Bi-predictive merge mode based on uni-predictive neighbors in video coding |
US9282338B2 (en) * | 2011-06-20 | 2016-03-08 | Qualcomm Incorporated | Unified merge mode and adaptive motion vector prediction mode candidates selection |
CN110139108B (zh) * | 2011-11-11 | 2023-07-18 | Ge视频压缩有限责任公司 | 用于将多视点信号编码到多视点数据流中的装置及方法 |
US9420285B2 (en) * | 2012-04-12 | 2016-08-16 | Qualcomm Incorporated | Inter-layer mode derivation for prediction in scalable video coding |
EP2683165B1 (en) * | 2012-07-04 | 2015-10-14 | Thomson Licensing | Method for coding and decoding a block of pixels from a motion model |
WO2016008157A1 (en) * | 2014-07-18 | 2016-01-21 | Mediatek Singapore Pte. Ltd. | Methods for motion compensation using high order motion model |
WO2016137149A1 (ko) * | 2015-02-24 | 2016-09-01 | 엘지전자(주) | 폴리곤 유닛 기반 영상 처리 방법 및 이를 위한 장치 |
CN109005407B (zh) | 2015-05-15 | 2023-09-01 | 华为技术有限公司 | 视频图像编码和解码的方法、编码设备和解码设备 |
CN106331722B (zh) | 2015-07-03 | 2019-04-26 | 华为技术有限公司 | 图像预测方法和相关设备 |
CN107925758B (zh) * | 2015-08-04 | 2022-01-25 | Lg 电子株式会社 | 视频编译系统中的帧间预测方法和设备 |
CN108965869B (zh) | 2015-08-29 | 2023-09-12 | 华为技术有限公司 | 图像预测的方法及设备 |
CN108965871B (zh) * | 2015-09-29 | 2023-11-10 | 华为技术有限公司 | 图像预测的方法及装置 |
-
2015
- 2015-09-29 CN CN201811031223.9A patent/CN108965871B/zh active Active
- 2015-09-29 CN CN201510632589.1A patent/CN106559669B/zh active Active
- 2015-09-29 CN CN201811025244.XA patent/CN109274974B/zh active Active
-
2016
- 2016-09-08 WO PCT/CN2016/098464 patent/WO2017054630A1/zh active Application Filing
- 2016-09-08 AU AU2016333221A patent/AU2016333221B2/en active Active
- 2016-09-08 KR KR1020187008717A patent/KR102114764B1/ko active IP Right Grant
- 2016-09-08 JP JP2018511731A patent/JP6669859B2/ja active Active
- 2016-09-08 EP EP16850254.0A patent/EP3331243B1/en active Active
- 2016-09-08 KR KR1020207014280A patent/KR102240141B1/ko active IP Right Grant
- 2016-09-08 SG SG11201801863QA patent/SG11201801863QA/en unknown
- 2016-09-08 RU RU2018114921A patent/RU2697726C1/ru active
- 2016-09-08 MY MYPI2018700879A patent/MY196371A/en unknown
- 2016-09-08 BR BR112018006271-5A patent/BR112018006271B1/pt active IP Right Grant
- 2016-09-08 MX MX2018003764A patent/MX2018003764A/es unknown
-
2018
- 2018-03-06 ZA ZA2018/01541A patent/ZA201801541B/en unknown
- 2018-03-16 US US15/923,434 patent/US11323736B2/en active Active
-
2020
- 2020-02-27 JP JP2020031872A patent/JP6882560B2/ja active Active
-
2021
- 2021-05-06 JP JP2021078566A patent/JP7368414B2/ja active Active
- 2021-12-02 US US17/540,928 patent/US20220094969A1/en active Pending
-
2023
- 2023-10-12 JP JP2023176443A patent/JP2024020203A/ja active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102934440A (zh) * | 2010-05-26 | 2013-02-13 | Lg电子株式会社 | 用于处理视频信号的方法和设备 |
CN102377992A (zh) * | 2010-08-06 | 2012-03-14 | 华为技术有限公司 | 运动矢量的预测值的获取方法和装置 |
EP2645720A2 (en) * | 2010-11-23 | 2013-10-02 | LG Electronics Inc. | Method for encoding and decoding images, and device using same |
CN104539966A (zh) * | 2014-09-30 | 2015-04-22 | 华为技术有限公司 | 图像预测方法及相关装置 |
CN104363451A (zh) * | 2014-10-27 | 2015-02-18 | 华为技术有限公司 | 图像预测方法及相关装置 |
CN104661031A (zh) * | 2015-02-16 | 2015-05-27 | 华为技术有限公司 | 用于视频图像编码和解码的方法、编码设备和解码设备 |
Non-Patent Citations (2)
Title |
---|
HAN, HUANG ET AL.: "Affine Skip and Direct modes for efficient video coding", VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP), 2012, 31 December 2012 (2012-12-31), pages 1 - 6, XP032309255 * |
ZHANG, NA.: "Research on MERGE Mode and Related Technologies of the Next Generation Video Coding Standard", CHINA MASTERS' THESES FULL-TEXT DATABASE (ELECTRONIC JOURNALS, 30 April 2014 (2014-04-30), pages 136 - 227, XP009503959 * |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11336907B2 (en) | 2018-07-16 | 2022-05-17 | Huawei Technologies Co., Ltd. | Video encoder, video decoder, and corresponding encoding and decoding methods |
RU2772813C1 (ru) * | 2018-07-16 | 2022-05-26 | Хуавей Текнолоджиз Ко., Лтд. | Видеокодер, видеодекодер и соответствующие способы кодирования и декодирования |
CN113170210A (zh) * | 2018-10-10 | 2021-07-23 | 交互数字Vc控股公司 | 视频编码和解码中的仿射模式信令 |
CN111107373A (zh) * | 2018-10-29 | 2020-05-05 | 华为技术有限公司 | 基于仿射预测模式的帧间预测的方法及相关装置 |
US11438578B2 (en) | 2018-10-29 | 2022-09-06 | Huawei Technologies Co., Ltd. | Video picture prediction method and apparatus |
CN111107373B (zh) * | 2018-10-29 | 2023-11-03 | 华为技术有限公司 | 基于仿射预测模式的帧间预测的方法及相关装置 |
CN112822514A (zh) * | 2020-12-30 | 2021-05-18 | 北京大学 | 基于依赖关系的视频流分组传输方法、系统、终端及介质 |
CN112822514B (zh) * | 2020-12-30 | 2022-06-28 | 北京大学 | 基于依赖关系的视频流分组传输方法、系统、终端及介质 |
Also Published As
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP7368414B2 (ja) | 画像予測方法および装置 | |
CN110024406B (zh) | 具有用于视频译码的样本存取的线性模型预测模式 | |
US11146788B2 (en) | Grouping palette bypass bins for video coding | |
TWI666918B (zh) | 決定在視訊寫碼中之調色板寫碼區塊的調色板大小、調色板單元及過濾 | |
KR102478411B1 (ko) | 서브샘플링 포맷을 위한 팔레트 모드 | |
TWI524744B (zh) | 在視訊寫碼中用於視訊時序之時序資訊關係之圖像次序計數的發信 | |
JP2018142972A (ja) | Bスライス中の予測ユニットの単方向インター予測への制限 | |
TW201838415A (zh) | 在視訊寫碼中判定用於雙邊濾波之鄰近樣本 | |
TW201830964A (zh) | 基於在視訊寫碼中之一預測模式導出雙邊濾波器資訊 | |
JP2018524906A (ja) | イントラブロックコピーモードでの参照ピクチャリスト構成 | |
JP2017519447A (ja) | ビデオコーディングのためのイントラブロックコピーブロックベクトルシグナリング | |
TW201603563A (zh) | 用於視訊寫碼之具有執行長度碼之調色盤預測器信令 | |
TW201517599A (zh) | 內部運動補償延伸 | |
JP2017523685A (ja) | イントラブロックコピーイングのためのブロックベクトルコーディング | |
TW201334544A (zh) | 判定視訊寫碼之解塊濾波之邊界強度值 | |
JP2017525316A (ja) | パレットモードコーディングのための方法 | |
TW202126040A (zh) | 用於視訊編碼的簡化的調色板預測器更新 | |
TW202133619A (zh) | 用於合併估計區域的基於歷史的運動向量預測約束 | |
JP2018511238A (ja) | 高速レートひずみ最適量子化 | |
RU2804871C2 (ru) | Способ и устройство предсказания изображений |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 16850254 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2018511731 Country of ref document: JP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 11201801863Q Country of ref document: SG |
|
WWE | Wipo information: entry into national phase |
Ref document number: MX/A/2018/003764 Country of ref document: MX |
|
ENP | Entry into the national phase |
Ref document number: 20187008717 Country of ref document: KR Kind code of ref document: A |
|
ENP | Entry into the national phase |
Ref document number: 2016333221 Country of ref document: AU Date of ref document: 20160908 Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
REG | Reference to national code |
Ref country code: BR Ref legal event code: B01A Ref document number: 112018006271 Country of ref document: BR |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2018114921 Country of ref document: RU |
|
ENP | Entry into the national phase |
Ref document number: 112018006271 Country of ref document: BR Kind code of ref document: A2 Effective date: 20180328 |