WO2017054630A1 - 图像预测的方法及装置 - Google Patents

图像预测的方法及装置 Download PDF

Info

Publication number
WO2017054630A1
WO2017054630A1 PCT/CN2016/098464 CN2016098464W WO2017054630A1 WO 2017054630 A1 WO2017054630 A1 WO 2017054630A1 CN 2016098464 W CN2016098464 W CN 2016098464W WO 2017054630 A1 WO2017054630 A1 WO 2017054630A1
Authority
WO
WIPO (PCT)
Prior art keywords
mode
affine
image
prediction mode
processed
Prior art date
Application number
PCT/CN2016/098464
Other languages
English (en)
French (fr)
Inventor
陈焕浜
林四新
张红
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority to RU2018114921A priority Critical patent/RU2697726C1/ru
Priority to KR1020187008717A priority patent/KR102114764B1/ko
Priority to BR112018006271-5A priority patent/BR112018006271B1/pt
Priority to MYPI2018700879A priority patent/MY196371A/en
Priority to SG11201801863QA priority patent/SG11201801863QA/en
Priority to AU2016333221A priority patent/AU2016333221B2/en
Priority to MX2018003764A priority patent/MX2018003764A/es
Priority to JP2018511731A priority patent/JP6669859B2/ja
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to KR1020207014280A priority patent/KR102240141B1/ko
Priority to EP16850254.0A priority patent/EP3331243B1/en
Publication of WO2017054630A1 publication Critical patent/WO2017054630A1/zh
Priority to ZA2018/01541A priority patent/ZA201801541B/en
Priority to US15/923,434 priority patent/US11323736B2/en
Priority to US17/540,928 priority patent/US20220094969A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/105Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/107Selection of coding mode or of prediction mode between spatial and temporal predictive coding, e.g. picture refresh
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/157Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
    • H04N19/159Prediction type, e.g. intra-frame, inter-frame or bidirectional frame prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/44Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • H04N19/517Processing of motion vectors by encoding
    • H04N19/52Processing of motion vectors by encoding by predictive encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/537Motion estimation other than block-based
    • H04N19/54Motion estimation other than block-based using feature points or meshes

Definitions

  • the present invention relates to the field of video coding and compression, and more particularly to a method and apparatus for image prediction.
  • Digital video capabilities can be incorporated into a wide range of devices, including digital television, digital live broadcast systems, wireless broadcast systems, personal digital assistants (PDAs), laptop or desktop computers, tablet computers, e-book readers, digital Cameras, digital recording devices, digital media players, video game devices, video game consoles, cellular or satellite radio phones, video conferencing devices, video streaming devices, and the like.
  • Digital video devices implement video compression techniques such as MPEG-2, MPEG-4, ITU-TH.263, ITU-TH.264/MPEG-4 Part 10 Advanced Video Coding (AVC), ITU-TH.265 High
  • AVC Advanced Video Coding
  • Video devices may transmit, receive, encode, decode, and/or store digital video information more efficiently by implementing such video coding techniques.
  • Video compression techniques include spatial (intra-picture) prediction and/or temporal (inter-picture) prediction to reduce or remove redundancy inherent in video sequences.
  • a video slice ie, a portion of a video frame or video frame
  • a number of video blocks which may also be referred to as a tree block, a coding unit (CU), and/or Decoding node.
  • Spatial prediction is used to encode video blocks in intra-coded (I) slices of a picture relative to reference samples in neighboring blocks in the same picture.
  • a video block in an inter-coded (P or B) slice of a picture may use spatial prediction with respect to reference samples in neighboring blocks in the same picture or temporal prediction with respect to reference samples in other reference pictures.
  • a picture may be referred to as a frame, and a reference picture may be referred to as a reference frame.
  • Spatial or temporal prediction produces a predictive block of the block to be coded.
  • the residual data represents the pixel difference between the original block to be decoded and the predictive block.
  • the inter-coded block is encoded according to a motion vector that points to a reference sample block that forms a predictive block and residual data that indicates a difference between the coded block and the predictive block.
  • the intra-coded block is encoded according to an intra coding mode and residual data.
  • the residual data may be transformed from a pixel domain to a transform domain, resulting in residual transform coefficients, which may then be quantized.
  • the quantized transform coefficients initially arranged in a two-dimensional array can be sequentially scanned to produce a one-dimensional vector of transform coefficients, and entropy coding can be applied to achieve more compression.
  • the present invention describes an image prediction method for improving coding efficiency.
  • the prediction mode of the image unit to be processed is derived by predicting information or unit size of a neighboring image unit of an image unit to be processed, or a prediction mode candidate set indicating a region level. Since the a priori information is provided for the coding of the prediction mode, the code rate of the coding prediction mode is saved, and the coding efficiency is improved.
  • a prediction image decoding method includes: determining, according to information of an adjacent image unit adjacent to an image unit to be processed, whether a candidate prediction mode set of the image unit to be processed includes an affine merge mode, The affine merge mode indicates that the image unit to be processed and the adjacent image unit of the image unit to be processed obtain respective predicted images using the same affine model; parsing first indication information in the code stream; a first indication information, from the set of candidate prediction modes, determining a prediction mode of the image unit to be processed; and determining, according to the prediction mode, a predicted image of the image unit to be processed.
  • the adjacent image unit of the image unit to be processed includes at least adjacent image units of the upper, left, upper right, lower left, and upper left of the image unit to be processed.
  • the candidate prediction mode set of the image unit to be processed includes an affine merge mode according to information of adjacent image units adjacent to the image unit to be processed:
  • the first implementation manner includes: when the prediction mode of the at least one of the adjacent image units is to obtain a predicted image by using an affine model, parsing the second indication information in the code stream, when the second indication information is 1 that the candidate prediction mode set includes the affine combining mode, and when the second indication information is 0, the candidate prediction mode set does not include the affine combining mode; otherwise, the candidate prediction mode The set does not contain the affine merge mode.
  • the second implementation manner includes: when the prediction mode of the at least one of the adjacent image units is to obtain a predicted image using an affine model, the candidate prediction mode set includes the affine merge mode; otherwise, the candidate prediction The pattern set does not contain the affine merge mode.
  • a third implementation manner includes: the prediction mode includes at least obtaining a first affine mode of the predicted image by using the first affine model or a second affine mode of obtaining the predicted image by using the second affine model, correspondingly,
  • the affine combining mode includes at least a first affine combining mode combining the first affine modes or a second affine combining mode combining the second affine modes, correspondingly, the basis and the image unit to be processed Information of adjacent adjacent image units, determining the image unit to be processed Whether the candidate prediction mode set includes an affine merge mode, and includes: when the first affine mode is the most in the prediction mode of the neighboring prediction unit, the candidate prediction mode set includes the first affine merge mode And not including the second affine merge mode; when the second affine mode is the most in the prediction mode of the neighboring prediction unit, the candidate prediction mode set includes the second affine merge mode And not including the first affine merge mode; when the prediction mode of the neighboring prediction unit
  • the method further includes: when the first affine mode is the most in the prediction mode of the neighboring prediction unit, the candidate prediction mode set includes the first affine merge mode, and does not include The second affine merge mode; when the second affine mode is the most in the prediction mode of the neighboring prediction unit, the candidate prediction mode set includes the second affine merge mode, and does not include The first affine merge mode; when the prediction mode of the adjacent prediction unit is not the most affine mode and the first affine mode is the second most, the candidate prediction mode set includes the first An affine merge mode, and does not include the second affine merge mode; when the prediction mode of the adjacent prediction unit is not the most affine mode and the second affine mode is multiple, the candidate The set of prediction modes includes the second affine merge mode and does not include the first affine merge mode.
  • a fourth implementation manner includes: when a prediction mode of the at least one of the adjacent image units is a prediction image obtained by using an affine model, and a width and a height of the at least one of the adjacent image units are respectively smaller than the to-be-processed Parsing the third indication information in the code stream when the image unit is wide and high, and when the third indication information is 1, the candidate prediction mode set includes the affine merge mode, when the third When the indication information is 0, the candidate prediction mode set does not include the affine merge mode; otherwise, the candidate prediction mode set does not include the affine merge mode.
  • a fifth implementation manner includes: when a prediction mode of the at least one of the adjacent image units is a prediction image obtained by using an affine model, and a width and a height of the at least one of the adjacent image units are respectively smaller than the to-be-processed
  • the candidate prediction mode set includes the affine merge mode when the image unit is wide and high; otherwise, the candidate prediction mode set does not include the affine merge mode.
  • a prediction image decoding method includes: parsing first indication information in a code stream; determining, according to the first indication information, a candidate mode set of a first to-be-processed image region; When the indication information is 0, the candidate mode set of the first to-be-processed image region adopts a candidate translation mode set, and the translational mode represents a prediction mode for obtaining a predicted image by using a translational model; when the first indication information is 1 o'clock, the candidate mode of the first image area to be processed The set adopts a candidate translation mode set and a candidate affine pattern set, the affine mode represents a prediction mode for obtaining a predicted image using an affine model; parsing second indication information in the code stream; and according to the second indication information Determining, from the candidate prediction mode set of the first to-be-processed image region, a prediction mode of the image unit to be processed, the image unit to be processed belongs to the first image to be processed; determining, according to
  • the first to-be-processed image area includes one of an image frame group, an image frame, an image slice set, an image slice set, an image slice, an image slice, an image coding unit set, and an image coding unit.
  • a predictive image decoding method includes: determining, according to information of a neighboring image unit adjacent to an image unit to be processed, whether a candidate prediction mode set of the image unit to be processed includes an affine merge mode, The affine merge mode indicates that the image unit to be processed and the adjacent image unit of the image unit to be processed obtain the respective predicted images using the same affine model; parsing the first indication information in the code stream; And an indication information, from the set of candidate prediction modes, determining a prediction mode of the image unit to be processed; and determining, according to the prediction mode, a predicted image of the image unit to be processed.
  • a predictive image encoding method includes: determining, according to information of a neighboring image unit adjacent to an image unit to be processed, whether a candidate prediction mode set of the image unit to be processed includes an affine merge mode, The affine merge mode indicates that the image unit to be processed and the adjacent image unit of the image unit to be processed obtain respective predicted images using the same affine model; and from the candidate prediction mode set, determine the to-be-processed Processing a prediction mode of the image unit; determining a predicted image of the image unit to be processed according to the prediction mode; programming the first indication information into a code stream, the first indication information indicating the prediction mode.
  • a prediction image decoding method includes: parsing first indication information in a code stream; determining, according to the first indication information, a candidate mode set of a first to-be-processed image region; When the indication information is 0, the candidate mode set of the first to-be-processed image region adopts a candidate translation mode set, and the translational mode represents a prediction mode for obtaining a predicted image by using a translational model; when the first indication information is 1st, the candidate mode set of the first image to be processed adopts a candidate translation mode set and a candidate affine mode set, the affine mode represents a prediction mode for obtaining a predicted image using an affine model; and parsing the code stream
  • a prediction mode of the image unit the image unit to be processed belongs to the first image area to be processed; and the predicted image of the image unit to be processed is determined
  • a predictive image encoding method includes: setting a first indication information to 0 when the candidate mode set of the first to-be-processed image region adopts a candidate translation mode set, and compiling the first indication information a code stream, the translation mode represents a prediction mode of obtaining a predicted image using a translational model; and when the candidate mode set of the first image region to be processed adopts a candidate translation mode set and a candidate affine mode set,
  • the first indication information is 1, the first indication information is programmed into a code stream, the affine mode represents a prediction mode for obtaining a predicted image using an affine model; and a candidate prediction mode from the first image region to be processed Determining, in the set, a prediction mode of the image unit to be processed, the image unit to be processed belongs to the first image area to be processed; determining a predicted image of the image unit to be processed according to the prediction mode; and displaying the second indication information
  • the code stream is programmed, and the second indication information represents the prediction mode.
  • a predictive image decoding apparatus includes: a first determining module, configured to determine a candidate prediction mode set of the image unit to be processed according to information of a neighboring image unit adjacent to an image unit to be processed Whether an affine merge mode is included, the affine merge mode indicating that the image unit to be processed and the adjacent image unit of the image unit to be processed obtain respective predicted images using the same affine model; a parsing module for parsing a first indication information in the code stream; a second determining module, configured to determine, according to the first indication information, a prediction mode of the image unit to be processed from the candidate prediction mode set; and a third determining module, And determining, according to the prediction mode, a predicted image of the image unit to be processed.
  • a predictive image encoding apparatus includes: a first determining module, configured to determine a candidate prediction mode set of the image unit to be processed according to information of adjacent image units adjacent to an image unit to be processed Whether an affine merge mode is included, the affine merge mode indicating that the image unit to be processed and the adjacent image unit of the image unit to be processed obtain the respective predicted image using the same affine model; the second determining module uses Determining, from the candidate prediction mode set, a prediction mode of the image unit to be processed; a third determining module, configured to determine a predicted image of the image unit to be processed according to the prediction mode; and an encoding module, configured to: The first indication information is encoded into a code stream, and the first indication information represents the prediction mode.
  • a predictive image decoding apparatus includes: a first parsing module, configured to parse first indication information in a code stream; and a first determining module, configured to determine, according to the first indication information, a first a candidate mode set of the image area to be processed; when the first indication information is 0,
  • the candidate mode set of the first to-be-processed image region adopts a candidate translation mode set, and the translational mode represents a prediction mode in which a prediction image is obtained using a translational model;
  • the candidate mode set of the first to-be-processed image region adopts a candidate translation mode set and a candidate affine pattern set, where the affine mode represents obtaining an predicted image using an affine model.
  • a prediction mode a prediction mode
  • a second parsing module configured to parse the second indication information in the code stream
  • a second determining module configured to select a prediction mode from the first image area to be processed according to the second indication information Determining, in the set, a prediction mode of the image unit to be processed, the image unit to be processed belongs to the first image area to be processed, and a third determining module, configured to determine a prediction of the image unit to be processed according to the prediction mode image.
  • a predictive image encoding apparatus includes: a first encoding module, configured to set a first indication information to 0 when a candidate mode set of the first image region to be processed adopts a candidate translation mode set, The first indication information is programmed into a code stream, and the translation mode represents a prediction mode in which a prediction image is obtained by using a translational model; when the candidate mode set of the first to-be-processed image region adopts a candidate translation mode set and a candidate simulation When the mode is set, the first indication information is set to 1, and the first indication information is programmed into a code stream, where the affine mode represents a prediction mode for obtaining a predicted image using an affine model; Determining a prediction mode of the image unit to be processed from the candidate prediction mode set of the first image to be processed, the image unit to be processed belongs to the first image area to be processed, and the second determining module is configured to Determining a predicted image of the image unit to be processed; and a second encoding module, configured to Determin
  • an apparatus for decoding video data comprising a video decoder configured to: determine from information of neighboring image units adjacent to an image unit to be processed Whether the candidate prediction mode set of the image unit to be processed includes an affine merge mode, the affine merge mode indicating that the image unit to be processed and the adjacent image unit of the image unit to be processed are obtained using the same affine model a respective prediction image; parsing first indication information in the code stream; determining, according to the first indication information, a prediction mode of the image unit to be processed from the candidate prediction mode set; determining, according to the prediction mode, a predicted image of the image unit to be processed.
  • an apparatus for encoding video data comprising a video encoder configured to: determine from information of neighboring image units adjacent to an image unit to be processed Whether the candidate prediction mode set of the image unit to be processed includes an affine a merge mode, the affine merge mode indicating that the image unit to be processed and the adjacent image unit of the image unit to be processed obtain respective predicted images using the same affine model; from the set of candidate prediction modes, determining Determining a mode of the image unit to be processed; determining a predicted image of the image unit to be processed according to the prediction mode; programming the first indication information into a code stream, wherein the first indication information indicates the prediction mode.
  • an apparatus for decoding video data comprising a video decoder configured to: first parse the first indication information in the code stream; according to the first indication information Determining a candidate mode set of the first to-be-processed image region; when the first indication information is 0, the candidate mode set of the first to-be-processed image region adopts a candidate translation mode set, and the translation mode indicates use
  • the translation model obtains a prediction mode of the predicted image; when the first indication information is 1, the candidate mode set of the first to-be-processed image region adopts a candidate translation mode set and a candidate affine pattern set, the affine
  • the mode represents obtaining a prediction mode of the predicted image using the affine model; parsing the second indication information in the code stream; and determining, according to the second indication information, from the candidate prediction mode set of the first to-be-processed image region a prediction mode of the image unit to be processed, the image unit to be processed belongs to the first image area to be processed; according to the prediction
  • an apparatus for encoding video data comprising a video encoder configured to: when a candidate mode set of a first image region to be processed employs a set of candidate translation modes And setting the first indication information to 0, and encoding the first indication information into a code stream, where the translation mode represents a prediction mode of obtaining a predicted image using a translational model; and when the candidate of the first image area to be processed
  • the mode set adopts the candidate translation mode set and the candidate affine mode set the first indication information is set to 1, and the first indication information is programmed into a code stream, and the affine mode indicates that the prediction is obtained by using an affine model.
  • a prediction mode of the image determining, from a candidate prediction mode set of the first to-be-processed image region, a prediction mode of the image unit to be processed, the image unit to be processed belonging to the first image to be processed; according to the prediction a mode of determining a predicted image of the image unit to be processed; encoding second indication information into the code stream, the second indication information indicating the Measurement mode.
  • a computer readable storage medium storing instructions that, when executed, are used by one or more processors of a device that decodes video data to perform operations on: Information about adjacent image units adjacent to the unit, determining whether the candidate prediction mode set of the image unit to be processed includes an affine merge mode, the affine merge mode representation And the image unit to be processed and the adjacent image unit of the image unit to be processed obtain the respective predicted image using the same affine model; parse the first indication information in the code stream; according to the first indication information, Determining, in the set of candidate prediction modes, a prediction mode of the image unit to be processed; and determining a predicted image of the image unit to be processed according to the prediction mode.
  • a computer readable storage medium storing instructions that, when executed, are used by one or more processors of a device that encodes video data to perform operations on: Determining, by the information of the neighboring image units adjacent to the unit, whether the candidate prediction mode set of the image unit to be processed includes an affine merge mode, the affine merge mode indicating the image unit to be processed and the image unit to be processed
  • the adjacent image units obtain the respective predicted images using the same affine model; from the candidate prediction mode set, the prediction mode of the image unit to be processed is determined; and the image unit to be processed is determined according to the prediction mode a predicted image; the first indication information is encoded into a code stream, and the first indication information represents the prediction mode.
  • a computer readable storage medium storing instructions that, when executed, are used by one or more processors of a device that decodes video data to: And determining, by the first indication information, a candidate mode set of the first to-be-processed image region; when the first indication information is 0, the candidate mode set of the first to-be-processed image region adopts a candidate a translation mode set, wherein the translation mode represents a prediction mode of obtaining a predicted image using a translational model; when the first indication information is 1, the candidate mode set of the first to-be-processed image region adopts a candidate translation mode And a set of candidate affine patterns representing a prediction mode of obtaining a predicted image using an affine model; parsing second indication information in the code stream; and from the first according to the second indication information Determining, in the candidate prediction mode set of the image region to be processed, a prediction mode of the image unit to be processed, the image unit to be processed belonging to the first to-be-processed Image region;
  • a computer readable storage medium storing instructions that, when executed, are used by one or more processors of a device that encodes video data to:
  • the first indication information is set to 0
  • the first indication information is programmed into the code stream
  • the translation mode indicates that the prediction mode of the predicted image is obtained by using the translational model.
  • the candidate mode set of the first to-be-processed image region adopts the candidate translation mode set and the candidate affine mode set
  • the first indication information is set to 1, and the first indication information is programmed into the code stream.
  • the affine pattern representation is obtained using an affine model Determining a prediction mode of the image; determining, from the candidate prediction mode set of the first to-be-processed image region, a prediction mode of the image unit to be processed, the image unit to be processed belonging to the first image to be processed; a prediction mode, determining a predicted image of the image unit to be processed; programming the second indication information into the code stream, the second indication information indicating the prediction mode.
  • FIG. 1 is a schematic block diagram of a video decoding system in accordance with an embodiment of the present invention
  • FIG. 2 is a schematic block diagram of a video encoder in accordance with an embodiment of the present invention.
  • FIG. 3 is a schematic flowchart of an example operation of a video encoder according to an embodiment of the present invention.
  • FIG. 4 is a schematic diagram showing the positions of a to-be-processed block and its adjacent reconstructed blocks according to an embodiment of the present invention
  • FIG. 5 is a schematic block diagram of another video encoder according to an embodiment of the present invention.
  • FIG. 6 is a schematic flowchart of another example operation of a video encoder according to an embodiment of the present invention.
  • FIG. 7 is a schematic block diagram of still another video encoder in accordance with an embodiment of the present invention.
  • FIG. 8 is a schematic block diagram of a video decoder according to an embodiment of the present invention.
  • FIG. 9 is a schematic flowchart of an example operation of a video decoder according to an embodiment of the present invention.
  • FIG. 10 is a schematic block diagram of another video decoder according to an embodiment of the present invention.
  • FIG. 11 is a schematic flowchart of another example operation of a video decoder according to an embodiment of the present invention.
  • FIG. 12 is a schematic block diagram of still another video decoder in accordance with an embodiment of the present invention.
  • Motion compensation is one of the most critical techniques for improving compression efficiency in video coding.
  • Traditional block based Matched motion compensation is a popular method of video encoders, especially in video coding standards.
  • an inter prediction block adopts a translational motion model, which assumes that motion vectors at all pixel positions in one block are equal. However, this assumption does not hold in many cases.
  • the motion of objects in real-life video is often a complex combination of motions such as translation, rotation, and scaling. If such a complex motion is contained in one pixel block, the prediction signal obtained by the conventional block matching-based motion compensation method is not accurate enough, so the inter-frame correlation cannot be sufficiently removed.
  • a high-order motion model is introduced into the motion compensation of video coding. Compared with the translational motion model, the higher-order motion model has more freedom, which allows the motion vectors of each pixel in an inter-predicted block to be different, that is, the motion vector field generated by the high-order motion model has better precision. .
  • the affine motion model based on the control point description is a representative one in the high-order motion model. Unlike the traditional translational motion model, the value of the motion vector of each pixel in the block is related to its location and is the first-order linear equation of the coordinate position.
  • the affine motion model allows the reference block to undergo a warp transformation such as rotation, scaling, etc., and a more accurate prediction block can be obtained in motion compensation.
  • the type of inter prediction obtained by predicting a block at the time of motion compensation by the affine motion model is generally referred to as an affine mode.
  • the inter prediction type includes two modes: advanced motion vector prediction (AMVP) and merge (Merge).
  • AMVP needs to explicitly pass the prediction direction of each coding block.
  • the Merge mode directly derives the motion information of the current coded block by using the motion vector of the neighboring block.
  • the affine mode and the inter-frame prediction method based on AMVP and Merge based on the translational motion model can form a new inter-frame prediction mode based on affine motion model such as AMVP and Merge.
  • the Merge mode based on affine motion model may be used. This is called affine merge mode (Affine Merge).
  • affine merge mode Affine Merge
  • the new prediction mode and the prediction mode existing in the current standard participate in the comparison process of “performance cost ratio”, select the optimal mode as the prediction mode, and generate the predicted image of the block to be processed.
  • performance cost ratio select the optimal mode as the prediction mode, and generate the predicted image of the block to be processed.
  • the result of the prediction mode selection is encoded and transmitted to the decoder.
  • the affine mode can better improve the precision value of the prediction block and improve the efficiency of coding.
  • it takes more code rate to encode the motion information of each control point.
  • the code rate used to encode the prediction mode selection result is also increased due to the increase in the candidate prediction mode.
  • the code stream is parsed, and the indication information is used to determine whether a certain region uses the candidate prediction mode set including the affine mode, and the prediction mode is determined according to the candidate prediction mode set and the received additional indication information, and the predicted image is generated.
  • the prediction mode information or the size information of the adjacent image unit of the image unit to be processed may be used as a priori knowledge for encoding the prediction information of the to-be-processed block, and the indication information formed by the candidate prediction mode set of the region is also As a prior knowledge of the prediction information of the to-be-processed block, the prior knowledge guides the coding of the prediction mode, saves the code rate of the coding mode selection information, and improves the coding efficiency.
  • the affine model is a collective term for a non-translational motion model.
  • Actual motions including rotation, scaling, deformation, dialysis, etc. can be used for motion estimation and motion compensation in interframe prediction by establishing different motion models.
  • they are simply referred to as first affine model and second affine, respectively.
  • FIG. 1 is a schematic block diagram of a video coding system 10 in accordance with an embodiment of the present invention.
  • video coder generally refers to both a video encoder and a video decoder.
  • video coding or “coding” may generally refer to video coding or video decoding.
  • video coding system 10 includes source device 12 and destination device 14.
  • Source device 12 produces encoded video data.
  • source device 12 may be referred to as a video encoding device or a video encoding device.
  • Destination device 14 may decode the encoded video data produced by source device 12.
  • destination device 14 may be referred to as a video decoding device or a video decoding device.
  • Source device 12 and destination device 14 may be examples of video coding devices or video coding devices.
  • Source device 12 and destination Device 14 may include a wide range of devices, including desktop computers, mobile computing devices, notebook (eg, laptop) computers, tablet computers, set top boxes, telephone handsets such as so-called “smart" phones, televisions, cameras , display device, digital media player, video game console, on-board computer, or the like.
  • Channel 16 may include one or more media and/or devices capable of moving encoded video data from source device 12 to destination device 14.
  • channel 16 may include one or more communication media that enable source device 12 to transmit encoded video data directly to destination device 14 in real time.
  • source device 12 may modulate the encoded video data in accordance with a communication standard (eg, a wireless communication protocol) and may transmit the modulated video data to destination device 14.
  • the one or more communication media can include wireless and/or wired communication media, such as a radio frequency (RF) spectrum or one or more physical transmission lines.
  • RF radio frequency
  • the one or more communication media may form part of a packet-based network (eg, a local area network, a wide area network, or a global network (eg, the Internet)).
  • the one or more communication media can include a router, a switch, a base station, or other device that facilitates communication from the source device 12 to the destination device 14.
  • channel 16 can include a storage medium that stores encoded video data generated by source device 12.
  • destination device 14 can access the storage medium via disk access or card access.
  • the storage medium may include a variety of locally accessible data storage media, such as Blu-ray Disc, DVD, CD-ROM, flash memory, or other suitable digital storage medium for storing encoded video data.
  • channel 16 can include a file server or another intermediate storage device that stores encoded video data generated by source device 12.
  • destination device 14 may access encoded video data stored at a file server or other intermediate storage device via streaming or download.
  • the file server can be a server type capable of storing encoded video data and transmitting the encoded video data to destination device 14.
  • the instance file server includes a web server (eg, for a website), a file transfer protocol (FTP) server, a network attached storage (NAS) device, and a local disk drive.
  • FTP file transfer protocol
  • NAS network attached storage
  • the techniques of the present invention are not limited to wireless applications or settings.
  • the techniques can be applied to video coding supporting a variety of multimedia applications, such as aerial television broadcasts, cable television transmissions, satellite television transmissions, streaming video transmissions (eg, via the Internet), and storage on data storage media. Encoding of video data, decoding of video data stored on a data storage medium, or other application.
  • video coding system 10 may be configured to support one-way or two-way video transmission to support applications such as video streaming, video playback, video broadcasting, and/or video telephony.
  • source device 12 includes a video source 18, a video encoder 20, and an output interface 22.
  • output interface 22 can include a modulator/demodulation transformer (modem) and/or a transmitter.
  • Video source 18 may include a video capture device (eg, a video camera), a video archive containing previously captured video data, a video feed interface to receive video data from a video content provider, and/or for generating video data.
  • Video encoder 20 may encode video data from video source 18.
  • source device 12 transmits encoded video data directly to destination device 14 via output interface 22.
  • the encoded video data may also be stored on a storage medium or file server for later access by the destination device 14 for decoding and/or playback.
  • destination device 14 includes an input interface 28, a video decoder 30, and a display device 32.
  • input interface 28 includes a receiver and/or a modem.
  • Input interface 28 may receive encoded video data via channel 16.
  • Display device 32 may be integral with destination device 14 or may be external to destination device 14. In general, display device 32 displays the decoded video data.
  • Display device 32 may include a variety of display devices such as a liquid crystal display (LCD), a plasma display, an organic light emitting diode (OLED) display, or another type of display device.
  • LCD liquid crystal display
  • OLED organic light emitting diode
  • video encoder 20 and video decoder 30 may operate in accordance with other proprietary or industry standards including ITU-TH.261, ISO/IEC MPEG-1 Visual, ITU-TH.262, or ISO/IEC MPEG-2 Visual, ITU. -TH.263, ISO/IECMPEG-4 Visual, ITU-TH.264 (also known as ISO/IECMPEG-4 AVC), including its tunable video coding (SVC) and multi-view video decoding (MVC) extension.
  • SVC tunable video coding
  • MVC multi-view video decoding extension
  • FIG. 1 is merely an example and the techniques of the present invention are applicable to video coding settings (eg, video encoding or video decoding) that do not necessarily include any data communication between the encoding device and the decoding device.
  • data is retrieved from local memory, data is streamed over a network, or manipulated in a similar manner.
  • the encoding device may encode the data and store the data to a memory, and/or the decoding device may retrieve the data from the memory and decode the data.
  • encoding and decoding are performed by a plurality of devices that only encode data to and/or retrieve data from the memory and decode the data by not communicating with each other.
  • Video encoder 20 and video decoder 30 may each be implemented as any of a variety of suitable circuits, such as one or more microprocessors, digital signal processors (DSPs), application specific integrated circuits (ASICs), field programmable Gate array (FPGA), discrete logic, hardware, or any combination thereof. If the technology is implemented partially in software, the device may store the instructions of the software in a suitable non-transitory computer readable storage medium, and the instructions in the hardware may be executed using one or more processors to perform the techniques of the present invention. Any of the foregoing (including hardware, software, a combination of hardware and software, etc.) can be considered one or more processors. Each of video encoder 20 and video decoder 30 may be included in one or more encoders or decoders, any of which may be integrated into a combined encoder/decoder in a respective device (coded Part of the decoder (CODEC).
  • CODEC coded Part of the decoder
  • video encoder 20 may partition the picture into a raster of a coding tree block (CTB).
  • CTB may be referred to as a "tree block,” a "largest coding unit” (LCU), or a "coding tree unit.”
  • LCU largest coding unit
  • coding tree unit The CTB of HEVC can be substantially similar to macroblocks of previous standards (eg, H.264/AVC). However, CTB is not necessarily limited to a particular size and may include one or more coding units (CUs).
  • the CTB of the picture can be grouped into one or more slices.
  • each of the slices contains an integer number of CTBs.
  • video encoder 20 may generate an encoded representation (ie, a coded slice) of each slice of the picture.
  • video encoder 20 may encode each CTB of the slice to produce an encoded representation of each of the sliced CTBs (ie, coded CTB).
  • video encoder 20 may recursively perform quadtree partitioning on the block of pixels associated with the CTB to segment the block of pixels into progressively decreasing blocks of pixels.
  • Each of the smaller blocks of pixels can be associated with a CU.
  • a partitioned CU may be a CU that is partitioned into blocks of pixels associated with other CUs.
  • An unpartitioned CU may be a CU in which a block of pixels is not partitioned into blocks of pixels associated with other CUs.
  • Video encoder 20 may generate one or more prediction units (PUs) for each unpartitioned CU. Each of the PUs of the CU may be associated with a different block of pixels within a block of pixels of the CU. Video encoder 20 may generate predictive pixel blocks for each PU of the CU. The predictive pixel block of the PU can be a block of pixels.
  • PUs prediction units
  • Video encoder 20 may use intra prediction or inter prediction to generate predictive pixel blocks for the PU. If video encoder 20 uses intra prediction to generate a predictive pixel block for a PU, video encoder 20 may generate a predictive pixel block for the PU based on the decoded pixels of the picture associated with the PU. If video encoder 20 uses inter prediction to generate a predictive pixel block for a PU, video encoder 20 may generate a predictive pixel for the PU based on decoded pixels of one or more pictures that are different from the picture associated with the PU. Piece.
  • Video encoder 20 may generate residual pixel blocks of the CU based on the predictive pixel blocks of the PU of the CU.
  • the residual pixel block of the CU may indicate the sample in the predictive pixel block of the PU of the CU and the initial pixel of the CU The difference between the corresponding samples in the block.
  • video encoder 20 may perform recursive quadtree partitioning on the residual pixel block of the CU to partition the residual pixel block of the CU into one or associated with a transform unit (TU) of the CU. Multiple smaller residual pixel blocks. Because the pixels in the block of pixels associated with the TU each comprise a luma sample and two chroma samples, each of the TUs can be associated with one residual sample block of the luma sample and two residual sample blocks of the chroma sample. Union.
  • Video encoder 20 may generate a set of syntax elements that represent coefficients in the quantized coefficient block. Video encoder 20 may apply an entropy encoding operation (eg, a context adaptive binary arithmetic coding (CABAC) operation) to at least some of such syntax elements.
  • CABAC context adaptive binary arithmetic coding
  • video encoder 20 may binarize the syntax elements to form a binary string that includes a succession of one or more bits (referred to as "binary"). Video encoder 20 may encode some of the bins using regular CABAC encoding, and may use the pass encoding to encode the other of the bins.
  • video encoder 20 may first identify the coding context.
  • the coding context can identify the probability of decoding a binary having a particular value. For example, the coding context may indicate that the probability of decoding a zero value binary is 0.7 and the probability of decoding a binary value is 0.3.
  • video encoder 20 may divide the interval into a lower subinterval and an upper subinterval. One of the subintervals may be associated with a value of 0, and another subinterval may be associated with a value of one. The width of the subinterval may be proportional to the probability indicated by the identified coding context for the associated value.
  • video encoder 20 When video encoder 20 uses a pass-through encoding to encode a sequence of bins, video encoder 20 may be able to code several bins in a single loop, while video encoder 20 may use regular CABAC encoding, video encoder 20 may It is possible to decode only a single binary in a loop.
  • the bypass decoding can be simpler because the bypass decoding does not require the video encoder 20 to select the context and can enable the video encoder 20 to assume that the probability of two symbols (0 and 1) is 1/2 (50%). . Therefore, in the bypass decoding, the interval is directly split into two halves. In effect, the bypass decoding bypasses the context adaptive portion of the arithmetic decoding engine.
  • Performing a bypass decoding on a binary may be less computationally intensive than a binary execution rule CABAC decoding.
  • performing parallel decoding enables higher parallelism and throughput.
  • Binary bits encoded using bypass decoding may be referred to as "bypass decoded binary".
  • video encoder 20 may apply inverse quantization and inverse transform to the transform block to reconstruct the residual sample block from the transform block.
  • Video encoder 20 may add reconstructed residual sample blocks to corresponding samples from one or more predictive sample blocks to produce reconstructed sample blocks.
  • By reconstructing the block of samples of each color component video encoder 20 may reconstruct the block of pixels associated with the TU. By reconstructing the block of pixels of each TU of the CU in this way, video encoder 20 can reconstruct the block of pixels that make up the CU.
  • video encoder 20 may perform a deblocking operation to reduce blockiness artifacts associated with the CU.
  • video encoder 20 may use sample adaptive offset (SAO) to modify the reconstructed structured block of CTB of the picture.
  • SAO sample adaptive offset
  • adding an offset value to a pixel in a picture can improve coding efficiency.
  • video encoder 20 may store the reconstructed structured block of CUs in a decoded picture buffer for use in generating predictive pixel blocks for other CUs.
  • video decoder 30 may decode the binary having the value associated with the upper subinterval. To decode the next binary of the syntax element, video decoder 30 may repeat these steps with respect to the interval that is the subinterval containing the encoded value. When video decoder 30 repeats these steps for the next binary, video decoder 30 may use the modified probability based on the probability indicated by the identified coding context and the decoded binary. Video decoder 30 may then debinarize the binary to recover the syntax elements. Debinarization may refer to selecting a syntax element value based on a mapping between a binary string and a syntax element value.
  • Video decoder 30 may reconstruct a picture of the video data based on the syntax elements extracted from the bitstream. The process of reconstructing video data based on syntax elements is generally reciprocal to the program executed by video encoder 20 to generate syntax elements. For example, video decoder 30 may generate a predictive pixel block of a PU of a CU based on syntax elements associated with the CU. Additionally, video decoder 30 may inverse quantize the coefficient blocks associated with the TUs of the CU. Video decoder 30 may perform an inverse transform on the coefficient block to reconstruct a residual pixel block associated with the TU of the CU. Video decoder 30 may reconstruct a block of pixels that constitute a CU based on the predictive pixel block and the residual pixel block.
  • video decoder 30 may perform a deblocking operation to reduce blockiness artifacts associated with the CU. Additionally, video decoder 30 may apply the SAO applied by video encoder 20 based on one or more SAO syntax elements. After video decoder 30 performs such operations, video decoder 30 may store the block of pixels of the CU in a decoded picture buffer. The decoded picture buffer may provide reference pictures for subsequent motion compensation, intra prediction, and presentation on the display device.
  • FIG. 2 is a block diagram illustrating an example video encoder 20 that is configured to implement the techniques of the present invention.
  • Figure 2 is provided for purposes of explanation and should not be construed as limiting the invention as broadly illustrated and described herein. Surgery.
  • the present invention describes video encoder 20 in HEVC coded image prediction.
  • the techniques of the present invention are applicable to other coding standards or methods.
  • video encoder 20 includes prediction processing unit 100, residual generation unit 102, transform processing unit 104, quantization unit 106, inverse quantization unit 108, inverse transform processing unit 110, reconstruction unit 112, and filter unit. 113.
  • Entropy encoding unit 116 includes a regular CABAC decoding engine 118 and a bypass decoding engine 120.
  • the prediction processing unit 100 includes an inter prediction processing unit 121 and an intra prediction processing unit 126.
  • the inter prediction processing unit 121 includes a motion estimation unit 122 and a motion compensation unit 124.
  • video encoder 20 may include more, fewer, or different functional components.
  • Video encoder 20 receives the video data.
  • video encoder 20 may encode each slice of each picture of the video data.
  • video encoder 20 may encode each CTB in the slice.
  • the prediction processing unit 100 may perform quadtree partitioning on the pixel blocks associated with the CTB to divide the pixel block into progressively smaller pixel blocks. Smaller blocks of pixels can be associated with the CU. For example, prediction processing unit 100 may partition a block of CTB into four equally sized sub-blocks, split one or more of the sub-blocks into four equally sized sub-sub-blocks, and the like.
  • Video encoder 20 may encode the CU of the CTB in the picture to produce an encoded representation of the CU (ie, the coded CU). Video encoder 20 may encode the CU of the CTB according to the z-scan order. In other words, video encoder 20 may encode the CU by the upper left CU, the upper right CU, the lower left CU, and then the lower right CU. When video encoder 20 encodes a partitioned CU, video encoder 20 may encode the CU associated with the sub-block of the pixel block of the partitioned CU according to the z-scan order.
  • prediction processing unit 100 may partition the pixel blocks of the CU among one or more PUs of the CU.
  • Video encoder 20 and video decoder 30 can support a variety of PU sizes. Assuming that the size of a particular CU is 2N ⁇ 2N, video encoder 20 and video decoder 30 may support a PU size of 2N ⁇ 2N or N ⁇ N for intra prediction, and support 2N ⁇ 2N, 2N ⁇ N, N ⁇ A symmetric PU size of 2N, N x N or similar size for inter prediction. Video encoder 20 and video decoder 30 may also support asymmetric partitioning of PU sizes of 2N x nU, 2N x nD, nL x 2N, and nR x 2N for inter prediction.
  • the inter prediction processing unit 121 may generate predictive data of the PU by performing inter prediction on each PU of the CU.
  • the predictive data of the PU may include motion information corresponding to the predictive pixel block of the PU and the PU.
  • the slice can be an I slice, a P slice, or a B slice.
  • Inter prediction unit 121 may depend on PU Whether the I performs a different operation on the PU of the CU in the I slice, the P slice, or the B slice. In an I slice, all PUs are intra predicted. Therefore, if the PU is in an I slice, the inter prediction unit 121 does not perform inter prediction on the PU.
  • motion estimation unit 122 may search for a reference picture in a list of reference pictures (eg, "List 0") to find a reference block for the PU.
  • the reference block of the PU may be the pixel block that most closely corresponds to the pixel block of the PU.
  • Motion estimation unit 122 may generate a reference picture index that indicates a reference picture of the PU-containing reference block in list 0, and a motion vector that indicates a spatial displacement between the pixel block of the PU and the reference block.
  • the motion estimation unit 122 may output the reference picture index and the motion vector as motion information of the PU.
  • Motion compensation unit 124 may generate a predictive pixel block of the PU based on the reference block indicated by the motion information of the PU.
  • motion estimation unit 122 may perform uni-directional inter prediction or bi-directional inter prediction on the PU.
  • motion estimation unit 122 may search for a reference picture of a first reference picture list ("List 0") or a second reference picture list ("List 1") to find a reference block for the PU.
  • the motion estimation unit 122 may output the following as the motion information of the PU: a reference picture index indicating a position in the list 0 or the list 1 of the reference picture containing the reference block, a space between the pixel block indicating the PU and the reference block The motion vector of the displacement, and the prediction direction indicator indicating whether the reference picture is in list 0 or in list 1.
  • motion estimation unit 122 may search for reference pictures in list 0 to find reference blocks for the PU, and may also search for reference pictures in list 1 to find another reference block for the PU.
  • Motion estimation unit 122 may generate a reference picture index indicating the list 0 of the reference picture containing the reference block and the location in list 1. Additionally, motion estimation unit 122 may generate a motion vector that indicates a spatial displacement between the reference block and the pixel block of the PU.
  • the motion information of the PU may include a reference picture index of the PU and a motion vector.
  • Motion compensation unit 124 may generate a predictive pixel block of the PU based on the reference block indicated by the motion information of the PU.
  • Intra prediction processing unit 126 may generate predictive data for the PU by performing intra prediction on the PU.
  • the predictive data of the PU may include predictive pixel blocks of the PU and various syntax elements.
  • Intra prediction processing unit 126 may perform intra prediction on PUs within I slices, P slices, and B slices.
  • intra-prediction processing unit 126 may use multiple intra-prediction modes to generate multiple sets of predictive data for the PU.
  • intra-prediction processing unit 126 may extend samples of sample blocks from neighboring PUs across sample blocks of the PU in a direction associated with the intra-prediction mode. Assume from left to right, from top The lower coding order is used for PU, CU, and CTB, and the adjacent PU may be above the PU, at the upper right of the PU, at the upper left of the PU, or to the left of the PU.
  • Intra prediction processing unit 126 may use various numbers of intra prediction modes, for example, 33 directional intra prediction modes. In some examples, the number of intra prediction modes may depend on the size of the pixel block of the PU.
  • the prediction processing unit 100 may select the predictive data of the PU of the CU from among the predictive data generated by the inter prediction processing unit 121 for the PU or the predictive data generated by the intra prediction processing unit 126 for the PU. In some examples, prediction processing unit 100 selects predictive data for the PU of the CU based on the rate/distortion metric of the set of predictive data.
  • a predictive pixel block that selects predictive data may be referred to herein as a selected predictive pixel block.
  • Residual generation unit 102 may generate a residual pixel block of the CU based on the pixel block of the CU and the selected predictive pixel block of the PU of the CU. For example, the residual generation unit 102 may generate a residual pixel block of the CU such that each sample in the residual pixel block has a value equal to a difference between: a sample in a pixel block of the CU, and a PU of the CU Corresponding samples in the predictive pixel block are selected.
  • the prediction processing unit 100 may perform quadtree partitioning to partition the residual pixel block of the CU into sub-blocks. Each undivided residual pixel block can be associated with a different TU of the CU. The size and location of the residual pixel block associated with the TU of the CU may or may not be based on the size and location of the pixel block of the PU of the CU.
  • Transform processing unit 104 may generate a coefficient block for each TU of the CU by applying one or more transforms to the residual sample block associated with the TU.
  • Transform processing unit 104 may apply various transforms to the residual sample block associated with the TU. For example, transform processing unit 104 may apply a discrete cosine transform (DCT), a directional transform, or a conceptually similar transform to the residual sample block.
  • DCT discrete cosine transform
  • a directional transform or a conceptually similar transform to the residual sample block.
  • Quantization unit 106 may quantize the coefficients in the coefficient block. The quantization procedure can reduce the bit depth associated with some or all of the coefficients. For example, an n-bit coefficient can be truncated to an m-bit coefficient during quantization, where n is greater than m. Quantization unit 106 may quantize the coefficient block associated with the TU of the CU based on a quantization parameter (QP) value associated with the CU. Video encoder 20 may adjust the degree of quantization applied to the coefficient block associated with the CU by adjusting the QP value associated with the CU.
  • QP quantization parameter
  • Inverse quantization unit 108 and inverse transform processing unit 110 may apply inverse quantization and inverse transform, respectively, to the coefficient block to reconstruct the residual sample block from the coefficient block.
  • the reconstruction unit 112 may add samples of the reconstructed residual sample block to corresponding samples from the one or more predictive sample blocks generated by the prediction processing unit 100 to generate reconstructed sample blocks associated with the TU. By reconstructing the CU in this way For each TU of sample blocks, video encoder 20 may reconstruct the block of pixels of the CU.
  • Filter unit 113 may perform a deblocking operation to reduce blockiness artifacts in the block of pixels associated with the CU. Further, the filter unit 113 may apply the SAO offset determined by the prediction processing unit 100 to the reconstructed sample block to recover the pixel block. Filter unit 113 may generate a sequence of SAO syntax elements of the CTB.
  • the SAO syntax elements may include regular CABAC coded bins and bypass coded bins. In accordance with the teachings of the present invention, none of the pass-coded bins of the color components are within the sequence between the regular CABAC coded bins of the same color component.
  • the decoded picture buffer 114 may store the reconstructed block of pixels.
  • Inter prediction unit 121 may perform inter prediction on PUs of other pictures using reference pictures containing reconstructed blocks of pixels.
  • intra-prediction processing unit 126 can use the reconstructed block of pixels in decoded picture buffer 114 to perform intra-prediction on other PUs in the same picture as the CU.
  • Entropy encoding unit 116 may receive data from other functional components of video encoder 20. For example, entropy encoding unit 116 may receive a coefficient block from quantization unit 106 and may receive syntax elements from prediction processing unit 100. Entropy encoding unit 116 may perform one or more entropy encoding operations on the data to produce entropy encoded data. For example, entropy encoding unit 116 may perform context adaptive variable length coding (CAVLC) operations, CABAC operations, variable to variable (V2V) length coding operations, grammar-based context adaptive binary arithmetic translations on data. Code (SBAC) operation, probability interval partition entropy (PIPE) decoding operation, or another type of entropy coding operation.
  • CAVLC context adaptive variable length coding
  • CABAC CABAC
  • V2V variable to variable
  • SBAC probability interval partition entropy
  • PIPE probability interval partition entropy
  • entropy encoding unit 116 may encode the SAO syntax elements generated by filter unit 113. As part of encoding the SAO syntax element, entropy encoding unit 116 may encode a regular CABAC coding binary of the SAO syntax element using regular CABAC engine 118, and may encode the bypass coding binary using bypass coding engine 120 .
  • the inter prediction unit 121 determines a set of inter prediction prediction modes.
  • video encoder 20 is an example of a video encoder that is configured to determine the image to be processed based on information of neighboring image units adjacent to the image unit to be processed in accordance with the teachings of the present invention.
  • the candidate prediction mode set of the unit includes an affine merge mode, the affine merge mode indicating that the image unit to be processed and the adjacent image unit of the image unit to be processed obtain the respective predicted images using the same affine model; Determining, from the candidate prediction mode set, a prediction mode of the image unit to be processed; determining, according to the prediction mode, a prediction image of the image unit to be processed; and encoding the first indication information into a code stream, where An indication information indicates the prediction mode.
  • FIG. 3 is a flow diagram of an example operation 200 of a video encoder for encoding video data in accordance with one or more techniques of this disclosure.
  • Figure 3 is provided as an example. In other examples, the techniques of the present invention may be practiced using more or fewer steps than those shown in the examples of FIG. 3 or steps that are different therefrom.
  • video encoder 20 performs the following steps:
  • S210 Determine, according to information about an adjacent image unit adjacent to the image unit to be processed, whether the candidate prediction mode set of the image unit to be processed includes an affine merge mode;
  • the blocks A, B, C, D, and E are adjacent reconstructed blocks of the current block to be coded, respectively located at the upper, left, upper right, lower left, and upper left positions of the block to be coded. Whether the affine merge mode exists in the candidate prediction mode set of the current block to be coded may be determined by the coding information of the adjacent reconstructed block.
  • FIG. 4 in the embodiment of the present invention exemplarily gives the number and location of adjacent reconstructed blocks of the block to be coded, and the number of the adjacent reconstructed blocks may be more than five or less. 5, not limited.
  • determining whether there is a block in the adjacent reconstructed block that the prediction type is an affine prediction if not, the candidate prediction mode set of the to-be-coded block does not include an affine merge mode; If so, the candidate prediction mode set of the block to be encoded includes an affine merge mode.
  • the adjacent reconstructed block includes multiple affine modes, and may include a first affine mode or a second affine mode, and correspondingly, the affine merge mode includes Combining the first affine merge mode of the first affine mode or the second affine merge mode of the second affine mode, respectively counting the first affine mode and the second of the adjacent reconstructed blocks The number of affine patterns and non-affine patterns, when the first affine pattern is the most, the candidate prediction mode set includes the first affine merge mode, and does not include the second affine merge mode; When the second affine mode is the most, the candidate prediction mode set includes the second affine merge mode, and does not include the first An affine merge mode; when the non-affine mode is the most, the candidate prediction mode set does not include the affine merge mode.
  • the third implementation manner may be: when the first affine mode is the most, the candidate prediction mode set includes the first affine merge mode, and does not include the second affine merge mode; When the second affine mode is the most, the candidate prediction mode set includes the second affine merge mode, and does not include the first affine merge mode; when the non-affine mode is the most, the first affine mode is counted and Which second is the second affine mode, when the first affine mode is too many, the candidate prediction mode set includes the first affine merge mode, and does not include the second affine merge mode; When the affine mode is many times, the candidate prediction mode set includes the second affine merge mode, and does not include the first affine merge mode.
  • two conditions are determined: (1) whether there is a block in the adjacent reconstructed block whose prediction type is an affine pattern; (2) a width sum of adjacent blocks in the affine mode Whether the height is smaller than the width and height of the block to be coded; if any condition is not satisfied, the candidate prediction mode set of the block to be coded does not include the affine merge mode; if both conditions are satisfied, respectively, the block to be coded
  • the set of candidate prediction modes includes an affine combining mode and the candidate prediction mode set of the block to be encoded does not include an affine combining mode. In both cases, the encoding process shown in FIG.
  • the candidate prediction mode set of the to-be-coded block includes an affine merge mode, and an indication information may be set as the third indication information, set to 1 and encoded into the code stream, and vice versa, the candidate prediction mode set of the to-be-coded block.
  • the third indication information is set to 0 and programmed into the code stream.
  • the width of the adjacent block that satisfies the affine mode is smaller than the width of the block to be coded, and the height of the adjacent block in the affine mode is smaller than the height of the block to be coded.
  • the determining condition may be that the width of the adjacent block of the affine mode is smaller than the width of the block to be coded or the height of the adjacent block of the affine mode is smaller than the height of the block to be coded, and is not limited. .
  • two conditions are determined: (1) whether there is a block in the adjacent reconstructed block whose prediction type is an affine pattern; (2) a width sum of adjacent blocks in the affine mode Whether the height is smaller than the width and height of the block to be coded; if any condition is not satisfied, the candidate prediction mode set of the block to be coded does not include the affine merge mode; if both conditions are satisfied, the candidate prediction of the block to be coded
  • the pattern collection contains an affine merge mode.
  • the prediction type and size of the adjacent reconstructed block are used as the basis for determining the candidate prediction mode set of the current block to be coded, and the adjacent reconstructed block solution may also be adopted.
  • the obtained attribute information is judged and is not limited here.
  • determining whether there is a block in the adjacent reconstructed block that the prediction type is an affine prediction may be used. For example, when the prediction type of at least two neighboring blocks is an affine mode, the candidate prediction mode set of the to-be-coded block includes an affine merge mode, otherwise, the block to be coded The candidate prediction mode set does not include an affine merge mode.
  • the prediction type of the adjacent block is that the number of affine patterns may be at least three, or at least four, and is not limited.
  • two conditions are determined: (1) whether there is a prediction in the adjacent reconstructed block.
  • the type is a block of affine mode; (2) whether the width and height of adjacent blocks of the affine mode are smaller than the width and height of the block to be coded; wherein, the second judgment condition, for example, may also be an affine mode Whether the width and height of adjacent blocks are less than 1/2 or 1/3, 1/4 of the width and height of the block to be coded are not limited.
  • the indication information is set to 0 or set to be exemplary, and the opposite setting may also be performed.
  • the Whether there is a block in the adjacent reconstructed block whose prediction type is affine prediction; if not, the candidate prediction mode set of the block to be coded does not include an affine merge mode; if yes, candidate predictions of the block to be coded respectively
  • the mode set includes an affine merge mode and the candidate prediction mode set of the block to be coded does not include an affine merge mode, and in both cases, the encoding process shown in FIG.
  • the candidate prediction mode set is the candidate prediction mode set determined by S210, and the coding process shown in FIG. 2 is sequentially performed by using each of the candidate prediction mode sets, and a mode with the best coding performance is selected as the The prediction mode of the coded block.
  • the purpose of performing the encoding process shown in FIG. 2 in the embodiment of the present invention is to select a prediction mode that may optimize encoding performance.
  • the performance cost ratio of each prediction mode can be compared, wherein the performance is represented by the quality of the image restoration, the cost is represented by the coded rate of the encoding, and only the performance or cost of each prediction mode can be compared, corresponding Can complete Figure 2
  • the encoding step may be stopped after obtaining the index to be compared. For example, if only the prediction modes are compared using the performance, only the prediction unit needs to be passed through, and is not limited.
  • the H.265 standard cited in the foregoing and the application documents such as CN201010247275.7 are used to generate a predicted image of a block to be coded according to a prediction mode, a prediction mode including a translational motion model, an affine prediction mode, and an affine merge mode. Description, no more details here.
  • the prediction mode determined by S220 is programmed into the code stream. It should be understood that this step may occur at any time after S220, and there is no specific requirement in the order of the steps, and the decoded first indication information of the decoding end may correspond.
  • FIG. 5 is a block diagram of an example of another video encoder 40 for encoding video data in accordance with one or more techniques of this disclosure.
  • the video encoder 40 includes a first determining module 41, a second determining module 42, a third determining module 43, and an encoding module 44.
  • the first determining module 41 is configured to: S210, determining, according to information about a neighboring image unit adjacent to the image unit to be processed, whether the candidate prediction mode set of the to-be-processed image unit includes an affine merge mode;
  • the second determining module 42 is configured to execute, by S220, determining, from the candidate prediction mode set, a prediction mode of the image unit to be processed;
  • the third determining module 43 is configured to execute S230, according to the prediction mode, determining a predicted image of the image unit to be processed;
  • the encoding module 44 is configured to execute S240 to program the first indication information into the code stream.
  • the current block and the adjacent block have a large possibility to have the same or similar prediction mode.
  • the information of the adjacent block is used to derive the current block.
  • the prediction mode information reduces the code rate of the coding prediction mode and improves the coding efficiency.
  • FIG. 6 is a flow diagram of an example operation 300 of a video encoder for encoding video data in accordance with one or more techniques of this disclosure.
  • Figure 5 is provided as an example. In other examples, the techniques of the present invention may be practiced using fewer, fewer, or different steps than those illustrated in the example of FIG. According to the example method of FIG. 5, video encoder 20 performs the following steps:
  • the indication information of the candidate prediction mode set of the first image area to be processed is encoded.
  • the candidate mode set of the first to-be-processed image region adopts the candidate translation mode set, set the first indication information to 0, and encode the first indication information into the code stream, where the translation mode indicates that the translation model is used.
  • Predicting a prediction mode of the image when the candidate mode set of the first to-be-processed image region adopts the candidate translation mode set and the candidate affine mode set, setting the first indication information to 1, and the first The indication information is programmed into a code stream, and the affine pattern represents a prediction mode for obtaining a predicted image using an affine model; wherein the first image area to be processed may be an image frame group, an image frame, an image slice set, and an image strip.
  • the first indication information is encoded in a header of an image frame group, such as a video parameter set (VPS), Sequence Parameter Set (SPS), Additional Enhancement Information (SEI), Image Frame Header, such as Image Parameter Set (PPS), Head of Image Slice Set, Head of Image Strip Set, Head of Image Fragment,
  • a video parameter set such as a video parameter set (VPS), Sequence Parameter Set (SPS), Additional Enhancement Information (SEI)
  • Image Frame Header such as Image Parameter Set (PPS), Head of Image Slice Set, Head of Image Strip Set, Head of Image Fragment
  • PPS Picture Parameter Set
  • Head of Image Slice Set such as Image Parameter Set (PPS)
  • PPS Picture Parameter Set
  • Head of Image Slice Set such as Image Parameter Set (PPS)
  • Head of Image Fragment For example, an image tile header, a slice header, a header of a set of image coding units, and a header of an image coding unit.
  • the determining of the first to-be-processed image area may be pre-configured, or may be adaptively determined during the encoding process, and the representation of the first to-be-processed image area range may be learned by the protocol of the codec.
  • the range of the first image area to be processed may also be encoded and transmitted through the code stream, which is not limited.
  • the determination of the candidate prediction mode set may be pre-configured, or may be determined after comparing the coding performance, and is not limited.
  • the indication information is set to 0 or set to be exemplary, and the opposite setting may also be performed.
  • S320 Determine, for the to-be-processed unit in the first to-be-processed image region, a prediction mode of the image unit to be processed from the candidate prediction mode set of the first to-be-processed image region.
  • FIG. 7 is a block diagram of an example of another video encoder 50 for encoding video data in accordance with one or more techniques of this disclosure.
  • the video encoder 50 includes: a first encoding module 51, a first determining module 52, and a second determining module. Block 53, a second encoding module 54.
  • the first encoding module 51 is configured to execute, by S310, indication information that encodes a candidate prediction mode set of the first to-be-processed image region;
  • the first determining module 52 is configured to: S320, for the to-be-processed unit in the first to-be-processed image region, determine a prediction mode of the image unit to be processed from the candidate prediction mode set of the first to-be-processed image region;
  • the second determining module 53 is configured to execute S330, according to the prediction mode, to determine a predicted image of the image unit to be processed;
  • the second encoding module 54 is configured to execute S340 to program the prediction mode selected by the unit to be processed into the code stream.
  • the flag is selected by setting the candidate prediction mode set at the regional level. The code rate spent by the coding redundancy mode is avoided, and the coding efficiency is improved.
  • FIG. 8 is a block diagram illustrating an example video decoder 30 that is configured to implement the techniques of the present invention.
  • FIG. 8 is provided for purposes of explanation and does not limit the techniques as broadly exemplified and described in the present invention.
  • the present invention describes video decoder 30 in HEVC coded image prediction.
  • the techniques of the present invention are applicable to other coding standards or methods.
  • video decoder 30 includes an entropy decoding unit 150, a prediction processing unit 152, an inverse quantization unit 154, an inverse transform processing unit 156, a reconstruction unit 158, a filter unit 159, and a decoded picture buffer 160.
  • the prediction processing unit 152 includes a motion compensation unit 162 and an intra prediction processing unit 164.
  • Entropy decoding unit 150 includes a regular CABAC decoding engine 166 and a bypass decoding engine 168. In other examples, video decoder 30 may include more, fewer, or different functional components.
  • Video decoder 30 can receive the bitstream.
  • Entropy decoding unit 150 may parse the bitstream to extract syntax elements from the bitstream. As part of parsing the bitstream, entropy decoding unit 150 may entropy decode the entropy encoded syntax elements in the bitstream.
  • Prediction processing unit 152, inverse quantization unit 154, inverse transform processing unit 156, reconstruction unit 158, and filter unit 159 may generate decoded video data based on syntax elements extracted from the bitstream.
  • the bitstream may comprise a sequence of coded SAO syntax elements of the CTB.
  • the SAO syntax elements may include regular CABAC coded bins and bypass coded bins. According to the techniques of the present invention, in the sequence of coded SAO syntax elements, none of the bypass coded bits are decoded by regular CABAC. Between the two in the carry.
  • Entropy decoding unit 150 may decode the SAO syntax elements. As part of decoding the SAO syntax elements, entropy decoding unit 150 may use regular CABAC coding engine 166 to decode the regular CABAC coding bins, and may use bypass coding engine 168 to decode the binned coding bins.
  • video decoder 30 may perform a reconstruction operation on the unpartitioned CU. To perform a reconstruction operation on an unpartitioned CU, video decoder 30 may perform a reconstruction operation on each TU of the CU. By performing a reconstruction operation on each TU of the CU, video decoder 30 may reconstruct the residual pixel blocks associated with the CU.
  • inverse quantization unit 154 may inverse quantize (ie, dequantize) the coefficient block associated with the TU. Inverse quantization unit 154 may use the QP value associated with the CU of the TU to determine the degree of quantization, and likewise determine the degree of inverse quantization that inverse quantization unit 154 will apply.
  • inverse transform processing unit 156 may apply one or more inverse transforms to the coefficient block to generate a residual sample block associated with the TU.
  • inverse transform processing unit 156 can apply inverse DCT, inverse integer transform, inverse Karhunen-Loeve transform (KLT), inverse rotation transform, inverse directional transform, or another inverse transform. Coefficient block.
  • intra prediction processing unit 164 may perform intra prediction to generate a predictive sample block for the PU.
  • Intra-prediction processing unit 164 may use an intra-prediction mode to generate a predictive pixel block of a PU based on a block of pixels of a spatially neighboring PU.
  • Intra prediction processing unit 164 may determine an intra prediction mode for the PU based on one or more syntax elements parsed from the bitstream.
  • Motion compensation unit 162 may construct a first reference picture list (List 0) and a second reference picture list (List 1) based on syntax elements extracted from the bitstream. Furthermore, if the PU uses inter prediction encoding, the entropy decoding unit 150 may extract motion information of the PU. Motion compensation unit 162 may determine one or more reference blocks of the PU based on the motion information of the PU. Motion compensation unit 162 can generate a predictive pixel block of the PU based on one or more reference blocks of the PU.
  • the reconstruction unit 158 may use the residual pixel block associated with the TU of the CU and the predictive pixel block of the PU of the CU (ie, intra prediction data or inter prediction data) to reconstruct the block of pixels of the CU, where applicable.
  • reconstruction unit 158 may add samples of the residual pixel block to corresponding samples of the predictive pixel block to reconstruct the pixel block of the CU.
  • Filter unit 159 may perform a deblocking operation to reduce blockiness artifacts associated with pixel blocks of the CU of the CTB. Additionally, filter unit 159 can modify the block of pixels of the CTB based on the SAO syntax elements parsed from the bitstream. For example, filter unit 159 can be based on the SAO syntax element of CTB The value is determined and the determined value is added to the sample in the reconstructed block of the CTB. By modifying at least some of the block of pixels of the CTB of the picture, filter unit 159 can modify the reconstructed picture of the video data based on the SAO syntax element.
  • Video decoder 30 may store the block of pixels of the CU in decoded picture buffer 160.
  • the decoded picture buffer 160 may provide reference pictures for subsequent motion compensation, intra prediction, and presentation on a display device (eg, display device 32 of FIG. 1).
  • video decoder 30 may perform intra-prediction operations or inter-prediction operations on PUs of other CUs based on the blocks of pixels in decoded picture buffer 160.
  • prediction processing unit 152 determines a set of inter-frame candidate prediction modes.
  • video decoder 30 is an example of a video decoder that is configured to determine the image to be processed based on information of neighboring image units adjacent to the image unit to be processed in accordance with the teachings of the present invention.
  • the candidate prediction mode set of the unit includes an affine merge mode, the affine merge mode indicating that the image unit to be processed and the adjacent image unit of the image unit to be processed obtain the respective predicted images using the same affine model; Parsing a code stream, obtaining first indication information; determining, according to the first indication information, a prediction mode of the image unit to be processed from the candidate prediction mode set; determining the image to be processed according to the prediction mode The predicted image of the unit.
  • FIG. 9 is a flow diagram of an example operation 400 of a video decoder for decoding video data in accordance with one or more techniques of this disclosure.
  • Figure 9 is provided as an example. In other examples, the techniques of the present invention may be practiced using more or fewer steps than those shown in the examples of FIG. 9 or steps that are different therefrom. According to the example method of FIG. 9, video decoder 30 performs the following steps:
  • S410 Determine, according to information about a neighboring image unit adjacent to the image unit to be processed, whether the candidate prediction mode set of the image unit to be processed includes an affine merge mode;
  • the blocks A, B, C, D, and E are adjacent reconstructed blocks of the current block to be coded, respectively located at the upper, left, upper right, lower left, and upper left positions of the block to be coded. Whether the affine merge mode exists in the candidate prediction mode set of the current block to be coded may be determined by the coding information of the adjacent reconstructed block.
  • FIG. 4 in the embodiment of the present invention exemplarily gives the number and location of adjacent reconstructed blocks of the block to be coded, and the number of the adjacent reconstructed blocks may be more than five or less. 5, not limited.
  • the second indication information in the code stream is parsed.
  • the candidate prediction mode set includes the affine merge mode, and when the second indication information is 0, the candidate prediction mode set does not include the affine merge mode; otherwise, the candidate prediction mode set does not include the affine merge mode.
  • determining whether there is a block in the adjacent reconstructed block that the prediction type is an affine prediction if not, the candidate prediction mode set of the to-be-coded block does not include an affine merge mode; If so, the candidate prediction mode set of the block to be encoded includes an affine merge mode.
  • the adjacent reconstructed block includes multiple affine modes, and may include a first affine mode or a second affine mode, and correspondingly, the affine merge mode includes Combining the first affine merge mode of the first affine mode or the second affine merge mode of the second affine mode, respectively counting the first affine mode and the second of the adjacent reconstructed blocks The number of affine patterns and non-affine patterns, when the first affine pattern is the most, the candidate prediction mode set includes the first affine merge mode, and does not include the second affine merge mode; When the two affine modes are the most, the candidate prediction mode set includes the second affine merge mode, and does not include the first affine merge mode; when the non-affine mode is the most, the candidate prediction mode set is not The affine merge mode is included.
  • the third implementation manner may be: when the first affine mode is the most, the candidate prediction mode set includes the first affine merge mode, and does not include the second affine merge mode; When the second affine mode is the most, the candidate prediction mode set includes the second affine merge mode, and does not include the first affine merge mode; when the non-affine mode is the most, the first affine mode is counted and Which second is the second affine mode, when the first affine mode is too many, the candidate prediction mode set includes the first affine merge mode, and does not include the second affine merge mode; When the affine mode is many times, the candidate prediction mode set includes the second affine merge mode, and does not include the first affine merge mode.
  • two conditions are determined: (1) whether there is a block in the adjacent reconstructed block whose prediction type is an affine pattern; (2) a width sum of adjacent blocks in the affine mode Whether the height is smaller than the width and height of the block to be coded; if any of the conditions are not satisfied, the candidate prediction mode set of the block to be coded does not include the affine merge mode; if both conditions are satisfied, the first of the code streams is parsed
  • the third indication information when the third indication information is 1, the candidate prediction mode set includes the affine merge mode, and when the third indication information is 0, the candidate prediction mode set does not include the An affine merge mode; otherwise, the candidate prediction mode set does not include the affine merge mode.
  • the width of the adjacent block that satisfies the affine mode is smaller than the width of the block to be coded, and the height of the adjacent block in the affine mode is smaller than the height of the block to be coded.
  • the determining condition may be that the width of the adjacent block of the affine mode is smaller than the width of the block to be coded or the height of the adjacent block of the affine mode is smaller than the height of the block to be coded, and is not limited. .
  • two conditions are determined: (1) whether there is a block in the adjacent reconstructed block whose prediction type is an affine pattern; (2) a width sum of adjacent blocks in the affine mode Whether the height is smaller than the width and height of the block to be coded; if any condition is not satisfied, the candidate prediction mode set of the block to be coded does not include the affine merge mode; if both conditions are satisfied, the candidate prediction of the block to be coded
  • the pattern collection contains an affine merge mode.
  • the prediction type and size of the adjacent reconstructed block are used as the basis for determining the candidate prediction mode set of the current to-be-coded block, and the attribute information obtained by parsing the adjacent reconstructed block may also be used. To judge, it corresponds to the encoding end, which is not limited here.
  • determining whether there is a block in the adjacent reconstructed block that the prediction type is an affine prediction may be used. For example, when the prediction type of at least two neighboring blocks is an affine mode, the candidate prediction mode set of the to-be-coded block includes an affine merge mode, otherwise, the block to be coded The candidate prediction mode set does not include an affine merge mode.
  • the prediction type of the adjacent block may be at least three, or at least four, corresponding to the coding end, and is not limited.
  • two conditions are determined: (1) whether there is a prediction in the adjacent reconstructed block.
  • the type is a block of affine mode; (2) whether the width and height of adjacent blocks of the affine mode are smaller than the width and height of the block to be coded; wherein, the second judgment condition, for example, may also be an affine mode Whether the width and height of the adjacent blocks are less than 1/2 or 1/3, 1/4 of the width and height of the block to be coded, and corresponding to the coding end, are not limited.
  • setting or setting the indication information to the encoding end in the embodiment of the present invention corresponds to the encoding end.
  • the first indication information indicates index information of a prediction mode of a block to be decoded, and this step corresponds to the encoding end step S240.
  • the prediction of the block to be decoded can be found by searching the list of prediction modes corresponding to the candidate prediction mode set determined in S410 by using the index information obtained in S420 for the list of different prediction mode sets corresponding to different prediction modes. mode.
  • FIG. 10 is a block diagram of an example of another frequency-decoding encoder 60 for decoding video data in accordance with one or more techniques of this disclosure.
  • the video decoder 60 includes a first determining module 61, a parsing module 62, a second determining module 63, and a third determining module 64.
  • the first determining module 61 is configured to execute, by S410, determining, according to information of a neighboring image unit adjacent to the image unit to be processed, whether the candidate prediction mode set of the to-be-processed image unit includes an affine merge mode;
  • the parsing module 62 is configured to execute the first indication information in the S420 parsing code stream;
  • the second determining module 63 is configured to perform, according to the first indication information, determining, according to the first indication information, a prediction mode of the image unit to be processed from the candidate prediction mode set;
  • the third determining module 64 is configured to execute S440 to determine a predicted image of the image unit to be processed according to the prediction mode.
  • the current block and the adjacent block have a large possibility to have the same or similar prediction mode.
  • the information of the adjacent block is used to derive the current block.
  • the prediction mode information reduces the code rate of the coding prediction mode and improves the coding efficiency.
  • FIG. 11 is a flow diagram of an example operation 500 of a video decoder for decoding video data in accordance with one or more techniques of this disclosure.
  • Figure 11 is provided as an example. In other examples, the techniques of the present invention may be implemented using more or fewer steps than steps shown in the examples of FIG. 11 or steps that are different therefrom.
  • video decoder 20 performs the following steps:
  • the first indication information indicates whether the candidate mode set of the first to-be-processed image region includes an affine motion model, and this step corresponds to step S310 of the encoding end.
  • the translation mode represents a prediction mode of obtaining a predicted image using a translational model
  • the candidate mode set of the first image to be processed adopts the a candidate translation mode set and a candidate affine pattern set, the affine mode representing a prediction mode for obtaining a predicted image using an affine model
  • the first image area to be processed may be an image frame group, an image frame, and an image point
  • the first indication information is encoded in a head of the image frame group, for example, a video parameter Set (VPS), Sequence Parameter Set (SPS), Additional Enhancement Information (SEI), Image Frame Header, such as Image Parameter Set (PPS), Head of Image Slice Set, Head of Image Strip Set, Image Score
  • VPS video parameter Set
  • SPS Sequence Parameter Set
  • SEI Additional Enhancement Information
  • Image Frame Header such as Image Parameter Set (PPS), Head of Image Slice Set, Head of Image Strip Set, Image Score
  • PPS Physical Enhancement Information
  • the determining of the first to-be-processed image area may be pre-configured, or may be adaptively determined during the encoding process, and the representation of the first to-be-processed image area range may be learned by the protocol of the codec.
  • the range of the first image area to be processed may be accepted from the encoding end by the code stream, and may correspond to the encoding end, which is not limited.
  • setting or setting the indication information to the exemplary embodiment in the embodiment of the present invention is exemplary and corresponds to the encoding end.
  • the second indication information indicates a prediction mode of a block to be processed in the first image area to be processed, and this step corresponds to step S340 of the encoding end.
  • S540 Determine, according to the second indication information, a prediction mode of an image unit to be processed from a candidate prediction mode set of the first to-be-processed image region.
  • FIG. 12 is a block diagram of an example of another video decoder 70 for decoding video data in accordance with one or more techniques of this disclosure.
  • the video decoder 70 includes a first parsing module 71, a first determining module 72, a second parsing module 73, a second determining module 74, and a third determining module 75.
  • the first parsing module 71 is configured to execute the first indication information in the S510 parsing code stream;
  • the first determining module 72 is configured to execute S520, determining, according to the first indication information, that the first to-be a set of candidate modes for the image region;
  • the second parsing module 73 is configured to perform S530 to parse the second indication information in the code stream;
  • the second determining module 74 is configured to perform, according to the second indication information, determining, according to the second indication information, a prediction mode of the image unit to be processed from the candidate prediction mode set of the first to-be-processed image region;
  • the third determining module 75 is configured to execute S550 to determine a predicted image of the image unit to be processed according to the prediction mode.
  • the flag is selected by setting the candidate prediction mode set at the regional level. The code rate spent by the coding redundancy mode is avoided, and the coding efficiency is improved.
  • the functions described may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the functions may be stored on or transmitted as one or more instructions or code via a computer readable medium and executed by a hardware-based processing unit.
  • the computer readable medium can comprise a computer readable storage medium (which corresponds to a tangible medium such as a data storage medium) or a communication medium comprising, for example, any medium that facilitates transfer of the computer program from one place to another in accordance with a communication protocol. .
  • computer readable media generally may correspond to (1) a non-transitory tangible computer readable storage medium, or (2) a communication medium such as a signal or carrier wave.
  • Data storage media may be any available media that can be accessed by one or more computers or one or more processors to retrieve instructions, code and/or data structures for implementing the techniques described in the present disclosure.
  • the computer program product can comprise a computer readable medium.
  • certain computer-readable storage media may comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, disk storage or other magnetic storage device, flash memory, or may be used to store instructions or data structures. Any other medium in the form of the desired program code and accessible by the computer. Also, any connection is properly termed a computer-readable medium. For example, if you use coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), or wireless technology (eg, infrared, radio, and microwave) to transmit commands from a website, server, or other remote source, then coaxial cable , fiber optic cable, twisted pair, DSL, or wireless technologies (eg, infrared, radio, and microwave) are included in the definition of the media.
  • coaxial cable, fiber optic cable, twisted pair, DSL, or wireless technologies eg, infrared, radio, and microwave
  • a magnetic disk and an optical disk include a compact disk (CD), a laser disk, an optical disk, a digital video disk (DVD), a flexible disk, and a Blu-ray disk, wherein the disk usually reproduces data magnetically, and the disk passes the laser Optically copy data.
  • CD compact disk
  • DVD digital video disk
  • a flexible disk a hard disk
  • Blu-ray disk wherein the disk usually reproduces data magnetically, and the disk passes the laser Optically copy data.
  • the combination of the above should also be included in the computer Read the scope of the media.
  • processors such as one or more digital signal processors (DSPs), general purpose microprocessors, application specific integrated circuits (ASICs), field programmable logic arrays (FPGAs), or other equivalent integrated or discrete logic circuits
  • DSPs digital signal processors
  • ASICs application specific integrated circuits
  • FPGAs field programmable logic arrays
  • processors may refer to any of the foregoing structures or any other structure suitable for implementing the techniques described herein.
  • the functionality described herein may be provided within dedicated hardware and/or software modules configured for encoding and decoding, or incorporated in a combined codec.
  • the techniques may be fully implemented in one or more circuits or logic elements.
  • the techniques of this disclosure may be implemented in a wide variety of devices or devices, including a wireless handset, an integrated circuit (IC), or a collection of ICs (eg, a chipset).
  • IC integrated circuit
  • Various components, modules or units are described in this disclosure to emphasize functional aspects of the apparatus configured to perform the disclosed techniques, but are not necessarily required to be implemented by different hardware units. Rather, as described above, various units may be combined in a codec hardware unit or combined with suitable software and/or by a collection of interoperable hardware units (including one or more processors as described above). Or firmware to provide.
  • system and “network” are used interchangeably herein. It should be understood that the term “and/or” herein is merely an association relationship describing an associated object, indicating that there may be three relationships, for example, A and/or B, which may indicate that A exists separately, and A and B exist simultaneously. There are three cases of B alone. In addition, the character "/" in this article generally indicates that the contextual object is an "or" relationship.
  • B corresponding to A means that B is associated with A, and B can be determined from A.
  • determining B from A does not mean that B is only determined based on A, and that B can also be determined based on A and/or other information.
  • the disclosed systems, devices, and methods may be implemented in other manners.
  • the device embodiments described above are merely illustrative.
  • the division of the unit is only a logical function division.
  • there may be another division manner for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.

Abstract

本发明实施例提供一种图像预测的方法及装置,其中,该图像预测方法,包括:根据与待处理图像单元相邻的相邻图像单元的信息,确定所述待处理图像单元的候选预测模式集合是否包含仿射合并模式,所述仿射合并模式表示所述待处理图像单元和所述待处理图像单元的相邻图像单元使用相同的仿射模型获得各自的预测图像;解析码流中的第一指示信息;根据所述第一指示信息,从所述候选预测模式集合中,确定所述待处理图像单元的预测模式;根据所述预测模式,确定所述待处理图像单元的预测图像。所述方法节省了编码预测模式的码率,提高了编码效率。

Description

图像预测的方法及装置 技术领域
本发明涉及视频译码及压缩领域,尤其涉及图像预测的方法及装置。
背景技术
数字视频能力可并入到大范围的装置中,包含数字电视、数字直播系统、无线广播系统、个人数字助理(PDA)、膝上型或桌上型计算机、平板计算机、电子书阅读器、数码相机、数字记录装置、数字媒体播放器、视频游戏装置、视频游戏控制台、蜂窝式或卫星无线电电话、视频会议装置、视频串流装置等等。数字视频装置实施视频压缩技术,例如由MPEG-2、MPEG-4、ITU-TH.263、ITU-TH.264/MPEG-4第10部分高级视频译码(AVC)、ITU-TH.265高效率视频译码(HEVC)标准定义的标准和所述标准的扩展部分中所描述的那些视频压缩技术,从而更高效地发射及接收数字视频信息。视频装置可通过实施此些视频译码技术来更高效地发射、接收、编码、解码和/或存储数字视频信息。
视频压缩技术包含空间(图片内)预测和/或时间(图片间)预测以减少或移除视频序列中所固有的冗余。对于基于块的视频译码,可将视频切片(即,视频帧或视频帧的一部分)分割为若干视频块,所述视频块还可被称作树块、译码单元(CU)和/或译码节点。使用空间预测相对于同一图片中的相邻块中的参考样本来编码图片的经帧内译码(I)切片中的视频块。图片的经帧间译码(P或B)切片中的视频块可使用相对于同一图片中的相邻块中的参考样本的空间预测或相对于其它参考图片中的参考样本的时间预测。图片可被称作帧,且参考图片可被称作参考帧。
空间或时间预测产生对待译码的块的预测性块。残差数据表示待译码的原始块与预测性块之间的像素差。根据指向形成预测性块的参考样本块的运动向量以及指示经译码块与所述预测性块之间的差的残差数据来编码经帧间译码块。根据帧内译码模式和残差数据来编码经帧内译码块。为了进一步压缩,可将残差数据从像素域变换为变换域,从而产生残差变换系数,所述残差变换系数随后可被量化。起初布置在二维阵列中的经量化变换系数可依序扫描以产生变换系数的一维向量,且可应用熵译码以实现更多的压缩。
发明内容
本发明描述了一种提高编码效率的图像预测方法,通过待处理图像单元相邻图像单元的预测信息或者单元尺寸,或者标示区域级的预测模式候选集合,推导所述待处理图像单元的预测模式,由于为预测模式的编码提供了先验信息,节省了编码预测模式的码率,提高了编码效率。
根据本发明的技术,一种预测图像解码方法,包括:根据与待处理图像单元相邻的相邻图像单元的信息,确定所述待处理图像单元的候选预测模式集合是否包含仿射合并模式,所述仿射合并模式表示所述待处理图像单元和所述待处理图像单元的相邻图像单元使用相同的仿射模型获得各自的预测图像;解析码流中的第一指示信息;根据所述第一指示信息,从所述候选预测模式集合中,确定所述待处理图像单元的预测模式;根据所述预测模式,确定所述待处理图像单元的预测图像。
其中,所述待处理图像单元的相邻图像单元至少包括所述待处理图像单元的上、左、右上、左下和左上的相邻图像单元。
根据本发明的技术,根据与待处理图像单元相邻的相邻图像单元的信息,确定所述待处理图像单元的候选预测模式集合是否包含仿射合并模式:
第一种实现方式,包括:当至少一个所述相邻图像单元的预测模式为使用仿射模型获得预测图像时,解析所述码流中的第二指示信息,当所述第二指示信息为1时,所述候选预测模式集合包含所述仿射合并模式,当所述第二指示信息为0时,所述候选预测模式集合不包含所述仿射合并模式;否则,所述候选预测模式集合不包含所述仿射合并模式。
第二种实现方式,包括:当至少一个所述相邻图像单元的预测模式为使用仿射模型获得预测图像时,所述候选预测模式集合包含所述仿射合并模式;否则,所述候选预测模式集合不包含所述仿射合并模式。
第三种实现方式,包括:所述预测模式至少包括使用第一仿射模型获得预测图像的第一仿射模式或使用第二仿射模型获得预测图像的第二仿射模式,对应的,所述仿射合并模式至少包括合并所述第一仿射模式的第一仿射合并模式或合并所述第二仿射模式的第二仿射合并模式,对应的,所述根据与待处理图像单元相邻的相邻图像单元的信息,确定所述待处理图像单元的 候选预测模式集合是否包含仿射合并模式,包括:当所述相邻预测单元的预测模式中,所述第一仿射模式最多时,所述候选预测模式集合包含所述第一仿射合并模式,且不包含所述第二仿射合并模式;当所述相邻预测单元的预测模式中,所述第二仿射模式最多时,所述候选预测模式集合包含所述第二仿射合并模式,且不包含所述第一仿射合并模式;当所述相邻预测单元的预测模式中,非所述仿射模式最多时,所述候选预测模式集合不包含所述仿射合并模式。
第三种实现方式,还包括:当所述相邻预测单元的预测模式中,所述第一仿射模式最多时,所述候选预测模式集合包含所述第一仿射合并模式,且不包含所述第二仿射合并模式;当所述相邻预测单元的预测模式中,所述第二仿射模式最多时,所述候选预测模式集合包含所述第二仿射合并模式,且不包含所述第一仿射合并模式;当所述相邻预测单元的预测模式中,非所述仿射模式最多且所述第一仿射模式次多时,所述候选预测模式集合包含所述第一仿射合并模式,且不包含所述第二仿射合并模式;当所述相邻预测单元的预测模式中,非所述仿射模式最多且所述第二仿射模式次多时,所述候选预测模式集合包含所述第二仿射合并模式,且不包含所述第一仿射合并模式。
第四种实现方式,包括:当至少一个所述相邻图像单元的预测模式为使用仿射模型获得预测图像,且所述至少一个所述相邻图像单元的宽和高分别小于所述待处理图像单元的宽和高时,解析所述码流中的第三指示信息,当所述第三指示信息为1时,所述候选预测模式集合包含所述仿射合并模式,当所述第三指示信息为0时,所述候选预测模式集合不包含所述仿射合并模式;否则,所述候选预测模式集合不包含所述仿射合并模式。
第五种实现方式,包括:当至少一个所述相邻图像单元的预测模式为使用仿射模型获得预测图像,且所述至少一个所述相邻图像单元的宽和高分别小于所述待处理图像单元的宽和高时,所述候选预测模式集合包含所述仿射合并模式;否则,所述候选预测模式集合不包含所述仿射合并模式。
根据本发明的技术,一种预测图像解码方法,包括:解析码流中的第一指示信息;根据所述第一指示信息,确定第一待处理图像区域的候选模式集合;当所述第一指示信息为0时,所述第一待处理图像区域的候选模式集合采用候选平动模式集合,所述平动模式表示使用平动模型获得预测图像的预测模式;当所述第一指示信息为1时,所述第一待处理图像区域的候选模式 集合采用候选平动模式集合和候选仿射模式集合,所述仿射模式表示使用仿射模型获得预测图像的预测模式;解析所述码流中的第二指示信息;根据所述第二指示信息,从所述第一待处理图像区域的候选预测模式集合中,确定待处理图像单元的预测模式,所述待处理图像单元属于所述第一待处理图像区域;根据所述预测模式,确定所述待处理图像单元的预测图像。
其中,所述第一待处理图像区域,包括:图像帧组、图像帧、图像分片集、图像条带集、图像分片、图像条带、图像编码单元集合、图像编码单元其中之一。
在一个实例中,一种预测图像解码方法,包括:根据与待处理图像单元相邻的相邻图像单元的信息,确定所述待处理图像单元的候选预测模式集合是否包含仿射合并模式,所述仿射合并模式表示所述待处理图像单元和所述待处理图像单元的相邻图像单元使用相同的仿射模型获得各自的预测图像;解析码流中的第一指示信息;根据所述第一指示信息,从所述候选预测模式集合中,确定所述待处理图像单元的预测模式;根据所述预测模式,确定所述待处理图像单元的预测图像。
在另一个实例中,一种预测图像编码方法,包括:根据与待处理图像单元相邻的相邻图像单元的信息,确定所述待处理图像单元的候选预测模式集合是否包含仿射合并模式,所述仿射合并模式表示所述待处理图像单元和所述待处理图像单元的相邻图像单元使用相同的仿射模型获得各自的预测图像;从所述候选预测模式集合中,确定所述待处理图像单元的预测模式;根据所述预测模式,确定所述待处理图像单元的预测图像;将第一指示信息编入码流,所述第一指示信息表示所述预测模式。
在另一个实例中,一种预测图像解码方法,包括:解析码流中的第一指示信息;根据所述第一指示信息,确定第一待处理图像区域的候选模式集合;当所述第一指示信息为0时,所述第一待处理图像区域的候选模式集合采用候选平动模式集合,所述平动模式表示使用平动模型获得预测图像的预测模式;当所述第一指示信息为1时,所述第一待处理图像区域的候选模式集合采用候选平动模式集合和候选仿射模式集合,所述仿射模式表示使用仿射模型获得预测图像的预测模式;解析所述码流中的第二指示信息;根据所述第二指示信息,从所述第一待处理图像区域的候选预测模式集合中,确定待处 理图像单元的预测模式,所述待处理图像单元属于所述第一待处理图像区域;根据所述预测模式,确定所述待处理图像单元的预测图像。
在另一个实例中,一种预测图像编码方法,包括:当第一待处理图像区域的候选模式集合采用候选平动模式集合时,设置第一指示信息为0,将所述第一指示信息编入码流,所述平动模式表示使用平动模型获得预测图像的预测模式;当所述第一待处理图像区域的候选模式集合采用候选平动模式集合和候选仿射模式集合时,设置所述第一指示信息为1,将所述第一指示信息编入码流,所述仿射模式表示使用仿射模型获得预测图像的预测模式;从所述第一待处理图像区域的候选预测模式集合中,确定待处理图像单元的预测模式,所述待处理图像单元属于所述第一待处理图像区域;根据所述预测模式,确定所述待处理图像单元的预测图像;将第二指示信息编入所述码流,所述第二指示信息表示所述预测模式。
在另一个实例中,一种预测图像解码装置,包括:第一确定模块,用于根据与待处理图像单元相邻的相邻图像单元的信息,确定所述待处理图像单元的候选预测模式集合是否包含仿射合并模式,所述仿射合并模式表示所述待处理图像单元和所述待处理图像单元的相邻图像单元使用相同的仿射模型获得各自的预测图像;解析模块,用于解析码流中的第一指示信息;第二确定模块,用于根据所述第一指示信息,从所述候选预测模式集合中,确定所述待处理图像单元的预测模式;第三确定模块,用于根据所述预测模式,确定所述待处理图像单元的预测图像。
在另一个实例中,一种预测图像编码装置,包括:第一确定模块,用于根据与待处理图像单元相邻的相邻图像单元的信息,确定所述待处理图像单元的候选预测模式集合是否包含仿射合并模式,所述仿射合并模式表示所述待处理图像单元和所述待处理图像单元的相邻图像单元使用相同的仿射模型获得各自的预测图像;第二确定模块,用于从所述候选预测模式集合中,确定所述待处理图像单元的预测模式;第三确定模块,用于根据所述预测模式,确定所述待处理图像单元的预测图像;编码模块,用于将第一指示信息编入码流,所述第一指示信息表示所述预测模式。
在另一个实例中,一种预测图像解码装置,包括:第一解析模块,用于解析码流中的第一指示信息;第一确定模块,用于根据所述第一指示信息,确定第一待处理图像区域的候选模式集合;当所述第一指示信息为0时,所 述第一待处理图像区域的候选模式集合采用候选平动模式集合,所述平动模式表示使用平动模型获得预测图像的预测模式;
当所述第一指示信息为1时,所述第一待处理图像区域的候选模式集合采用候选平动模式集合和候选仿射模式集合,所述仿射模式表示使用仿射模型获得预测图像的预测模式;第二解析模块,用于解析所述码流中的第二指示信息;第二确定模块,用于根据所述第二指示信息,从所述第一待处理图像区域的候选预测模式集合中,确定待处理图像单元的预测模式,所述待处理图像单元属于所述第一待处理图像区域;第三确定模块,用于根据所述预测模式,确定所述待处理图像单元的预测图像。
在另一个实例中,一种预测图像编码装置,包括:第一编码模块,用于当第一待处理图像区域的候选模式集合采用候选平动模式集合时,设置第一指示信息为0,将所述第一指示信息编入码流,所述平动模式表示使用平动模型获得预测图像的预测模式;当所述第一待处理图像区域的候选模式集合采用候选平动模式集合和候选仿射模式集合时,设置所述第一指示信息为1,将所述第一指示信息编入码流,所述仿射模式表示使用仿射模型获得预测图像的预测模式;第一确定模块,用于从所述第一待处理图像区域的候选预测模式集合中,确定待处理图像单元的预测模式,所述待处理图像单元属于所述第一待处理图像区域;第二确定模块,用于根据所述预测模式,确定所述待处理图像单元的预测图像;第二编码模块,用于将第二指示信息编入所述码流,所述第二指示信息表示所述预测模式。
在另一个实例中,一种用于对视频数据进行解码的设备,所述设备包括经配置以进行以下操作的视频解码器:根据与待处理图像单元相邻的相邻图像单元的信息,确定所述待处理图像单元的候选预测模式集合是否包含仿射合并模式,所述仿射合并模式表示所述待处理图像单元和所述待处理图像单元的相邻图像单元使用相同的仿射模型获得各自的预测图像;解析码流中的第一指示信息;根据所述第一指示信息,从所述候选预测模式集合中,确定所述待处理图像单元的预测模式;根据所述预测模式,确定所述待处理图像单元的预测图像。
在另一个实例中,一种用于对视频数据进行编码的设备,所述设备包括经配置以进行以下操作的视频编码器:根据与待处理图像单元相邻的相邻图像单元的信息,确定所述待处理图像单元的候选预测模式集合是否包含仿射 合并模式,所述仿射合并模式表示所述待处理图像单元和所述待处理图像单元的相邻图像单元使用相同的仿射模型获得各自的预测图像;从所述候选预测模式集合中,确定所述待处理图像单元的预测模式;根据所述预测模式,确定所述待处理图像单元的预测图像;将第一指示信息编入码流,所述第一指示信息表示所述预测模式。
在另一个实例中,一种用于对视频数据进行解码的设备,所述设备包括经配置以进行以下操作的视频解码器:解析码流中的第一指示信息;根据所述第一指示信息,确定第一待处理图像区域的候选模式集合;当所述第一指示信息为0时,所述第一待处理图像区域的候选模式集合采用候选平动模式集合,所述平动模式表示使用平动模型获得预测图像的预测模式;当所述第一指示信息为1时,所述第一待处理图像区域的候选模式集合采用候选平动模式集合和候选仿射模式集合,所述仿射模式表示使用仿射模型获得预测图像的预测模式;解析所述码流中的第二指示信息;根据所述第二指示信息,从所述第一待处理图像区域的候选预测模式集合中,确定待处理图像单元的预测模式,所述待处理图像单元属于所述第一待处理图像区域;根据所述预测模式,确定所述待处理图像单元的预测图像。
在另一个实例中,一种用于对视频数据进行编码的设备,所述设备包括经配置以进行以下操作的视频编码器:当第一待处理图像区域的候选模式集合采用候选平动模式集合时,设置第一指示信息为0,将所述第一指示信息编入码流,所述平动模式表示使用平动模型获得预测图像的预测模式;当所述第一待处理图像区域的候选模式集合采用候选平动模式集合和候选仿射模式集合时,设置所述第一指示信息为1,将所述第一指示信息编入码流,所述仿射模式表示使用仿射模型获得预测图像的预测模式;从所述第一待处理图像区域的候选预测模式集合中,确定待处理图像单元的预测模式,所述待处理图像单元属于所述第一待处理图像区域;根据所述预测模式,确定所述待处理图像单元的预测图像;将第二指示信息编入所述码流,所述第二指示信息表示所述预测模式。
在另一个实例中,一种存储有指令的计算机可读存储媒体,所述指令在被执行时使用于对视频数据进行解码的设备的一或多个处理器进行以下操作:根据与待处理图像单元相邻的相邻图像单元的信息,确定所述待处理图像单元的候选预测模式集合是否包含仿射合并模式,所述仿射合并模式表示 所述待处理图像单元和所述待处理图像单元的相邻图像单元使用相同的仿射模型获得各自的预测图像;解析码流中的第一指示信息;根据所述第一指示信息,从所述候选预测模式集合中,确定所述待处理图像单元的预测模式;根据所述预测模式,确定所述待处理图像单元的预测图像。
在另一个实例中,一种存储有指令的计算机可读存储媒体,所述指令在被执行时使用于对视频数据进行编码的设备的一或多个处理器进行以下操作:根据与待处理图像单元相邻的相邻图像单元的信息,确定所述待处理图像单元的候选预测模式集合是否包含仿射合并模式,所述仿射合并模式表示所述待处理图像单元和所述待处理图像单元的相邻图像单元使用相同的仿射模型获得各自的预测图像;从所述候选预测模式集合中,确定所述待处理图像单元的预测模式;根据所述预测模式,确定所述待处理图像单元的预测图像;将第一指示信息编入码流,所述第一指示信息表示所述预测模式。
在另一个实例中,一种存储有指令的计算机可读存储媒体,所述指令在被执行时使用于对视频数据进行解码的设备的一或多个处理器进行以下操作:解析码流中的第一指示信息;根据所述第一指示信息,确定第一待处理图像区域的候选模式集合;当所述第一指示信息为0时,所述第一待处理图像区域的候选模式集合采用候选平动模式集合,所述平动模式表示使用平动模型获得预测图像的预测模式;当所述第一指示信息为1时,所述第一待处理图像区域的候选模式集合采用候选平动模式集合和候选仿射模式集合,所述仿射模式表示使用仿射模型获得预测图像的预测模式;解析所述码流中的第二指示信息;根据所述第二指示信息,从所述第一待处理图像区域的候选预测模式集合中,确定待处理图像单元的预测模式,所述待处理图像单元属于所述第一待处理图像区域;根据所述预测模式,确定所述待处理图像单元的预测图像。
在另一个实例中,一种存储有指令的计算机可读存储媒体,所述指令在被执行时使用于对视频数据进行编码的设备的一或多个处理器进行以下操作:当第一待处理图像区域的候选模式集合采用候选平动模式集合时,设置第一指示信息为0,将所述第一指示信息编入码流,所述平动模式表示使用平动模型获得预测图像的预测模式;当所述第一待处理图像区域的候选模式集合采用候选平动模式集合和候选仿射模式集合时,设置所述第一指示信息为1,将所述第一指示信息编入码流,所述仿射模式表示使用仿射模型获得 预测图像的预测模式;从所述第一待处理图像区域的候选预测模式集合中,确定待处理图像单元的预测模式,所述待处理图像单元属于所述第一待处理图像区域;根据所述预测模式,确定所述待处理图像单元的预测图像;将第二指示信息编入所述码流,所述第二指示信息表示所述预测模式。
附图说明
为了更清楚地说明本发明实施例的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1是根据本发明实施例的视频译码系统的示意性框图;
图2是根据本发明实施例的视频编码器的示意性框图;
图3是根据本发明实施例的视频编码器的实例操作的示意性流程图;
图4是本发明实施例的待处理块及其相邻重构块的位置示意图;
图5是根据本发明实施例的另一视频编码器的示意性框图;
图6是根据本发明实施例的视频编码器的另一实例操作的示意性流程图;
图7是根据本发明实施例的又一视频编码器的示意性框图;
图8是根据本发明实施例的视频解码器的示意性框图;
图9是根据本发明实施例的视频解码器的实例操作的示意性流程图;
图10是根据本发明实施例的另一视频解码器的示意性框图;
图11是根据本发明实施例的视频解码器的另一实例操作的示意性流程图;
图12是根据本发明实施例的又一视频解码器的示意性框图;
具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有付出创造性劳动的前提下所获得的所有其他实施例,都属于本发明保护的范围。
运动补偿是视频编码中提高压缩效率最关键的技术之一。传统的基于块 匹配的运动补偿是主流的视频编码器,特别是视频编码标准中,广泛采用的一种方法。在基于块匹配的运动补偿方法中,一个帧间预测块采用的是平移的运动模型,其假设一个块中所有像素位置上的运动向量相等。然而,这种假设在很多情况下并不成立。现实视频中物体的运动经常是平移、旋转、缩放等运动的复杂组合。若一个像素块中含有这些复杂的运动,则传统的基于块匹配的运动补偿方法得到的预测信号就不够准确,因此不能够充分去除帧间相关性。为了解决这个问题,高阶运动模型被引入到视频编码的运动补偿中。相比于平移运动模型,高阶运动模型的自由度更大,允许一个帧间预测块中的每个像素的运动向量各不相同,即高阶运动模型产生的运动矢量场具有更好的精度。
基于控制点描述的仿射运动模型是高阶运动模型中比较有代表性的一种。与传统的平移运动模型不同,块中各个像素点的运动向量的值与其所在位置有关,是坐标位置的一阶线性方程。仿射运动模型允许参考块经过旋转、缩放等扭曲变换,在运动补偿时可以得到更精确的预测块。
上述通过仿射运动模型在运动补偿时获得预测块的帧间预测类型,通常被称为仿射模式。在当前主流的视频压缩编码标准中,帧间预测类型有先进运动矢量预测(Advanced motion vector prediction,AMVP)和合并(Merge)两种模式,AMVP需要显式地传递每个编码块的预测方向,参考帧索引及真实运动矢量与预测运动矢量之间的差值。而Merge模式则是直接采用相邻块的运动矢量推导得到当前编码块的运动信息。仿射模式和基于平移运动模型的AMVP、Merge等帧间预测方式相结合,可以形成基于仿射运动模型的AMVP、Merge等新型的帧间预测模式,例如不妨把基于仿射运动模型的Merge模式称为仿射合并模式(Affine Merge)。在对预测模式的选择过程中,新型预测模式和当前标准中存在的预测模式共同参与“性能代价比”的比较过程,选出最优的模式作为预测模式,生成待处理块的预测图像,一般的,预测模式选择的结果会被编码并传输到解码端。
仿射模式可以更好的提高预测块的精度值,提高编码的效率,但是另一方面,相对于基于平移运动模型的统一的运动信息,需要花费更多的码率编码各个控制点的运动信息,同时,由于候选预测模式的增加,用来编码预测模式选择结果的码率也在增加。这些额外的码率消耗,都影响了编码效率的提高。
根据本发明的技术方案,一方面,根据待处理图像单元的相邻图像单元的预测模式信息或者尺寸信息,确定所述待处理图像单元的候选预测模式集合是否包含仿射合并模式;解析码流,获得指示信息;根据所述指示信息,从所述候选预测模式集合中,确定所述待处理图像单元的预测模式;根据所述预测模式,确定所述待处理图像单元的预测图像。另一方面,解析码流,通过指示信息,判断某一区域是否使用了包含仿射模式的候选预测模式集合,根据候选预测模式集合和接收到的另外的指示信息,确定预测模式,生成预测图像。
由此,所述待处理图像单元的相邻图像单元的预测模式信息或者尺寸信息可以作为编码所述待处理块的预测信息的先验知识,所述区域的候选预测模式集合构成的指示信息也可以作为编码所述待处理块的预测信息的先验知识,所述的先验知识指导预测模式的编码,节省了编码模式选择信息的码率,提高了编码效率。
同时,有多种提高仿射模型运动信息编码效率的方案,比如:申请号为CN201010247275.7、CN201410584175.1、CN201410526608.8、CN201510085362.X、PCT/CN2015/073969、CN201510249484.8、CN201510391765.7、CN201510543542.8等专利申请,上述文件的全部内容以引用的方式并入本文中。应理解,由于解决的具体技术问题不同,本发明的技术方案可以应用于上述方案中,进一步提升编码效率。
还应理解,仿射模型是对于非平移运动模型的一种统称概念。包括旋转、缩放、形变、透析等实际运动都可以通过建立不同的运动模型而用于帧间预测中的运动估计和运动补偿,不妨,把它们分别简称为第一仿射模型、第二仿射模型等。
图1是根据本发明实施例的视频译码系统10的示意性框图。如本文中所描述,术语“视频译码器”一般指代视频编码器及视频解码器两者。在本发明中,术语“视频译码”或“译码”可一般指代视频编码或视频解码。
如图1中所展示,视频译码系统10包含源装置12及目的地装置14。源装置12产生经编码视频数据。因此,源装置12可被称作视频编码装置或视频编码设备。目的地装置14可解码由源装置12产生的经编码视频数据。因此,目的地装置14可被称作视频解码装置或视频解码设备。源装置12及目的地装置14可为视频译码装置或视频译码设备的实例。源装置12及目的地 装置14可包括广泛范围的装置,包含桌上型计算机、移动计算装置、笔记型(例如,膝上型)计算机、平板计算机、机顶盒、例如所谓的“智能”电话等电话手持机、电视、相机、显示装置、数字媒体播放器、视频游戏控制台、车载计算机,或其类似者。
目的地装置14可经由信道16接收来自源装置12的经编码视频数据。信道16可包括能够将经编码视频数据从源装置12移动到目的地装置14的一或多个媒体及/或装置。在一实例中,信道16可包括使源装置12能够实时地将经编码视频数据直接发射到目的地装置14的一或多个通信媒体。在此实例中,源装置12可根据通信标准(例如,无线通信协议)来调变经编码视频数据,且可将经调变视频数据发射到目的地装置14。所述一或多个通信媒体可包含无线及/或有线通信媒体,例如射频(RF)频谱或一或多根物理传输线。所述一或多个通信媒体可形成基于包的网络(例如,局域网、广域网或全球网络(例如,因特网))的部分。所述一或多个通信媒体可包含路由器、交换器、基站,或促进从源装置12到目的地装置14的通信的其它设备。
在另一实例中,信道16可包含存储由源装置12产生的经编码视频数据的存储媒体。在此实例中,目的地装置14可经由磁盘存取或卡存取来存取存储媒体。存储媒体可包含多种本地存取式数据存储媒体,例如蓝光光盘、DVD、CD-ROM、快闪存储器,或用于存储经编码视频数据的其它合适数字存储媒体。
在另一实例中,信道16可包含文件服务器或存储由源装置12产生的经编码视频数据的另一中间存储装置。在此实例中,目的地装置14可经由流式传输或下载来存取存储于文件服务器或其它中间存储装置处的经编码视频数据。文件服务器可为能够存储经编码视频数据且将所述经编码视频数据发射到目的地装置14的服务器类型。实例文件服务器包含web服务器(例如,用于网站)、文件传送协议(FTP)服务器、网络附接存储(NAS)装置,及本地磁盘驱动器。
目的地装置14可经由标准数据连接(例如,因特网连接)来存取经编码视频数据。数据连接的实例类型包含适合于存取存储于文件服务器上的经编码视频数据的无线信道(例如,Wi-Fi连接)、有线连接(例如,DSL、缆线调制解调器等),或两者的组合。经编码视频数据从文件服务器的发射可为流式传输、下载传输或两者的组合。
本发明的技术不限于无线应用或设定。可将所述技术应用于支持例如以下应用等多种多媒体应用的视频译码:空中电视广播、有线电视发射、卫星电视发射、流式传输视频发射(例如,经由因特网)、存储于数据存储媒体上的视频数据的编码、存储于数据存储媒体上的视频数据的解码,或其它应用。在一些实例中,视频译码系统10可经配置以支持单向或双向视频发射,以支持例如视频流式传输、视频播放、视频广播及/或视频电话等应用。
在图1的实例中,源装置12包含视频源18、视频编码器20及输出接口22。在一些实例中,输出接口22可包含调变器/解调变器(调制解调器)及/或发射器。视频源18可包含视频俘获装置(例如,视频相机)、含有先前俘获的视频数据的视频存档、用以从视频内容提供者接收视频数据的视频馈入接口,及/或用于产生视频数据的计算机图形系统,或视频数据的此些源的组合。
视频编码器20可编码来自视频源18的视频数据。在一些实例中,源装置12经由输出接口22将经编码视频数据直接发射到目的地装置14。经编码视频数据还可存储于存储媒体或文件服务器上以供目的地装置14稍后存取以用于解码及/或播放。
在图1的实例中,目的地装置14包含输入接口28、视频解码器30及显示装置32。在一些实例中,输入接口28包含接收器及/或调制解调器。输入接口28可经由信道16接收经编码视频数据。显示装置32可与目的地装置14整合或可在目的地装置14外部。一般来说,显示装置32显示经解码视频数据。显示装置32可包括多种显示装置,例如液晶显示器(LCD)、等离子体显示器、有机发光二极管(OLED)显示器或另一类型的显示装置。
视频编码器20及视频解码器30可根据视频压缩标准(例如,高效率视频译码(H.265)标准)而操作,且可遵照HEVC测试模型(HM)。H.265标准的文本描述ITU-TH.265(V3)(04/2015)于2015年4月29号发布,可从http://handle.itu.int/11.1002/1000/12455下载,所述文件的全部内容以引用的方式并入本文中。
或者,视频编码器20及视频解码器30可根据其它专属或行业标准而操作,所述标准包含ITU-TH.261、ISO/IECMPEG-1Visual、ITU-TH.262或ISO/IECMPEG-2Visual、ITU-TH.263、ISO/IECMPEG-4Visual,ITU-TH.264(还称为ISO/IECMPEG-4AVC),包含其可调式视频译码(SVC)及多视图视频译码 (MVC)扩展。然而,本发明的技术不限于任何特定译码标准或技术。
此外,图1仅为实例且本发明的技术可应用于未必包含编码装置与解码装置之间的任何数据通信的视频译码设定(例如,视频编码或视频解码)。在其它实例中,从本地存储器检索数据,经由网络流式传输数据,或以类似方式操作数据。编码装置可编码数据且将所述数据存储到存储器,及/或解码装置可从存储器检索数据且解码所述数据。在许多实例中,通过彼此不进行通信而仅编码数据到存储器及/或从存储器检索数据及解码数据的多个装置执行编码及解码。
视频编码器20及视频解码器30各自可实施为多种合适电路中的任一者,例如一或多个微处理器、数字信号处理器(DSP)、专用集成电路(ASIC)、现场可编程门阵列(FPGA)、离散逻辑、硬件或其任何组合。如果技术部分地以软件实施,则装置可将软件的指令存储于合适的非暂时性计算机可读存储媒体中,且可使用一或多个处理器执行硬件中的指令以执行本发明的技术。可将前述各者中的任一者(包含硬件、软件、硬件与软件的组合等)视为一或多个处理器。视频编码器20及视频解码器30中的每一者可包含于一或多个编码器或解码器中,其中的任一者可整合为各别装置中的组合式编码器/解码器(编解码器(CODEC))的部分。
本发明大体上可指代视频编码器20将某一信息“用信号发送”到另一装置(例如,视频解码器30)。术语“用信号发送”大体上可指代语法元素及/或表示经编码视频数据的其它数据的传达。此传达可实时或近实时地发生。或者,此通信可在一时间跨度上发生,例如可在编码时以经编码位流将语法元素存储到计算机可读存储媒体时发生,所述语法元素在存储到此媒体之后接着可由解码装置在任何时间检索。
如上文简单提及,视频编码器20编码视频数据。视频数据可包括一或多个图片。所述图片中的每一者可为静态图像。在一些例子中,图片可被称作视频“帧”。视频编码器20可产生位流,所述位流包含形成视频数据的经译码表示的位的序列。视频数据的经译码表示可包含经译码图片及相关联数据。经译码图片为图片的经译码表示。相关联数据可包含序列参数集(SPS)、图片参数集(PPS)及其它语法结构。SPS可含有可应用于图片的零个或多个序列的参数。PPS可含有可应用于零个或多个图片的参数。语法结构可为以指定次序一起呈现于位流中的零个或多个语法元素的集合。
为产生图片的经编码表示,视频编码器20可将图片分割成译码树型块(CTB)的栅格。在一些例子中,CTB可被称作“树型块”、“最大译码单元”(LCU)或“译码树型单元”。HEVC的CTB可大致类似于先前标准(例如,H.264/AVC)的宏块。然而,CTB未必限于特定大小且可包含一或多个译码单元(CU)。
CTB中的每一者可与图片内的像素的具有相等大小的不同块相关联。每一像素可包括一亮度(luminance或luma)样本及两个色度(chrominance或chroma)样本。因此,每一CTB可与亮度样本的一块及色度样本的两个块相关联。为易于解释,本发明可将二维像素阵列称作像素块,且可将二维样本阵列称作样本块。视频编码器20可使用四分树分割来将与CTB相关联的像素块分割成与CU相关联的像素块,因此名称为“译码树型块”。
图片的CTB可经分组成一或多个切片。在一些实例中,切片中的每一者包含整数个CTB。作为编码一图片的部分,视频编码器20可产生所述图片的每一切片的经编码表示(即,经译码切片)。为产生经译码切片,视频编码器20可编码切片的每一CTB以产生切片的CTB中的每一者的经编码表示(即,经译码CTB)。
为产生经译码CTB,视频编码器20可对与CTB相关联的像素块递归地执行四分树分割,以将像素块分割成逐渐减小的像素块。较小像素块中的每一者可与CU相关联。经分割CU可为像素块经分割成与其它CU相关联的像素块的CU。未经分割CU可为像素块未经分割成与其它CU相关联的像素块的CU。
视频编码器20可产生每一未经分割CU的一或多个预测单元(PU)。CU的PU中的每一者可与CU的像素块内的不同像素块相关联。视频编码器20可针对CU的每一PU产生预测性像素块。PU的预测性像素块可为像素的块。
视频编码器20可使用帧内预测或帧间预测来产生PU的预测性像素块。如果视频编码器20使用帧内预测来产生PU的预测性像素块,则视频编码器20可基于与PU相关联的图片的经解码像素来产生PU的预测性像素块。如果视频编码器20使用帧间预测来产生PU的预测性像素块,则视频编码器20可基于不同于与PU相关联的图片的一或多个图片的经解码像素来产生PU的预测性像素块。
视频编码器20可基于CU的PU的预测性像素块来产生CU的残余像素块。CU的残余像素块可指示CU的PU的预测性像素块中的样本与CU的初始像素 块中的对应样本之间的差。
此外,作为编码未经分割CU的部分,视频编码器20可对CU的残余像素块执行递归四分树分割以将CU的残余像素块分割成与CU的变换单元(TU)相关联的一或多个较小残余像素块。因为与TU相关联的像素块中的像素各自包含一亮度样本及两个色度样本,所以TU中的每一者可与亮度样本的一残余样本块及色度样本的两个残余样本块相关联。
视频译码器20可将一或多个变换应用于与TU相关联的残余样本块以产生系数块(即,系数的块)。视频编码器20可对系数块中的每一者执行量化程序。量化大体上指代系数经量化以可能减少用以表示系数的数据量从而提供进一步压缩的程序。
视频编码器20可产生表示经量化系数块中的系数的语法元素的集合。视频编码器20可将熵编码操作(例如,上下文自适应二进制算术译码(CABAC)操作)应用于此些语法元素中的至少一些。
为将CABAC编码应用于语法元素,视频编码器20可将语法元素二进制化以形成包括一连串一或多个位(称作“二进位”)的二进制串。视频编码器20可使用规则CABAC编码来编码二进位中的一些,且可使用旁通编码来编码二进位中的其它者。
当视频编码器20使用规则CABAC编码来编码二进位的序列时,视频编码器20可首先识别译码上下文。译码上下文可识别译码具有特定值的二进位的机率。举例来说,译码上下文可指示译码0值二进位的机率为0.7及译码1值二进位的机率为0.3。在识别译码上下文之后,视频编码器20可将区间分成下部子区间及上部子区间。所述子区间中的一者可与值0相关联,且另一子区间可与值1相关联。子区间的宽度可与由所识别的译码上下文针对相关联值而指示的机率成比例。
如果语法元素的二进位具有与下部子区间相关联的值,则经编码值可等于下部子区间的下边界。如果语法元素的同一二进位具有与上部子区间相关联的值,则经编码值可等于上部子区间的下边界。为编码语法元素的下一二进位,视频编码器20可相对于为与经编码位的值相关联的子区间的区间来重复此些步骤。当视频编码器20针对下一二进位重复此些步骤时,视频编码器20可使用基于由所识别的译码上下文指示的机率及经编码二进位的实际值的经修改机率。
当视频编码器20使用旁通编码来编码二进位的序列时,视频编码器20可能能够在单一循环中译码若干二进位,而当视频编码器20使用规则CABAC编码时,视频编码器20可能能够在一循环中仅译码单一二进位。旁通译码可较简单,这是因为旁通译码不需要视频编码器20选择上下文且可使视频编码器20能够假定两个符号(0及1)的机率为1/2(50%)。因此,在旁通译码中,将区间直接分裂成两半。实际上,旁通译码将算术译码引擎的上下文自适应部分旁通。
对二进位执行旁通译码可能比对二进位执行规则CABAC译码在计算上花费少。此外,执行旁通译码可实现较高平行度及输送量。使用旁通译码来编码的二进位可被称作“经旁通译码二进位”。
除熵编码系数块的语法元素外,视频编码器20可将逆量化及逆变换应用于变换块,以从变换块重建构残余样本块。视频编码器20可将经重建构残余样本块加到来自一或多个预测性样本块的对应样本,以产生经重建构样本块。通过重建构每一色彩分量的样本块,视频编码器20可重建构与TU相关联的像素块。通过以此方式重建构CU的每一TU的像素块,视频编码器20可重建构CU的像素块。
在视频编码器20重建构CU的像素块之后,视频编码器20可执行解块操作以减少与CU相关联的方块效应假影。在视频编码器20执行解块操作之后,视频编码器20可使用样本自适应偏移(SAO)来修改图片的CTB的经重建构像素块。一般来说,将偏移值加到图片中的像素可改进译码效率。在执行此些操作之后,视频编码器20可将CU的经重建构像素块存储于经解码图片缓冲器中以用于产生其它CU的预测性像素块。
视频解码器30可接收位流。所述位流可包含由视频编码器20编码的视频数据的经译码表示。视频解码器30可剖析所述位流以从所述位流提取语法元素。作为从所述位流提取至少一些语法元素的部分,视频解码器30可熵解码位流中的数据。
当视频解码器30执行CABAC解码时,视频解码器30可对一些二进位执行规则CABAC解码且可对其它二进位执行旁通解码。当视频解码器30对语法元素执行规则CABAC解码时,视频解码器30可识别译码上下文。视频解码器30可接着将区间分成下部子区间及上部子区间。所述子区间中的一者可与值0相关联,且另一子区间可与值1相关联。子区间的宽度可与由所识 别的译码上下文针对相关联值而指示的机率成比例。如果经编码值在下部子区间内,则视频解码器30可解码具有与下部子区间相关联的值的二进位。如果经编码值在上部子区间内,则视频解码器30可解码具有与上部子区间相关联的值的二进位。为解码语法元素的下一二进位,视频解码器30可相对于为含有经编码值的子区间的区间而重复此些步骤。当视频解码器30针对下一二进位重复此些步骤时,视频解码器30可使用基于由所识别的译码上下文指示的机率及经解码二进位的经修改机率。视频解码器30可接着将二进位解二进制化以恢复语法元素。解二进制化可指代根据二进制串与语法元素值之间的映射来选择语法元素值。
当视频解码器30执行旁通解码时,视频解码器30可能能够在单一循环内解码若干二进位,而当视频解码器30执行规则CABAC解码时,视频解码器30大体上可仅能够在一循环中解码单一二进位,或需要一个以上循环用于单一二进位。旁通解码可比规则CABAC解码简单,这是因为视频解码器30不需要选择上下文且可假定两个符号(0及1)的机率为1/2。以此方式,旁通二进位的编码及/或解码可比经规则译码二进位在计算上花费少,且可实现较高平行度及输送量。
视频解码器30可基于从位流提取的语法元素来重建构视频数据的图片。基于语法元素来重建构视频数据的程序大体上与由视频编码器20执行以产生语法元素的程序互逆。举例来说,视频解码器30可基于与CU相关联的语法元素来产生CU的PU的预测性像素块。另外,视频解码器30可逆量化与CU的TU相关联的系数块。视频解码器30可对系数块执行逆变换以重建构与CU的TU相关联的残余像素块。视频解码器30可基于预测性像素块及残余像素块来重建构CU的像素块。
在视频解码器30已重建构CU的像素块之后,视频解码器30可执行解块操作以减少与CU相关联的方块效应假影。另外,基于一或多个SAO语法元素,视频解码器30可应用由视频编码器20应用的SAO。在视频解码器30执行此些操作之后,视频解码器30可将CU的像素块存储于经解码图片缓冲器中。经解码图片缓冲器可提供用于后续运动补偿、帧内预测及显示装置上的呈现的参考图片。
图2为说明经配置以实施本发明的技术的实例视频编码器20的框图。图2是出于解释的目的而提供且不应视为限制如本发明广泛例证并描述的技 术。出于解释的目的,本发明在HEVC译码的图像预测中描述视频编码器20。然而,本发明的技术可适用于其它译码标准或方法。
在图2的实例中,视频编码器20包含预测处理单元100、残余产生单元102、变换处理单元104、量化单元106、逆量化单元108、逆变换处理单元110、重建构单元112、滤波器单元113、经解码图片缓冲器114及熵编码单元116。熵编码单元116包含规则CABAC译码引擎118及旁通译码引擎120。预测处理单元100包含帧间预测处理单元121及帧内预测处理单元126。帧间预测处理单元121包含运动估计单元122及运动补偿单元124。在其它实例中,视频编码器20可包含更多、更少或不同的功能组件。
视频编码器20接收视频数据。为编码视频数据,视频编码器20可编码视频数据的每一图片的每一切片。作为编码切片的部分,视频编码器20可编码所述切片中的每一CTB。作为编码CTB的部分,预测处理单元100可对与CTB相关联的像素块执行四分树分割,以将像素块分成逐渐变小的像素块。较小像素块可与CU相关联。举例来说,预测处理单元100可将CTB的像素块分割成四个相等大小的子块,将子块中的一或多者分割成四个相等大小的子子块,等等。
视频编码器20可编码图片中的CTB的CU以产生CU的经编码表示(即,经译码CU)。视频编码器20可根据z形扫描次序来编码CTB的CU。换句话说,视频编码器20可按左上CU、右上CU、左下CU及接着右下CU来编码所述CU。当视频编码器20编码经分割CU时,视频编码器20可根据z形扫描次序来编码与经分割CU的像素块的子块相关联的CU。
此外,作为编码CU的部分,预测处理单元100可在CU的一或多个PU当中分割CU的像素块。视频编码器20及视频解码器30可支持各种PU大小。假定特定CU的大小为2N×2N,视频编码器20及视频解码器30可支持2N×2N或N×N的PU大小以用于帧内预测,且支持2N×2N、2N×N、N×2N、N×N或类似大小的对称PU大小以用于帧间预测。视频编码器20及视频解码器30还可支持2N×nU、2N×nD、nL×2N及nR×2N的PU大小的不对称分割以用于帧间预测。
帧间预测处理单元121可通过对CU的每一PU执行帧间预测而产生PU的预测性数据。PU的预测性数据可包含对应于PU的预测性像素块及PU的运动信息。切片可为I切片、P切片或B切片。帧间预测单元121可取决于PU 是在I切片、P切片还是B切片中而对CU的PU执行不同操作。在I切片中,所有PU经帧内预测。因此,如果PU在I切片中,则帧间预测单元121不对PU执行帧间预测。
如果PU在P切片中,则运动估计单元122可搜索参考图片的列表(例如,“列表0”)中的参考图片以查找PU的参考块。PU的参考块可为最紧密地对应于PU的像素块的像素块。运动估计单元122可产生指示列表0中的含有PU的参考块的参考图片的参考图片索引,及指示PU的像素块与参考块之间的空间位移的运动向量。运动估计单元122可将参考图片索引及运动向量作为PU的运动信息而输出。运动补偿单元124可基于由PU的运动信息指示的参考块来产生PU的预测性像素块。
如果PU在B切片中,则运动估计单元122可对PU执行单向帧间预测或双向帧间预测。为对PU执行单向帧间预测,运动估计单元122可搜索第一参考图片列表(“列表0”)或第二参考图片列表(“列表1”)的参考图片以查找PU的参考块。运动估计单元122可将以下各者作为PU的运动信息而输出:指示含有参考块的参考图片的列表0或列表1中的位置的参考图片索引、指示PU的像素块与参考块之间的空间位移的运动向量,及指示参考图片是在列表0中还是在列表1中的预测方向指示符。
为对PU执行双向帧间预测,运动估计单元122可搜索列表0中的参考图片以查找PU的参考块,且还可搜索列表1中的参考图片以查找PU的另一参考块。运动估计单元122可产生指示含有参考块的参考图片的列表0及列表1中的位置的参考图片索引。另外,运动估计单元122可产生指示参考块与PU的像素块之间的空间位移的运动向量。PU的运动信息可包含PU的参考图片索引及运动向量。运动补偿单元124可基于由PU的运动信息指示的参考块来产生PU的预测性像素块。
帧内预测处理单元126可通过对PU执行帧内预测而产生PU的预测性数据。PU的预测性数据可包含PU的预测性像素块及各种语法元素。帧内预测处理单元126可对I切片、P切片及B切片内的PU执行帧内预测。
为对PU执行帧内预测,帧内预测处理单元126可使用多个帧内预测模式来产生PU的预测性数据的多个集合。为使用帧内预测模式来产生PU的预测性数据的集合,帧内预测处理单元126可在与帧内预测模式相关联的方向上跨越PU的样本块扩展来自相邻PU的样本块的样本。假定从左向右、从上 而下的编码次序用于PU、CU及CTB,相邻PU可在PU的上方,在PU的右上方,在PU的左上方或在PU的左方。帧内预测处理单元126可使用各种数目个帧内预测模式,例如,33个方向性帧内预测模式。在一些实例中,帧内预测模式的数目可取决于PU的像素块的大小。
预测处理单元100可从通过帧间预测处理单元121针对PU而产生的预测性数据或通过帧内预测处理单元126针对PU而产生的预测性数据当中选择CU的PU的预测性数据。在一些实例中,预测处理单元100基于预测性数据的集合的速率/失真量度来选择CU的PU的预测性数据。选定预测性数据的预测性像素块在本文中可被称作选定预测性像素块。
残余产生单元102可基于CU的像素块及CU的PU的选定预测性像素块来产生CU的残余像素块。举例来说,残余产生单元102可产生CU的残余像素块,使得残余像素块中的每一样本具有等于以下两者之间的差的值:CU的像素块中的样本,及CU的PU的选定预测性像素块中的对应样本。
预测处理单元100可执行四分树分割以将CU的残余像素块分割成子块。每一未划分的残余像素块可与CU的不同TU相关联。与CU的TU相关联的残余像素块的大小及位置可能或可能不基于CU的PU的像素块的大小及位置。
因为TU的残余像素块的像素可包括一亮度样本及两个色度样本,所以TU中的每一者可与亮度样本的一块及色度样本的两个块相关联。变换处理单元104可通过将一或多个变换应用于与TU相关联的残余样本块而产生CU的每一TU的系数块。变换处理单元104可将各种变换应用于与TU相关联的残余样本块。举例来说,变换处理单元104可将离散余弦变换(DCT)、方向性变换或概念上类似的变换应用于残余样本块。
量化单元106可量化系数块中的系数。量化程序可减小与所述系数中的一些或全部相关联的位深度。举例来说,n位系数可在量化期间降值舍位到m位系数,其中n大于m。量化单元106可基于与CU相关联的量化参数(QP)值来量化与CU的TU相关联的系数块。视频编码器20可通过调整与CU相关联的QP值来调整应用于与CU相关联的系数块的量化程度。
逆量化单元108及逆变换处理单元110可分别将逆量化及逆变换应用于系数块以从系数块重建构残余样本块。重建构单元112可将经重建构残余样本块的样本加到来自由预测处理单元100产生的一或多个预测性样本块的对应样本,以产生与TU相关联的经重建构样本块。通过以此方式重建构CU的 每一TU的样本块,视频编码器20可重建构CU的像素块。
滤波器单元113可执行解块操作以减少与CU相关联的像素块中的方块效应假影。此外,滤波器单元113可将由预测处理单元100确定的SAO偏移应用于经重建构样本块以恢复像素块。滤波器单元113可产生CTB的SAO语法元素的序列。SAO语法元素可包含经规则CABAC译码二进位及经旁通译码二进位。根据本发明的技术,在序列内,色彩分量的经旁通译码二进位中无一者在同一色彩分量的经规则CABAC译码二进位中的两者之间。
经解码图片缓冲器114可存储经重建构像素块。帧间预测单元121可使用含有经重建构像素块的参考图片来对其它图片的PU执行帧间预测。另外,帧内预测处理单元126可使用经解码图片缓冲器114中的经重建构像素块来对在与CU相同的图片中的其它PU执行帧内预测。
熵编码单元116可接收来自视频编码器20的其它功能组件的数据。举例来说,熵编码单元116可接收来自量化单元106的系数块且可接收来自预测处理单元100的语法元素。熵编码单元116可对数据执行一或多个熵编码操作以产生经熵编码数据。举例来说,熵编码单元116可对数据执行上下文自适应可变长度译码(CAVLC)操作、CABAC操作、可变到可变(V2V)长度译码操作、基于语法的上下文自适应二进制算术译码(SBAC)操作、机率区间分割熵(PIPE)译码操作,或另一类型的熵编码操作。在一特定实例中,熵编码单元116可编码由滤波器单元113产生的SAO语法元素。作为编码SAO语法元素的部分,熵编码单元116可使用规则CABAC引擎118来编码SAO语法元素的经规则CABAC译码二进位,且可使用旁通译码引擎120来编码经旁通译码二进位。
根据本发明的技术,帧间预测单元121确定帧间候选预测模式的集合。以此方式,视频编码器20是视频编码器的实例,所述视频编码器根据本发明的技术经配置以根据与待处理图像单元相邻的相邻图像单元的信息,确定所述待处理图像单元的候选预测模式集合是否包含仿射合并模式,所述仿射合并模式表示所述待处理图像单元和所述待处理图像单元的相邻图像单元使用相同的仿射模型获得各自的预测图像;从所述候选预测模式集合中,确定所述待处理图像单元的预测模式;根据所述预测模式,确定所述待处理图像单元的预测图像;将第一指示信息编入码流,所述第一指示信息表示所述预测模式。
图3是根据本发明的一或多种技术的用于编码视频数据的视频编码器的实例操作200的流程图。图3是作为实例而提供。在其它实例中,可使用比图3的实例中所展示的步骤多、少的步骤或与其不同的步骤来实施本发明的技术。根据图3的实例方法,视频编码器20执行如下步骤:
S210,根据与待处理图像单元相邻的相邻图像单元的信息,确定所述待处理图像单元的候选预测模式集合是否包含仿射合并模式;
具体的,如图4所示,块A、B、C、D、E为当前待编码块的相邻已重构块,分别位于待编码块的上、左、右上、左下和左上的位置,可以通过相邻重构块的编码信息,确定当前待编码块的候选预测模式集合是否存在仿射合并模式。
应理解,本发明实施例中的图4示例性的给出了待编码块的相邻已重构块的数量和位置,所述相邻已重构块的数量可以多于5个或者少于5个,不做限定。
在第一种可实现方式中,判断所述的相邻重构块中是否存在预测类型是仿射预测的块;如果没有,则该待编码块的候选预测模式集合不包含仿射合并模式;如果有,则分别以该待编码块的候选预测模式集合包含仿射合并模式和该待编码块的候选预测模式集合不包含仿射合并模式,两种情况,执行图2所示的编码过程;如果第一种情况的编码性能更好,则该待编码块的候选预测模式集合包含仿射合并模式,将一指示信息,不妨设为第二指示信息,置1,并编入码流,反之,则该待编码块的候选预测模式集合不包含仿射合并模式,将第二指示信息,置0,并编入码流。
在第二种可实现方式中,判断所述的相邻重构块中是否存在预测类型是仿射预测的块;如果没有,则该待编码块的候选预测模式集合不包含仿射合并模式;如果有,则该待编码块的候选预测模式集合包含仿射合并模式。
在第三种可实现方式中,所述的相邻重构块包含多种仿射模式,不妨设包含第一仿射模式或第二仿射模式,对应的,所述仿射合并模式,包括合并所述第一仿射模式的第一仿射合并模式或合并所述第二仿射模式的第二仿射合并模式,分别统计所述相邻重构块中第一仿射模式、第二仿射模式和非仿射模式的数量,当第一仿射模式最多时,所述候选预测模式集合包含所述第一仿射合并模式,且不包含所述第二仿射合并模式;当第二仿射模式最多时,所述候选预测模式集合包含所述第二仿射合并模式,且不包含所述第一 仿射合并模式;当非仿射模式最多时,所述候选预测模式集合不包含所述仿射合并模式。
第三种可实现方式,还可以为:当第一仿射模式最多时,所述候选预测模式集合包含所述第一仿射合并模式,且不包含所述第二仿射合并模式;当第二仿射模式最多时,所述候选预测模式集合包含所述第二仿射合并模式,且不包含所述第一仿射合并模式;当非仿射模式最多时,统计第一仿射模式和第二仿射模式哪一个次多,当第一仿射模式次多时,所述候选预测模式集合包含所述第一仿射合并模式,且不包含所述第二仿射合并模式;当第二仿射模式次多时,所述候选预测模式集合包含所述第二仿射合并模式,且不包含所述第一仿射合并模式。
在第四种可实现方式中,判断两个条件:(1)所述的相邻重构块中是否存在预测类型是仿射模式的块;(2)仿射模式的相邻块的宽和高是否小于待编码块的宽和高;如果任一条件不满足,则该待编码块的候选预测模式集合不包含仿射合并模式;如果两个条件都满足,则分别以该待编码块的候选预测模式集合包含仿射合并模式和该待编码块的候选预测模式集合不包含仿射合并模式,两种情况,执行图2所示的编码过程;如果第一种情况的编码性能更好,则该待编码块的候选预测模式集合包含仿射合并模式,将一指示信息,不妨设为第三指示信息,置1,并编入码流,反之,则该待编码块的候选预测模式集合不包含仿射合并模式,将第三指示信息,置0,并编入码流。
应理解,所述判断条件(2)在本发明实施例中为同时满足仿射模式的相邻块的宽小于待编码块的宽,仿射模式的相邻块的高小于待编码块的高,在其它的实施例中,该判断条件还可以为:仿射模式的相邻块的宽小于待编码块的宽或者仿射模式的相邻块的高小于待编码块的高,不做限定。
在第五种可实现方式中,判断两个条件:(1)所述的相邻重构块中是否存在预测类型是仿射模式的块;(2)仿射模式的相邻块的宽和高是否小于待编码块的宽和高;如果任一条件不满足,则该待编码块的候选预测模式集合不包含仿射合并模式;如果两个条件都满足,则该待编码块的候选预测模式集合包含仿射合并模式。
应理解,在本发明的实施例中,采用了相邻重构块的预测类型和尺寸作为判断当前待编码块的候选预测模式集合的根据,还可以采用相邻重构块解 析获得的属性信息来判断,在这里不做限定。
还应理解,在本发明实施例的各种可实现方式中,示例性的,比如第二种可实现方式中,判断所述的相邻重构块中是否存在预测类型是仿射预测的块,也可以采用如下的判断准则,示例性的,至少两个相邻块的预测类型是仿射模式时,该待编码块的候选预测模式集合包含仿射合并模式,否则,该待编码块的候选预测模式集合不包含仿射合并模式。相邻块的预测类型是仿射模式的个数也可以是至少三个,或者至少四个,不做限定。
还应理解,在本发明实施例的各种可实现方式中,示例性的,比如第五种可实现方式中,判断两个条件:(1)所述的相邻重构块中是否存在预测类型是仿射模式的块;(2)仿射模式的相邻块的宽和高是否小于待编码块的宽和高;其中,第二个判断条件,示例性的,也可以为仿射模式的相邻块的宽和高是否小于待编码块的宽和高的1/2或者1/3,1/4,不做限定。
还应理解,在本发明实施例中对指示信息置0或者置1是示例性的,也可以进行相反的设置,示例性的,比如在第一种可实现方式中,还可以判断所述的相邻重构块中是否存在预测类型是仿射预测的块;如果没有,则该待编码块的候选预测模式集合不包含仿射合并模式;如果有,则分别以该待编码块的候选预测模式集合包含仿射合并模式和该待编码块的候选预测模式集合不包含仿射合并模式,两种情况,执行图2所示的编码过程;如果第一种情况的编码性能更好,则该待编码块的候选预测模式集合包含仿射合并模式,将一指示信息,不妨设为第二指示信息,置0,并编入码流,反之,则该待编码块的候选预测模式集合不包含仿射合并模式,将第二指示信息,置1,并编入码流。
S220,从所述候选预测模式集合中,确定所述待处理图像单元的预测模式;
所述候选预测模式集合即为S210所确定的候选预测模式集合,依次采用候选预测模式集合中的每一个预测模式执行图2所示的编码过程,选择编码性能最好的一种模式,作为待编码块的预测模式。
应理解,本发明实施例中执行图2所示的编码过程的目的是选出可能使编码性能最优的一种预测模式。在所述选择过程中,可以比较各预测模式的性能代价比,其中性能使用图像还原的质量来表示,代价使用编码的码率来表示,也可以只比较各预测模式的性能或者代价,对应的,可以完成图2所 示的编码的全部步骤,也可以在获得需要比较的指标后,停止编码步骤,比如,如果仅使用性能比较各预测模式,则仅需要经过预测单元即可,不做限定。
S230,根据所述预测模式,确定所述待处理图像单元的预测图像;
前文引述的H.265标准以及CN201010247275.7等申请文件对根据预测模式,包括平动运动模型的预测模式,仿射预测模式,仿射合并模式等,生成待编码块的预测图像,进行了详细描述,这里不再赘述。
S240,将第一指示信息编入码流;
将S220确定的预测模式编入码流,应理解,该步骤可以发生在S220之后的任意时刻,在步骤次序上没有特定要求,和解码端的解码第一指示信息相对应即可。
图5是根据本发明的一或多种技术的用于编码视频数据的另一视频编码器40的实例框图。
视频编码器40包括:第一确定模块41,第二确定模块42,第三确定模块43,编码模块44。
第一确定模块41用于执行S210根据与待处理图像单元相邻的相邻图像单元的信息,确定所述待处理图像单元的候选预测模式集合是否包含仿射合并模式;
第二确定模块42用于执行S220从所述候选预测模式集合中,确定所述待处理图像单元的预测模式;
第三确定模块43用于执行S230根据所述预测模式,确定所述待处理图像单元的预测图像;
编码模块44用于执行S240将第一指示信息编入码流。
由于相邻块之间的运动信息具有相关性,当前块和相邻块存在很大的可能性具有相同或相似的预测模式,本发明实施例通过判断相邻块的信息,来推导当前块的预测模式信息,减少了编码预测模式的码率,提高了编码效率。
图6是根据本发明的一或多种技术的用于编码视频数据的视频编码器的实例操作300的流程图。图5是作为实例而提供。在其它实例中,可使用比图5的实例中所展示的步骤多、少的步骤或与其不同的步骤来实施本发明的技术。根据图5的实例方法,视频编码器20执行如下步骤:
S310,编码第一待处理图像区域的候选预测模式集合的指示信息。
当第一待处理图像区域的候选模式集合采用候选平动模式集合时,设置第一指示信息为0,将所述第一指示信息编入码流,所述平动模式表示使用平动模型获得预测图像的预测模式;当所述第一待处理图像区域的候选模式集合采用所述候选平动模式集合和候选仿射模式集合时,设置所述第一指示信息为1,将所述第一指示信息编入码流,所述仿射模式表示使用仿射模型获得预测图像的预测模式;其中,所述第一待处理图像区域可以是图像帧组、图像帧、图像分片集、图像条带集、图像分片、图像条带、图像编码单元集合、图像编码单元其中之一,对应的,所述第一指示信息编码于图像帧组的头部,比如,视频参数集(VPS),序列参数集(SPS),附加增强信息(SEI),图像帧头部,比如图像参数集(PPS),图像分片集的头部,图像条带集的头部,图像分片的头部,比如图像分片头(tile header),图像条带头(slice header),图像编码单元集合的头部、图像编码单元的头部。
应理解,该步骤对于第一待处理图像区域的确定,可以是预先配置的,也可以在编码的过程中自适应确定,第一待处理图像区域范围的表示,可以通过编解码端的协议获知,也可以通过码流将第一待处理图像区域的范围编码传输,不做限定。
还应理解,对于候选预测模式集合的确定,可以是预先配置的,也可以是比较编码性能后确定的,不做限定。
还应理解,在本发明实施例中对指示信息置0或者置1是示例性的,也可以进行相反的设置。
S320,对于所述第一待处理图像区域中的待处理单元,从所述第一待处理图像区域的候选预测模式集合中,确定待处理图像单元的预测模式;
具体方法和S220类似,不再赘述。
S330,根据所述预测模式,确定所述待处理图像单元的预测图像;
具体方法和S230类似,不再赘述。
S340,将待处理单元所选用的预测模式编入所述码流;
具体方法和S240类似,不再赘述。
图7是根据本发明的一或多种技术的用于编码视频数据的另一视频编码器50的实例框图。
视频编码器50包括:第一编码模块51,第一确定模块52,第二确定模 块53,第二编码模块54。
第一编码模块51用于执行S310编码第一待处理图像区域的候选预测模式集合的指示信息;
第一确定模块52用于执行S320对于所述第一待处理图像区域中的待处理单元,从所述第一待处理图像区域的候选预测模式集合中,确定待处理图像单元的预测模式;
第二确定模块53用于执行S330根据所述预测模式,确定所述待处理图像单元的预测图像;
第二编码模块54用于执行S340将待处理单元所选用的预测模式编入所述码流。
由于相邻块之间的运动信息具有相关性,在同一区域内,很大可能性,仅存在平移运动而不存在仿射运动,本发明实施例通过设置区域级的候选预测模式集合选择标志,避免了编码冗余模式花费的码率,提高了编码效率。
图8为说明经配置以实施本发明的技术的实例视频解码器30的框图。图8是出于解释的目的而提供且不限制如本发明中广泛例证并描述的技术。出于解释的目的,本发明在HEVC译码的图像预测中描述视频解码器30。然而,本发明的技术可适用于其它译码标准或方法。
在图8的实例中,视频解码器30包含熵解码单元150、预测处理单元152、逆量化单元154、逆变换处理单元156、重建构单元158、滤波器单元159及经解码图片缓冲器160。预测处理单元152包含运动补偿单元162及帧内预测处理单元164。熵解码单元150包含规则CABAC译码引擎166及旁通译码引擎168。在其它实例中,视频解码器30可包含更多、更少或不同的功能组件。
视频解码器30可接收位流。熵解码单元150可剖析所述位流以从所述位流提取语法元素。作为剖析位流的部分,熵解码单元150可熵解码位流中的经熵编码语法元素。预测处理单元152、逆量化单元154、逆变换处理单元156、重建构单元158及滤波器单元159可基于从位流提取的语法元素来产生经解码视频数据。
位流可包含CTB的经译码SAO语法元素的序列。SAO语法元素可包含经规则CABAC译码二进位及经旁通译码二进位。根据本发明的技术,在经译码SAO语法元素的序列中,经旁通译码二进位中无一者在经规则CABAC译码二 进位中的两者之间。熵解码单元150可解码SAO语法元素。作为解码SAO语法元素的部分,熵解码单元150可使用规则CABAC译码引擎166来解码经规则CABAC译码二进位,且可使用旁通译码引擎168来解码经旁通译码二进位。
另外,视频解码器30可对未经分割CU执行重建构操作。为对未经分割CU执行重建构操作,视频解码器30可对CU的每一TU执行重建构操作。通过对CU的每一TU执行重建构操作,视频解码器30可重建构与CU相关联的残余像素块。
作为对CU的TU执行重建构操作的部分,逆量化单元154可逆量化(即,解量化)与TU相关联的系数块。逆量化单元154可使用与TU的CU相关联的QP值来确定量化程度,且同样地确定逆量化单元154将应用的逆量化程度。
在逆量化单元154逆量化系数块之后,逆变换处理单元156可将一或多个逆变换应用于系数块,以便产生与TU相关联的残余样本块。举例来说,逆变换处理单元156可将逆DCT、逆整数变换、逆卡忽南-拉维(Karhunen-Loeve)变换(KLT)、逆旋转变换、逆方向性变换或另一逆变换应用于系数块。
如果PU使用帧内预测编码,则帧内预测处理单元164可执行帧内预测以产生PU的预测性样本块。帧内预测处理单元164可使用帧内预测模式以基于空间相邻PU的像素块来产生PU的预测性像素块。帧内预测处理单元164可基于从位流剖析的一或多个语法元素来确定PU的帧内预测模式。
运动补偿单元162可基于从位流提取的语法元素来建构第一参考图片列表(列表0)及第二参考图片列表(列表1)。此外,如果PU使用帧间预测编码,则熵解码单元150可提取PU的运动信息。运动补偿单元162可基于PU的运动信息来确定PU的一或多个参考块。运动补偿单元162可基于PU的一或多个参考块来产生PU的预测性像素块。
重建构单元158可在适用时使用与CU的TU相关联的残余像素块及CU的PU的预测性像素块(即,帧内预测数据或帧间预测数据)以重建构CU的像素块。特定来说,重建构单元158可将残余像素块的样本加到预测性像素块的对应样本以重建构CU的像素块。
滤波器单元159可执行解块操作以减少与CTB的CU的像素块相关联的方块效应假影。另外,滤波器单元159可基于从位流剖析的SAO语法元素来修改CTB的像素块。举例来说,滤波器单元159可基于CTB的SAO语法元素 来确定值,且将所确定值加到CTB的经重建构像素块中的样本。通过修改图片的CTB的像素块中的至少一些,滤波器单元159可基于SAO语法元素来修改视频数据的经重建构图片。
视频解码器30可将CU的像素块存储于经解码图片缓冲器160中。经解码图片缓冲器160可提供参考图片以用于后续运动补偿、帧内预测及显示装置(例如,图1的显示装置32)上的呈现。举例来说,视频解码器30可基于经解码图片缓冲器160中的像素块来对其它CU的PU执行帧内预测操作或帧间预测操作。
根据本发明的技术,预测处理单元152确定帧间候选预测模式的集合。以此方式,视频解码器30是视频解码器的实例,所述视频解码器根据本发明的技术经配置以根据与待处理图像单元相邻的相邻图像单元的信息,确定所述待处理图像单元的候选预测模式集合是否包含仿射合并模式,所述仿射合并模式表示所述待处理图像单元和所述待处理图像单元的相邻图像单元使用相同的仿射模型获得各自的预测图像;解析码流,获得第一指示信息;根据所述第一指示信息,从所述候选预测模式集合中,确定所述待处理图像单元的预测模式;根据所述预测模式,确定所述待处理图像单元的预测图像。
图9是根据本发明的一或多种技术的用于解码视频数据的视频解码器的实例操作400的流程图。图9是作为实例而提供。在其它实例中,可使用比图9的实例中所展示的步骤多、少的步骤或与其不同的步骤来实施本发明的技术。根据图9的实例方法,视频解码器30执行如下步骤:
S410,根据与待处理图像单元相邻的相邻图像单元的信息,确定所述待处理图像单元的候选预测模式集合是否包含仿射合并模式;
具体的,如图4所示,块A、B、C、D、E为当前待编码块的相邻已重构块,分别位于待编码块的上、左、右上、左下和左上的位置,可以通过相邻重构块的编码信息,确定当前待编码块的候选预测模式集合是否存在仿射合并模式。
应理解,本发明实施例中的图4示例性的给出了待编码块的相邻已重构块的数量和位置,所述相邻已重构块的数量可以多于5个或者少于5个,不做限定。
在第一种可实现方式中,判断所述的相邻重构块中是否存在预测类型是仿射预测的块;当至少一个所述相邻图像单元的预测模式为使用仿射模型获 得预测图像时,解析所述码流中的第二指示信息,当所述第二指示信息为1时,所述候选预测模式集合包含所述仿射合并模式,当所述第二指示信息为0时,所述候选预测模式集合不包含所述仿射合并模式;否则,所述候选预测模式集合不包含所述仿射合并模式。
在第二种可实现方式中,判断所述的相邻重构块中是否存在预测类型是仿射预测的块;如果没有,则该待编码块的候选预测模式集合不包含仿射合并模式;如果有,则该待编码块的候选预测模式集合包含仿射合并模式。
在第三种可实现方式中,所述的相邻重构块包含多种仿射模式,不妨设包含第一仿射模式或第二仿射模式,对应的,所述仿射合并模式,包括合并所述第一仿射模式的第一仿射合并模式或合并所述第二仿射模式的第二仿射合并模式,分别统计所述相邻重构块中第一仿射模式、第二仿射模式和非仿射模式的数量,当第一仿射模式最多时,所述候选预测模式集合包含所述第一仿射合并模式,且不包含所述第二仿射合并模式;当第二仿射模式最多时,所述候选预测模式集合包含所述第二仿射合并模式,且不包含所述第一仿射合并模式;当非仿射模式最多时,所述候选预测模式集合不包含所述仿射合并模式。
第三种可实现方式,还可以为:当第一仿射模式最多时,所述候选预测模式集合包含所述第一仿射合并模式,且不包含所述第二仿射合并模式;当第二仿射模式最多时,所述候选预测模式集合包含所述第二仿射合并模式,且不包含所述第一仿射合并模式;当非仿射模式最多时,统计第一仿射模式和第二仿射模式哪一个次多,当第一仿射模式次多时,所述候选预测模式集合包含所述第一仿射合并模式,且不包含所述第二仿射合并模式;当第二仿射模式次多时,所述候选预测模式集合包含所述第二仿射合并模式,且不包含所述第一仿射合并模式。
在第四种可实现方式中,判断两个条件:(1)所述的相邻重构块中是否存在预测类型是仿射模式的块;(2)仿射模式的相邻块的宽和高是否小于待编码块的宽和高;如果任一条件不满足,则该待编码块的候选预测模式集合不包含仿射合并模式;如果两个条件都满足,解析所述码流中的第三指示信息,当所述第三指示信息为1时,所述候选预测模式集合包含所述仿射合并模式,当所述第三指示信息为0时,所述候选预测模式集合不包含所述仿射合并模式;否则,所述候选预测模式集合不包含所述仿射合并模式。
应理解,所述判断条件(2)在本发明实施例中为同时满足仿射模式的相邻块的宽小于待编码块的宽,仿射模式的相邻块的高小于待编码块的高,在其它的实施例中,该判断条件还可以为:仿射模式的相邻块的宽小于待编码块的宽或者仿射模式的相邻块的高小于待编码块的高,不做限定。
在第五种可实现方式中,判断两个条件:(1)所述的相邻重构块中是否存在预测类型是仿射模式的块;(2)仿射模式的相邻块的宽和高是否小于待编码块的宽和高;如果任一条件不满足,则该待编码块的候选预测模式集合不包含仿射合并模式;如果两个条件都满足,则该待编码块的候选预测模式集合包含仿射合并模式。
应理解,在本发明的实施例中,采用了相邻重构块的预测类型和尺寸作为判断当前待编码块的候选预测模式集合的根据,还可以采用相邻重构块解析获得的属性信息来判断,和编码端相对应即可,在这里不做限定。
还应理解,在本发明实施例的各种可实现方式中,示例性的,比如第二种可实现方式中,判断所述的相邻重构块中是否存在预测类型是仿射预测的块,也可以采用如下的判断准则,示例性的,至少两个相邻块的预测类型是仿射模式时,该待编码块的候选预测模式集合包含仿射合并模式,否则,该待编码块的候选预测模式集合不包含仿射合并模式。相邻块的预测类型是仿射模式的个数也可以是至少三个,或者至少四个,和编码端相对应即可,不做限定。
还应理解,在本发明实施例的各种可实现方式中,示例性的,比如第五种可实现方式中,判断两个条件:(1)所述的相邻重构块中是否存在预测类型是仿射模式的块;(2)仿射模式的相邻块的宽和高是否小于待编码块的宽和高;其中,第二个判断条件,示例性的,也可以为仿射模式的相邻块的宽和高是否小于待编码块的宽和高的1/2或者1/3,1/4,和编码端相对应即可,不做限定。
还应理解,在本发明实施例中对指示信息置0或者置1是和编码端相对应的。
S420,解析码流中的第一指示信息;
所述第一指示信息表示的是待解码块的预测模式的索引信息,这一步骤是和编码端步骤S240相对应的。
S430,根据所述第一指示信息,从所述候选预测模式集合中,确定所述 待处理图像单元的预测模式;
在不同的候选预测模式集合对应着不同的预测模式的列表,通过在S420中获得的索引信息,查找在S410中确定的候选预测模式集合对应的预测模式的列表,就可以找到待解码块的预测模式。
S440,根据所述预测模式,确定所述待处理图像单元的预测图像;
具体方法和S230类似,不再赘述。
图10是根据本发明的一或多种技术的用于解码视频数据的另一解频编码器60的实例框图。
视频解码器60包括:第一确定模块61,解析模块62,第二确定模块63,第三确定模块64。
第一确定模块61用于执行S410根据与待处理图像单元相邻的相邻图像单元的信息,确定所述待处理图像单元的候选预测模式集合是否包含仿射合并模式;
解析模块62用于执行S420解析码流中的第一指示信息;
第二确定模块63用于执行S430根据所述第一指示信息,从所述候选预测模式集合中,确定所述待处理图像单元的预测模式;
第三确定模块64用于执行S440根据所述预测模式,确定所述待处理图像单元的预测图像。
由于相邻块之间的运动信息具有相关性,当前块和相邻块存在很大的可能性具有相同或相似的预测模式,本发明实施例通过判断相邻块的信息,来推导当前块的预测模式信息,减少了编码预测模式的码率,提高了编码效率。
图11是根据本发明的一或多种技术的用于解码视频数据的视频解码器的实例操作500的流程图。图11是作为实例而提供。在其它实例中,可使用比图11的实例中所展示的步骤多、少的步骤或与其不同的步骤来实施本发明的技术。根据图11的实例方法,视频解码器20执行如下步骤:
S510,解析码流中的第一指示信息;
所述第一指示信息表示的是第一待处理图像区域的候选模式集合是否包括仿射运动模型,这一步骤和编码端的步骤S310相对应。
S520,根据所述第一指示信息,确定第一待处理图像区域的候选模式集合;
当所述第一指示信息为0时,所述第一待处理图像区域的候选模式集合 采用候选平动模式集合,所述平动模式表示使用平动模型获得预测图像的预测模式;当所述第一指示信息为1时,所述第一待处理图像区域的候选模式集合采用所述候选平动模式集合和候选仿射模式集合,所述仿射模式表示使用仿射模型获得预测图像的预测模式;其中,所述第一待处理图像区域可以是图像帧组、图像帧、图像分片集、图像条带集、图像分片、图像条带、图像编码单元集合、图像编码单元其中之一,对应的,所述第一指示信息编码于图像帧组的头部,比如,视频参数集(VPS),序列参数集(SPS),附加增强信息(SEI),图像帧头部,比如图像参数集(PPS),图像分片集的头部,图像条带集的头部,图像分片的头部,比如图像分片头(tile header),图像条带头(slice header),图像编码单元集合的头部、图像编码单元的头部。
应理解,该步骤对于第一待处理图像区域的确定,可以是预先配置的,也可以在编码的过程中自适应确定,第一待处理图像区域范围的表示,可以通过编解码端的协议获知,也可以通过码流从编码端接受第一待处理图像区域的范围,和编码端相对应即可,不做限定。
还应理解,在本发明实施例中对指示信息置0或者置1是示例性的,和编码端相对应即可。
S530,解析所述码流中的第二指示信息;
所述第二指示信息表示所述第一待处理图像区域中待处理块的预测模式,这一步骤和编码端的步骤S340相对应。
S540,根据所述第二指示信息,从所述第一待处理图像区域的候选预测模式集合中,确定待处理图像单元的预测模式;
具体方法和S320类似,不再赘述。
S550,根据所述预测模式,确定所述待处理图像单元的预测图像;
具体方法和S330类似,不再赘述。
图12是根据本发明的一或多种技术的用于解码视频数据的另一视频解码器70的实例框图。
视频解码器70包括:第一解析模块71,第一确定模块72,第二解析模块73,第二确定模块74,第三确定模块75。
第一解析模块71用于执行S510解析码流中的第一指示信息;
第一确定模块72用于执行S520根据所述第一指示信息,确定第一待处 理图像区域的候选模式集合;
第二解析模块73用于执行S530解析所述码流中的第二指示信息;
第二确定模块74用于执行S540根据所述第二指示信息,从所述第一待处理图像区域的候选预测模式集合中,确定待处理图像单元的预测模式;
第三确定模块75用于执行S550根据所述预测模式,确定所述待处理图像单元的预测图像。
由于相邻块之间的运动信息具有相关性,在同一区域内,很大可能性,仅存在平移运动而不存在仿射运动,本发明实施例通过设置区域级的候选预测模式集合选择标志,避免了编码冗余模式花费的码率,提高了编码效率。
在一或多个实例中,所描述的功能可以硬件、软件、固件或其任何组合来实施。如果以软件实施,则功能可作为一或多个指令或代码而存储于计算机可读媒体上或经由计算机可读媒体而发射,且通过基于硬件的处理单元执行。计算机可读媒体可包含计算机可读存储媒体(其对应于例如数据存储媒体等有形媒体)或通信媒体,通信媒体包含(例如)根据通信协议促进计算机程序从一处传送到另一处的任何媒体。以此方式,计算机可读媒体大体上可对应于(1)非暂时性的有形计算机可读存储媒体,或(2)例如信号或载波等通信媒体。数据存储媒体可为可由一或多个计算机或一或多个处理器存取以检索指令、代码及/或数据结构以用于实施本发明中所描述的技术的任何可用媒体。计算机程序产品可包含计算机可读媒体。
通过实例而非限制,某些计算机可读存储媒体可包括RAM、ROM、EEPROM、CD-ROM或其它光盘存储器、磁盘存储器或其它磁性存储装置、快闪存储器,或可用以存储呈指令或数据结构的形式的所要程序代码且可由计算机存取的任何其它媒体。而且,任何连接可适当地称为计算机可读媒体。举例来说,如果使用同轴电缆、光缆、双绞线、数字用户线(DSL)或无线技术(例如,红外线、无线电及微波)而从网站、服务器或其它远程源发射指令,则同轴电缆、光缆、双绞线、DSL或无线技术(例如,红外线、无线电及微波)包含于媒体的定义中。然而,应理解,计算机可读存储媒体及数据存储媒体不包含连接、载波、信号或其它暂时性媒体,而是有关非暂时性有形存储媒体。如本文中所使用,磁盘及光盘包含压缩光盘(CD)、激光光盘、光学光盘、数字影音光盘(DVD)、软性磁盘及蓝光光盘,其中磁盘通常以磁性方式复制数据,而光盘通过激光以光学方式复制数据。以上各物的组合还应包含于计算机可 读媒体的范围内。
可由例如一或多个数字信号处理器(DSP)、通用微处理器、专用集成电路(ASIC)、现场可编程逻辑阵列(FPGA)或其它等效集成或离散逻辑电路等一或多个处理器来执行指令。因此,如本文中所使用的术语“处理器”可指代前述结构或适于实施本文中所描述的技术的任何其它结构中的任一者。另外,在一些方面中,可将本文中所描述的功能性提供于经配置以用于编码及解码的专用硬件及/或软件模块内,或并入于组合式编解码器中。而且,所述技术可完全实施于一或多个电路或逻辑元件中。
本发明的技术可以广泛多种装置或设备来实施,所述装置或设备包含无线手持机、集成电路(IC)或IC集合(例如,芯片组)。在本发明中描述各种组件、模块或单元以强调经配置以执行所揭示技术的装置的功能方面,但未必要求通过不同硬件单元来实现。确切地说,如上文所描述,各种单元可组合于编解码器硬件单元中,或通过交互操作性硬件单元(包含如上文所描述的一或多个处理器)的集合结合合适软件及/或固件来提供。
应理解,说明书通篇中提到的“一个实施例”或“一实施例”意味着与实施例有关的特定特征、结构或特性包括在本发明的至少一个实施例中。因此,在整个说明书各处出现的“在一个实施例中”或“在一实施例中”未必一定指相同的实施例。此外,这些特定的特征、结构或特性可以任意适合的方式结合在一个或多个实施例中。
在本发明的各种实施例中,应理解,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本发明实施例的实施过程构成任何限定。
另外,本文中术语“系统”和“网络”在本文中常可互换使用。应理解,本文中术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中字符“/”,一般表示前后关联对象是一种“或”的关系。
在本申请所提供的实施例中,应理解,“与A相应的B”表示B与A相关联,根据A可以确定B。但还应理解,根据A确定B并不意味着仅仅根据A确定B,还可以根据A和/或其它信息确定B。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各 示例的单元及算法步骤,能够以电子硬件、计算机软件或者二者的结合来实现,为了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地描述了各示例的组成及步骤。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本发明的范围。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本发明各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
以上所述,仅为本发明的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本发明的保护范围之内。因此,本发明的保护范围应以所述权利要求的保护范围为准。

Claims (40)

  1. 一种预测图像解码方法,包括:
    根据与待处理图像单元相邻的相邻图像单元的信息,确定所述待处理图像单元的候选预测模式集合是否包含仿射合并模式,所述仿射合并模式表示所述待处理图像单元和所述待处理图像单元的相邻图像单元使用相同的仿射模型获得各自的预测图像;
    解析码流中的第一指示信息;
    根据所述第一指示信息,从所述候选预测模式集合中,确定所述待处理图像单元的预测模式;
    根据所述预测模式,确定所述待处理图像单元的预测图像。
  2. 根据权利要求1所述的方法,其特征在于,所述待处理图像单元的相邻图像单元至少包括所述待处理图像单元的上、左、右上、左下和左上的相邻图像单元。
  3. 根据权利要求1或2所述的方法,其特征在于,所述相邻图像单元的信息为所述相邻图像单元的预测模式,对应的,所述根据与待处理图像单元相邻的相邻图像单元的信息,确定所述待处理图像单元的候选预测模式集合是否包含仿射合并模式,包括:
    当至少一个所述相邻图像单元的预测模式为使用仿射模型获得预测图像时,解析所述码流中的第二指示信息,
    当所述第二指示信息为1时,所述候选预测模式集合包含所述仿射合并模式,
    当所述第二指示信息为0时,所述候选预测模式集合不包含所述仿射合并模式;
    否则,所述候选预测模式集合不包含所述仿射合并模式。
  4. 根据权利要求1或2所述的方法,其特征在于,所述相邻图像单元的信息为所述相邻图像单元的预测模式,对应的,所述根据与待处理图像单元相邻的相邻图像单元的信息,确定所述待处理图像单元的候选预测模式集合是否包含仿射合并模式,包括:
    当至少一个所述相邻图像单元的预测模式为使用仿射模型获得预测图像时,所述候选预测模式集合包含所述仿射合并模式;
    否则,所述候选预测模式集合不包含所述仿射合并模式。
  5. 根据权利要求1或2所述的方法,其特征在于,所述相邻图像单元的信息为所述相邻图像单元的预测模式,所述预测模式至少包括使用第一仿射模型获得预测图像的第一仿射模式或使用第二仿射模型获得预测图像的第二仿射模式,对应的,所述仿射合并模式至少包括合并所述第一仿射模式的第一仿射合并模式或合并所述第二仿射模式的第二仿射合并模式,对应的,所述根据与待处理图像单元相邻的相邻图像单元的信息,确定所述待处理图像单元的候选预测模式集合是否包含仿射合并模式,包括:
    当所述相邻预测单元的预测模式中,所述第一仿射模式最多时,所述候选预测模式集合包含所述第一仿射合并模式,且不包含所述第二仿射合并模式;
    当所述相邻预测单元的预测模式中,所述第二仿射模式最多时,所述候选预测模式集合包含所述第二仿射合并模式,且不包含所述第一仿射合并模式;
    当所述相邻预测单元的预测模式中,非所述仿射模式最多时,所述候选预测模式集合不包含所述仿射合并模式。
  6. 根据权利要求1或2所述的方法,其特征在于,所述相邻图像单元的信息为所述相邻图像单元的预测模式,所述预测模式至少包括使用第一仿射模型获得预测图像的第一仿射模式或使用第二仿射模型获得预测图像的第二仿射模式,对应的,所述仿射合并模式至少包括合并所述第一仿射模式的第一仿射合并模式或合并所述第二仿射模式的第二仿射合并模式,对应的,所述根据与待处理图像单元相邻的相邻图像单元的信息,确定所述待处理图像单元的候选预测模式集合是否包含仿射合并模式,包括:
    当所述相邻预测单元的预测模式中,所述第一仿射模式最多时,所述候选预测模式集合包含所述第一仿射合并模式,且不包含所述第二仿射合并模式;
    当所述相邻预测单元的预测模式中,所述第二仿射模式最多时,所述候 选预测模式集合包含所述第二仿射合并模式,且不包含所述第一仿射合并模式;
    当所述相邻预测单元的预测模式中,非所述仿射模式最多且所述第一仿射模式次多时,所述候选预测模式集合包含所述第一仿射合并模式,且不包含所述第二仿射合并模式;
    当所述相邻预测单元的预测模式中,非所述仿射模式最多且所述第二仿射模式次多时,所述候选预测模式集合包含所述第二仿射合并模式,且不包含所述第一仿射合并模式。
  7. 根据权利要求1或2所述的方法,其特征在于,所述信息为所述相邻图像单元的预测模式和尺寸,对应的,所述根据与待处理图像单元相邻的相邻图像单元的信息,确定所述待处理图像单元的候选预测模式集合是否包含仿射合并模式,包括:
    当至少一个所述相邻图像单元的预测模式为使用仿射模型获得预测图像,且所述至少一个所述相邻图像单元的宽和高分别小于所述待处理图像单元的宽和高时,解析所述码流中的第三指示信息,
    当所述第三指示信息为1时,所述候选预测模式集合包含所述仿射合并模式,
    当所述第三指示信息为0时,所述候选预测模式集合不包含所述仿射合并模式;
    否则,所述候选预测模式集合不包含所述仿射合并模式。
  8. 根据权利要求1或2所述的方法,其特征在于,所述信息为所述相邻图像单元的预测模式和尺寸,对应的,所述根据与待处理图像单元相邻的相邻图像单元的信息,确定所述待处理图像单元的候选预测模式集合是否包含仿射合并模式,包括:
    当至少一个所述相邻图像单元的预测模式为使用仿射模型获得预测图像,且所述至少一个所述相邻图像单元的宽和高分别小于所述待处理图像单元的宽和高时,所述候选预测模式集合包含所述仿射合并模式;
    否则,所述候选预测模式集合不包含所述仿射合并模式。
  9. 一种预测图像解码方法,包括:
    解析码流中的第一指示信息;
    根据所述第一指示信息,确定第一待处理图像区域的候选模式集合;
    当所述第一指示信息为0时,所述第一待处理图像区域的候选模式集合采用候选平动模式集合,所述平动模式表示使用平动模型获得预测图像的预测模式;
    当所述第一指示信息为1时,所述第一待处理图像区域的候选模式集合采用候选平动模式集合和候选仿射模式集合,所述仿射模式表示使用仿射模型获得预测图像的预测模式;
    解析所述码流中的第二指示信息;
    根据所述第二指示信息,从所述第一待处理图像区域的候选预测模式集合中,确定待处理图像单元的预测模式,所述待处理图像单元属于所述第一待处理图像区域;
    根据所述预测模式,确定所述待处理图像单元的预测图像。
  10. 根据权利要求9所述的方法,其特征在于,所述第一待处理图像区域,包括:图像帧组、图像帧、图像分片集、图像条带集、图像分片、图像条带、图像编码单元集合、图像编码单元其中之一。
  11. 一种预测图像编码方法,包括:
    根据与待处理图像单元相邻的相邻图像单元的信息,确定所述待处理图像单元的候选预测模式集合是否包含仿射合并模式,所述仿射合并模式表示所述待处理图像单元和所述待处理图像单元的相邻图像单元使用相同的仿射模型获得各自的预测图像;
    从所述候选预测模式集合中,确定所述待处理图像单元的预测模式;
    根据所述预测模式,确定所述待处理图像单元的预测图像;
    将第一指示信息编入码流,所述第一指示信息表示所述预测模式。
  12. 根据权利要求11所述的方法,其特征在于,所述待处理图像单元的相邻图像单元至少包括所述待处理图像单元的上、左、右上、左下和左上的相邻图像单元。
  13. 根据权利要求11或12所述的方法,其特征在于,所述信息为所述相邻图像单元的预测模式,对应的,所述根据与待处理图像单元相邻的相邻图像单元的信息,确定所述待处理图像单元的候选预测模式集合是否包含仿射合并模式,包括:
    当至少一个所述相邻图像单元的预测模式为使用仿射模型获得预测图像,且所述候选预测模式集合包含所述仿射合并模式时,设置第二指示信息为1,将所述第二指示信息编入所述码流;
    当至少一个所述相邻图像单元的预测模式为使用仿射模型获得预测图像,且所述候选预测模式集合不包含所述仿射合并模式时,设置所述第二指示信息为0,将所述第二指示信息编入所述码流;
    否则,所述候选预测模式集合不包含所述仿射合并模式。
  14. 根据权利要求11或12所述的方法,其特征在于,所述信息为所述相邻图像单元的预测模式,对应的,所述根据与待处理图像单元相邻的相邻图像单元的信息,确定所述待处理图像单元的候选预测模式集合是否包含仿射合并模式,包括:
    当至少一个所述相邻图像单元的预测模式为使用仿射模型获得预测图像时,所述候选预测模式集合包含所述仿射合并模式;
    否则,所述候选预测模式集合不包含所述仿射合并模式。
  15. 根据权利要求11或12所述的方法,其特征在于,所述相邻图像单元的信息为所述相邻图像单元的预测模式,所述预测模式至少包括使用第一仿射模型获得预测图像的第一仿射模式或使用第二仿射模型获得预测图像的第二仿射模式,对应的,所述仿射合并模式至少包括合并所述第一仿射模式的第一仿射合并模式或合并所述第二仿射模式的第二仿射合并模式,对应的,所述根据与待处理图像单元相邻的相邻图像单元的信息,确定所述待处理图像单元的候选预测模式集合是否包含仿射合并模式,包括:
    当所述相邻预测单元的预测模式中,所述第一仿射模式最多时,所述候选预测模式集合包含所述第一仿射合并模式,且不包含所述第二仿射合并模式;
    当所述相邻预测单元的预测模式中,所述第二仿射模式最多时,所述候选预测模式集合包含所述第二仿射合并模式,且不包含所述第一仿射合并模式;
    当所述相邻预测单元的预测模式中,非所述仿射模式最多时,所述候选预测模式集合不包含所述仿射合并模式。
  16. 根据权利要求11或12所述的方法,其特征在于,所述相邻图像单元的信息为所述相邻图像单元的预测模式,所述预测模式至少包括使用第一仿射模型获得预测图像的第一仿射模式或使用第二仿射模型获得预测图像的第二仿射模式,对应的,所述仿射合并模式至少包括合并所述第一仿射模式的第一仿射合并模式或合并所述第二仿射模式的第二仿射合并模式,对应的,所述根据与待处理图像单元相邻的相邻图像单元的信息,确定所述待处理图像单元的候选预测模式集合是否包含仿射合并模式,包括:
    当所述相邻预测单元的预测模式中,所述第一仿射模式最多时,所述候选预测模式集合包含所述第一仿射合并模式,且不包含所述第二仿射合并模式;
    当所述相邻预测单元的预测模式中,所述第二仿射模式最多时,所述候选预测模式集合包含所述第二仿射合并模式,且不包含所述第一仿射合并模式;
    当所述相邻预测单元的预测模式中,非所述仿射模式最多且所述第一仿射模式次多时,所述候选预测模式集合包含所述第一仿射合并模式,且不包含所述第二仿射合并模式;
    当所述相邻预测单元的预测模式中,非所述仿射模式最多且所述第二仿射模式次多时,所述候选预测模式集合包含所述第二仿射合并模式,且不包含所述第一仿射合并模式。
  17. 根据权利要求11或12所述的方法,其特征在于,所述信息为所述相邻图像单元的预测模式和尺寸,对应的,所述根据与待处理图像单元相邻的相邻图像单元的信息,确定所述待处理图像单元的候选预测模式集合是否包含仿射合并模式,包括:
    当至少一个所述相邻图像单元的预测模式为使用仿射模型获得预测图 像,且所述至少一个所述相邻图像单元的宽和高分别小于所述待处理图像单元的宽和高,且所述候选预测模式集合包含所述仿射合并模式时,设置第三指示信息为1,将所述第三指示信息编入所述码流;
    当至少一个所述相邻图像单元的预测模式为使用仿射模型获得预测图像,且所述至少一个所述相邻图像单元的宽和高分别小于所述待处理图像单元的宽和高,且所述候选预测模式集合不包含所述仿射合并模式时,设置所述第三指示信息为0,将所述第三指示信息编入所述码流;
    否则,所述候选预测模式集合不包含所述仿射合并模式。
  18. 根据权利要求11或12所述的方法,其特征在于,所述信息为所述相邻图像单元的预测模式和尺寸,对应的,所述根据与待处理图像单元相邻的相邻图像单元的信息,确定所述待处理图像单元的候选预测模式集合是否包含仿射合并模式,包括:
    当至少一个所述相邻图像单元的预测模式为使用仿射模型获得预测图像,且所述至少一个所述相邻图像单元的宽和高分别小于所述待处理图像单元的宽和高时,所述候选预测模式集合包含所述仿射合并模式;
    否则,所述候选预测模式集合不包含所述仿射合并模式。
  19. 一种预测图像编码方法,包括:
    当第一待处理图像区域的候选模式集合采用候选平动模式集合时,设置第一指示信息为0,将所述第一指示信息编入码流,所述平动模式表示使用平动模型获得预测图像的预测模式;
    当所述第一待处理图像区域的候选模式集合采用候选平动模式集合和候选仿射模式集合时,设置所述第一指示信息为1,将所述第一指示信息编入码流,所述仿射模式表示使用仿射模型获得预测图像的预测模式;
    从所述第一待处理图像区域的候选预测模式集合中,确定待处理图像单元的预测模式,所述待处理图像单元属于所述第一待处理图像区域;
    根据所述预测模式,确定所述待处理图像单元的预测图像;
    将第二指示信息编入所述码流,所述第二指示信息表示所述预测模式。
  20. 根据权利要求19所述的方法,其特征在于,所述第一待处理图像 区域,包括:图像帧组、图像帧、图像分片集、图像条带集、图像分片、图像条带、图像编码单元集合、图像编码单元其中之一。
  21. 一种预测图像解码装置,包括:
    第一确定模块,用于根据与待处理图像单元相邻的相邻图像单元的信息,确定所述待处理图像单元的候选预测模式集合是否包含仿射合并模式,所述仿射合并模式表示所述待处理图像单元和所述待处理图像单元的相邻图像单元使用相同的仿射模型获得各自的预测图像;
    解析模块,用于解析码流中的第一指示信息;
    第二确定模块,用于根据所述第一指示信息,从所述候选预测模式集合中,确定所述待处理图像单元的预测模式;
    第三确定模块,用于根据所述预测模式,确定所述待处理图像单元的预测图像。
  22. 根据权利要求21所述的装置,其特征在于,所述待处理图像单元的相邻图像单元至少包括所述待处理图像单元的上、左、右上、左下和左上的相邻图像单元。
  23. 根据权利要求21或22所述的装置,其特征在于,所述信息为所述相邻图像单元的预测模式,对应的,所述第一确定模块,具体用于:
    当至少一个所述相邻图像单元的预测模式为使用仿射模型获得预测图像时,解析所述码流中的第二指示信息,
    当所述第二指示信息为1时,所述候选预测模式集合包含所述仿射合并模式,
    当所述第二指示信息为0时,所述候选预测模式集合不包含所述仿射合并模式;
    否则,所述候选预测模式集合不包含所述仿射合并模式。
  24. 根据权利要求21或22所述的装置,其特征在于,所述信息为所述相邻图像单元的预测模式,对应的,所述第一确定模块,具体用于:
    当至少一个所述相邻图像单元的预测模式为使用仿射模型获得预测图 像时,所述候选预测模式集合包含所述仿射合并模式;
    否则,所述候选预测模式集合不包含所述仿射合并模式。
  25. 根据权利要求21或22所述的装置,其特征在于,所述相邻图像单元的信息为所述相邻图像单元的预测模式,所述预测模式至少包括使用第一仿射模型获得预测图像的第一仿射模式或使用第二仿射模型获得预测图像的第二仿射模式,对应的,所述第一确定模块,具体用于:
    当所述相邻预测单元的预测模式中,所述第一仿射模式最多时,所述候选预测模式集合包含所述第一仿射合并模式,且不包含所述第二仿射合并模式;
    当所述相邻预测单元的预测模式中,所述第二仿射模式最多时,所述候选预测模式集合包含所述第二仿射合并模式,且不包含所述第一仿射合并模式;
    当所述相邻预测单元的预测模式中,非所述仿射模式最多时,所述候选预测模式集合不包含所述仿射合并模式。
  26. 根据权利要求21或22所述的装置,其特征在于,所述相邻图像单元的信息为所述相邻图像单元的预测模式,所述预测模式至少包括使用第一仿射模型获得预测图像的第一仿射模式或使用第二仿射模型获得预测图像的第二仿射模式,对应的,所述仿射合并模式至少包括合并所述第一仿射模式的第一仿射合并模式或合并所述第二仿射模式的第二仿射合并模式,对应的,所述第一确定模块,具体用于:
    当所述相邻预测单元的预测模式中,所述第一仿射模式最多时,所述候选预测模式集合包含所述第一仿射合并模式,且不包含所述第二仿射合并模式;
    当所述相邻预测单元的预测模式中,所述第二仿射模式最多时,所述候选预测模式集合包含所述第二仿射合并模式,且不包含所述第一仿射合并模式;
    当所述相邻预测单元的预测模式中,非所述仿射模式最多且所述第一仿射模式次多时,所述候选预测模式集合包含所述第一仿射合并模式,且不包含所述第二仿射合并模式;
    当所述相邻预测单元的预测模式中,非所述仿射模式最多且所述第二仿射模式次多时,所述候选预测模式集合包含所述第二仿射合并模式,且不包含所述第一仿射合并模式。
  27. 根据权利要求21或22所述的装置,其特征在于,所述信息为所述相邻图像单元的预测模式和尺寸,对应的,所述第一确定模块,具体用于:
    当至少一个所述相邻图像单元的预测模式为使用仿射模型获得预测图像,且所述至少一个所述相邻图像单元的宽和高分别小于所述待处理图像单元的宽和高时,解析所述码流中的第三指示信息,
    当所述第三指示信息为1时,所述候选预测模式集合包含所述仿射合并模式,
    当所述第三指示信息为0时,所述候选预测模式集合不包含所述仿射合并模式;
    否则,所述候选预测模式集合不包含所述仿射合并模式。
  28. 根据权利要求21或22所述的装置,其特征在于,所述信息为所述相邻图像单元的预测模式和尺寸,对应的,所述第一确定模块,具体用于:
    当至少一个所述相邻图像单元的预测模式为使用仿射模型获得预测图像,且所述至少一个所述相邻图像单元的宽和高分别小于所述待处理图像单元的宽和高时,所述候选预测模式集合包含所述仿射合并模式;
    否则,所述候选预测模式集合不包含所述仿射合并模式。
  29. 一种预测图像解码装置,包括:
    第一解析模块,用于解析码流中的第一指示信息;
    第一确定模块,用于根据所述第一指示信息,确定第一待处理图像区域的候选模式集合;
    当所述第一指示信息为0时,所述第一待处理图像区域的候选模式集合采用候选平动模式集合,所述平动模式表示使用平动模型获得预测图像的预测模式;
    当所述第一指示信息为1时,所述第一待处理图像区域的候选模式集合采用候选平动模式集合和候选仿射模式集合,所述仿射模式表示使用仿射模 型获得预测图像的预测模式;
    第二解析模块,用于解析所述码流中的第二指示信息;
    第二确定模块,用于根据所述第二指示信息,从所述第一待处理图像区域的候选预测模式集合中,确定待处理图像单元的预测模式,所述待处理图像单元属于所述第一待处理图像区域;
    第三确定模块,用于根据所述预测模式,确定所述待处理图像单元的预测图像。
  30. 根据权利要求29所述的装置,其特征在于,所述第一待处理图像区域,包括:图像帧组、图像帧、图像分片集、图像条带集、图像分片、图像条带、图像编码单元集合、图像编码单元其中之一。
  31. 一种预测图像编码装置,包括:
    第一确定模块,用于根据与待处理图像单元相邻的相邻图像单元的信息,确定所述待处理图像单元的候选预测模式集合是否包含仿射合并模式,所述仿射合并模式表示所述待处理图像单元和所述待处理图像单元的相邻图像单元使用相同的仿射模型获得各自的预测图像;
    第二确定模块,用于从所述候选预测模式集合中,确定所述待处理图像单元的预测模式;
    第三确定模块,用于根据所述预测模式,确定所述待处理图像单元的预测图像;
    编码模块,用于将第一指示信息编入码流,所述第一指示信息表示所述预测模式。
  32. 根据权利要求31所述的装置,其特征在于,所述待处理图像单元的相邻图像单元至少包括所述待处理图像单元的上、左、右上、左下和左上的相邻图像单元。
  33. 根据权利要求31或32所述的装置,其特征在于,所述信息为所述相邻图像单元的预测模式,对应的,对应的,所述第一确定模块,具体用于:
    当至少一个所述相邻图像单元的预测模式为使用仿射模型获得预测图 像,且所述候选预测模式集合包含所述仿射合并模式时,设置第二指示信息为1,将所述第二指示信息编入所述码流;
    当至少一个所述相邻图像单元的预测模式为使用仿射模型获得预测图像,且所述候选预测模式集合不包含所述仿射合并模式时,设置所述第二指示信息为0,将所述第二指示信息编入所述码流;
    否则,所述候选预测模式集合不包含所述仿射合并模式。
  34. 根据权利要求31或32所述的装置,其特征在于,所述信息为所述相邻图像单元的预测模式,对应的,所述第一确定模块,具体用于:
    当至少一个所述相邻图像单元的预测模式为使用仿射模型获得预测图像时,所述候选预测模式集合包含所述仿射合并模式;
    否则,所述候选预测模式集合不包含所述仿射合并模式。
  35. 根据权利要求31或32所述的装置,其特征在于,所述相邻图像单元的信息为所述相邻图像单元的预测模式,所述预测模式至少包括使用第一仿射模型获得预测图像的第一仿射模式或使用第二仿射模型获得预测图像的第二仿射模式,对应的,所述仿射合并模式至少包括合并所述第一仿射模式的第一仿射合并模式或合并所述第二仿射模式的第二仿射合并模式,对应的,所述第一确定模块,具体用于:
    当所述相邻预测单元的预测模式中,所述第一仿射模式最多时,所述候选预测模式集合包含所述第一仿射合并模式,且不包含所述第二仿射合并模式;
    当所述相邻预测单元的预测模式中,所述第二仿射模式最多时,所述候选预测模式集合包含所述第二仿射合并模式,且不包含所述第一仿射合并模式;
    当所述相邻预测单元的预测模式中,非所述仿射模式最多时,所述候选预测模式集合不包含所述仿射合并模式。
  36. 根据权利要求31或32所述的方法,其特征在于,所述相邻图像单元的信息为所述相邻图像单元的预测模式,所述预测模式至少包括使用第一仿射模型获得预测图像的第一仿射模式或使用第二仿射模型获得预测图像 的第二仿射模式,对应的,所述仿射合并模式至少包括合并所述第一仿射模式的第一仿射合并模式或合并所述第二仿射模式的第二仿射合并模式,对应的,所述第一确定模块,具体用于:
    当所述相邻预测单元的预测模式中,所述第一仿射模式最多时,所述候选预测模式集合包含所述第一仿射合并模式,且不包含所述第二仿射合并模式;
    当所述相邻预测单元的预测模式中,所述第二仿射模式最多时,所述候选预测模式集合包含所述第二仿射合并模式,且不包含所述第一仿射合并模式;
    当所述相邻预测单元的预测模式中,非所述仿射模式最多且所述第一仿射模式次多时,所述候选预测模式集合包含所述第一仿射合并模式,且不包含所述第二仿射合并模式;
    当所述相邻预测单元的预测模式中,非所述仿射模式最多且所述第二仿射模式次多时,所述候选预测模式集合包含所述第二仿射合并模式,且不包含所述第一仿射合并模式。
  37. 根据权利要求31或32所述的装置,其特征在于,所述信息为所述相邻图像单元的预测模式和尺寸,对应的,所述第一确定模块,具体用于:
    当至少一个所述相邻图像单元的预测模式为使用仿射模型获得预测图像,且所述至少一个所述相邻图像单元的宽和高分别小于所述待处理图像单元的宽和高,且所述候选预测模式集合包含所述仿射合并模式时,设置第三指示信息为1,将所述第三指示信息编入所述码流;
    当至少一个所述相邻图像单元使用仿射模型获得预测图像,且所述至少一个所述相邻图像单元的宽和高分别小于所述待处理图像单元的宽和高,且所述候选预测模式集合不包含所述仿射合并模式时,设置所述第三指示信息为0,将所述第三指示信息编入所述码流;
    否则,所述候选预测模式集合不包含所述仿射合并模式。
  38. 根据权利要求31或32所述的装置,其特征在于,所述信息为所述相邻图像单元的预测模式和尺寸,对应的,所述第一确定模块,具体用于:
    当至少一个所述相邻图像单元的预测模式为使用仿射模型获得预测图 像,且所述至少一个所述相邻图像单元的宽和高分别小于所述待处理图像单元的宽和高时,所述候选预测模式集合包含所述仿射合并模式;
    否则,所述候选预测模式集合不包含所述仿射合并模式。
  39. 一种预测图像编码装置,包括:
    第一编码模块,用于当第一待处理图像区域的候选模式集合采用候选平动模式集合时,设置第一指示信息为0,将所述第一指示信息编入码流,所述平动模式表示使用平动模型获得预测图像的预测模式;
    当所述第一待处理图像区域的候选模式集合采用候选平动模式集合和候选仿射模式集合时,设置所述第一指示信息为1,将所述第一指示信息编入码流,所述仿射模式表示使用仿射模型获得预测图像的预测模式;
    第一确定模块,用于从所述第一待处理图像区域的候选预测模式集合中,确定待处理图像单元的预测模式,所述待处理图像单元属于所述第一待处理图像区域;
    第二确定模块,用于根据所述预测模式,确定所述待处理图像单元的预测图像;
    第二编码模块,用于将第二指示信息编入所述码流,所述第二指示信息表示所述预测模式。
  40. 根据权利要求39所述的装置,其特征在于,所述第一待处理图像区域,包括:图像帧组、图像帧、图像分片集、图像条带集、图像分片、图像条带、图像编码单元集合、图像编码单元其中之一。
PCT/CN2016/098464 2015-09-29 2016-09-08 图像预测的方法及装置 WO2017054630A1 (zh)

Priority Applications (13)

Application Number Priority Date Filing Date Title
MX2018003764A MX2018003764A (es) 2015-09-29 2016-09-08 Metodo y aparato de prediccion de imagen.
BR112018006271-5A BR112018006271B1 (pt) 2015-09-29 2016-09-08 método e aparelho para decodificar uma imagem predita
MYPI2018700879A MY196371A (en) 2015-09-29 2016-09-08 Image Prediction Method and Apparatus
SG11201801863QA SG11201801863QA (en) 2015-09-29 2016-09-08 Image prediction method and apparatus
AU2016333221A AU2016333221B2 (en) 2015-09-29 2016-09-08 Image prediction method and apparatus
RU2018114921A RU2697726C1 (ru) 2015-09-29 2016-09-08 Способ и устройство предсказания изображений
JP2018511731A JP6669859B2 (ja) 2015-09-29 2016-09-08 画像予測方法および装置
KR1020187008717A KR102114764B1 (ko) 2015-09-29 2016-09-08 이미지 예측 방법 및 장치
KR1020207014280A KR102240141B1 (ko) 2015-09-29 2016-09-08 이미지 예측 방법 및 장치
EP16850254.0A EP3331243B1 (en) 2015-09-29 2016-09-08 Image prediction method and device
ZA2018/01541A ZA201801541B (en) 2015-09-29 2018-03-06 Image prediction method and apparatus
US15/923,434 US11323736B2 (en) 2015-09-29 2018-03-16 Image prediction method and apparatus
US17/540,928 US20220094969A1 (en) 2015-09-29 2021-12-02 Image prediction method and apparatus

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510632589.1 2015-09-29
CN201510632589.1A CN106559669B (zh) 2015-09-29 2015-09-29 预测图像编解码方法及装置

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US15/923,434 Continuation US11323736B2 (en) 2015-09-29 2018-03-16 Image prediction method and apparatus

Publications (1)

Publication Number Publication Date
WO2017054630A1 true WO2017054630A1 (zh) 2017-04-06

Family

ID=58414556

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/098464 WO2017054630A1 (zh) 2015-09-29 2016-09-08 图像预测的方法及装置

Country Status (13)

Country Link
US (2) US11323736B2 (zh)
EP (1) EP3331243B1 (zh)
JP (4) JP6669859B2 (zh)
KR (2) KR102114764B1 (zh)
CN (3) CN108965871B (zh)
AU (1) AU2016333221B2 (zh)
BR (1) BR112018006271B1 (zh)
MX (1) MX2018003764A (zh)
MY (1) MY196371A (zh)
RU (1) RU2697726C1 (zh)
SG (1) SG11201801863QA (zh)
WO (1) WO2017054630A1 (zh)
ZA (1) ZA201801541B (zh)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111107373A (zh) * 2018-10-29 2020-05-05 华为技术有限公司 基于仿射预测模式的帧间预测的方法及相关装置
CN112822514A (zh) * 2020-12-30 2021-05-18 北京大学 基于依赖关系的视频流分组传输方法、系统、终端及介质
CN113170210A (zh) * 2018-10-10 2021-07-23 交互数字Vc控股公司 视频编码和解码中的仿射模式信令
US11336907B2 (en) 2018-07-16 2022-05-17 Huawei Technologies Co., Ltd. Video encoder, video decoder, and corresponding encoding and decoding methods
RU2772813C1 (ru) * 2018-07-16 2022-05-26 Хуавей Текнолоджиз Ко., Лтд. Видеокодер, видеодекодер и соответствующие способы кодирования и декодирования
US11438578B2 (en) 2018-10-29 2022-09-06 Huawei Technologies Co., Ltd. Video picture prediction method and apparatus

Families Citing this family (42)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108965871B (zh) * 2015-09-29 2023-11-10 华为技术有限公司 图像预测的方法及装置
EP4072134A1 (en) * 2016-12-27 2022-10-12 Samsung Electronics Co., Ltd. Video affine mode encoding method and device therefor, and decoding method and device therefor
US10630994B2 (en) * 2017-06-28 2020-04-21 Agora Lab, Inc. Specific operation prediction in video compression
US10979718B2 (en) * 2017-09-01 2021-04-13 Apple Inc. Machine learning video processing systems and methods
US10609384B2 (en) * 2017-09-21 2020-03-31 Futurewei Technologies, Inc. Restriction on sub-block size derivation for affine inter prediction
EP3468195A1 (en) * 2017-10-05 2019-04-10 Thomson Licensing Improved predictor candidates for motion compensation
US20190208211A1 (en) * 2018-01-04 2019-07-04 Qualcomm Incorporated Generated affine motion vectors
US20190222834A1 (en) * 2018-01-18 2019-07-18 Mediatek Inc. Variable affine merge candidates for video coding
EP3788787A1 (en) 2018-06-05 2021-03-10 Beijing Bytedance Network Technology Co. Ltd. Interaction between ibc and atmvp
WO2019244117A1 (en) 2018-06-21 2019-12-26 Beijing Bytedance Network Technology Co., Ltd. Unified constrains for the merge affine mode and the non-merge affine mode
CN113115046A (zh) 2018-06-21 2021-07-13 北京字节跳动网络技术有限公司 分量相关的子块分割
KR20210024487A (ko) * 2018-07-01 2021-03-05 베이징 바이트댄스 네트워크 테크놀로지 컴퍼니, 리미티드 효율적인 아핀 병합 모션 벡터 유도
CN110677645B (zh) * 2018-07-02 2022-06-10 华为技术有限公司 一种图像预测方法及装置
US10516885B1 (en) 2018-07-11 2019-12-24 Tencent America LLC Method and apparatus for video coding
CN117499672A (zh) 2018-08-27 2024-02-02 华为技术有限公司 一种视频图像预测方法及装置
CN117241039A (zh) * 2018-08-28 2023-12-15 华为技术有限公司 帧间预测方法、装置以及视频编码器和视频解码器
JP7225375B2 (ja) * 2018-08-30 2023-02-20 華為技術有限公司 パレット符号化を使用するエンコード装置、デコード装置および対応する方法
BR112021004667A2 (pt) * 2018-09-12 2021-06-01 Huawei Technologies Co., Ltd. codificador de vídeo, decodificador de vídeo e métodos correspondentes
EP3837835A4 (en) 2018-09-18 2021-06-23 Huawei Technologies Co., Ltd. CODING PROCESS, DEVICE AND SYSTEM
PT3847818T (pt) 2018-09-18 2024-03-05 Huawei Tech Co Ltd Codificador de vídeo, um descodificador de vídeo e métodos correspondentes
GB2579763B (en) 2018-09-21 2021-06-09 Canon Kk Video coding and decoding
GB2577318B (en) * 2018-09-21 2021-03-10 Canon Kk Video coding and decoding
GB2597616B (en) * 2018-09-21 2023-01-18 Canon Kk Video coding and decoding
TWI818086B (zh) 2018-09-24 2023-10-11 大陸商北京字節跳動網絡技術有限公司 擴展Merge預測
WO2020070730A2 (en) * 2018-10-06 2020-04-09 Beijing Bytedance Network Technology Co., Ltd. Size restriction based on affine motion information
GB2578150C (en) 2018-10-18 2022-05-18 Canon Kk Video coding and decoding
GB2595054B (en) 2018-10-18 2022-07-06 Canon Kk Video coding and decoding
CN112997495B (zh) 2018-11-10 2024-02-20 北京字节跳动网络技术有限公司 当前图片参考中的取整
CN112997487A (zh) 2018-11-15 2021-06-18 北京字节跳动网络技术有限公司 仿射模式与其他帧间编解码工具之间的协调
CN113016185B (zh) 2018-11-17 2024-04-05 北京字节跳动网络技术有限公司 以运动矢量差分模式控制Merge
CN111263147B (zh) 2018-12-03 2023-02-14 华为技术有限公司 帧间预测方法和相关装置
EP3868107A4 (en) * 2018-12-21 2021-12-15 Beijing Bytedance Network Technology Co. Ltd. MOTION VECTOR ACCURACY IN INTERACTING WITH MOTION VECTOR DIFFERENCE MODE
CN111526362B (zh) * 2019-02-01 2023-12-29 华为技术有限公司 帧间预测方法和装置
WO2020181428A1 (zh) 2019-03-08 2020-09-17 Oppo广东移动通信有限公司 预测方法、编码器、解码器及计算机存储介质
WO2020181471A1 (zh) * 2019-03-11 2020-09-17 Oppo广东移动通信有限公司 帧内预测方法、装置及计算机存储介质
CN117692660A (zh) 2019-03-12 2024-03-12 Lg电子株式会社 图像编码/解码方法以及数据的传输方法
WO2020186882A1 (zh) * 2019-03-18 2020-09-24 华为技术有限公司 基于三角预测单元模式的处理方法及装置
CN113853793B (zh) * 2019-05-21 2023-12-19 北京字节跳动网络技术有限公司 基于光流的帧间编码的语法信令
CN113347434B (zh) * 2019-06-21 2022-03-29 杭州海康威视数字技术股份有限公司 预测模式的解码、编码方法及装置
US11132780B2 (en) 2020-02-14 2021-09-28 Huawei Technologies Co., Ltd. Target detection method, training method, electronic device, and computer-readable medium
CN112801906B (zh) * 2021-02-03 2023-02-21 福州大学 基于循环神经网络的循环迭代图像去噪方法
US20230412794A1 (en) * 2022-06-17 2023-12-21 Tencent America LLC Affine merge mode with translational motion vectors

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102377992A (zh) * 2010-08-06 2012-03-14 华为技术有限公司 运动矢量的预测值的获取方法和装置
CN102934440A (zh) * 2010-05-26 2013-02-13 Lg电子株式会社 用于处理视频信号的方法和设备
EP2645720A2 (en) * 2010-11-23 2013-10-02 LG Electronics Inc. Method for encoding and decoding images, and device using same
CN104363451A (zh) * 2014-10-27 2015-02-18 华为技术有限公司 图像预测方法及相关装置
CN104539966A (zh) * 2014-09-30 2015-04-22 华为技术有限公司 图像预测方法及相关装置
CN104661031A (zh) * 2015-02-16 2015-05-27 华为技术有限公司 用于视频图像编码和解码的方法、编码设备和解码设备

Family Cites Families (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3654664B2 (ja) 1994-08-23 2005-06-02 シャープ株式会社 画像符号化装置及び画像復号装置
JP3681342B2 (ja) * 2000-05-24 2005-08-10 三星電子株式会社 映像コーディング方法
JP4245587B2 (ja) * 2005-06-22 2009-03-25 シャープ株式会社 動き補償予測方法
US9258519B2 (en) 2005-09-27 2016-02-09 Qualcomm Incorporated Encoder assisted frame rate up conversion using various motion models
JP2012080151A (ja) 2009-02-09 2012-04-19 Toshiba Corp 幾何変換動き補償予測を用いる動画像符号化及び動画像復号化の方法と装置
US20100246675A1 (en) * 2009-03-30 2010-09-30 Sony Corporation Method and apparatus for intra-prediction in a video encoder
WO2011013253A1 (ja) * 2009-07-31 2011-02-03 株式会社 東芝 幾何変換動き補償予測を用いる予測信号生成装置、動画像符号化装置及び動画像復号化装置
KR101611437B1 (ko) 2009-10-28 2016-04-26 삼성전자주식회사 복수의 프레임을 참조하여 영상을 부호화, 복호화하는 방법 및 장치
US8179446B2 (en) * 2010-01-18 2012-05-15 Texas Instruments Incorporated Video stabilization and reduction of rolling shutter distortion
CN107071487B (zh) 2010-11-04 2020-09-15 Ge视频压缩有限责任公司 支持区块合并和跳过模式的图像编码
US9319716B2 (en) * 2011-01-27 2016-04-19 Qualcomm Incorporated Performing motion vector prediction for video coding
CA2830242C (en) * 2011-03-21 2016-11-22 Qualcomm Incorporated Bi-predictive merge mode based on uni-predictive neighbors in video coding
US9282338B2 (en) * 2011-06-20 2016-03-08 Qualcomm Incorporated Unified merge mode and adaptive motion vector prediction mode candidates selection
CN110139108B (zh) * 2011-11-11 2023-07-18 Ge视频压缩有限责任公司 用于将多视点信号编码到多视点数据流中的装置及方法
US9420285B2 (en) * 2012-04-12 2016-08-16 Qualcomm Incorporated Inter-layer mode derivation for prediction in scalable video coding
EP2683165B1 (en) * 2012-07-04 2015-10-14 Thomson Licensing Method for coding and decoding a block of pixels from a motion model
WO2016008157A1 (en) * 2014-07-18 2016-01-21 Mediatek Singapore Pte. Ltd. Methods for motion compensation using high order motion model
WO2016137149A1 (ko) * 2015-02-24 2016-09-01 엘지전자(주) 폴리곤 유닛 기반 영상 처리 방법 및 이를 위한 장치
CN109005407B (zh) 2015-05-15 2023-09-01 华为技术有限公司 视频图像编码和解码的方法、编码设备和解码设备
CN106331722B (zh) 2015-07-03 2019-04-26 华为技术有限公司 图像预测方法和相关设备
CN107925758B (zh) * 2015-08-04 2022-01-25 Lg 电子株式会社 视频编译系统中的帧间预测方法和设备
CN108965869B (zh) 2015-08-29 2023-09-12 华为技术有限公司 图像预测的方法及设备
CN108965871B (zh) * 2015-09-29 2023-11-10 华为技术有限公司 图像预测的方法及装置

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102934440A (zh) * 2010-05-26 2013-02-13 Lg电子株式会社 用于处理视频信号的方法和设备
CN102377992A (zh) * 2010-08-06 2012-03-14 华为技术有限公司 运动矢量的预测值的获取方法和装置
EP2645720A2 (en) * 2010-11-23 2013-10-02 LG Electronics Inc. Method for encoding and decoding images, and device using same
CN104539966A (zh) * 2014-09-30 2015-04-22 华为技术有限公司 图像预测方法及相关装置
CN104363451A (zh) * 2014-10-27 2015-02-18 华为技术有限公司 图像预测方法及相关装置
CN104661031A (zh) * 2015-02-16 2015-05-27 华为技术有限公司 用于视频图像编码和解码的方法、编码设备和解码设备

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
HAN, HUANG ET AL.: "Affine Skip and Direct modes for efficient video coding", VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP), 2012, 31 December 2012 (2012-12-31), pages 1 - 6, XP032309255 *
ZHANG, NA.: "Research on MERGE Mode and Related Technologies of the Next Generation Video Coding Standard", CHINA MASTERS' THESES FULL-TEXT DATABASE (ELECTRONIC JOURNALS, 30 April 2014 (2014-04-30), pages 136 - 227, XP009503959 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11336907B2 (en) 2018-07-16 2022-05-17 Huawei Technologies Co., Ltd. Video encoder, video decoder, and corresponding encoding and decoding methods
RU2772813C1 (ru) * 2018-07-16 2022-05-26 Хуавей Текнолоджиз Ко., Лтд. Видеокодер, видеодекодер и соответствующие способы кодирования и декодирования
CN113170210A (zh) * 2018-10-10 2021-07-23 交互数字Vc控股公司 视频编码和解码中的仿射模式信令
CN111107373A (zh) * 2018-10-29 2020-05-05 华为技术有限公司 基于仿射预测模式的帧间预测的方法及相关装置
US11438578B2 (en) 2018-10-29 2022-09-06 Huawei Technologies Co., Ltd. Video picture prediction method and apparatus
CN111107373B (zh) * 2018-10-29 2023-11-03 华为技术有限公司 基于仿射预测模式的帧间预测的方法及相关装置
CN112822514A (zh) * 2020-12-30 2021-05-18 北京大学 基于依赖关系的视频流分组传输方法、系统、终端及介质
CN112822514B (zh) * 2020-12-30 2022-06-28 北京大学 基于依赖关系的视频流分组传输方法、系统、终端及介质

Also Published As

Publication number Publication date
JP6669859B2 (ja) 2020-03-18
JP2024020203A (ja) 2024-02-14
KR20180043830A (ko) 2018-04-30
EP3331243A4 (en) 2018-07-18
AU2016333221B2 (en) 2019-10-03
CN106559669A (zh) 2017-04-05
RU2697726C1 (ru) 2019-08-19
CN108965871A (zh) 2018-12-07
BR112018006271B1 (pt) 2021-01-26
KR20200057120A (ko) 2020-05-25
AU2016333221A1 (en) 2018-03-29
JP6882560B2 (ja) 2021-06-02
CN109274974A (zh) 2019-01-25
JP7368414B2 (ja) 2023-10-24
EP3331243A1 (en) 2018-06-06
JP2020092457A (ja) 2020-06-11
ZA201801541B (en) 2019-08-28
MY196371A (en) 2023-03-27
MX2018003764A (es) 2018-07-06
JP2018533261A (ja) 2018-11-08
KR102240141B1 (ko) 2021-04-13
CN108965871B (zh) 2023-11-10
KR102114764B1 (ko) 2020-05-25
JP2021119709A (ja) 2021-08-12
SG11201801863QA (en) 2018-04-27
US11323736B2 (en) 2022-05-03
CN106559669B (zh) 2018-10-09
EP3331243B1 (en) 2021-04-21
US20180205965A1 (en) 2018-07-19
US20220094969A1 (en) 2022-03-24
CN109274974B (zh) 2022-02-11
BR112018006271A2 (zh) 2018-10-16
RU2019124468A (ru) 2019-09-03

Similar Documents

Publication Publication Date Title
JP7368414B2 (ja) 画像予測方法および装置
CN110024406B (zh) 具有用于视频译码的样本存取的线性模型预测模式
US11146788B2 (en) Grouping palette bypass bins for video coding
TWI666918B (zh) 決定在視訊寫碼中之調色板寫碼區塊的調色板大小、調色板單元及過濾
KR102478411B1 (ko) 서브샘플링 포맷을 위한 팔레트 모드
TWI524744B (zh) 在視訊寫碼中用於視訊時序之時序資訊關係之圖像次序計數的發信
JP2018142972A (ja) Bスライス中の予測ユニットの単方向インター予測への制限
TW201838415A (zh) 在視訊寫碼中判定用於雙邊濾波之鄰近樣本
TW201830964A (zh) 基於在視訊寫碼中之一預測模式導出雙邊濾波器資訊
JP2018524906A (ja) イントラブロックコピーモードでの参照ピクチャリスト構成
JP2017519447A (ja) ビデオコーディングのためのイントラブロックコピーブロックベクトルシグナリング
TW201603563A (zh) 用於視訊寫碼之具有執行長度碼之調色盤預測器信令
TW201517599A (zh) 內部運動補償延伸
JP2017523685A (ja) イントラブロックコピーイングのためのブロックベクトルコーディング
TW201334544A (zh) 判定視訊寫碼之解塊濾波之邊界強度值
JP2017525316A (ja) パレットモードコーディングのための方法
TW202126040A (zh) 用於視訊編碼的簡化的調色板預測器更新
TW202133619A (zh) 用於合併估計區域的基於歷史的運動向量預測約束
JP2018511238A (ja) 高速レートひずみ最適量子化
RU2804871C2 (ru) Способ и устройство предсказания изображений

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16850254

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2018511731

Country of ref document: JP

WWE Wipo information: entry into national phase

Ref document number: 11201801863Q

Country of ref document: SG

WWE Wipo information: entry into national phase

Ref document number: MX/A/2018/003764

Country of ref document: MX

ENP Entry into the national phase

Ref document number: 20187008717

Country of ref document: KR

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2016333221

Country of ref document: AU

Date of ref document: 20160908

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

REG Reference to national code

Ref country code: BR

Ref legal event code: B01A

Ref document number: 112018006271

Country of ref document: BR

WWE Wipo information: entry into national phase

Ref document number: 2018114921

Country of ref document: RU

ENP Entry into the national phase

Ref document number: 112018006271

Country of ref document: BR

Kind code of ref document: A2

Effective date: 20180328