WO2022193868A1 - 未匹配像素的解码方法、编码方法、解码器以及编码器 - Google Patents

未匹配像素的解码方法、编码方法、解码器以及编码器 Download PDF

Info

Publication number
WO2022193868A1
WO2022193868A1 PCT/CN2022/075554 CN2022075554W WO2022193868A1 WO 2022193868 A1 WO2022193868 A1 WO 2022193868A1 CN 2022075554 W CN2022075554 W CN 2022075554W WO 2022193868 A1 WO2022193868 A1 WO 2022193868A1
Authority
WO
WIPO (PCT)
Prior art keywords
unmatched
target
entropy
target image
pixels
Prior art date
Application number
PCT/CN2022/075554
Other languages
English (en)
French (fr)
Inventor
王英彬
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Publication of WO2022193868A1 publication Critical patent/WO2022193868A1/zh
Priority to US17/970,462 priority Critical patent/US20230042484A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/13Adaptive entropy coding, e.g. adaptive variable length coding [AVLC] or context adaptive binary arithmetic coding [CABAC]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/182Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a pixel
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/186Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/593Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial prediction techniques
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/80Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation
    • H04N19/82Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation involving filtering within a prediction loop
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
    • H04N19/91Entropy coding, e.g. variable length coding [VLC] or arithmetic coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
    • H04N19/96Tree coding, e.g. quad-tree coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display

Definitions

  • the embodiments of the present application relate to the technical field of video or image processing, and more particularly, to a decoding method, an encoding method, a decoder, and an encoder for unmatched pixels.
  • Entropy coding is an important video compression technology.
  • Commonly used entropy coding methods include context-based binary arithmetic coding (Content Adaptive Binary Arithmetic Coding, CABAC) and context-based variable length coding (Content Adaptive Variable Length Coding, CAVLC).
  • CABAC Content Adaptive Binary Arithmetic Coding
  • CAVLC Content Adaptive Variable Length Coding
  • the binary data can be encoded by the normal encoding mode and the bypass encoding mode (Bypass Coding Mode).
  • the bypass encoding mode does not need to assign a specific probability model to each binary bit (Bin), and the input Bin is directly encoded with a simple bypass encoder to speed up the entire encoding and decoding speed.
  • the context model corresponding to the syntax element can be located by the context index increment (context index increment, ctxIdxInc) and the context start index (context index Start, ctxIdxStart).
  • the context model needs to be updated according to the bin value, which is the adaptive process in encoding.
  • the Y, Cb and Cr components of its unmatched pixels are directly coded rather than derived from predicted values.
  • Each component of the unmatched pixel is encoded according to its bit depth. For example, the bit depth of the current image is 10 bits.
  • 3*10 bit symbols are required for encoding, that is, for each symbol bit Encoding in bypass mode results in excessive encoding overhead for unmatched pixels.
  • the present application provides a decoding method, an encoding method, a decoder, and an encoder for unmatched pixels, which can improve the flexibility of encoding and help equalize the encoding performance of the unmatched pixels with the encoding overhead.
  • the present application provides a decoding method for unmatched pixels, comprising:
  • the target image frame in the sequence is divided;
  • the target image block is obtained.
  • the present application provides a method for encoding unmatched pixels, comprising:
  • the binary symbol string of the unmatched pixels is encoded by at least two entropy encoding methods to obtain the code stream of the target sequence.
  • an embodiment of the present application provides a decoder for executing the method in the first aspect or each of its implementations.
  • the decoder includes a functional unit for performing the method in the above-mentioned first aspect or each of its implementations.
  • an embodiment of the present application provides an encoder for executing the method in the second aspect or each of its implementations.
  • the encoder includes a functional unit for executing the method in the second aspect or each of its implementations.
  • an encoding and decoding device including:
  • a processor adapted to implement computer instructions
  • a computer-readable storage medium storing computer instructions adapted to be loaded by a processor and perform the method as in any one of the above-described first to second aspects or implementations thereof.
  • an embodiment of the present application provides a computer-readable storage medium, where computer instructions are stored in the computer-readable storage medium, and when the computer instructions are read and executed by a processor of a computer device, the computer device is made to execute the above-mentioned first step.
  • an embodiment of the present application provides a computer program product or computer program, where the computer program product or computer program includes computer instructions, and the computer instructions are stored in a computer-readable storage medium.
  • the processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, causing the computer device to perform the method in any one of the above-mentioned first to second aspects or implementations thereof .
  • the code stream of the target sequence is decoded by at least two entropy decoding methods, that is, the code stream of the target sequence is decoded by multiple entropy decoding methods, which can improve the flexibility of encoding and coding, and is conducive to balanced encoding Encoding overhead for pixels that do not match the encoding performance.
  • FIG. 1 is a schematic block diagram of a video encoding and decoding system provided by an embodiment of the present application
  • FIG. 2 is a schematic block diagram of a video encoder provided by an embodiment of the present application.
  • FIG. 3 is a schematic block diagram of a decoding framework provided by an embodiment of the present application.
  • FIG. 4 is a schematic diagram of an intra-frame string replication provided by an embodiment of the present application.
  • FIG. 5 is a schematic flowchart of an encoding method for unmatched pixels provided by an embodiment of the present application.
  • FIG. 6 is a schematic flowchart of a decoding method for an unmatched pixel provided by an embodiment of the present application.
  • FIG. 7 is a schematic block diagram of an encoder provided by an embodiment of the present application.
  • FIG. 8 is a schematic block diagram of a decoder provided by an embodiment of the present application.
  • FIG. 9 is a schematic block diagram of an encoding and decoding device provided by an embodiment of the present application.
  • the technology of the present application is not limited to any coding standard or technology.
  • the solutions provided by the embodiments of the present application can be applied to the technical field of digital video coding, for example, the field of image coding and decoding, the field of video coding and decoding, the field of hardware video coding and decoding, the field of dedicated circuit video coding and decoding, and the field of real-time video coding and decoding.
  • the solutions provided in the embodiments of the present application may be combined with the Audio Video coding Standard (AVS), the second-generation AVS standard (AVS2), or the third-generation AVS standard (AVS3).
  • AVS Audio Video coding Standard
  • AVS2 second-generation AVS standard
  • AVS3 third-generation AVS standard
  • H.264/Audio Video Coding AVC
  • H.265/High Efficiency Video Coding HEVC
  • H.266/Versatile Video Coding Versatile Video Coding
  • VVC Verssatile Video Coding
  • ITU-TH.261 ISO/IECMPEG-1 Visual
  • ITU-TH. ITU-TH.263, ISO/IECMPEG-4Visual ITU-TH.264 (also known as ISO/IECMPEG-4AVC)
  • SVC Scalable Video Codec
  • MVC Multi-View Video Codec
  • the solutions provided by the embodiments of the present application can be used to perform lossy compression (lossy compression) on images, and can also be used to perform lossless compression (lossless compression) on images.
  • the lossless compression may be visually lossless compression (visually lossless compression) or mathematically lossless compression (mathematically lossless compression).
  • FIG. 1 For ease of understanding, the video coding and decoding system involved in the embodiments of the present application is first introduced with reference to FIG. 1 .
  • FIG. 1 is a schematic block diagram of a video encoding and decoding system 100 according to an embodiment of the present application. It should be noted that FIG. 1 is only an example, and the video encoding and decoding systems in the embodiments of the present application include, but are not limited to, those shown in FIG. 1 .
  • the video codec system 100 includes an encoding device 110 and a decoding device 120 .
  • the encoding device is used to encode the video data (which can be understood as compression) to generate a code stream, and transmit the code stream to the decoding device.
  • the decoding device decodes the code stream encoded by the encoding device to obtain decoded video data.
  • the encoding device 110 can be understood as a device with a video encoding function
  • the decoding device 120 can be understood as a device with a video decoding function.
  • the encoding device 110 may transmit the encoded video data (eg, a code stream) to the decoding device 120 via the channel 130 .
  • Channel 130 may include one or more media and/or devices capable of transmitting encoded video data from encoding device 110 to decoding device 120.
  • channel 130 includes one or more communication media that enables encoding device 110 to transmit encoded video data directly to decoding device 120 in real-time.
  • encoding apparatus 110 may modulate the encoded video data according to a communication standard and transmit the modulated video data to decoding apparatus 120 .
  • the communication medium includes a wireless communication medium, such as a radio frequency spectrum, optionally, the communication medium may also include a wired communication medium, such as one or more physical transmission lines.
  • channel 130 includes a storage medium that can store video data encoded by encoding device 110 . Storage media include a variety of locally accessible data storage media such as optical discs, DVDs, flash memory, and the like.
  • the decoding apparatus 120 may obtain the encoded video data from the storage medium.
  • channel 130 may include a storage server that may store video data encoded by encoding device 110 .
  • the decoding device 120 may download the stored encoded video data from the storage server.
  • the storage server may store the encoded video data and may transmit the encoded video data to the decoding device 120, such as a web server (eg, for a website), a file transfer protocol (FTP) server, and the like.
  • a web server eg, for a website
  • FTP file transfer protocol
  • encoding apparatus 110 includes video encoder 112 and output interface 113 .
  • the output interface 113 may include a modulator/demodulator (modem) and/or a transmitter.
  • encoding device 110 may include video source 111 in addition to video encoder 112 and input interface 113 .
  • the video source 111 may include at least one of a video capture device (eg, a video camera), a video archive, a video input interface, a computer graphics system for receiving video data from a video content provider, a computer graphics system Used to generate video data.
  • the video encoder 112 encodes the video data from the video source 111 to generate a code stream.
  • Video data may include one or more pictures or a sequence of pictures.
  • the code stream contains the encoding information of the image or image sequence in the form of bit stream.
  • the encoded information may include encoded image data and associated data.
  • the associated data may include a sequence parameter set (SPS for short), a picture parameter set (PPS for short), and other syntax structures.
  • SPS sequence parameter set
  • PPS picture parameter set
  • An SPS may contain parameters that apply to one or more sequences.
  • a PPS may contain parameters that apply to one or more images.
  • a syntax structure refers to a set of zero or more syntax elements in a codestream arranged in a specified order.
  • the video encoder 112 directly transmits the encoded video data to the decoding device 120 via the output interface 113 .
  • the encoded video data may also be stored on a storage medium or a storage server for subsequent reading by the decoding device 120 .
  • decoding device 120 includes input interface 121 and video decoder 122 .
  • the decoding device 120 may include a display device 123 in addition to the input interface 121 and the video decoder 122 .
  • the input interface 121 includes a receiver and/or a modem.
  • the input interface 121 may receive the encoded video data through the channel 130 .
  • the video decoder 122 is configured to decode the encoded video data, obtain the decoded video data, and transmit the decoded video data to the display device 123 .
  • the display device 123 displays the decoded video data.
  • the display device 123 may be integrated with the decoding apparatus 120 or external to the decoding apparatus 120 .
  • the display device 123 may include various display devices, such as a liquid crystal display (LCD), a plasma display, an organic light emitting diode (OLED) display, or other types of display devices.
  • LCD liquid crystal display
  • plasma display a plasma display
  • OLED organic light emitting diode
  • FIG. 1 is only an example of the present application, and the technical solutions of the embodiments of the present application are not limited to FIG. 1 .
  • the technology of the present application may also be applied to single-sided video encoding or single-sided video decoding.
  • FIG. 2 is a schematic block diagram of a video encoder 200 provided by an embodiment of the present application. It should be understood that the video encoder 200 can be used to perform lossy compression on images and can also be used to perform lossless compression on images.
  • the lossless compression may be visually lossless compression (visually lossless compression) or mathematically lossless compression (mathematically lossless compression).
  • the video encoder 200 can be applied to image data in luminance chrominance (YCbCr, YUV) format.
  • luminance chrominance YCbCr, YUV
  • the video encoder 200 reads video data, and for each frame of image in the video data, divides one frame of image into several coding tree units (CTUs).
  • CTUs coding tree units
  • the CTB may be referred to as “Tree block", “Largest Coding Unit” (LCU for short) or “coding tree block” (CTB for short).
  • LCU Large Coding Unit
  • CTB coding tree block
  • Each CTU may be associated with a block of pixels of equal size within the image.
  • Each pixel may correspond to one luminance (luma) sample and two chrominance (chrominance or chroma) samples.
  • each CTU may be associated with one block of luma samples and two blocks of chroma samples.
  • the size of one CTU is, for example, 128 ⁇ 128, 64 ⁇ 64, 32 ⁇ 32, and so on.
  • a CTU can be further divided into several coding units (Coding Unit, CU) for coding, and the CU can be a rectangular block or a square block.
  • the CU can be further divided into a prediction unit (PU) and a transform unit (TU), so that coding, prediction, and transformation are separated and processing is more flexible.
  • the CTU is divided into CUs in a quadtree manner, and the CUs are divided into TUs and PUs in a quadtree manner.
  • the video encoder 200 may support various PU sizes. Assuming the size of a particular CU is 2Nx2N, video encoders and video decoders may support PU sizes of 2Nx2N or NxN for intra prediction, and support 2Nx2N, 2NxN, Nx2N, NxN or similar sized symmetric PUs for inter prediction. Video encoders and video decoders may also support 2NxnU, 2NxnD, nLx2N, and nRx2N asymmetric PUs for inter prediction.
  • the video encoder 200 may include:
  • the prediction unit 210 the residual unit 220 , the transform/quantization unit 230 , the inverse transform/quantization unit 240 , the reconstruction unit 250 , the loop filtering unit 260 , the decoded image buffer 270 , and the entropy encoding unit 280 . It should be noted that the video encoder 200 may include more, less or different functional components.
  • the prediction unit 210 includes an inter prediction unit 211 and an intra prediction unit 212 . Since there is a strong correlation between adjacent pixels in a frame of a video, the method of intra-frame prediction is used in video coding and decoding technology to eliminate the spatial redundancy between adjacent pixels. Due to the strong similarity between adjacent frames in the video, the inter-frame prediction method is used in the video coding and decoding technology to eliminate the temporal redundancy between adjacent frames, thereby improving the coding efficiency.
  • the inter-frame prediction unit 211 can be used for inter-frame prediction, and the inter-frame prediction can refer to image information of different frames, and the inter-frame prediction uses motion information to find a reference block from the reference frame, and generates a prediction block according to the reference block for eliminating temporal redundancy;
  • Frames used for inter-frame prediction may be P frames and/or B frames, where P frames refer to forward predicted frames, and B frames refer to bidirectional predicted frames.
  • the motion information includes the reference frame list where the reference frame is located, the reference frame index, and the motion vector.
  • the motion vector can be of whole pixel or sub-pixel. If the motion vector is sub-pixel, then it is necessary to use interpolation filtering in the reference frame to make the required sub-pixel block.
  • the reference frame found according to the motion vector is used.
  • the whole pixel or sub-pixel block is called the reference block.
  • the reference block is directly used as the prediction block, and some technologies are processed on the basis of the reference block to generate the prediction block.
  • Reprocessing to generate a prediction block on the basis of the reference block can also be understood as taking the reference block as a prediction block and then processing it on the basis of the prediction block to generate a new prediction block.
  • the most commonly used inter-frame prediction methods include: geometric partitioning mode (GPM) in the VVC video codec standard, and angular weighted prediction (AWP) in the AVS3 video codec standard. These two intra prediction modes have something in common in principle.
  • the intra-frame prediction unit 212 only refers to the information of the same frame image, and predicts the pixel information in the current code image block, so as to eliminate the spatial redundancy.
  • Frames used for intra prediction may be I-frames.
  • the intra-frame prediction modes used by HEVC include Planar mode, DC and 33 angle modes, for a total of 35 prediction modes.
  • the intra-frame modes used by VVC are Planar, DC, and 65 angular modes, for a total of 67 prediction modes.
  • the intra-frame modes used by AVS3 include DC, Plane, Bilinear, and 63 angle modes, totaling 66 prediction modes.
  • the intra-prediction unit 212 may be implemented using an intra-block copy technique and an intra-string copy technique.
  • Residual unit 220 may generate a residual block of the CU based on the pixel blocks of the CU and the prediction blocks of the PUs of the CU. For example, residual unit 220 may generate a residual block of the CU such that each sample in the residual block has a value equal to the difference between the samples in the CU's pixel block, and the CU's PU's Corresponding samples in the prediction block.
  • Transform/quantization unit 230 may quantize transform coefficients. Transform/quantization unit 230 may quantize transform coefficients associated with TUs of the CU based on quantization parameter (QP) values associated with the CU. Video encoder 200 may adjust the degree of quantization applied to transform coefficients associated with the CU by adjusting the QP value associated with the CU. Inverse transform/quantization unit 240 may apply inverse quantization and inverse transform, respectively, to the quantized transform coefficients to reconstruct a residual block from the quantized transform coefficients.
  • QP quantization parameter
  • Reconstruction unit 250 may add the samples of the reconstructed residual block to corresponding samples of the one or more prediction blocks generated by prediction unit 210 to generate a reconstructed image block associated with the TU. By reconstructing the block of samples for each TU of the CU in this manner, video encoder 200 may reconstruct the block of pixels of the CU.
  • In-loop filtering unit 260 may perform deblocking filtering operations to reduce blocking artifacts for pixel blocks associated with the CU.
  • the loop filtering unit 260 includes a deblocking filtering unit and a sample adaptive compensation/adaptive loop filtering (SAO/ALF) unit, wherein the deblocking filtering unit is used for deblocking, the SAO/ALF unit Used to remove ringing effects.
  • SAO/ALF sample adaptive compensation/adaptive loop filtering
  • the decoded image buffer 270 may store the reconstructed pixel blocks.
  • Inter-prediction unit 211 may use the reference picture containing the reconstructed pixel block to perform inter-prediction on PUs of other pictures.
  • intra-prediction unit 212 may use the reconstructed pixel blocks in decoded picture buffer 270 to perform intra-prediction on other PUs in the same picture as the CU.
  • Entropy encoding unit 280 may receive the quantized transform coefficients from transform/quantization unit 230 . Entropy encoding unit 280 may perform one or more entropy encoding operations on the quantized transform coefficients to generate entropy encoded data.
  • FIG. 3 is a schematic block diagram of a decoding framework 300 provided by an embodiment of the present application.
  • the video decoder 300 includes:
  • Entropy decoding unit 310 may include more, less or different functional components.
  • the video decoder 300 may receive the code stream.
  • Entropy decoding unit 310 may parse the codestream to extract syntax elements from the codestream. As part of parsing the codestream, entropy decoding unit 310 may parse the entropy-encoded syntax elements in the codestream.
  • the prediction unit 320, the inverse quantization/transform unit 330, the reconstruction unit 340, and the in-loop filtering unit 350 may decode the video data according to the syntax elements extracted from the code stream, ie, generate decoded video data.
  • the prediction unit 320 includes an intra prediction unit 321 and an inter prediction unit 322 .
  • Intra-prediction unit 321 may perform intra-prediction to generate prediction blocks for the PU. Intra-prediction unit 321 may use an intra-prediction mode to generate prediction blocks for a PU based on pixel blocks of spatially neighboring PUs. Intra-prediction unit 321 may also determine an intra-prediction mode for the PU from one or more syntax elements parsed from the codestream.
  • Inter-prediction unit 322 may construct a first reference picture list (List 0) and a second reference picture list (List 1) from the syntax elements parsed from the codestream. Furthermore, if the PU is encoded using inter-prediction, entropy decoding unit 310 may parse the motion information for the PU. Inter-prediction unit 322 may determine one or more reference blocks for the PU according to the motion information of the PU. Inter-prediction unit 322 may generate a prediction block for the PU from one or more reference blocks of the PU.
  • the inverse quantization/transform unit 330 inversely quantizes (ie, dequantizes) the transform coefficients associated with the TUs. Inverse quantization/transform unit 330 may use the QP value associated with the CU of the TU to determine the degree of quantization. After inverse quantizing the transform coefficients, inverse quantization/transform unit 330 may apply one or more inverse transforms to the inverse quantized transform coefficients to generate a residual block associated with the TU.
  • Reconstruction unit 340 uses the residual blocks associated with the TUs of the CU and the prediction blocks of the PUs of the CU to reconstruct the pixel blocks of the CU. For example, reconstruction unit 340 may add samples of the residual block to corresponding samples of the prediction block to reconstruct the pixel block of the CU, resulting in a reconstructed image block.
  • In-loop filtering unit 350 may perform deblocking filtering operations to reduce blocking artifacts for pixel blocks associated with the CU.
  • Video decoder 300 may store the reconstructed images of the CU in decoded image buffer 360 .
  • the video decoder 300 may use the reconstructed image in the decoded image buffer 360 as a reference image for subsequent prediction, or transmit the reconstructed image to a display device for presentation.
  • an image of one frame is divided into blocks, and for the current block, the prediction unit 210 uses intra-frame prediction or inter-frame prediction to generate a prediction block of the current block.
  • the residual unit 220 may calculate a residual block based on the predicted block and the original block of the current block, that is, the difference between the predicted block and the original block of the current block, and the residual block may also be referred to as residual information.
  • the residual block can be transformed and quantized by the transform/quantization unit 230 to remove information insensitive to human eyes, so as to eliminate visual redundancy.
  • the residual block before being transformed and quantized by the transform/quantization unit 230 may be referred to as a time-domain residual block, and the time-domain residual block after being transformed and quantized by the transform/quantization unit 230 may be referred to as a frequency residual block. or a frequency domain residual block.
  • the entropy coding unit 280 receives the quantized variation coefficient output by the variation quantization unit 230, and can perform entropy coding on the quantized variation coefficient to output a code stream. For example, the entropy encoding unit 280 may eliminate character redundancy according to the target context model and the probability information of the binary code stream.
  • a current block may be referred to as a current coding unit (CU) or a current prediction unit (PU), or the like.
  • a prediction block may also be referred to as a predicted image block or an image prediction block, and a reconstructed image block may also be referred to as a reconstructed block or an image reconstructed image block.
  • the entropy decoding unit 310 can parse the code stream to obtain prediction information, quantization coefficient matrix, etc. of the current block, and the prediction unit 320 uses intra prediction or inter prediction on the current block to generate the prediction block of the current block based on the prediction information.
  • the inverse quantization/transform unit 330 performs inverse quantization and inverse transformation on the quantized coefficient matrix using the quantized coefficient matrix obtained from the code stream to obtain a residual block.
  • the reconstruction unit 340 adds the prediction block and the residual block to obtain a reconstructed block.
  • the reconstructed blocks form a reconstructed image, and the loop filtering unit 350 performs loop filtering on the reconstructed image based on the image or based on the block to obtain a decoded image.
  • the encoding side also needs a similar operation to the decoding side to obtain the decoded image.
  • the decoded image may also be referred to as a reconstructed image, and the reconstructed image may be a subsequent frame as a reference frame for inter-frame prediction.
  • the block division information determined by the coding end, and mode information or parameter information such as prediction, transformation, quantization, entropy coding, and loop filtering, etc. are carried in the code stream when necessary.
  • the decoding end determines the same block division information, prediction, transformation, quantization, entropy coding, loop filtering and other mode information or parameter information as the encoding end by analyzing the code stream and analyzing the existing information, so as to ensure the decoded image obtained by the encoding end. It is the same as the decoded image obtained by the decoder.
  • video signals can be divided into two types: those captured by cameras and those generated by computers. Due to different statistical characteristics, the corresponding compression and encoding methods may also be different, but one or more of the following operations and processing will be performed on the input original video signal:
  • Block partition structure The input image is divided into several non-overlapping processing units according to the size of one, and each processing unit will perform similar compression operations. This processing unit is called the CTU, or LCU. Further down the CTU, more finer divisions can be continued to obtain one or more basic coding units, which are called CUs. Each CU is the most basic element in a coding session. Described below are various encoding methods that may be used for each CU.
  • Predictive Coding Including intra-frame prediction and inter-frame prediction, etc., after the original video signal is predicted by the selected reconstructed video signal, the residual video signal is obtained.
  • the encoder needs to decide among many possible predictive coding modes for the current CU, select the most suitable one, and inform the decoder.
  • Intra-frame prediction the predicted signal comes from the area that has been coded and reconstructed in the same image
  • Inter-frame prediction the predicted signal comes from other images that have been encoded and different from the current image (called reference images)
  • transform coding and quantization (Transform & Quantization):
  • the residual video signal undergoes transformation operations such as DFT, DCT, etc., and the signal is converted into the transform domain, which is called transform coefficient.
  • the signal in the transform domain is further subjected to a lossy quantization operation, which loses certain information, so that the quantized signal is beneficial to the compressed expression.
  • the encoder also needs to select one transformation for the current encoding CU and inform the decoder.
  • the fineness of quantization is usually determined by the quantization parameter (QP).
  • QP quantization parameter
  • the larger the value of QP the coefficients representing a larger range of values will be quantized into the same output, so it usually brings greater distortion and lower values. Code rate; on the contrary, if the QP value is smaller, the coefficients representing a smaller value range will be quantized into the same output, so it usually brings less distortion and corresponds to a higher code rate.
  • Entropy Coding or Statistical Coding The quantized transform domain signal will undergo statistical compression coding according to the frequency of occurrence of each value, and finally output a binarized (0 or 1) compressed code stream. At the same time, the encoding produces other information, such as the selected mode, motion vector, etc., and entropy encoding is also required to reduce the bit rate.
  • Statistical coding is a lossless coding method that can effectively reduce the code rate required to express the same signal. Common statistical coding methods include variable length coding (VLC, Variable Length Coding) or context-based binary arithmetic coding (Content Adaptive Binary Arithmetic Coding, CABAC).
  • loop filtering the image that has been coded, through the operations of inverse quantization, inverse transformation and prediction compensation (the reverse operations of the above 2 to 4), the reconstructed decoded image can be obtained. Comparing the reconstructed image with the original image, due to the influence of quantization, part of the information is different from the original image, resulting in distortion (Distortion). Performing filtering operations on the reconstructed image, such as deblocking, SAO or ALF, can effectively reduce the degree of distortion caused by quantization. Since these filtered reconstructed images will be used as references for subsequent encoded images to predict future signals, the above filtering operations are also referred to as in-loop filtering and filtering operations in the encoding loop.
  • the decoder For each CU, after obtaining the compressed code stream, the decoder first performs entropy decoding to obtain various mode information and quantized transform coefficients, and each coefficient undergoes inverse quantization and inverse transform to obtain residual signals; On the one hand, according to the known coding mode information, the prediction signal corresponding to the CU can be obtained, and after adding the two, the reconstructed signal can be obtained. Finally, the reconstructed value of the decoded image needs to undergo a loop filtering operation to generate the final output signal.
  • a block-based hybrid coding framework For mainstream video coding standards, such as HEVC, VVC, AVS3, a block-based hybrid coding framework is used.
  • the original video data is divided into a series of coding blocks, and video coding methods such as prediction, transform and entropy coding are combined to realize the compression of video data.
  • motion compensation is a type of prediction method commonly used in video coding. Based on the redundancy characteristics of video content in the temporal or spatial domain, motion compensation derives the prediction value of the current coding block from the coded region.
  • Such prediction methods include: inter-frame prediction, block-copy intra-frame prediction, string-copy intra-frame prediction, etc., and in specific coding implementations, these prediction methods may be used alone or in combination.
  • Intra String Copy (ISC) technology divides a coded block into a series of pixel strings or unmatched pixels according to a certain scanning order (raster scan, round-trip scan and Zig-Zag scan, etc.). Similar to IBC, each string looks for a reference string of the same shape in the coded area of the current image, and derives the predicted value of the current string. By encoding the residual between the pixel value of the current string and the predicted value, instead of directly encoding the pixel value, it can effectively Save bits.
  • ISC Intra String Copy
  • Figure 4 presents a schematic diagram of the string replication intra prediction.
  • the dark gray area is the encoded area
  • the 28 white pixels are string 1
  • the light gray 35 pixels are string 2
  • the black 1 pixel represents an unmatched pixel.
  • Unmatched pixels are also called outliers, and the raw values of unmatched pixels are encoded directly rather than derived from predicted values.
  • the intra-frame string replication technology needs to encode the string vector (String Vector, SV) corresponding to each string in the current coding block, the string length, and the flag of whether there is a matching string.
  • a string vector (SV) represents the displacement of the string to be encoded from its reference string.
  • String length indicates the number of pixels contained in the string.
  • the reference string of string 1 is on its left side
  • the displacement of string 1 to its corresponding reference string is represented by string vector 1 .
  • the reference string of string 2 is above it, and the displacement of string 2 to its corresponding reference string is represented by string vector 2.
  • the string replicates the sub-mode of the equivalent string and the unit base vector string in intra prediction.
  • the equivalence string and unit vector string mode is a sub-mode of string replication intra prediction, which was adopted into the AVS3 standard in October 2020. Similar to string copy intra prediction, this mode divides an encoding/decoding block into a series of pixel strings or unmatched pixels according to a certain scanning order. An equivalence string is characterized in that all pixels in the pixel string have the same predicted value.
  • the unit vector string also known as the unit base vector string, the unit offset string, the upper copy string, etc.
  • the unit vector string is characterized in that its displacement vector is (0,-1), and each pixel of the string uses the upper pixel as the current pixel. Predictive value.
  • the equal value string mode needs to encode the type, length and predicted value information of each string of the current coding block in the code stream. As with the normal string predictor submode, the predictor for unmatched pixels is encoded directly rather than derived from the predictor.
  • the Y, Cb and Cr components of unmatched pixels are directly coded rather than derived from predicted values.
  • Each component of the unmatched pixel is encoded according to its bit depth. For example, the bit depth of the current image is 10 bits.
  • 3*10 bit symbols are required for encoding, that is, for each symbol bit Encoding in bypass mode results in excessive encoding overhead for unmatched pixels.
  • An embodiment of the present application proposes an encoding method for unmatched pixels.
  • the method adjusts the value of each component of the unmatched pixels through the encoding end, and selects a suitable context model for encoding the unmatched pixels, thereby reducing the encoding of the unmatched pixels. overhead and improve encoding performance.
  • the method provided by the present application is applicable to codecs that use the string copy intra prediction mode or other arbitrary modes (such as palette mode) that need to encode unmatched pixels. There is no specific limitation.
  • FIG. 5 is a schematic flowchart of a method 400 for encoding unmatched pixels according to an embodiment of the present application.
  • the execution body of the method 400 may be, but is not limited to, an encoder, or a device for performing block vector encoding, such as a desktop computer, a mobile computing device, a notebook (eg, laptop) computer, a tablet computer, Handsets such as set-top boxes, smartphones, televisions, cameras, display devices, digital media players, video game consoles, in-vehicle computers, or the like.
  • the encoding apparatus 110 shown in FIG. 1 or the video encoder 200 shown in FIG. 2 the encoding apparatus 110 shown in FIG. 1 or the video encoder 200 shown in FIG. 2 .
  • the encoding method 400 of the unmatched pixels may include:
  • S410 obtain a target sequence, and divide the target image frame in the target sequence into multiple image blocks, where the multiple image blocks include target image blocks;
  • S440 Encode the binary symbol string of the unmatched pixels by using at least two entropy encoding methods to obtain a code stream of the target sequence.
  • the code stream of the target sequence is encoded by multiple entropy encoding methods, which can improve the flexibility of encoding and help to balance the encoding overhead of pixels that do not match the encoding performance.
  • the component value of the unmatched pixel involved in this application can be understood as the value of the color component of the unmatched pixel.
  • the component value of the unmatched pixel can be the Y component of the unmatched pixel, U component or the value of the V component.
  • the length of the binary symbol string is M, where M is an integer greater than 0, and the at least two entropy encoding methods include a first entropy encoding method and a second entropy encoding method; wherein, the S440 may include:
  • the first entropy coding method is used for decoding the pre-preset first decoding bit of the binary symbol string
  • the second entropy coding method is used for decoding the second decoding bit that is preset after the binary symbol string. , get the binary symbol string of unmatched pixels in the target image block.
  • the second preset second target bit may be a preset number of bits encoded using the second entropy encoding method, for example, may be the last N bits in the binary symbol string, where N is an integer greater than 0, and N is less than M
  • the first preset first target bit may be a preset number of bits encoded and decoded using the first entropy encoding method, for example, may be the first M-N bits in the binary symbol string.
  • the encoding end encodes the binary symbol string of the unmatched pixels through two entropy encoding methods to obtain the code stream of the target sequence.
  • the first entropy coding mode is a bypass mode
  • the second entropy coding mode is a context-based binary arithmetic coding CABAC mode.
  • the binary symbol string of the unmatched pixels is encoded by at least two entropy encoding methods; because the at least two entropy encoding methods are constructed so that the first M-N bits of the binary symbol string are encoded in a bypass mode.
  • Entropy encoding the last N bits of the binary symbol string are entropy encoded using the context-based binary arithmetic coding CABAC mode; on the one hand, the first M-N bits of the binary symbol string are designed to be entropy encoded using bypass mode, which can ensure encoding.
  • the last N bits of the binary symbol string are entropy coded using the context-based binary arithmetic coding CABAC mode, which can reduce the coding overhead of unmatched pixels.
  • the scheme provided by this application can guarantee the coding performance. On the basis of , the coding overhead of unmatched pixels is reduced.
  • bypass mode can be understood as a CABAC mode under specific probability information, but due to the setting for the specific probability information, when the bypass mode is used for entropy coding, a large coding overhead is generated.
  • the method 400 may further include:
  • the first identifier is used to indicate whether the at least two entropy encoding methods are used for entropy coding for the unmatched pixels in the target sequence; the second identifier is used to indicate that the unmatched pixels in the target sequence are not matched Whether the matching pixels are entropy encoded using the at least two entropy encoding methods, and the target image frame includes the target image block; the third identifier is used to indicate whether to use the at least the unmatched pixels in the target slice in the target image frame.
  • Entropy encoding is performed using two entropy encoding methods, and the target slice includes the target image block; the fourth identifier is used to indicate whether the at least two entropy encoding methods are used for entropy encoding for unmatched pixels in the target image block.
  • the specific values of the first identifier, the second identifier, the third identifier or the fourth identifier can be obtained by query in the corresponding parameter set, and can also be obtained by other methods.
  • This application implements This example is not specifically limited.
  • the value of each identifier in the parameter set may be a parameter set in advance. For example, before encoding, the user can set the value of each identifier in the parameter set by setting the parameters of the encoder.
  • the at least two entropy encoding methods are default modes for unmatched pixels; wherein, the S440 may include:
  • the binary symbol string of the unmatched pixels is encoded by the default mode to obtain the code stream of the target sequence.
  • the encoder can use at least one of the following methods to determine whether the target image block uses the encoding method for unmatched pixels provided in this application:
  • the coding method for the unmatched pixels provided by the present application is used by default, that is, it is not necessary to encode a marker in the code stream.
  • b) Encoding a sequence-level flag in the code stream, indicating that the unmatched pixels of all coding blocks in the target sequence use the encoding method for unmatched pixels provided in this application.
  • c) Encode an image-level (frame-level) mark in the code stream, indicating that all unmatched pixels in the target image frame use the encoding method for unmatched pixels provided in this application.
  • d) Encoding a slice-level flag in the code stream, indicating that all unmatched pixels in the target slice use the encoding method for unmatched pixels provided in this application.
  • e) Encoding a CU-level (ie, image block-level) flag in the code stream, indicating that the unmatched pixels in the target image block all use the encoding method for unmatched pixels provided in this application.
  • the S420 may include:
  • Shift operation is performed on the component values of the unmatched pixels to obtain the adjusted component values of the unmatched pixels.
  • the displacement operation may be to use a displacement operator to perform a displacement operation on the component values of the unmatched pixels, and may be to perform a left shift operation or a right shift operation.
  • the encoding end adjusts the pixels of the unmatched pixels, and selects an appropriate context model for encoding the component values of the adjusted unmatched pixels; correspondingly, the decoding end selects the corresponding context model for decoding.
  • the method 400 may further include:
  • the quantization parameter is acquired, and based on the quantization parameter, the value of the second target bit is preset after being determined.
  • the quantization parameter is a parameter used when performing quantization transformation in encoding and decoding.
  • the post-preset second target bit refers to the preset number of bits encoded in the second entropy encoding method.
  • the encoder may determine the value of N according to the quantization parameter for the target image block. For example, the larger the quantization parameter of the target image block is, the larger or smaller the value of N is. For example, the encoder may determine the value of N according to the corresponding relationship between at least one quantization parameter and at least one numerical value. Specifically, the encoder may determine the value corresponding to the target image block as the value of N.
  • the component value of the unmatched pixel is a chrominance component value or a luminance component value, when the component value of the unmatched pixel is the value of N that is used for the chrominance component value in encoding, and when the component value of the unmatched pixel is the value of N used in encoding The component value is different from the value of N that is used for the luminance component value during encoding.
  • val_Y the luminance component value of the unmatched pixel before adjustment
  • the encoder can adjust the luminance component value of the unmatched pixel as (val_Y>>N1) ⁇ N1; for another example, denote val_U
  • the encoder can adjust the chrominance component value of the unmatched pixel to be (val_U>>N2) ⁇ N2.
  • the value of N1 and the value of N2 may be the same or different, which is not specifically limited in this application.
  • the value of N1 is smaller than the value of N2, so as to guarantee the coding performance of the luminance component to a limited extent.
  • the S420 may include:
  • the component values of the unmatched pixels are weighted to obtain the adjusted component values of the unmatched pixels.
  • val represents the component value of the unmatched pixel before adjustment
  • / represents an integer division operation
  • * represents a multiplication operation
  • K is a weighting coefficient
  • K is not specifically limited in the embodiments of the present application.
  • K may or may not be equal to N.
  • K may be a positive integer greater than 0.
  • the method 400 may further include:
  • the quantization parameter is obtained, and the value of the weighting coefficient is determined based on the quantization parameter.
  • the encoder may determine the value of K according to the quantization parameter for the target image block. For example, the larger the quantization parameter of the target image block is, the larger or smaller the value of K is. For example, the encoder may determine the value of K according to the correspondence between at least one quantization parameter and at least one numerical value. Specifically, the encoder may determine the value corresponding to the target image block as the value of K.
  • the component value of the unmatched pixel is a chrominance component value or a luminance component value
  • the value of K used for the chrominance component value is different from the value of K used for the luminance component value.
  • val_Y the luminance component value of the unmatched pixel before adjustment
  • the encoder can adjust the luminance component value of the unmatched pixel to val_Y/K1*K1; for another example, denote val_U as the luminance component value before adjustment
  • val_U the luminance component value before adjustment
  • the encoder can adjust the chrominance component value of the unmatched pixel to val_U/K2*K2.
  • the value of K1 and the value of K2 may be the same or different, which is not specifically limited in this application.
  • the value of K1 is smaller than the value of K2, so as to guarantee the coding performance of the luminance component to a limited extent.
  • the decoding method provided by the present application may be applied only to the luminance component, that is, N1 is not 0, and N2 is 0.
  • the decoding method provided in this application may be applied only to the chrominance component, that is, N2 is not 0, and N1 is 0.
  • N and K may be used in combination, that is, K may be used to adjust the component value of the unmatched pixel, and N may be used to determine the corresponding value of each of the at least two entropy encoding methods.
  • the encoding method provided by the present application will be described below with reference to specific embodiments by taking an example of encoding an input video with a bit depth of 10 bits.
  • val the luminance or chrominance component value of the current unmatched pixel
  • the encoder adjusts the value of val to (val>>2) ⁇ 2;
  • the encoding end uses a fixed-length code to binarize the adjusted val to derive a binary symbol string with a length of 10.
  • the first 8 bits of the binary symbol string are entropy encoded in the bypass mode, and the last 2 bits are entropy encoded in the CABAC mode.
  • the decoding end decodes the 10-bit binary symbol string from the code stream, wherein the first 8 bits are entropy encoded in the bypass mode, and the last 2 bits are entropy encoded in the CABAC mode; way to debinarize to get the value of val.
  • the encoder sets the value of N according to the quantization parameter, and for 27, 32, 38 and 45, the value of N is set to 2, 2, 3, and 3, respectively.
  • the encoder adjusts the value of val to (val>>N) ⁇ N;
  • the encoding end uses the fixed-length code to binarize the adjusted val, and derives a binary symbol string with a length of 10.
  • the first 10-N bits of the binary symbol string are entropy encoded in the bypass mode, and the last N bits are entropy encoded in the CABAC mode.
  • the decoding end decodes the 10-bit binary symbol string from the code stream, wherein the first 10-N bits are entropy encoded in the bypass mode, and the last N bits are entropy encoded in the CABAC mode;
  • the code is debinarized to obtain the value of val.
  • the encoder sets the value of M according to the quantization parameter, and for 27, 32, 38, and 45, the value of M is set to 2, 2, 3, and 3, respectively.
  • the encoder adjusts the value of val to (val>>N) ⁇ N;
  • the encoding end uses a fixed-length code to binarize the adjusted val, and derives a binary symbol string with a length of 10.
  • the first 10-N bits of the binary symbol string are entropy encoded in the bypass mode, and the last N bits are entropy encoded in the CABAC mode.
  • the decoding end decodes the 10-bit binary symbol string from the code stream, wherein the first 10-N bits are entropy encoded in the bypass mode, and the last N bits are entropy encoded in the CABAC mode;
  • the code is debinarized to obtain the value of val.
  • the size of the sequence numbers of the above-mentioned processes does not mean the sequence of execution, and the execution sequence of each process should be determined by its functions and internal logic, and should not be dealt with in the present application.
  • the implementation of the embodiments constitutes no limitation.
  • the term "and/or" is only an association relationship for describing associated objects, indicating that there may be three kinds of relationships. Specifically, A and/or B can represent three situations: A exists alone, A and B exist at the same time, and B exists alone.
  • the character "/" in this document generally indicates that the related objects are an "or" relationship.
  • FIG. 6 is a schematic flowchart of a method 500 for decoding an unmatched pixel provided by an embodiment of the present application.
  • the execution body of the method 500 may be, but is not limited to, the following devices: a decoder, or a device for performing block-vector decoding, such as desktop computers, mobile computing devices, notebook (eg, laptop) computers, tablet computers, Handsets such as set-top boxes, smartphones, televisions, cameras, display devices, digital media players, video game consoles, in-vehicle computers, or the like.
  • the decoding device 120 shown in FIG. 1 or the video decoder 300 shown in FIG. 3 the decoding device 120 shown in FIG. 1 or the video decoder 300 shown in FIG. 3 .
  • the decoding method 500 of the unmatched pixels may include:
  • S510 Obtain the code stream of the target sequence, and decode the code stream of the target sequence by at least two entropy decoding methods to obtain a binary symbol string of unmatched pixels in the target image block, where the target image block is the target image block in the target sequence.
  • the image frame is divided;
  • the length of the binary symbol string is M, where M is an integer greater than 0, and the at least two entropy decoding methods include a first entropy decoding method and a second entropy decoding method; wherein, the S510 may include:
  • the first entropy decoding method is used to decode the first target bit of the binary symbol string
  • the second entropy decoding method is used to decode the second target bit of the binary symbol string. , to obtain a binary symbol string of unmatched pixels in the target image block, where the sum of the first preset first target bit and the second preset second target bit is the length of the binary symbol string.
  • the first entropy decoding mode is a bypass mode
  • the second entropy decoding mode is a context-based binary arithmetic coding CABAC mode.
  • the code stream of the target sequence is decoded by at least two entropy decoding methods. Since the at least two entropy decoding methods are constructed such that the first M-N bits of the binary symbol string are entropy decoded in bypass mode, the The last N bits of the binary symbol string are entropy decoded using the context-based binary arithmetic coding CABAC mode; on the one hand, the first M-N bits of the binary symbol string are designed to be entropy decoded in the bypass mode, which can ensure the decoding performance, and the other is
  • the last N bits of the binary symbol string are entropy decoded by using the context-based binary arithmetic coding CABAC mode, which can reduce the coding overhead of unmatched pixels. Based on this, the scheme provided by the application can ensure the coding performance. Reduce encoding overhead for unmatched pixels.
  • the method 500 may further include:
  • the first identifier is used to indicate whether the at least two entropy decoding methods are used for entropy decoding for the unmatched pixels in the target sequence;
  • the second identifier is used to indicate that the unmatched pixels in the target sequence are Whether the matched pixels are entropy decoded by adopting the at least two entropy decoding methods, and the target image frame includes the target image block;
  • the third identifier is used to indicate whether the at least two unmatched pixels in the target slice in the target image frame are used for entropy decoding.
  • the S510 may include:
  • the code stream of the target sequence is processed by the at least two entropy decoding methods. Decoding is performed to obtain a binary symbol string of unmatched pixels in the target image block.
  • the at least two entropy decoding methods are default modes for unmatched pixels; wherein, the S510 may include:
  • the code stream of the target sequence is decoded by the default mode, and the binary symbol string of the unmatched pixels in the target image block is obtained.
  • the decoder can use at least one of the following methods to determine whether the target image block uses the decoding method for unmatched pixels provided in this application:
  • the decoding method for the unmatched pixels provided by the present application is used by default, that is, there is no need to decode markers in the code stream.
  • b) Decode a sequence-level flag in the code stream, indicating that the unmatched pixels of all decoding blocks in the target sequence use the decoding method for unmatched pixels provided in this application.
  • c) Decode an image-level (frame-level) flag in the code stream, indicating that all unmatched pixels in the target image frame use the decoding method for unmatched pixels provided in this application.
  • d) Decode a slice-level flag in the code stream, indicating that all unmatched pixels in the target slice use the decoding method for unmatched pixels provided in this application.
  • e) Decode a CU-level (ie, image block-level) flag in the code stream, indicating that the unmatched pixels in the target image block all use the decoding method for unmatched pixels provided in this application.
  • the method 500 may further include:
  • the quantization parameter is acquired, and based on the quantization parameter, the value of the second target bit is preset after being determined.
  • the decoder may determine the value of N according to the quantization parameter for the target image block. For example, the larger the quantization parameter of the target image block is, the larger or smaller the value of N is.
  • the decoder may determine the value of N according to the correspondence between at least one quantization parameter and at least one numerical value. Specifically, the decoder may determine the value corresponding to the target image block as the value of N.
  • the component value of the unmatched pixel is a chrominance component value or a luminance component value.
  • the value of N used in decoding is the same as when the unmatched pixel has a component value.
  • the value of N used for decoding is different.
  • val_Y the luminance component value of the unmatched pixel before adjustment
  • the encoder can adjust the luminance component value of the unmatched pixel as (val_Y>>N1) ⁇ N1; for another example, denote val_U
  • the encoder can adjust the chrominance component value of the unmatched pixel to be (val_U>>N2) ⁇ N2.
  • the value of N1 and the value of N2 may be the same or different, which is not specifically limited in this application.
  • the value of N1 is smaller than the value of N2, so as to guarantee the coding performance of the luminance component to a limited extent.
  • the decoding method provided by the present application may be applied only to the luminance component, that is, N1 is not 0, and N2 is 0.
  • the decoding method provided in this application may be applied only to the chrominance component, that is, N2 is not 0, and N1 is 0.
  • the decoding process is a reverse operation of the encoding process, and the decoding method 500 may refer to the relevant description of the encoding method 400 , which is not repeated here to avoid repetition.
  • FIG. 7 is a schematic flowchart of an encoder 600 provided by an embodiment of the present application.
  • the encoder 600 may include:
  • a dividing unit 610 configured to obtain a target sequence, and divide the target image frame in the target sequence into multiple image blocks, and the multiple image blocks include the target image block;
  • An adjustment unit 620 configured to adjust the component values of the unmatched pixels in the target image block to obtain the adjusted component values of the unmatched pixels
  • the binarization unit 630 is used to obtain the binary symbol string of the unmatched pixel by binarizing the adjustment component value of the unmatched pixel;
  • the encoding unit 640 is configured to encode the binary symbol string of the unmatched pixels through at least two entropy encoding methods to obtain the code stream of the target sequence.
  • the code stream of the target sequence is decoded by at least two entropy encoding methods, that is, the code stream of the target sequence is encoded by multiple entropy encoding methods, which can improve the flexibility of encoding and coding, and is conducive to balanced encoding Encoding overhead for performance unmatched pixels.
  • the at least two entropy encoding methods include a first entropy encoding method and a second entropy encoding method; wherein the encoding unit 640 is specifically configured to:
  • the first entropy coding method to encode the first target bit before the binary symbol string
  • the second entropy encoding method to encode the second target bit after the binary symbol string , to obtain the code stream of the target sequence
  • the sum of the pre-set first target bit and the post-preset second target bit is the length of the binary symbol string.
  • the first entropy encoding mode is a bypass mode
  • the second entropy encoding mode is a context-based binary arithmetic encoding mode.
  • the encoding unit 640 is also used for:
  • the first identifier is used to indicate whether the at least two entropy encoding methods are used for entropy coding for the unmatched pixels in the target sequence; the second identifier is used to indicate that the unmatched pixels in the target sequence are not matched Whether the matching pixels are entropy encoded using the at least two entropy encoding methods, and the target image frame includes the target image block; the third identifier is used to indicate whether to use the at least the unmatched pixels in the target slice in the target image frame.
  • Entropy encoding is performed using two entropy encoding methods, and the target slice includes the target image block; the fourth identifier is used to indicate whether the at least two entropy encoding methods are used for entropy encoding for unmatched pixels in the target image block.
  • the at least two entropy encoding methods are default modes for unmatched pixels; wherein, the encoding unit 640 is specifically used for:
  • the binary symbol string of the unmatched pixels is encoded by the default mode to obtain the code stream of the target sequence.
  • the adjustment unit 620 is specifically used for:
  • Shift operation is performed on the component values of the unmatched pixels to obtain the adjusted component values of the unmatched pixels.
  • the adjustment unit 620 before performing a displacement operation on the component values of the unmatched pixels to obtain the adjusted component values of the unmatched pixels, the adjustment unit 620 is further configured to:
  • the quantization parameter is acquired, and based on the quantization parameter, the value of the second target bit is preset after being determined.
  • the component value of the unmatched pixel is a chrominance component value or a luma component value
  • the S420 may include:
  • the component values of the unmatched pixels are weighted to obtain the adjusted component values of the unmatched pixels.
  • FIG. 8 is a schematic flowchart of a decoder 700 provided by an embodiment of the present application.
  • the decoder 700 may include:
  • the decoding unit 710 is configured to obtain the code stream of the target sequence, and decode the code stream of the target sequence by at least two entropy decoding methods to obtain a binary symbol string of unmatched pixels in the target image block, where the target image block is the target image block.
  • the target image frame in the sequence is divided;
  • an inverse binarization unit 720 configured to obtain the component value of the unmatched pixel by performing inverse binarization on the binary symbol string
  • the processing unit 730 is configured to obtain the target image block based on the component values of the unmatched pixels.
  • the at least two entropy decoding methods include a first entropy decoding method and a second entropy decoding method; wherein the decoding unit 710 is specifically configured to:
  • the first entropy decoding method is used to decode the first target bit of the binary symbol string
  • the second entropy decoding method is used to decode the second target bit of the binary symbol string. , to obtain a binary symbol string of unmatched pixels in the target image block, and the sum of the first preset first target bit and the second preset second target bit is the length of the binary symbol string.
  • the first entropy decoding mode is a bypass mode
  • the second entropy decoding mode is a context-based binary arithmetic coding mode
  • the decoding unit 710 is further used for:
  • the first identifier is used to indicate whether the at least two entropy decoding methods are used for entropy decoding for the unmatched pixels in the target sequence;
  • the second identifier is used to indicate that the unmatched pixels in the target sequence are Whether the matched pixels are entropy decoded by adopting the at least two entropy decoding methods, and the target image frame includes the target image block;
  • the third identifier is used to indicate whether the at least two unmatched pixels in the target slice in the target image frame are used for entropy decoding.
  • the decoding unit 710 is specifically used for:
  • the code stream of the target sequence is processed by the at least two entropy decoding methods. Decoding is performed to obtain a binary symbol string of unmatched pixels in the target image block.
  • the at least two entropy decoding methods are default modes for unmatched pixels; wherein the decoding unit 710 is specifically used for:
  • the code stream of the target sequence is decoded by the default mode, and the binary symbol string of the unmatched pixels in the target image block is obtained.
  • the decoding unit 710 is further configured to:
  • the quantization parameter is acquired, and based on the quantization parameter, the value of the second target bit is preset after being determined.
  • the component value of the unmatched pixel is a chrominance component value or a luminance component value
  • the value of N used for the chrominance component value is different from the value of N used for the luminance component value.
  • the apparatus embodiments and the method embodiments may correspond to each other, and similar descriptions may refer to the method embodiments. To avoid repetition, details are not repeated here.
  • the encoder 600 shown in FIG. 7 may correspond to the corresponding subject in executing the method 400 of the embodiments of the present application, that is, the aforementioned and other operations and/or functions of the various units in the encoder 600 are respectively for implementing the method 400 and the like.
  • the decoder 700 shown in FIG. 8 may correspond to the corresponding subject in executing the method 500 of the embodiments of the present application, and the aforementioned and other operations and/or functions of the various units in the decoder 700 are for the purpose of implementing the method 500 and other methods, respectively. corresponding process.
  • each unit in the encoder 600 or the decoder 700 involved in the embodiments of the present application may be respectively or all merged into one or several other units to form, or some of the unit(s) may be further disassembled It is divided into a plurality of units with smaller functions, which can realize the same operation without affecting the realization of the technical effects of the embodiments of the present application.
  • the above-mentioned units are divided based on logical functions. In practical applications, the function of one unit may also be implemented by multiple units, or the functions of multiple units may be implemented by one unit. In other embodiments of the present application, the encoder 600 or the decoder 700 may also include other units.
  • a general-purpose computing device including a general-purpose computer such as a central processing unit (CPU), a random access storage medium (RAM), a read-only storage medium (ROM), etc., and a general-purpose computer may be implemented
  • a computer program (including program code) capable of executing the steps involved in the corresponding method is run on the computer to construct the encoder 600 or the decoder 700 involved in the embodiments of the present application, and to implement the encoding method for unmatched pixels in the embodiments of the present application or decoding method for unmatched pixels.
  • the computer program may be recorded on, for example, a computer-readable storage medium, loaded into an electronic device through the computer-readable storage medium, and executed in the electronic device, so as to implement the corresponding methods of the embodiments of the present application.
  • the units mentioned above can be implemented in the form of hardware, can also be implemented by instructions in the form of software, and can also be implemented in the form of a combination of software and hardware.
  • the steps of the method embodiments in the embodiments of the present application may be completed by hardware integrated logic circuits in the processor and/or instructions in the form of software, and the steps of the methods disclosed in conjunction with the embodiments of the present application may be directly embodied as hardware
  • the execution of the decoding processor is completed, or the execution is completed by a combination of hardware and software in the decoding processor.
  • the software may be located in random access memory, flash memory, read-only memory, programmable read-only memory, electrically erasable programmable memory, registers, and other storage media mature in the art.
  • the storage medium is located in the memory, and the processor reads the information in the memory, and completes the steps in the above method embodiments in combination with its hardware.
  • FIG. 9 is a schematic structural diagram of an encoding and decoding device 800 provided by an embodiment of the present application.
  • the codec device 800 includes at least a processor 710 and a computer-readable storage medium 720 .
  • the processor 710 and the computer-readable storage medium 720 may be connected through a bus or other means.
  • the computer-readable storage medium 720 is used for storing a computer program 721
  • the computer program 721 includes computer instructions
  • the processor 710 is used for executing the computer instructions stored in the computer-readable storage medium 720 .
  • the processor 710 is the computing core and the control core of the encoding/decoding device 800, which is suitable for implementing one or more computer instructions, and is specifically suitable for loading and executing one or more computer instructions to implement corresponding method processes or corresponding functions.
  • the processor 710 may also be referred to as a central processing unit (Central Processing Unit, CPU).
  • the processor 710 may include, but is not limited to: a general-purpose processor, a digital signal processor (Digital Signal Processor, DSP), an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), a Field Programmable Gate Array (Field Programmable Gate Array, FPGA) Or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, and the like.
  • the computer-readable storage medium 720 may be a high-speed RAM memory, or a non-volatile memory (Non-Volatile Memory), such as at least one disk memory; computer readable storage medium.
  • the computer-readable storage medium 720 includes, but is not limited to, volatile memory and/or non-volatile memory.
  • the non-volatile memory may be a read-only memory (Read-Only Memory, ROM), a programmable read-only memory (Programmable ROM, PROM), an erasable programmable read-only memory (Erasable PROM, EPROM), an electrically programmable read-only memory (Erasable PROM, EPROM).
  • Volatile memory may be Random Access Memory (RAM), which acts as an external cache.
  • RAM Random Access Memory
  • SRAM Static RAM
  • DRAM Dynamic RAM
  • SDRAM Synchronous DRAM
  • SDRAM double data rate synchronous dynamic random access memory
  • Double Data Rate SDRAM DDR SDRAM
  • enhanced SDRAM ESDRAM
  • synchronous link dynamic random access memory SLDRAM
  • Direct Rambus RAM Direct Rambus RAM
  • the encoding and decoding device 800 may be an encoding terminal, an encoder, or an encoding framework involved in the embodiments of the present application; the computer-readable storage medium 720 stores first computer instructions; Execute the first computer instructions stored in the computer-readable storage medium 720 to implement the corresponding steps in the method for encoding unmatched pixels provided by the embodiments of the present application; in other words, the first computer instructions in the computer-readable storage medium 720 are processed by The controller 710 loads and executes the corresponding steps, which are not repeated here to avoid repetition.
  • the encoding and decoding device 800 may be a decoding end, a decoder, or a decoding framework involved in the embodiments of the present application;
  • the computer-readable storage medium 720 stores second computer instructions; Execute the second computer instructions stored in the computer-readable storage medium 720 to implement the corresponding steps in the method for decoding unmatched pixels provided in the embodiments of the present application; in other words, the second computer instructions in the computer-readable storage medium 720 are processed by The controller 710 loads and executes the corresponding steps, which are not repeated here to avoid repetition.
  • an embodiment of the present application further provides a computer-readable storage medium (Memory), where the computer-readable storage medium is a memory device in the encoding and decoding device 800 for storing programs and data.
  • computer readable storage medium 720 may include both a built-in storage medium in the encoding and decoding device 800 , and of course, may also include an extended storage medium supported by the encoding and decoding device 800 .
  • the computer-readable storage medium provides storage space in which the operating system of the codec apparatus 800 is stored.
  • one or more computer instructions suitable for being loaded and executed by the processor 710 are also stored in the storage space, and these computer instructions may be one or more computer programs 721 (including program codes).
  • a computer program product or computer program comprising computer instructions stored in a computer readable storage medium.
  • the codec device 800 may be a computer
  • the processor 710 reads the computer instructions from the computer-readable storage medium 720
  • the processor 710 executes the computer instructions, so that the computer executes the unmatched information provided in the above-mentioned various optional manners The encoding method of pixels or the decoding method of unmatched pixels.
  • the computer program product includes one or more computer instructions.
  • the computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable device.
  • the computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from a website site, computer, server or data center via Transmission to another website site, computer, server, or data center by wired (eg, coaxial cable, optical fiber, digital subscriber line, DSL) or wireless (eg, infrared, wireless, microwave, etc.) means.
  • wired eg, coaxial cable, optical fiber, digital subscriber line, DSL
  • wireless eg, infrared, wireless, microwave, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

本申请提供了一种未匹配像素的解码方法、编码方法、解码器以及编码器,涉及视频或图像处理技术领域。该未匹配像素的解码方法包括:获取目标序列的码流,通过至少两种熵解码方式对目标序列的码流进行解码,得到目标图像块中的未匹配像素的二元符号串,目标图像块是将目标序列中的目标图像帧划分得到的;通过对该二元符号串进行反二值化,得到该未匹配像素的分量值;基于该未匹配像素的分量值,得到该目标图像块。该方法能够提升编码的灵活性,有利于均衡编码性能未匹配像素的编码开销。

Description

未匹配像素的解码方法、编码方法、解码器以及编码器
本申请要求于2021年03月13日提交中国专利局,申请号为202110272823X,申请名称为“未匹配像素的解码方法、编码方法、解码器以及编码器”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请实施例涉及视频或图像处理技术领域,并且更具体地,涉及未匹配像素的解码方法、编码方法、解码器以及编码器。
背景技术
数字视频压缩技术主要是将庞大的数字影像视频数据进行压缩,以便于传输以及存储等。随着互联网视频的激增以及人们对视频清晰度的要求越来越高,尽管已有的数字视频压缩标准能够实现视频解压缩,但目前仍然需要追求更好的数字视频压缩技术,以提升压缩性能。熵编码是一种重要的视频压缩技术,常用的熵编码方法包括基于上下文的二进制算术编码(Content Adaptive Binary Arithmetic Coding,CABAC)和基于上下文的变长编码(Content Adaptive Variable Length Coding,CAVLC)。CABAC过程主要包含3个步骤:二进制化、上下文建模和二进制算术编码。
在对输入的语法元素进行二值化后,可以通过常规编码模式和旁路编码模式(Bypass Coding Mode)对二元数据进行编码。旁路编码模式无须为每个二元位(Bin)分配特定的概率模型,输入的Bin直接用一个简单的旁路编码器进行编码,以加快整个编码以及解码的速度。
一般情况下,不同的语法元素之间并不是完全独立的,且相同语法元素自身也具有一定的记忆性。因此,根据条件熵理论,利用其他已编码的语法元素进行条件编码,相对于独立编码或者无记忆编码能够进一步提高编码性能。这些用来作为条件的已编码符号信息称为上下文。在常规编码模式中,语法元素的bin顺序地进入上下文模型器。编码器根据先前编码过的语法元素或bin的值,为每一个输入的bin分配合适的概率模型,该过程即为上下文建模。通过上下文索引增量(context index increment,ctxIdxInc)和上下文起始索引(context index Start,ctxIdxStart)即可定位到语法元素所对应的上下文模型。将bin和分配的概率模型一起送入二元算术编码器进行编码后,需要根据bin值更新上下文模型,也就是编码中的自适应过程。
然而,在AVS3标准中,对于串复制帧内预测模式编码块,其未匹配像素的Y,Cb和Cr分量被直接编码,而不是通过预测值导出。未匹配像素各分量均根据其位深进行编码,例如当前图像的位深为10bit,编码一个未匹配像素的Y,Cb,Cr分量则需要3*10bit位符号进行编码,即对于各符号位均采用旁路(bypass)模式进行编码,导致未匹配像素的编码开销过大。
发明内容
本申请提供了一种未匹配像素的解码方法、编码方法、解码器以及编码器,能够提升编码的灵活性,有利于均衡编码性能未匹配像素的编码开销。
一方面,本申请提供了一种未匹配像素的解码方法,包括:
获取目标序列的码流,通过至少两种熵解码方式对所述目标序列的码流进行解码,得到目标图像块中的未匹配像素的二元符号串,所述目标图像块是将所述目标序列中的目标图像帧划分得到的;
通过对该二元符号串进行反二值化,得到该未匹配像素的分量值;
基于该未匹配像素的分量值,得到该目标图像块。
另一方面,本申请提供了一种未匹配像素的编码方法,包括:
获取目标序列,将目标序列中的目标图像帧划分为多个图像块,该多个图像块包括目标图像块;
调整该目标图像块中的未匹配像素的分量值,得到所述未匹配像素的调整分量值;
通过二值化该未匹配像素的调整分量值,得到该未匹配像素的二元符号串;
通过至少两种熵编码方式对该未匹配像素的二元符号串进行编码,得到该目标序列的码流。
另一方面,本申请实施例提供了一种解码器,用于执行上述第一方面或其各实现方式中的方法。具体地,该解码器包括用于执行上述第一方面或其各实现方式中的方法的功能单元。
另一方面,本申请实施例提供了一种编码器,用于执行上述第二方面或其各实现方式中的方法。具体地,该编码器包括用于执行上述第二方面或其各实现方式中的方法的功能单元。
另一方面,本申请实施例提供了一种编解码设备,包括:
处理器,适于实现计算机指令;以及,
计算机可读存储介质,计算机可读存储介质存储有计算机指令,计算机指令适于由处理器加载并执行如上述第一方面至第二方面中的任一方面或其各实现方式中的方法。
另一方面,本申请实施例提供了一种计算机可读存储介质,该计算机可读存储介质存储有计算机指令,该计算机指令被计算机设备的处理器读取并执行时,使得计算机设备执行上述第一方面至第二方面中的任一方面或其各实现方式中的方法。
另一方面,本申请实施例提供了一种计算机程序产品或计算机程序,该计算机程序产品或计算机程序包括计算机指令,该计算机指令存储在计算机可读存储介质中。计算机设备的处理器从计算机可读存储介质读取该计算机指令,处理器执行该计算机指令,使得该计算机设备执行上述第一方面至第二方面中的任一方面或其各实现方式中的方法。
本申请实施例中,通过至少两种熵解码方式对目标序列的码流进行解码,即通过多种熵解码方式对目标序列的码流进行解码,能够提升编编码的灵活性,有利于均衡编编码性能未匹配像素的编码开销。
附图说明
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本申请实施例提供的一种视频编解码系统的示意性框图;
图2是本申请实施例提供的视频编码器的示意性框图;
图3是本申请实施例提供的解码框架的示意性框图;
图4是本申请实施例提供的一种帧内串复制的示意图;
图5是本申请实施例提供的未匹配像素的编码方法的示意性流程图;
图6是本申请实施例提供的未匹配像素的解码方法的示意性流程图。
图7是本申请实施例提供的编码器的示意性框图。
图8是本申请实施例提供的解码器的示意性框图。
图9是本申请实施例提供的编解码设备的示意性框图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
本申请的技术不限于任何编解码标准或技术。
本申请实施例提供的方案可应用于数字视频编码技术领域,例如,图像编解码领域、视频编解码领域、硬件视频编解码领域、专用电路视频编解码领域、实时视频编解码领域。或者说,本申请实施例提供的方案可结合至音视频编码标准(Audio Video coding Standard,AVS)、第二代AVS标准(AVS2)或第三代AVS标准(AVS3)。具体包括但不限于H.264/音视频编码(Audio Video coding,AVC)标准、H.265/高效视频编码(High Efficiency Video Coding,HEVC)标准以及H.266/多功能视频编码(Versatile Video Coding,VVC)标准。或者说,本申请实施例提供的方案可结合至其它专属或行业标准,示例性地,具体可以包含ITU-TH.261、ISO/IECMPEG-1Visual、ITU-TH.262或ISO/IECMPEG-2Visual、ITU-TH.263、ISO/IECMPEG-4Visual,ITU-TH.264(也可称为ISO/IECMPEG-4AVC),也包含可分级视频编解码(SVC)及多视图视频编解码(MVC)扩展。
此外,本申请实施例提供的方案可以用于对图像进行有损压缩(lossy compression),也可用于对图像进行无损压缩(lossless compression)。该无损压缩可以是视觉无损压缩(visually lossless compression),也可以是数学无损压缩(mathematically lossless compression)。
为了便于理解,首先结合图1对本申请实施例涉及的视频编解码系统进行介绍。
图1为本申请实施例涉及的一种视频编解码系统100的示意性框图。需要说明的是,图1只是一种示例,本申请实施例的视频编解码系统包括但不限于图1所示。如图1所示,该视频编解码系统100包含编码设备110和解码设备120。其中编码设备用于对视频数据进行编码(可以理解成压缩)产生码流,并将码流传输给解码设备。解码设备对编码设备编码产生的码流进行解码,得到解码后的视频数据。
编码设备110可以理解为具有视频编码功能的设备,解码设备120可以理解为具有视频解码功能的设备,即本申请实施例对编码设备110和解码设备120包括更广泛的装置,例如包含智能手机、台式计算机、移动计算装置、笔记本(例如,膝上型)计算机、平板计算机、机顶盒、电视、相机、显示装置、数字媒体播放器、视频游戏控制台、车载计算机等。在一些实施例中,编码设备110可以经由信道130将编码后的视频数据(如码流)传输给解码设备120。信道130可以包括能够将编码后的视频数据从编码设备110传输到解码设备120的 一个或多个媒体和/或装置。
在一个实例中,信道130包括使编码设备110能够实时地将编码后的视频数据直接发射到解码设备120的一个或多个通信媒体。在此实例中,编码设备110可根据通信标准来调制编码后的视频数据,且将调制后的视频数据发射到解码设备120。其中通信媒体包含无线通信媒体,例如射频频谱,可选的,通信媒体还可以包含有线通信媒体,例如一根或多根物理传输线。在另一实例中,信道130包括存储介质,该存储介质可以存储编码设备110编码后的视频数据。存储介质包含多种本地存取式数据存储介质,例如光盘、DVD、快闪存储器等。在该实例中,解码设备120可从该存储介质中获取编码后的视频数据。在另一实例中,信道130可包含存储服务器,该存储服务器可以存储编码设备110编码后的视频数据。在此实例中,解码设备120可以从该存储服务器中下载存储的编码后的视频数据。可选的,该存储服务器可以存储编码后的视频数据且可以将该编码后的视频数据发射到解码设备120,例如web服务器(例如,用于网站)、文件传送协议(FTP)服务器等。
在一些实施例中,编码设备110包含视频编码器112及输出接口113。其中,输出接口113可以包含调制器/解调器(调制解调器)和/或发射器。
在一些实施例中,编码设备110除了包括视频编码器112和输入接口113外,还可以包括视频源111。视频源111可包含视频采集装置(例如,视频相机)、视频存档、视频输入接口、计算机图形系统中的至少一个,其中,视频输入接口用于从视频内容提供者处接收视频数据,计算机图形系统用于产生视频数据。
视频编码器112对来自视频源111的视频数据进行编码,产生码流。视频数据可包括一个或多个图像(picture)或图像序列(sequence of pictures)。码流以比特流的形式包含了图像或图像序列的编码信息。编码信息可以包含编码图像数据及相关联数据。相关联数据可包含序列参数集(sequence parameter set,简称SPS)、图像参数集(picture parameter set,简称PPS)及其它语法结构。SPS可含有应用于一个或多个序列的参数。PPS可含有应用于一个或多个图像的参数。语法结构是指码流中以指定次序排列的零个或多个语法元素的集合。
视频编码器112经由输出接口113将编码后的视频数据直接传输到解码设备120。编码后的视频数据还可存储于存储介质或存储服务器上,以供解码设备120后续读取。
在一些实施例中,解码设备120包含输入接口121和视频解码器122。在一些实施例中,解码设备120除包括输入接口121和视频解码器122外,还可以包括显示装置123。其中,输入接口121包含接收器及/或调制解调器。输入接口121可通过信道130接收编码后的视频数据。视频解码器122用于对编码后的视频数据进行解码,得到解码后的视频数据,并将解码后的视频数据传输至显示装置123。
显示装置123显示解码后的视频数据。显示装置123可与解码设备120整合或在解码设备120外部。显示装置123可包括多种显示装置,例如液晶显示器(LCD)、等离子体显示器、有机发光二极管(OLED)显示器或其它类型的显示装置。
需要说明的是,图1仅为本申请的一个实例,本申请实施例的技术方案不限于图1,例如本申请的技术还可以应用于单侧的视频编码或单侧的视频解码。
下面对本申请实施例涉及的视频编码框架进行介绍。
图2是本申请实施例提供的视频编码器200的示意性框图。应理解,该视频编码器200可用于对图像进行有损压缩(lossy compression),也可用于对图像进行无损压缩(lossless  compression)。该无损压缩可以是视觉无损压缩(visually lossless compression),也可以是数学无损压缩(mathematically lossless compression)。
该视频编码器200可应用于亮度色度(YCbCr,YUV)格式的图像数据上。
例如,该视频编码器200读取视频数据,针对视频数据中的每帧图像,将一帧图像划分成若干个编码树单元(coding tree unit,CTU),在一些例子中,CTB可被称作“树型块”、“最大编码单元”(Largest Coding unit,简称LCU)或“编码树型块”(coding tree block,简称CTB)。每一个CTU可以与图像内的具有相等大小的像素块相关联。每一像素可对应一个亮度(luminance或luma)采样及两个色度(chrominance或chroma)采样。因此,每一个CTU可与一个亮度采样块及两个色度采样块相关联。一个CTU大小例如为128×128、64×64、32×32等。一个CTU又可以继续被划分成若干个编码单元(Coding Unit,CU)进行编码,CU可以为矩形块也可以为正方形块。CU可以进一步划分为预测单元(prediction Unit,PU)和变换单元(transform unit,TU),进而使得编码、预测、变换分离,处理的时候更灵活。在一种示例中,CTU以四叉树方式划分为CU,CU以四叉树方式划分为TU、PU。
该视频编码器200可支持各种PU大小。假定特定CU的大小为2N×2N,视频编码器及视频解码器可支持2N×2N或N×N的PU大小以用于帧内预测,且支持2N×2N、2N×N、N×2N、N×N或类似大小的对称PU以用于帧间预测。视频编码器及视频解码器还可支持2N×nU、2N×nD、nL×2N及nR×2N的不对称PU以用于帧间预测。
如图2所示,该视频编码器200可包括:
预测单元210、残差单元220、变换/量化单元230、反变换/量化单元240、重建单元250、环路滤波单元260、解码图像缓存270和熵编码单元280。需要说明的是,视频编码器200可包含更多、更少或不同的功能组件。
预测单元210包括帧间预测单元211和帧内预测单元212。由于视频的一个帧中的相邻像素之间存在很强的相关性,在视频编解码技术中使用帧内预测的方法消除相邻像素之间的空间冗余。由于视频中的相邻帧之间存在着很强的相似性,在视频编解码技术中使用帧间预测方法消除相邻帧之间的时间冗余,从而提高编码效率。
帧间预测单元211可用于帧间预测,帧间预测可以参考不同帧的图像信息,帧间预测使用运动信息从参考帧中找到参考块,根据参考块生成预测块,用于消除时间冗余;帧间预测所使用的帧可以为P帧和/或B帧,P帧指的是向前预测帧,B帧指的是双向预测帧。运动信息包括参考帧所在的参考帧列表,参考帧索引,以及运动矢量。运动矢量可以是整像素的或者是分像素的,如果运动矢量是分像素的,那么需要再参考帧中使用插值滤波做出所需的分像素的块,这里把根据运动矢量找到的参考帧中的整像素或者分像素的块叫参考块。有的技术会直接把参考块作为预测块,有的技术会在参考块的基础上再处理生成预测块。在参考块的基础上再处理生成预测块也可以理解为把参考块作为预测块然后再在预测块的基础上处理生成新的预测块。最常用的帧间预测方法包括:VVC视频编解码标准中的几何划分模式(geometric partitioning mode,GPM),以及AVS3视频编解码标准中的角度加权预测(angular weighted prediction,AWP)。这两种帧内预测模式在原理上有共通之处。
帧内预测单元212只参考同一帧图像的信息,预测当前码图像块内的像素信息,用于消除空间冗余。帧内预测所使用的帧可以为I帧。HEVC使用的帧内预测模式有平面模式(Planar)、DC和33种角度模式,共35种预测模式。VVC使用的帧内模式有Planar、DC和65种角度模式,共67种预测模式。AVS3使用的帧内模式有DC、Plane、Bilinear和63种角 度模式,共66种预测模式。在一些实施例中,帧内预测单元212可以采用帧内块复制技术和帧内串复制技术实现。
残差单元220可基于CU的像素块及CU的PU的预测块来产生CU的残差块。举例来说,残差单元220可产生CU的残差块,使得残差块中的每一采样具有等于以下两者之间的差的值:CU的像素块中的采样,及CU的PU的预测块中的对应采样。
变换/量化单元230可量化变换系数。变换/量化单元230可基于与CU相关联的量化参数(QP)值来量化与CU的TU相关联的变换系数。视频编码器200可通过调整与CU相关联的QP值来调整应用于与CU相关联的变换系数的量化程度。反变换/量化单元240可分别将逆量化及逆变换应用于量化后的变换系数,以从量化后的变换系数重建残差块。
重建单元250可将重建后的残差块的采样加到预测单元210产生的一个或多个预测块的对应采样,以产生与TU相关联的重建图像块。通过此方式重建CU的每一个TU的采样块,视频编码器200可重建CU的像素块。
环路滤波单元260可执行消块滤波操作以减少与CU相关联的像素块的块效应。在一些实施例中,环路滤波单元260包括去块滤波单元和样点自适应补偿/自适应环路滤波(SAO/ALF)单元,其中去块滤波单元用于去方块效应,SAO/ALF单元用于去除振铃效应。
解码图像缓存270可存储重建后的像素块。帧间预测单元211可使用含有重建后的像素块的参考图像来对其它图像的PU执行帧间预测。另外,帧内预测单元212可使用解码图像缓存270中的重建后的像素块来对在与CU相同的图像中的其它PU执行帧内预测。
熵编码单元280可接收来自变换/量化单元230的量化后的变换系数。熵编码单元280可对量化后的变换系数执行一个或多个熵编码操作以产生熵编码后的数据。
图3是本申请实施例提供的解码框架300的示意性框图。
如图3所示,视频解码器300包含:
熵解码单元310、预测单元320、反量化/变换单元330、重建单元340、环路滤波单元350及解码图像缓存360。需要说明的是,视频解码器300可包含更多、更少或不同的功能组件。
视频解码器300可接收码流。熵解码单元310可解析码流以从码流提取语法元素。作为解析码流的一部分,熵解码单元310可解析码流中的经熵编码后的语法元素。预测单元320、反量化/变换单元330、重建单元340及环路滤波单元350可根据从码流中提取的语法元素来解码视频数据,即产生解码后的视频数据。
预测单元320包括帧内预测单元321和帧间预测单元322。
帧内预测单元321可执行帧内预测以产生PU的预测块。帧内预测单元321可使用帧内预测模式以基于空间相邻PU的像素块来产生PU的预测块。帧内预测单元321还可根据从码流解析的一个或多个语法元素来确定PU的帧内预测模式。
帧间预测单元322可根据从码流解析的语法元素来构造第一参考图像列表(列表0)及第二参考图像列表(列表1)。此外,如果PU使用帧间预测编码,则熵解码单元310可解析PU的运动信息。帧间预测单元322可根据PU的运动信息来确定PU的一个或多个参考块。帧间预测单元322可根据PU的一个或多个参考块来产生PU的预测块。
反量化/变换单元330可逆量化(即,解量化)与TU相关联的变换系数。反量化/变换单元330可使用与TU的CU相关联的QP值来确定量化程度。在逆量化变换系数之后,反量化/变 换单元330可将一个或多个逆变换应用于逆量化变换系数,以便产生与TU相关联的残差块。
重建单元340使用与CU的TU相关联的残差块及CU的PU的预测块以重建CU的像素块。例如,重建单元340可将残差块的采样加到预测块的对应采样以重建CU的像素块,得到重建图像块。
环路滤波单元350可执行消块滤波操作以减少与CU相关联的像素块的块效应。
视频解码器300可将CU的重建图像存储于解码图像缓存360中。视频解码器300可将解码图像缓存360中的重建图像作为参考图像用于后续预测,或者,将重建图像传输给显示装置呈现。
视频编解码的基本流程如下:
在编码端,将一帧图像划分成块,针对当前块,预测单元210使用帧内预测或帧间预测产生当前块的预测块。残差单元220可基于预测块与当前块的原始块计算残差块,即预测块和当前块的原始块的差值,该残差块也可称为残差信息。该残差块经由变换/量化单元230变换与量化等过程,可以去除人眼不敏感的信息,以消除视觉冗余。可选的,经过变换/量化单元230变换与量化之前的残差块可称为时域残差块,经过变换/量化单元230变换与量化之后的时域残差块可称为频率残差块或频域残差块。熵编码单元280接收到变化量化单元230输出的量化后的变化系数,可对该量化后的变化系数进行熵编码,输出码流。例如,熵编码单元280可根据目标上下文模型以及二进制码流的概率信息消除字符冗余。可选的,在本申请中,当前块(current block)可以称为当前编码单元(CU)或当前预测单元(PU)等。预测块也可称为预测图像块或图像预测块,重建图像块也可称为重建块或图像重建图像块。
在解码端,熵解码单元310可解析码流得到当前块的预测信息、量化系数矩阵等,预测单元320基于预测信息对当前块使用帧内预测或帧间预测产生当前块的预测块。反量化/变换单元330使用从码流得到的量化系数矩阵,对量化系数矩阵进行反量化、反变换得到残差块。重建单元340将预测块和残差块相加得到重建块。重建块组成重建图像,环路滤波单元350基于图像或基于块对重建图像进行环路滤波,得到解码图像。编码端同样需要和解码端类似的操作获得解码图像。该解码图像也可以称为重建图像,重建图像可以为后续的帧作为帧间预测的参考帧。
需要说明的是,编码端确定的块划分信息,以及预测、变换、量化、熵编码、环路滤波等模式信息或者参数信息等在必要时携带在码流中。解码端通过解析码流及根据已有信息进行分析确定与编码端相同的块划分信息,预测、变换、量化、熵编码、环路滤波等模式信息或者参数信息,从而保证编码端获得的解码图像和解码端获得的解码图像相同。
还应当理解,上述是基于块的混合编码框架下的视频编解码器的基本流程,随着技术的发展,该框架或流程的一些模块或步骤可能会被优化,本申请适用于该基于块的混合编码框架下的视频编解码器的基本流程,但不限于该框架及流程。
上文结合图1至图3对本申请实施例可适用的编解码系统,编码器以及解码器进行了描述,从信号的获取方式看,视频信号可以分为摄像机拍摄到的以及计算机生成的两种方式,由于统计特性的不同,其对应的压缩编码方式也可能有所区别,但都会对输入的原始视频信号,执行如下操作和处理中的一项或多项:
1)、块划分结构(block partition structure):输入图像根据一个的大小,划分成 若干个不重叠的处理单元,每个处理单元将进行类似的压缩操作。这个处理单元被称作CTU,或者LCU。CTU再往下,可以继续进行更加精细的划分,得到一个或多个基本编码的单元,称之为CU。每个CU是一个编码环节中最基本的元素。以下描述的是对每一个CU可能采用的各种编码方式。
2)、预测编码(Predictive Coding):包括了帧内预测和帧间预测等方式,原始视频信号经过选定的已重建视频信号的预测后,得到残差视频信号。编码端需要为当前CU决定在众多可能的预测编码模式中,选择最适合的一种,并告知解码端。
a、帧内预测:预测的信号来自于同一图像内已经编码重建过的区域
b、帧间预测:预测的信号来自已经编码过的,不同于当前图像的其他图像(称之为参考图像)
3)、变换编码及量化(Transform & Quantization):残差视频信号经过DFT,DCT等变换操作,将信号转换到变换域中,称之为变换系数。在变换域中的信号,进一步的进行有损的量化操作,丢失掉一定的信息,使得量化后的信号有利于压缩表达。在一些视频编码标准中,可能有多于一种变换方式可以选择,因此,编码端也需要为当前编码CU选择其中的一种变换,并告知解码端。量化的精细程度通常由量化参数(QP)来决定,QP取值较大大,表示更大取值范围的系数将被量化为同一个输出,因此通常会带来更大的失真,及较低的码率;相反,QP取值较小,表示较小取值范围的系数将被量化为同一个输出,因此通常会带来较小的失真,同时对应较高的码率。
4)、熵编码(Entropy Coding)或统计编码:量化后的变换域信号,将根据各个值出现的频率,进行统计压缩编码,最后输出二值化(0或者1)的压缩码流。同时,编码产生其他信息,例如选择的模式,运动矢量等,也需要进行熵编码以降低码率。统计编码是一种无损编码方式,可以有效的降低表达同样的信号所需要的码率。常见的统计编码方式有变长编码(VLC,Variable Length Coding)或者基于上下文的二进制算术编码(Content Adaptive Binary Arithmetic Coding,CABAC)。
5)、环路滤波(Loop Filtering):已经编码过的图像,经过反量化,反变换及预测补偿的操作(上述2~4的反向操作),可获得重建的解码图像。重建图像与原始图像相比,由于存在量化的影响,部分信息与原始图像有所不同,产生失真(Distortion)。对重建图像进行滤波操作,例如去块效应滤波(deblocking),SAO或者ALF等滤波器,可以有效的降低量化所产生的失真程度。由于这些经过滤波后的重建图像,将做为后续编码图像的参考,用于对将来的信号进行预测,所以上述的滤波操作也被称为环路滤波,及在编码环路内的滤波操作。
在解码过程中,对于每一个CU,解码器获得压缩码流后,先进行熵解码,获得各种模式信息及量化后的变换系数,各个系数经过反量化及反变换,得到残差信号;另一方面,根据已知的编码模式信息,可获得该CU对应的预测信号,两者相加之后,即可得到重建信号。最后,解码图像的重建值,需要经过环路滤波的操作,产生最终的输出信号。
针对主流的视频编码标准,如HEVC,VVC,AVS3,均采用基于块的混合编码框架。将原始的视频数据分成一系列的编码块,结合预测,变换和熵编码等视频编码方法,实现视频数据的压缩。其中,运动补偿是视频编码常用的一类预测方法,运动补偿基于视频内容在时域或空域的冗余特性,从已编码的区域导出当前编码块的预测值。这类预测方法包括:帧间预 测、块复制帧内预测、串复制帧内预测等,在具体的编码实现中,可能单独或组合使用这些预测方法。
下面对串复制帧内预测进行介绍。
1、串复制帧内预测中的普通串预测子模式。
串复制帧内预测技术(Intra String Copy,ISC)按照某种扫描顺序(光栅扫描、往返扫描和Zig-Zag扫描等)将一个编码块分成一系列像素串或未匹配像素。类似于IBC,每个串在当前图像已编码区域中寻找相同形状的参考串,导出当前串的预测值,通过编码当前串像素值与预测值之间残差,代替直接编码像素值,能够有效节省比特。
图4给出了串复制帧内预测的示意图。
如图4所示,深灰色的区域为已编码区域,白色的28个像素为串1,浅灰色的35个像素为串2,黑色的1个像素表示未匹配像素。未匹配像素也称为孤立点,未匹配像素的原始值被直接编码,而不是通过预测值导出。需要说明的是,帧内串复制技术需要编码当前编码块中各个串对应的串矢量(String Vector,SV)、串长度以及是否有匹配串的标志等。串矢量(SV)表示待编码串到其参考串的位移。串长度表示该串所包含的像素数量。如图4所示,串1的参考串在其左侧,通过串矢量1表示串1到其对应的参考串的位移。串2的参考串在其上方,通过串矢量2表示串2到其对应的参考串的位移。
2、串复制帧内预测中的等值串与单位基矢量串子模式。
等值串与单位矢量串模式是串复制帧内预测的一种子模式,在2020年10月被采纳至AVS3标准中。类似于串复制帧内预测,该模式将一个编码/解码块按照某种扫描顺序将划分为一系列的像素串或未匹配像素,像素串的类型可以为等值串或单位基矢量串。等值串的特点在于像素串中所有像素具有相同的预测值。单位矢量串(也称为单位基矢量串,单位偏移串,复制上方串等)的特点在于其位移矢量为(0,-1),该串的每个像素使用上方的像素作为当前像素的预测值。等值串模式需要在码流中对当前编码块各个串的类型,长度和预测值信息进行编码。与普通串预测值子模式一样,未匹配像素的预测值被直接编码,而不是通过预测值导出。
在AVS3标准中,对于串复制帧内预测模式编码块,其未匹配像素的Y,Cb和Cr分量被直接编码,而不是通过预测值导出。未匹配像素各分量均根据其位深进行编码,例如当前图像的位深为10bit,编码一个未匹配像素的Y,Cb,Cr分量则需要3*10bit位符号进行编码,即对于各符号位均采用旁路(bypass)模式进行编码,导致未匹配像素的编码开销过大。
本申请实施例提出一种未匹配像素的编码方法,该方法通过编码端对未匹配像素各分量值进行调整,并为未匹配像素的选择合适的上下文模型进行编码,从而降低未匹配像素的编码开销,提升编码性能。需要说明的是,本申请提供的方法适用于使用了串复制帧内预测模式或其他需要对未匹配像素编码的任意模式(如调色板模式)的编解码器中,本申请实施例对此不作具体限定。
图5是本申请实施例提供的未匹配像素的编码方法400的示意性流程图。该方法400的执行主体可以是如下设备,但不限于此:编码器、或者用于进行块矢量编码的设备,如台式计算机、移动计算装置、笔记本(例如,膝上型)计算机、平板计算机、机顶盒、智能电话等手持机、电视、相机、显示装置、数字媒体播放器、视频游戏控制台、车载计算机,或其类似者等。例如,图1所示的编码设备110或图2所示的视频编码器200。
如图5所示,该未匹配像素的编码方法400可包括:
S410,获取目标序列,将目标序列中的目标图像帧划分为多个图像块,该多个图像块包括目标图像块;
S420,调整该目标图像块中的未匹配像素的分量值,得到未匹配像素的调整分量值;
S430,通过二值化该未匹配像素的调整分量值,得到该未匹配像素的二元符号串;
S440,通过至少两种熵编码方式对该未匹配像素的二元符号串进行编码,得到该目标序列的码流。
本申请实施例中,通过多种熵编码方式对目标序列的码流进行编码,能够提升编码的灵活性,有利于均衡编码性能未匹配像素的编码开销。
需要说明的是,本申请涉及的未匹配像素的分量值可以理解为该未匹配像素颜色分量的数值,例如对于YUV图像,该未匹配像素的分量值可以是该未匹配像素的Y分量,U分量或V分量的值。
在一些实施例中,该二元符号串的长度为M,M为大于0的整数,该至少两种熵编码方式包括第一熵编码方式和第二熵编码方式;其中,该S440可包括:
通过对该二元符号串的前预设第一解码位采用该第一熵编码方式进行解码,以及通过对该二元符号串的后预设第二解码位采用该第二熵编码方式进行解码,得到目标图像块中的未匹配像素的二元符号串。其中,后预设第二目标位可以是预先设置好的使用第二熵编码方式进行编码的位数,比如,可以为二元符号串中的后N位,N为大于0的整数,N小于M,则前预设第一目标位可以是预先设置好的使用第一熵编码方式进行编解码的位数,比如,可以为二元符号串中的前M-N位。
换言之,编码端通过两种熵编码方式对该未匹配像素的二元符号串进行编码,得到该目标序列的码流。
在一些实施例中,该第一熵编码方式为旁路模式,该第二熵编码方式为基于上下文的二进制算术编码CABAC模式。
本申请实施例中,通过至少两种熵编码方式对该未匹配像素的二元符号串进行编码;由于该至少两种熵编码方式构造为该二元符号串的前M-N位采用旁路模式进行熵编码,该二元符号串的后N位采用基于上下文的二进制算术编码CABAC模式进行熵编码;一方面,该二元符号串的前M-N位设计为采用旁路模式进行熵编码,能够保证编码性能,另一方面,该二元符号串的后N位采用基于上下文的二进制算术编码CABAC模式进行熵编码,能够降低未匹配像素的编码开销,基于此,本申请提供的方案能够在保证编码性能的基础上,降低未匹配像素的编码开销。
需要说明的是,旁路模式可以理解为特定概率信息下的CABAC模式,但由于针对该特定概率信息的设定,会导致采用旁路模式进行熵编码时,产生较大的编码开销。
在一些实施例中,该方法400还可包括:
将以下标识中的至少一项,写入该码流;
第一标识、第二标识、第三标识或第四标识;
其中,该第一标识用于指示针对该目标序列中的未匹配像素是否采用该至少两种熵编码方式进行熵编码;该第二标识用于指示针对该目标序列中的目标图像帧中的未匹配像素是否采用该至少两种熵编码方式进行熵编码,该目标图像帧包括该目标图像块;该第三标识用于指示针对该目标图像帧中的目标片中的未匹配像素是否采用该至少两种熵编码方式进行熵编码,该目标片包括该目标图像块;该第四标识用于指示针对该目标图像块中的未匹配像素是 否采用该至少两种熵编码方式进行熵编码。
需要说明的是,该第一标识、该第二标识、该第三标识或该第四标识的具体取值可以在相应的参数集中通过查询的方式获取,也可以通过其他方式获取,本申请实施例对此不作具体限定。例如,参数集中的各个标识的取值可以是提前设置好的参数。例如,在进行编码前,用户可通过设置编码器的参数设置参数集中的各个标识的取值。
在一些实施例中,该至少两种熵编码方式为针对未匹配像素的默认模式;其中,该S440可包括:
通过该默认模式对该未匹配像素的二元符号串进行编码,得到该目标序列的码流。
换言之,编码器可以通过以下方法中的至少一项,确定目标图像块是否使用本申请提供的针对未匹配像素的编码方法:
a)、对于目标编码块中的未匹配像素默认使用本申请提供的针对未匹配像素的编码方法,即不需要在码流中编码标记。b)、在码流中编码一个序列级的标记,指示该目标序列中所有编码块的未匹配像素均使用本申请提供的针对未匹配像素的编码方法。c)、在码流中编码一个图像级(帧级)的标记,指示该目标图像帧中所有的未匹配像素均使用本申请提供的针对未匹配像素的编码方法。d)、在码流中编码一个片级的标记,指示该目标片中所有的未匹配像素均使用本申请提供的针对未匹配像素的编码方法。e)、在码流中编码一个CU级(即图像块级)的标记,指示该目标图像块中的未匹配像素均使用本申请提供的针对未匹配像素的编码方法。
在一些实施例中,该S420可包括:
将该未匹配像素的分量值进行位移运算,得到未匹配像素的调整分量值。
其中,位移运算可以是对未匹配像素的分量值使用位移运算符进行位移运算,可以是进行左移运算,也可以是进行右移运算。比如,可以使用(val>>N)<<N进行位移运算,得到未匹配像素的调整分量值;其中,val表示调整前的该未匹配像素的分量值,>>表示右移运算符,<<表示左移运算符。
换言之,编码端对未匹配像素的像素进行调整,并为调整后的未匹配像素的分量值的选择合适的上下文模型进行编码;相应的,解码端则选择对应的上下文模型进行解码。
在一些实施例中,该将该未匹配像素的分量值进行位移运算,得到未匹配像素的调整分量值之前,该方法400还可包括:
获取量化参数,基于量化参数,确定后预设第二目标位的取值。
其中,量化参数是进行编解码中进行量化变换时使用的参数。后预设第二目标位是指预先设置好的使用第二熵编码方式进行编码的位数。在一种实现方式中,编码器可以根据针对该目标图像块的量化参数,确定N的取值。例如,该目标图像块的量化参数越大,则N的取值越大或越小。例如,编码器可以根据至少一个量化参数和至少一个数值的对应关系确定N的取值。具体地,编码器可以将该目标图像块对应的数值,确定为该N的取值。
在一些实施例中,该未匹配像素的分量值为色度分量值或亮度分量值,当未匹配像素的分量值为色度分量值在编码时采用的N的取值和当未匹配像素的分量值为亮度分量值在编码时采用的N的取值不同。
在一种实现方式中,记val_Y为调整前的未匹配像素的亮度分量值,编码端可将该未匹配像素的亮度分量值调整为(val_Y>>N1)<<N1;再如,记val_U为调整前的未匹配像素的色度分量值,编码端可将该未匹配像素的色度分量值调整为(val_U>>N2)<<N2。可选的,N1的取 值和N2的取值可以相同,也可以不同,本申请对此不作具体限定。例如,N1的取值小于N2的取值,以有限保证亮度分量的编码性能。
在一些实施例中,该S420可包括:
将该未匹配像素的分量值进行加权运算,得到未匹配像素的调整分量值。
可以使用val/K*K进行加权运算,得到未匹配像素的调整分量值;
其中,val表示调整前的该未匹配像素的分量值,/表示整除运算,*表示乘法运算,K是加权系数。
需要说明的是,本申请实施例对K的取值不作具体限定。例如,K可以等于N,也可以不等于N。再如,K可以是大于0的正整数。
在一些实施例中,该将该未匹配像素的分量值进行加权运算,得到未匹配像素的调整分量值之前,该方法400还可包括:
获取量化参数,基于量化参数,确定加权系数的取值。
在一种实现方式中,编码器可以根据针对该目标图像块的量化参数,确定K的取值。例如,该目标图像块的量化参数越大,则K的取值越大或越小。例如,编码器可以根据至少一个量化参数和至少一个数值的对应关系确定K的取值。具体地,编码器可以将该目标图像块对应的数值,确定为该K的取值。
在一些实施例中,该未匹配像素的分量值为色度分量值或亮度分量值,该色度分量值采用的K的取值和该亮度分量值采用的K的取值不同。
在一种实现方式中,记val_Y为调整前的未匹配像素的亮度分量值,编码端可将该未匹配像素的亮度分量值调整为val_Y/K1*K1;再如,记val_U为调整前的未匹配像素的色度分量值,编码端可将该未匹配像素的色度分量值调整为val_U/K2*K2。可选的,K1的取值和K2的取值可以相同,也可以不同,本申请对此不作具体限定。例如,K1的取值小于K2的取值,以有限保证亮度分量的编码性能。作为一种示例,可以只对亮度分量应用本申请提供的解码方法,即N1不为0,N2为0。作为另一种示例,可以只对色度分量应用本申请提供的解码方法,即N2不为0,N1为0。
需要说明的是,本申请实施例中,N和K可以结合使用,即K可以用于调整该未匹配像素的分量值,N可以用于确定至少两个熵编码方式中每一个熵编码方式对应的二元符号串中的符号位。
下面以采用10bit位深对输入视频进行编码为例,结合具体实施例对本申请提供的编码方法进行说明。
方法1:
a)、记val为当前未匹配像素的亮度或色度分量值,编码端将val的值调整为(val>>2)<<2;
b)、编码端使用定长码的对调整后的val进行二值化,导出长度为10的二元符号串。二元符号串的前8位采用旁路模式的方式进行熵编码,后2位采用CABAC模式的方式进行熵编码。
相应的,解码端从码流中解码10位二元符号串,其中,前8位采用旁路模式的方式进行熵编码,后2位采用CABAC模式的方式进行熵编码;然后按照定长码的方式反二值化得到val的值。
方法2:
a)、编码端根据量化参数设置N的值,对于27,32,38以及45,N的值分别设置为2,2,3,3。
b)、编码端将val的值调整为(val>>N)<<N;
c)、编码端使用定长码的对调整后的val进行二值化,导出长度为10的二元符号串。二元符号串的前10-N位采用旁路模式的方式进行熵编码,后N位采用CABAC模式的方式进行熵编码。
相应的,解码端从码流中解码10位二元符号串,其中,前10-N位采用旁路模式的方式进行熵编码,后N位采用CABAC模式的方式进行熵编码;然后按照定长码的方式反二值化得到val的值。
方法3:
a)、编码端根据量化参数设置M的值,对于27,32,38,45,M的值分别设置为2,2,3,3。
b)、记val的值为当前未匹配像素的亮度或色度分量值,如果val为亮度分量值,编码端设置N=M,否则如果val为色度分量值,编码端设置N=M+1;
c)、编码端将val的值调整为(val>>N)<<N;
d)、编码端使用定长码的方式对调整后的val进行二值化,导出长度为10的二元符号串。二元符号串的前10-N位采用旁路模式的方式进行熵编码,后N位采用CABAC模式的方式进行熵编码。
相应的,解码端从码流中解码10位二元符号串,其中,前10-N位采用旁路模式的方式进行熵编码,后N位采用CABAC模式的方式进行熵编码;然后按照定长码的方式反二值化得到val的值。
以上结合附图详细描述了本申请的优选实施方式,但是,本申请并不限于上述实施方式中的具体细节,在本申请的技术构思范围内,可以对本申请的技术方案进行多种简单变型,这些简单变型均属于本申请的保护范围。例如,在上述具体实施方式中所描述的各个具体技术特征,在不矛盾的情况下,可以通过任何合适的方式进行组合,为了避免不必要的重复,本申请对各种可能的组合方式不再另行说明。又例如,本申请的各种不同的实施方式之间也可以进行任意组合,只要其不违背本申请的思想,其同样应当视为本申请所公开的内容。
还应理解,在本申请的各种方法实施例中,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。另外,本申请实施例中,术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系。具体地,A和/或B可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中字符“/”,一般表示前后关联对象是一种“或”的关系。
上文结合图5详细说明了本申请实施例提供的未匹配像素的编码方法,下面结合图6对本申请实施例提供的未匹配像素的解码方法进行说明。
图6是本申请实施例提供的未匹配像素的解码方法500的示意性流程图。该方法500的执行主体可以是如下设备,但不限于此:解码器、或者用于进行块矢量解码的设备,如台式计算机、移动计算装置、笔记本(例如,膝上型)计算机、平板计算机、机顶盒、智能电话等手持机、电视、相机、显示装置、数字媒体播放器、视频游戏控制台、车载计算机,或其类 似者等。例如,图1所示的解码设备120或图3所示的视频解码器300。
如图6所示,该未匹配像素的解码方法500可包括:
S510,获取目标序列的码流,通过至少两种熵解码方式对目标序列的码流进行解码,得到目标图像块中的未匹配像素的二元符号串,目标图像块是将目标序列中的目标图像帧划分得到的;
S520,通过对该二元符号串进行反二值化,得到该未匹配像素的分量值;
S530,基于该未匹配像素的分量值,得到该目标图像块。
在一些实施例中,该二元符号串的长度为M,M为大于0的整数,该至少两种熵解码方式包括第一熵解码方式和第二熵解码方式;其中,该S510可包括:
通过对该二元符号串的前预设第一目标位采用该第一熵解码方式进行解码,以及通过对该二元符号串的后预设第二目标位采用该第二熵解码方式进行解码,得到目标图像块中的未匹配像素的二元符号串,所述前预设第一目标位和所述后预设第二目标位的总和为所述二元符号串的长度。
在一些实施例中,该第一熵解码方式为旁路模式,该第二熵解码方式为基于上下文的二进制算术编码CABAC模式。
本申请实施例中,通过至少两种熵解码方式对目标序列的码流进行解码,由于该至少两种熵解码方式构造为该二元符号串的前M-N位采用旁路模式进行熵解码,该二元符号串的后N位采用基于上下文的二进制算术编码CABAC模式进行熵解码;一方面,该二元符号串的前M-N位设计为采用旁路模式进行熵解码,能够保证解码性能,另一方面,该二元符号串的后N位采用基于上下文的二进制算术编码CABAC模式进行熵解码,能够降低未匹配像素的编码开销,基于此,本申请提供的方案能够在保证编码性能的基础上,降低未匹配像素的编码开销。
在一些实施例中,该通过至少两种熵解码方式对目标序列的码流进行解码,得到目标图像块中的未匹配像素的二元符号串之前,该方法500还可包括:
通过对目标序列的码流进行解析,得到以下标识中的至少一项:
第一标识、第二标识、第三标识或第四标识;
其中,该第一标识用于指示针对该目标序列中的未匹配像素是否采用该至少两种熵解码方式进行熵解码;该第二标识用于指示针对该目标序列中的目标图像帧中的未匹配像素是否采用该至少两种熵解码方式进行熵解码,该目标图像帧包括该目标图像块;该第三标识用于指示针对该目标图像帧中的目标片中的未匹配像素是否采用该至少两种熵解码方式进行熵解码,该目标片包括该目标图像块;该第四标识用于指示针对该目标图像块中的未匹配像素是否采用该至少两种熵解码方式进行熵解码;其中,该S510可包括:
若该第一标识、该第二标识、该第三标识或该第三标识用于指示采用该至少两种熵解码方式进行熵解码,则通过该至少两种熵解码方式对目标序列的码流进行解码,得到目标图像块中的未匹配像素的二元符号串。
在一些实施例中,该至少两种熵解码方式为针对未匹配像素的默认模式;其中,该S510可包括:
通过该默认模式对目标序列的码流进行解码,得到目标图像块中的未匹配像素的二元符号串。
换言之,解码器可以通过以下方法中的至少一项,确定目标图像块是否使用本申请提供的针对未匹配像素的解码方法:
a)、对于目标解码块中的未匹配像素默认使用本申请提供的针对未匹配像素的解码方法,即不需要在码流中解码标记。b)、在码流中解码一个序列级的标记,指示该目标序列中所有解码块的未匹配像素均使用本申请提供的针对未匹配像素的解码方法。c)、在码流中解码一个图像级(帧级)的标记,指示该目标图像帧中所有的未匹配像素均使用本申请提供的针对未匹配像素的解码方法。d)、在码流中解码一个片级的标记,指示该目标片中所有的未匹配像素均使用本申请提供的针对未匹配像素的解码方法。e)、在码流中解码一个CU级(即图像块级)的标记,指示该目标图像块中的未匹配像素均使用本申请提供的针对未匹配像素的解码方法。
在一些实施例中,该通过至少两种熵解码方式对目标序列的码流进行解码,得到目标图像块中的未匹配像素的二元符号串之前,该方法500还可包括:
获取量化参数,基于量化参数,确定后预设第二目标位的取值。
在一种实现方式中,解码器可以根据针对该目标图像块的量化参数,确定N的取值。例如,该目标图像块的量化参数越大,则N的取值越大或越小。例如,解码器可以根据至少一个量化参数和至少一个数值的对应关系确定N的取值。具体地,解码器可以将该目标图像块对应的数值,确定为该N的取值。
在一些实施例中,该未匹配像素的分量值为色度分量值或亮度分量值,当未匹配像素的分量值为色度分量值,在进行解码时使用的N的取值与当未匹配像素的分量值为亮度分量值时进行解码时使用的N的取值不同。
在一种实现方式中,记val_Y为调整前的未匹配像素的亮度分量值,编码端可将该未匹配像素的亮度分量值调整为(val_Y>>N1)<<N1;再如,记val_U为调整前的未匹配像素的色度分量值,编码端可将该未匹配像素的色度分量值调整为(val_U>>N2)<<N2。可选的,N1的取值和N2的取值可以相同,也可以不同,本申请对此不作具体限定。例如,N1的取值小于N2的取值,以有限保证亮度分量的编码性能。作为一种示例,可以只对亮度分量应用本申请提供的解码方法,即N1不为0,N2为0。作为另一种示例,可以只对色度分量应用本申请提供的解码方法,即N2不为0,N1为0。
需要说明的是,解码过程为编码过程的反向操作,解码方法500可参考编码方法400的相关描述,为避免重复,此处不再赘述。
上文结合图5至图6,详细描述了本申请的方法实施例,下文结合图7至图9,详细描述本申请的装置实施例。
图7是本申请实施例提供的编码器600的示意性流程图。
如图7所示,该编码器600可包括:
划分单元610,用于获取目标序列,将目标序列中的目标图像帧划分为多个图像块,该多个图像块包括目标图像块;
调整单元620,用于调整该目标图像块中的未匹配像素的分量值,得到所述未匹配像素的调整分量值;
二值化单元630,用于通过二值化该未匹配像素的调整分量值,得到该未匹配像素的二元符号串;
编码单元640,用于通过至少两种熵编码方式对该未匹配像素的二元符号串进行编码,得到该目标序列的码流。
本申请实施例中,通过至少两种熵编码方式对目标序列的码流进行解码,即通过多种熵编码方式对目标序列的码流进行编码,能够提升编编码的灵活性,有利于均衡编码性能未匹配像素的编码开销。
在一些实施例中,该至少两种熵编码方式包括第一熵编码方式和第二熵编码方式;其中,该编码单元640具体用于:
通过对该二元符号串的前预设第一目标位采用该第一熵编码方式进行编码,以及通过对该二元符号串的后预设第二目标位采用该第二熵编码方式进行编码,得到该目标序列的码流,前预设第一目标位和后预设第二目标位的总和为该二元符号串的长度。
在一些实施例中,该第一熵编码方式为旁路模式,该第二熵编码方式为基于上下文的二进制算术编码模式。
在一些实施例中,该编码单元640还用于:
将以下标识中的至少一项,写入该码流;
第一标识、第二标识、第三标识或第四标识;
其中,该第一标识用于指示针对该目标序列中的未匹配像素是否采用该至少两种熵编码方式进行熵编码;该第二标识用于指示针对该目标序列中的目标图像帧中的未匹配像素是否采用该至少两种熵编码方式进行熵编码,该目标图像帧包括该目标图像块;该第三标识用于指示针对该目标图像帧中的目标片中的未匹配像素是否采用该至少两种熵编码方式进行熵编码,该目标片包括该目标图像块;该第四标识用于指示针对该目标图像块中的未匹配像素是否采用该至少两种熵编码方式进行熵编码。
在一些实施例中,该至少两种熵编码方式为针对未匹配像素的默认模式;其中,该编码单元640具体用于:
通过该默认模式对该未匹配像素的二元符号串进行编码,得到该目标序列的码流。
在一些实施例中,该调整单元620具体用于:
将该未匹配像素的分量值进行位移运算,得到所述未匹配像素的调整分量值。
在一些实施例中,该将该未匹配像素的分量值进行位移运算,得到所述未匹配像素的调整分量值之前,该调整单元620还用于:
获取量化参数,基于量化参数,确定后预设第二目标位的取值。
在一些实施例中,该未匹配像素的分量值为色度分量值或亮度分量值,
在一些实施例中,该S420可包括:
将该未匹配像素的分量值进行加权运算,得到未匹配像素的调整分量值。
图8是本申请实施例提供的解码器700的示意性流程图。
如图8所示,该解码器700可包括:
解码单元710,用于获取目标序列的码流,通过至少两种熵解码方式对目标序列的码流进行解码,得到目标图像块中的未匹配像素的二元符号串,目标图像块是将目标序列中的目标图像帧划分得到的;
反二值化单元720,用于通过对该二元符号串进行反二值化,得到该未匹配像素的分量值;
处理单元730,用于基于该未匹配像素的分量值,得到该目标图像块。
在一些实施例中,该至少两种熵解码方式包括第一熵解码方式和第二熵解码方式;其中,该解码单元710具体用于:
通过对该二元符号串的前预设第一目标位采用该第一熵解码方式进行解码,以及通过对该二元符号串的后预设第二目标位采用该第二熵解码方式进行解码,得到目标图像块中的未匹配像素的二元符号串,前预设第一目标位和后预设第二目标位的总和为二元符号串的长度。
在一些实施例中,该第一熵解码方式为旁路模式,该第二熵解码方式为基于上下文的二进制算术编码模式。
在一些实施例中,该通过至少两种熵解码方式对目标序列的码流进行解码,得到目标图像块中的未匹配像素的二元符号串之前,该解码单元710还用于:
通过对目标序列的码流进行解析,得到以下标识中的至少一项:
第一标识、第二标识、第三标识或第四标识;
其中,该第一标识用于指示针对该目标序列中的未匹配像素是否采用该至少两种熵解码方式进行熵解码;该第二标识用于指示针对该目标序列中的目标图像帧中的未匹配像素是否采用该至少两种熵解码方式进行熵解码,该目标图像帧包括该目标图像块;该第三标识用于指示针对该目标图像帧中的目标片中的未匹配像素是否采用该至少两种熵解码方式进行熵解码,该目标片包括该目标图像块;该第四标识用于指示针对该目标图像块中的未匹配像素是否采用该至少两种熵解码方式进行熵解码;其中,该解码单元710具体用于:
若该第一标识、该第二标识、该第三标识或该第三标识用于指示采用该至少两种熵解码方式进行熵解码,则通过该至少两种熵解码方式对目标序列的码流进行解码,得到目标图像块中的未匹配像素的二元符号串。
在一些实施例中,该至少两种熵解码方式为针对未匹配像素的默认模式;其中,该解码单元710具体用于:
通过该默认模式对目标序列的码流进行解码,得到目标图像块中的未匹配像素的二元符号串。
在一些实施例中,该通过至少两种熵解码方式对目标序列的码流进行解码,得到目标图像块中的未匹配像素的二元符号串之前,该该解码单元710还用于:
获取量化参数,基于量化参数,确定后预设第二目标位的取值。
在一些实施例中,该未匹配像素的分量值为色度分量值或亮度分量值,该色度分量值采用的N的取值和该亮度分量值采用的N的取值不同。
应理解,装置实施例与方法实施例可以相互对应,类似的描述可以参照方法实施例。为避免重复,此处不再赘述。具体地,图7所示的编码器600可以对应于执行本申请实施例的方法400中的相应主体,即编码器600中的各个单元的前述和其它操作和/或功能分别为了实现方法400等各个方法中的相应流程。图8所示的解码器700可以对应于执行本申请实施例的方法500中的相应主体,并且解码器700中的各个单元的前述和其它操作和/或功能分别为了实现方法500等各个方法中的相应流程。
还应当理解,本申请实施例涉及的编码器600或解码器700中的各个单元可以分别或全部合并为一个或若干个另外的单元来构成,或者其中的某个(些)单元还可以再拆分为功能上更小的多个单元来构成,这可以实现同样的操作,而不影响本申请的实施例的技术效果的实现。上述单元是基于逻辑功能划分的,在实际应用中,一个单元的功能也可以由多个单元来实现,或者多个单元的功能由一个单元实现。在本申请的其它实施例中,该编码器600或解码器700也可以包括其它单元,在实际应用中,这些功能也可以由其它单元协助实现,并且可以由多个单元协作实现。根据本申请的另一个实施例,可以通过在包括例如中央处理单 元(CPU)、随机存取存储介质(RAM)、只读存储介质(ROM)等处理元件和存储元件的通用计算机的通用计算设备上运行能够执行相应方法所涉及的各步骤的计算机程序(包括程序代码),来构造本申请实施例涉及的编码器600或解码器700,以及来实现本申请实施例的未匹配像素的编码方法或未匹配像素的解码方法。计算机程序可以记载于例如计算机可读存储介质上,并通过计算机可读存储介质装载于电子设备中,并在其中运行,来实现本申请实施例的相应方法。
换言之,上文涉及的单元可以通过硬件形式实现,也可以通过软件形式的指令实现,还可以通过软硬件结合的形式实现。具体地,本申请实施例中的方法实施例的各步骤可以通过处理器中的硬件的集成逻辑电路和/或软件形式的指令完成,结合本申请实施例公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件组合执行完成。可选地,软件可以位于随机存储器,闪存、只读存储器、可编程只读存储器、电可擦写可编程存储器、寄存器等本领域的成熟的存储介质中。该存储介质位于存储器,处理器读取存储器中的信息,结合其硬件完成上述方法实施例中的步骤。
图9是本申请实施例提供的编解码设备800的示意结构图。
如图9所示,该编解码设备800至少包括处理器710以及计算机可读存储介质720。其中,处理器710以及计算机可读存储介质720可通过总线或者其它方式连接。计算机可读存储介质720用于存储计算机程序721,计算机程序721包括计算机指令,处理器710用于执行计算机可读存储介质720存储的计算机指令。处理器710是编解码设备800的计算核心以及控制核心,其适于实现一条或多条计算机指令,具体适于加载并执行一条或多条计算机指令从而实现相应方法流程或相应功能。
作为示例,处理器710也可称为中央处理器(CentralProcessingUnit,CPU)。处理器710可以包括但不限于:通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等等。
作为示例,计算机可读存储介质720可以是高速RAM存储器,也可以是非不稳定的存储器(Non-VolatileMemory),例如至少一个磁盘存储器;可选的,还可以是至少一个位于远离前述处理器710的计算机可读存储介质。具体而言,计算机可读存储介质720包括但不限于:易失性存储器和/或非易失性存储器。其中,非易失性存储器可以是只读存储器(Read-Only Memory,ROM)、可编程只读存储器(Programmable ROM,PROM)、可擦除可编程只读存储器(Erasable PROM,EPROM)、电可擦除可编程只读存储器(Electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(Random Access Memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(Static RAM,SRAM)、动态随机存取存储器(Dynamic RAM,DRAM)、同步动态随机存取存储器(Synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(Double Data Rate SDRAM,DDR SDRAM)、增强型同步动态随机存取存储器(Enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(synch link DRAM,SLDRAM)和直接内存总线随机存取存储器(Direct Rambus RAM,DR RAM)。
在一种实现方式中,该编解码设备800可以是本申请实施例涉及的编码端、编码器或编码框架;该计算机可读存储介质720中存储有第一计算机指令;由处理器710加载并执行计 算机可读存储介质720中存放的第一计算机指令,以实现本申请实施例提供的未匹配像素的编码方法中的相应步骤;换言之,计算机可读存储介质720中的第一计算机指令由处理器710加载并执行相应步骤,为避免重复,此处不再赘述。
在一种实现方式中,该编解码设备800可以是本申请实施例涉及的解码端、解码器或解码框架;该计算机可读存储介质720中存储有第二计算机指令;由处理器710加载并执行计算机可读存储介质720中存放的第二计算机指令,以实现本申请实施例提供的未匹配像素的解码方法中的相应步骤;换言之,计算机可读存储介质720中的第二计算机指令由处理器710加载并执行相应步骤,为避免重复,此处不再赘述。
根据本申请的另一方面,本申请实施例还提供了一种计算机可读存储介质(Memory),计算机可读存储介质是编解码设备800中的记忆设备,用于存放程序和数据。例如,计算机可读存储介质720。可以理解的是,此处的计算机可读存储介质720既可以包括编解码设备800中的内置存储介质,当然也可以包括编解码设备800所支持的扩展存储介质。计算机可读存储介质提供存储空间,该存储空间存储了编解码设备800的操作系统。并且,在该存储空间中还存放了适于被处理器710加载并执行的一条或多条的计算机指令,这些计算机指令可以是一个或多个的计算机程序721(包括程序代码)。
根据本申请的另一方面,提供了一种计算机程序产品或计算机程序,该计算机程序产品或计算机程序包括计算机指令,该计算机指令存储在计算机可读存储介质中。例如,计算机程序721。此时,编解码设备800可以是计算机,处理器710从计算机可读存储介质720读取该计算机指令,处理器710执行该计算机指令,使得该计算机执行上述各种可选方式中提供的未匹配像素的编码方法或未匹配像素的解码方法。
换言之,当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。该计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行该计算机程序指令时,全部或部分地运行本申请实施例的流程或实现本申请实施例的功能。该计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。该计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质进行传输,例如,该计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(digital subscriber line,DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元以及流程步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
最后需要说明的是,以上内容,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。

Claims (20)

  1. 一种未匹配像素的解码方法,由处理器执行,其特征在于,包括:
    获取目标序列的码流,通过至少两种熵解码方式对所述目标序列的码流进行解码,得到目标图像块中的未匹配像素的二元符号串,所述目标图像块是将所述目标序列中的目标图像帧划分得到的;
    通过对所述二元符号串进行反二值化,得到所述未匹配像素的分量值;
    基于所述未匹配像素的分量值,得到所述目标图像块。
  2. 根据权利要求1所述的方法,其特征在于,所述至少两种熵解码方式包括第一熵解码方式和第二熵解码方式;
    其中,所述通过至少两种熵解码方式对所述目标序列的码流进行解码,得到目标图像块中的未匹配像素的二元符号串,包括:
    通过对所述二元符号串的前预设第一目标位采用所述第一熵解码方式进行解码,以及通过对所述二元符号串的后预设第二目标位采用所述第二熵解码方式进行解码,得到目标图像块中的未匹配像素的二元符号串,所述预设第一目标位和所述预设第二目标位的总和为所述二元符号串的长度。
  3. 根据权利要求2所述的方法,其特征在于,所述第一熵解码方式为旁路模式,所述第二熵解码方式为基于上下文的二进制算术编码模式。
  4. 根据权利要求1所述的方法,其特征在于,所述通过至少两种熵解码方式对所述目标序列的码流进行解码,得到目标图像块中的未匹配像素的二元符号串之前,所述方法还包括:
    通过对所述目标序列的码流进行解析,得到以下标识中的至少一项:
    第一标识、第二标识、第三标识或第四标识;
    其中,所述第一标识用于指示针对所述目标序列中的未匹配像素是否采用所述至少两种熵解码方式进行熵解码;所述第二标识用于指示针对所述目标序列中的目标图像帧中的未匹配像素是否采用所述至少两种熵解码方式进行熵解码,所述目标图像帧包括所述目标图像块;所述第三标识用于指示针对所述目标图像帧中的目标片中的未匹配像素是否采用所述至少两种熵解码方式进行熵解码,所述目标片包括所述目标图像块;所述第四标识用于指示针对所述目标图像块中的未匹配像素是否采用所述至少两种熵解码方式进行熵解码;
    其中,所述通过至少两种熵解码方式对所述目标序列的码流进行解码,得到目标图像块中的未匹配像素的二元符号串,包括:
    若所述第一标识、所述第二标识、所述第三标识或所述第三标识用于指示采用所述至少两种熵解码方式进行熵解码,则通过所述至少两种熵解码方式对所述目标序列的码流进行解码,得到目标图像块中的未匹配像素的二元符号串。
  5. 根据权利要求1所述的方法,其特征在于,所述至少两种熵解码方式为针对未匹配像素的默认模式;
    其中,所述通过至少两种熵解码方式对所述目标序列的码流进行解码,得到目标图像块中的未匹配像素的二元符号串,包括:
    通过所述默认模式对所述目标序列的码流进行解码,得到目标图像块中的未匹配像素的二元符号串。
  6. 根据权利要求2或3任一项所述的方法,其特征在于,所述通过至少两种熵解码方式对所述目标序列的码流进行解码,得到目标图像块中的未匹配像素的二元符号串之前,所述 方法还包括:
    获取量化参数,基于所述量化参数,确定所述预设第二目标位的取值。
  7. 根据权利要求1~5任一项所述的方法,其特征在于,所述未匹配像素的分量值为色度分量值或亮度分量值。
  8. 一种未匹配像素的编码方法,由处理器执行,其特征在于,包括:
    获取目标序列,将所述目标序列中的目标图像帧划分为多个图像块,所述多个图像块包括目标图像块;
    调整所述目标图像块中的未匹配像素的分量值,得到所述未匹配像素的调整分量值;
    通过二值化所述未匹配像素的调整分量值,得到所述未匹配像素的二元符号串;
    通过至少两种熵编码方式对所述未匹配像素的二元符号串进行编码,得到所述目标序列的码流。
  9. 根据权利要求8所述的方法,其特征在于,所述调整所述目标图像块中的未匹配像素的分量值,得到所述未匹配像素的调整分量值,包括:
    将所述未匹配像素的分量值进行位移运算,得到所述未匹配像素的调整分量值。
  10. 根据权利要求9所述的方法,其特征在于,所述将所述未匹配像素的分量值进行位移运算,得到所述未匹配像素的调整分量值之前,所述方法还包括:
    获取量化参数,基于所述量化参数,确定二元符号串的预设第二目标位的取值。
  11. 根据权利要求8所述的方法,其特征在于,所述调整所述目标图像块中的未匹配像素的分量值,得到所述未匹配像素的调整分量值,包括:
    将所述未匹配像素的分量值进行加权运算,得到所述未匹配像素的调整分量值。
  12. 根据权利要求8所述的方法,其特征在于,所述至少两种熵编码方式包括第一熵编码方式和第二熵编码方式;
    所述通过至少两种熵编码方式对所述未匹配像素的二元符号串进行编码,得到所述目标序列的码流,包括:
    通过对所述未匹配像素的二元符号串的前预设第一目标位采用所述第一熵编码方式进行编码,以及通过对所述未匹配像素的二元符号串的后预设第二目标位采用所述第二熵编码方式进行编码,得到所述目标序列的码流。
  13. 根据权利要求12所述的方法,其特征在于,所述第一熵编码方式为旁路模式,所述第二熵编码方式为基于上下文的二进制算术编码模式。
  14. 根据权利要求8所述的方法,其特征在于,所述方法,还包括:
    将以下标识中的至少一项,写入所述目标序列的码流;
    第一标识、第二标识、第三标识或第四标识;
    其中,所述第一标识用于指示针对所述目标序列中的未匹配像素是否采用所述至少两种熵编码方式进行熵编码;所述第二标识用于指示针对所述目标序列中的目标图像帧中的未匹配像素是否采用所述至少两种熵编码方式进行熵编码,所述目标图像帧包括所述目标图像块;所述第三标识用于指示针对所述目标图像帧中的目标片中的未匹配像素是否采用所述至少两种熵编码方式进行熵编码,所述目标片包括所述目标图像块;所述第四标识用于指示针对所述目标图像块中的未匹配像素是否采用所述至少两种熵编码方式进行熵编码。
  15. 根据权利要求8所述的方法,其特征在于,所述至少两种熵解码方式为针对未匹配像素的默认模式;
    所述通过至少两种熵编码方式对所述未匹配像素的二元符号串进行编码,得到所述目标序列的码流,包括:
    通过所述针对未匹配像素的默认模式对所述未匹配像素的二元符号串进行编码,得到所述目标序列的码流。
  16. 一种解码器,其特征在于,包括:
    解码单元,用于获取目标序列的码流,通过至少两种熵解码方式对所述目标序列的码流进行解码,得到目标图像块中的未匹配像素的二元符号串,所述目标图像块是将所述目标序列中的目标图像帧划分得到的;
    反二值化单元,用于通过对所述二元符号串进行反二值化,得到所述未匹配像素的分量值;
    处理单元,用于基于所述未匹配像素的分量值,得到所述目标图像块。
  17. 一种编码器,其特征在于,包括:
    划分单元,用于获取目标序列,将所述目标序列中的目标图像帧划分为多个图像块,所述多个图像块包括目标图像块;
    调整单元,用于调整所述目标图像块中的未匹配像素的分量值,得到所述未匹配像素的调整分量值;
    二值化单元,用于通过二值化所述未匹配像素的调整分量值,得到所述未匹配像素的二元符号串;
    编码单元,用于通过至少两种熵编码方式对所述未匹配像素的二元符号串进行编码,得到所述目标序列的码流。
  18. 一种电子设备,其特征在于,包括:
    处理器,适于执行计算机程序;
    计算机可读存储介质,所述计算机可读存储介质中存储有计算机程序,所述计算机程序被所述处理器执行时,实现如权利要求1至7中任一项所述的解码方法,或实现如权利要求8至11中任一项所述的编码方法。
  19. 一种计算机可读存储介质,其特征在于,用于存储计算机程序,所述计算机程序使得计算机执行如权利要求1至7中任一项所述的解码方法,或如权利要求8至11中任一项所述的编码方法。
  20. 一种计算机程序产品或计算机程序,所述计算机程序产品或计算机程序包括计算机指令,所述计算机指令存储在计算机可读存储介质中。计算机设备的处理器从所述计算机可读存储介质读取所述计算机指令,处理器执行所述计算机指令,使得所述计算机设备执行如权利要求1至7中任一项所述的解码方法,或如权利要求8至11中任一项所述的编码方法。
PCT/CN2022/075554 2021-03-13 2022-02-08 未匹配像素的解码方法、编码方法、解码器以及编码器 WO2022193868A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/970,462 US20230042484A1 (en) 2021-03-13 2022-10-20 Decoding method and coding method for unmatched pixel, decoder, and encoder

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110272823.XA CN115086664A (zh) 2021-03-13 2021-03-13 未匹配像素的解码方法、编码方法、解码器以及编码器
CN202110272823.X 2021-03-13

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/970,462 Continuation US20230042484A1 (en) 2021-03-13 2022-10-20 Decoding method and coding method for unmatched pixel, decoder, and encoder

Publications (1)

Publication Number Publication Date
WO2022193868A1 true WO2022193868A1 (zh) 2022-09-22

Family

ID=83240953

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/075554 WO2022193868A1 (zh) 2021-03-13 2022-02-08 未匹配像素的解码方法、编码方法、解码器以及编码器

Country Status (3)

Country Link
US (1) US20230042484A1 (zh)
CN (1) CN115086664A (zh)
WO (1) WO2022193868A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116527904B (zh) * 2023-07-03 2023-09-12 鹏城实验室 熵编码方法、熵解码方法及相关装置

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130177069A1 (en) * 2012-01-09 2013-07-11 Texas Instruments Incorporated Context Adaptive Binary Arithmetic Coding (CABAC) with Scalable Throughput and Coding Efficiency
CN106416246A (zh) * 2014-06-20 2017-02-15 寰发股份有限公司 视频编码中的语法的二进制化和上下文自适应编码的方法和装置
EP3709657A1 (en) * 2019-03-11 2020-09-16 InterDigital VC Holdings, Inc. Reducing the number of regular coded bins
WO2021025485A1 (ko) * 2019-08-06 2021-02-11 현대자동차주식회사 비디오 부호화 및 복호화를 위한 엔트로피 코딩

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130177069A1 (en) * 2012-01-09 2013-07-11 Texas Instruments Incorporated Context Adaptive Binary Arithmetic Coding (CABAC) with Scalable Throughput and Coding Efficiency
CN106416246A (zh) * 2014-06-20 2017-02-15 寰发股份有限公司 视频编码中的语法的二进制化和上下文自适应编码的方法和装置
EP3709657A1 (en) * 2019-03-11 2020-09-16 InterDigital VC Holdings, Inc. Reducing the number of regular coded bins
WO2021025485A1 (ko) * 2019-08-06 2021-02-11 현대자동차주식회사 비디오 부호화 및 복호화를 위한 엔트로피 코딩

Also Published As

Publication number Publication date
CN115086664A (zh) 2022-09-20
US20230042484A1 (en) 2023-02-09

Similar Documents

Publication Publication Date Title
TWI755394B (zh) 二值化二次轉換指數
AU2016219428B2 (en) Restriction on palette block size in video coding
US9924175B2 (en) Determining application of deblocking filtering to palette coded blocks in video coding
EP2952000B1 (en) Unification of signaling lossless coding mode and pulse code modulation (pcm) mode in video coding
CN113115047B (zh) 视频编解码方法和设备
JP2017532896A (ja) パレットインデックスのコーディングのためのパースの依存性の低減
WO2016057929A1 (en) Palette run hiding in palette-based video coding
CA2952629A1 (en) Method for palette mode coding
EP3808084A1 (en) Trellis coded quantization coefficient coding
US11140418B2 (en) Block-based adaptive loop filter design and signaling
JP2022510145A (ja) しきい値とライスパラメータとを使用した係数復号のための正規コード化ビン低減
WO2019246091A1 (en) Coefficient coding with grouped bypass bins
WO2022193868A1 (zh) 未匹配像素的解码方法、编码方法、解码器以及编码器
US20160150234A1 (en) Palette mode coding
EP4035371A1 (en) Arithmetic coder byte stuffing signaling for video coding
WO2023236113A1 (zh) 视频编解码方法、装置、设备、系统及存储介质
WO2022193389A1 (zh) 视频编解码方法与系统、及视频编解码器
WO2022174475A1 (zh) 视频编解码方法与系统、及视频编码器与视频解码器
WO2022188239A1 (zh) 系数的编解码方法、编码器、解码器及计算机存储介质
WO2022193390A1 (zh) 视频编解码方法与系统、及视频编解码器
WO2022217447A1 (zh) 视频编解码方法与系统、及视频编解码器
WO2022116054A1 (zh) 图像处理方法、系统、视频编码器及视频解码器
WO2022193394A1 (zh) 系数的编解码方法、编码器、解码器及计算机存储介质
WO2023138562A1 (zh) 图像解码方法、图像编码方法及相应的装置
JP2022537662A (ja) ビデオエンコーディングにおける適応ループフィルタのためのクリッピングインデックスコード化

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22770225

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 23.01.2024)