WO2024051329A1 - 图像编解码方法、装置、编码器、解码器和系统 - Google Patents
图像编解码方法、装置、编码器、解码器和系统 Download PDFInfo
- Publication number
- WO2024051329A1 WO2024051329A1 PCT/CN2023/105320 CN2023105320W WO2024051329A1 WO 2024051329 A1 WO2024051329 A1 WO 2024051329A1 CN 2023105320 W CN2023105320 W CN 2023105320W WO 2024051329 A1 WO2024051329 A1 WO 2024051329A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- bits
- coding unit
- bit stream
- image
- encoding
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 107
- 238000013139 quantization Methods 0.000 claims abstract description 83
- 230000015654 memory Effects 0.000 claims description 31
- 238000004590 computer program Methods 0.000 claims description 16
- 230000008569 process Effects 0.000 description 27
- 230000006870 function Effects 0.000 description 25
- 230000006854 communication Effects 0.000 description 23
- 238000004891 communication Methods 0.000 description 23
- 238000012545 processing Methods 0.000 description 15
- 102100022002 CD59 glycoprotein Human genes 0.000 description 10
- 101000897400 Homo sapiens CD59 glycoprotein Proteins 0.000 description 10
- 238000010586 diagram Methods 0.000 description 9
- 238000007906 compression Methods 0.000 description 8
- 230000006835 compression Effects 0.000 description 8
- 230000003190 augmentative effect Effects 0.000 description 5
- 230000003287 optical effect Effects 0.000 description 5
- 238000003491 array Methods 0.000 description 4
- 230000005540 biological transmission Effects 0.000 description 4
- 238000006243 chemical reaction Methods 0.000 description 4
- 238000013461 design Methods 0.000 description 4
- 230000009466 transformation Effects 0.000 description 4
- 241000023320 Luma <angiosperm> Species 0.000 description 3
- OSWPMRLSEDHDFF-UHFFFAOYSA-N methyl salicylate Chemical compound COC(=O)C1=CC=CC=C1O OSWPMRLSEDHDFF-UHFFFAOYSA-N 0.000 description 3
- 238000007781 pre-processing Methods 0.000 description 3
- 238000005070 sampling Methods 0.000 description 3
- 101100382854 Arabidopsis thaliana CCD7 gene Proteins 0.000 description 2
- 101100129496 Arabidopsis thaliana CYP711A1 gene Proteins 0.000 description 2
- 101100129499 Arabidopsis thaliana MAX2 gene Proteins 0.000 description 2
- 101100083446 Danio rerio plekhh1 gene Proteins 0.000 description 2
- 230000001133 acceleration Effects 0.000 description 2
- 230000009471 action Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000011217 control strategy Methods 0.000 description 2
- 238000012937 correction Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 239000004973 liquid crystal related substance Substances 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000002093 peripheral effect Effects 0.000 description 2
- 238000012805 post-processing Methods 0.000 description 2
- 239000007787 solid Substances 0.000 description 2
- 238000009966 trimming Methods 0.000 description 2
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 description 1
- 238000012952 Resampling Methods 0.000 description 1
- XUIMIQQOPSSXEZ-UHFFFAOYSA-N Silicon Chemical compound [Si] XUIMIQQOPSSXEZ-UHFFFAOYSA-N 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000007175 bidirectional communication Effects 0.000 description 1
- 238000013144 data compression Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000005192 partition Methods 0.000 description 1
- 230000002688 persistence Effects 0.000 description 1
- 238000003908 quality control method Methods 0.000 description 1
- 238000011002 quantification Methods 0.000 description 1
- 230000002441 reversible effect Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 229910052710 silicon Inorganic materials 0.000 description 1
- 239000010703 silicon Substances 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/124—Quantisation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/119—Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/13—Adaptive entropy coding, e.g. adaptive variable length coding [AVLC] or context adaptive binary arithmetic coding [CABAC]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/136—Incoming video signal characteristics or properties
- H04N19/14—Coding unit complexity, e.g. amount of activity or edge presence estimation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/184—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being bits, e.g. of the compressed video stream
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/42—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
Definitions
- the present application relates to the field of multimedia, and in particular, to an image encoding and decoding method, device, encoder, decoder and system.
- the encoder performs coding operations such as prediction, quantization, and entropy coding on image frames to obtain a bit stream.
- the decoder performs decoding operations such as entropy decoding, inverse quantization, and predictive reconstruction on the bit stream to obtain the reconstructed image of the image frame.
- decoding operations such as entropy decoding, inverse quantization, and predictive reconstruction on the bit stream to obtain the reconstructed image of the image frame.
- the bit stream contains The more redundant information the image frame has, the more bits the bit stream has. Therefore, how to determine the quantization parameters used to encode and decode images and reduce the number of encoding bits after encoding the image while ensuring the quality of the reconstructed image is an urgent problem to be solved.
- the present application provides an image encoding and decoding method, device, encoder, decoder and system, whereby by reasonably determining the quantization parameters used to encode and decode images, the quality of the reconstructed image can be reduced while ensuring the quality of the reconstructed image.
- the number of encoding bits after encoding the image is not limited.
- an image decoding method includes: when decoding the bit stream of a coding unit to be decoded in the image bit stream, determining according to the image content of the coding unit and the number of bits of data in the bit stream buffer.
- the target number of bits of the coding unit, and the bit stream of the coding unit is decoded according to the quantization parameter determined by the target number of bits.
- the image content is used to indicate the complexity of different pixel areas in the coding unit.
- the bit stream buffer is used to store the number of coded bits decoded by one or more coding units.
- the target number of bits of the coding unit is used to indicate the expected number of bits after lossy encoding of the coding unit with reference to the image content of the coding unit.
- the complexity of the coding unit is higher, it means that the image contains more information, that is, there is less repeated information; conversely, the complexity of the coding unit is lower, which means that the image contains less information, that is, there is more repeated information.
- two factors the image content of the coding unit and the number of bits of data in the bitstream buffer, are taken into consideration. For coding units with a lower complexity, a smaller target number of bits is tended to be set. For coding units with a higher complexity, the target number of bits is set to be smaller.
- Higher coding units tend to set more target bit numbers, that is, the complexity of the image content expressed by the reference coding unit and the number of bits of the coded coding units in the bit stream buffer, which is the prerequisite for ensuring the quality of the reconstructed image. , to reduce the number of encoding bits after encoding the image as much as possible.
- the inverse process of encoding a coding unit that is, when decoding the bit stream of the coding unit, the two factors of the image content of the coding unit and the number of bits of data in the bit stream buffer are taken into consideration.
- the target The quantization parameter determined by the number of bits decodes the bit stream of the coding unit, thereby reducing the number of coding bits after encoding the image as much as possible while ensuring the quality of the reconstructed image by improving the accuracy of the code rate control.
- the bit stream of the coding unit has a small number of bits and the image content of the coding unit to be encoded is complex, on the premise of ensuring a constant code rate, a reasonable increase
- the number of bits in the bit stream of the coding unit and the value of the quantization parameter can be smaller to improve the quality of the reconstructed image.
- the value of the quantization parameter can be larger to reasonably reduce the number of bits in the bit stream of the coding unit.
- the method further includes: after determining the quantization parameter, decoding the bit stream of the coding unit according to the quantization parameter to obtain a reconstructed image of the coding unit.
- an image coding method includes: when coding a coding unit to be coded in the current frame, determining the target bits of the coding unit according to the image content of the coding unit and the number of bits of data in the bit stream buffer. number, and the coding unit is coded according to the quantization parameter determined by the target number of bits of the coding unit.
- the image content is used to indicate the complexity of different pixel areas in the coding unit.
- the bitstream buffer is used to store the bitstream or part of the bitstream for one or more coding units.
- the target number of bits of the coding unit is used to indicate the expected number of bits after lossy encoding of the coding unit with reference to the image content of the coding unit.
- the complexity of the coding unit is higher, it means that the image contains more information, that is, there is less repeated information; conversely, the complexity of the coding unit is lower, which means that the image contains less information, that is, there is more repeated information.
- two factors the image content of the coding unit and the number of bits of data in the bitstream buffer, are taken into consideration. For coding units with a lower complexity, a smaller target number of bits is tended to be set. For coding units with a higher complexity, the target number of bits is set to be smaller. Higher coding units tend to set more target bits.
- determining the target number of bits of the coding unit based on the image content of the coding unit and the number of bits of data in the bit stream buffer includes: determining the lossless bit data of the coding unit based on the image content of the coding unit, based on The number of bits of data in the bitstream buffer determines the information content of the coding unit. Furthermore, the target number of bits is determined based on the number of lossless bits of the coding unit and the amount of information of the coding unit. Since the lossless bit data is used to indicate the expected number of bits after lossless encoding of the coding unit, that is, the lossless bit data represents the expected number of bits in a coding method that can fully express the information of the coding unit.
- the amount of information is used to indicate the complexity of the image content expressed by the coding unit in the current frame.
- the lossless bit data of the coding unit is measured by the amount of information to determine the target number of bits of the coding unit, that is, on the premise of fully retaining the information of the coding unit based on the amount of information, the number of coding bits after encoding the image is reduced, thereby improving the determination of coding Accuracy of the unit's target number of bits.
- determining the information amount of the coding unit in the current frame based on the number of bits of data in the bitstream buffer includes: determining the lossy bits of the coding unit based on the number of bits of data in the bitstream buffer. The amount of information is determined based on the number of lossy bits and the average number of lossless bits of the coding unit.
- the number of lossy bits of the coding unit is used to indicate the expected number of bits after lossy encoding of the coding unit without reference to the image content of the coding unit.
- the average number of lossless bits is used to indicate the average number of expected bits after lossless encoding of each coding unit in the current frame.
- the average expected number of bits is used to characterize the complexity of the image content expressed in the current frame, and the ratio of the number of lossy bits of the coding unit to the average expected number of bits is used to quantify the complexity of the image content expressed by the coding unit in the current frame. Improve the accuracy of rate control.
- determining the quantization parameter based on the target number of bits of the coding unit includes: based on at least one of the number of lossless bits of the coding unit, the number of lossy bits of the coding unit, and the bit stream buffer fullness.
- the target number of bits of the coding unit is clamped to obtain the clamp value of the target number of bits, and the quantization parameter is determined based on the number of lossless bits of the coding unit and the clamp value of the target number of bits.
- the bitstream buffer fullness is used to indicate the ratio of the number of bits of data in the bitstream buffer to the storage capacity of the bitstream buffer. Therefore, the target number of bits of the coding unit is clamped, further improving the accuracy of code rate control.
- the image content includes the complexity level of the coding unit.
- the complexity level of the coding unit includes at least one of a luminance complexity level and a chrominance complexity level.
- an image encoding and decoding device includes various modules for executing the first aspect or any method that may be designed in the first aspect, and executing the second aspect or any one of the second aspects. Possible design of individual modules of the method.
- an encoder in a fourth aspect, includes at least one processor and a memory, wherein the memory is used to store a computer program, so that when the computer program is executed by at least one processor, the second aspect or the second aspect is implemented. Either possible design is described in the method.
- a decoder comprising at least one processor and a memory, wherein the memory is used to store a computer program, so that when the computer program is executed by at least one processor, the first aspect or aspects are implemented. Either possible design is described in the method.
- a sixth aspect provides a coding and decoding system, which includes the encoder as described in the fourth aspect and the decoder as described in the fifth aspect.
- a chip including: a processor and a power supply circuit; wherein the power supply circuit is used to supply power to the processor; and the processor is used to execute the first aspect or any possible implementation of the first aspect.
- a computer-readable storage medium including: computer software instructions; when the computer software instructions are run in a computing device, the computing device is caused to execute the method in the first aspect or any possible implementation of the first aspect.
- a computer program product is provided.
- the computer program product When the computer program product is run on a computer, it causes the computing device to perform the operation steps of the method in the first aspect or any possible implementation of the first aspect, and perform the second step as described in the second aspect.
- Figure 1 is a schematic structural diagram of a coding and decoding system provided by this application.
- Figure 2 is a schematic scene diagram of a coding and decoding system provided by this application.
- Figure 3 is a schematic structural diagram of an encoder and a decoder provided by this application.
- Figure 4 is a schematic flow chart of an image encoding and decoding method provided by this application.
- Figure 5 is a schematic flow chart of an image encoding method provided by this application.
- Figure 6 is a schematic flow chart of an image decoding method provided by this application.
- Figure 7 is a schematic diagram of a target bit number clamping method provided by this application.
- Figure 8 is a schematic structural diagram of a coding and decoding device provided by this application.
- Figure 9 is a schematic structural diagram of a coding and decoding system provided by this application.
- Video contains multiple consecutive images. When the continuous changes of multiple images exceed 24 frames per second, according to the principle of persistence of vision, the human eye cannot distinguish a single static image, and it will appear smooth and continuous.
- a picture is a video.
- Video coding refers to the processing of sequences of pictures that form a video or video sequence.
- the terms "picture”, “frame” or “image” may be used as synonyms.
- Video encoding as used in this article means video encoding or video decoding.
- Video encoding is performed on the source side and typically involves processing (e.g., compressing) the original video picture to reduce the amount of data required to represent the video picture, subject to a certain image quality, so that it can be stored and/or transmitted more efficiently.
- Video decoding is performed on the destination side and typically involves inverse processing relative to the encoder to reconstruct the video picture.
- the "encoding" of video pictures referred to in the embodiments should be understood to refer to the "encoding” or "decoding” of the video sequence.
- the combination of encoding part and decoding part is also called codec (encoding and decoding).
- Video coding can also be called image coding (imagecoding) or image compression (image compression).
- Image decoding refers to the reverse process of
- a video sequence includes a series of images, which are further divided into slices, and slices are further divided into blocks.
- Video coding is performed in units of blocks.
- the concept of blocks is further expanded.
- MB macroblock
- the macroblock can be further divided into multiple prediction blocks (partitions) that can be used for predictive coding.
- HEVC high efficiency video coding
- basic concepts such as coding unit (CU), prediction unit (PU) and transform unit (TU) are used, functionally
- CU coding unit
- PU prediction unit
- TU transform unit
- a CU can be divided into smaller CUs according to a quadtree, and the smaller CUs can be further divided to form a quadtree structure.
- the CU is the basic unit for dividing and encoding encoded images.
- PU can correspond to the prediction block and is the basic unit of predictive coding.
- the CU is further divided into multiple PUs according to the division mode.
- TU can correspond to a transformation block and is the basic unit for transforming prediction residuals.
- PU or TU they essentially belong to the concept of block (or coding unit).
- a CTU is split into multiple CUs by using a quadtree structure represented as a coding tree.
- the decision is made at the CU level whether to code a picture region using inter-picture (temporal) or intra-picture (spatial) prediction.
- Each CU can be further split into one, two or four PUs depending on the PU split type.
- the same prediction process is applied within a PU and the relevant information is transferred to the decoder on a PU basis.
- the CU is partitioned into transform units (TUs) according to other quadtree structures similar to the coding tree for the CU.
- quad-tree and binary tree QTBT are used to segment frames to segment coding blocks.
- CU can be in square or rectangular shape.
- the coding unit to be encoded in the currently encoded image may be called the current block.
- the reference block is a block that provides a reference signal for the current block, where the reference signal represents the pixel value within the coding unit.
- a block in a reference image that provides a prediction signal for the current block may be a prediction block, where the prediction signal represents a pixel value or a sample value or a sample signal within the prediction block. For example, after traversing multiple reference blocks, the best reference block is found. This best reference block will provide prediction for the current block. This block is called a prediction block.
- Lossless video coding means that the original video picture can be reconstructed, that is, the reconstructed video picture has the same quality as the original video picture (assuming there is no transmission loss or other data loss during storage or transmission).
- Lossy video coding refers to performing further compression, such as quantization, to reduce the number of bits required to represent a video picture, without the decoder being able to completely reconstruct the video picture, i.e. the quality of the reconstructed video picture is compared to the quality of the original video picture lower or worse.
- Bit stream refers to the binary stream generated after encoding an image or video. Bit stream is also called code stream or code rate, which is the number of bits transmitted per unit time.
- code stream is the number of bits transmitted per unit time.
- Code rate control refers to the function of adjusting the code rate during the encoding and decoding process, hereafter abbreviated as code control.
- Rate control modes include constant bit rate (Constant Bit Rate, CBR) and dynamic bit rate (Variable Bit Rate, VBR).
- Constant Bit Rate ensures a stable bit rate within the bit rate statistics time.
- VBR Very Bit Rate
- Quantization refers to the process of mapping the continuous values of a signal into multiple discrete amplitudes.
- Quantization parameter is used during the encoding process to quantize the residual values generated by the prediction operation or the coefficients generated by the transformation operation.
- the syntax elements are inversely quantized to obtain residual values or coefficients.
- Quantization parameters are parameters used in the quantization process. Generally, the larger the value of the quantization parameter, the more obvious the degree of quantization, the worse the quality of the reconstructed image, and the lower the code rate; conversely, the smaller the value of the quantization parameter, the better the quality of the reconstructed image. Well, the higher the code rate.
- Bitstream buffer fullness refers to the ratio of the number of bits of data in the bitstream buffer to the storage capacity of the bitstream buffer.
- the number of bits of data in the bitstream buffer includes the number of encoding bits of the encoding unit.
- the number of bits of data in the bitstream buffer includes the number of decoded bits of the coding unit.
- Clamping refers to the operation of limiting a certain value to a specified range.
- FIG. 1 is a schematic structural diagram of a coding and decoding system provided by this application.
- the codec system 100 includes a source device 110 and a destination device 120.
- the source device 110 is used to compress and encode the video or image to obtain a bit stream, and transmit the bit stream to the destination device 120 .
- the destination device 120 decodes the bit stream, reconstructs the video or image, and displays the reconstructed image.
- the source device 110 includes an image collector 111, a preprocessor 112, an encoder 113 and a communication interface 114.
- Image collector 111 is used to acquire original images.
- Image collector 111 may include or be any class of image capture device, used for example to capture real world images, and/or any class of images or comments (for screen content encoding, some text on the screen is also considered to be encoded an image or a portion of an image) generating device, such as a computer graphics processor for generating computer animated images, or for acquiring and/or providing real world images, computer animated images (e.g., screen content, virtual reality , VR) images), and/or any combination thereof (e.g., augmented reality (Augmented Reality, AR) images).
- augmented reality Augmented Reality, AR
- the image collector 111 may be a camera for capturing images or a memory for storing images, the image collector 111 may also include any class (internal or external) that stores previously captured or generated images and/or acquires or receives images. interface.
- the image collector 111 may be, for example, local or an integrated camera integrated in the source device; when the image collector 111 is a memory, the image collector 111 may be local or, for example, integrated in the source device. Integrated memory in the device.
- the interface may be, for example, an external interface that receives images from an external video source, such as an external image capturing device, such as a camera, an external memory, or an external image generating device.
- the external image generating device For example, an external computer graphics processor, computer or server. Interfaces can be based on any proprietary or standard Any type of interface that uses a standardized interface protocol, such as a wired or wireless interface, or an optical interface.
- An image can be viewed as a two-dimensional array or matrix of pixels (picture elements).
- the pixels in the array can also be called sampling points.
- the number of sample points in the horizontal and vertical directions (or axes) of an array or image defines the size and/or resolution of the image.
- three color components are usually used, that is, an image can be represented as or contain three sample arrays.
- an image in the RBG format or color space, an image includes corresponding arrays of red, green, and blue samples.
- each pixel is usually represented in a brightness/chroma format or color space.
- YUV format it includes a brightness component indicated by Y (sometimes it can also be indicated by L) and two components indicated by U and V.
- the luminance (luma) component Y represents brightness or gray level intensity (for example, both are the same in a grayscale image), while the two chroma (chroma) components U and V represent chrominance or color information components.
- an image in YUV format includes a luma sample array of luma sample values (Y), and two chroma sample arrays of chroma values (U and V). Images in RGB format can be converted or transformed into YUV format and vice versa, this process is also called color transformation or conversion. If the image is black and white, the image may only include the luminance sample array.
- the image transmitted from the image collector 111 to the encoder 113 may also be called original image data.
- the preprocessor 112 is used to receive the original image collected by the image collector 111 and perform preprocessing on the original image to obtain a preprocessed image.
- the preprocessing performed by the preprocessor 112 includes trimming, color format conversion (eg, conversion from RGB format to YUV format), color correction, or denoising, etc.
- the encoder 113 is configured to receive the preprocessed image generated by the preprocessor 112, and perform compression encoding on the preprocessed image to obtain a bit stream.
- the encoder 113 may include a code control unit 1131 and an encoding unit 1132.
- the code control unit 1131 is used to determine the quantization parameters used to encode each encoding unit in the current frame, so that the encoding unit 1132 encodes the predetermined value according to the quantization parameters.
- the processed image is predicted, quantized and encoded to obtain a bit stream.
- the encoder 113 can determine the target number of bits according to the image content of the coding unit and the number of bits of data in the bit stream buffer, and encode the image according to the quantization parameter determined by the target number of bits. Units are coded.
- the communication interface 114 is used to receive the bit stream generated by the encoder 113, and send the bit stream to the destination device 120 through the communication channel 130, so that the destination device 120 can reconstruct the original image according to the bit stream.
- the destination device 120 includes a display 121, a post-processor 122, a decoder 123 and a communication interface 124.
- the communication interface 124 is used to receive the bit stream sent by the communication interface 114 and transmit the bit stream to the decoder 123 . In order to facilitate the decoder 123 to reconstruct the original image according to the bit stream.
- the communication interface 114 and the communication interface 124 may be used through a direct communication link between the source device 110 and the destination device 120, such as a direct wired or wireless connection, or through any type of network, such as a wired network, a wireless network, or any of them. Combination, any type of private network and public network or any type of combination thereof, send or receive data related to the original image.
- Both communication interface 114 and communication interface 124 may be configured as a one-way communication interface as indicated by the arrows from the source device 110 to the corresponding communication channel 130 of the destination device 120 in FIG. 1 , or as a bi-directional communication interface, and may be used to send and receive messages. etc., to establish the connection, confirm and exchange any other information related to the communication link and/or data transmission such as encoded bit stream transmission, etc.
- Decoder 123 is used to decode the bit stream and reconstruct the original image.
- the decoder 123 performs entropy decoding, inverse quantization and predictive reconstruction on the bit stream to obtain a reconstructed image.
- the decoder 123 may include a code control unit 1231 and a decoding unit 1232.
- the code control unit 1231 is used to determine the quantization parameters used to decode each coding unit in the current frame, so that the decoding unit 1232 can process the bit stream according to the quantization parameters.
- Decoding, inverse quantization, and predictive reconstruction are performed to obtain the reconstructed image.
- the decoder 123 can determine the target number of bits of the coding unit according to the image content of the coding unit and the number of bits of data in the bit stream buffer. The quantization determined based on the target number of bits Parameters to decode the bitstream of the coding unit.
- the post-processor 122 is configured to receive the reconstructed image generated by the decoder 123 and perform post-processing on the reconstructed image.
- the post-processing performed by the post-processor 122 includes color format conversion (eg, from YUV format to RGB format), color correction, trimming or resampling, or any other processing.
- the display 121 is used to display the reconstructed image.
- Display 121 may be or may include any type of display device for presenting reconstructed pictures, such as an integrated or external display or monitor.
- the display may include a liquid crystal display (LCD), an organic light emitting diode (OLED) display, a plasma display, a projector, a micro-LED display, a liquid crystal on silicon (LCoS), Digital light processor (DLP) or other display of any kind.
- both the encoder 113 and the decoder 123 may be implemented as any one of various suitable circuits, for example, one or more Microprocessor, digital signal processor (DSP), application-specific integrated circuit (ASIC), field-programmable gate array (FPGA), discrete logic, hardware or other Any combination.
- DSP digital signal processor
- ASIC application-specific integrated circuit
- FPGA field-programmable gate array
- the device may store instructions for the software in a suitable non-transitory computer-readable storage medium, and may execute the instructions in hardware using one or more processors to perform the techniques of the present disclosure. . Any of the foregoing (including hardware, software, a combination of hardware and software, etc.) may be regarded as one or more processors.
- the image collector 111 and the encoder 113 can be integrated on one physical device, or can be installed on different physical devices, without limitation.
- the source device 110 shown in FIG. 1 includes an image collector 111 and an encoder 113.
- the source device 110 is, for example, a mobile phone, a tablet, a computer, a laptop, a video camera, a camera, a wearable device, a vehicle-mounted device, a terminal device, a virtual reality (VR) device, an augmented reality (AR) device, a hybrid Reality (Mixed Reality, MR) equipment or Extended Reality (XR) equipment or other image capture equipment.
- the source device 110 does not include the image collector 111, it means that the image collector 111 and the encoder 113 are two different physical devices, and the source device 110 can obtain original images from other devices (such as image capture devices or image storage devices).
- the display 121 and the decoder 123 may be integrated on one physical device, or may be provided on different physical devices, without limitation.
- the destination device 120 shown in Figure 1 includes a display 121 and a decoder 123, which means that the display 121 and the decoder 123 are integrated on one physical device.
- the destination device 120 can also be called a playback device.
- the destination device 120 has a decoding function. and the function to display the reconstructed image.
- the destination device 120 is, for example, a monitor, television, digital media player, video game console, vehicle-mounted computer, or other device that displays images. If the destination device 120 does not include the display 121, it means that the display 121 and the decoder 123 are two different physical devices. After the destination device 120 decodes the bit stream and reconstructs the original image, it transmits the reconstructed image to other display devices (such as a television). , digital media player), and the reconstructed image is displayed by other display devices.
- other display devices such as a television).
- FIG. 1 shows that the source device 110 and the destination device 120 can be integrated on one physical device, or can be arranged on different physical devices, which is not limited.
- the source device 110 may be a camera, and the destination device 120 may be a display in various possible forms.
- the source device 110 can collect the video of the first scene, and transmit multiple frames of original images in the video to the codec device.
- the codec device performs codec processing on the original images to obtain reconstructed images, and the destination device 120 displays the reconstructed images and plays them. video.
- the source device 110 and the destination device 120 are integrated in a virtual reality (VR) device, an augmented reality (Augmented Reality, AR) device, or a mixed reality (Mixed Reality) device.
- VR virtual reality
- AR Augmented Reality
- MR Augmented Reality
- XR Extended Reality
- the source device 110 can collect images of the real scene where the user is located and the destination device 120 can display the reconstructed image of the real scene in the virtual environment.
- source device 110 or its corresponding functions and destination device 120 or its corresponding functions may be implemented using the same hardware and/or software or by separate hardware and/or software or any combination thereof. According to the description, it will be obvious to a skilled person that the existence and division of different units or functions in the source device 110 and/or the destination device 120 shown in FIG. 1 may vary according to actual devices and applications.
- the encoding and decoding system may also include other devices.
- the encoding and decoding system may also include end-side devices or cloud-side devices.
- the source device 110 collects the original image, it pre-processes the original image to obtain a pre-processed image; and transmits the pre-processed image to the end-side device or the cloud-side device, and the end-side device or the cloud-side device implements the pre-processing The function of encoding and decoding the subsequent image.
- the encoder 300 includes a prediction unit 310 , a code control unit 320 , a quantization unit 330 , a coding unit 340 and a block dividing unit 350 .
- the block dividing unit 350 is used to divide the original image into a plurality of coding units.
- the code control unit 320 is configured to determine the target number of bits of the coding unit based on the image content of the coding unit that currently needs to be coded output by the block dividing unit 350 and the number of bits of data in the bit stream buffer, and determine the quantization parameter based on the target number of bits.
- the data in the bit stream buffer includes the bit stream of the image frame, and the number of bits of the data in the bit stream buffer The number of bits in the bitstream that contains the image frame.
- the data in the bit stream buffer when the encoder 300 completes encoding the coding unit in a frame of image and transmits the bit stream of the coding unit to the decoder 400, the data in the bit stream buffer includes the bit stream of one or more coding units.
- the number of bits of data in the stream buffer includes the number of bits of the bit stream of one or more coding units. It can be understood that the bit stream of one or more coding units may be the bit stream of the coding unit encoded by the encoder 300 minus the bit stream of the encoded coding unit transmitted by the encoder 300 to the decoder 400 .
- the prediction unit 310 is configured to perform intra prediction on the coding unit output by the block dividing unit 350 to obtain the number of predicted bits, and output the original number of bits of the coding unit and the residual of the number of predicted bits.
- intra prediction may refer to the intra prediction of HEVC.
- Intra-frame prediction is a common method to remove spatial redundant information in the original image, that is, using the reconstructed pixels of adjacent coding blocks as reference values to predict the current coding unit, which is equivalent to the coding unit in the original image and its surrounding coding blocks. If there is a correlation, the surrounding reconstructed coding units can be used to estimate the pixel value of the current coding unit. The estimated pixel value is the predicted value.
- the residual between the predicted value and the original value of the current coding unit is quantized and entropy encoded. What is encoded and transmitted is often the prediction residual.
- the decoder performs the same prediction process to obtain the predicted value of the current coding unit, and then obtains the reconstructed value of the current coding unit by adding the predicted value and the solved residual.
- the quantization unit 330 is configured to quantize the residual output by the prediction unit 310 according to the quantization parameter output by the code control unit 320 to obtain a quantized residual.
- the encoding unit 340 is configured to encode the quantized residual output by the quantization unit 330 to obtain a bit stream of the encoding unit.
- the quantized residual output by the quantization unit 330 is entropy encoded.
- the decoder 400 includes a decoding unit 410 , a code control unit 420 , an inverse quantization unit 430 and a prediction reconstruction unit 440 .
- the decoding unit 410 is used to decode the bit stream of the coding unit to obtain the quantized residual and image content.
- the code control unit 420 determines the target number of bits of the coding unit according to the number of bits of data in the image content bit stream buffer of the coding unit that currently needs to be decoded output by the decoding unit 410, and determines the quantization parameter according to the target number of bits.
- the encoder 300 when the encoder 300 completes encoding a frame of image and transmits the bit stream of the image frame to the decoder 400, the data in the bit stream buffer of the decoder 400 includes the decoded data of the image frame, and the bit stream buffer The number of bits in the data includes the number of bits in the decoded data of the image frame.
- the encoder 300 when the encoder 300 completes encoding the coding unit in a frame of image and transmits the bit stream of the coding unit to the decoder 400, the data in the bit stream buffer of the decoder 400 includes one or more coding units. Decoded data, the number of bits of data in the bit stream buffer includes the number of bits of decoded data of one or more coding units.
- the inverse quantization unit 430 is configured to inversely quantize the quantized residual output by the decoding unit 410 according to the quantization parameter output by the code control unit 420 to obtain a residual.
- the prediction and reconstruction unit 440 is configured to perform prediction and reconstruction according to the residual output by the inverse quantization unit 430 to obtain a reconstructed image, so that the display can display the reconstructed image.
- this application provides an image encoding and decoding method that takes into account the image content of the coding unit and the data in the bit stream buffer.
- the number of bits is two factors, by dynamically setting the target number of bits.
- the coding unit is encoded using the quantization parameters determined by the target number of bits. This reduces the number of encoding bits after encoding the image while ensuring the quality of the reconstructed image.
- Figure 4 is a schematic flowchart of an image encoding and decoding method provided by this application.
- the image encoding and decoding process performed by the source device 110 and the destination device 120 in FIG. 1 is taken as an example for explanation.
- the method includes the following steps.
- Step 410 The source device 110 obtains the coding unit to be encoded in the current frame.
- the source device 110 can collect original images through the image collector 111.
- the source device 110 can also receive original images collected by other devices; or obtain original images from the memory in the source device 110 or other memories.
- the original image may include at least one of a real-world image collected in real time, an image stored by the device, and an image synthesized from multiple images. This embodiment does not limit the acquisition method of the original image and the type of the original image.
- the current frame refers to a frame of image or original image that is encoded and decoded at the current moment.
- the previous frame refers to a frame of image or original image that has been encoded and decoded before the current moment.
- the previous frame may be a time point before the current time point or frames at multiple times before the current time point.
- the source device 110 may divide the current frame to obtain multiple coding units, and encode the multiple coding units.
- Step 420 The source device 110 determines the target number of bits based on the image content of the coding unit and the number of bits of data in the bit stream buffer, and determines the quantization parameter based on the target number of bits.
- the image content of a coding unit indicates the complexity of different pixel regions in the coding unit. For example, the color, texture, shape, etc. complexity of the pixel area.
- the encoder 113 divides the coding unit into several sub-blocks, and for each sub-block, the difference between adjacent pixel values is gradually calculated in both horizontal and vertical directions. After summing the absolute values of the differences, the complexity level corresponding to the sub-block is obtained. The complexity level is compared with the threshold to obtain the complexity level of the sub-block. After performing regular operations on the complexity levels of each sub-block, the complexity level k of the coding unit is obtained.
- the complexity levels of different coding units in the current frame can be different or the same.
- the number of bits of data in the bitstream buffer is used to indicate the number of bits of the bitstream of the encoded coding unit in the current frame stored in the bitstream buffer.
- a physical buffer for storing the bitstream can be pre-configured in memory before encoding.
- the bit stream buffer may be obtained based on the number of bits of the bit stream of the encoded coding unit stored in the physical buffer.
- the source device 110 determines the target number of bits according to the complexity level indicated by the image content of the coding unit and the number of bits of data in the bit stream buffer, and the quantization parameter determined according to the target number of bits. For a specific explanation on determining the quantization parameters, refer to the description of steps 510 to 560 below.
- Step 430 The source device 110 encodes the coding unit according to the quantization parameter to obtain a bit stream of the coding unit.
- the source device 110 may perform encoding operations such as transformation, quantization, and entropy coding on the coding unit to generate a bit stream, thereby achieving the purpose of data compression on the coding unit to be encoded.
- the number of bits of the bit stream of the coding unit may be smaller than the target number of bits or larger than the target number of bits.
- Step 440 The source device 110 sends the bit stream to the destination device 120.
- the source device 110 may send the bit stream of the video to the destination device 120 after the entire encoding of the video is completed.
- the source device 110 may encode the original image in real time on a frame-by-frame basis, and send the bit stream of one frame after encoding one frame.
- the source device 110 performs encoding processing on the encoding unit of the original image, and sends the bit stream of the encoding unit after encoding the encoding unit.
- Step 450 The destination device 120 obtains the bit stream of the coding unit to be decoded in the image bit stream.
- Step 460 The destination device 120 determines the target number of bits based on the image content of the coding unit and the number of bits of data in the bit stream buffer, and determines the quantization parameter based on the target number of bits of the coding unit.
- the destination device 120 After receiving the bit stream of the coding unit, the destination device 120 decodes the bit stream of the coding unit to obtain the image content of the coding unit and the encoded data of the coding unit, and then based on the number of bits of the data in the bit stream buffer and the image of the coding unit The content determines the quantification parameters.
- the destination device 120 determines the target number of bits according to the complexity level indicated by the image content of the coding unit and the number of bits of data in the bit stream buffer, and the quantization parameter determined according to the target number of bits. For a specific explanation on determining the quantization parameters, refer to the description of steps 510 to 560 below.
- Step 470 The destination device 120 decodes the bit stream of the coding unit of the current frame according to the quantization parameter to obtain the reconstructed image.
- the destination device 120 decodes the encoded data of the coding unit according to the quantization parameter determined by the target number of bits of the coding unit to obtain a reconstructed image.
- the destination device 120 displays the reconstructed image.
- the destination device 120 transmits the reconstructed image to other display devices, and the other display devices display the reconstructed image.
- the two factors of the number of bits of data in the bitstream buffer and the image content of the coding unit are balanced, and the quantization parameters are dynamically set to ensure the quality of the reconstructed image.
- the quantization parameters are dynamically set to ensure the quality of the reconstructed image.
- Figure 5 is a schematic flowchart of an image encoding method provided by this application.
- the encoder 300 in FIG. 3 performs the determination process of quantization parameters as an example for explanation.
- the method flow in Figure 5 is an elaboration of the specific operation processes included in step 420 and step 460 in Figure 4 . As shown in Figure 5, the method includes the following steps.
- Step 510 The encoder 300 determines the number of lossless bits of the coding unit according to the image content of the coding unit.
- the image content of a coding unit is used to indicate the complexity of different pixel regions in a coding unit.
- the encoder 300 can encode according to The image content of the code unit determines the complexity level of the coding unit, for example, as set forth in step 420.
- the number of lossless bits of the coding unit is used to indicate the expected number of bits after lossless encoding of the coding unit.
- the number of lossless bits may be an empirically configured default value.
- the encoder 300 sets the expected number of bits after lossless encoding of the uncoded coding unit according to the number of bits of the coded coding unit.
- the encoder 300 may look up the table according to the identification of the coding unit and the complexity level of the coding unit to determine the number of lossless bits of the coding unit.
- B LL represents the number of lossless bits.
- B LL Record BLL [T][k], where T represents the identifier of the coding unit, and k represents the complexity level of the coding unit.
- Step 520 The encoder 300 determines the number of lossy bits of the coding unit according to the number of bits of data in the bit stream buffer.
- the number of lossy bits of a coding unit is used to indicate the expected number of bits after lossy encoding of a coding unit without reference to the content of the coding unit.
- Bpp represents the number of lossy bits
- Bpp INI represents the initial value of the number of lossy bits
- Bpp ADJ represents the adjustment value of the number of lossy bits.
- the initial value of the number of lossy bits is determined based on the number of bits of the coding unit and the compression rate.
- the compression rate is determined based on the needs of the actual application scenario.
- the adjustment value of the lossy bit number is proportional to (RcBuf END -RcBuf T ), where RcBuf END represents the expected number of bits in the bit stream buffer at the end of encoding or decoding of the current frame.
- RcBuf T represents the number of bits of the encoded coding unit in the bitstream buffer. If the difference between RcBuf END - RcBuf T is greater than 0, it means that the number of bits of the encoded coding unit in the bit stream buffer does not exceed the expected number of bits in the bit stream buffer at the end of encoding or decoding of the current frame, and it can be unencoded.
- the unit allocates more target bits; if the difference between RcBuf END -RcBuf T is less than 0, it means that the number of bits of the encoded coding unit in the bitstream buffer exceeds the expected bits of the bitstream buffer at the end of encoding or decoding of the current frame. number, you can allocate a smaller target number of bits to the uncoded coding unit; if the difference between RcBuf END -RcBuf T is equal to 0, it means that the number of bits of the coded coding unit in the bit stream buffer is equal to the end of encoding or decoding of the current frame When the expected number of bits is in the bitstream buffer, unencoded coding units can be allocated a smaller target number of bits.
- the number of bits of coded coding units in the bitstream buffer is obtained linearly with the number of bits of coded coding units in the physical buffer.
- RcBuf T PhyBuf T +X 0
- PhyBuf T represents the number of bits of the encoded coding unit in the physical buffer
- the physical buffer refers to the storage space in the memory used to store the bit stream of the encoded coding unit.
- the storage capacity of the physical buffer can be the number of bits of the bit stream of one or more coding units.
- X 0 represents the agreed parameters.
- the RcBuf END corresponding to different coding units in the current frame is the same.
- the RcBuf END corresponding to the coding units in different frames can be the same or different.
- Step 530 The encoder 300 determines the amount of information based on the number of lossy bits and the average number of lossless bits.
- the amount of information represents the proportion of the information of the coding unit currently to be encoded to the information of the current frame, that is, the complexity of the content expressed by the coding unit in the current frame.
- the information content is the ratio of the number of lossy bits to the average number of lossless bits.
- the average number of lossless bits is used to indicate the expected number of bits after lossless encoding of the current frame.
- the average number of lossless bits may be the average expected number of bits after lossless encoding of each coding unit in the current frame.
- the current frame includes encoding unit 1 and encoding unit 2.
- the expected number of bits after lossless encoding of encoding unit 1 is 10 bits.
- the expected number of bits after lossless encoding of encoding unit 2 is 20 bits.
- the average lossless number of the current frame is The number of bits is 15 bits.
- R represents the amount of information
- Bpp represents the number of lossy bits
- B AVG represents the average number of lossless bits.
- bitsOffset represents the offset
- bitsOffset BitsOffset-X1*Bpp+X 2
- BitsOffset represents the initial value of the offset
- the initial value of the offset is related to the image bit depth.
- X 1 , X 2 , X 3 , X 4 and X 5 represent agreed parameters.
- Step 540 The encoder 300 determines the target number of bits based on the number of lossless bits and the information amount of the coding unit.
- the target number of bits is used to indicate the expected number of bits after lossy encoding of the coding unit when referring to the content of the coding unit, that is, the expected number of bits after the encoder 300 performs quantization encoding on the coding unit when referring to the content of the coding unit.
- B TGT represents the target number of bits
- R represents the amount of information
- B LL represents the number of lossless bits
- bitsOffset represents the offset
- X 6 and X 7 are the agreed parameters.
- Step 550 The encoder 300 clamps the target number of bits based on at least one of the number of lossy bits, the number of lossless bits, and the fullness of the bit stream buffer, to obtain a clamp value of the target number of bits.
- the encoder 300 determines the minimum value B MIN of the target number of bits and the maximum value B MAX of the target number of bits according to the number of lossy bits Bpp, the number of lossless bits B LL and the bit stream buffer fullness F, and then, according to the minimum value of the target number of bits B MAX
- the value B MIN and the maximum value B MAX of the target number of bits clamp the target number of bits B TGT to obtain the clamp value B' TGT of the target number of bits.
- the clamp value of the target bit number satisfies the following formula (4).
- B' TGT MIN(MAX(B MIN ,B TGT ),B MAX ) Formula (4)
- B MIN > B TGT , B MIN ⁇ B MAX the target number of bits is B MIN .
- B MIN > B TGT B MIN > B MAX
- B MAX the target number of bits
- bitstream buffer fullness is used to indicate the storage status of the bitstream of the encoded coding unit in the bitstream buffer.
- the fullness of the bit stream buffer satisfies the following formula (5).
- F RcBuf T /RcBuf MAX formula (5)
- F represents the fullness of the bit stream buffer
- RcBuf T represents the number of bits of the encoded coding unit in the bit stream buffer
- RcBuf MAX represents the maximum number of bits allowed in the bit stream buffer.
- Step 560 The encoder 300 determines the quantization parameter according to the clamp value of the lossless bit number and the target bit number.
- the encoder 300 clamps the expected number of bits after quantization encoding of the coding unit while referring to the content of the coding unit, and then determines the quantization parameter based on the clamped value of the number of lossless bits and the target number of bits.
- the quantization parameters satisfy formula (6).
- QP (B LL -B' TGT +X 8 )*X 9 )*X 10Formula (6)
- B LL represents the number of lossless bits
- B' TGT represents the clamp value of the target number of bits
- X 8 , X 9 and X 10 are agreed parameters.
- the encoder calculates the expected value after coding based on the coding unit that currently needs to be coded relative to the coding unit determined by the complexity of the entire frame image, that is, based on the complexity level of the coding unit and the average complexity level of the entire frame. of. Therefore, in order to obtain better coding performance and the quality of the reconstructed image, the code control module is used to allocate different expected numbers of coded bits to different coding units in the image, thereby maximizing the use of the specified total number of coded bits. The purpose of decompressing the image is to have the best possible quality.
- Figure 6 shows a schematic flow chart of code rate control in the decoding process.
- the difference between Figure 6 and Figure 5 is that the quantization parameters output by the code rate control are used in the inverse quantization process.
- the process of determining the quantization parameters refer to the explanation in Figure 5 above.
- the following is an example of a method of determining the clamp value of the target number of bits based on the minimum value B MIN of the target number of bits and the maximum value B MAX of the target number of bits.
- the method of determining the clamp value of the target number of bits includes the following steps.
- Step 710 Calculate the first minimum value B MIN1 .
- the first minimum value B MIN1 (Param Bpp ⁇ Bpp + K1) ⁇ (MIN ( B LL , B LLMAX ) ⁇ K2 + K3), where B LL represents the number of lossless bits, Bpp represents the number of lossy bits, and B LLMAX represents the maximum number of lossless bits after lossless encoding of the coding unit under the current pixel bit depth.
- Param Bpp represents the agreed parameters under the current pixel bit depth.
- K1, K2 and K3 represent the agreed parameters.
- Step 720 Calculate the second minimum value B MIN2 .
- the second minimum value B MIN2 (K4-K5 ⁇ F) ⁇ Bpp+K6, where F represents the bit stream buffer fullness, Bpp represents the number of lossy bits, and K4, K5 and K6 represent agreed parameters.
- Step 730 Calculate the third minimum value B MIN3 .
- F limit represents the agreed upper limit of the bit stream buffer fullness
- Bpp represents the number of lossy bits
- F represents the bit stream buffer fullness
- Sr represents the parameters related to the image chroma sampling rate
- K 7 and K 8 represents the agreed parameters.
- Step 740 Calculate the minimum value B MIN .
- the minimum value of the target number of bits B MIN MAX(MAX(B MIN1 ,B MIN2 ),B MIN3 ).
- B MIN1 represents the first minimum value calculated in step 710
- B MIN2 represents the second minimum value calculated in step 720
- B MIN3 represents the third minimum value calculated in step 730 .
- Step 750 Calculate the offset value of the first maximum value B MAX1 .
- the offset value of the first maximum value bppOffset1 K9-(K10 ⁇ F+K11)-Sr ⁇ B relative , where F represents the bit stream buffer fullness, and B relative represents the lossless bit number difference calculated in step 730 value, Sr represents the parameters related to the image chroma sampling rate, and K9, K10 and K11 represent the agreed parameters.
- Step 760 Calculate the offset value of the second maximum value B MAX2 .
- Step 770 Calculate the offset value of the third maximum value B MAX3 .
- the offset value of the third maximum value bppOffset3 K14*(F limit -F), where F represents the input buffer fullness and K14 represents the agreed parameters.
- Step 780 Calculate the maximum value B MAX .
- Step 790 Clamp the target number of bits B TGT .
- the minimum value B MIN calculated in step 740 and the maximum value B MAX calculated in step 780 are used as the range for clamping the target number of bits to obtain the clamping value of the target number of bits, and the target number of bits B TGT is clamped within this range.
- B' TGT MIN(MAX(B MIN ,B TGT ),B MAX ).
- the encoder and decoder include corresponding hardware structures and/or software modules that perform each function.
- the units and method steps of each example described in conjunction with the embodiments disclosed in this application can be implemented in the form of hardware or a combination of hardware and computer software. Whether a certain function is executed by hardware or computer software driving the hardware depends on the specific application scenarios and design constraints of the technical solution.
- the image encoding and decoding method provided according to this embodiment is described in detail above with reference to FIGS. 1 to 7 .
- the encoding and decoding device provided according to this embodiment will be described with reference to FIG. 8 .
- Figure 8 is a schematic structural diagram of a possible encoding and decoding device provided in this embodiment.
- These encoding and decoding devices can be used to implement the functions of the encoder and decoder in the above method embodiments, and therefore can also achieve the beneficial effects of the above method embodiments.
- the encoding and decoding device may be the encoder 300 and decoder 400 as shown in FIG. 3 , or may be a module (such as a chip) applied to a computing device.
- the encoding and decoding device 800 includes a communication module 810 , a code control module 820 , an encoding module 830 and a storage module 840 .
- the encoding and decoding device 800 is used to implement the functions of the encoder 300 and the decoder 400 in the method embodiment shown in FIG. 3 .
- the communication module 810 is used to obtain the coding unit to be encoded in the current frame. For example, the communication module 810 is used to perform step 410 in FIG. 4 .
- the code control module 820 is configured to determine the target number of bits of the coding unit according to the image content of the coding unit and the number of bits of data in the bit stream buffer, and determine the quantization parameter according to the target number of bits of the coding unit. For example, the code control module 820 is used to execute step 420 in FIG. 4 .
- the encoding module 830 is configured to encode the coding unit according to the quantization parameter to obtain a bit stream of the coding unit. For example, the encoding module 830 is used to perform step 430 in FIG. 4 .
- the communication module 810 is used to obtain the bit stream of the coding unit to be decoded in the image bit stream. For example, the communication module 810 is used to perform step 450 in FIG. 4 .
- the code control module 820 is configured to determine the target number of bits of the coding unit according to the image content of the coding unit and the number of bits of data in the bit stream buffer, and determine the quantization parameter according to the target number of bits of the coding unit. For example, the code control module 820 is used to execute step 460 in FIG. 4 .
- the encoding module 830 is configured to decode the bit stream of the encoding unit according to the quantization parameter to obtain a reconstructed image of the encoding unit. For example, the encoding module 830 is used to perform step 470 in FIG. 4 .
- the storage module 840 is used for the number of bits of data in the bit stream buffer, so that the code control module 820 determines the quantization parameter.
- the encoding and decoding device 800 in the embodiment of the present application can be implemented by an application-specific integrated circuit (ASIC) or a programmable logic device (PLD).
- the above PLD can be a complex program.
- Logic device complex programmable logical device, CPLD
- field-programmable gate array field-programmable gate array
- FPGA field-programmable gate array
- GAL general array logic
- the encoding and decoding device 800 may correspond to performing the method described in the embodiment of the present application, and the above and other operations and/or functions of each unit in the encoding and decoding device 800 are respectively to implement the various methods in Figure 3 The corresponding process, for the sake of brevity, will not be repeated here.
- FIG. 9 is a schematic structural diagram of an image processing system provided by the present application.
- the image processing system is explained by taking a mobile phone as an example.
- the mobile phone or the chip system built into the mobile phone includes: a memory 910, a processor 920, a sensor component 930, a multimedia Component 940 and input/output interface 950.
- the following is a detailed introduction to each component of a mobile phone or a chip system built into a mobile phone with reference to Figure 9 .
- the memory 910 can be used to store data, software programs and modules; it mainly includes a stored program area and a stored data area, where the stored program area can store software programs, including instructions formed in codes, including but not limited to operating systems, at least one function. Required applications, such as sound playback function, image playback function, etc.; the storage data area can store data created based on the use of the mobile phone, such as audio data, image data, phone book, etc.
- the memory 910 may be used to store the number of bits of data in the bit stream buffer, etc.
- the memory may include floppy disks, hard disks such as built-in hard disks and removable hard disks, magnetic disks, optical disks, magneto-optical disks such as CD_ROM and DCD_ROM, and non-volatile storage Devices such as RAM, ROM, PROM, EPROM, EEPROM, flash memory, or any other form of storage media known in the technical field.
- the memory may include floppy disks, hard disks such as built-in hard disks and removable hard disks, magnetic disks, optical disks, magneto-optical disks such as CD_ROM and DCD_ROM, and non-volatile storage Devices such as RAM, ROM, PROM, EPROM, EEPROM, flash memory, or any other form of storage media known in the technical field.
- the processor 920 is the control center of the mobile phone, using various interfaces and lines to connect various parts of the entire device, by running or executing software programs and/or software modules stored in the memory 910, and calling data stored in the memory 910, Performs various functions of the phone and processes data for overall monitoring of the phone.
- the processor 920 may be used to perform one or more steps in the method embodiment of the present application.
- the processor 920 may be used to perform one of steps 420 to 470 in the following method embodiments or Multiple steps.
- the processor 920 may be a single-processor structure, a multi-processor structure, a single-threaded processor, a multi-threaded processor, etc.; in some feasible embodiments, the processor 920 may include a central processing unit At least one of a unit, a general-purpose processor, a digital signal processor, a neural network processor, an image processing unit, an image signal processor, a microcontroller or a microprocessor, and the like. In addition, the processor 920 may further include other hardware circuits or accelerators, such as application specific integrated circuits, field programmable gate arrays or other programmable logic devices, transistor logic devices, hardware components, or any combination thereof.
- the processor 920 may also be a combination that implements computing functions, such as a combination of one or more microprocessors, a combination of a digital signal processor and a microprocessor, and so on.
- Sensor component 930 includes one or more sensors for providing various aspects of status assessment for the phone.
- the sensor component 930 may include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications, that is, becoming an integral part of a camera or camera.
- the sensor component 930 can be used to support the camera in the multimedia component 940 to acquire images, etc.
- the sensor component 930 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor or a temperature sensor, through which the acceleration/deceleration, orientation, open/closed status of the mobile phone, the relative positioning of components, or Temperature changes of mobile phones, etc.
- the multimedia component 940 provides a screen of an output interface between the mobile phone and the user.
- the screen may be a touch panel, and when the screen is a touch panel, the screen may be implemented as a touch screen to receive input signals from the user.
- the touch panel includes one or more touch sensors to sense touches, swipes, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide action.
- the multimedia component 940 also includes at least one camera.
- the multimedia component 940 includes a front camera and/or a rear camera. When the mobile phone is in an operating mode, such as shooting mode or video mode, the front camera and/or the rear camera can sense external multimedia signals, which are used to form image frames.
- Each front-facing camera and rear-facing camera can be a fixed optical lens system or have a focal length and optical zoom capabilities.
- the input/output interface 950 provides an interface between the processor 920 and peripheral interface modules.
- the peripheral interface modules may include a keyboard, a mouse, or a USB (Universal Serial Bus) device.
- the input/output interface 950 may have only one input/output interface, or may have multiple input/output interfaces.
- the mobile phone may also include audio components and communication components.
- the audio component includes a microphone
- the communication component includes a wireless fidelity (WiFi) module, a Bluetooth module, etc.
- WiFi wireless fidelity
- Bluetooth Bluetooth
- the above-mentioned image processing system may be a general-purpose device or a special-purpose device.
- the image processing system may be an edge device (eg, a box carrying a chip with processing capabilities) or the like.
- the image processing system may also be a server or other device with computing capabilities.
- the image processing system may correspond to the encoding and decoding device 800 in this embodiment, and may correspond to the corresponding subject executing any method according to FIG. 3, and each module in the encoding and decoding device 800
- the above and other operations and/or functions are respectively to implement the corresponding processes of each method in Figure 3. For the sake of simplicity, they will not be described again here.
- the method steps in this embodiment can be implemented by hardware or by a processor executing software instructions.
- Software instructions can be composed of corresponding software modules, and software modules can be stored in random access memory (random access memory, RAM), flash memory, read-only memory (read-only memory, ROM), programmable read-only memory (programmable ROM) , PROM), erasable programmable read-only memory (erasable PROM, EPROM), electrically erasable programmable read-only memory (electrically EPROM, EEPROM), register, hard disk, mobile hard disk, CD-ROM or other well-known in the art any other form of storage media.
- An exemplary storage medium is coupled to the processor such that the processor can read information from the storage medium and write information to the storage medium.
- the storage medium can also be an integral part of the processor.
- the processor and storage media may be located in an ASIC. Additionally, the ASIC can be located in a computing device. Of course, the processor and storage medium may also exist as discrete components in a computing device.
- the computer program product includes one or more computer programs or instructions.
- the computer may be a general purpose computer, a special purpose computer, a computer network, a network device, a user equipment, or other programmable device.
- the computer program or instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another.
- the computer program or instructions may be transmitted from a website, computer, A server or data center transmits via wired or wireless means to another website site, computer, server, or data center.
- the computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server or data center that integrates one or more available media.
- the available media may be magnetic media, such as floppy disks, hard disks, and magnetic tapes; they may also be optical media, such as digital video discs (DVDs); they may also be semiconductor media, such as solid state drives (solid state drives). ,SSD).
- SSD solid state drives
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
公开了图像编解码方法、装置、编码器、解码器和系统,涉及多媒体领域。该方法包括,对图像位流中待解码的编码单元的位流进行解码,或者对当前帧中待编码的编码单元进行编码时,兼顾编码单元的图像内容和位流缓冲区中数据的比特数两个因素,通过动态地设置目标比特数。在编码端,利用目标比特数确定的量化参数对编码单元进行编码。在解码端,根据目标比特数确定的量化参数对编码单元的位流进行解码。从而在确保重建后图像的质量的前提下,减少对图像进行编码后的编码比特数。
Description
本申请要求于2022年09月08日提交国家知识产权局、申请号为202211097577.X,申请名称为“目标比特数的钳位方法”的中国专利申请的优先权,本申请还要求于2022年12月28日提交国家知识产权局、申请号为202211696765.4,申请名称为“图像编解码方法、装置、编码器、解码器和系统”的中国专利申请的优先权,这些全部内容通过引用结合在本申请中。
本申请涉及多媒体领域,尤其涉及一种图像编解码方法、装置、编码器、解码器和系统。
目前,编码器对图像帧进行预测、量化、熵编码等编码操作得到位流。解码器对位流进行熵解码、反量化、预测重建等解码操作,得到图像帧的重建后图像。其中,由于量化参数的值越大,位流包含的图像帧中有效信息越少导致重建后图像的质量较差;反之,量化参数的值越小重建后图像的质量越高,则位流包含的图像帧的冗余信息越多而位流的比特数较多。因此,如何确定对图像进行编解码所使用的量化参数,在确保重建后图像的质量的前提下,减少对图像进行编码后的编码比特数是亟待解决的问题。
发明内容
本申请提供了一种图像编解码方法、装置、编码器、解码器和系统,由此通过合理地确定对图像进行编解码所使用的量化参数,在确保重建后图像的质量的前提下,减少对图像进行编码后的编码比特数。
第一方面,提供了一种图像解码方法,该方法包括:对图像位流中待解码的编码单元的位流进行解码时,根据编码单元的图像内容和位流缓冲区中数据的比特数确定编码单元的目标比特数,根据目标比特数确定的量化参数对编码单元的位流进行解码。其中,图像内容用于指示编码单元中不同像素区域的复杂程度。位流缓冲区用于存储一个或多个编码单元解码后的编码比特数。编码单元的目标比特数用于指示在参考编码单元的图像内容时对编码单元进行有损编码后的期望比特数。
由于编码单元的复杂程度较高,表示图像包含的信息较多,即重复信息较少;反之,编码单元的复杂程度较低,表示图像包含的信息较少,即重复信息较多。在对编码单元进行编码时,兼顾编码单元的图像内容和位流缓冲区中数据的比特数两个因素,对于复杂程度较低的编码单元,倾向设置更少的目标比特数,对于复杂程度较高的编码单元,倾向设置更多的目标比特数,即在参考编码单元所表达的图像内容的复杂程度和位流缓冲区中已编码的编码单元的比特数,确保重建后图像的质量的前提下,尽可能减少对图像进行编码后的编码比特数。对于编码单元进行编码的逆过程,即对编码单元的位流进行解码时,兼顾编码单元的图像内容和位流缓冲区中数据的比特数两个因素,通过动态地设置目标比特数,利用目标比特数确定的量化参数对编码单元的位流进行解码,从而通过提高码率控制的精确性,在确保重建后图像的质量的前提下,尽可能减少对图像进行编码后的编码比特数。
例如,基于恒定码率码控策略对视频进行编码时,如果编码单元的位流的比特数较少,而待编码的编码单元的图像内容较复杂,在确保码率恒定的前提下,合理提升编码单元的位流的比特数,量化参数的取值可以较小,来提升重建后图像的质量。又如,基于恒定码率码控策略对视频进行编码时,如果编码单元的位流的比特数较多,而待编码的编码单元的图像内容较简单,在确保码率恒定和重建后图像的质量的前提下,量化参数的取值可以较大,合理降低编码单元的位流的比特数。
结合第一方面,在另一种可能的实现方式中,方法还包括:确定到量化参数后,根据量化参数对编码单元的位流进行解码,得到编码单元的重建后图像。
第二方面,提供了一种图像编码方法,方法包括:对当前帧中待编码的编码单元进行编码时,根据编码单元的图像内容和位流缓冲区中数据的比特数确定编码单元的目标比特数,根据编码单元的目标比特数确定的量化参数对编码单元进行编码。其中,图像内容用于指示编码单元中不同像素区域的复杂程度。位流缓冲区用于存储一个或多个编码单元的位流或部分位流。编码单元的目标比特数用于指示在参考编码单元的图像内容时对编码单元进行有损编码后的期望比特数。
由于编码单元的复杂程度较高,表示图像包含的信息较多,即重复信息较少;反之,编码单元的复杂程度较低,表示图像包含的信息较少,即重复信息较多。在对编码单元进行编码时,兼顾编码单元的图像内容和位流缓冲区中数据的比特数两个因素,对于复杂程度较低的编码单元,倾向设置更少的目标比特数,对于复杂程度较高的编码单元,倾向设置更多的目标比特数。通过动态地设置目标比特数,利用目标比特数确定的量化参数对编码单元进行编码,从而通过提高码率控制的精确性,在确保重建后图像的质量的前提下,尽可能减少对图像进行编码后的编码比特数。
在一种可能的实现方式中,根据编码单元的图像内容和位流缓冲区中数据的比特数确定编码单元的目标比特数,包括:根据编码单元的图像内容确定编码单元的无损比特数据,根据位流缓冲区中数据的比特数确定编码单元的信息量。进而,根据编码单元的无损比特数和编码单元的信息量确定目标比特数。由于无损比特数据用于指示对编码单元进行无损编码后的期望比特数,即无损比特数据表征能够充分表达编码单元的信息的编码方式下的期望比特数。信息量用于指示编码单元表达的图像内容在当前帧所表达的图像内容的复杂程度。通过信息量对编码单元的无损比特数据进行衡量确定编码单元的目标比特数,即依据信息量充分保留编码单元的信息的前提下,减少对图像进行编码后的编码比特数,从而,提高确定编码单元的目标比特数的精确性。
在另一种可能的实现方式中,根据位流缓冲区中数据的比特数确定在当前帧中编码单元的信息量,包括:根据位流缓冲区中数据的比特数确定编码单元的有损比特数,根据编码单元的有损比特数和平均无损比特数确定信息量。其中,编码单元的有损比特数用于指示在不参考编码单元的图像内容时对编码单元进行有损编码后的期望比特数。平均无损比特数用于指示对当前帧中每个编码单元进行无损编码后的平均期望比特数。通过平均期望比特数表征当前帧表达的图像内容的复杂程度,利用编码单元的有损比特数与平均期望比特数的比值量化编码单元表达的图像内容在当前帧所表达的图像内容的复杂程度,提高码率控制的精确性。
在另一种可能的实现方式中,根据编码单元的目标比特数确定量化参数,包括:根据编码单元的无损比特数、编码单元的有损比特数和位流缓冲区满度中的至少一个对编码单元的目标比特数进行钳位,得到目标比特数的钳位值,根据编码单元的无损比特数和目标比特数的钳位值确定量化参数。位流缓冲区满度用于指示位流缓冲区中数据的比特数占位流缓冲区的存储容量的比值。从而,对编码单元的目标比特数进行钳位,进一步提高码率控制的精确性。
其中,图像内容包括编码单元的复杂度等级。例如,编码单元的复杂度等级包括:亮度复杂度等级和色度复杂度等级中至少一种。
第三方面,提供了一种图像编解码装置,所述装置包括用于执行第一方面或第一方面任一种可能设计的方法的各个模块,以及执行第二方面或第二方面任一种可能设计的方法的各个模块。
第四方面,提供了一种编码器,所述编码器包括至少一个处理器和存储器,其中,存储器用于存储计算机程序,使得计算机程序被至少一个处理器执行时实现第二方面或第二方面任一种可能设计中所述的方法。
第五方面,提供了一种解码器,所述解码器包括至少一个处理器和存储器,其中,存储器用于存储计算机程序,使得计算机程序被至少一个处理器执行时实现第一方面或第一方面任一种可能设计中所述的方法。
第六方面,提供一种编解码系统,该编解码系统包括如第四方面所述的编码器和如第五方面所述的解码器。
第七方面,提供一种芯片,包括:处理器和供电电路;其中,所述供电电路用于为所述处理器供电;所述处理器用于执行第一方面或第一方面任一种可能实现方式中的方法的操作步骤,以及执行第二方面或第二方面任一种可能实现方式中的方法的操作步骤。
第八方面,提供一种计算机可读存储介质,包括:计算机软件指令;当计算机软件指令在计算设备中运行时,使得计算设备执行第一方面或第一方面任一种可能实现方式中的方法的操作步骤,以及执行如第二方面或第二方面任意一种可能的实现方式中所述方法的操作步骤。
第九方面,提供一种计算机程序产品,当计算机程序产品在计算机上运行时,使得计算设备执行第一方面或第一方面任一种可能实现方式中的方法的操作步骤,以及执行如第二方面或第二方面任意一种可能的实现方式中所述方法的操作步骤。
本申请在上述各方面提供的实现方式的基础上,还可以进行进一步组合以提供更多实现方式。
图1为本申请提供的一种编解码系统的结构示意图;
图2为本申请提供的一种编解码系统的场景示意图;
图3为本申请提供的一种编码器和解码器的结构示意图;
图4为本申请提供的一种图像编解码方法的流程示意图;
图5为本申请提供的一种图像编码方法的流程示意图;
图6为本申请提供的一种图像解码方法的流程示意图;
图7为本申请提供的一种目标比特数的钳位方法示意图;
图8为本申请提供的一种编解码装置的结构示意图;
图9为本申请提供的一种编解码系统的结构示意图。
本申请所涉及的技术方案不仅可能应用于视频编码标准(如H.264、HEVC等标准),还可能应用于未来的视频编码标准(如H.266标准)。本申请的实施方式部分使用的术语仅用于对本申请的具体实施例进行解释,而非旨在限定本申请。下面先对本申请可能涉及的一些概念进行简单介绍。
视频,包含连续的多个图像,连续的多个图像变化每秒超过24帧(frame)画面以上时,根据视觉暂留原理,人眼无法辨别单幅的静态画面,则看上去平滑连续的多个画面即视频。
视频编码,指处理形成视频或视频序列的图片序列。在视频编码领域,术语“图片(picture)”、“帧(frame)”或“图像(image)”可以用作同义词。本文中使用的视频编码表示视频编码或视频解码。视频编码在源侧执行,通常包括在满足一定图像质量的条件下,处理(例如,压缩)原始视频图片以减少表示该视频图片所需的数据量,从而更高效地存储和/或传输。视频解码在目的地侧执行,通常包括相对于编码器作逆处理,以重构视频图片。实施例涉及的视频图片“编码”应理解为涉及视频序列的“编码”或“解码”。编码部分和解码部分的组合也称为编解码(编码和解码)。视频编码也可以称为图像编码(imagecoding)也称图像压缩(image compression)。图像解码(imagedecoding)是指图像编码的逆过程。
视频序列,包括一系列图像(picture),图像被进一步划分为切片(slice),切片再被划分为块(block)。视频编码以块为单位进行编码处理,在一些新的视频编码标准中,块的概念被进一步扩展。比如,在H.264标准中有宏块(macroblock,MB),宏块可进一步划分成多个可用于预测编码的预测块(partition)。在高性能视频编码(high efficiency video coding,HEVC)标准中,采用编码单元(coding unit,CU),预测单元(prediction unit,PU)和变换单元(transform unit,TU)等基本概念,从功能上划分了多种块单元,并采用全新的基于树结构进行描述。比如CU可以按照四叉树进行划分为更小的CU,而更小的CU还可以继续划分,从而形成一种四叉树结构,CU是对编码图像进行划分和编码的基本单元。对于PU和TU也有类似的树结构,PU可以对应预测块,是预测编码的基本单元。对CU按照划分模式进一步划分成多个PU。TU可以对应变换块,是对预测残差进行变换的基本单元。然而,无论CU,PU还是TU,本质上都属于块(或称编码单元)的概念。
例如在HEVC中,通过使用表示为编码树的四叉树结构将CTU拆分为多个CU。在CU层级处作出是否使用图片间(时间)或图片内(空间)预测对图片区域进行编码的决策。每个CU可以根据PU拆分类型进一步拆分为一个、两个或四个PU。一个PU内应用相同的预测过程,并在PU基础上将相关信息传输到解码器。在通过基于PU拆分类型应用预测过程获取残差块之后,可
以根据类似于用于CU的编码树的其它四叉树结构将CU分割成变换单元(transform unit,TU)。在视频压缩技术最新的发展中,使用四叉树和二叉树(Quad-tree and binary tree,QTBT)分割帧来分割编码块。在QTBT块结构中,CU可以为正方形或矩形形状。
本文中,为了便于描述和理解,可将当前编码图像中待编码的编码单元称为当前块,例如在编码中,指当前正在编码的块;在解码中,指当前正在解码的块。将参考图像中用于对当前块进行预测的已解码的编码单元称为参考块,即参考块是为当前块提供参考信号的块,其中,参考信号表示编码单元内的像素值。可将参考图像中为当前块提供预测信号的块为预测块,其中,预测信号表示预测块内的像素值或者采样值或者采样信号。例如,在遍历多个参考块以后,找到了最佳参考块,此最佳参考块将为当前块提供预测,此块称为预测块。
无损视频编码,指可以重构原始视频图片,即经重构视频图片具有与原始视频图片相同的质量(假设存储或传输期间没有传输损耗或其它数据丢失)。
有损视频编码,指通过例如量化执行进一步压缩,来减少表示视频图片所需的比特数,而解码器侧无法完全重构视频图片,即经重构视频图片的质量相比原始视频图片的质量较低或较差。
位流(Data Rate),指对图像或视频进行编码后生成的二进制流。位流也称为码流或码率,即在单位时间内传输的比特数。图像编码中画面质量控制中重要部分。对于同样分辨率的图像,图像的位流越大,压缩比就越小,画面质量就越好。
码率控制,指在编解码过程中对码率进行调整的作用,以下简写为码控。码率控制模式包括恒定码率(Constant Bit Rate,CBR)和动态码率(Variable Bit Rate,VBR)。
恒定码率(Constant Bit Rate,CBR),即在码率统计时间内确保码率平稳。
动态码率(Variable Bit Rate,VBR),即允许在码率统计时间内码率波动,从而保证编码后的图像质量平稳。
量化,指将信号的连续取值映射成多个离散的幅值的过程。
量化参数(quantization parameter,QP),用于在编码过程中,预测操作产生的残差值或变换操作产生的系数进行量化后。在解码过程中对语法元素进行反量化得到残差值或系数。量化参数为量化过程使用的参数,通常量化参数的值越大,量化程度越明显,重建后图像的质量越差,码率越低;反之,量化参数的值越小,重建后图像的质量越好,码率越高。
位流缓冲区满度,指位流缓冲区中数据的比特数占位流缓冲区的存储容量的比值。在编码端,位流缓冲区中数据的比特数包括编码单元的编码比特数。在解码端,位流缓冲区中数据的比特数包括编码单元的解码后比特数。
钳位,指将某数值限制在规定范围的操作。
下面结合附图对本申请的实施方式进行说明。
图1为本申请提供的一种编解码系统的结构示意图。编解码系统100包括源设备110和目的设备120。源设备110用于对视频或图像进行压缩编码得到位流,向目的设备120传输位流。目的设备120对位流进行解码,并重建视频或图像,显示重建后图像。
具体地,源设备110包括图像采集器111、预处理器112、编码器113和通信接口114。
图像采集器111用于获取原始图像。图像采集器111,可以包括或可以为任何类别的图像捕获设备,用于例如捕获现实世界图像,和/或任何类别的图像或评论(对于屏幕内容编码,屏幕上的一些文字也认为是待编码的图像或图像的一部分)生成设备,例如,用于生成计算机动画图像的计算机图形处理器,或用于获取和/或提供现实世界图像、计算机动画图像(例如,屏幕内容、虚拟现实(virtual reality,VR)图像)的任何类别设备,和/或其任何组合(例如,增强现实(Augmented Reality,AR)图像)。图像采集器111可以为用于捕获图像的相机或者用于存储图像的存储器,图像采集器111还可以包括存储先前捕获或产生的图像和/或获取或接收图像的任何类别的(内部或外部)接口。当图像采集器111为相机时,图像采集器111可例如为本地的或集成在源设备中的集成相机;当图像采集器111为存储器时,图像采集器111可为本地的或例如集成在源设备中的集成存储器。当所述图像采集器111包括接口时,接口可例如为从外部视频源接收图像的外部接口,外部视频源例如为外部图像捕获设备,比如相机、外部存储器或外部图像生成设备,外部图像生成设备例如为外部计算机图形处理器、计算机或服务器。接口可以为根据任何专有或标准
化接口协议的任何类别的接口,例如有线或无线接口、光接口。
图像可以视为像素点(picture element)的二维阵列或矩阵。阵列中的像素点也可以称为采样点。阵列或图像在水平和垂直方向(或轴线)上的采样点数目定义图像的尺寸和/或分辨率。为了表示颜色,通常采用三个颜色分量,即图像可以表示为或包含三个采样阵列。例如在RBG格式或颜色空间中,图像包括对应的红色、绿色及蓝色采样阵列。但是,在视频编码中,每个像素通常以亮度/色度格式或颜色空间表示,例如对于YUV格式的图像,包括Y指示的亮度分量(有时也可以用L指示)以及U和V指示的两个色度分量。亮度(luma)分量Y表示亮度或灰度水平强度(例如,在灰度等级图像中两者相同),而两个色度(chroma)分量U和V表示色度或颜色信息分量。相应地,YUV格式的图像包括亮度采样值(Y)的亮度采样阵列,和色度值(U和V)的两个色度采样阵列。RGB格式的图像可以转换或变换为YUV格式,反之亦然,该过程也称为色彩变换或转换。如果图像是黑白的,该图像可以只包括亮度采样阵列。本申请中,由图像采集器111传输至编码器113的图像也可称为原始图像数据。
预处理器112用于接收图像采集器111采集的原始图像,并对原始图像进行预处理,得到预处理后图像。例如,预处理器112执行的预处理包括整修、色彩格式转换(例如,从RGB格式转换为YUV格式)、调色或去噪等。
编码器113用于接收预处理器112生成的预处理后图像,对预处理后图像进行压缩编码得到位流。示例地,编码器113可以包括码控单元1131和编码单元1132,码控单元1131用于确定对当前帧中每个编码单元进行编码所使用的量化参数,以便于编码单元1132根据量化参数对预处理后图像进行预测、量化和编码得到位流,其中,编码器113可以根据编码单元的图像内容和位流缓冲区中数据的比特数确定目标比特数,根据目标比特数确定的量化参数对编码单元进行编码。
通信接口114用于接收编码器113生成的位流,通过通信信道130向目的设备120发送位流,以便于目的设备120根据位流重建原始图像。
目的设备120包括显示器121、后处理器122、解码器123和通信接口124。
通信接口124用于接收通信接口114发送的位流,并将位流传输给解码器123。以便于解码器123根据位流重建原始图像。
通信接口114和通信接口124可用于通过源设备110与目的设备120之间的直连通信链路,例如直接有线或无线连接等,或者通过任意类型的网络,例如有线网络、无线网络或其任意组合、任意类型的私网和公网或其任意类型的组合,发送或接收原始图像的相关数据。
通信接口114和通信接口124均可配置为如图1中从源设备110指向目的设备120的对应通信信道130的箭头所指示的单向通信接口,或双向通信接口,并且可用于发送和接收消息等,以建立连接,确认并交换与通信链路和/或例如编码后的位流传输等数据传输相关的任何其它信息,等等。
解码器123用于对位流进行解码,并重建原始图像。示例地,解码器123对位流进行熵解码、反量化和预测重建,得到重建后图像。其中,解码器123可以包括码控单元1231和解码单元1232,码控单元1231用于确定对当前帧中每个编码单元进行解码所使用的量化参数,以便于解码单元1232根据量化参数对位流进行解码、反量化、预测重建得到重建后图像,其中,解码器123可以根据编码单元的图像内容和位流缓冲区中数据的比特数确定编码单元的目标比特数,根据目标比特数确定的量化参数对编码单元的位流进行解码。
后处理器122用于接收解码器123生成的重建后图像,对重建后图像进行后处理。例如,后处理器122执行的后处理包括色彩格式转换(例如,从YUV格式转换为RGB格式)、调色、整修或重采样,或任何其它处理等。
显示器121用于显示重建后图像。显示器121可以为或可以包括任何类别的用于呈现经重构图片的显示设备,例如,集成的或外部的显示器或监视器。例如,显示器可以包括液晶显示器(liquid crystal display,LCD)、有机发光二极管(organic light emitting diode,OLED)显示器、等离子显示器、投影仪、微LED显示器、硅基液晶(liquid crystal on silicon,LCoS)、数字光处理器(digital light processor,DLP)或任何类别的其它显示器。
其中,编码器113和解码器123都可以实施为各种合适电路中的任一个,例如,一个或多个
微处理器、数字信号处理器(digital signal processor,DSP)、专用集成电路(application-specific integrated circuit,ASIC)、现场可编程门阵列(field-programmable gate array,FPGA)、离散逻辑、硬件或其任何组合。如果部分地以软件实施所述技术,则设备可将软件的指令存储于合适的非暂时性计算机可读存储介质中,且可使用一或多个处理器以硬件执行指令从而执行本公开的技术。前述内容(包含硬件、软件、硬件与软件的组合等)中的任一者可视为一或多个处理器。
图像采集器111和编码器113可以集成在一个物理设备上,也可以设置在不同的物理设备上,不予限定。示例地,如图1所示的源设备110包括图像采集器111和编码器113,表示图像采集器111和编码器113集成在一个物理设备上,则源设备110也可称为采集设备。源设备110例如是手机、平板电脑、计算机、笔记本电脑、摄像机、照相机、可穿戴设备、车载设备、终端设备、虚拟现实(virtual reality,VR)设备、增强现实(Augmented Reality,AR)设备、混合现实(Mixed Reality,MR)设备或扩展现实(Extended Reality,XR)设备或者其他采集图像设备。若源设备110不包括图像采集器111,表示图像采集器111和编码器113是两个不同的物理设备,源设备110可以从其他设备(如:采集图像设备或存储图像设备)获取原始图像。
另外,显示器121和解码器123可以集成在一个物理设备上,也可以设置在不同的物理设备上,不予限定。示例地,如图1所示的目的设备120包括显示器121和解码器123,表示显示器121和解码器123集成在一个物理设备上,则目的设备120也可称为回放设备,目的设备120具有解码和显示重建后图像的功能。目的设备120例如是显示器、电视机、数字媒体播放器、视频游戏控制台、车载计算机或其他显示图像的设备。若目的设备120不包括显示器121,表示显示器121和解码器123是两个不同的物理设备,目的设备120对位流解码重建原始图像后,将重建后图像传输给其他显示设备(如:电视机、数字媒体播放器),由其他显示设备显示重建后图像。
此外,图1示出了源设备110和目的设备120可以集成在一个物理设备上,也可以设置在不同的物理设备上,不予限定。
示例地,如图2中的(a)所示,源设备110可以是摄像头,目的设备120可以是各种可能形态的显示器。源设备110可以采集第一场景的视频,将视频中多帧原始图像传输至编解码设备,编解码设备对原始图像进行编解码处理,得到重建后图像,由目的设备120显示重建后图像,播放视频。
又示例地,如图2中的(b)所示,源设备110和目的设备120集成在虚拟现实(virtual reality,VR)设备、增强现实(Augmented Reality,AR)设备、混合现实(Mixed Reality,MR)设备或扩展现实(Extended Reality,XR)设备中,则VR/AR/MR/XR设备具备采集原始图像、显示重建后图像和编解码的功能。源设备110可以采集用户所处现实场景的图像和目的设备120可以在虚拟环境中显示现实场景的重建后图像。
在这些实施例中,源设备110或其对应功能和目的设备120或其对应功能可以使用相同硬件和/或软件或通过单独的硬件和/或软件或其任意组合来实现。根据描述,图1所示的源设备110和/或目的设备120中的不同单元或功能的存在和划分可能根据实际设备和应用而有所不同,这对技术人员来说是显而易见的。
上述编解码系统的结构只是示意性说明,在一些可能的实现方式中,编解码系统还可以包括其他设备,例如,编解码系统还可以包括端侧设备或云侧设备。源设备110采集到原始图像后,对原始图像进行预处理,得到预处理后图像;并将预处理后图像传输至端侧设备或云侧设备,由端侧设备或云侧设备实现对预处理后图像进行编解码的功能。
本申请提供的图像编解码方法应用于编码端和解码端。结合图3对编码器和解码器的结构进行详细说明。如图3所示,编码器300包括预测单元310、码控单元320、量化单元330、编码单元340和块划分单元350。
块划分单元350用于将原始图像划分为多个编码单元。
码控单元320用于根据块划分单元350输出的当前需要进行编码的编码单元的图像内容和位流缓冲区中数据的比特数确定编码单元的目标比特数,根据目标比特数确定量化参数。
在一些实施例中,编码器300对一帧图像编码完成后向解码器400传输图像帧的位流时,位流缓冲区中数据包括图像帧的位流,位流缓冲区中数据的比特数包括图像帧的位流的比特数。
在另一些实施例中,编码器300对一帧图像中编码单元编码完成后向解码器400传输编码单元的位流时,位流缓冲区中数据包括一个或多个编码单元的位流,位流缓冲区中数据的比特数包括一个或多个编码单元的位流的比特数。可理解地,一个或多个编码单元的位流可以是编码器300已编码的编码单元的位流减去编码器300向解码器400传输的已编码的编码单元的位流。
预测单元310用于对块划分单元350输出的编码单元进行帧内预测得到预测比特数,输出编码单元的原始比特数和预测比特数的残差。例如,关于帧内预测的解释可以参考HEVC的帧内预测。帧内预测是一种常见的去除原始图像中空间冗余信息的手段,即采用相邻编码块重建像素作为参考值对当前编码单元进行预测,相当于原始图像内的编码单元与其周围的编码块存在相关性,可以用周围已重建的编码单元来估计当前编码单元的像素值,估计出来的像素值就是预测值,将预测值与当前编码单元的原始值的残差进行量化、熵编码。编码传输的往往是预测残差。解码端进行相同预测过程,得到当前编码单元的预测值,然后通过预测值与解出的残差相加得到当前编码单元的重建值。
量化单元330用于根据码控单元320输出的量化参数对预测单元310输出的残差进行量化得到量化后残差。
编码单元340用于对量化单元330输出的量化后残差进行编码得到编码单元的位流。例如对量化单元330输出的量化后残差进行熵编码。
结合图3对解码器的结构进行详细说明。如图3所示,解码器400包括解码单元410、码控单元420、反量化单元430和预测重建单元440。
解码单元410用于对编码单元的位流进行解码得到量化后残差和图像内容。
码控单元420根据解码单元410输出的当前需要进行解码的编码单元的图像内容位流缓冲区中数据的比特数确定编码单元的目标比特数,根据目标比特数确定量化参数。
在一些实施例中,编码器300对一帧图像编码完成后向解码器400传输图像帧的位流时,解码器400的位流缓冲区中数据包括图像帧的解码后数据,位流缓冲区中数据的比特数包括图像帧的解码后数据的比特数。
在另一些实施例中,编码器300对一帧图像中编码单元编码完成后向解码器400传输编码单元的位流时,解码器400的位流缓冲区中数据包括一个或多个编码单元的解码后数据,位流缓冲区中数据的比特数包括一个或多个编码单元的解码后数据的比特数。
反量化单元430用于根据码控单元420输出的量化参数对解码单元410输出的量化后残差进行反量化得到残差。
预测重建单元440用于根据反量化单元430输出的残差进行预测重建得到重建后图像,以便于显示器显示重建后图像。
为了解决如何确定对图像进行编解码所使用的量化参数,以确保重建后图像的质量的问题,本申请提供一种图像编解码方法,即兼顾编码单元的图像内容和位流缓冲区中数据的比特数两个因素,通过动态地设置目标比特数。在编码端,利用目标比特数确定的量化参数对编码单元进行编码。从而在确保重建后图像的质量的前提下,减少对图像进行编码后的编码比特数。
接下来,结合附图对图像编解码过程进行说明。图4为本申请提供的一种图像编解码方法的流程示意图。在这里由图1中源设备110和目的设备120执行图像编解码过程为例进行说明。如图4所示,该方法包括以下步骤。
步骤410、源设备110获取当前帧中待编码的编码单元。
如上述实施例所述,若源设备110携带图像采集器111,源设备110可以通过图像采集器111采集原始图像。可选地,源设备110也可以接收其他设备采集的原始图像;或者从源设备110中的存储器或其他存储器获取原始图像。原始图像可以包括实时采集的现实世界图像、设备存储的图像和由多个图像合成的图像中至少一种。本实施例对原始图像的获取方式以及原始图像的类型不予限定。
当前帧是指在当前时刻进行编解码处理的一帧图像或原始图像。在先帧是指在当前时刻之前时刻已进行编解码处理的一帧图像或原始图像。在先帧可以是当前时刻的前一时刻或者前多个时刻的帧。
源设备110可以对当前帧进行划分得到多个编码单元,对多个编码单元进行编码。
步骤420、源设备110根据编码单元的图像内容和位流缓冲区中数据的比特数确定目标比特数,根据目标比特数确定的量化参数。
编码单元的图像内容指示编码单元中不同像素区域的复杂程度。例如,像素区域的颜色、纹理、形状等复杂程度。在一些实施例中,编码器113将编码单元分为若干个子块,对于每个子块,从水平和竖直两个方向逐步计算相邻像素值的差值。将差值的绝对值求和后,得到子块对应的复杂度等级,将复杂度等级和阈值进行比较,得到子块的复杂度等级。将各个子块的复杂度等级进行规则运算后,得到编码单元的复杂度等级k。当前帧中不同编码单元的复杂度等级可以不同也可以相同。
位流缓冲区中数据的比特数用于指示位流缓冲区存储的当前帧中已编码的编码单元的位流的比特数。在编码前,可以在存储器中预先配置用于存储位流的物理缓冲区。位流缓冲区可以是根据物理缓冲区存储的已编码的编码单元的位流的比特数得到的。
源设备110根据编码单元的图像内容指示的复杂度等级和位流缓冲区中数据的比特数确定目标比特数,根据目标比特数确定的量化参数。关于确定量化参数的具体解释参考下述步骤510至560的阐述。
步骤430、源设备110根据量化参数对编码单元进行编码,得到编码单元的位流。
源设备110可以对编码单元进行变换或量化、熵编码等编码操作,生成位流,从而实现对待编码的编码单元进行数据压缩的目的。编码单元的位流的比特数可以小于目标比特数也可以大于目标比特数。生成位流的具体方法可以参考现有技术,以及上述实施例中编码单元330的阐述。
步骤440、源设备110向目的设备120发送位流。
源设备110可以对视频全部编码完成后,向目的设备120发送视频的位流。或者,源设备110也可以以帧为单位,实时对原始图像进行编码处理,对一帧编码完成后发送一帧的位流。或者,源设备110对原始图像的编码单元进行编码处理,对编码单元编码完成后发送编码单元的位流。发送位流的具体方法可以参考现有技术,以及上述实施例中通信接口114和通信接口124的阐述。
步骤450、目的设备120获取图像位流中待解码的编码单元的位流。
步骤460、目的设备120根据编码单元的图像内容和位流缓冲区中数据的比特数确定目标比特数,根据编码单元的目标比特数确定量化参数。
目的设备120接收到编码单元的位流后,对编码单元的位流进行解码得到编码单元的图像内容和编码单元的编码后数据,再根据位流缓冲区中数据的比特数和编码单元的图像内容确定量化参数。目的设备120根据编码单元的图像内容指示的复杂度等级和位流缓冲区中数据的比特数确定目标比特数,根据目标比特数确定的量化参数。关于确定量化参数的具体解释参考下述步骤510至560的阐述。
步骤470、目的设备120根据量化参数对当前帧的编码单元的位流进行解码,得到重建后图像。
目的设备120,根据编码单元的目标比特数确定的量化参数对编码单元的编码后数据进行解码,得到重建后图像。
目的设备120显示重建后图像。或者,目的设备120将重建后图像传输给其他显示设备,由其他显示设备显示重建后图像。
如此,为了获得更好的编解码性能和重建后图像的质量,平衡位流缓冲区中数据的比特数和编码单元的图像内容两个因素,动态地设置量化参数,从而在确保重建后图像的质量的前提下,减少对图像进行编码后的编码比特数。
接下来,结合附图对确定量化参数的过程进行详细说明。图5为本申请提供的一种图像编码方法的流程示意图。在这里由图3中编码器300执行量化参数的确定过程为例进行说明。其中,图5的方法流程是对图4中步骤420和步骤460所包括的具体操作过程的阐述。如图5所示,该方法包括以下步骤。
步骤510、编码器300根据编码单元的图像内容确定编码单元的无损比特数。
编码单元的图像内容用于指示编码单元中不同像素区域的复杂程度。编码器300可以根据编
码单元的图像内容确定编码单元的复杂度等级,例如,如步骤420中的阐述。编码单元的无损比特数用于指示对编码单元进行无损编码后的期望比特数。
在一些实施例中,无损比特数可以是根据经验配置的默认值。在另一些实施例中,编码器300根据已编码的编码单元的比特数设置未编码的编码单元的无损编码后的期望比特数。编码器300可以根据编码单元的标识和编码单元的复杂度等级查表,确定编码单元的无损比特数。假设BLL表示无损比特数。BLL=RecordBLL[T][k],其中,T表示编码单元的标识,k表示编码单元的复杂度等级。
步骤520、编码器300根据位流缓冲区中数据的比特数确定编码单元的有损比特数。
编码单元的有损比特数用于指示在不参考编码单元的内容时对编码单元进行有损编码后的期望比特数。
编码器300根据位流缓冲区中已编码的编码单元的比特数确定有损比特数的调整值。进而,编码器300根据有损比特数的初始值和有损比特数的调整值确定有损比特数。有损比特数满足如下公式(1)。
Bpp=BppINI+BppADJ 公式(1)
Bpp=BppINI+BppADJ 公式(1)
其中,Bpp表示有损比特数,BppINI表示有损比特数的初始值,BppADJ表示有损比特数的调整值。有损比特数的初始值根据编码单元的比特数和压缩率确定得到。压缩率根据实际应用场景的需求确定的。
有损比特数的调整值与(RcBufEND-RcBufT)成正比,RcBufEND表示当前帧编码或解码结束时位流缓冲区的期望比特数。RcBufT表示位流缓冲区中已编码的编码单元的比特数。如果RcBufEND-RcBufT的差值大于0,表示位流缓冲区中已编码的编码单元的比特数未超过当前帧编码或解码结束时位流缓冲区的预期比特数,可以为未编码的编码单元分配较多的目标比特数;如果RcBufEND-RcBufT的差值小于0,表示位流缓冲区中已编码的编码单元的比特数超过当前帧编码或解码结束时位流缓冲区的预期比特数,可以为未编码的编码单元分配较少的目标比特数;如果RcBufEND-RcBufT的差值等于0,表示位流缓冲区中已编码的编码单元的比特数等于当前帧编码或解码结束时位流缓冲区的预期比特数,可以为未编码的编码单元分配较少的目标比特数。
位流缓冲区中已编码的编码单元的比特数与物理缓冲区中已编码的编码单元的比特数线性得到。例如,RcBufT=PhyBufT+X0,PhyBufT表示物理缓冲区中已编码的编码单元的比特数,物理缓冲区指存储器中用于存储已编码的编码单元的位流的存储空间。物理缓冲区的存储容量可以是一个或多个编码单元的位流的比特数。X0表示约定好的参数。
当前帧中不同的编码单元对应的RcBufEND相同。不同的帧中编码单元对应的RcBufEND可以相同也可以不同。
步骤530、编码器300根据有损比特数和平均无损比特数确定信息量。
信息量表示当前待编码的编码单元的信息在当前帧的信息的占比,即编码单元表达的内容在当前帧所表达的内容的复杂程度。例如,信息量为有损比特数和平均无损比特数的比值。平均无损比特数用于指示对当前帧进行无损编码后的期望比特数。平均无损比特数可以是当前帧中每个编码单元进行无损编码后的平均期望比特数。例如,当前帧包括编码单元1和编码单元2,编码单元1的进行无损编码后的期望比特数为10比特,编码单元2的进行无损编码后的期望比特数为20比特,当前帧的平均无损比特数为15比特。信息量表示编码单元的复杂度等级。信息量满足如下公式(2)。
R=(Bpp*InvTab[X3*BAVG-bitsOffset-1]+X4)*X5 公式(2)
R=(Bpp*InvTab[X3*BAVG-bitsOffset-1]+X4)*X5 公式(2)
其中,R表示信息量,Bpp表示有损比特数,BAVG表示平均无损比特数。bitsOffset表示偏移量,bitsOffset=BitsOffset-X1*Bpp+X2,BitsOffset表示偏移量的初始值,偏移量的初始值与图像位深相关。X1、X2、X3、X4和X5表示约定好的参数。
步骤540、编码器300根据无损比特数和编码单元的信息量确定目标比特数。
目标比特数用于指示在参考编码单元的内容时对编码单元进行有损编码后的期望比特数,即编码器300在参考编码单元的内容时对编码单元进行量化编码后的期望比特数。
有损比特数越大,目标比特数越大;有损比特数越小,目标比特数越小。目标比特数满足如
下公式(3)。
BTGT=(R*(BLL-bitsOffset)+X6)*X7 公式(3)
BTGT=(R*(BLL-bitsOffset)+X6)*X7 公式(3)
其中,BTGT表示目标比特数,R表示信息量,BLL表示无损比特数,bitsOffset表示偏移量,X6和X7为约定好的参数。
步骤550、编码器300根据有损比特数、无损比特数和位流缓冲区满度中至少一个对目标比特数进行钳位,得到目标比特数的钳位值。
编码器300根据有损比特数Bpp,无损比特数BLL和位流缓冲区满度F确定目标比特数的最小值BMIN和目标比特数的最大值BMAX,进而,根据目标比特数的最小值BMIN和目标比特数的最大值BMAX对目标比特数BTGT进行钳位,得到目标比特数的钳位值B’TGT。目标比特数的钳位值满足如下公式(4)。
B’TGT=MIN(MAX(BMIN,BTGT),BMAX) 公式(4)
B’TGT=MIN(MAX(BMIN,BTGT),BMAX) 公式(4)
例如,若BMIN>BTGT,BMIN<BMAX,则目标比特数为BMIN。又如,若BMIN>BTGT,BMIN>BMAX,则目标比特数为BMAX。又如,若BMIN<BTGT,BTGT<BMAX,则目标比特数为BTGT。
位流缓冲区满度用于指示位流缓冲区对已编码的编码单元的位流的存储情况。位流缓冲区满度满足如下公式(5)。
F=RcBufT/RcBufMAX 公式(5)
F=RcBufT/RcBufMAX 公式(5)
其中,F表示位流缓冲区满度,RcBufT表示位流缓冲区中已编码的编码单元的比特数,RcBufMAX表示位流缓冲区允许的最大比特数。为了保持物理缓冲区的比特数,如果位流缓冲区满度高了,减少在参考编码单元的内容时对编码单元进行有损编码后的期望比特数,以降低物理缓冲区的比特数。
步骤560、编码器300根据无损比特数和目标比特数的钳位值确定量化参数。
编码器300在参考编码单元的内容时对编码单元进行量化编码后的期望比特数进行钳位后,根据无损比特数和目标比特数的钳位值确定量化参数。量化参数满足公式(6)。
QP=(BLL-B’TGT+X8)*X9)*X10 公式(6)
QP=(BLL-B’TGT+X8)*X9)*X10 公式(6)
其中,BLL表示无损比特数,B’TGT表示目标比特数的钳位值,X8、X9和X10为约定好的参数。
可理解地,编码器根据当前需要编码的编码单元相对于整帧图像的复杂情况决定的编码单元进行编码后的期望值,即根据编码单元的复杂度等级和整帧的平均复杂度等级推到得到的。从而,为了获得更好的编码性能和重建后图像的质量,通过码控模块为图像内的不同编码单元分配不同的编码后的期望比特数,从而达到最大化利用规定的总编码的比特数,解压后的图像质量尽可能最优的目的。
图6示出了一种解码过程中码率控制的流程示意图。图6与图5的区别在于码率控制输出的量化参数用于反量化过程。关于确定量化参数的过程参考上述图5中的解释。
下面对根据目标比特数的最小值BMIN和目标比特数的最大值BMAX确定目标比特数的钳位值的方法进行举例说明。如图7所示,确定目标比特数的钳位值的方法包括以下步骤。
步骤710、计算第一最小值BMIN1。例如,第一最小值BMIN1=(ParamBpp×Bpp+K1)×(MIN(BLL,BLLMAX)×K2+K3),其中,BLL表示无损比特数,Bpp表示有损比特数,BLLMAX表示在当前像素位深度下编码单元进行无损编码后的最大无损比特数,ParamBpp表示在当前像素位深度下约定好的参数,K1、K2和K3表示约定好的参数。
步骤720、计算第二最小值BMIN2。例如,第二最小值BMIN2=(K4-K5×F)×Bpp+K6,其中,F表示位流缓冲区满度,Bpp表示有损比特数,K4、K5和K6表示约定好的参数。
步骤730、计算第三最小值BMIN3。例如,第三最小值BMIN3的计算过程如下:1)计算无损比特数差值Brelative,Brelative=Clip(0,320,BLLMAX–BLL)。其中,Clip表示可以将BLLMAX–BLL的值限制在0到320范围内,BLL表示无损比特数,BLLMAX表示在当前像素位深度下编码单元在无损编码后的最大无损比特数。2)计算第三最小值BMIN3,BMIN3=Bpp-(K7×(F-Flimit)+K8)-Sr×Brelative。其中,Flimit表示约定好的位流缓冲区满度的上限值,Bpp表示有损比特数,F表示位流缓冲区满度,Sr表示与图像色度采样率相关的参数,K7和K8表示约定好的参数。
步骤740、计算最小值BMIN。目标比特数的最小值BMIN=MAX(MAX(BMIN1,BMIN2),BMIN3)。
其中,BMIN1表示步骤710计算的第一最小值,BMIN2表示步骤720计算的第二最小值,BMIN3表示步骤730计算的第三最小值。
步骤750、计算第一最大值BMAX1的偏移值。例如,第一最大值的偏移值bppOffset1=K9-(K10×F+K11)-Sr×Brelative,其中,F表示位流缓冲区满度,Brelative表示步骤730中计算的无损比特数差值,Sr表示与图像色度采样率相关的参数,K9、K10和K11表示约定好的参数。
步骤760、计算第二最大值BMAX2的偏移值。例如,第二最大值的偏移值bppOffset2=MAX(K12-Bpp,K13)-Brelative,其中,K12和K13表示约定好的参数。
步骤770、计算第三最大值BMAX3的偏移值。例如,第三最大值的偏移值bppOffset3=K14*(Flimit-F),其中,F表示输入的缓冲区满度,K14表示约定好的参数。
步骤780、计算最大值BMAX。目标比特数的最大值BMAX计算过程如下:1)计算目标比特数的最大值的偏移值bppOffset=MIN(MIN(bppOffset1,bppOffset2),bppOffset3)。其中bppOffset1表示步骤750计算的第一最大值BMAX1的偏移值,bppOffset2表示步骤760计算的第二最大值BMAX2的偏移值,bppOffset3表示步骤770计算的第三最大值BMAX3的偏移值。2)计算目标比特数的最大值BMAX,BMAX=MAX(Bpp+bppOffset,BMIN),BMIN表示步骤740计算的最小值BMIN。
步骤790、对目标比特数BTGT进行钳位。将步骤740计算的最小值BMIN和步骤780计算的最大值BMAX作为对目标比特数进行钳位的范围,得到目标比特数的钳位值,以此范围对目标比特数BTGT进行钳位,B’TGT=MIN(MAX(BMIN,BTGT),BMAX)。
可以理解的是,为了实现上述实施例中的功能,编码器和解码器包括了执行各个功能相应的硬件结构和/或软件模块。本领域技术人员应该很容易意识到,结合本申请中所公开的实施例描述的各示例的单元及方法步骤,本申请能够以硬件或硬件和计算机软件相结合的形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用场景和设计约束条件。
上文中结合图1至图7,详细描述了根据本实施例所提供的图像编解码方法,下面将结合图8,描述根据本实施例所提供的编解码装置。
图8为本实施例提供的可能的编解码装置的结构示意图。这些编解码装置可以用于实现上述方法实施例中编码器和解码器的功能,因此也能实现上述方法实施例所具备的有益效果。在本实施例中,该编解码装置可以是如图3所示的编码器300和解码器400,还可以是应用于计算设备的模块(如芯片)。
如图8所示,编解码装置800包括通信模块810、码控模块820、编码模块830和存储模块840。编解码装置800用于实现上述图3中所示的方法实施例中编码器300和解码器400的功能。
编解码装置800用于实现编码器300的功能时,各个模块的具体功能如下。
通信模块810用于获取当前帧中待编码的编码单元。例如,通信模块810用于执行图4中步骤410。
码控模块820,用于根据所述编码单元的图像内容和位流缓冲区中数据的比特数确定所述编码单元的目标比特数,根据所述编码单元的目标比特数确定量化参数。例如,码控模块820用于执行图4中步骤420。
编码模块830,用于根据所述量化参数对所述编码单元进行编码,得到所述编码单元的位流。例如,编码模块830用于执行图4中步骤430。
编解码装置800用于实现解码器400的功能时,各个模块的具体功能如下。
通信模块810用于获取图像位流中待解码的编码单元的位流。例如,通信模块810用于执行图4中步骤450。
码控模块820,用于根据所述编码单元的图像内容和位流缓冲区中数据的比特数确定所述编码单元的目标比特数,根据所述编码单元的目标比特数确定量化参数。例如,码控模块820用于执行图4中步骤460。
编码模块830,用于根据所述量化参数对所述编码单元的位流进行解码,得到所述编码单元的重建后图像。例如,编码模块830用于执行图4中步骤470。
存储模块840用于位流缓冲区中数据的比特数,以便于码控模块820确定量化参数。
应理解的是,本申请实施例的编解码装置800可以通过专用集成电路(application-specific integrated circuit,ASIC)实现,或可编程逻辑器件(programmable logic device,PLD)实现,上述PLD可以是复杂程序逻辑器件(complex programmable logical device,CPLD),现场可编程门阵列(field-programmable gate array,FPGA),通用阵列逻辑(generic array logic,GAL)或其任意组合。也可以通过软件实现图3所示的方法时,及其各个模块也可以为软件模块,编解码装置800其各个模块也可以为软件模块。
根据本申请实施例的编解码装置800可对应于执行本申请实施例中描述的方法,并且编解码装置800中的各个单元的上述和其它操作和/或功能分别为了实现图3中的各个方法的相应流程,为了简洁,在此不再赘述。
图9为本申请提供的一种图像处理系统的结构示意图,该图像处理系统以手机为例进行说明,该手机或者内置于手机的芯片系统包括:存储器910、处理器920、传感器组件930、多媒体组件940以及输入/输出接口950。下面结合图9对手机或者内置于手机的芯片系统的各个构成部件进行具体的介绍。
存储器910可用于存储数据、软件程序以及模块;主要包括存储程序区和存储数据区,其中,存储程序区可存储软件程序,包括以代码形成的指令,包括但不限于操作系统、至少一个功能所需的应用程序,比如声音播放功能、图像播放功能等;存储数据区可存储根据手机的使用所创建的数据,比如音频数据、图像数据、电话本等。在本申请实施例中,存储器910可用于存储位流缓冲区中数据的比特数等。在一些可行的实施例中,可以有一个存储器,也可以有多个存储器;该存储器可以包括软盘,硬盘如内置硬盘和移动硬盘,磁盘,光盘,磁光盘如CD_ROM、DCD_ROM,非易失性存储设备如RAM、ROM、PROM、EPROM、EEPROM、闪存、或者技术领域内所公知的任意其他形式的存储介质。
处理器920是手机的控制中心,利用各种接口和线路连接整个设备的各个部分,通过运行或执行存储在存储器910内的软件程序和/或软件模块,以及调用存储在存储器910内的数据,执行手机的各种功能和处理数据,从而对手机进行整体监控。在本申请实施例中,处理器920可用于执行本申请方法实施例中的一个或者多个步骤,比如,处理器920可用于执行下述方法实施例中的步骤420至步骤470中的一个或者多个步骤。在一些可行的实施例中,处理器920可以是单处理器结构、多处理器结构、单线程处理器以及多线程处理器等;在一些可行的实施例中,处理器920可以包括中央处理器单元、通用处理器、数字信号处理器、神经网络处理器、图像处理单元、图像信号处理器、微控制器或微处理器等的至少一个。除此以外,处理器920还可进一步包括其他硬件电路或加速器,如专用集成电路、现场可编程门阵列或者其他可编程逻辑器件、晶体管逻辑器件、硬件部件或者其任意组合。其可以实现或执行结合本申请公开内容所描述的各种示例性的逻辑方框,模块和电路。处理器920也可以是实现计算功能的组合,例如包含一个或多个微处理器组合,数字信号处理器和微处理器的组合等等。
传感器组件930包括一个或多个传感器,用于为手机提供各个方面的状态评估。其中,传感器组件930可以包括光传感器,如CMOS或CCD图像传感器,用于在成像应用中使用,即成为相机或摄像头的组成部分。在本申请实施例中,传感器组件930可用于支持多媒体组件940中的摄像头获取图像等。此外,传感器组件930还可以包括加速度传感器,陀螺仪传感器,磁传感器,压力传感器或温度传感器,通过传感器组件930可以检测到手机的加速/减速、方位、打开/关闭状态,组件的相对定位,或手机的温度变化等。
多媒体组件940在手机和用户之间提供一个输出接口的屏幕,该屏幕可以为触摸面板,且当该屏幕为触摸面板时,屏幕可以被实现为触摸屏,以接收来自用户的输入信号。触摸面板包括一个或多个触摸传感器以感测触摸、滑动和触摸面板上的手势。所述触摸传感器可以不仅感测触摸或滑动动作的边界,而且还检测与所述触摸或滑动操作相关的持续时间和压力。此外,多媒体组件940还包括至少一个摄像头,比如,多媒体组件940包括一个前置摄像头和/或后置摄像头。当手机处于操作模式,如拍摄模式或视频模式时,前置摄像头和/或后置摄像头可以感应外部的多媒体信号,该信号被用于形成图像帧。每个前置摄像头和后置摄像头可以是一个固定的光学透镜系统或具有焦距和光学变焦能力。
输入/输出接口950为处理器920和外围接口模块之间提供接口,比如,外围接口模块可以包括键盘、鼠标、或USB(通用串行总线)设备等。在一种可能的实现方式中,输入/输出接口950可以只有一个输入/输出接口,也可以有多个输入/输出接口。
尽管未示出,手机还可以包括音频组件和通信组件等,比如,音频组件包括麦克风,通信组件包括无线保真(wireless fidelity,WiFi)模块、蓝牙模块等,本申请实施例在此不再赘述。
上述图像处理系统可以是一个通用设备或者是一个专用设备。例如,图像处理系统可以是边缘设备(例如,携带具有处理能力芯片的盒子)等。可选地,图像处理系统也可以是服务器或其他具有计算能力的设备。
应理解,根据本实施例的图像处理系统可对应于本实施例中的编解码装置800,并可以对应于执行根据图3中任一方法中的相应主体,并且编解码装置800中的各个模块的上述和其它操作和/或功能分别为了实现图3中的各个方法的相应流程,为了简洁,在此不再赘述。
本实施例中的方法步骤可以通过硬件的方式来实现,也可以由处理器执行软件指令的方式来实现。软件指令可以由相应的软件模块组成,软件模块可以被存放于随机存取存储器(random access memory,RAM)、闪存、只读存储器(read-only memory,ROM)、可编程只读存储器(programmable ROM,PROM)、可擦除可编程只读存储器(erasable PROM,EPROM)、电可擦除可编程只读存储器(electrically EPROM,EEPROM)、寄存器、硬盘、移动硬盘、CD-ROM或者本领域熟知的任何其它形式的存储介质中。一种示例性的存储介质耦合至处理器,从而使处理器能够从该存储介质读取信息,且可向该存储介质写入信息。当然,存储介质也可以是处理器的组成部分。处理器和存储介质可以位于ASIC中。另外,该ASIC可以位于计算设备中。当然,处理器和存储介质也可以作为分立组件存在于计算设备中。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机程序或指令。在计算机上加载和执行所述计算机程序或指令时,全部或部分地执行本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、网络设备、用户设备或者其它可编程装置。所述计算机程序或指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机程序或指令可以从一个网站站点、计算机、服务器或数据中心通过有线或无线方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是集成一个或多个可用介质的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质,例如,软盘、硬盘、磁带;也可以是光介质,例如,数字视频光盘(digital video disc,DVD);还可以是半导体介质,例如,固态硬盘(solid state drive,SSD)。以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到各种等效的修改或替换,这些修改或替换都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以权利要求的保护范围为准。
Claims (18)
- 一种图像解码方法,其特征在于,包括:获取图像位流中待解码的编码单元的位流;根据所述编码单元的图像内容和位流缓冲区中数据的比特数确定所述编码单元的目标比特数,根据所述编码单元的目标比特数确定量化参数,所述量化参数用于对所述编码单元的位流进行解码;其中,所述图像内容用于指示所述编码单元中不同像素区域的复杂程度,所述位流缓冲区用于存储一个或多个编码单元解码后的编码比特数,所述编码单元的目标比特数用于指示在参考所述编码单元的图像内容时对所述编码单元进行有损编码后的期望比特数。
- 根据权利要求1所述的方法,其特征在于,根据所述编码单元的图像内容和所述位流缓冲区中数据的比特数确定所述编码单元的目标比特数,包括:根据所述编码单元的图像内容确定所述编码单元的无损比特数,所述编码单元的无损比特数用于指示对所述编码单元进行无损编码后的期望比特数;根据所述位流缓冲区中数据的比特数确定在当前帧中所述编码单元的信息量,所述信息量用于指示所述编码单元表达的内容在所述当前帧所表达的内容的复杂程度;根据所述编码单元的无损比特数和所述编码单元的信息量确定所述目标比特数。
- 根据权利要求2所述的方法,其特征在于,根据所述位流缓冲区中数据的比特数确定在当前帧中所述编码单元的信息量,包括:根据所述位流缓冲区中数据的比特数确定所述编码单元的有损比特数,所述编码单元的有损比特数用于指示在不参考所述编码单元的图像内容时对所述编码单元进行有损编码后的期望比特数;根据所述编码单元的有损比特数和平均无损比特数确定所述信息量,所述平均无损比特数用于指示对所述当前帧中每个编码单元进行无损编码后的平均期望比特数。
- 根据权利要求2或3所述的方法,其特征在于,根据所述编码单元的目标比特数确定量化参数,包括:根据所述编码单元的无损比特数、所述编码单元的有损比特数和位流缓冲区满度中的至少一个对所述编码单元的目标比特数进行钳位,得到所述目标比特数的钳位值,所述位流缓冲区满度用于指示所述位流缓冲区中数据的比特数占所述位流缓冲区的存储容量的比值;根据所述编码单元的无损比特数和所述目标比特数的钳位值确定所述量化参数。
- 根据权利要求1-4中任一项所述的方法,其特征在于,所述图像内容包括所述编码单元的复杂度等级。
- 根据权利要求5所述的方法,其特征在于,所述编码单元的复杂度等级包括:亮度复杂度等级和色度复杂度等级中至少一种。
- 根据权利要求1-6中任一项所述的方法,其特征在于,所述方法还包括:根据所述量化参数对所述编码单元的位流进行解码,得到所述编码单元的重建后图像。
- 一种图像编码方法,其特征在于,包括:获取当前帧中待编码的编码单元;根据所述编码单元的图像内容和位流缓冲区中数据的比特数确定所述编码单元的目标比特数,根据所述编码单元的目标比特数确定量化参数,所述量化参数用于对所述编码单元进行编码;其中,所述图像内容用于指示所述编码单元中不同像素区域的复杂程度,所述位流缓冲区用于存储一个或多个编码单元的位流或部分位流,所述编码单元的目标比特数用于指示在参考所述编码单元的图像内容时对所述编码单元进行有损编码后的期望比特数。
- 根据权利要求8所述的方法,其特征在于,根据所述编码单元的图像内容和所述位流缓冲区中数据的比特数确定所述编码单元的目标比特数,包括:根据所述编码单元的图像内容确定所述编码单元的无损比特数,所述编码单元的无损比特数用于指示对所述编码单元进行无损编码后的期望比特数;根据所述位流缓冲区中数据的比特数确定在当前帧中所述编码单元的信息量,所述信息量用 于指示所述编码单元表达的内容在所述当前帧所表达的内容的复杂程度;根据所述编码单元的无损比特数和所述编码单元的信息量确定所述目标比特数。
- 根据权利要求9所述的方法,其特征在于,根据所述位流缓冲区中数据的比特数确定在当前帧中所述编码单元的信息量,包括:根据所述位流缓冲区中数据的比特数确定所述编码单元的有损比特数,所述编码单元的有损比特数用于指示在不参考所述编码单元的图像内容时对所述编码单元进行有损编码后的期望比特数;根据所述编码单元的有损比特数和平均无损比特数确定所述信息量,所述平均无损比特数用于指示对所述当前帧中每个编码单元进行无损编码后的平均期望比特数。
- 根据权利要求9或10所述的方法,其特征在于,根据所述编码单元的目标比特数确定量化参数,包括:根据所述编码单元的无损比特数、所述编码单元的有损比特数和位流缓冲区满度中的至少一个对所述编码单元的目标比特数进行钳位,得到所述目标比特数的钳位值,所述位流缓冲区满度用于指示所述位流缓冲区中数据的比特数占所述位流缓冲区的存储容量的比值;根据所述编码单元的无损比特数和所述目标比特数的钳位值确定所述量化参数。
- 根据权利要求8-11中任一项所述的方法,其特征在于,所述图像内容包括所述编码单元的复杂度等级。
- 根据权利要求12所述的方法,其特征在于,所述编码单元的复杂度等级包括:亮度复杂度等级和色度复杂度等级中至少一种。
- 根据权利要求8-13中任一项所述的方法,其特征在于,所述方法还包括:根据所述量化参数对所述编码单元进行编码,得到所述编码单元的位流。
- 一种编解码装置,其特征在于,所述编解码装置包括用于执行实现如权利要求1-14中任一项所述的方法。
- 一种解码器,其特征在于,所述编码器包括至少一个处理器和存储器,其中,所述存储器用于存储计算机程序,使得所述计算机程序被所述至少一个处理器执行时实现如权利要求1-7中任一项所述的方法。
- 一种编码器,其特征在于,所述编码器包括至少一个处理器和存储器,其中,所述存储器用于存储计算机程序,使得所述计算机程序被所述至少一个处理器执行时实现如权利要求8-14中任一项所述的方法。
- 一种编解码系统,其特征在于,所述编解码系统包括如权利要求17所述的编码器,以及如权利要求16所述的解码器,所述编码器用于执行上述权利要求8-14中任一项所述的方法的操作步骤,所述解码器用于执行上述权利要求1-7中任一项所述的方法。
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211097577.X | 2022-09-08 | ||
CN202211097577 | 2022-09-08 | ||
CN202211696765.4 | 2022-12-28 | ||
CN202211696765.4A CN117676140A (zh) | 2022-09-08 | 2022-12-28 | 图像编解码方法、装置、编码器、解码器和系统 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2024051329A1 true WO2024051329A1 (zh) | 2024-03-14 |
Family
ID=90068807
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2023/105320 WO2024051329A1 (zh) | 2022-09-08 | 2023-06-30 | 图像编解码方法、装置、编码器、解码器和系统 |
Country Status (2)
Country | Link |
---|---|
CN (2) | CN118631995A (zh) |
WO (1) | WO2024051329A1 (zh) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0971542A2 (en) * | 1998-07-10 | 2000-01-12 | Tektronix, Inc. | Readjustment of bit rates when switching between compressed video streams |
WO2005084035A2 (en) * | 2004-02-06 | 2005-09-09 | Apple Computer, Inc. | Rate and quality controller for h.264/avc video coder and scene analyzer therefor |
CN101159867A (zh) * | 2007-03-31 | 2008-04-09 | 红杉树(杭州)信息技术有限公司 | 一种基于片的自适应码率控制方法 |
EP3496403A1 (en) * | 2017-12-06 | 2019-06-12 | V-Nova International Limited | Hierarchical data structure |
CN113132726A (zh) * | 2019-12-31 | 2021-07-16 | 上海海思技术有限公司 | 编码方法及编码器 |
WO2021244341A1 (zh) * | 2020-06-05 | 2021-12-09 | 中兴通讯股份有限公司 | 图像编码方法及装置、电子设备及计算机可读存储介质 |
-
2022
- 2022-12-28 CN CN202410645756.5A patent/CN118631995A/zh active Pending
- 2022-12-28 CN CN202211696765.4A patent/CN117676140A/zh active Pending
-
2023
- 2023-06-30 WO PCT/CN2023/105320 patent/WO2024051329A1/zh unknown
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0971542A2 (en) * | 1998-07-10 | 2000-01-12 | Tektronix, Inc. | Readjustment of bit rates when switching between compressed video streams |
WO2005084035A2 (en) * | 2004-02-06 | 2005-09-09 | Apple Computer, Inc. | Rate and quality controller for h.264/avc video coder and scene analyzer therefor |
CN101159867A (zh) * | 2007-03-31 | 2008-04-09 | 红杉树(杭州)信息技术有限公司 | 一种基于片的自适应码率控制方法 |
EP3496403A1 (en) * | 2017-12-06 | 2019-06-12 | V-Nova International Limited | Hierarchical data structure |
CN113132726A (zh) * | 2019-12-31 | 2021-07-16 | 上海海思技术有限公司 | 编码方法及编码器 |
WO2021244341A1 (zh) * | 2020-06-05 | 2021-12-09 | 中兴通讯股份有限公司 | 图像编码方法及装置、电子设备及计算机可读存储介质 |
Also Published As
Publication number | Publication date |
---|---|
CN117676140A (zh) | 2024-03-08 |
CN118631995A (zh) | 2024-09-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10412413B2 (en) | Image processing device and image processing method | |
CN107547907B (zh) | 编解码的方法及设备 | |
US20220295071A1 (en) | Video encoding method, video decoding method, and corresponding apparatus | |
WO2020244579A1 (zh) | Mpm列表构建方法、色度块的帧内预测模式获取方法及装置 | |
WO2020125595A1 (zh) | 视频译码器及相应方法 | |
CN116828192A (zh) | 图像重建方法和装置 | |
WO2020253681A1 (zh) | 融合候选运动信息列表的构建方法、装置及编解码器 | |
WO2020224476A1 (zh) | 一种图像划分方法、装置及设备 | |
WO2024051329A1 (zh) | 图像编解码方法、装置、编码器、解码器和系统 | |
EP3905692A1 (en) | Video coder, video decoder, and corresponding method | |
WO2024051328A1 (zh) | 图像编解码方法、装置、编码器、解码器和系统 | |
WO2024051331A1 (zh) | 图像编解码方法、装置、编码器、解码器和系统 | |
US10250899B1 (en) | Storing and retrieving high bit depth image data | |
WO2024188124A1 (zh) | 一种图像编解码方法、装置、编码器、解码器和系统 | |
TW202315399A (zh) | 一種視訊編碼、解碼方法及裝置 | |
US11272199B2 (en) | Device and method for coding video data | |
RU2784414C1 (ru) | Сигнализация размера выходного изображения для передискретизации опорного изображения | |
WO2024183387A1 (zh) | 一种图像编解码方法、装置及系统 | |
RU2787713C2 (ru) | Способ и устройство предсказания блока цветности | |
US12126836B2 (en) | Picture prediction method, encoder, decoder, and storage medium | |
CN113615191B (zh) | 图像显示顺序的确定方法、装置和视频编解码设备 | |
RU2820991C1 (ru) | Кодирующее устройство, декодирующее устройство и соответствующие способы уменьшения сложности при внутрикадровом прогнозировании для планарного режима | |
WO2023173916A1 (zh) | 编解码方法和装置 | |
US20220191548A1 (en) | Picture prediction method, encoder, decoder and storage medium | |
WO2020140889A1 (zh) | 量化、反量化方法及装置 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23862030 Country of ref document: EP Kind code of ref document: A1 |