WO2024061055A1 - 图像编码方法和图像解码方法、装置及存储介质 - Google Patents

图像编码方法和图像解码方法、装置及存储介质 Download PDF

Info

Publication number
WO2024061055A1
WO2024061055A1 PCT/CN2023/118293 CN2023118293W WO2024061055A1 WO 2024061055 A1 WO2024061055 A1 WO 2024061055A1 CN 2023118293 W CN2023118293 W CN 2023118293W WO 2024061055 A1 WO2024061055 A1 WO 2024061055A1
Authority
WO
WIPO (PCT)
Prior art keywords
encoding
mode
coding
substream
block
Prior art date
Application number
PCT/CN2023/118293
Other languages
English (en)
French (fr)
Inventor
王岩
潘冬萍
孙煜程
陈方栋
Original Assignee
杭州海康威视数字技术股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 杭州海康威视数字技术股份有限公司 filed Critical 杭州海康威视数字技术股份有限公司
Publication of WO2024061055A1 publication Critical patent/WO2024061055A1/zh

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/15Data rate or code amount at the encoder output by monitoring actual compressed data size at the memory before deciding storage at the transmission buffer
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/18Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a set of transform coefficients
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/44Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder

Definitions

  • the present application relates to the field of video encoding and decoding, and in particular, to an image encoding method, an image decoding method, a device and a storage medium.
  • Substream parallelism refers to using multiple entropy encoders to encode syntax elements of different channels to obtain multiple substreams, filling the multiple substreams into their respective corresponding substream buffers, and interleaving the substreams in the substream buffers into bit streams (or code streams) according to preset interleaving rules.
  • this application provides an image encoding method, image decoding method, device, equipment and storage medium, which can perform encoding through a variety of improved encoding and decoding modes, reasonably configure the space of the substream buffer, and reduce the hardware cost. cost.
  • this application provides an image coding method, which method includes:
  • the coding unit includes coding blocks of multiple channels
  • the preset codeword is encoded in the target substream that meets the preset condition until the target substream does not meet the preset condition; the target substream is a substream among multiple substreams; the multiple substreams are A code stream obtained by encoding the encoding blocks of the multiple channels.
  • this application provides an image coding method, which method includes:
  • the coding unit includes coding blocks of multiple channels
  • the encoding block of each channel is encoded based on a preset expansion rate, so that the current expansion rate is less than or equal to the preset expansion rate, wherein the preset expansion rate includes a first preset expansion rate;
  • the value of the current expansion rate is derived from the quotient of the number of bits of the largest substream and the number of bits of the smallest substream; the largest substream is the largest number of bits among multiple substreams obtained by encoding the coding blocks of the multiple channels.
  • substream; the minimum substream is the substream with the smallest number of bits among multiple substreams obtained by encoding the coding blocks of the multiple channels.
  • this application provides an image coding method, which method includes:
  • the coding unit includes coding blocks of multiple channels
  • the BV of the reference prediction block is used to indicate the position of the reference prediction block in the coded image block; the reference prediction block is used to represent the encoding encoded according to the IBC mode The predicted value of the block;
  • the BV of the reference prediction block is encoded in at least one substream obtained by encoding the coding block of the at least one channel in the IBC mode.
  • this application provides an image coding method, which method includes:
  • the coding unit is an image block in the image to be processed; the coding unit includes coding blocks of multiple channels;
  • the first total code length is the total code length of the first code stream obtained after the encoding blocks of the multiple channels are encoded according to their respective corresponding target encoding modes; the target encoding mode It includes a first coding mode; the first coding mode is a mode for coding sample values in a coding block according to a first fixed-length code; the code length of the first fixed-length code is less than or equal to the length of the image to be processed.
  • Image bit width the image bit width is used to characterize the number of bits required to store each sample in the image to be processed;
  • the encoding blocks of the multiple channels are encoded according to the fallback mode; the fallback mode and the first encoding mode mode
  • the logo is the same.
  • this application provides an image decoding method, which method includes:
  • the coding unit includes coding blocks of multiple channels;
  • the number of codewords is used to indicate the number of preset codewords encoded in the target substream that satisfies the preset condition; the preset codeword is when there is a target substream that satisfies the preset condition , encoded into the target substream; based on the number of codewords, the code stream is decoded.
  • this application provides an image decoding method, which method includes:
  • the coding unit includes coding blocks of multiple channels;
  • the number of codewords is determined according to the current expansion rate and the first preset expansion rate; the number of codewords is used to indicate the number of preset codewords encoded in the target substream that meets the preset conditions; the preset code Words are encoded into the target substream when there is a target substream that meets the preset conditions;
  • the code stream is decoded.
  • this application provides an image decoding method, which method includes:
  • the position of the reference prediction block is determined in the plurality of substreams based on the block vector BV of the reference prediction block parsed from at least one substream in the plurality of substreams; the reference prediction block is used to represent The prediction value of the decoded block decoded according to the intra block copy IBC mode; the BV of the reference prediction block is used to indicate the position of the reference prediction block in the reconstructed image block;
  • the decoded block decoded according to the IBC mode is reconstructed.
  • this application provides an image decoding method, which method includes:
  • the coding unit includes coding blocks of multiple channels;
  • the mode flag is parsed from the substream encoded by the encoding blocks of the multiple channels, and the second total code length is greater than the remaining size of the code stream buffer, it is determined that the target decoding mode of the substream is the fallback mode ;
  • the mode flag is used to indicate whether the encoding blocks of the multiple channels are encoded in the fallback mode; the second total code length is obtained by encoding the encoding blocks of the multiple channels according to the first encoding mode.
  • the total code length of the code stream; the first coding mode is a mode in which the coding unit of the sample value in the coding block is coded according to a first fixed-length code; the code length of the first fixed-length code is less than or equal to The image bit width of the image to be processed; the image bit width is used to characterize the number of bits required to encode and store each sample in the image to be processed;
  • the target fallback sub-mode is one of the fallback modes;
  • the preset flag bit is used to indicate the multiple The type of fallback mode used when encoding the coding block of the channel;
  • the fallback mode includes a first fallback mode and a second fallback mode;
  • the substream is decoded according to the target fallback mode.
  • this application provides an image decoding method, which method includes:
  • this application provides an image decoding method, which method includes:
  • the decoding unit is an image block in the image to be processed, and the decoding unit includes encoding blocks of multiple channels;
  • the first total code length is the total code length of a first code stream obtained after decoding blocks of the multiple channels are decoded according to their respective corresponding target decoding modes;
  • the decoding blocks of the multiple channels are encoded in a fallback mode.
  • this application provides an image decoding method, which method includes:
  • the decoding unit includes encoding blocks of multiple channels;
  • the BV of the reference prediction block is decoded in at least one substream obtained by encoding the decoding block of at least one channel in IBC mode.
  • the present application provides an image coding device, the device including:
  • An acquisition module is used to acquire a coding unit; the coding unit includes coding blocks of multiple channels;
  • a processing module configured to encode preset codewords in a target substream that satisfies a preset condition until the target substream does not meet the preset condition; the target substream is a substream among multiple substreams; so
  • the plurality of substreams are code streams obtained by encoding coding blocks of the plurality of channels.
  • the present application provides an image coding device, the device including:
  • An acquisition module is used to acquire a coding unit; the coding unit includes coding blocks of multiple channels;
  • the processing module is configured to encode the coding block of each channel based on a preset expansion rate, so that a current expansion rate is less than or equal to the preset expansion rate.
  • the present application provides an image coding device, the device including:
  • An acquisition module is used to acquire a coding unit; the coding unit includes coding blocks of multiple channels;
  • a processing module configured to encode the coding block of at least one channel among the plurality of channels according to the intra block copy IBC mode; obtain the block vector BV of the reference prediction block; the BV of the reference prediction block is used to indicate the The position of the reference prediction block in the coded image block; the reference prediction block is used to represent the prediction value of the coding block coded according to the IBC mode; the coding block in the at least one channel is coded in the IBC mode The BV of the reference prediction block is encoded in the obtained at least one sub-stream.
  • the present application provides an image coding device, the device including:
  • An acquisition module is used to acquire a coding unit; the coding unit is an image block in the image to be processed; the coding unit includes coding blocks of multiple channels;
  • a processing module configured to determine the first total code length; the first total code length is the total code length of the first code stream obtained after the encoding blocks of the multiple channels are encoded according to their respective corresponding target encoding modes;
  • the target encoding mode includes a first encoding mode; the first encoding mode is a mode for encoding sample values in the encoding block according to a first fixed-length code; the code length of the first fixed-length code is less than or equal to the The image bit width of the image to be processed; the image bit width is used to characterize the number of bits required to store each sample in the image to be processed; when the first total code length is greater than or equal to the code stream buffer When the size remains, the encoding blocks of the multiple channels are encoded according to the fallback mode; the fallback mode has the same mode flag as the first encoding mode.
  • the present application provides an image decoding device, the device comprising:
  • a processing module is used to parse a code stream after encoding a coding unit; the coding unit includes coding blocks of multiple channels; determine the number of code words; the number of code words is used to indicate the number of preset code words encoded in a target substream that meets a preset condition; the preset code word is encoded into the target substream when there is a target substream that meets the preset condition; based on the number of code words, decode the code stream.
  • the present application provides an image decoding device, the device including:
  • a processing module configured to parse the code stream after encoding the coding unit; the coding unit includes coding blocks of multiple channels; determine the current expansion rate; determine the codeword according to the current expansion rate and the first preset expansion rate Quantity; the number of codewords is used to indicate the number of preset codewords encoded in the target substream that satisfies the preset conditions; the preset codewords are encoded into the target substream when there is a target substream that satisfies the preset conditions. In the target substream, the code stream is decoded based on the number of codewords.
  • the present application provides an image decoding device, the device including:
  • a processing module used to parse the code stream after encoding the coding unit; the coding unit includes coding blocks of multiple channels; the code stream includes the coding blocks of the multiple channels encoded, and the coding blocks of the multiple channels Multiple substreams in one-to-one correspondence; determining the position of the reference prediction block in the multiple substreams based on the block vector BV of the reference prediction block parsed from at least one substream among the plurality of substreams;
  • the reference prediction block is used to characterize the prediction value of the decoding block decoded according to the intra block copy IBC mode; the BV of the reference prediction block is used to indicate the position of the reference prediction block in the reconstructed image block; based on Determine the predicted value of the decoded block decoded according to the IBC mode with reference to the position information of the predicted block; reconstruct the decoded block decoded according to the IBC mode based on the predicted value.
  • an image decoding device which includes:
  • a processing module used to parse the code stream after encoding the encoding unit; the encoding unit includes encoding blocks of multiple channels; if encoding from the encoding blocks of the multiple channels The mode flag is parsed out of the obtained substream, and the second total code length is greater than the remaining size of the code stream buffer, then it is determined that the target decoding mode of the substream is the fallback mode; the mode flag is used to indicate the multiple Whether the encoding blocks of the channels are encoded in the fallback mode; the second total code length is the total code length of the code stream obtained after the encoding blocks of the multiple channels are encoded in the first encoding mode; the second total code length is A coding mode is a mode in which the coding unit of the sample value in the coding block is coded according to a first fixed-length code; the code length of the first fixed-length code is less than or equal to the image bit width of the image to be processed; so The image bit width is used to represent the number of bits required to encode and store each sample
  • the present application provides a video encoder, which includes a processor and a memory; the memory stores instructions executable by the processor; when the processor is configured to execute the instructions, the video encoder implements the above-mentioned Image encoding methods in the first aspect to the fourth aspect.
  • the present application provides a video decoder, which includes a processor and a memory; the memory stores instructions executable by the processor; when the processor is configured to execute the instructions, the video decoder implements the following: The image decoding methods in the above fifth to eleventh aspects.
  • the present application provides a computer program product that, when run on an image encoding device, causes the image encoding device to execute the image encoding methods in the above first to fourth aspects.
  • the present application provides a readable storage medium.
  • the readable storage medium includes: software instructions; when the software instructions are run in an image encoding device, the image encoding device implements the above first to fourth aspects.
  • the image encoding method in when the software instructions are run in the image decoding device, cause the image decoding device to implement the image decoding methods in the above-mentioned fifth aspect to the eleventh aspect.
  • the present application provides a chip.
  • the chip includes a processor and an interface.
  • the processor is coupled to the memory through the interface.
  • the processor executes the computer program in the memory or the image encoding device executes the instruction, the above-mentioned first
  • the method described in any one of the aspects to the fourth aspect is executed.
  • Figure 1 is a schematic diagram of the framework of substream parallel technology
  • Figure 2 is a schematic diagram of the coding terminal stream interleaving process
  • Figure 3 is a schematic diagram of the format of a substream interleaving unit
  • Figure 4 is a schematic diagram of the reverse sub-stream interleaving process at the decoding end
  • Figure 5 is another schematic diagram of the coding terminal stream interleaving process
  • FIG6 is a schematic diagram of the composition of a video encoding and decoding system provided in an embodiment of the present application.
  • Figure 7 is a schematic diagram of the composition of a video encoder provided by an embodiment of the present application.
  • Figure 8 is a schematic structural diagram of a video decoder provided by an embodiment of the present application.
  • Figure 9 is a schematic flow chart of video encoding and decoding provided by an embodiment of the present application.
  • Figure 10 is a schematic diagram of the composition of an image encoding device and an image decoding device provided by an embodiment of the present application;
  • Figure 11 is a schematic flowchart of an image encoding method provided by an embodiment of the present application.
  • Figure 12 is a schematic flow chart of an image decoding method provided by an embodiment of the present application.
  • Figure 13 is a schematic flow chart of another image decoding method provided by an embodiment of the present application.
  • Figure 14 is a schematic flow chart of another image decoding method provided by an embodiment of the present application.
  • Figure 15 is a schematic flow chart of another image decoding method provided by an embodiment of the present application.
  • Figure 16 is a schematic flowchart of another image encoding method provided by an embodiment of the present application.
  • Figure 17 is a schematic flow chart of yet another image encoding method provided by an embodiment of the present application.
  • Figure 18 is a schematic flow chart of another image decoding method provided by an embodiment of the present application.
  • Figure 19 is a schematic flow chart of another image encoding method provided by an embodiment of the present application.
  • Figure 20 is a schematic flow chart of another image decoding method provided by an embodiment of the present application.
  • Figure 21 is a schematic flow chart of another image encoding method provided by an embodiment of the present application.
  • Figure 22 is a schematic flow chart of yet another image encoding method provided by an embodiment of the present application.
  • Figure 23 is a schematic flow chart of another image decoding method provided by an embodiment of the present application.
  • Figure 24 is a schematic flow chart of yet another image encoding method provided by an embodiment of the present application.
  • Figure 25 is a schematic flow chart of another image decoding method provided by an embodiment of the present application.
  • Figure 26 is a schematic flow chart of another image encoding method provided by an embodiment of the present application.
  • Figure 27 is a schematic flow chart of another image decoding method provided by an embodiment of the present application.
  • Figure 28 is a schematic diagram of the composition of an image encoding device provided by an embodiment of the present application.
  • substream parallelism or also called substream interleaving
  • sub-stream parallelism means that the encoding end combines the coding blocks (coding blocks) of different channels of the coding unit (CU) (such as the luminance channel, the first chroma channel, and the second chroma channel, etc.) , CB) syntax elements are encoded using multiple entropy encoders to obtain multiple sub-streams, and the multiple sub-streams are interleaved into bit streams with fixed-size data packets.
  • substream parallelism means that the decoder uses different entropy decoders to decode different substreams in parallel.
  • Figure 1 is a schematic framework diagram of sub-stream parallel technology. As shown in Figure 1, taking the encoding end as an example, the specific application timing of the sub-stream parallel technology is after encoding the syntax elements (such as transform coefficients, quantization coefficients, etc.).
  • the encoding process of other parts in Figure 1 can refer to the video encoding and decoding system provided by the embodiment of the present application below, and will not be described again this time.
  • Figure 2 is a schematic diagram of the coding terminal stream interleaving process.
  • the encoding module such as a prediction module, a transformation module, a quantization module, etc.
  • the encoding module can output the syntax elements and quantized transformation coefficients of the three channels.
  • entropy encoder 1, entropy encoder 2, and entropy encoder 3 encode the syntax elements and quantized transform coefficients of the three channels respectively to obtain the substream corresponding to each channel, and combine the three channels
  • the respective corresponding substreams are pushed into the encoding substream buffer 1, the encoding substream buffer 2, and the encoding substream buffer 3.
  • the substream interleaving module can interleave the substreams in the encoded substream buffer 1, the encoded substream buffer 2, and the encoded substream buffer 3, and finally outputs a bit stream after interleaving multiple substreams (or it can also be called code stream).
  • FIG. 3 is a schematic diagram of the format of a substream interleaving unit.
  • a substream can be composed of a substream interleaving unit, which can also be called a substream segment.
  • the length of the sub-streaming slice is N bits, and the sub-streaming slice includes an M-bit data header and an N-M-bit data body.
  • the data header is used to indicate the sub-stream to which the current sub-stream slice belongs.
  • N can take 512, and M can take 2.
  • Figure 4 is a schematic diagram of the reverse sub-stream interleaving process at the decoding end.
  • the sub-stream interleaving module in the decoding end can first Perform a reverse substream interleaving process to decompose the bit stream into substreams corresponding to the three channels, and push the substreams corresponding to the three channels into decoding substream buffer 1, decoding substream buffer 2, and Decode substream buffer 3.
  • the decoder can extract N-bit length data packets from the bit stream at a time. By parsing the M-bit data header, the target substream to which the current substream slice belongs is obtained, and the remaining data body in the current substream slice is put into the decoding substream buffer corresponding to the target substream.
  • Entropy decoder 1 can decode the substream in the decoded substream buffer 1 to obtain the syntax elements and quantized transform coefficients of one channel; entropy decoder 2 can decode the substream in the decoded substream buffer 2 , obtain the syntax elements and quantized transform coefficients of another channel; the entropy decoder 3 can decode the substream in the decoding substream buffer 3, obtain the syntax elements and quantized transform coefficients of another channel, and finally The syntax elements and quantized transform coefficients of each of the three channels are input to the subsequent decoding module for decoding processing to obtain a decoded image.
  • the following uses the encoding end as an example to introduce the process of substream interleaving.
  • Each substream slice in the coding substream buffer includes coding bits generated by encoding at least one image block.
  • the image blocks corresponding to the initial bits of the data body of each substream slice can be sequentially marked, and in the substream interleaving process, different substreams are interleaved according to the sequential markings.
  • the sub-stream interleaving process may mark image blocks through a block count queue.
  • the block counting queue is implemented as a first-in-first-out queue.
  • the encoding end sets the block count of the currently encoded image blocks, and sets a block count queue counter queue[ss_idx] for each substream.
  • initialize the block count to 0, initialize each counter queue[ss_idx] to be empty, and then push a 0 into each counter queue[ss_idx].
  • the block count queue is updated.
  • the update process is as follows:
  • Step 2 Select a substream ss_idx.
  • Step 3 Calculate the number of substream slices num_in_buffer[ss_idx] that can be constructed in the coding substream buffer corresponding to the substream ss_idx. Assume that the coding substream buffer corresponding to the substream ss_idx is buffer[ss_idx], and the amount of data included in buffer[ss_idx] is buffer[ss_idx].fullness.
  • Step 4 Compare the current block count column length num_in_queue[ss_idx] and the number of sub-stream slices that can be constructed in the encoding sub-stream buffer num_in_buffer[ss_idx]. If the two are equal, then the count of the currently encoded image blocks Push the block count queue, that is, counter_queue[ss_idx].push(block_count).
  • Step 5 Return to step 2 and process the next sub-flow until all sub-flows are processed.
  • the encoding end can interleave the substreams in each encoding substream buffer.
  • the interleaving process is as follows:
  • Step 1 Select a subflow ss_idx.
  • Step 2 Determine whether the amount of data buffer[ss_idx].fullness included in the encoded substream buffer buffer[ss_idx] corresponding to the substream ss_idx is greater than or equal to N-M. If yes, go to step 3; if not, go to step 6.
  • Step 3 Determine whether the value of the head element in the block count queue of the subflow ss_idx is the minimum value in the block count queues of all subflows. If yes, go to step 4; if not, go to step 6.
  • Step 4 Construct a substream slice with the data in the current encoding substream buffer. For example, extract N-M bits of data from the encoding substream buffer buffer[ss_idx], add an M-bit data header, the data in the header is ss_idx, concatenate the M-bit data header and the extracted N-M bits of data into an N-bit substream slice, and send the substream slice to the bit stream finally output by the encoding end.
  • Step 5 Pop (or delete) the head element of the block count queue of the subflow ss_idx, that is, counter_queue[ss_idx].pop().
  • Step 6 Return to step 1 and process the next sub-flow until all sub-flows are processed.
  • the encoding end may also perform the following steps after the above interleaving process to encode the remaining data in the substream buffer.
  • Step 1 Determine whether there is at least one non-empty buffer in all currently encoded substream buffers. If yes, go to step 2; if not, end.
  • Step 2 Select a subflow ss_idx.
  • Step 3 Determine whether the value of the head element of the block count queue of the subflow ss_idx is the minimum value of the block count queues of all subflows. If yes, proceed to step 4; If not, go to step 6.
  • Step 4 If the amount of data in the encoding substream buffer buffer[ss_idx] corresponding to the substream ss_idx is less than N-M bits, fill 0 into the encoding substream buffer buffer[ss_idx] until the data in the encoding substream buffer buffer[ss_idx] reaches N-M bits. At the same time, pop (or delete) the head element value of the block counter queue of the substream, that is, counter_queue[ss_idx].pop(), and push a MAX_INT representing the maximum value in the data range, that is, counter_queue[ss_idx].push(MAX_INT).
  • Step 5 can refer to step 4 in the above-mentioned substream interleaving process, and will not be described again.
  • Step 6 Return to step 2 to process the next sub-flow. If all sub-flows have been processed, return to step 1.
  • FIG. 5 is another schematic diagram of the coding terminal stream interleaving process.
  • the encoding substream buffer may include encoding substream buffer 1, encoding substream buffer 2, and encoding substream buffer 3 respectively. .
  • the sub-stream slices in the encoding sub-stream buffer 1 are 1_1, 1_2, 1_3, and 1_4 from front to back.
  • the marks in the block count queue 1 of the substream 1 corresponding to the encoding substream buffer 1 are 12, 13, 27, and 28 in sequence.
  • the sub-stream slices in the encoding sub-stream buffer 2 are 2_1 and 2_2 from front to back.
  • the marks in the block count queue 2 of the substream 2 corresponding to the encoding substream buffer 2 are 5 and 71 in sequence.
  • the sub-stream slices in the encoding sub-stream buffer 3 are 3_1, 3_2, and 3_3 from front to back.
  • the marks in the block count queue 3 of the substream 3 corresponding to the encoding substream buffer 3 are 6, 13, and 25 in sequence.
  • the substream interleaving module can encode the substream buffer 1, the encoded substream buffer 2, and the encoded substream buffer 3 according to the marking order in the block count queue 1, the block count queue 2, and the block count queue 3. Sub-strips are interleaved.
  • sub-stream slice 2_1 corresponding to the minimum mark 5
  • sub-stream slice 3_1 corresponding to the mark 6
  • sub-stream slice 1_1 corresponding to the mark 12
  • sub-stream slice corresponding to the mark 13 1_2 sub-slice 3_2 corresponding to mark 13
  • sub-slice 3_3 corresponding to mark 25
  • sub-slice 1_3 corresponding to mark 27
  • sub-slice 1_4 corresponding to mark 28
  • sub-slice 2_2 corresponding to mark 71.
  • embodiments of the present application provide an image encoding method, an image decoding method, a device, and a storage medium, which can perform encoding through a variety of improved encoding modes, thereby controlling the number of bits of the maximum encoding block and the minimum encoding block.
  • the number of bits can reduce the theoretical expansion rate of the encoded block to be encoded and reduce the filling speed difference between different sub-stream buffers, thereby reducing the preset space size of the sub-stream buffer and reducing hardware costs.
  • FIG6 is a schematic diagram of the composition of a video coding and decoding system provided in an embodiment of the present application.
  • the video coding and decoding system includes a source device 10 and a destination device 11 .
  • the source device 10 generates encoded video data.
  • the source device 10 may also be called an encoding end, a video encoding end, a video encoding device, or a video encoding device.
  • the destination device 11 may encode the encoded video data generated by the source device 10 .
  • the video data is decoded, and the destination device 11 may also be called a decoding end, a video decoding end, a video decoding device, or a video decoding device.
  • Source device 10 and/or destination device 11 may include at least one processor and memory coupled to the at least one processor.
  • the memory may include but is not limited to read-only memory (ROM), random access memory (RAM), electrically erasable programmable read-only memory (EEPROM) , flash memory or any other media that can be used to store the required program code in the form of instructions or data structures that can be accessed by a computer, which is not specifically limited in the embodiments of the present application.
  • ROM read-only memory
  • RAM random access memory
  • EEPROM electrically erasable programmable read-only memory
  • flash memory or any other media that can be used to store the required program code in the form of instructions or data structures that can be accessed by a computer, which is not specifically limited in the embodiments of the present application.
  • Source device 10 and destination device 11 may include a variety of devices, including desktop computers, mobile computing devices, notebook (eg, laptop) computers, tablet computers, set-top boxes, telephone handsets such as so-called “smart” phones, Electronic devices such as televisions, cameras, display devices, digital media players, video game consoles, in-vehicle computers, or the like.
  • Link 12 may include one or more media and/or devices capable of moving encoded video data from source device 10 to destination device 11 .
  • link 12 may include one or more communication media that enables source device 10 to transmit encoded video data directly to destination device 11 in real time.
  • the source device 10 may modulate the encoded video data according to a communication standard (eg, a wireless communication protocol), and may transmit the modulated video data to the destination device 11 .
  • the above one or more communication media may include wireless and/or wired communication media, such as: radio frequency (Radio Frequency, RF) spectrum, one or more physical transmission lines.
  • the above one or more communication media may form part of a packet-based network, such as a local area network, a wide area network, or a global network (eg, the Internet).
  • the one or more communication media mentioned above may include routers, switches, base stations, or other devices that implement communication from the source device 10 to the destination device 11 .
  • source device 10 may output the encoded video data from output interface 103 to storage device 13 .
  • the destination device 11 can access the encoded video data from the storage device 13 through the input interface 113 .
  • the storage device 13 may include a variety of local access data storage media, such as Blu-ray Disc, High-density Digital Video Disc (DVD), Compact Disc Read-Only Memory (CD-ROM), Flash memory, or other suitable digital storage medium for storing encoded video data.
  • storage device 13 may correspond to a file server or another intermediate storage device that stores encoded video data generated by source device 10 .
  • destination device 11 may obtain its stored video data from storage device 13 via streaming or downloading.
  • the file server may be any type of server capable of storing encoded video data and transmitting the encoded video data to destination device 11 .
  • a file server may include a World Wide Web (Web) server (e.g., for a website), a File Transfer Protocol (FTP) server, a Network Attached Storage (NAS) device, and a local disk driver.
  • Web World Wide Web
  • FTP File Transfer Protocol
  • NAS Network Attached Storage
  • Destination device 11 may access the encoded video data through any standard data connection (eg, an Internet connection).
  • Example types of data connections include wireless channels, wired connections (eg, cable modems, etc.), or a combination of both, suitable for accessing encoded video data stored on a file server.
  • the encoded video data can be transmitted from the file server through streaming, downloading, or a combination of both.
  • image encoding method and image decoding method provided by the embodiments of the present application are not limited to wireless application scenarios.
  • the image encoding method and image decoding method provided by the embodiments of the present application can be applied to video encoding and decoding supporting the following multimedia applications: over-the-air television broadcasting, cable television transmission, satellite television transmission, streaming video transmission (e.g., via the Internet), encoding of video data stored on a data storage medium, decoding of video data stored on a data storage medium, or other applications.
  • the video encoding and decoding system can be configured to support one-way or two-way video transmission to support applications such as video streaming, video playback, video broadcasting and/or video telephony.
  • the video coding and decoding system shown in FIG. 6 is only an example of the video coding and decoding system, and is not a limitation of the video coding and decoding system in this application.
  • the image encoding method and image decoding method provided by this application can also be applied to scenarios where there is no data communication between the encoding device and the decoding device.
  • the video data to be encoded or the encoded video data may be retrieved from local storage, may also be streamed on a network, etc.
  • the video encoding device may encode the video data to be encoded and store the encoded video data in a memory, and the video decoding device may also obtain the encoded video data from the memory and decode the encoded video data.
  • source device 10 includes video source 101 , video encoder 102 and output interface 103 .
  • output interface 103 may include a modulator/demodulator (modem) and/or a transmitter.
  • Video source 101 may include a video capture device (e.g., a video camera), a video archive containing previously captured video data, a video input interface to receive video data from a video content provider, and/or computer graphics for generating the video data. system, or a combination of these sources of video data.
  • Video encoder 102 may encode video data from video source 101 .
  • source device 10 transmits the encoded video data directly to destination device 11 via output interface 103 .
  • the encoded video data may also be stored on the storage device 13 for later access by the destination device 11 for decoding and/or playback.
  • the destination device 11 includes a display device 111 , a video decoder 112 and an input interface 113 .
  • input interface 113 includes a receiver and/or modem.
  • Input interface 113 may receive encoded video data via link 12 and/or from storage device 13 .
  • Display device 111 may be integrated with destination device 11 or may be external to destination device 11 . Generally, the display device 111 displays the decoded video data.
  • the display device 111 may include a variety of display devices, such as a liquid crystal display, a plasma display, an organic light emitting diode display, or other types of display devices.
  • video encoder 102 and video decoder 112 may each be integrated with an audio encoder and decoder, and may include appropriate multiplexer-demultiplexer units or other hardware and software to Handles encoding of both audio and video in a common data stream or in separate data streams.
  • the video encoder 102 and the video decoder 112 may include at least one microprocessor, digital signal processor (DSP), application-specific integrated circuit (ASIC), field programmable gate array (field programmable) gate array, FPGA), discrete logic, hardware, or any combination thereof. If the encoding method provided by the present application is implemented using software, the instructions for the software can be stored in a suitable non-volatile computer-readable storage medium, and at least one processor can be used to execute the instructions to implement the present application.
  • DSP digital signal processor
  • ASIC application-specific integrated circuit
  • FPGA field programmable gate array
  • the video encoder 102 and the video decoder 112 in this application may operate according to a video compression standard (such as HEVC) or other industry standards, which is not specifically limited in this application.
  • a video compression standard such as HEVC
  • other industry standards which is not specifically limited in this application.
  • FIG. 7 is a schematic diagram of the composition of the video encoder 102 provided by the embodiment of the present application.
  • the video encoder 102 can perform prediction, transformation, quantization, and transformation in the prediction module 21, the transformation module 22, the quantization module 23, the entropy coding module 24, the encoding substream buffer 25, and the substream interleaving module 26 respectively.
  • the prediction module 21, the transformation module 22, and the quantization module 23 are also the encoding modules in the above-mentioned Figure 1.
  • the video encoder 102 also includes a preprocessing module 20 and a summer 202, where the preprocessing module 20 includes a segmentation module and a code rate control module.
  • the video encoder 102 also includes an inverse quantization module 27, an inverse transform module 28, a summer 201 and a reference image memory 29.
  • the video encoder 102 receives video data, and the preprocessing module 20 inputs parameters of the video data.
  • the input parameters include the resolution of the image in the video data, the sampling format of the image, pixel depth (bits per pixel, BPP), bit width (or can also be called image bit width) and other information.
  • BPP refers to the number of bits occupied by a unit pixel.
  • the segmentation module in the preprocessing module 20 segments the image into original blocks (or may also be called coding units (CUs)).
  • the original block (or may also be referred to as a coding unit (CU)) may include coding blocks for multiple channels.
  • the multiple channels may be RGB channels or YUV channels.
  • this partitioning may also include partitioning into slices, image blocks or other larger units, and video block partitioning according to the largest coding unit (LCU) and the quadtree structure of the CU.
  • video encoder 102 encodes components of video blocks within a video slice to be encoded.
  • a strip may be divided into a plurality of original blocks (and possibly into a collection of original blocks called image blocks).
  • the size of CU, PU and TU is usually determined in the partitioning module.
  • the segmentation module is used to determine the size of the rate control unit.
  • the code rate control unit refers to the basic processing unit in the code rate control module. For example, the code rate control module calculates complexity information for the original block based on the code rate control unit, and then calculates the quantization parameters of the original block based on the complexity information.
  • the segmentation strategy of the segmentation module can be preset, or it can be continuously adjusted based on the image during the encoding process.
  • the segmentation strategy is a preset strategy
  • the same segmentation strategy is also preset in the decoder, thereby obtaining the same image processing unit.
  • the image processing unit is any one of the above image blocks, and corresponds one-to-one with the encoding side.
  • the segmentation strategy can be directly or indirectly encoded into the code stream.
  • the decoder obtains the corresponding parameters from the code stream, obtains the same segmentation strategy, and obtains the same image processing unit.
  • the code rate control module in the preprocessing module 20 is used to generate quantization parameters so that the quantization module 23 and the inverse quantization module 27 perform correlation calculations.
  • the code rate control module can obtain the image information of the original block for calculation, such as the above-mentioned input information; it can also obtain the reconstructed value obtained by the reconstruction of the summer 201 for calculation. This application does not do this. limit.
  • the prediction module 21 may provide the prediction block to the summer 202 to generate a residual block, and provide the prediction block to the summer 201 for reconstruction to obtain a reconstructed block, which is used as a reference pixel for subsequent prediction.
  • the video encoder 102 forms a pixel difference by subtracting the pixel value of the prediction block from the pixel value of the original block.
  • the pixel difference is the residual block, and the data in the residual block may include brightness difference and chrominance difference.
  • the summer 201 represents one or more components that perform this subtraction operation.
  • the prediction module 21 may also send related syntax elements to the entropy coding module 24 for merging into the bitstream.
  • Transform module 22 may divide the residual block into one or more TUs for transformation. Transform module 22 may transform the residual block from the pixel domain to the transform domain (eg, frequency domain). For example, the residual block is transformed using discrete cosine transform (DCT) or discrete sine transform (DST) to obtain the transform coefficients. Transform module 32 may send the resulting transform coefficients to quantization module 23.
  • DCT discrete cosine transform
  • DST discrete sine transform
  • the quantization module 23 may perform quantization based on quantization units.
  • the quantization unit may be the same as the above-mentioned CU, TU, and PU, or may be further divided in the segmentation module.
  • the quantization module 23 quantizes the transform coefficients to further reduce the code rate to obtain quantized coefficients.
  • the quantization process may reduce the bit depth associated with some or all of the coefficients.
  • the degree of quantization can be modified by adjusting the quantization parameters.
  • quantization module 23 may then perform a scan of the matrix containing the quantized transform coefficients.
  • entropy encoding module 24 may perform a scan.
  • entropy encoding module 24 may entropy encode the quantized coefficients.
  • the entropy coding module 24 may perform context-adaptive variable-length coding (CAVLC), context-based adaptive binary arithmetic coding (CABAC), grammar-based Context-adaptive binary arithmetic coding (syntax-based binary arithmetic coding, SBAC), probability interval partitioning entropy (PIPE) decoding, or another entropy coding method or technology.
  • CAVLC context-adaptive variable-length coding
  • CABAC context-based adaptive binary arithmetic coding
  • SBAC syntax-based binary arithmetic coding
  • PIPE probability interval partitioning entropy
  • Entropy encoding is performed by the entropy encoding module 24 Afterwards, the substreams are obtained. After entropy coding by multiple entropy coding modules 24, multiple substreams can be obtained.
  • the multiple substreams are substream interleaved through the coding substream buffer 25 and the substream interleaving module 26 to obtain the code stream.
  • the codestream is transmitted to video decoder 112 or archived for later transmission or retrieval by video decoder 112 .
  • the process of sub-stream interleaving may refer to the above-mentioned FIG. 1 to FIG. 5 , and will not be described in detail here.
  • the inverse quantization module 27 and the inverse transform module 28 respectively apply inverse quantization and inverse transform.
  • the summer 201 adds the inversely transformed residual block to the predicted residual block to generate a reconstructed block, which is used as a subsequent original block. Reference pixel for prediction.
  • the reconstructed block is stored in the reference image memory 29.
  • FIG. 8 is a schematic structural diagram of the video decoder 112 provided by the embodiment of the present application.
  • the video decoder 112 includes a substream interleaving module 30, a decoded substream buffer 31, an entropy decoding module 32, a prediction module 33, an inverse quantization module 34, an inverse transform module 35, a summer 301 and a reference image. Memory 36.
  • the entropy decoding module 32 includes a parsing module and a code rate control module.
  • video decoder 112 may perform an exemplary reciprocal decoding process to the encoding process described with respect to video encoder 102 from FIG. 7 .
  • video decoder 112 receives a codestream of encoded video from video encoder 102 .
  • the substream interleaving module 30 performs reverse substream interleaving on the code stream to obtain multiple substreams, and the multiple substreams pass through their corresponding decoding substream buffers and flow into their corresponding entropy decoding modules 32 .
  • the parsing module in the entropy decoding module 32 of the video decoder 112 entropy decodes the substream to generate quantization coefficients and syntax elements.
  • Entropy decoding module 32 passes the syntax elements to prediction module 33.
  • Video decoder 112 may receive syntax elements at the video slice level and/or the video block level.
  • the code rate control module in the entropy decoding module 32 generates quantization parameters based on the information of the image to be decoded obtained by the analysis module, so that the inverse quantization module 34 performs related calculations.
  • the code rate control module can also calculate the quantization parameter based on the reconstructed block reconstructed by the summer 301.
  • Inverse quantization module 34 inversely quantizes (eg, dequantizes) the quantization coefficients provided in the substream and decoded by entropy decoding module 32 and the generated quantization parameters.
  • the inverse quantization process may include determining the degree of quantization using quantization parameters calculated by video encoder 102 for each video block in the video slice, and likewise determining the degree of inverse quantization applied.
  • the inverse transform module 35 applies inverse transform (for example, DCT, DST and other transform methods) to the inversely quantized transform coefficients, and generates inversely transformed residual blocks in the pixel domain according to the inverse transform units based on the inversely quantized transform coefficients.
  • the size of the inverse transformation unit is the same as the size of the TU, and the inverse transformation method and the transformation method adopt the corresponding forward transformation and inverse transformation in the same transformation method.
  • the inverse transformation of DCT and DST is inverse DCT, inverse DST or conceptually Similar inverse transformation process.
  • video decoder 112 forms a decoded video block by summing the inverse-transformed residual block from inverse-transform module 35 with the prediction block.
  • Summer 301 represents one or more components that perform this summation operation.
  • a deblocking filter may also be applied to filter the decoded blocks in order to remove blocking artifacts. Decoded image blocks in a given frame or image are stored in reference image memory 36 as reference pixels for subsequent predictions.
  • FIG. 9 is a schematic flow chart of a video coding and decoding provided by an embodiment of the present application.
  • the video coding and decoding implementation includes process 1 To process 5, processes 1 to process 5 may be executed by any one or more of the above-mentioned source device 10, video encoder 102, destination device 11 or video decoder 112.
  • Process 1 Divide a frame of image into one or more parallel coding units that do not overlap with each other. There is no dependency between the one or more parallel coding units, and they can be completely parallel/independently encoded and decoded, such as parallel coding unit 1 and parallel coding unit 2.
  • each parallel coding unit it can be divided into one or more independent coding units that do not overlap with each other.
  • the independent coding units may not depend on each other, but they can share some parallel coding unit header information.
  • the independent coding unit may include three channels of brightness Y, first chroma Cb, and second chroma Cr, or three channels of RGB, or may include only one of the channels. If the independent coding unit contains three channels, the sizes of the three channels can be exactly the same or different, depending on the input format of the image.
  • the independent coding unit can also be understood as one or more processing units formed by N channels included in each parallel coding unit.
  • the above three channels of Y, Cb, and Cr are the three channels that constitute the parallel coding unit, and each of them can be an independent coding unit, or Cb and Cr can be collectively called chroma channels, then the parallel coding unit includes the luminance channel An independent coding unit composed of chroma channels, and an independent coding unit composed of chroma channels.
  • each independent coding unit it can be divided into one or more non-overlapping coding units.
  • Each coding unit within the independent coding unit can be interdependent.
  • multiple coding units can perform mutual reference precoding and decoding. .
  • the coding unit and the independent coding unit have the same size (that is, the independent coding unit is only divided into one coding unit), its size can be all the sizes described in process 2.
  • the encoding unit may include three channels of brightness Y, first chroma Cb, and second chroma Cr (or three RGB channels), or may include only one of the channels. If it contains three channels, the sizes of several channels can be exactly the same or different, depending on the image input format.
  • process 3 is an optional step in the video encoding and decoding method.
  • the video encoder/decoder can encode/decode the residual coefficients (or residual values) of the independent coding units obtained in process 2.
  • PG non-overlapping prediction groups
  • PG can also be referred to as Group for short.
  • Each PG is encoded and decoded according to the selected prediction mode.
  • the predicted value of the PG is obtained to form the predicted value of the entire coding unit.
  • the residual value of the coding unit is obtained.
  • Process 5 Based on the residual value of the coding unit, group the coding units to obtain one or more non-overlapping residual blocks (RB).
  • the residual coefficients of each RB are encoded and decoded according to the selected mode. , forming a residual coefficient stream. Specifically, it can be divided into two categories: transforming the residual coefficients and not transforming them.
  • the selected mode of the residual coefficient encoding and decoding method in process 5 may include, but is not limited to any of the following: semi-fixed length encoding method, exponential Golomb encoding method, Golomb-Rice encoding method, truncated unary code Encoding method, run length encoding method, direct encoding of original residual value, etc.
  • the video encoder may directly encode the coefficients within the RB.
  • the video encoder can also transform the residual block, such as DCT, DST, Hadamard transform, etc., and then encode the transformed coefficients.
  • the residual block such as DCT, DST, Hadamard transform, etc.
  • the video encoder can directly quantize each coefficient in the RB uniformly, and then perform binary encoding. If the RB is large, it can be further divided into multiple coefficient groups (CG), and then each CG is uniformly quantized and then binary encoded. In some embodiments of the present application, the coefficient group (CG) and the quantization group (QG) may be the same.
  • the maximum value of the absolute value of the residual within an RB block is defined as the modified maximum (mm).
  • the number of coded bits of the residual coefficient in the RB block is determined (the number of coded bits of the residual coefficient in the same RB block is consistent). For example, if the critical limit (CL) of the current RB block is 2 and the current residual coefficient is 1, then 2 bits are needed to encode the residual coefficient 1, which is expressed as 01. If the CL of the current RB block is 7, it means encoding an 8-bit residual coefficient and a 1-bit sign bit.
  • the determination of CL is to find the minimum M value that satisfies all residuals of the current sub-block to be within the range of [-2 ⁇ (M-1), 2 ⁇ (M-1)]. If there are two boundary values -2 ⁇ (M-1) and 2 ⁇ (M-1) at the same time, M should be increased by 1, that is, M+1 bits are needed to encode all the residuals of the current RB block; if there are only - One of the two boundary values 2 ⁇ (M-1) and 2 ⁇ (M-1), a Trailing bit needs to be encoded to determine the Is the boundary value -2 ⁇ (M-1) or 2 ⁇ (M-1); if none of -2 ⁇ (M-1) and 2 ⁇ (M-1) exists in all residuals, no need Encode the Trailing bit.
  • the video encoder can also directly encode the original value of the image instead of the residual value.
  • FIG. 10 is a schematic diagram of the composition of a codec device provided by an embodiment of the present application.
  • the codec device may be part of the device in the above-mentioned video encoder 102 , or may be part of the device in the above-mentioned video decoder 112 .
  • the encoding and decoding device can be applied to the encoding side (or encoding end) or the decoding side (or decoding end).
  • the encoding and decoding device includes a processor 41 and a memory 42 .
  • the processor 41 and the processor 42 are connected to each other (for example, via a bus 43).
  • the codec device may also include a communication interface 44, which is connected to the processor 41 and the memory 42 for receiving/transmitting data.
  • the processor 41 is configured to execute instructions stored in the memory 42 to implement the image encoding method and image decoding method provided in the following embodiments of the application.
  • the processor 41 may be a central processing unit (CPU), a general-purpose processor, a network processor (NP), a digital signal processor (DSP), a microprocessor, a microcontroller, or a Programmable logic device (PLD) or any combination thereof.
  • the processor 41 can also be any other device with processing functions, such as a circuit, a device or a software module, which is not limited in the embodiment of the present application.
  • the processor 41 may include one or more CPUs, such as CPU0 and CPU1 in FIG. 10 .
  • the electronic device may include multiple processors.
  • the processor 41 may also include a processor 45 (shown as an example with a dotted line in FIG. 10).
  • Memory 42 is used to store instructions.
  • the instructions may be a computer program.
  • the memory 42 may be a read-only memory (ROM) or other type of static storage device that can store static information and/or instructions, or may be a random access memory (RAM). ) or other types of dynamic storage devices that can store information and/or instructions, and can also be electrically erasable programmable read-only memory (EEPROM), compact disc read-only memory , CD-ROM) or other optical disc storage, optical disc storage (including compressed optical discs, laser discs, optical discs, digital versatile discs, Blu-ray discs, etc.), magnetic disk storage media or other magnetic storage devices, etc., the embodiments of this application do not limit this.
  • ROM read-only memory
  • RAM random access memory
  • EEPROM electrically erasable programmable read-only memory
  • CD-ROM compact disc read-only memory
  • optical disc storage including compressed optical discs, laser discs, optical discs, digital versatile discs, Blu-ray discs, etc.
  • the memory 42 may exist independently of the processor 41 or may be integrated with the processor 41 .
  • the memory 42 may be located within the codec device or outside the codec device, and this is not limited in the embodiment of the present application.
  • Bus 43 is used to transmit information between various components included in the encoding and decoding device.
  • the bus 43 may be an industry standard architecture (industry standard architecture, ISA) line, a peripheral component interconnect (PCI) line, or an extended industry standard architecture (extended industry standard architecture, EISA) line, etc.
  • the bus 43 can be divided into address lines, data lines, control lines, etc. For ease of presentation, only one solid line is used in Figure 10, but it does not mean that there is only one line or one type of line.
  • Communication interface 44 is used to communicate with other devices or other communication networks.
  • the other communication network may be Ethernet, wireless access network (radio access network, RAN), wireless local area networks (wireless local area networks, WLAN), etc.
  • Communication interface 44 may be a module, a circuit, a transceiver, or any device capable of communicating. The embodiments of the present application do not limit this.
  • the structure shown in Figure 10 does not constitute a limitation on the encoding and decoding device.
  • the encoding and decoding device may include more or fewer components than shown in the figure, or certain components. combination of components, or different component arrangements.
  • the execution subject of the image encoding method and image decoding method provided by the embodiments of the present application may be the above-mentioned encoding and decoding device, or an application (application, APP) installed in the encoding and decoding device that provides encoding and decoding functions; or , a CPU in the codec device; or, a functional module in the codec device for executing the image encoding method and the image decoding method.
  • the embodiments of the present application do not limit this. For simplicity of description, the following description will be uniformly based on the encoding end or the decoding end.
  • the embodiment of this application proposes a series of improved coding methods (such as coding mode/prediction mode, complexity information transmission, coefficient grouping, code stream arrangement, etc.) to reduce the coding unit
  • the expansion rate can reduce the speed difference when encoding the encoding blocks of each channel of the encoding unit, and the space of the sub-stream buffer can be reasonably configured to reduce hardware costs.
  • the expansion rate of the coding unit may include a theoretical expansion rate and a current (CU) expansion rate (actual expansion rate).
  • the theoretical expansion rate can be obtained through theoretical derivation after the codec is determined, and its value is greater than 1. If the number of bits of the substream encoded by the coding block of a certain channel in the coding unit is 0, this situation needs to be eliminated when calculating the current expansion rate.
  • Theoretical expansion rate the number of bits of the CB with the largest theoretical number of bits in the CU/the number of bits of the CB with the smallest theoretical number of bits in the CU.
  • the current (CU) expansion rate the number of bits of the CB with the largest actual number of bits in the current CU/the number of bits of the CB with the smallest actual number of bits in the current CU.
  • the associated expansion rate may also include the current sub-stream expansion rate.
  • the current substream expansion rate the number of data bits in the substream buffer with the largest number of data bits in the current multiple substream buffers / the data in the substream buffer with the smallest number of data bits in the current multiple substream buffers number of bits.
  • the coding end can select based on the following strategies.
  • Each improved encoding mode in the following embodiments can be selected and determined according to the following strategies.
  • the bit consumption cost refers to the number of bits required to encode/decode the CU.
  • the bit consumption cost mainly includes the code length of the mode flag (or mode codeword), the code length of the encoding/decoding tool information codeword, and the code length of the residual codeword.
  • Distortion is used to indicate the difference between the reconstructed value and the original value. Distortion can be calculated using the sum of squared differences (SSD), mean squared error (MSE), sum of absolute errors (time domain) (sum of absolute differences (SAD)), and sum of absolute errors (frequency domain) (sum of absolute transformed difference, SATD), and peak signal to noise ratio (peak signal to noise ratio, PSNR), etc. Any one or more calculations. The embodiments of the present application do not limit this.
  • Strategy 3 Calculate the rate-distortion cost and select the encoding mode with the smallest rate-distortion cost.
  • the rate-distortion cost refers to the weighted sum of the bit consumption cost and distortion.
  • the weight coefficient of the bit consumption cost and the weight coefficient of the encoded distortion can be preset at the encoding end.
  • the embodiments of this application do not limit the specific numerical value of the weight coefficient.
  • the encoding end encodes each pixel value in the encoding blocks of multiple channels of the encoding unit using the image bit width as a fixed-length code.
  • the encoding end encodes each pixel value in the encoding block of multiple channels of the encoding unit with a fixed-length code that is less than or equal to the image bit width.
  • FIG. 11 is a schematic flowchart of an image encoding method provided by an embodiment of the present application. As shown in Figure 11, the image encoding method includes S101 to S102.
  • the encoding end obtains the encoding unit.
  • the encoding end is also the source device 10 in Figure 6 above, or the video encoder 102 in the source device 10, or the encoding and decoding device in Figure 10 above, etc.
  • the embodiment of the present application is not limited to this.
  • the encoding unit is an image block in the image to be processed (that is, the original block mentioned above).
  • the encoding unit includes encoding blocks of multiple channels; the multiple channels include a first channel; the first channel is any one of the multiple channels.
  • the size of the encoding unit can be 16 ⁇ 2 ⁇ 3, then the size of the encoding block of the first channel of the encoding unit is 16 ⁇ 2.
  • the encoding end encodes the encoding block of the first channel according to the first encoding mode.
  • the first encoding mode is a mode in which the sample values in the encoding block of the first channel are encoded according to the first fixed-length code.
  • the code length of the first fixed-length code is less than or equal to the image bit width of the image to be processed. Image bitwidth characterizes the number of bits required to store each sample in the image to be processed.
  • the first fixed-length code may be preset in the encoding/decoding end, or may be determined by the encoding end and written into the code stream header information and transmitted to the decoding end.
  • the anti-expansion mode refers to a mode that directly encodes the original pixel values in the encoding block of multiple channels. Compared with other encoding modes, the number of bits consumed by directly encoding the original pixel values is usually larger. Therefore, theoretically The code blocks with the largest theoretical number of bits in the dilation ratio usually come from code blocks coded in anti-expansion mode. Embodiments of the present application reduce the theoretical number of bits (that is, the numerator in the fraction) of the coding block with the largest theoretical number of bits in the theoretical expansion rate by reducing the code length of the fixed-length code in the anti-expansion mode, thereby reducing the theoretical expansion rate.
  • the encoding end can convert it into YUV format for encoding.
  • the encoding end can also convert it into RGB format for encoding. coding. The embodiments of the present application do not limit this.
  • FIG 12 is a schematic flowchart of an image decoding method provided by an embodiment of the present application. As shown in Figure 12, the image decoding method includes S201 to S202.
  • the decoding end obtains the code stream after encoding the coding unit.
  • the bitstream after encoding the coding unit may include multiple substreams corresponding to the multiple channels after the coding blocks of the multiple channels are encoded.
  • the multiple substreams may include the substream corresponding to the first channel (that is, the substream after the coding block of the first channel is encoded).
  • the decoding end decodes the substream corresponding to the first channel according to the first decoding mode.
  • the first decoding mode is a mode in which sample values are parsed from the substream corresponding to the first channel according to the first fixed-length code.
  • S202 may also specifically include: when the first fixed-length code is equal to the image bit width, the decoding end directly decodes the sub-stream corresponding to the first channel according to the first decoding mode; when the first fixed-length code is smaller than When the image bit width is 1, the decoding end performs inverse quantization on the parsed pixel values in the encoding block of the first channel.
  • the quantization step size is 1 ⁇ (bitdepth-fixed_length).
  • bitdepth represents the image bit width.
  • fixed_length represents the length of the first certain length code. 1 ⁇ means shift one position to the left.
  • Optional implementation 1 If the current code stream buffer cannot satisfy the requirement that all channels of the encoding unit are encoded in the anti-expansion mode, the anti-expansion mode is turned off at this time, and the encoding end selects a mode other than the original value. Even if other Encoding mode encoding will cause expansion (the number of bits of the substream after encoding a certain channel's encoding block is too large), and the anti-expansion mode cannot be used.
  • Improvement plan 1 Ensure that the anti-expansion mode is turned on.
  • FIG 13 is a schematic flowchart of another image decoding method provided by an embodiment of the present application. As shown in Figure 13, the image decoding method includes S301 to S303.
  • the encoding end obtains a coding unit.
  • the coding unit is an image block in the image to be processed, and the coding unit includes coding blocks of multiple channels.
  • the encoding end determines the first total code length.
  • the first total code length is the total code length of the first code stream obtained by encoding the coding blocks of multiple channels according to their respective corresponding target coding modes.
  • the target encoding mode includes a first encoding mode.
  • the first encoding mode is a mode for encoding pixel values in the encoding block according to a first fixed-length code.
  • the code length of the first fixed-length code is less than or equal to the image bit width of the image to be processed.
  • the image bit width is used to characterize the number of bits required to store each sample in the image to be processed.
  • the encoding end can determine the rate distortion cost of each coding mode according to the above strategy 3, and determine the mode with the lowest rate distortion cost as the target coding mode corresponding to the coding block of the channel.
  • the fallback mode includes a first fallback mode and a second fallback mode.
  • the first fallback mode refers to using the IBC mode to obtain the block vector of the reference prediction block, then calculating the residual and quantizing the residual.
  • the quantization step size is based on the remaining size of the code stream buffer and the target pixel depth (bite per pixel, BPP )Sure.
  • the second fallback mode refers to directly quantizing pixels, and the quantization step size is determined based on the remaining size of the code stream buffer and the target BPP.
  • the mode flags for fallback mode and first encoding mode are the same.
  • the anti-expansion mode uses fixed-length code encoding.
  • the final total code length is fixed, and the anti-expansion mode can be used to avoid the above situation where the residual error is too large.
  • the embodiment of the present application provides a solution for notifying the decoder of the adopted encoding mode by judging the remaining size of the code stream buffer when the mode flags of the fallback mode and the first encoding mode (anti-expansion mode) are the same.
  • the image decoding method may further include: the encoding end encoding mode flags in multiple substreams obtained by encoding the encoding blocks of multiple channels.
  • the mode flag is used to indicate the coding mode adopted by each coding block of multiple channels, and the mode flag of the first coding mode and the mode flag of the fallback mode are the same.
  • the substream encoded by the coding blocks of multiple channels can be encoded with its own mode flag. Take the first channel among the multiple channels as an example, and the first channel is any one of the multiple channels.
  • the encoding end encodes the mode flags in the multiple sub-streams obtained by encoding the encoding blocks of the multiple channels. , may include: encoding the sub-mode flag in the sub-stream obtained by encoding the encoding block of the first channel.
  • the sub-mode flag is used to indicate the type of fallback mode adopted by the coding block of the first channel, or used to indicate the type of fallback mode adopted by the coding blocks of multiple channels.
  • the fallback mode may include a first fallback mode and a second fallback mode. No further details will be given here.
  • the above-mentioned first component is also taken as an example.
  • the encoding end encodes the mode flag in multiple substreams obtained by encoding the encoding blocks of multiple channels, which may include: in the encoding block of the luminance channel.
  • the first flag, the second flag, and the third flag are encoded in the encoded substream.
  • the first flag is used to indicate that the coding blocks of multiple channels are coded in the first coding mode or the fallback mode
  • the second flag is used to indicate that the coding blocks of the multiple channels are coded in the target mode
  • the target mode is the first coding mode and the fallback mode. Any one of the fallback modes; when the second flag indicates that the target mode adopted by the coding blocks of multiple channels is the fallback mode, the third flag is used to indicate the type of fallback mode adopted by the coding blocks of multiple channels.
  • FIG 14 is a schematic flowchart of another image decoding method provided by an embodiment of the present application. As shown in Figure 14, the image decoding method includes S401 to S404.
  • the decoding end analyzes the code stream after encoding the coding unit.
  • the coding unit includes coding blocks of multiple channels.
  • the plurality of channels include a first channel, and the first channel is any one of the plurality of channels.
  • the decoding end determines that the target decoding mode of the substream is fallback. model.
  • the second total code length is a total code length of a first code stream obtained after coding blocks of multiple channels are encoded according to the first coding mode or the fallback mode.
  • the decoding end parses the preset flag bit in the substream and determines the target fallback mode.
  • the target fallback mode is one of the fallback modes.
  • the sub-mode flag is used to indicate the type of fallback mode used when encoding the encoding blocks of multiple channels.
  • the fallback mode includes a first fallback mode and a second fallback mode.
  • the decoding end decodes the substream according to the target fallback mode.
  • FIG 15 is a schematic flowchart of yet another image decoding method provided by an embodiment of the present application. As shown in Figure 15, the image decoding method includes S501 to S505.
  • the decoding end analyzes the code stream after encoding the coding unit.
  • the decoding end parses the first flag from the substream obtained by encoding the encoding block of the first channel.
  • the first channel is any one of the plurality of channels.
  • the first flag may refer to the above-mentioned encoding method, and will not be described again here.
  • the decoding end parses the second flag from the substream obtained by encoding the encoding block of the first channel.
  • the second flag can refer to the description in the above encoding method, which will not be repeated here.
  • the decoding end parses the third flag from the substream obtained by encoding the encoding blocks of the first channel.
  • the decoding end determines target decoding modes of the multiple channels according to the type of the fallback mode indicated by the third flag, and decodes substreams obtained by encoding the coding blocks of the multiple channels according to the target decoding mode.
  • the decoding end may use the type of fallback mode indicated by the third flag as the target decoding mode of multiple channels.
  • the method may further include: the decoding end determines the target decoding code length of the coding unit based on the remaining size of the code stream buffer and the target pixel depth BPP, and the target decoding code length is used to indicate the code length of the decoding coding unit.
  • the code length required for the stream the decoding end determines the allocated code length of multiple channels based on the decoding code length.
  • the allocated code length is used to indicate the code length required to decode the residual code stream of the encoding block of multiple channels; the decoding end determines the code length based on the decoding code length.
  • the average value of the allocated code length among multiple channels is used to determine the decoding code length allocated to each of the multiple channels.
  • Encoding end The encoding end controls that at any time, the encoding bit consumption cost in the anti-expansion mode (original value mode) is the largest. The encoding end cannot choose a mode in which the bit consumption cost is greater than the original value. Therefore, encoding mode is required for all components. Even in fallback mode, not only the substream corresponding to the Y channel (first substream) requires encoding mode, but also the substream corresponding to the U/V channel (second substream and third substream). stream) also requires encoding mode. Keep the fallback mode codeword the same as the original value mode codeword. The specific types of fallback modes (first fallback mode and second fallback mode) may be based on a certain channel encoding.
  • Decoding end First analyze the encoding modes of the three channels. When the Y/U/V channels are all target encoding modes selected based on the rate distortion cost, then judge the remaining size of the code stream buffer to determine whether the code stream buffer is allowed. All three channels are decoded according to their respective target encoding modes. If not allowed, the current decoding mode is fallback mode. If it is fallback mode, a flag bit can be parsed based on a certain component to indicate whether the current fallback mode belongs to the first fallback mode or the second fallback mode (the three channels use the same fallback mode). Therefore, when a channel at the decoding end does not select the anti-expansion mode, the current CU will not select the fallback mode.
  • Encoding end The encoding end controls that at any time, the encoding bit consumption cost in the anti-expansion mode (original value mode) is the largest. The encoding end cannot choose a mode in which the bit consumption cost is greater than the original value.
  • the original value mode and fallback mode still use the same mode flag (that is, the first flag mentioned above), but when encoding the anti-expansion mode and fallback mode, an additional flag (that is, the second flag mentioned above) is encoded to indicate the current encoding
  • the mode is one of the anti-expansion mode or the fallback mode.
  • a further flag bit (that is, the above-mentioned third flag) is encoded to indicate that the current fallback mode is One of the first fallback mode or the second fallback mode.
  • the fallback mode types of multiple channels remain consistent.
  • the mode flag is encoded in the substream corresponding to the channel, but there is no need to encode the flag of whether it is the fallback mode (that is, the above-mentioned second flag).
  • Decoding end parse the mode flag from the bitstream. If the mode flag is a mode flag shared by the anti-expansion mode and the fallback mode, continue to parse a flag (that is, the second flag mentioned above) to indicate whether the current mode is the fallback mode. If it is the fallback mode, further parse a flag (that is, the third flag mentioned above) to indicate that the current fallback mode is one of the first fallback mode and the second fallback mode.
  • the type of fallback mode of a channel is parsed, other channels that use the fallback mode use the same type of fallback mode as the channel. Channels that do not use the fallback mode each parse their own mode.
  • Improvement Scheme 1 can reduce the molecular upper limit in the expansion rate formula through the above-mentioned Embodiment 2.
  • Optional implementation 2 When it is determined to use fallback mode for encoding, the encoding end will set a target BPP. In fallback mode, the sum of the number of bits of the substreams corresponding to the encoding blocks of the three channels must be less than or equal to the target. BPP. Therefore, when allocating the code rate, the public information (block vector and mode flag) is encoded on the brightness channel. After the allocated code length minus the code length occupied by the public information, the remaining code length is used to encode multiple channels. The residual is encoded, so the remaining code length needs to be allocated.
  • the allocation method will be that the remaining code length is divided by the number of pixels in multiple channels (that is, the code length allocated to each channel needs to be the coding unit (an integer multiple of the number of pixels), if it cannot be divided evenly, the part of the code length that cannot be divided evenly is allocated to the brightness channel.
  • the remaining code length is small and cannot be divisible by the number of pixels in multiple channels, the code length cannot be allocated to the substream corresponding to the first chroma channel and the second chroma channel, resulting in a smaller number of bits in the corresponding substream. , the expansion rate is large.
  • Improvement plan 2 Evenly distribute the remaining code lengths to multiple channels according to the number of bits to improve distribution accuracy.
  • Figure 16 is a schematic flowchart of yet another image encoding method provided by an embodiment of the present application. As shown in Figure 15, based on the above S301 to S303, the image encoding method may also include S601 to S603.
  • the encoding end determines the target encoding code length of the encoding unit based on the remaining size of the code stream buffer and the target BPP.
  • the target coding code length is used to indicate the code length required for coding the coding unit.
  • the target BPP can be obtained by the encoding end, for example, receiving the target BPP input by the user.
  • the allocated code length is used to indicate the code length required to encode the residual of the coding block of multiple channels.
  • the encoding end can allocate a code length minus the code length occupied by the common information to obtain the allocated code lengths for multiple channels.
  • S603 Determine the encoding code lengths allocated to each of the multiple channels according to the average values of the allocated code lengths in the multiple channels.
  • the encoding end may also encode the above block vectors and mode flags in substreams corresponding to multiple channels.
  • the mode flag can refer to the above and will not be described again here.
  • the block vector can be described with reference to the IBC mode below and will not be described again here.
  • the encoding end can encode the residual
  • the pixels of the channels are grouped, and each group encodes only one residual value.
  • the current code length allocation mode based on the number of pixels of the coding unit may cause the chroma channel to be unable to be allocated to the code length, resulting in the number of bits encoded by the coding block of the chroma channel. Smaller, the expansion rate becomes larger.
  • the image encoding method provided by the embodiment of the present application can be allocated by converting the allocated unit from the number of pixels to the number of bits allocated to the code length. Each channel can be allocated to the code length for encoding, which reduces the theoretical expansion rate.
  • the embodiment of the present application also provides an image decoding method.
  • the method may also include: the decoding end based on the remaining size of the code stream buffer and the target pixel depth.
  • BPP determines the target decoding code length of the coding unit; the target decoding code length is used to indicate the code length required to decode the code stream of the coding unit; the decoding end is based on the decoding code length; determines the allocation code length of multiple channels; the allocation code length is used Indicates the code length required to decode the residual of the code stream of the encoding block of multiple channels; the decoding end determines the decoding code length allocated to each of the multiple channels based on the average value of the allocated code length in multiple channels.
  • the mode identifier is CU level
  • the BV Block Vector
  • Both the mode identifier and BV are transmitted in brightness.
  • Improvement plan Change the mode identification to CB level, BV is still CU level, each channel transmits the mode identification, and each channel transmits BV.
  • Embodiment 4 is a diagrammatic representation of Embodiment 4:
  • Figure 17 is a schematic flowchart of yet another image encoding method provided by an embodiment of the present application. As shown in Figure 17, the method includes S701 to S704.
  • the encoding end obtains the encoding unit.
  • the coding unit includes coding blocks of multiple channels.
  • the encoding end encodes the encoding block of at least one channel among the multiple channels according to the IBC mode.
  • S702 may specifically include: the encoding end determines the target encoding mode of the at least one channel using a rate-distortion optimization decision.
  • the target encoding mode includes IBC mode.
  • the rate-distortion optimization decision can be made with reference to the above-mentioned strategy 3, and will not be described again here.
  • the encoding end obtains the BV of the reference prediction block.
  • the BV of the reference prediction block is used to indicate the position of the reference prediction block in the encoded image block.
  • the reference prediction block is used to represent the prediction value of the coding block encoded in the IBC mode.
  • the encoding end encodes the BV of the reference prediction block in at least one substream obtained by encoding the encoding block of at least one channel in IBC mode.
  • the BV may be all encoded in the substream corresponding to the luma channel.
  • the BV can also be all in the substream corresponding to a certain channel of chroma (for example, the first chroma channel or the second chroma channel).
  • chroma for example, the first chroma channel or the second chroma channel.
  • the embodiments of the present application do not limit the specific numerical value of the preset ratio.
  • the above-mentioned S602 may specifically include: encoding the encoding blocks of multiple channels according to the IBC mode; the above-mentioned S704 may specifically include: encoding the encoding blocks of each channel in the multiple channels through IBC mode encoding.
  • the BV of the encoding reference prediction block in the substream may specifically include: encoding the encoding blocks of each channel in the multiple channels through IBC mode encoding.
  • the BV of the reference prediction block includes multiple BVs.
  • the above-mentioned coding of the BV of the reference prediction block in the substream obtained by encoding the coding block of each channel in the multiple channels through IBC mode may include: coding the coding block in each channel in the multiple channels through IBC according to a preset ratio.
  • the embodiment of the present application can increase transmission in the substream corresponding to the chroma channel with a smaller number of bits. data, thereby increasing the denominator in the above formula for calculating the theoretical expansion rate, thus reducing the theoretical expansion rate.
  • a CU generates header information during the encoding process, and the header information of the CU can also be allocated and encoded in the substream according to the above-mentioned method of allocating BVs.
  • FIG 18 is a schematic flowchart of yet another image decoding method provided by an embodiment of the present application. As shown in Figure 18, the method includes S801 to S804.
  • the decoding end analyzes the code stream after encoding the coding unit.
  • the encoding unit includes encoding blocks of multiple channels; the code stream includes multiple sub-streams encoded by encoding blocks of multiple channels and corresponding to multiple channels one-to-one.
  • the decoder determines the position of the reference prediction block in the plurality of substreams based on the block vector BV of the reference prediction block parsed from at least one substream among the plurality of substreams.
  • the decoding end determines the prediction value of the decoding block decoded according to the IBC mode based on the position information of the reference prediction block.
  • the decoding end reconstructs the decoding block decoded according to the IBC mode based on the prediction value.
  • S803 and S804 may refer to the above description of the video encoding and decoding system, and will not be described again here.
  • the encoding end can all parse the BV in the substream corresponding to the luma channel.
  • the BV can also be all in the substream corresponding to a certain channel of chroma (for example, the first chroma channel or the second chroma channel). parse, or equally divide the number of BVs in the substreams corresponding to the two chroma channels for parsing. If it cannot be divided equally, the encoding end can parse one more BV in the substream corresponding to any chroma channel.
  • chroma for example, the first chroma channel or the second chroma channel.
  • the embodiments of the present application do not limit the specific numerical value of the preset ratio.
  • the coding blocks encoded according to the IBC mode include coding blocks of at least two channels; the coding blocks of at least two channels share the BV of the reference prediction block.
  • the method may further include: when the IBC mode identifier is parsed from any one of the multiple substreams, the decoding end determines that the target decoding mode corresponding to the multiple substreams is the IBC mode.
  • the method may further include: the decoding end parses the IBC mode identifiers from the multiple substreams one by one, and determines that the target decoding mode corresponding to the substream from which the IBC mode identifier is parsed is the IBC mode.
  • the improvement plan for the IBC mode mainly includes: Two options are introduced:
  • Step 1 Obtain the reference prediction block based on multi-channel, that is, for a multi-channel coding unit, use the multi-channel at a certain position in the search area as the reference prediction block under the current multi-channel coding unit, and record this position. It is BV, that is, multiple channels share one BV. If the input image is YUV400, there is only one channel, otherwise there are 3 channels. After obtaining the prediction block, the residual under each channel is calculated, the residual is quantized (transformed), and the reconstruction is completed after inverse quantization (inverse transform).
  • Step 2 Encode auxiliary information (including coding block complexity level and other information), mode information, and encode the quantized coefficients of the luminance channel on the luminance channel.
  • Step 3 If a chroma channel exists, encode auxiliary information (including coding block complexity and other information) on the chroma channel, and encode the quantized coefficients of the chroma channel.
  • auxiliary information including coding block complexity and other information
  • the BV of the reference prediction block of each coding unit can be all encoded on the luma channel.
  • the BV of the reference prediction block of each coding unit can also be encoded entirely on a certain chroma channel, or the BV can be equally divided on the two chroma channels for encoding. If it cannot be divided equally, then a certain chroma channel will have more than one BV. Compiled part of BV.
  • the BV can also be divided equally and encoded on different components. If it cannot be divided equally, it will be distributed according to the preset proportion.
  • Step 1 Analyze the auxiliary information and mode information of the coding block of the brightness channel, and analyze the quantized coefficients in the brightness channel.
  • Step 2 If there is a chroma channel, parse the auxiliary information of the encoding block of the U channel. If the luma channel prediction mode is the IBC mode, the prediction mode of the current chroma channel does not need to be parsed and directly defaults to the IBC mode. Parse the current chroma channel. Lower the quantized coefficients.
  • Step 3 The analysis of BV is consistent with the encoding end, that is:
  • the BV of the reference prediction block of each coding unit can all be analyzed on the luma channel.
  • the BV of the reference prediction block of each analysis unit can also be analyzed entirely on a certain channel of chroma, or the BV can be divided equally on two chroma channels for analysis. If it cannot be divided equally, then a certain chroma channel has more than one channel. Compiled part of BV.
  • the BV can also be divided equally and parsed on different components. If it cannot be divided equally, it will be distributed according to the preset proportion.
  • Step 4 According to the BV shared by the three channels, obtain the predicted value of each coding block under each channel. Inverse quantize (inverse transform) the coefficients obtained by analyzing under each channel to obtain the residual value. According to the residual value and prediction Values are reconstructed for each coding block.
  • Step 1 The same coding unit, that is, each IBC mode under three channels only needs to be trained to obtain a set of BV.
  • This set of BV can be obtained based on the search area and original value of the three channels, or can be based on the search area of one channel and the original value. The value is obtained, or calculated based on the search area of any two channels and the original value.
  • Step 2 For the coding block of each channel, the target coding mode is determined using the rate-distortion cost, where the BV of the three channels in the IBC mode uses the BV calculated in step 1. It is allowed that after one component selects the IBC mode, the other components do not select the IBC mode.
  • Step 3 The coding block of each channel needs to encode its own optimal mode. If the target mode of one or more channels selects the IBC mode, then BV can encode based on a channel that selects the IBC mode, or Encoding can be based on two channels that select IBC and equally divide the number of BVs, or it can be based on all channels that select IBC mode.
  • Each channel parses a target mode. If it is parsed that the target mode of one or more channels selects the IBC mode (only the same IBC mode can be selected), then the BV parsing is consistent with the BV encoding allocation plan at the encoding end.
  • Optional implementation As shown in the flow chart in Figure 1, there may be a residual skip mode during the encoding process. If the encoding end selects the skip mode during encoding, there is no need to encode the residual during encoding, only Only 1 bit of data needs to be encoded to indicate residual skipping.
  • Improvement plan Group the processing coefficients (residual coefficients, and/or transformation coefficients).
  • FIG 19 is a schematic flowchart of yet another image encoding method provided by an embodiment of the present application. As shown in Figure 19, the image encoding method includes S901 to S902.
  • the encoding end obtains the processing coefficient corresponding to the encoding unit.
  • the processing coefficients include one or more of residual coefficients and transformation coefficients.
  • the encoding end divides the processing coefficients into multiple groups according to the number threshold.
  • the number threshold can be preset at the encoding end.
  • the number threshold is related to the size of the coding block. For example, for a 16 ⁇ 2 coding block, the number threshold can be set to 16.
  • the number of each group of processing coefficients in the multiple groups of processing coefficients is less than or equal to the number threshold.
  • the image coding method provided by the embodiment of the present application can group the processing coefficients during encoding, and each group of processing coefficients after the grouping is transmitted. It is necessary to add the header information of this group of processing coefficients to describe the details of this group of processing coefficients. Compared with the current use of 1-bit data to represent residual skipping, the denominator in the calculation formula of the theoretical expansion rate is increased, which reduces the theoretical expansion rate.
  • FIG 20 is a schematic flowchart of yet another image decoding method provided by an embodiment of the present application. As shown in Figure 20, the image decoding method includes S1001 to S1002.
  • the decoding end analyzes the code stream after encoding the coding unit and determines the processing coefficient corresponding to the coding unit.
  • the processing coefficients include one or more of residual coefficients and transformation coefficients; the processing coefficients include multiple groups; the number of processing coefficients in each group is less than or equal to the number threshold.
  • the decoding end decodes the code stream based on the processing coefficient.
  • S1002 may refer to the above description of the video encoding and decoding system, and will not be described again here.
  • the coding unit includes a coding block of a luminance channel, a coding block of a first chroma channel, and a coding block of a second chroma channel.
  • the substream corresponding to the coding block of the luminance channel is the first substream
  • the substream corresponding to the coding block of the first chroma channel is the second substream
  • the substream corresponding to the coding block of the second chroma channel is the third substream.
  • the first substream uses 1 or 3 bits to transmit the complexity level of the luma channel
  • the second substream uses 1 or 3 bits to transmit the average of the two chroma channels
  • the third substream does not transmit the complexity level.
  • the BiasInit is obtained by looking up the table, and then the quantization parameter Qp[0] of the brightness channel and the quantization parameter Qp of the two chroma channels are calculated. [1],Qp[2].
  • the table lookup table can be referred to the following Table 1, which will not be described again here.
  • the specific implementation of the first subflow is as follows:
  • complexity_level_flag[0] is the level update flag of brightness channel complexity, which is a binary variable.
  • a value of '1' indicates that the luminance channel of the coding unit does not need to update the complexity level; a value of '0' indicates that the luminance channel of the coding unit does not need to update the complexity level.
  • the value of ComplexityLevelFlag[0] is equal to the value of complexity_level_flag[0].
  • delta_level[0] is the brightness channel complexity level change, which is a 2-bit unsigned integer. Determines how much the brightness complexity level changes.
  • the value of DeltaLevel[0] is equal to the value of delta_level[0]. If delta_level[0] does not exist in the code stream, the value of DeltaLevel[0] is equal to 0.
  • PrevComplexityLevel represents the complexity level of the brightness channel of the previous coding unit; ComplexityLevel[0] represents the complexity level of the brightness channel.
  • Improvement plan Add transmission complexity information to the third substream.
  • FIG 21 is a schematic flowchart of yet another image encoding method provided by an embodiment of the present application. As shown in Figure 20, the image encoding method includes S1101 to S1103.
  • the encoding end obtains the encoding unit.
  • the coding unit includes coding blocks of P channels, and P is an integer greater than or equal to 2.
  • the encoding end obtains the complexity information of the encoding block of each channel in the P channels.
  • the complexity information is used to characterize the degree of difference in pixel values of the coding blocks of each channel.
  • P channels include the luminance channel, the first chrominance channel, Taking the second chroma channel as an example, the complexity information of the coding block of the brightness channel is used to characterize the degree of difference in pixel values of the coding block of the brightness channel; the complexity information of the coding block of the first chroma channel is used to characterize the coding block of the first chroma channel.
  • the degree of difference in the pixel values of the encoding block of the first chroma channel; the complexity information of the encoding block of the second chroma channel is used to characterize the degree of difference of the pixel values of the encoding block of the second chroma channel;
  • the encoding end encodes the complexity information of the encoding blocks of each channel in the substream obtained by encoding the encoding blocks of P channels.
  • S1103 may specifically include: the encoding end encoding the complexity level of the encoding blocks of each channel in the substreams obtained by encoding the encoding blocks of P channels.
  • the second substream is specifically implemented as follows:
  • complexity_level_flag[1] represents the first chroma channel complexity level update flag, which is a binary variable.
  • a value of '1' represents the complexity level of the coding block of the first chroma channel of the coding unit and the coding block of the luminance channel. The complexity level of is consistent; a value of '0' indicates that the complexity level of the coding block of the first chroma channel of the coding unit is inconsistent with the complexity level of the coding block of the luma channel.
  • the value of ComplexityLevelFlag[1] is equal to the value of complexity_level_flag[1].
  • delta_level[1] represents the change amount of the complexity level of the first chroma channel.
  • DeltaLevel[1] is equal to the value of delta_level[1]. If delta_level[1] does not exist in the code stream, the value of DeltaLevel[1] is equal to 0.
  • ComplexityLevel[1] indicates the complexity level of the coding block of the first chroma channel.
  • the third substream is specifically implemented as follows:
  • complexity_level_flag[2] represents the second chroma channel complexity level update flag, which is a binary variable.
  • a value of '1' represents the complexity level of the coding block of the second chroma channel of the coding unit and the first chroma channel.
  • the complexity level of the coding blocks of the coding unit is consistent; a value of '0' indicates that the complexity level of the coding block of the second chroma channel of the coding unit is inconsistent with the complexity level of the coding block of the first chroma channel.
  • the value of ComplexityLevelFlag[2] is equal to the value of complexity_level_flag[2].
  • delta_level[2] represents the change amount of the complexity level of the second chroma channel.
  • DeltaLevel[2] is a 2-bit unsigned integer and determines the change amount of the complexity level of the encoding block of the second chroma channel.
  • the value of DeltaLevel[2] is equal to the value of delta_level[2]. If delta_level[2] does not exist in the code stream, the value of DeltaLevel[2] is equal to 0.
  • ComplexityLevel[2] indicates the complexity level of the coding block of the second chroma channel.
  • the complexity information includes a complexity level and a first reference coefficient
  • the first reference coefficient is used to represent a proportional relationship between complexity levels of coding blocks of different channels.
  • S1103 may specifically include: the encoding end encodes the complexity level of the encoding blocks of Q channels in the substreams obtained by encoding the encoding blocks of Q channels among the P channels; Q is an integer less than P; the encoding end encodes the encoding blocks of Q channels in the P channels.
  • the first reference coefficient is encoded in the substream obtained by encoding the coding blocks of P-Q channels in the channel.
  • the complexity information of the coding block of the luminance channel is the complexity level of the coding block of the luminance channel;
  • the first chroma The complexity information of the coding block of the channel is the complexity level of the coding block of the first chroma channel;
  • the complexity information of the coding block of the second chroma channel is the reference coefficient;
  • the reference coefficient is used to characterize the coding of the first chroma channel.
  • a value of '1' indicates that the complexity level of the coding block of the second chroma channel of the coding unit is greater than the complexity level of the coding block of the first chroma channel.
  • a value of '0' indicates that the complexity level of the coding block of the second chroma channel of the coding unit is smaller than or equal to the complexity level of the coding block of the first chroma channel.
  • the value of ComplexityLevelFlag[2] is equal to the value of complexity_level_flag[2].
  • the complexity information includes a complexity level, a reference complexity level, and a second reference coefficient;
  • the reference complexity level includes any of the following: a first complexity level, a second complexity level, and a third Complexity level;
  • the first complexity level is the maximum value among the complexity levels of the encoding blocks of P-Q channels among P channels;
  • Q is an integer less than P;
  • the second complexity level is P-Q channels among P channels The minimum value among the complexity levels of the coding blocks;
  • the third complexity level is the average value among the complexity levels of the coding blocks of the P-Q channels in the P channels;
  • the second reference coefficient is used to characterize the P channels.
  • the above-mentioned S1103 may specifically include: the encoding end encodes the complexity level of the encoding blocks of Q channels in the substreams obtained by encoding the encoding blocks of Q channels among the P channels; The coding reference complexity level and the second reference coefficient in the substream obtained by coding the coding blocks of P-Q channels in the channel.
  • the complexity information of the coding block of the luminance channel is the complexity level of the coding block of the luminance channel; the first chroma
  • the complexity information of the coding block of the channel is the reference complexity level; the reference complexity level includes any of the following: A first complexity level, a second complexity level, and a third complexity level; the first complexity level is the complexity level of the coding block of the first chroma channel and the complexity level of the coding block of the second chroma channel The maximum value in; the second complexity level is the minimum value of the complexity level of the coding block of the first chroma channel and the complexity level of the coding block of the second chroma channel; the third complexity level is the minimum value of The average complexity level of the coding block of the chroma channel and the complexity level of the coding block of the second chroma channel; the complexity information of the coding block of the second
  • complexity information is not transmitted in the third substream corresponding to the coding block of the second chroma channel.
  • the image coding method provided by the embodiment of the present application increases the number of bits in the substream with a smaller number of bits by adding coding complexity information in the third substream, thereby increasing the denominator in the above theoretical expansion rate calculation formula and reducing the theoretical expansion rate.
  • the embodiment of the present application further provides an image decoding method.
  • Figure 2 shows another image decoding method provided by the embodiment of the present application. As shown in Figure 2, the decoding method includes S1201 to S1204.
  • the decoding end analyzes the code stream after encoding the coding unit.
  • the coding unit includes coding blocks of P channels; P is an integer greater than or equal to 2; the code stream includes multiple substreams encoded by the coding blocks of P channels and corresponding to the P channels one-to-one.
  • the decoding end parses the complexity information of the encoding blocks of each channel in the substream obtained by encoding the encoding blocks of P channels.
  • S1202 may specifically include: the decoding end separately parses the complexity level of the encoding blocks of each channel in the substreams obtained by encoding the encoding blocks of the P channels.
  • the decoding end determines the quantization parameters of the encoding blocks of each channel based on the complexity information of the encoding blocks of each channel.
  • the decoding end decodes the code stream based on the quantization parameters of the encoding blocks of each channel.
  • the complexity information includes a complexity level and a first reference coefficient; the first reference coefficient is used to characterize the proportional relationship between complexity levels of coding blocks of different channels.
  • S1202 may specifically include: the decoder respectively parses the complexity level of the encoding blocks of Q channels in the substreams obtained by encoding the encoding blocks of Q channels among the P channels; Q is an integer less than P. ; The decoding end parses the first reference coefficient in the substream obtained by encoding the coding blocks of P-Q channels among the P channels; the decoding end determines P-Q based on the first reference coefficient and the complexity level of the coding blocks of Q channels. The complexity level of the channel's encoding blocks.
  • ChromaComplexityLevel represents the complexity level of the chroma channel.
  • ChromaComplexityLevel needs to be calculated separately on the encoding side (the decoding side obtains it directly from the code stream).
  • the complexity information includes a complexity level, a reference complexity level, and a second reference coefficient;
  • the reference complexity level includes any one of the following: a first complexity level, a second complexity level , and the third complexity level;
  • the first complexity level is the maximum value among the complexity levels of the coding blocks of PQ channels among the P channels;
  • Q is an integer less than P;
  • the second complexity level is P channels The minimum value among the complexity levels of the coding blocks of the PQ channels;
  • the third complexity level is the average value of the complexity levels of the coding blocks of the PQ channels among the P channels;
  • the second reference coefficient is used to characterize P The size relationship and/or the proportional relationship between the complexity levels of the coding blocks of the PQ channels in the channels.
  • the above-mentioned S1202 may specifically include: the decoding end analyzes the complexity levels of the encoding blocks of Q channels in the substreams obtained by encoding the encoding blocks of Q channels among the P channels; the decoding end analyzes the complexity levels of the encoding blocks of Q channels in the P channels; The reference complexity level and the second reference coefficient are parsed in the substream obtained by encoding the coding blocks of PQ channels in the channel; the decoding end is based on the complexity level of the coding blocks of Q channels. and a second reference coefficient to determine the complexity level of the coding blocks of the PQ channels.
  • the decoding end can use the complexity level ComplexityLevel[0] of the encoding block of the luminance channel and the complexity level of the first chroma channel. For the complexity level of the coding block ComplexityLevel[1], check the following table 2 to obtain BiasInit1.
  • check BiasInit2 is obtained in Table 2 below, and then the quantization parameter Qp[0] of the luminance channel, the quantization parameter Qp[1] of the first chroma channel, and the quantization parameter Qp[2] of the second chroma channel are calculated.
  • the decoding end can also look up Table 2 to obtain BiasInit based on the complexity level ComplexityLevel[0] of the coding block of the chroma channel and the complexity level ChromaComplexityLevel of the coding block of the chroma channel, and then calculate it through the following process Qp[0], Qp[1], and Qp[2].
  • Qp[0] Clip3(0,MaxQp[0],(MasterQp–tmp)>>7)
  • Qp[1] Clip3(0,MaxQp[1],(MasterQp+Bias)>>7)
  • Qp[2] Qp[1]
  • Qp[1] Clip3(0,MaxQp[1],Qp[1]+ComplexityLevel[1]-ChromaComplexityLevel)
  • Qp[2] Clip3(0,MaxQp[2],Qp[2]+ComplexityLevel[2]-ChromaComplexityLevel)
  • the substream corresponding to the encoding block of the first chroma channel can transmit the first complexity level and the second complexity level.
  • the decoding end can encode the luminance channel according to the The complexity level of the block, ComplexityLevel[0], and the complexity level of the coding block of the chroma channel, ChromaComplexityLevel (take the first complexity level/the complexity level of the coding block of the first chroma channel), look up Table 2 to get BiasInit, and then Qp[0], Qp[1], and Qp[2] are calculated through the following process.
  • the substream corresponding to the encoding block of the first chroma channel can be transmitted through the third complexity level.
  • the above-mentioned ChromaComplexityLevel can take the third complexity level and pass through the following The above process calculates Qp[0], Qp[1], and Qp[2].
  • Qp[0] Clip3(0,MaxQp[0],(MasterQp–tmp)>>7)
  • Qp[1] Clip3(0,MaxQp[1],(MasterQp+Bias)>>7)
  • Qp[2] Clip3(0,MaxQp[2],Qp[2]+1) ⁇
  • Chroma channels share substreams.
  • a total of three substreams are transmitted, the first substream, the second substream, and the third substream.
  • the first substream includes the syntax elements and transformation coefficients/residual coefficients of the luminance channel
  • the second substream includes the syntax elements and transformation coefficients/residual coefficients of the first chroma channel
  • the third substream includes the third substream. Syntax elements and transform/residual coefficients for dichromatic channels.
  • Improvement plan For the image to be processed in YUV420/YUV422 format, put the syntax elements and transformation coefficients/residual coefficients of the first chroma channel, and the syntax elements and transformation coefficients/residual coefficients of the second chroma channel in the second Transmitted in the sub-stream, cancel the third sub-stream.
  • Embodiment 7 is a diagrammatic representation of Embodiment 7:
  • FIG 22 is a schematic flowchart of yet another image encoding method provided by an embodiment of the present application. As shown in Figure 22, the image encoding method includes S1201 to S1203.
  • the encoding end obtains the encoding unit.
  • the coding unit is an image block in the image to be processed.
  • a coding unit includes coding blocks for multiple channels.
  • the encoding end When the image format of the image to be processed is a preset format, the encoding end combines the substreams obtained by encoding the coding blocks of at least two preset channels among the multiple channels into one merged substream.
  • the preset format may be preset in the encoding/decoding end.
  • the preset format may be YUV420 or YUV422, etc.
  • the preset channels may be the first chroma channel and the second chroma channel.
  • the definition of the coding block of the first substream may be as follows:
  • the definition of the coding block of the second sub-stream may be as follows:
  • the number of bits encoded by the syntax elements of the two chroma channels and the quantized transform coefficients is usually smaller than that of the luminance channel. Fusion of the syntax elements of the two chroma channels and transmission in the second sub-stream can reduce the number of bits in the first sub-stream. The difference in bit count between the stream and the second substream, reduces the theoretical expansion rate.
  • FIG. 23 is a schematic flowchart of yet another image decoding method provided by an embodiment of the present application. As shown in Figure 23, the method includes S1301 to S1303.
  • the decoding end analyzes the code stream after encoding the coding unit.
  • the decoding end determines the merged substreams.
  • the merged substream is obtained by merging the substreams encoded by coding blocks of at least two preset channels among multiple channels when the image format of the image to be processed is a preset format.
  • the decoding end decodes the code stream based on the merging of at least two substreams to be merged.
  • Embodiment 8 is a diagrammatic representation of Embodiment 8
  • Figure 24 is a schematic flowchart of yet another image encoding method provided by an embodiment of the present application. As shown in Figure 24, the method includes S1401 to S1402.
  • the encoding end obtains the encoding unit.
  • the coding unit includes coding blocks of multiple channels.
  • S1402 The encoder encodes a preset codeword in a target substream that meets a preset condition until the target substream does not meet the preset condition.
  • the target substream is a substream in the multiple substreams.
  • the multiple substreams are code streams obtained by encoding the coding blocks of multiple channels.
  • the preset codeword may be "0" or other codewords, etc. This embodiment of the application does not limit this.
  • the preset condition includes: the number of substream bits is less than a preset first bit number threshold.
  • the first bit number threshold can be preset at the encoding/decoding end, or transmitted in the code stream by the encoding/decoding end. The embodiments of the present application do not limit this.
  • the first bit number threshold is used to indicate the minimum number of CB bits allowed in the CU.
  • the preset condition includes: there is an encoded coding block whose number of bits is less than a preset second bit number threshold in the code stream of the coding unit.
  • the second bit number threshold can also be preset at the encoding/decoding end, or transmitted in the code stream by the encoding/decoding end. The embodiments of the present application do not limit this.
  • the second bit number threshold is used to indicate the bit number of the minimum substream allowed among the multiple substreams.
  • FIG. 25 is a schematic flowchart of yet another image decoding method provided by an embodiment of the present application. As shown in Figure 25, the image decoding method includes S1501 to S1503.
  • the decoding end parses the bit stream after encoding the encoding unit.
  • the coding unit includes coding blocks of multiple channels.
  • the decoding end determines the number of code words.
  • the number of codewords is used to indicate the number of preset codewords encoded in the target substream that satisfies the preset condition.
  • the preset codeword is encoded into the target substream when there is a target substream that satisfies the preset conditions.
  • the decoding end decodes the code stream based on the number of code words.
  • the decoding end may delete the preset codeword based on the number of codewords, and decode the code stream after the preset codeword is deleted.
  • Improvement plan Use the preset expansion rate as a threshold to control the current actual expansion rate.
  • Embodiment 9 is a diagrammatic representation of Embodiment 9:
  • Figure 26 is a schematic flowchart of yet another image coding method provided by an embodiment of the present application. As shown in Figure 26, the image coding method includes S1601 to S1603.
  • the encoding end obtains the encoding unit.
  • the coding unit includes coding blocks of multiple channels.
  • the encoding end determines the target encoding mode corresponding to each of the encoding blocks of the multiple channels based on the preset expansion rate.
  • the preset expansion rate can be preset at the encoding end and the decoding end.
  • the preset expansion rate can also be encoded into a sub-stream by the encoding end and transmitted to the decoding end.
  • the embodiments of the present application do not limit this.
  • the encoding end encodes the encoding block of each channel according to the target encoding mode, so that the current expansion rate is less than or equal to the preset expansion rate.
  • the preset expansion rate includes a first preset expansion rate; the current expansion rate is equal to the quotient of the number of bits of the maximum substream and the number of bits of the minimum substream; the maximum substream is performed on coding blocks of multiple channels.
  • the encoder may preset a first preset expansion rate as the maximum threshold allowed in the coding unit, and before encoding the coding unit, obtain all current substream states, obtain the substream state with the largest total amount of the substream with the most fixed-length code streams sent plus the current substream, and obtain the substream state with the smallest total amount of the substream with the least fixed-length code streams sent plus the current substream.
  • the encoder may no longer use rate-distortion optimization as the target mode selection criterion, and only select a coding mode with a lower code stream for the substream in the maximum state, and select a mode with a higher code rate for the substream in the minimum state.
  • the coding blocks of the three channels Y/U/V are encoded to obtain three sub-streams.
  • the encoding end can obtain the states of the three sub-streams.
  • the encoding end can select a mode with a larger bit rate for the coding block of the Y channel and a mode with a smaller bit rate for the coding block of the U channel.
  • the preset expansion rate includes a second preset expansion rate; the current expansion rate is equal to the number of bits of the encoding block with the largest number of bits and the number of bits of the encoding block with the smallest number of bits among the encoding blocks of the multiple channels. Bit quotient.
  • the encoding end can preset a second preset expansion rate as the maximum threshold allowed in the encoding unit, and obtain the code rate cost in each mode before encoding a certain encoding block. . If the coding mode corresponding to the optimal code rate cost obtained based on the rate distortion cost satisfies the actual expansion rate of encoding each sub-stream is less than the preset expansion rate, then the coding mode corresponding to the optimal code rate cost is used as the target of the coding block Encoding mode, and encode the target encoding mode into the sub-stream corresponding to the encoding block; if the encoding mode corresponding to the optimal code rate cost cannot satisfy the actual expansion rate of encoding each sub-stream is less than the preset expansion rate, the encoding end will Change the mode with the largest code rate to a mode with a code rate lower than the code rate of the mode with the largest code rate, or change the mode with the smallest code rate to a mode with a code rate greater than the code rate of the
  • the coding blocks of the three Y/U/V channels are encoded to obtain three sub-streams.
  • the optimal code rate costs of the three channels Y/U/V are rate_y, rate_u, and rate_v respectively, and rate_y has the largest code rate and rate_u has the smallest code rate.
  • rate_y/rate_u ⁇ A_th the encoding end can modify the target coding mode of the Y channel so that the code rate cost is less than rate_y, or modify the target coding mode of the U channel so that the code rate cost is greater than rate_u, so that rate_y/rate_u ⁇ A_th, and rate_y/rate_v ⁇ A_th.
  • the image encoding method provided by the embodiment of the present application can also intervene in the target encoding mode selected for the encoding block of each channel by preset expansion rate, so that when the encoding end encodes each channel according to the target encoding mode, the actual expansion of the encoding unit rate is less than the preset expansion rate, thereby reducing the actual expansion rate.
  • the image encoding method may further include: the encoding end determines the current expansion rate; when the current expansion rate is greater than the first preset expansion rate, encoding the preset codeword in the minimum substream such that the current expansion rate Less than or equal to the first preset expansion rate.
  • the encoding end can fill preset codewords at the end of the minimum substream.
  • the image encoding method may further include: the encoding end determines the current expansion; when the current expansion rate is greater than the second preset expansion, the encoding end encodes the preset codeword in the encoding block with the smallest number of bits, so that The current expansion rate is less than or equal to the second preset expansion rate.
  • Figure 27 is a schematic flowchart of yet another image decoding method provided by an embodiment of the present application. As shown in Figure 27, the method includes S1701 to S1703.
  • the decoding end analyzes the code stream after encoding the coding unit.
  • the decoding end determines the number of preset codewords based on the code stream.
  • the decoding end decodes the code stream based on the number of preset codewords.
  • S1701 to S1703 may be described with reference to the above S1501 to S1503, and will not be described again here.
  • the embodiment of the present application also provides an image encoding device, and any of the above image encoding methods can be executed by the image encoding device.
  • the image encoding device provided by the embodiment of the present application may be the above-mentioned source device 10 or the video encoder 102.
  • Figure 28 is a schematic diagram of the composition of an image encoding device provided by an embodiment of the present application. As shown in Figure 28, the image encoding device includes an acquisition module 2801 and a processing module 2802.
  • the acquisition module 2801 is used to acquire a coding unit; the coding unit is an image block in the image to be processed; the coding unit includes coding blocks of multiple channels; the multiple channels include a first channel; the first channel is any one of the multiple channels.
  • the processing module 2802 is used to encode the coding block of the first channel according to a first coding mode; the first coding mode is a mode for encoding sample values in the coding block of the first channel according to a first fixed-length code; the code length of the first fixed-length code is Less than or equal to the image bit width of the image to be processed; the image bit width is used to represent the number of bits required to store each sample in the image to be processed.
  • the acquisition module 2801 is also used to acquire the code stream after encoding the encoding unit; the encoding unit is an image block in the image to be processed; the encoding unit includes encoding blocks of multiple channels; the multiple channels include the first channel; the first channel is any one of the multiple channels; the code stream includes multiple sub-streams encoded by the encoding blocks of the multiple channels and corresponding to the multiple channels one-to-one.
  • the processing module 2802 is also configured to decode the substream corresponding to the first channel according to the first decoding mode; the first decoding mode is a mode in which the sample value is parsed from the substream corresponding to the first channel according to the first certain length code; the first The code length of the fixed-length code is less than or equal to the image bit width of the image to be processed; the image bit width is used to characterize the number of bits required to store each sample in the image to be processed.
  • the acquisition module 2801 is also used to acquire a coding unit; the coding unit is an image block in the image to be processed; the coding unit includes coding blocks of multiple channels.
  • the processing module 2802 is also used to determine the first total code length; the first total code length is the total code length of the first code stream obtained by encoding the coding blocks of multiple channels according to their respective corresponding target coding modes; target coding
  • the mode includes a first encoding mode; the first encoding mode is a mode for encoding sample values in the encoding block according to a first fixed-length code; the code length of the first fixed-length code is less than or equal to the image bit width of the image to be processed; the image The bit width is used to characterize the number of bits required to store each sample in the image to be processed; when the first total code length is greater than or equal to the remaining size of the code stream buffer, the encoding blocks of multiple channels are processed in fallback mode Encoding; the mode flags for fallback mode and first encoding mode are the
  • the processing module 2802 is also configured to encode mode flags in multiple substreams obtained by encoding encoding blocks of multiple channels; the mode flag is used to indicate the encoding mode adopted by each encoding block of multiple channels.
  • the multiple channels include a first channel; the first channel is any one of the multiple channels; the processing module 2802 is specifically configured to encode the sub-mode in the sub-stream encoded by the encoding block of the first channel Flag; sub-mode flag is used to indicate the type of fallback mode used by encoding blocks of multiple channels.
  • the plurality of channels include a first channel; the first channel is any one of the plurality of channels; the processing module 2802 is specifically configured to encode the first channel in the substream obtained by encoding the encoding block of the first channel. flag, a second flag, and a third flag; the first flag is used to indicate that the encoding blocks of multiple channels are encoded in the first encoding mode or the fallback mode; the second flag is used to indicate that the encoding blocks of multiple channels are encoded in the target mode.
  • the target mode is either the first encoding mode or the fallback mode; when the second flag indicates that the target mode adopted by the coding block of multiple channels is the fallback mode, the third flag is used to indicate the encoding of multiple channels. The kind of fallback mode adopted by the block.
  • the processing module 2802 is also used to determine the target coding code length of the coding unit based on the remaining size of the code stream buffer and the target pixel depth BPP; the target coding code length is used to indicate the required code length for coding the coding unit.
  • Code length determine the allocated code length of multiple channels based on the encoding code length; the allocated code length is used to indicate the code length required to encode the residual of the coding block of multiple channels; based on the average value of the allocated code length in multiple channels , determine the encoding code length allocated to each channel.
  • the processing module 2802 is also used to parse the code stream after encoding the coding unit; the coding unit includes coding blocks of multiple channels; if the pattern is parsed from the sub-stream encoded by the coding blocks of multiple channels, flag, and the first total code length is greater than the remaining size of the code stream buffer, then the target decoding mode of the substream is determined to be the fallback mode; the mode flag is used to indicate that the coding blocks of multiple channels adopt the first coding mode or the fallback mode Encoding is performed; the first total code length is the total code length of the first code stream obtained after coding blocks of multiple component channels are encoded according to their respective corresponding target coding modes and the first coding mode; the target coding mode includes the first coding mode; the first encoding mode is a mode of encoding the sample value encoding unit in the encoding block according to the first fixed-length code; the code length of the first fixed-length code is less than or equal to the image bit width of the image to be processed; the image bit
  • the processing module 2802 is also used to determine the target decoding code length of the coding unit based on the remaining size of the code stream buffer and the target pixel depth BPP; the target decoding code length is used to indicate the code stream of the decoding coding unit.
  • the required code length based on the decoding code length; determine the allocated code length of multiple channels; the allocated code length is used to indicate the code length required to decode the residual code stream of the encoding block of multiple channels; based on the allocated code length in The average value of multiple channels determines the decoding code length allocated to each channel.
  • the processing module 2802 is also used to parse the code stream after encoding the encoding unit; the encoding unit includes encoding blocks of multiple channels; the multiple channels include a first channel; the first channel is one of the multiple channels. Any channel; parse the first flag from the substream encoded by the encoding block of the first channel; the first flag is used to indicate that the encoding blocks of multiple channels are encoded in the first encoding mode or fallback mode; the first encoding mode is The mode of encoding the sample values in the encoding block of the first channel according to the first fixed-length code; the code length of the first fixed-length code is less than or equal to the image bit width of the image to be processed; the image bit width is used to represent the encoding and storage to be processed The number of bits required to process each sample in the image; the second flag is parsed from the substream encoded by the encoding block of the first channel; the second flag is used to indicate the encoding block target mode encoding of multiple channels;
  • the processing module 2802 is also used to determine the target decoding code length of the coding unit based on the remaining size of the code stream buffer and the target pixel depth BPP; the target decoding code length is used to indicate the code stream of the decoding coding unit.
  • the required code length based on the decoding code length; determine the allocated code length of multiple channels; the allocated code length is used to indicate the code length required to decode the residual code stream of the encoding block of multiple channels; based on the allocated code length in The average value of multiple channels determines the decoding code length allocated to each channel.
  • the acquisition module 2801 is also used to acquire a coding unit; the coding unit includes coding blocks of multiple channels.
  • the processing module 2802 is also configured to encode the coding block of at least one channel among the multiple channels according to the intra block copy IBC mode; obtain the block vector BV of the reference prediction block; the BV of the reference prediction block is used to indicate that the reference prediction block is in The position in the coded image block; the reference prediction block is used to represent the prediction value of the coding block coded according to the IBC mode; the reference prediction block is coded in at least one substream obtained by coding the coding block of at least one channel in the IBC mode. BV.
  • the processing module 2802 is specifically configured to use the rate-distortion optimization decision to determine the target coding mode of at least one channel; the target coding mode includes the IBC mode; and encode the coding block of at least one channel according to the target coding mode.
  • the processing module 2802 is specifically configured to encode the encoding blocks of multiple channels according to the IBC mode; encode the reference prediction in the substream obtained by encoding the encoding block of each channel in the multiple channels in the IBC mode. BV of the block.
  • the processing module 2802 is specifically configured to encode the BV of the reference prediction block in the code stream obtained by IBC mode encoding of the encoding block of each channel in the plurality of channels according to a preset ratio.
  • the processing module 2802 is also used to parse the code stream after encoding the coding unit; the coding unit includes encoding blocks of multiple channels; the code stream includes encoding blocks of multiple channels, and the code stream with multiple channels Multiple substreams in one-to-one correspondence; based on the block vector BV of the reference prediction block parsed from at least one of the multiple substreams, the position of the reference prediction block is determined in the multiple substreams; the reference prediction block is used to represent Prediction of coded blocks coded according to intra block copy IBC mode value; the BV of the reference prediction block is used to indicate the position of the reference prediction block in the encoded image block; based on the position information of the reference prediction block, determine the prediction value of the decoding block decoded according to the IBC mode; based on the prediction value, the prediction value is determined according to the IBC mode.
  • IBC mode decodes the decoded blocks for reconstruction.
  • the BV of the reference prediction block is encoded in multiple sub-streams; the BV of the reference prediction block is obtained when encoding blocks of multiple channels are encoded according to the IBC mode.
  • the acquisition module 2801 is also used to acquire the processing coefficient corresponding to the coding unit; the processing coefficient includes one or more of a residual coefficient and a transformation coefficient.
  • the processing module 2802 is also configured to divide the processing coefficients into multiple groups according to the number threshold, and the number of processing coefficients in each group of the multiple groups of processing coefficients is less than or equal to the number threshold.
  • the processing module 2802 is also used to parse the code stream after encoding the coding unit and determine the processing coefficient corresponding to the coding unit; the processing coefficient includes one or more of the residual coefficient and the transformation coefficient; the processing coefficient Including multiple groups; the number of processing coefficients in each group is less than or equal to the number threshold; the code stream is decoded based on the processing coefficients.
  • the acquisition module 2801 is further used to acquire a coding unit; the coding unit includes coding blocks of P channels; P is an integer greater than or equal to 2; the complexity information of the coding block of each channel in the P channels is acquired; the complexity information is used to characterize the degree of difference of the pixel values of the coding block of each channel.
  • the processing module 2802 is further used to encode the complexity information of the coding block of each channel in the substream obtained by encoding the coding blocks of the P channels.
  • the complexity information includes a complexity level; the processing module 2802 is specifically configured to separately encode the complexity level of the coding block of each channel in the substream obtained by encoding the coding blocks of the P channels.
  • the complexity information includes a complexity level and a first reference coefficient; the first reference coefficient is used to characterize the proportional relationship between the complexity levels of coding blocks of different channels; the processing module 2802 is specifically used to perform P
  • the complexity level of encoding blocks of Q channels in the substreams obtained by encoding the encoding blocks of Q channels in each channel; Q is an integer less than P; encoding the encoding blocks of P-Q channels in P channels is obtained
  • the first reference coefficient is encoded in the substream.
  • the complexity information includes a complexity level, a reference complexity level, and a second reference coefficient;
  • the reference complexity level includes any of the following: a first complexity level, a second complexity level, and a third Complexity level;
  • the first complexity level is the maximum value among the complexity levels of the encoding blocks of P-Q channels among P channels;
  • Q is an integer less than P;
  • the second complexity level is P-Q channels among P channels The minimum value among the complexity levels of the coding blocks;
  • the third complexity level is the average value among the complexity levels of the coding blocks of the P-Q channels in the P channels;
  • the second reference coefficient is used to characterize the P channels.
  • the size relationship, and/or the proportional relationship between the complexity levels of the coding blocks of P-Q channels; the processing module 2802 is specifically used to encode each of the substreams obtained by encoding the coding blocks of Q channels among the P channels.
  • the complexity level of the encoding blocks of Q channels; the reference complexity level and the second reference coefficient are encoded in the substream obtained by encoding the encoding blocks of P-Q channels in the P channels.
  • the processing module 2802 is also used to parse the code stream after encoding the coding unit; the coding unit includes coding blocks of P channels; P is an integer greater than or equal to 2; the code stream includes coding blocks of P channels Multiple substreams that correspond to P channels one-to-one after block coding; the complexity information of the coding block of each channel is parsed in the substream obtained by coding the coding block of P channels; the complexity information is used to characterize each The degree of difference in pixel values of the coding blocks of each channel; determine the quantization parameters of the coding blocks of each channel based on the complexity information of the coding blocks of each channel; decode the code stream based on the quantization parameters of the coding blocks of each channel .
  • the complexity information includes a complexity level; the processing module 2802 is also configured to analyze the complexity level of the coding block of each channel in the substreams obtained by encoding the coding blocks of the P channels.
  • the complexity information includes a complexity level and a first reference coefficient; the first reference coefficient is used to characterize the proportional relationship between the complexity levels of coding blocks of different channels; the processing module 2802 is specifically used to perform P In the substreams obtained by encoding the encoding blocks of Q channels in the channels, each parses the complexity level of the encoding blocks of Q channels; Q is an integer less than P; the encoding blocks of P-Q channels in the P channels are encoded.
  • the first reference coefficient is parsed in the sub-stream; based on the first reference coefficient and the complexity level of the coding block of the Q channels, the complexity level of the coding block of the P-Q channels is determined.
  • the complexity information includes a complexity level, a reference complexity level, and a second reference coefficient;
  • the reference complexity level includes any of the following: a first complexity level, a second complexity level, and a third Complexity level;
  • the first complexity level is the maximum value among the complexity levels of the encoding blocks of P-Q channels among P channels;
  • Q is an integer less than P;
  • the second complexity level is P-Q channels among P channels The minimum value among the complexity levels of the coding blocks;
  • the third complexity level is the average value among the complexity levels of the coding blocks of the P-Q channels in the P channels;
  • the second reference coefficient is used to characterize the P channels.
  • the size relationship, and/or the proportional relationship between the complexity levels of the coding blocks of P-Q channels; the processing module 2802 is specifically used to analyze the substreams obtained by encoding the coding blocks of Q channels among the P channels.
  • the complexity level of the coding block of Q channels; the reference complexity level and the second reference coefficient are parsed in the substream obtained by encoding the coding block of P-Q channels in the P channels; the complexity based on the coding block of Q channels grade. and a second reference coefficient to determine the complexity level of the coding blocks of the P-Q channels.
  • the acquisition module 2801 is also used to acquire a coding unit; the coding unit includes coding blocks of multiple channels; the coding unit is an image block in the image to be processed.
  • the processing module 2802 is also configured to combine the substreams obtained by encoding the coding blocks of at least two preset channels among the multiple channels into one merged substream when the image format of the image to be processed is a preset format.
  • the processing module 2802 is also used to parse the code stream after encoding the coding unit; the coding unit includes coding blocks of multiple channels; the coding unit is an image block in the image to be processed; determine the merged sub-stream; merge The substream is obtained by merging the substreams encoded by the coding blocks of at least two preset channels in multiple channels when the image format of the image to be processed is a preset format; based on the merging of at least two substreams to be merged Stream, decode the code stream.
  • the acquisition module 2801 is also used to acquire a coding unit; the coding unit includes coding blocks of multiple channels.
  • the processing module 2802 is also used to encode preset codewords in the target substream that meets the preset conditions until the target substream does not meet the preset conditions; the target substream is a substream among multiple substreams; the multiple substreams are pairs of The code stream obtained by encoding the coding blocks of multiple channels.
  • the preset condition includes: the number of substream bits is less than a preset first bit number threshold.
  • the preset condition includes: there is an encoded coding block whose number of bits is less than a preset second bit number threshold in the code stream of the coding unit.
  • the processing module 2802 is also used to parse the code stream after encoding the coding unit; the coding unit includes coding blocks of multiple channels; determine the number of codewords; the number of codewords is used to indicate that when the preset conditions are met, The number of preset codewords encoded in the target substream; the preset codewords are encoded into the target substream when there is a target substream that meets the preset conditions; the codestream is decoded based on the number of codewords.
  • the acquisition module 2801 is also used to acquire a coding unit; the coding unit includes coding blocks of multiple channels.
  • the processing module 2802 is also used to determine the target encoding mode corresponding to the encoding block of each channel in the encoding blocks of the multiple channels based on the preset expansion rate; encode the encoding block of each channel according to the target encoding mode, so that The current expansion rate is less than or equal to the preset expansion rate.
  • the preset expansion rate includes a first preset expansion rate; the current expansion rate is equal to the quotient of the number of bits of the maximum substream and the number of bits of the minimum substream; the maximum substream The stream is the substream with the largest number of bits among the multiple substreams obtained by encoding the encoding blocks of multiple channels; the minimum substream is the substream with the smallest number of bits among the multiple substreams obtained by encoding the encoding blocks of multiple channels. flow.
  • the preset expansion rate includes a second preset expansion rate; the current expansion rate is equal to the number of bits of the encoding block with the largest number of bits and the number of bits of the encoding block with the smallest number of bits among the encoding blocks of the multiple channels. Bit quotient.
  • the processing module 2802 is also used to determine the current expansion rate; when the current expansion rate is greater than the first preset expansion rate, encode the preset codeword in the minimum substream so that the current expansion rate is less than or equal to The first preset expansion rate.
  • the processing module 2802 is also used to determine the current expansion rate; when the current expansion rate is greater than the second preset expansion rate, encode the preset codeword in the encoding block with the smallest number of bits, so that the current expansion rate Less than or equal to the second preset expansion rate.
  • the processing module 2802 is also used to parse the code stream after encoding the coding unit; the coding unit includes coding blocks of multiple channels; based on the code stream, determine the number of preset codewords; based on the preset codeword number to decode the code stream.
  • the present application also provides a readable storage medium, including execution instructions, which, when executed on an image coding and decoding device, enable the image coding and decoding device to execute any one of the methods provided in the above embodiments.
  • the present application also provides a computer program product including execution instructions, which, when executed on an image coding and decoding device, enables the image coding and decoding device to execute any one of the methods provided in the above embodiments.
  • the embodiment of the present application also provides a chip, including: a processor and an interface.
  • the processor is coupled to the memory through the interface.
  • the processor executes the computer program in the memory or the image encoding and decoding device executes instructions When, any one of the methods provided in the above embodiments is executed.
  • the computer program product includes one or more computer-executable instructions.
  • the computer can be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device.
  • the computer-executable instructions can be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium.
  • the computer-executable instructions can be transmitted from a website site, computer, server or data center by wired (e.g., coaxial cable, optical fiber, digital subscriber line (digital subscriber line, DSL)) or wireless (e.g., infrared, wireless, microwave, etc.) mode to another website site, computer, server or data center.
  • the computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains one or more servers that can be integrated with the medium. Available media can be magnetic media (e.g., floppy disks, hard disks, tapes), optical media (e.g., DVDs), or semiconductor media (e.g., solid state disks (SSDs)), etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

本申请提供一种图像编码方法和图像解码方法、装置及存储介质,该编码方法包括:获取编码单元;编码单元为待处理图像中的图像块;编码单元包括多个通道的编码块;多个通道包括第一通道;第一通道为多个通道中的任意一个通道;按照第一编码模式对第一通道的编码块进行编码。该方法适用于图像编解码过程中,用于解决子流缓冲区设置过大的问题。

Description

图像编码方法和图像解码方法、装置及存储介质
相关申请
本申请要求于2022年9月20日申请的、申请号为202211146464.4的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及视频编解码领域,尤其涉及一种图像编码方法和图像解码方法、装置及存储介质。
背景技术
为了提升编码器的性能,提出了一种名为子流并行的技术。
子流并行是指利用多个熵编码器将不同通道的语法元素进行编码得到多个子流,并将多个子流填充入各自对应的子流缓冲区,并按照预设的交织规则将子流缓冲区中的子流交织成比特流(或者也可以称作码流)。
但是,考虑到子流之间的依赖性,不同的子流缓冲区中填充子流的速度不同,同一时间填充速度较快的子流缓冲区比填充速度较慢的子流缓冲区中填充的子流位数多,为了保证填充数据的完整性,需要将所有的子流缓冲区设置得较大,导致增加了硬件成本。
发明内容
基于上述技术问题,本申请提供一种图像编码方法和图像解码方法、装置、设备及存储介质,可以通过多种改进后的编解码模式来进行编码,合理配置子流缓冲区的空间,降低硬件成本。
第一方面,本申请提供一种图像编码方法,所述方法包括:
获取编码单元;所述编码单元包括多个通道的编码块;
在满足预设条件的目标子流中编码预设码字,直至所述目标子流不满足所述预设条件;所述目标子流是多个子流中的子流;所述多个子流是对所述多个通道的编码块进行编码得到的码流。
第二方面,本申请提供一种图像编码方法,所述方法包括:
获取编码单元;所述编码单元包括多个通道的编码块;
基于预设膨胀率对所述每个通道的编码块进行编码,以使得当前膨胀率小于或等于所述预设膨胀率,其中,所述预设膨胀率包括第一预设膨胀率;所述当前膨胀率的值由最大子流的比特数与最小子流的比特数之商导出;所述最大子流为对所述多个通道的编码块进行编码得到的多个子流中,比特数最大的子流;所述最小子流为对所述多个通道的编码块进行编码得到的多个子流中,比特数最小的子流。
第三方面,本申请提供一种图像编码方法,所述方法包括:
获取编码单元;所述编码单元包括多个通道的编码块;
按照帧内块复制IBC模式对所述多个通道中的至少一个通道的编码块进行编码;
获取参考预测块的块向量BV;所述参考预测块的BV用于指示所述参考预测块在已编码的图像块中的位置;所述参考预测块用于表征按照所述IBC模式编码的编码块的预测值;
在所述至少一种通道的编码块经所述IBC模式编码得到的至少一个子流中编码所述参考预测块的BV。
第四方面,本申请提供一种图像编码方法,所述方法包括:
获取编码单元;所述编码单元为待处理图像中的图像块;所述编码单元包括多个通道的编码块;
确定第一总码长;所述第一总码长为所述多个通道的编码块均按照各自对应的目标编码模式编码后得到的的第一码流的总码长;所述目标编码模式包括第一编码模式;所述第一编码模式为按照第一定长码对编码块中的样本值进行编码的模式;所述第一定长码的码长小于或等于所述待处理图像的图像位宽;所述图像位宽用于表征存储所述待处理图像中每个样本所需的比特位数;
当所述第一总码长大于或等于码流缓冲区的剩余大小时,将所述多个通道的编码块按照回退模式进行编码;所述回退模式和所述第一编码模式的模式标志相同。
第五方面,本申请提供一种图像解码方法,所述方法包括:
解析对编码单元编码后的码流;所述编码单元包括多个通道的编码块;
确定码字数量;所述码字数量用于指示在满足预设条件的目标子流中编码的预设码字的数量;所述预设码字是当存在满足预设条件的目标子流时,编码入所述目标子流中的;基于所述码字数量,对所述码流进行解码。
第六方面,本申请提供一种图像解码方法,所述方法包括:
解析对编码单元编码后的码流;所述编码单元包括多个通道的编码块;
确定所述当前膨胀率;
根据所述当前膨胀率和第一预设膨胀率确定码字数量;所述码字数量用于指示在满足预设条件的目标子流中编码的预设码字的数量;所述预设码字是当存在满足预设条件的目标子流时,编码入所述目标子流中的;
基于所述码字数量,对所述码流进行解码。
第七方面,本申请提供一种图像解码方法,所述方法包括:
解析对编码单元编码后的码流;所述编码单元包括多个通道的编码块;所述码流包括所述多个通道的编码块编码后的、与所述多个通道一一对应的多个子流;
基于从所述多个子流中的至少一个子流中解析到的参考预测块的块向量BV,在所述多个子流中确定出所述参考预测块的位置;所述参考预测块用于表征按照帧内块复制IBC模式进行解码的解码块的预测值;所述参考预测块的BV用于指示所述参考预测块在已重建的图像块中的位置;
基于所述参考预测块的位置信息,确定所述按照IBC模式进行解码的解码块的预测值;
基于所述预测值,对所述按照IBC模式进行解码的解码块进行重建。
第八方面,本申请提供一种图像解码方法,所述方法包括:
解析对编码单元编码后的码流;所述编码单元包括多个通道的编码块;
若从所述多个通道的编码块编码得到的子流中解析出模式标志,且第二总码长大于码流缓冲区的剩余大小,则确定所述子流的目标解码模式为回退模式;所述模式标志用于指示所述多个通道的编码块是否采用回退模式进行编码;所述第二总码长为所述多个通道的编码块均按照第一编码模式进行编码后得到的码流的总码长;所述第一编码模式为按照第一定长码对编码块中的样本值所述编码单元进行编码的模式;所述第一定长码的码长小于或等于所述待处理图像的图像位宽;所述图像位宽用于表征编码存储所述待处理图像中每个样本所需的比特位数;
解析所述子流中的预设标志位,确定目标回退子模式;所述目标回退子模式为所述回退模式中的一种;所述预设标志位用于指示所述多个通道的编码块编码时所采用的回退模式的种类;所述回退模式包括第一回退模式和第二回退模式;
按照所述目标回退模式对所述子流进行解码。
第九方面,本申请提供一种图像解码方法,所述方法包括:
获取对编码单元编码后的码流;
按照第一解码模式对第一通道对应的子流进行解码。
第十方面,本申请提供一种图像解码方法,所述方法包括:
获取解码单元;所述解码单元为待处理图像中的图像块,所述解码单元包括多个通道的编码块;
确定第一总码长;所述第一总码长为所述多个通道的解码块均按照各自对应的目标解码模式进行解码后得到的第一码流的总码长;
当所述第一总码长大于或等于码流缓冲区的剩余大小时,将所述多个通道的解码块按照回退模式进行编码。
第十一方面,本申请提供一种图像解码方法,所述方法包括:
获取解码单元;所述解码单元包括多个通道的编码块;
按照IBC模式对所述多个通道中的至少一个通道的解码块进行解码;
获取参考预测块的BV;
在至少一种通道的解码块经IBC模式编码得到的至少一个子流中解码参考预测块的BV。
第十二方面,本申请提供一种图像编码装置,所述装置包括:
获取模块,用于获取编码单元;所述编码单元包括多个通道的编码块;
处理模块,用于在满足预设条件的目标子流中编码预设码字,直至所述目标子流不满足所述预设条件;所述目标子流是多个子流中的子流;所述多个子流是对所述多个通道的编码块进行编码得到的码流。
第十三方面,本申请提供一种图像编码装置,所述装置包括:
获取模块,用于获取编码单元;所述编码单元包括多个通道的编码块;
处理模块,用于基于预设膨胀率对所述每个通道的编码块进行编码,以使得当前膨胀率小于或等于所述预设膨胀率。
第十四方面,本申请提供一种图像编码装置,所述装置包括:
获取模块,用于获取编码单元;所述编码单元包括多个通道的编码块;
处理模块,用于按照帧内块复制IBC模式对所述多个通道中的至少一个通道的编码块进行编码;获取参考预测块的块向量BV;所述参考预测块的BV用于指示所述参考预测块在已编码的图像块中的位置;所述参考预测块用于表征按照所述IBC模式编码的编码块的预测值;在所述至少一种通道的编码块经所述IBC模式编码得到的至少一个子流中编码所述参考预测块的BV。
第十五方面,本申请提供一种图像编码装置,所述装置包括:
获取模块,用于获取编码单元;所述编码单元为待处理图像中的图像块;所述编码单元包括多个通道的编码块;
处理模块,用于确定第一总码长;所述第一总码长为所述多个通道的编码块均按照各自对应的目标编码模式编码后得到的的第一码流的总码长;所述目标编码模式包括第一编码模式;所述第一编码模式为按照第一定长码对编码块中的样本值进行编码的模式;所述第一定长码的码长小于或等于所述待处理图像的图像位宽;所述图像位宽用于表征存储所述待处理图像中每个样本所需的比特位数;当所述第一总码长大于或等于码流缓冲区的剩余大小时,将所述多个通道的编码块按照回退模式进行编码;所述回退模式和所述第一编码模式的模式标志相同。
第十六方面,本申请提供一种图像解码装置,所述装置包括:
处理模块,用于解析对编码单元编码后的码流;所述编码单元包括多个通道的编码块;确定码字数量;所述码字数量用于指示在满足预设条件的目标子流中编码的预设码字的数量;所述预设码字是当存在满足预设条件的目标子流时,编码入所述目标子流中的;基于所述码字数量,对所述码流进行解码。
第十七方面,本申请提供一种图像解码装置,所述装置包括:
处理模块,用于解析对编码单元编码后的码流;所述编码单元包括多个通道的编码块;确定所述当前膨胀率;根据所述当前膨胀率和第一预设膨胀率确定码字数量;所述码字数量用于指示在满足预设条件的目标子流中编码的预设码字的数量;所述预设码字是当存在满足预设条件的目标子流时,编码入所述目标子流中的;基于所述码字数量,对所述码流进行解码。
第十八方面,本申请提供一种图像解码装置,所述装置包括:
处理模块,用于解析对编码单元编码后的码流;所述编码单元包括多个通道的编码块;所述码流包括所述多个通道的编码块编码后的、与所述多个通道一一对应的多个子流;基于从所述多个子流中的至少一个子流中解析到的参考预测块的块向量BV,在所述多个子流中确定出所述参考预测块的位置;所述参考预测块用于表征按照帧内块复制IBC模式进行解码的解码块的预测值;所述参考预测块的BV用于指示所述参考预测块在已重建的图像块中的位置;基于所述参考预测块的位置信息,确定所述按照IBC模式进行解码的解码块的预测值;基于所述预测值,对所述按照IBC模式进行解码的解码块进行重建。
第十九方面,本申请提供一种图像解码装置,所述装置包括:
处理模块,用于解析对编码单元编码后的码流;所述编码单元包括多个通道的编码块;若从所述多个通道的编码块编码 得到的子流中解析出模式标志,且第二总码长大于码流缓冲区的剩余大小,则确定所述子流的目标解码模式为回退模式;所述模式标志用于指示所述多个通道的编码块是否采用回退模式进行编码;所述第二总码长为所述多个通道的编码块均按照第一编码模式进行编码后得到的码流的总码长;所述第一编码模式为按照第一定长码对编码块中的样本值所述编码单元进行编码的模式;所述第一定长码的码长小于或等于所述待处理图像的图像位宽;所述图像位宽用于表征编码存储所述待处理图像中每个样本所需的比特位数;解析所述子流中的预设标志位,确定目标回退子模式;所述目标回退子模式为所述回退模式中的一种;所述预设标志位用于指示所述多个通道的编码块编码时所采用的回退模式的种类;所述回退模式包括第一回退模式和第二回退模式;按照所述目标回退模式对所述子流进行解码。
第二十方面,本申请提供一种视频编码器,该视频编码器包括处理器和存储器;存储器存储有处理器可执行的指令;处理器被配置为执行指令时,使得视频编码器实现上述第一方面至第四方面中的图像编码方法。
第二十一方面,本申请提供一种视频解码器,该视频解码器包括处理器和存储器;存储器存储有处理器可执行的指令;处理器被配置为执行指令时,使得视频解码器实现如上述第五方面至第十一方面中的图像解码方法。
第二十二方面,本申请提供一种计算机程序产品,当该计算机程序产品在图像编码装置上运行时,使得图像编码装置执行上述第一方面至第四方面中的图像编码方法。
第二十三方面,本申请提供一种可读存储介质,该可读存储介质包括:软件指令;当软件指令在图像编码装置中运行时,使得图像编码装置实现上述第一方面至第四方面中的图像编码方法,当软件指令在图像解码装置中运行时,使得图像解码装置实现如上述第五方面至第十一方面中的图像解码方法。
第二十四方面,本申请提供一种芯片,该芯片包括处理器和接口,处理器通过接口与存储器耦合,当处理器执行存储器中的计算机程序或图像编码装置执行指令时,使得上述第一方面至第四方面任意一个方面所述的方法被执行。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为子流并行技术的框架示意图;
图2为编码端子流交织过程示意图;
图3为子流交织单元的格式的示意图;
图4为解码端反向子流交织过程示意图;
图5为编码端子流交织过程的另一种示意图;
图6为本申请实施例提供的视频编解码系统的组成示意图;
图7为本申请实施例提供的视频编码器的组成示意图;
图8为本申请实施例提供的视频解码器的结构示意图;
图9为本申请实施例提供的一种视频编解码的流程示意图;
图10为本申请实施例提供的图像编码装置和图像解码装置的组成示意图;
图11为本申请实施例提供的一种图像编码方法的流程示意图;
图12为本申请实施例提供的一种图像解码方法的流程示意图;
图13为本申请实施例提供的另一种图像解码方法的流程示意图;
图14为本申请实施例提供的另一种图像解码方法的流程示意图;
图15为本申请实施例提供的又一种图像解码方法的流程示意图;
图16为本申请实施例提供的又一种图像编码方法的流程示意图;
图17为本申请实施例提供的又一种图像编码方法的流程示意图;
图18为本申请实施例提供的又一种图像解码方法的流程示意图;
图19为本申请实施例提供的又一种图像编码方法的流程示意图;
图20为本申请实施例提供的又一种图像解码方法的流程示意图;
图21为本申请实施例提供的又一种图像编码方法的流程示意图;
图22为本申请实施例提供的又一种图像编码方法的流程示意图;
图23为本申请实施例提供的又一种图像解码方法的流程示意图;
图24为本申请实施例提供的又一种图像编码方法的流程示意图;
图25为本申请实施例提供的又一种图像解码方法的流程示意图;
图26为本申请实施例提供的又一种图像编码方法的流程示意图;
图27为本申请实施例提供的又一种图像解码方法的流程示意图;
图28为本申请实施例提供的图像编码装置的组成示意图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
在本申请的描述中,除非另有说明,“/”表示“或”的意思,例如,A/B可以表示A或B。本文中的“和/或”仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。此外,“至少一个”是指一个或多个,“多个”是指两个或两个以上。“第一”、“第二”等字样并不对数量和执行次序进行限定,并且“第一”、“第二”等字样也并不限定一定不同。
需要说明的是,本申请中,“示例性地”或者“例如”等词用于表示作例子、例证或说明。本申请中被描述为“示例性地”或者“例如”的任何实施例或设计方案不应被解释为比其他实施例或设计方案更优选或更具优势。确切而言,使用“示例性地”或者“例如”等词旨在以具体方式呈现相关概念。
为了提升编码器的性能,提出了一种名为子流并行(或者也可以称作子流交织)的技术。
对于编码端来说,子流并行是指编码端将编码单元(coding unit,CU)的不同通道(例如亮度通道、第一色度通道、以及第二色度通道等)的编码块(coding block,CB)的语法元素使用多个熵编码器编码得到多个子流,并以固定大小的数据包将多个子流交织成比特流。相对应地,对于解码端来说,子流并行是指解码端使用不同的熵解码器并行解码不同的子流。
示例性地,图1为子流并行技术的框架示意图。如图1所示,以编码端为例,子流并行技术的具体应用时序在对语法元素(例如变换系数、以及量化系数等)进行编码之后。图1中的其他部分的编码流程可以参照下述本申请实施例提供的视频编解码系统中所述,此次不再赘述。
示例性地,图2为编码端子流交织过程示意图。如图2所示,以待编码的图像块包括三个通道为例,则编码模块(例如预测模块、变换模块、以及量化模块等)可以输出该三个通道的语法元素和量化后的变换系数,之后由熵编码器1、熵编码器2、以及熵编码器3分别对该三个通道的语法元素和量化后的变换系数进行编码,得到每个通道对应的子流,并将三个通道各自对应的子流对应压入编码子流缓冲区1、编码子流缓冲区2、以及编码子流缓冲区3。子流交织模块可以对编码子流缓冲区1、编码子流缓冲区2、以及编码子流缓冲区3中的子流进行交织,最终输出多个子流交织后的比特流(或者也可以称作码流)。
示例性地,图3为子流交织单元的格式的示意图。如图3所示,子流可以由子流交织单元组成,子流交织单元又可以称作子流片(substream segment)。子流片的长度为N比特,子流片包括M比特的数据头和N-M比特的数据主体。
其中,数据头用于指示当前子流片隶属的子流。N可以取512,M可以取2。
示例性地,图4为解码端反向子流交织过程示意图。如图4所示,同样以编码端进行编码的待编码的图像块包括三个通道为例,则编码端输出的比特流输入解码端后,解码端中的子流交织模块可以先对比特流进行反向子流交织过程,将比特流分解为三个通道各自对应的子流,并将三个通道各自对应的子流对应压入解码子流缓冲区1、解码子流缓冲区2、以及解码子流缓冲区3。例如,以上述图3所示的子流片为例,解码端可以从比特流中每次提取N比特长度的数据包。通过解析其中M比特的数据头,得到当前子流片隶属的目标子流,并将当前子流片中余下的数据主体放入目标子流对应的解码子流缓冲区。
熵解码器1可以对解码子流缓冲区1中的子流进行解码,得到一个通道的语法元素和量化后的变换系数;熵解码器2可以对解码子流缓冲区2中的子流进行解码,得到另一个通道的语法元素和量化后的变换系数;熵解码器3可以对解码子流缓冲区3中的子流进行解码,得到又一个通道的语法元素和量化后的变换系数,最后将该三个通道各自的语法元素和量化后的变换系数输入后续解码模块进行解码处理,得到经解码的图像。
以下以编码端为例对子流交织的过程进行介绍。
在编码子流缓冲区中的每个子流片均包括了至少1个图像块编码产生的编码比特。子流交织时,可以先对每个子流片的数据主体的最初比特所对应的图像块进行顺序标记,在子流交织过程中,按照该顺序标记对不同子流进行交织。
在一实施例中,该子流交织过程可以通过块计数队列对图像块进行标记。
例如,块计数队列通过先进先出队列的方式实现。编码端设置对当前已编码的图像块的计数block count,并对每个子流设置一个块计数队列counter queue[ss_idx]。在每个条带开始编码是,初始化block count为0,初始化各个counter queue[ss_idx]为空,然后在各个counter queue[ss_idx]各自压入一个0。
在每个图像块(或者说编码单元(codeing unit,CU))完成编码操作后,对块计数队列进行更新,更新过程如下:
步骤1、令当前已编码的图像块的计数+1,也即block count+=1。
步骤2、选择一个子流ss_idx。
步骤3、计算该子流ss_idx对应的编码子流缓冲区中可构建的子流片的个数num_in_buffer[ss_idx]。设该子流ss_idx对应的编码子流缓冲区为buffer[ss_idx],buffer[ss_idx]中包括的数据量为buffer[ss_idx].fullness,同样以上述图3所示的子流片的大小为N比特,子流片中包括M比特的数据头为例,则num_in_buffer[ss_idx]可以根据下述公式(1)计算得到:
num_in_buffer[ss_idx]=buffer[ss_idx].fullness/(N–M)      公式(1)
其中,“/”表示整除。
步骤4、比较当前块计数度列长度num_in_queue[ss_idx]和编码子流缓冲区中可构建的子流片的个数num_in_buffer[ss_idx],若两者相等,则将当前已编码的图像块的计数压入该块计数队列,也即counter_queue[ss_idx].push(block_count)。
步骤5、返回步骤2,处理下一个子流,直至所有子流均处理完毕。
在对块计数队列更新完毕后,编码端可以对各个编码子流缓冲区中的子流进行交织,交织过程如下:
步骤1、选择一个子流ss_idx。
步骤2、判断该子流ss_idx对应的编码子流缓冲区buffer[ss_idx]中包括的数据量buffer[ss_idx].fullness是否大于或等于N-M。若是,则执行步骤3;若否,则执行步骤6。
步骤3、判断该子流ss_idx的块计数队列中的队首元素值是否为所有子流的块计数队列中的最小值。若是,则执行步骤4;若否,则执行步骤6。
步骤4、以当前编码子流缓冲区中的数据构建一个子流片。例如,从编码子流缓冲区buffer[ss_idx]中取出长度为N-M比特的数据,并添加M比特的数据头,数据头中的数据为ss_idx,将M比特的数据头和取出的N-M比特的数据拼接为N比特的子流片,将子流片送入编码端最终输出的比特流。
步骤5、弹出(或者说删除)该子流ss_idx的块计数队列的队首元素,也即counter_queue[ss_idx].pop()。
步骤6、返回步骤1,处理下一个子流,直至所有子流均处理完毕。
在一实施例中,若当前图像块为一个条带(slice)的最后一个图像块,则编码端在上述交织过程之后,还可以执行下述步骤,将编码子流缓冲区中剩余的数据进行打包:
步骤1、判断目前所有的编码子流缓冲区是否存在至少一个非空。若是,则执行步骤2;若否,则结束。
步骤2、选择一个子流ss_idx。
步骤3、判断该子流ss_idx的块计数队列的队首元素值是否为所有子流的块计数队列的最小值。若是,则执行步骤4; 若否,则执行步骤6。
步骤4、若该子流ss_idx对应的编码子流缓冲区buffer[ss_idx]中的数据量不足N-M比特,则向该编码子流缓冲区buffer[ss_idx]中填入0,直至该编码子流缓冲区buffer[ss_idx]中的数据达到N-M比特。同时,弹出(或者说删除)该子流的块计数队列的队首元素值,也即counter_queue[ss_idx].pop(),并压入一个表示数据范围内最大值的MAX_INT,也即counter_queue[ss_idx].push(MAX_INT)。
步骤5、构建一个子流片。此处的步骤5可以参照上述子流交织过程中的步骤4,不再赘述。
步骤6、返回步骤2,处理下一个子流。若所有子流均处理完毕,则返回步骤1。
示例性地,图5为编码端子流交织过程的另一种示意图。如图5所示,同样以待编码的图像块包括三个通道为例,则编码子流缓冲区可以分别包括编码子流缓冲区1、编码子流缓冲区2、以及编码子流缓冲区3。
编码子流缓冲区1中的子流片从前至后依次为1_1、1_2、1_3、以及1_4。编码子流缓冲区1对应的子流1的块计数队列1中的标记依次为12、13、27、以及28。编码子流缓冲区2中的子流片从前至后依次为2_1和2_2。编码子流缓冲区2对应的子流2的块计数队列2中的标记依次为5和71。编码子流缓冲区3中的子流片从前至后依次为3_1、3_2、以及3_3。编码子流缓冲区3对应的子流3的块计数队列3中的标记依次为6、13、以及25。则子流交织模块可以根据块计数队列1、块计数队列2、以及块计数队列3中的标记顺序对编码子流缓冲区1、编码子流缓冲区2、以及编码子流缓冲区3中的子流片进行交织。
交织后的码流中的子流片的顺序依次为:最小标记5对应的子流片2_1、标记6对应的子流片3_1、标记12对应的子流片1_1、标记13对应的子流片1_2、标记13对应的子流片3_2、标记25对应的子流片3_3、标记27对应的子流片1_3、标记28对应的子流片1_4、以及标记71对应的子流片2_2。
但是,多个子流交织时存在依赖性,以编码端为例,不同的子流缓冲区中子流的填充速度不同,同一时间填充速度较快的子流缓冲区比填充速度较慢的子流缓冲区中填充的子流位数多,填充速度较快的子流缓冲区在等待填充速度较慢的子流缓冲区填充够一个子流片的过程中同时还在继续填充数据,为了保证填充数据的完整性,需要将所有的子流缓冲区设置得较大,增加了硬件成本。
在这种情况下,本申请实施例提供一种图像编码方法和图像解码方法、装置及存储介质,可以通过多种改进后的编码模式进行编码,从而控制最大编码块的比特数和最小编码块的比特数,以使得经过编码后的待编码块的理论膨胀率降低,缩小不同子流缓冲区之间的填充速度差异,从而减小子流缓冲区的预设空间大小,降低硬件成本。
以下结合附图进行介绍。
图6为本申请实施例提供的视频编解码系统的组成示意图。如图6所示,视频编解码系统包括源装置10和目的装置11。
源装置10产生经过编码后的视频数据,源装置10也可以被称为编码端、视频编码端、视频编码装置、或视频编码设备等,目的装置11可以对源装置10产生的经过编码后的视频数据进行解码,目的装置11也可以被称为解码端、视频解码端、视频解码装置、或视频解码设备等。源装置10和/或目的装置11可包含至少一个处理器以及耦合到所述至少一个处理器的存储器。该存储器可包含但不限于只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、带电可擦可编程只读存储器(electrically erasable programmable read-only memory,EEPROM)、快闪存储器或可用于以可由计算机存取的指令或数据结构的形式存储所要的程序代码的任何其它媒体,本申请实施例对此不作具体限定。
源装置10和目的装置11可以包括各种装置,包含桌上型计算机、移动计算装置、笔记本(例如,膝上型)计算机、平板计算机、机顶盒、例如所谓的“智能”电话等电话手持机、电视机、相机、显示装置、数字媒体播放器、视频游戏控制台、车载计算机或其类似者等电子设备。
目的装置11可经由链路12从源装置10接收经编码视频数据。链路12可包括能够将经编码视频数据从源装置10移动到目的装置11的一个或多个媒体和/或装置。在一个实例中,链路12可包括使得源装置10能够实时地将编码后的视频数据直接发射到目的装置11的一个或多个通信媒体。在此实例中,源装置10可根据通信标准(例如:无线通信协议)来调制编码后的视频数据,并且可以将调制后的视频数据发射到目的装置11。上述一个或多个通信媒体可包含无线和/或有线通信媒体,例如:射频(Radio Frequency,RF)频谱、一个或多个物理传输线。上述一个或多个通信媒体可形成基于分组的网络的一部分,基于分组的网络例如为局域网、广域网或全球网络(例如,因特网)等。上述一个或多个通信媒体可以包含路由器、交换器、基站,或者实现从源装置10到目的装置11的通信的其它设备。
在另一实例中,源装置10可将编码后的视频数据从输出接口103输出到存储装置13。类似地,目的装置11可通过输入接口113从存储装置13存取编码后的视频数据。存储装置13可包含多种本地存取式数据存储媒体,例如蓝光光盘、高密度数字视频光盘(Digital Video Disc,DVD)、只读光盘(Compact Disc Read-Only Memory,CD-ROM)、快闪存储器,或用于存储经编码视频数据的其它合适数字存储媒体。
在另一实例中,存储装置13可对应于文件服务器或存储由源装置10产生的编码后的视频数据的另一中间存储装置。在此实例中,目的装置11可经由流式传输或下载从存储装置13获取其存储的视频数据。文件服务器可为任何类型的能够存储经编码的视频数据并且将经编码的视频数据发射到目的装置11的服务器。例如,文件服务器可以包含全球广域网(World Wide Web,Web)服务器(例如,用于网站)、文件传送协议(File Transfer Protocol,FTP)服务器、网络附加存储(Network Attached Storage,NAS)装置以及本地磁盘驱动器。
目的装置11可通过任何标准数据连接(例如,因特网连接)存取编码后的视频数据。数据连接的实例类型包含适合于存取存储于文件服务器上的编码后的视频数据的无线信道、有线连接(例如,缆线调制解调器等),或两者的组合。编码后的视频数据从文件服务器发射的方式可为流式传输、下载传输或两者的组合。
需要说明的是,本申请实施例提供的图像编码方法和图像解码方法不限于无线应用场景。
示例性地,本申请实施例提供的图像编码方法和图像解码方法可以应用于支持以下多种多媒体应用的视频编解码:空中电视广播、有线电视发射、卫星电视发射、流式传输视频发射(例如,经由因特网)、存储于数据存储媒体上的视频数据的编码、存储于数据存储媒体上的视频数据的解码,或其它应用。在一些实例中,视频编解码系统可经配置,以支持单向或双向视频发射,以支持例如视频流式传输、视频播放、视频广播及/或视频电话等应用。
需要说明的是,图6示出的视频编解码系统仅仅是视频编解码系统的示例,并不是对本申请中视频编解码系统的限定。 本申请提供的图像编码方法和图像解码方法还可适用于编码装置与解码装置之间无数据通信的场景。在其它实施例中,待编码视频数据或编码后的视频数据可以从本地存储器检索,也可以在网络上流式传输等。视频编码装置可对待编码视频数据进行编码并且将编码后的视频数据存储到存储器,视频解码装置也可从存储器中获取编码后的视频数据并且对该编码后的视频数据进行解码。
在图6的实施例中,源装置10包含视频源101、视频编码器102和输出接口103。在一些实施例中,输出接口103可包含调制器/解调器(调制解调器)和/或发射器。视频源101可包括视频捕获装置(例如,摄像机)、含有先前捕获的视频数据的视频存档、用以从视频内容提供者接收视频数据的视频输入接口,和/或用于产生视频数据的计算机图形系统,或视频数据的这些来源的组合。
视频编码器102可对来自视频源101的视频数据进行编码。在一些实施例中,源装置10经由输出接口103将编码后的视频数据直接发射到目的装置11。在其它实施例中,编码后的视频数据还可存储到存储装置13上,供目的装置11稍后存取来用于解码和/或播放。
在图6的实施例中,目的装置11包含显示装置111、视频解码器112以及输入接口113。在一些实施例中,输入接口113包含接收器和/或调制解调器。输入接口113可经由链路12和/或从存储装置13接收编码后的视频数据。显示装置111可与目的装置11集成或可在目的装置11外部。一般来说,显示装置111显示解码后的视频数据。显示装置111可包括多种显示装置,例如,液晶显示器、等离子显示器、有机发光二极管显示器或其它类型的显示装置。
在一实施例中,视频编码器102和视频解码器112可各自与音频编码器和解码器集成,且可包含适当的多路复用器-多路分用器单元或其它硬件和软件,以处理共同数据流或单独数据流中的音频和视频两者的编码。
视频编码器102和视频解码器112可以包括至少一个微处理器、数字信号处理器(digital signal processor,DSP)、专用集成电路(application-specific integrated circuit,ASIC)、现场可编程门阵列(field programmable gate array,FPGA)、离散逻辑、硬件或其任何组合。若本申请提供的编码方法采用软件实现,则可将用于软件的指令存储在合适的非易失性计算机可读存储媒体中,且可使用至少一个处理器执行所述指令从而实施本申请。
本申请中的视频编码器102和视频解码器112可以根据视频压缩标准(例如HEVC)操作,也可根据其它业界标准操作,本申请对此不作具体限定。
图7为本申请实施例提供的视频编码器102的组成示意图。如图7所示,视频编码器102可以在预测模块21、变换模块22、量化模块23、熵编码模块24、编码子流缓冲区25、以及子流交织模块26分别进行预测、变换、量化、熵编码、以及子流交织的过程。其中,预测模块21、变换模块22、量化模块23也即上述图1中的编码模块。视频编码器102中还包括预处理模块20和求和器202,其中预处理模块20包括分割模块和码率控制模块。对于视频块重构建,视频编码器102也包括反量化模块27、反变换模块28、求和器201和参考图像存储器29。
如图7所示,视频编码器102接收视频数据,预处理模块20视频数据的输入参数。其中,该输入参数包括该视频数据中图像的分辨率、图像的采样格式、像素深度(bits per pixel,BPP)、位宽(或者也可以称作图像位宽)等信息。其中,BPP是指单位像素所占用的比特数。位宽是指单位像素中一个像素通道所占用的比特数。例如,以YUV三个像素通道的值表示一个像素,若每个像素通道占用8比特(bits),则该像素的位宽为8,并且该像素的BPP为3×8=24bits。
预处理模块20中的分割模块将图像分割成原始块(或者也可以称作编码单元(CU))。该原始块(或者也可以称作编码单元(CU))可以包括多个通道的编码块。例如该多个通道可以是RGB通道或者YUV通道等。本申请实施例对此不作限制。在一实施例中,此分割也可包含分割成条带(slice)、图像块或其它较大单元,以及根据最大编码单元(Largest Coding Unit,LCU)及CU的四叉树结构进行视频块分割。示例性的,视频编码器102编码在待编码的视频条带内的视频块的组件。一般的,条带可划分成多个原始块(且可能划分成称作图像块的原始块的集合)。通常在分割模块中确定CU、PU以及TU的尺寸。此外,分割模块还用于确定码率控制单元的尺寸。该码率控制单元是指码率控制模块中的基本处理单元,例如在码率控制模块基于码率控制单元,为原始块计算复杂度信息,再根据复杂度信息计算原始块的量化参数。其中,分割模块的分割策略可以是预设的,也可以是编码过程中基于图像不断调整的。当分割策略是预设策略时,相应地,解码端中也预设相同的分割策略,从而获取相同的图像处理单元。该图像处理单元为上述任意一种图像块,且与编码侧一一对应。当分割策略在编码过程中基于图像不断调整时,该分割策略可以直接或间接地编入码流,相应地,解码端从码流中获取相应参数,得到相同的分割策略,获取相同的图像处理单元。
预处理模块20中的码率控制模块用于生成量化参数以使得量化模块23和反量化模块27进行相关计算。其中,码率控制模块在计算量化参数过程中,可以获取原始块的图像信息进行计算,例如上述输入信息;还可以获取求和器201经重构得到的重建值进行计算,本申请对此不作限制。
预测模块21可将预测块提供到求和器202以产生残差块,且将该预测块提供到求和器201经重构得到重建块,该重建块用于后续进行预测的参考像素。其中,视频编码器102通过原始块的像素值减去预测块的像素值来形成像素差值,该像素差值即为残差块,该残差块中的数据可包含亮度差及色度差。求和器201表示执行此减法运算的一个或多个组件。预测模块21还可将相关的语法元素发送至熵编码模块24用于合并至码流。
变换模块22可将残差块划分为一个或多个TU进行变换。变换模块22可将残差块从像素域转换到变换域(例如,频域)。例如,使用离散余弦变换(discrete cosine transform,DCT)或离散正弦变换(discrete sine transform,DST)将残差块经变换得到变换系数。变换模块32可将所得变换系数发送到量化模块23。
量化模块23可基于量化单元进行量化。其中,量化单元可以与上述CU、TU、PU相同,也可以在分割模块中进一步地划分。量化模块23对变换系数进行量化以进一步减小码率得到量化系数。其中,量化过程可减少与系数中的一些或全部相关联的比特深度。可通过调整量化参数来修改量化的程度。在一些可行的实施方式中,量化模块23可接着执行包含经量化变换系数的矩阵的扫描。替代的,熵编码模块24可执行扫描。
在量化之后,熵编码模块24可熵编码量化系数。例如,熵编码模块24可执行上下文自适应性可变长度编码(context-adaptive variable-length coding,CAVLC)、上下文自适应性二进制算术编码(context-based adaptive binary arithmetic coding,CABAC)、基于语法的上下文自适应性二进制算术解码(syntax-based binary arithmetic coding,SBAC)、概率区间分割熵(probability interval partitioning entropy,PIPE)解码或另一熵编码方法或技术。在通过熵编码模块24进行熵编码 之后得到子流,通过多个熵编码模块24进行熵编码后可以得到多个子流,该多个子流经过编码子流缓冲区25、以及子流交织模块26进行子流交织过后得到码流,该码流传输到视频解码器112或存档以供稍后传输或由视频解码器112检索。
其中,子流交织的过程可以参照上述图1至图5处所述,此处不再赘述。
反量化模块27及反变换模块28分别应用反量化与反变换,求和器201将反变换后的残差块得和预测的残差块相加以产生重建块,该重建块用作后续原始块进行预测的参考像素。该重建块存储于参考图像存储器29中。
图8为本申请实施例提供的视频解码器112的结构示意图。如图8所示,视频解码器112包括子流交织模块30、解码子流缓冲区31、熵解码模块32、预测模块33、反量化模块34、反变换模块35、求和器301和参考图像存储器36。
其中,熵解码模块32包括解析模块和码率控制模块。在一些可行的实施方式中,视频解码器112可执行与关于来自图7的视频编码器102描述的编码流程的示例性地互逆的解码流程。
在解码过程期间,视频解码器112从视频编码器102接收经编码的视频的码流。并通过子流交织模块30对该码流进行反向子流交织,得到多个子流,并该多个子流经过各自对应的解码子流缓冲区,流入各自对应的熵解码模块32中。以一个子流为例,视频解码器112的熵解码模块32中的解析模块对该子流进行熵解码,以产生量化系数和语法元素。熵解码模块32将语法元素转递到预测模块33。视频解码器112可在视频条带层级和/或视频块层级处接收语法元素。
熵解码模块32中的码率控制模块根据解析模块得到的待解码图像的信息,生成量化参数以使得反量化模块34进行相关计算。码率控制模块还可以根据求和器301经重构得到的重建块,以计算量化参数。
反量化模块34对子流中所提供且通过熵解码模块32所解码的量化系数以及所生成的量化参数进行反量化(例如,解量化)。反量化过程可包含使用通过视频编码器102针对视频条带中的每一视频块所计算的量化参数确定量化的程度,且同样地确定应用的反量化的程度。反变换模块35将反变换(例如,DCT、DST等变换方法)应用于反量化后的变换系数,将反量化后的变换系数按照反变换单元在像素域中产生反变换后的残差块。其中,反变换单元的尺寸与TU的尺寸相同,反变换方法与变换方法采用同样的变换方法中相应的正变换与反变换,例如,DCT、DST的反变换为反DCT、反DST或概念上类似的反变换过程。
预测模块33生成预测块后,视频解码器112通过将来自反变换模块35的反变换后的残差块与通过与预测块求和来形成经解码视频块。求和器301表示执行此求和运算的一个或多个组件。在需要时,也可应用解块滤波器来对经解码块进行滤波以便去除块效应伪影。给定帧或图像中的经解码的图像块存储于参考图像存储器36中,作为后续进行预测的参考像素。
本申请实施例提供一种可能的视频(图像)编解码的实现方式,参见图9,图9为本申请实施例提供的一种视频编解码的流程示意图,该视频编解码实现方式包括过程①至过程⑤,过程①至过程⑤可以由上述的源装置10、视频编码器102、目的装置11或视频解码器112中的任意一个或多个执行。
过程①:将一帧图像分成一个或多个互相不重叠的并行编码单元。该一个或多个并行编码单元间无依赖关系,可完全并行/独立编码和解码,比如并行编码单元1和并行编码单元2。
过程②:对于每个并行编码单元,可再将其分成一个或多个互相不重叠的独立编码单元,各个独立编码单元间可相互不依赖,但可以共用一些并行编码单元头信息。
独立编码单元既可以是包括亮度Y、第一色度Cb、第二色度Cr三个通道,或RGB三个通道,也可以仅包含其中的某一个通道。若独立编码单元包含三个通道,则这三个通道的尺寸可以完全一样,也可以不一样,具体与图像的输入格式相关。该独立编码单元也可以理解为每个并行编码单元所包含N个通道形成的一个或多个处理单元。例如上述Y、Cb、Cr三个通道即为构成该并行编码单元的三个通道,其分别可以为一个独立编码单元,或者Cb和Cr可以统称为色度通道,则该并行编码单元包括亮度通道构成的独立编码单元,以及色度通道构成的独立编码单元。
过程③:对于每个独立编码单元,可再将其分成一个或多个互相不重叠的编码单元,独立编码单元内的各个编码单元可相互依赖,如多个编码单元可以进行相互参考预编解码。
若编码单元与独立编码单元尺寸相同(即独立编码单元仅分成一个编码单元),则其尺寸可为过程②所述的所有尺寸。
编码单元既可以是包括亮度Y、第一色度Cb、第二色度Cr三个通道(或RGB三通道),也可以仅包含其中的某一个通道。若包含三个通道,几个通道的尺寸可以完全一样,也可以不一样,具体与图像输入格式相关。
值得注意的是,过程③是视频编解码方法中一个可选的步骤,视频编/解码器可以对过程②获得的独立编码单元进行残差系数(或残差值)进行编/解码。
过程④:对于编码单元,可以将其可再将其分成一个或多个互相不重叠的预测组(prediction group,PG),PG也可简称为Group,各个PG按照选定预测模式进行编解码,得到PG的预测值,组成整个编码单元的预测值,基于预测值和编码单元的原始值,获得编码单元的残差值。
过程⑤:基于编码单元的残差值,对编码单元进行分组,获得一个或多个相不重叠的残差小块(residual block,RB),各个RB的残差系数按照选定模式进行编解码,形成残差系数流。具体的,可分为对残差系数进行变换和不进行变换两类。
其中,过程⑤中残差系数编解码方法的选定模式可以包括,但不限于下述任一种:半定长编码方式、指数哥伦布(Golomb)编码方法、Golomb-Rice编码方法、截断一元码编码方法、游程编码方法、直接编码原始残差值等。
例如,视频编码器可直接对RB内的系数进行编码。
又如,视频编码器也可对残差块进行变换,如DCT、DST、Hadamard变换等,再对变换后的系数进行编码。
作为一种可能的示例,当RB较小时,视频编码器可直接对RB内的各个系数进行统一量化,再进行二值化编码。若RB较大,可进一步划分为多个系数组(coefficient group,CG),再对各个CG进行统一量化,再进行二值化编码。在本申请的一些实施例中,系数组(CG)和量化组(QG)可以相同。
下面以半定长编码方式对残差系数编码的部分进行示例性说明。首先,将一个RB块内残差绝对值的最大值定义为修整最大值(modified maximum,mm)。其次,确定该RB块内残差系数的编码比特数(同一个RB块内残差系数的编码比特数一致)。例如,若当前RB块的关键限值(critical limit,CL)为2,当前残差系数为1,则编码残差系数1需要2个比特,表示为01。若当前RB块的CL为7,则表示编码8-bit的残差系数和1-bit的符号位。CL的确定是去找满足当前子块所有残差都在[-2^(M-1),2^(M-1)]范围之内的最小M值。若同时存在-2^(M-1)和2^(M-1)两个边界值,则M应增加1,即需要M+1个比特编码当前RB块的所有残差;若仅存在-2^(M-1)和2^(M-1)两个边界值中的一个,则需要编码一个Trailing位来确定该 边界值是-2^(M-1)还是2^(M-1);若所有残差均不存在-2^(M-1)和2^(M-1)中的任何一个,则无需编码该Trailing位。
另外,对于某些特殊的情况,视频编码器也可以直接编码图像的原始值,而不是残差值。
上述视频编码器102以及视频解码器112也可以通过另外一种实现形态来实现,例如,采用通用的数字处理系统实现。参见图10,图10为本申请实施例提供的编解码装置的组成示意图,该编解码装置可以是上述视频编码器102中的部分装置,也可以是上述视频解码器112中的部分装置。该编解码装置可以应用于编码侧(或者说编码端),也可以应用于解码侧(或者说解码端)。如图10所示,该编解码装置包括处理器41和存储器42。处理器41与处理器42相连接(例如通过总线43互相连接)。在一实施例中,该编解码装置还可以包括通信接口44,通信接口44连接处理器41和存储器42,用于接收/发送数据。
处理器41,用于执行存储器42中存储的指令,以实现本申请下述实施例提供的图像编码方法和图像解码方法。处理器41可以是中央处理器(central processing unit,CPU)、通用处理器网络处理器(network processor,NP)、数字信号处理器(digital signal processing,DSP)、微处理器、微控制器、可编程逻辑器件(programmable logic device,PLD)或它们的任意组合。处理器41还可以是其它任意具有处理功能的装置,例如电路、器件或软件模块,本申请实施例对此不作限制。在一种示例中,处理器41可以包括一个或多个CPU,例如图10中的CPU0和CPU1。作为一种可选的实现方式,电子设备可以包括多个处理器,例如,除处理器41之外,还可以包括处理器45(图10中以虚线为例示出)。
存储器42,用于存储指令。例如,指令可以是计算机程序。在一实施例中,存储器42可以是只读存储器(read-only memory,ROM)或可存储静态信息和/或指令的其他类型的静态存储设备,也可以是存取存储器(random access memory,RAM)或者可存储信息和/或指令的其他类型的动态存储设备,还可以是电可擦可编程只读存储器(electrically erasable programmable read-only memory,EEPROM)、只读光盘(compact disc read-only memory,CD-ROM)或其他光盘存储、光碟存储(包括压缩光碟、激光碟、光碟、数字通用光碟、蓝光光碟等)、磁盘存储介质或者其他磁存储设备等,本申请实施例对此不作限制。
需要说明的是,存储器42可以独立于处理器41存在,也可以和处理器41集成在一起。存储器42可以位于编解码装置内,也可以位于编解码装置外,本申请实施例对此不作限制。
总线43,用于在编解码装置所包括的各个部件之间传送信息。总线43可以是工业标准体系结构(industry standard architecture,ISA)线路、外部设备互连(peripheral component interconnect,PCI)线路或扩展工业标准体系结构(extended industry standard architecture,EISA)线路等。总线43可以分为地址线路、数据线路、控制线路等。为便于表示,图10中仅用一条实线表示,但并不表示仅有一根线路或一种类型的线路。
通信接口44,用于与其他设备或其它通信网络进行通信。该其它通信网络可以为以太网,无线接入网(radio access network,RAN),无线局域网(wireless local area networks,WLAN)等。通信接口44可以是模块、电路、收发器或者任何能够实现通信的装置。本申请实施例对此不作限制。
需要说明的是,图10中示出的结构并不构成对编解码装置的限定,除图10所示的部件之外,编解码装置可以包括比图示更多或更少的部件,或者某些部件的组合,或者不同的部件布置。
需要说明的是,本申请实施例提供的图像编码方法和图像解码方法的执行主体可以是上述编解码装置,或者,编解码装置中安装的提供编解码功能的应用程序(application,APP);或者,编解码装置中的CPU;又或者,编解码装置中用于执行图像编码方法和图像解码方法的功能模块。本申请实施例对此不作限制。为了描述简单,以下统一以编码端或解码端进行描述。
下面结合附图对本申请实施例提供的图像编码方法和图像解码方法进行介绍。
如背景技术、以及上述图1至图5处所述,不同的子流缓冲区填充子流的速度不同,同一时间填充速度较快的子流缓冲区比填充速度较慢的子流缓冲区中填充的子流位数多,为了保证填充数据的完整性,需要将所有的子流缓冲区设置得较大,增加了硬件成本。
为了对子流缓冲区进行合理配置,本申请实施例提出了一系列改进后的编码方式(例如编码模式/预测模式、复杂度信息传输、系数分组、码流排布等)来减小编码单元的膨胀率,从而缩小编码单元各个通道的编码块进行编码时的速度差异,可以合理配置子流缓冲区的空间,降低硬件成本。
其中,编码单元的膨胀率可以包括理论膨胀率和当前(CU)膨胀率(实际膨胀率)。理论膨胀率在编解码器确定后,通过理论推导即可得出,其值大于1。若编码单元存在某一通道的编码块编码后的子流的比特数为0,则计算当前膨胀率时需要除去该种情况。理论膨胀率=CU中理论比特数最大的CB的比特数/CU中理论比特数最小的CB的比特数。当前(CU)膨胀率=当前CU中实际比特数最大的CB的比特数/当前CU中实际比特数最小的CB的比特数。
在一实施例中,相关的膨胀率还可以包括当前子流膨胀率。当前子流膨胀率=当前多个子流缓冲区中数据的比特数最大的子流缓冲区中的数据的比特数/当前多个子流缓冲区中数据的比特数最小的子流缓冲区中的数据的比特数。
需要说明的是,针对各个通道的编码块或者整个编码单元,编码端在选择编码模式(或者说预测模式)时,可以基于以下几种策略进行选择。以下实施例中的各个改进后的编码模式均可以按照下述几种策略选择确定。
策略1、计算比特消耗代价,并选择比特消耗代价最小的编码模式。
其中,比特消耗代价是指编/解码CU所需的比特数。比特消耗代价主要包括模式标志(或者说模式码字)的码长、编/解码工具信息码字的码长、残差码字的码长。
策略2、计算编码后的失真,并选择失真最小的编码模式。
其中,失真用于指示重建值和原始值之间的差异情况。计算失真可以采用平方误差和(sum of squared difference,SSD)、均方误差(mean squared error,MSE)、绝对误差和(时域)(sum of absolute difference,SAD)、绝对误差和(频域)(sum of absolute transformed difference,SATD)、以及峰值信噪比(peak signal to noise ratio,PSNR)等中的任意一项或多项计算。本申请实施例对此不作限制。
策略3、计算率失真代价,并选择率失真代价最小的编码模式。
其中,率失真代价是指比特消耗代价和失真的加权和,进行加权计算时,比特消耗代价的权重系数和编码后的失真的权重系数均可以预设在编码端中。本申请实施例对权重系数的具体数值不作限制。
以下将通过一系列的实施例对本申请实施例提供的图像编码方法和图像解码方法中改进的编码模式进行介绍。
一、防膨胀模式修改。
可选的实施方案:编码端对编码单元的多个通道的编码块中的每个像素值以图像位宽作为定长码进行编码。
修改方案:编码端对编码单元的多个通道的编码块中的每个像素值以小于或等于图像位宽的定长码进行编码。
实施例一:
示例性地,图11为本申请实施例提供的一种图像编码方法的流程示意图。如图11所示,该图像编码方法包括S101至S102。
S101、编码端获取编码单元。
其中,编码端也即上述图6中的源装置10,或者源装置10中的视频编码器102,或者上述图10中的编解码装置等。本申请实施例对此不作限制。编码单元为待处理图像中的图像块(也即上述原始块)。编码单元包括多个通道的编码块;多个通道包括第一通道;第一通道为多个通道中的任意一个通道。例如,编码单元的尺寸可以为16×2×3,则该编码单元的第一通道的编码块的尺寸即为16×2。
S102、编码端按照第一编码模式对第一通道的编码块进行编码。
其中,第一编码模式为按照第一定长码对第一通道的编码块中的采样值进行编码的模式。第一定长码的码长小于或等于待处理图像的图像位宽。图像位宽用于表征存储待处理图像中每个样本所需的比特位数。第一定长码可以预设在编/解码端中,或者,由编码端确定后并写入码流头信息中传输至解码端。
应理解,防膨胀模式是指对多个通道的编码块中的原始像素值直接进行编码的模式,与其他编码模式相比,直接编码原始像素值所消耗的比特数通常较大,因此,理论膨胀率中的理论比特数最大的编码块通常来自于按照防膨胀模式进行编码的编码块。本申请实施例通过降低防膨胀模式中定长码的码长,来降低理论膨胀率中理论比特数最大的编码块的理论比特数(也即分式中的分子),从而降低理论膨胀率。理论膨胀率较低,各个通道的编码块编码子流的速度差异较小,因此设置子流缓冲区时,既不会将子流缓冲区较大,也不会导致子流缓冲区空间浪费,从而实现对子流缓冲区的合理配置。
在一实施例中,若待处理的图像为RGB图像,则编码端可以将其转化为YUV格式进行编码,或者,若待处理的图像为YUV图像,编码端也可以将其转化成RGB格式进行编码。本申请实施例对此不作限制。
相对应地,本申请实施例还提供一种图像解码方法。图12为本申请实施例提供的一种图像解码方法的流程示意图。如图12所示,该图像解码方法包括S201至S202。
S201、解码端获取对编码单元编码后的码流。
其中,对编码单元编码后的码流可以包括多个通道的编码块编码后的、与多个通道一一对应的多个子流。例如,该多个子流可以包括上述第一通道对应的子流(也即第一通道的编码块编码后的子流)。
S202、解码端按照第一解码模式对第一通道对应的子流进行解码。
其中,第一解码模式为按照第一定长码从第一通道对应的子流中解析样本值的模式。
在一实施例中,S202还可以具体包括:当第一定长码等于图像位宽时,解码端直接按照第一解码模式对第一通道对应的子流进行解码;当第一定长码小于图像位宽时,解码端对解析出来的第一通道的编码块中的像素值进行反量化。
其中,量化步长为1<<(bitdepth-fixed_length)。bitdepth表示图像位宽。fixed_length表示第一定长码的长度。1<<表示左移一位。
二、回退模式修改。
可选的实施方案1:若当前码流缓冲区无法满足编码单元的所有通道均采用防膨胀模式进行编码时,此时防膨胀模式关闭,编码端选择原始值以外的模式,此时即使采用其他编码模式编码会发生膨胀(某一通道的编码块编码后的子流的比特数过大),也不能采用防膨胀模式。
改进方案1:保障防膨胀模式开启。
实施例二:
图13为本申请实施例提供的另一种图像解码方法的流程示意图。如图13所示,该图像解码方法包括S301至S303。
S301、编码端获取编码单元。
其中,编码单元为待处理图像中的图像块,编码单元包括多个通道的编码块。
S302、编码端确定第一总码长。
其中,第一总码长为多个通道的编码块均按照各自对应的目标编码模式进行编码后得到的第一码流的总码长。目标编码模式包括第一编码模式,第一编码模式为按照第一定长码对编码块中的像素值进行编码的模式,第一定长码的码长小于或等于待处理图像的图像位宽,图像位宽用于表征存储待处理图像中每个样本所需的比特数。
例如,对多个通道中的每个通道,编码端均可以按照上述策略3确定每个编码模式的率失真代价,并确定率失真代价最低的模式作为通道的编码块对应的目标编码模式。
S303、当第一总码长大于或等于码流缓冲区的剩余大小时,编码端将多个通道的编码块按照回退模式进行编码。
其中,回退模式包括第一回退模式和第二回退模式。第一回退模式是指使用IBC模式获得参考预测块的块向量,然后计算残差并对残差量化,量化的步长基于码流缓存区的剩余大小和目标像素深度(bite per pixel,BPP)确定。第二回退模式是指直接对像素点进行量化,量化的步长基于码流缓冲区的剩余大小和目标BPP确定。回退模式和第一编码模式的模式标志相同。
应理解,当采用其他编码模式编码时可能会出现编码后的残差过大的情况,导致采用该其他编码模式的编码块的比特数过大,而防膨胀模式是采用定长码编码,编码后的总码长固定,采用防膨胀模式可以避免上述残差过大的情况。本申请实施例提供了一种在回退模式和第一编码模式(防膨胀模式)的模式标志相同的情况下,通过判断码流缓冲区的剩余大小来向解码端告知采用的编码模式的方案,保障了防膨胀模式一直可以被选用,从而降低了理论膨胀率,理论膨胀率较低,各个通道的编码块编码子流的速度差异较小,因此设置子流缓冲区时,既不会导致子流缓冲区较大,也不会导致子流缓冲区空间浪费,从而实现对子流缓冲区的合理配置。
在一实施例中,该图像解码方法还可以包括:编码端在多个通道的编码块编码得到的多个子流中编码模式标志。
其中,模式标志用于指示多个通道的编码块各自采用的编码模式,第一编码模式的模式标志和回退模式的模式标志相同。
在一实施例中,多个通道的编码块编码后的子流中可以编码自身的模式标志。以多个通道中包括第一通道,第一通道为多个通道中的任意一个通道为例,在这种情况下,编码端在多个通道的编码块编码得到的多个子流中编码模式标志,可以包括:在的第一通道的编码块编码得到的子流中编码子模式标志。
其中,子模式标志用于指示第一通道的编码块所采用的回退模式的种类,或者用于指示多个通道的编码块所采用的回退模式的种类。如上所述,回退模式可以包括第一回退模式和第二回退模式。此处不再赘述。
在一实施例中,同样以上述第一分量为例,在这种情况下,编码端在多个通道的编码块编码得到的多个子流中编码模式标志,可以包括:在亮度通道的编码块编码得到的子流中编码第一标志、第二标志、以及第三标志。
其中,第一标志用于指示多个通道的编码块采用第一编码模式或者回退模式编码;第二标志用于指示多个通道的编码块采用目标模式编码,目标模式为第一编码模式和回退模式中的任意一种;当第二标志指示多个通道的编码块采用的目标模式为回退模式时,第三标志用于指示多个通道的编码块采用的回退模式的种类。
相对应地,本申请实施例还提供两种图像解码方法。图14为本申请实施例提供的另一种图像解码方法的流程示意图。如图14所示,该图像解码方法包括S401至S404。
S401、解码端解析对编码单元编码后的码流。
其中,编码单元包括多个通道的编码块。多个通道包括第一通道,第一通道为多个通道中的任意一个通道。
S402、若从多个通道的编码块编码得到的子流中解析出模式标志,且第二总码长大于码流缓冲区的剩余大小,则解码端确定该子流的目标解码模式为回退模式。
其中,第二总码长为多个通道的编码块均按照第一编码模式或者回退模式进行编码后得到的的第一码流的总码长。
S403、解码端解析所述子流中的预设标志位,确定目标回退模式。
其中,目标回退模式为回退模式中的一种。子模式标志用于指示多个通道的编码块编码时所采用的回退模式的种类。回退模式包括第一回退模式和第二回退模式。
S404、解码端按照目标回退模式对该子流进行解码。
图15为本申请实施例提供的又一种图像解码方法的流程示意图。如图15所示,该图像解码方法包括S501至S505。
S501、解码端解析对编码单元编码后的码流。
S502、解码端从第一通道的编码块编码得到的子流中解析第一标志。
其中,第一通道为多个通道中的任意一个通道。第一标志可以参照上述编码方法中所述,此处不再赘述。
S503、解码端从第一通道的编码块编码得到的子流中解析第二标志。
其中,第二标志可以参照上述编码方法中所述,此处不再赘述。
S504、当第二标志指示多个通道的编码块采用的目标模式为回退模式时,解码端从第一通道的编码块编码得到的子流中解析第三标志。
其中,第三标志可以参照上述编码方法中所述,此处不再赘述。
S505、解码端根据第三标志指示的回退模式的种类确定多个通道的目标解码模式,并按照目标解码模式对多个通道的编码块编码得到的子流进行解码。
例如,解码端可以将第三标志指示的回退模式的种类作为多个通道的目标解码模式。
在一实施例中,该方法还可以包括:解码端基于码流缓冲区的剩余大小、以及目标像素深度BPP,确定编码单元的目标解码码长,目标解码码长用于指示解码编码单元的码流所需的码长;解码端基于解码码长确定多个通道的分配码长,分配码长用于指示解码多个通道的编码块的码流的残差所需的码长;解码端根据分配码长在多个通道的平均值,确定多个通道各自分配的解码码长。
基于上述实施例二的理解,以多个通道包括亮度(Y)通道、第一色度(U)通道、以及第二色度(V)通道为例,对回退模式的改进方案1中主要包括的两个方案进行介绍:
方案1:
编码端:编码端控制在任何时候,防膨胀模式(原始值模式)下的编码的比特消耗代价都是最大的,编码端不能选择比特消耗代价大于原始值的模式。因此所有分量下都需要编码模式,即使在回退模式下,不仅Y通道对应的子流(第一子流)需要编码模式,U/V通道对应的子流(第二子流和第三子流)也需要编码模式。保持回退模式码字与原始值模式码字相同。回退模式的具体种类(第一回退模式和第二回退模式)可以基于某一个通道编码。
解码端:首先解析三个通道的编码模式,当Y/U/V三个通道均为基于率失真代价选择的目标编码模式时,再判断码流缓冲区的剩余大小判断码流缓冲区是否允许三个通道均按照各自的目标编码模式解码。若不允许,则当前的解码模式为回退模式。若为回退模式,则可以基于某一个分量解析一个标志位,表示当前回退模式属于第一回退模式还是第二回退模式(三通道使用同一种回退模式)。因此当解码端有一个通道没有选择防膨胀模式时,当前CU就不会选择回退模式。
方案2:
编码端:编码端控制在任何时候,防膨胀模式(原始值模式)下的编码的比特消耗代价都是最大的,编码端不能选择比特消耗代价大于原始值的模式。原始值模式与回退模式依然使用相同的模式标志(也即上述第一标志),但在编码防膨胀模式和回退模式时,额外编码一个标志(也即上述第二标志),表示当前编码模式为防膨胀模式或回退模式中的一种,当该额外的标志指示当前编码模式为回退模式时,进一步编码一个标志位(也即上述第三标志),表示当前的回退模式为第一回退模式或者第二回退模式中的一种。当多个通道均选择回退模式时,多个通道的回退模式的种类保持一致。当有通道选择防膨胀模式时,在该通道对应的子流中编码模式标志,但无需编码是否为回退模式的标志(也即上述第二标志)。
解码端:从码流中解析出模式标志,若模式标志为防膨胀模式和回退模式共用的模式标志,则继续解析一个标志(也即上述第二标志),表示当前是否是回退模式。若是回退模式,则进一步解析一个标志(也即上述第三标志),表示当前的回退模式为第一回退模式以及第二回退模式中的一种。当解析出一个通道的回退模式的种类时,其他采用回退模式的通道与该通道采用相同的回退模式的种类。未采用回退模式的通道各自解析自身的模式。
如上所述,改进方案1通过上述实施例二可以降低膨胀率公式中的分子上限。在一实施例中,对于回退模式,还可以提 高膨胀率公式中的分母下限。以下通过可选的实施方案2和改进方案2进行介绍。
可选的实施方案2:当确定使用回退模式进行编码时,编码端会给定一个目标BPP,在回退模式下三个通道的编码块对应的子流的比特数之和必须小于等于目标BPP。因此在进行码率发分配时,将公共信息(块向量和模式标志)放在亮度通道上编码,分配的码长减去公共信息占用的码长之后,剩余的码长用于对多个通道的残差进行编码,因此需要对剩余的码长进行分配,分配的方法时将是剩余的码长对多个通道的像素数进行整除(也即每个通道分配到的码长需要是编码单元的像素数的整数倍),若无法整除,就将无法整除的部分码长分配给亮度通道。当剩余的码长较小时,无法被多个通道的像素数整除,第一色度通道和第二色度通道对应的子流中就无法分配到码长,导致对应的子流比特数较小,膨胀率较大。
改进方案2:将剩余的码长按照比特数平均分配到多个通道,提高分配精度。
实施例三:
图16为本申请实施例提供的又一种图像编码方法的流程示意图。如图15所示,在上述S301至S303的基础上,该图像编码方法还可以包括S601至S603。
S601、编码端基于码流缓冲区的剩余大小以及目标BPP,确定编码单元的目标编码码长。
其中,目标编码码长用于指示编码编码单元所需的码长。目标BPP可以由编码端获取,例如接收用户输入的目标BPP。
S602、基于目标编码码长,确定多个通道的分配码长。
其中,分配码长用于指示编码多个通道的编码块的残差所需的码长。
例如,如上所述,编码端可以分配码长减去公共信息所占用的码长,得到多个通道的分配码长。
S603、根据分配码长在多个通道的平均值,确定多个通道各自分配的编码码长。
在一实施例中,编码端还可以将上述块向量和模式标志在多个通道对应的子流中进行编码。
其中,模式标志可以参照上述,此处不再赘述。块向量可以参照下述IBC模式所述,此处不再赘述。
在一实施例中,若编码残差时,存在一个通道分配到的比特无法整除该通道的像素个数,也即无法对每个像素的残差使用定长码编码,则编码端可以将该通道的像素进行分组,每组仅编码一个残差值。
应理解,当剩余的码长较小时,目前以编码单元的像素个数为单位的码长分配模式可能导致色度通道无法分配到码长,从而导致色度通道的编码块编码后的比特数较小,使膨胀率变大。本申请实施例提供的图像编码方法可以通过将分配的单位由像素个数转化为分配码长的比特数进行分配,每个通道均可以分配到码长进行编码,降低了理论膨胀率。
相对应地,本申请实施例还提供一种图像解码方法,在上述S401至S404或者S501至S505的基础上,该方法还可以包括:解码端基于码流缓冲区的剩余大小、以及目标像素深度BPP,确定编码单元的目标解码码长;目标解码码长用于指示解码编码单元的码流所需的码长;解码端基于解码码长;确定多个通道的分配码长;分配码长用于指示解码多个通道的编码块的码流的残差所需的码长;解码端根据分配码长在多个通道的平均值,确定多个通道各自分配的解码码长。
三、帧内块复制(Intra block copy,IBC)模式修改。
可选的实施方案:模式标识CU级,BV(Block Vector)也是CU级,模式标识和BV均在亮度传输
改进方案:模式标识改为CB级别,BV依然是CU级别,各通道均传输模式标识,各通道均传输BV。
实施例四:
图17为本申请实施例提供的又一种图像编码方法的流程示意图。如图17所示,该方法包括S701至S704。
S701、编码端获取编码单元。
其中,编码单元包括多个通道的编码块。
S702、编码端按照IBC模式对多个通道中的至少一个通道的编码块进行编码。
在一实施例中,S702可以具体包括:编码端使用率失真优化决策确定出该至少一个通道的目标编码模式。
其中,目标编码模式包括IBC模式。率失真优化决策可以参照上述策略3所述,此处不再赘述。
S703、编码端获取参考预测块的BV。
其中,参考预测块的BV用于指示参考预测块在已编码的图像块中的位置。参考预测块用于表征按照IBC模式编码的编码块的预测值。
S704、编码端在至少一种通道的编码块经IBC模式编码得到的至少一个子流中编码参考预测块的BV。
在一实施例中,对于每个通道的编码块的参考预测块的BV,该BV可以全部在亮度通道对应的子流中编码。
在一实施例中,对于每个通道的编码块的参考预测块的BV,该BV也可以全部在色度的某一个通道(例如第一色度通道或者第二色度通道)对应的子流中编码,或者,在两个色度通道对应的子流中平分该BV的个数进行编码,若无法平分,则编码端可以在任意一个色度通道对应的子流中多编码一个BV。
在一实施例中,对于每个通道的编码块的参考预测块的BV,该BV也可以平分在不同通道的编码块对应的子流中编码,若无法平分,则编码端可以按照预设比例进行分配。例如,以BV的个数为8个为例,则预设比例可以是Y通道:U通道:V通道=2:3:3,或者Y通道:U通道:V通道=4:2:2等。本申请实施例对预设比例的具体数值不作限制。在这种情况下,上述S602可以具体包括:按照IBC模式对多个通道的编码块进行编码;上述S704可以具体包括:在多个通道中的每一种通道的编码块经IBC模式编码得到的子流中编码参考预测块的BV。
在一实施例中,参考预测块的BV包括多个。上述在多个通道中的每一个通道的编码块经IBC模式编码得到的子流中编码参考预测块的BV,可以包括:按照预设比例在多个通道中的每一个通道的编码块经IBC模式编码得到的码流中编码参考预测块的BV。
应理解,亮度通道对应的子流的比特数通常较大,色度通道对应的子流的比特数通常较小。本申请实施例通过在采用IBC模式进行编码时,在至少一个通道的编码块对应的子流中进行传输参考预测块的BV,可以在比特数较小的色度通道对应的子流中增加传输数据,从而提高上述计算理论膨胀率的公式中的分母,从而减小理论膨胀率。
在一实施例中,CU在编码过程中会产生头信息,该CU的头信息也可以按照上述分配BV的方式分配编码在子流中。
相对应地,本申请实施例还提供一种图像解码方法。图18为本申请实施例提供的又一种图像解码方法的流程示意图。如图18所示,该方法包括S801至S804。
S801、解码端解析对编码单元编码后的码流。
其中,编码单元包括多个通道的编码块;码流包括多个通道的编码块编码后的、与多个通道一一对应的多个子流。
S802、解码端基于从多个子流中的至少一个子流中解析到的参考预测块的块向量BV,在多个子流中确定出参考预测块的位置。
S803、解码端基于参考预测块的位置信息,确定按照IBC模式进行解码的解码块的预测值。
S804、解码端基于预测值,对按照IBC模式进行解码的解码块进行重建。
S803和S804可以参照上述视频编解码系统处所述,此处不再赘述。
在一实施例中,对于每个通道的编码块的参考预测块的BV,编码端可以全部在亮度通道对应的子流中解析。
在一实施例中,对于每个通道的编码块的参考预测块的BV,该BV也可以全部在色度的某一个通道(例如第一色度通道或者第二色度通道)对应的子流中解析,或者,在两个色度通道对应的子流中平分该BV的个数进行解析,若无法平分,则编码端可以在任意一个色度通道对应的子流中多解析一个BV。
在一实施例中,对于每个通道的编码块的参考预测块的BV,该BV也可以平分在不同通道的编码块对应的子流中解析,若无法平分,则解码端可以按照预设比例进行分配。例如,以BV的个数为8个为例,则预设比例可以是Y通道:U通道:V通道=2:3:3,或者Y通道:U通道:V通道=4:2:2等。本申请实施例对预设比例的具体数值不作限制。
在一实施例中,按照IBC模式进行进行编码的编码块包括至少两个通道的编码块;至少两个通道的编码块共用参考预测块的BV。
在一实施例中,该方法还可以包括:当从多个子流中的任意一个子流中解析出IBC模式标识时,解码端确定该多个子流对应的目标解码模式为IBC模式。
在一实施例中,该方法还可以包括:解码端从多个子流中逐个解析IBC模式标识,确定多个子流中解析出IBC模式标识的子流所对应的目标解码模式为IBC模式。
基于上述实施例四的理解,以多个通道包括亮度(Y)通道、第一色度(U)通道、以及第二色度(V)通道为例,对IBC模式的改进方案中主要包括的两个方案进行介绍:
方案1:
编码端:
步骤1、基于多通道来获取参考预测块,即对于一个多通道的编码单元而言,将搜索区域中某一个位置的多通道作为当前多通道的编码单元下的参考预测块,将这个位置记为BV,即多通道共享一个BV。若输入图像为YUV400则只有一个通道,否则有3个通道。得到预测块后,计算每个通道下的残差,对残差进行(变换)量化,反量化(反变换)后完成重建。
步骤2、在亮度通道上编码辅助信息(包括编码块复杂度等级等信息),模式信息,以及编码亮度通道量化后的系数。
步骤3、若存在色度通道,在色度通道上编码辅助信息(包括编码块复杂度等信息),以及编码色度通道量化后的系数。
3.1、对于每个编码单元的参考预测块的BV,可以全部在亮度通道上编码。
3.2、对于每个编码单元的参考预测块的BV,也可以全部在色度某一个通道上编码,或者在色度两个通道上平分BV进行编码,若无法平分,则某一个色度通道多编部分BV。
3.3、对于每个编码单元的参考预测块的BV,也可以平分BV分别在不同分量上编码。若无法平分,则按照预设比例进行分配。
解码端:
步骤1、解析亮度通道的编码块的辅助信息以及模式信息,解析亮度通道下量化后的系数。
步骤2、若存在色度通道,解析U通道的编码块的辅助信息,若亮度通道预测模式为IBC模式,则当前色度通道的预测模式不需要解析直接默认为IBC模式,解析当前色度通道下量化后的系数。
步骤3、对于BV的解析,与编码端一直,也即:
3.1、对于每个编码单元的参考预测块的BV,可以全部在亮度通道上解析。
3.2、对于每个解析单元的参考预测块的BV,也可以全部在色度某一个通道上解析,或者在色度两个通道上平分BV进行解析,若无法平分,则某一个色度通道多编部分BV。
3.3、对于每个解析单元的参考预测块的BV,也可以平分BV分别在不同分量上解析。若无法平分,则按照预设比例进行分配。
步骤4、根据三通道共享的BV,得到每个通道下每个编码块的预测值,将每个通道下解析得到的系数进行反量化(反变换)得到残差值,根据残差值和预测值对每个编码块完成重建。
方案2:
编码端:
步骤1、同一个编码单元,即三通道下每种IBC模式只需要训练得到一组BV,这组BV可以是基于三通道的搜索区域以及原始值得到,可以基于其中一个通道的搜索区域以及原始值得到,或基于其中任意两个通道的搜索区域以及原始值计算得到。
步骤2、对于每个通道的编码块,使用率失真代价决策出目标编码模式,其中IBC模式下三通道的BV使用步骤1中计算得到的BV。可以允许在某一个分量选择IBC模式后,其他分量不选择IBC模式。
步骤3、对于每个通道的编码块,都需要编码自身的最优模式,若有一个或多个通道的目标模式选择了IBC模式,那么BV可以基于某一个选择IBC模式的通道进行编码,也可以基于某两个选择IBC的通道平分BV个数进行编码,也可以基于所有选择IBC模式的通道平分BV个数进行编码,对于无法整除BV个数时,按照预设比例分配。
解码端:
每个通道都解析一个目标模式,若解析到有一个或多个通道的目标模式选择了IBC模式(只能选择同一种IBC模式),那么BV解析与编码端BV编码的分配方案一致。
四、系数分组。
可选的实施方案:如图1中的流程图所示,编码过程中可能存在残差跳过模式,若编码端在编码时选择该跳过模式,编码时就不需要编码残差了,只需要编码1比特的数据来表示残差跳过即可。
改进方案:对处理系数(残差系数,和/或,变换系数)进行分组。
实施例五:
图19为本申请实施例提供的又一种图像编码方法的流程示意图。如图19所示,该图像编码方法包括S901至S902。
S901、编码端获取编码单元对应的处理系数。
其中,处理系数包括残差系数和变换系数中的一项或多项。
S902、编码端按照个数阈值将处理系数分为多组。
其中,个数阈值可以预设在编码端中。个数阈值与编码块的尺寸相关。例如,对于16×2的编码块来说,个数阈值可以设置为16。多组处理系数中的每组处理系数的个数小于或等于个数阈值。
应理解,与可选的实施方案中的残差跳过模式相比,本申请实施例提供的图像编码方法在编码时可以将处理系数进行分组,分组后的每一组处理系数在传输时均需要添加该组处理系数的头部信息来描述该组处理系数的详细情况,与目前的利用1比特的数据来表示残差跳过相比,增加了理论膨胀率的计算公式中的分母,降低了理论膨胀率。
相对应地,本申请实施例还提供一种图像解码方法。图20为本申请实施例提供的又一种图像解码方法的流程示意图。如图20所示,该图像解码方法包括S1001至S1002。
S1001、解码端解析对编码单元编码后的码流,确定编码单元对应的处理系数。
其中,处理系数包括残差系数和变换系数中的一项或多项;处理系数包括多组;每组处理系数的个数小于或等于个数阈值。
S1002、解码端基于处理系数对码流进行解码。
S1002可以参照上述视频编解码系统处所述,此处不再赘述。
五、复杂度传输。
可选的实施方案:编码单元包括亮度通道的编码块、第一色度通道的编码块、以及第二色度通道的编码块。亮度通道的编码块对应的子流为第一子流,第一色度通道的编码块对应子流为第二子流,第二色度通道的编码块对应的子流为第三子流。第一子流用1或3比特传输亮度通道的复杂度等级,第二子流中用1或3比特传输两个色度通道的平均值,第三子流中不传输复杂度等级。
例如,CU级复杂度的计算方式为:
if(image_format==‘000’){/*YUV400*/
CuComplexityLevel=ComplexityLevel[0]
}else if(image_format==‘001’){/*YUV420*/
CuComplexityLevel=ComplexityDivide3Table[ComplexityLevel[1]+(ComplexityLevel[0]<<1)]
}else if(image_format==‘010’){/*YUV422*/
CuComplexityLevel=(ComplexityLevel[0]+ComplexityLevel[1])>>1
}else if(image_format==‘011’||image_format==‘100’){/*YUV444or RGB444*/
CuComplexityLevel=ComplexityDivide3Table[ComplexityLevel[0]+(ComplexityLevel[1]<<1)]
}
根据亮度通道的复杂度等级ComplexityLevel[0]、色度通道的复杂度等级ComplexityLevel[1]查表得到BiasInit,然后计算亮度通道的量化参数Qp[0]、以及两个色度通道的量化参数Qp[1],Qp[2]。
其中,查表的表格可以参照下述表1所示,此处不再赘述。
以第一子流为例,第一子流的具体实现如下:
PrevComplexityLevel=ComplexityLevel[0]
其中,complexity_level_flag[0]为亮度通道复杂度的等级更新标志,是一个二值变量。值为‘1’表示编码单元的亮度通道需要更新复杂度等级;值为‘0’表示编码单元的亮度通道不需要更新复杂度等级。ComplexityLevelFlag[0]的值等于complexity_level_flag[0]的值。delta_level[0]为亮度通道复杂度等级变化量,是2位无符号整数。确定亮度复杂度等级的变化量。DeltaLevel[0]的值等于delta_level[0]的值。如果码流中不存在delta_level[0],DeltaLevel[0]的值等于0。PrevComplexityLevel表示前一个编码单元的亮度通道的复杂度等级;ComplexityLevel[0]表示亮度通道的复杂度等级。
改进方案:在第三子流中增加传输复杂度信息。
实施例六
图21为本申请实施例提供的又一种图像编码方法的流程示意图。如图20所示,该图像编码方法包括S1101至S1103。
S1101、编码端获取编码单元。
其中,编码单元包括P个通道的编码块,P为大于或等于2的整数。
S1102、编码端获取P个通道中每个通道的编码块的复杂度信息。
其中,复杂度信息用于表征每个通道的编码块的像素值的差异程度。例如,以P个通道包括亮度通道、第一色度通道、 以及第二色度通道为例,则亮度通道的编码块的复杂度信息用于表征亮度通道的编码块的像素值的差异程度;第一色度通道的编码块的复杂度信息用于表征第一色度通道的编码块的像素值的差异程度;第二色度通道的编码块的复杂度信息用于表征第二色度通道的编码块的像素值的差异程度;
S1103、编码端在P个通道的编码块编码得到的子流中编码每个通道的编码块的复杂度信息。
在一实施例中,S1103可以具体包括:编码端在P个通道的编码块编码得到的子流中各自编码每个通道的编码块的复杂度等级。
示例性地,以第一色度分量对应的第二子流为例,第二子流具体实现如下:
其中,complexity_level_flag[1]表示第一色度通道复杂度等级更新标志,是二值变量,值为‘1’表示编码单元的第一色度通道的编码块的复杂度等级与亮度通道的编码块的复杂度等级一致;值为‘0’表示编码单元的第一色度通道的编码块的复杂度等级与亮度通道的编码块的复杂度等级不一致。ComplexityLevelFlag[1]的值等于complexity_level_flag[1]的值。delta_level[1]表示第一色度通道复杂度等级变化量,是2位无符号整数,确定第一色度通道的编码块的复杂度等级的变化量。DeltaLevel[1]的值等于delta_level[1]的值。如果码流中不存在delta_level[1],DeltaLevel[1]的值等于0。ComplexityLevel[1]表示第一色度通道的编码块的复杂度等级。
示例性地,以第二色度分量对应的第三子流为例,第三子流具体实现如下:
其中,complexity_level_flag[2]表示第二色度通道复杂度等级更新标志,是二值变量,值为‘1’表示编码单元的第二色度通道的编码块的复杂度等级与第一色度通道的编码块的复杂度等级一致;值为‘0’表示编码单元的第二色度通道的编码块的复杂度等级与第一色度通道的编码块的复杂度等级不一致。ComplexityLevelFlag[2]的值等于complexity_level_flag[2]的值。delta_level[2]表示第二色度通道复杂度等级变化量,是2位无符号整数,确定第二色度通道的编码块的复杂度等级的变化量。DeltaLevel[2]的值等于delta_level[2]的值。如果码流中不存在delta_level[2],DeltaLevel[2]的值等于0。ComplexityLevel[2]表示第二色度通道的编码块的复杂度等级。
在一实施例中,复杂度信息包括复杂度等级和第一参考系数,第一参考系数用于表征不同通道的编码块的复杂度等级之间的比例关系。S1103可以具体包括:编码端在P个通道中的Q个通道的编码块编码得到的子流中各自编码Q个通道的编码块的复杂度等级;Q为小于P的整数;编码端在P个通道中的P-Q个通道的编码块编码得到的子流中编码第一参考系数。
例如,以P个通道包括亮度通道、第一色度通道、以及第二色度通道为例,则亮度通道的编码块的复杂度信息为亮度通道的编码块的复杂度等级;第一色度通道的编码块的复杂度信息为第一色度通道的编码块的复杂度等级;第二色度通道的编码块的复杂度信息为参考系数;参考系数用于表征第一色度通道的编码块的复杂度等级和第二色度通道的编码块的复杂度等级之间的比例关系。
例如,将上述变量complexity_level_flag[2]的赋值含义进行改变,值为‘1’表示编码单元的第二色度通道的编码块的复杂度等级比第一色度通道的编码块的复杂度等级大;值为‘0’表示编码单元的第二色度通道的编码块的复杂度等级比第一色度通道的编码块的复杂度等级小或者相等。ComplexityLevelFlag[2]的值等于complexity_level_flag[2]的值。
在一实施例中,复杂度信息包括复杂度等级、参考复杂度等级、以及第二参考系数;参考复杂度等级包括以下任意一项:第一复杂度等级、第二复杂度等级、以及第三复杂度等级;第一复杂度等级为P个通道中的P-Q个通道的编码块的复杂度等级中的最大值;Q为小于P的整数;第二复杂度等级为P个通道中P-Q个通道的编码块的复杂度等级中的最小值;第三复杂度等级为P个通道中的P-Q个通道的编码块的复杂度等级中的平均值;第二参考系数用于表征P个通道中的P-Q个通道的编码块的复杂度等级之间的大小关系,和/或,比例关系。在这种情况下,上述S1103可以具体包括:编码端在P个通道中的Q个通道的编码块编码得到的子流中各自编码Q个通道的编码块的复杂度等级;编码端在P个通道中的P-Q个通道的编码块编码得到的子流中编码参考复杂度等级和第二参考系数。
例如,以P个通道包括亮度通道、第一色度通道、以及第二色度通道为例,则亮度通道的编码块的复杂度信息为亮度通道的编码块的复杂度等级;第一色度通道的编码块的复杂度信息为参考复杂度等级;参考复杂度等级包括以下任意一项: 第一复杂度等级、第二复杂度等级、以及第三复杂度等级;第一复杂度等级为第一色度通道的编码块的复杂度等级和第二色度通道的编码块的复杂度等级中的最大值;第二复杂度等级为第一色度通道的编码块的复杂度等级和第二色度通道的编码块的复杂度等级中的最小值;第三复杂度等级为第一色度通道的编码块的复杂度等级和第二色度通道的编码块的复杂度等级的平均值;第二色度通道的编码块的复杂度信息为参考系数;参考系数用于表征第一色度通道的编码块的复杂度等级和第二色度通道的编码块的复杂度等级之间的大小关系,和/或,比例关系。
应理解,可选的实施方案中第二色度通道的编码块对应的第三子流中并不传输复杂度信息。本申请实施例提供的图像编码方法通过在第三子流中增加编码复杂度信息,增加了比特数较小的子流中的比特数,从而提高了上述理论膨胀率计算公式中的分母,降低了理论膨胀率。
相对应地,本申请实施例还提供了一种图像解码方法。图2为本申请实施例提供的又一种图像解码方法。如图2所示,该解码方法包括S1201至S1204。
S1201、解码端解析对编码单元编码后的码流。
其中,编码单元包括P个通道的编码块;P为大于或等于2的整数;码流包括P个通道的编码块编码后的、与P个通道一一对应的多个子流。
S1202、解码端在P个通道的编码块编码得到的子流中解析每个通道的编码块的复杂度信息。
在一实施例中,S1202可以具体包括:解码端在P个通道的编码块编码得到的子流中各自解析每个通道的编码块的复杂度等级。
S1203、解码端基于每个通道的编码块的复杂度信息,确定每个通道的编码块的量化参数。
S1204、解码端基于每个通道的编码块的量化参数,对码流进行解码。
在一实施例中,当每个通道对应的子流中均传输了各个通道的复杂度等级时,CU级别的复杂度等级(CuComplexityLevel)可以按照下述过程进行计算:
if(image_format==‘000’){/*YUV400*/
CuComplexityLevel=ComplexityLevel[0]
}else if(image_format==‘001’){/*YUV420*/
CuComplexityLevel=ComplexityDivide3Table[((ComplexityLevel[0]<<2)+ComplexityLevel[1]+ComplexityLevel[2])>>
1]
}else if(image_format==‘010’){/*YUV422*/
CuComplexityLevel=((ComplexityLevel[0]<<1)+ComplexityLevel[1]+ComplexityLevel[2])>>2
}else if(image_format==‘011’||image_format==‘100’){/*YUV444or RGB444*/
CuComplexityLevel=ComplexityDivide3Table[ComplexityLevel[0]+ComplexityLevel[1]+ComplexityLevel[2]]
}
其中,ComplexityDivide3Table的定义为:ComplexityDivide3Table={0,0,0,1,1,1,2,2,2,3,3,3,4};image_format表示编码单元所在的待处理图像的图像格式。
在一实施例中,如上所述,复杂度信息包括复杂度等级和第一参考系数;第一参考系数用于表征不同通道的编码块的复杂度等级之间的比例关系。在这种情况下,S1202可以具体包括:解码端在P个通道中的Q个通道的编码块编码得到的子流中各自解析Q个通道的编码块的复杂度等级;Q为小于P的整数;解码端在P个通道中的P-Q个通道的编码块编码得到的子流中解析第一参考系数;解码端基于第一参考系数、以及Q个通道的编码块的复杂度等级,确定P-Q个通道的编码块的复杂度等级。
在一实施例中,当存在子流中没有传输复杂度等级时(也即上述第三子流中传输大小关系时),CU级别的复杂度等级(CuComplexityLevel)可以按照下述过程进行计算:
if(image_format==‘000’){/*YUV400*/
CuComplexityLevel=ComplexityLevel[0]
}else if(image_format==‘001’){/*YUV420*/
CuComplexityLevel=ComplexityDivide3Table[ChromaComplexityLevel+(ComplexityLevel[0]<<1)]
}else if(image_format==‘010’){/*YUV422*/
CuComplexityLevel=(ComplexityLevel[0]+ChromaComplexityLevel)>>1
}else if(image_format==‘011’||image_format==‘100’){/*YUV444or RGB444*/
CuComplexityLevel=ComplexityDivide3Table[ComplexityLevel[0]+(ChromaComplexityLevel<<1)]
}
其中ChromaComplexityLevel表示色度通道的复杂度等级。
对于ChromaComplexityLevel在编码端需要单独计算(解码端直接从码流中获取)。
ChromaComplexityLevel=(ComplexityLevel[1]+ComplexityLevel[2])>>1或者ChromaComplexityLevel=max(ComplexityLevel[1],ComplexityLevel[2])或者ChromaComplexityLevel=min(ComplexityLevel[1],ComplexityLevel[2])或者ChromaComplexityLevel=ComplexityLevel[1]或者ChromaComplexityLevel=ComplexityLevel[2]。
在一实施例中,如上所述,复杂度信息包括复杂度等级、参考复杂度等级、以及第二参考系数;参考复杂度等级包括以下任意一项:第一复杂度等级、第二复杂度等级、以及第三复杂度等级;第一复杂度等级为P个通道中的P-Q个通道的编码块的复杂度等级中的最大值;Q为小于P的整数;第二复杂度等级为P个通道中P-Q个通道的编码块的复杂度等级中的最小值;第三复杂度等级为P个通道中的P-Q个通道的编码块的复杂度等级中的平均值;第二参考系数用于表征P个通道中的P-Q个通道的编码块的复杂度等级之间的大小关系,和/或,比例关系。在这种情况下,上述S1202可以具体包括:解码端在P个通道中的Q个通道的编码块编码得到的子流中各自解析Q个通道的编码块的复杂度等级;解码端在P个通道中的P-Q个通道的编码块编码得到的子流中解析参考复杂度等级和第二参考系数;解码端基于Q个通道的编码块的复杂度等级。以及第二参考系数,确定P-Q个通道的编码块的复杂度等级。
在一实施例中,当每个通道对应的子流中均传输了各个通道的复杂度等级时,解码端可以根据亮度通道的编码块的复杂度等级ComplexityLevel[0]与第一色度通道的编码块的复杂度等级ComplexityLevel[1]查下述表2得到BiasInit1,根据亮度通道的编码块的复杂度等级ComplexityLevel[0]与第二色度通道的编码块的复杂度等级ComplexityLevel[2]查下述表2得到BiasInit2,然后计算亮度通道的量化参数Qp[0]、第一色度分俩的量化参数Qp[1]、以及第二色度通道的量化参数Qp[2]。
例如,解码端可以按照下述过程计算量化参数:
Bias1=(BiasInit1×FormatBias)>>1
Bias2=(BiasInit2×FormatBias)>>1
tmp=(ChromaSampleRate>>1)×(Bias1+Bias2)
tmp=((tmp<<7)+256)>>9
Qp[0]=Clip3(0,MaxQp[0],(MasterQp–tmp)>>7)
Qp[1]=Clip3(0,MaxQp[1],(MasterQp+Bias1)>>7)
Qp[2]=Clip3(0,MaxQp[2],(MasterQp+Bias2)>>7)
表1 ChromaSampleRate,FormatBias与ImageFormat的对应关系
表2BiasInit的定义
在一实施例中,解码端还可以根据色度通道的编码块的复杂度等级ComplexityLevel[0]与色度通道的编码块的复杂度等级ChromaComplexityLevel查表2得到BiasInit,然后通过下述过程计算得到Qp[0]、Qp[1]、以及Qp[2]。
Bias=(BiasInit×FormatBias)>>1
tmp=ChromaSampleRate×Bias
tmp=((tmp<<7)+128)>>8
Qp[0]=Clip3(0,MaxQp[0],(MasterQp–tmp)>>7)
Qp[1]=Clip3(0,MaxQp[1],(MasterQp+Bias)>>7)
Qp[2]=Qp[1]
Qp[1]=Clip3(0,MaxQp[1],Qp[1]+ComplexityLevel[1]-ChromaComplexityLevel)
Qp[2]=Clip3(0,MaxQp[2],Qp[2]+ComplexityLevel[2]-ChromaComplexityLevel)
在一实施例中,如上所述,第一色度通道的编码块对应的子流可以传输第一复杂度等级和第二复杂度等级,在这种情况下,解码端可以根据亮度通道的编码块的复杂度等级ComplexityLevel[0]、以及色度通道的编码块的复杂度等级ChromaComplexityLevel(取第一复杂度等级/第一色度通道的编码块的复杂度等级)查表2得到BiasInit,然后通过下述过程计算得到Qp[0]、Qp[1]、以及Qp[2]。
Bias=(BiasInit×FormatBias)>>1
tmp=ChromaSampleRate×Bias
tmp=((tmp<<7)+128)>>8
Qp[0]=Clip3(0,MaxQp[0],(MasterQp–tmp)>>7)
if(0==complexity_level_flag[2]){
Qp[1]=Clip3(0,MaxQp[1],(MasterQp+Bias)>>7)
Qp[2]=Clip3(0,MaxQp[2],(MasterQp+Bias-32)>>7)
}else if(1==complexity_level_flag[2]){
Qp[1]=Clip3(0,MaxQp[1],(MasterQp+Bias-32)>>7)
Qp[2]=Clip3(0,MaxQp[2],(MasterQp+Bias)>>7)
}
在一实施例中,如上所述,第一色度通道的编码块对应的子流可以通过传输第三复杂度等级,在这种情况下,上述ChromaComplexityLevel可以取第三复杂度等级,并通过下述过程计算得到Qp[0]、Qp[1]、以及Qp[2]。
Bias=(BiasInit×FormatBias)>>1
tmp=ChromaSampleRate×Bias
tmp=((tmp<<7)+128)>>8
Qp[0]=Clip3(0,MaxQp[0],(MasterQp–tmp)>>7)
Qp[1]=Clip3(0,MaxQp[1],(MasterQp+Bias)>>7)
Qp[2]=Qp[1]
if(0==complexity_level_flag[2]){
Qp[1]=Clip3(0,MaxQp[1],Qp[1]+2)
}else if(1==complexity_level_flag[2]){
Qp[2]=Clip3(0,MaxQp[2],Qp[2]+1)
}
六、色度通道共用子流。
可选的实施方案:对于YUV420/YUV422格式的待处理图像,一共传输三个子流,第一子流、第二子流、以及第三子流。其中,第一子流中包括亮度通道的语法元素和变换系数/残差系数,第二子流中包括第一色度通道的语法元素和变换系数/残差系数,第三子流中包括第二色度通道的语法元素和变换系数/残差系数。
改进方案:对于YUV420/YUV422格式的待处理图像,将第一色度通道的语法元素和变换系数/残差系数、以及第二色度通道的语法元素和变换系数/残差系数放在第二子流中传输,取消第三子流。
实施例七:
图22为本申请实施例提供的又一种图像编码方法的流程示意图。如图22所示,该图像编码方法包括S1201至S1203。
S1201、编码端获取编码单元。
其中,编码单元为待处理图像中的图像块。编码单元包括多个通道的编码块。
S1202、当待处理图像的图像格式为预设格式时,则编码端将多个通道中的至少两个预设通道的编码块编码得到的子流合并为一个合并子流。
其中,预设格式可以预设在编/解码端中,例如,预设格式可以为YUV420或者YUV422等,预设通道可以为第一色度通道和第二色度通道。示例性地,第一子流的编码块的定义可以如下述所示:
示例性地,第二子流的编码块的定义可以如下述所示:

应理解,两个色度通道的语法元素和量化后的变换系数编码后的比特数通常小于亮度通道,将两个色度通道的语法元素融合在第二子流中传输,可以降低第一子流和第二子流之间的比特数差异,降低理论膨胀率。
相对应地,本申请实施例还提供一种图像解码方法,图23为本申请实施例提供的又一种图像解码方法的流程示意图,如图23所示,该方法包括S1301至S1303。
S1301、解码端解析对编码单元编码后的码流。
S1302、解码端确定合并子流。
其中,合并子流是当待处理图像的图像格式为预设格式时,将多个通道中的至少两个预设通道的编码块编码的到的子流合并得到的。
S1303、解码端基于所述合并了至少两个待合并子流,对码流进行解码。
七、子流填充。
改进方案:在比特数小于比特数阈值的子流中填充预设码字。
实施例八:
图24为本申请实施例提供的又一种图像编码方法的流程示意图,如图24所示,该方法包括S1401至S1402。
S1401、编码端获取编码单元。
其中,编码单元包括多个通道的编码块。
S1402、编码端在满足预设条件的目标子流中编码预设码字,直至目标子流不满足预设条件。
其中,目标子流是多个子流中的子流。多个子流是对多个通道的编码块进行编码得到的码流。预设码字可以是“0”或者其他码字等。本申请实施例对此不作限制。
在一实施例中,预设条件包括:子流比特数小于预设的第一比特数阈值。
其中,第一比特数阈值可以预设在编/解码端,或者,由编/解码端在码流中传输。本申请实施例对此不作限制。第一比特数阈值用于指示CU中允许的最小CB的比特数。
在一实施例中,预设条件包括:编码单元的码流中存在比特数小于预设的第二比特数阈值的经编码的编码块。
其中,第二比特数阈值也可以预设在编/解码端,或者,由编/解码端在码流中传输。本申请实施例对此不作限制。第二比特数阈值用于指示多个子流中允许的最小子流的比特数。
应理解,在比特数较小的子流中填充预设码字,可以直接增加膨胀率的分母部分,从而降低编码单元的实际膨胀率。
相对应地,本申请实施例还提供一种图像解码方法。图25为本申请实施例提供的又一种图像解码方法的流程示意图,如图25所示,该图像解码方法包括S1501至S1503。
S1501、解码端解析对编码单元编码后的码流。
其中,编码单元包括多个通道的编码块。
S1502、解码端确定码字数量。
其中,码字数量用于指示在满足预设条件的目标子流中编码的预设码字的数量。预设码字是当存在满足预设条件的目标子流时,编码入目标子流中的。
S1503、解码端基于码字数量,对码流进行解码。
例如,解码端可以基于码字数量,将预设码字删除,对删除了预设码字之后的码流进行解码。
八、预设膨胀率。
改进方案:通过预设膨胀率作为阈值来控制当前的实际膨胀率。
实施例九:
图26为本申请实施例提供的又一种图像编码方法的流程示意图,如图26所示,该图像编码方法包括S1601至S1603
S1601、编码端获取编码单元。
其中,编码单元包括多个通道的编码块。
S1602、编码端基于预设膨胀率,确定多个通道的编码块中每个通道的编码块各自对应的目标编码模式。
其中,预设膨胀率可以预设在编码端和解码端中。或者,预设膨胀率也可以由编码端编码入子流中传输至解码端。本申请实施例对此不作限制。
S1603、编码端按照目标编码模式对每个通道的编码块进行编码,以使得当前膨胀率小于或等于预设膨胀率。
在一实施例中,预设膨胀率包括第一预设膨胀率;当前膨胀率等于最大子流的比特数与最小子流的比特数之商;最大子流为对多个通道的编码块进行编码得到的多个子流中,比特数最大的子流;最小子流为对多个通道的编码块进行编码得到的多个子流中,比特数最小的子流。
在一实施例中,如上所述,编码端可以预设一个第一预设膨胀率作为编码单元内允许的最大阈值,在对编码单元编码前可以获取当前所有子流状态,获取已发送的最多固定长度码流的子流,加上当前子流中剩余量的总量最大的子流状态、以及已发送的最少固定长度码流的子流,加上当前子流中剩余总量的最小的子流状态。若最大状态与最小状态的比值,或者差值大于预设的差值阈值,则编码端可以不再使用率失真优化作为目标模式的选择准则,仅为最大状态的子流选择码流较低的编码模式,最小状态的子流选择码率较高的模式。
示例性地,以预设膨胀率为B_th,同时设置B_delta作为辅助阈值,Y/U/V三个通道的编码块编码得到三个子流,在编码一个编码单元式,编码端可以获取该三个子流的状态,若第一子流(Y通道对应的子流)对应的码率最大设为bit_stream1,且第二子流(U通道对应的子流)对应的码率最小设为bit_stream1,则当bit_stream1/bit_stream2>B_th-B_delta时,编码端可以为Y通道的编码块选择码率较大的模式,为U通道的编码块选择码率较小的模式。
在一实施例中,预设膨胀率包括第二预设膨胀率;当前膨胀率等于经编码的多个通道的编码块中,比特数最大的编码块的比特数和比特数最小的编码块的比特数之商。
在一实施例中,如上所述,编码端可以预设一个第二预设膨胀率作为编码单元内允许的最大阈值,在对某个编码块进行编码之前,得到每个模式下的码率代价。若根据率失真代价得到的最优码率代价所对应的编码模式满足编码每个子流的实际膨胀率小于预设膨胀率,则将该最优码率代价对应的编码模式作为该编码块的目标编码模式,并将目标编码模式编入该编码块对应的子流中;若最优码率代价对应的编码模式无法满足编码每个子流的实际膨胀率小于预设膨胀率,则编码端将码率最大的模式改为一个码率小于该码率最大的模式的码率的模式,或者将码率最小的模式改为一个码率大于该码率最小的模式的码率的模式,或者同时将码率最大的模式改为一个码率小于该码率最大的模式的码率的模式,将码率最小的模式改为一个码率大于该码率最小的模式的码率的模式。
示例性地,以预设膨胀率为A_th为例,Y/U/V三个通道的编码块编码得到三个子流。假设Y/U/V三个通道最优码率代价分别为rate_y、rate_u、以及rate_v,且rate_y的码率最大,rate_u的码率最小。若rate_y/rate_u≥A_th,则编码端可以修改Y通道的目标编码模式使其满足码率代价小于rate_y,或者修改U通道的目标编码模式使其满足码率代价大于rate_u,以使得rate_y/rate_u<A_th,且rate_y/rate_v<A_th。
本申请实施例提供的图像编码方法还可以通过预设膨胀率来干预为每个通道的编码块选择的目标编码模式,从而使得编码端按照目标编码模式编码每个通道时,编码单元的实际膨胀率小于预设膨胀率,从而减小了实际膨胀率。
在一实施例中,该图像编码方法还可以包括:编码端确定当前膨胀率;当当前膨胀率大于第一预设膨胀率时,在最小子流中编码预设码字,以使得当前膨胀率小于或等于第一预设膨胀率。例如,编码端可以在最小子流的末尾填充预设码字。
在一实施例中,该图像编码方法还可以包括:编码端确定当前膨胀;当当前膨胀率大于第二预设膨胀时,编码端在比特数最小的编码块中编码预设码字,以使得当前膨胀率小于或等于第二预设膨胀率。
相对应地,本申请实施例还提供一种图像解码方法。图27为本申请实施例提供的又一种图像解码方法的流程示意图。如图27所示,该方法包括S1701至S1703。
S1701、解码端解析对编码单元编码后的码流。
S1702、解码端基于码流,确定预设码字的数量。
S1703、解码端基于预设码字的数量,对码流进行解码。
S1701至S1703可以参照上述S1501至S1503所述,此处不再赘述。
需要说明的是,上述以单独的一系列实施例对本申请实施例提供的图像编码方法和图像解码方法进行了介绍。实际使用过程中,上述一系列实施例之间、以及实施例中的可选的方案之间等可以互相组合使用。本申请实施例对具体的组合不作限制。
上述主要从方法的角度对本申请实施例提供的方案进行了介绍。为了实现上述功能,其包含了执行各个功能相应的硬件结构和/或软件模块。本领域技术目标应该很容易意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,本申请能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术目标可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
在示例性的实施例中,本申请实施例还提供一种图像编码装置,上述任一种图像编码方法都可以由该图像编码装置执行。本申请实施例提供的图像编码装置可以是上述源装置10或视频编码器102。
图28为本申请实施例提供的图像编码装置的组成示意图。如图28所示,该图像编码装置包括获取模块2801和处理模块2802。
在一实施例中,获取模块2801,用于获取编码单元;编码单元为待处理图像中的图像块;编码单元包括多个通道的编码块;多个通道包括第一通道;第一通道为多个通道中的任意一个通道。处理模块2802,用于按照第一编码模式对第一通道的编码块进行编码;第一编码模式为按照第一定长码对第一通道的编码块中的样本值进行编码的模式;第一定长码的码长 小于或等于待处理图像的图像位宽;图像位宽用于表征存储待处理图像中每个样本所需的比特位数。
在一实施例中,获取模块2801,还用于获取对编码单元编码后的码流;编码单元为待处理图像中的图像块;编码单元包括多个通道的编码块;多个通道包括第一通道;第一通道为多个通道中的任意一个通道;码流包括多个通道的编码块编码后的、与多个通道一一对应的多个子流。处理模块2802,还用于按照第一解码模式对第一通道对应的子流进行解码;第一解码模式为按照第一定长码从第一通道对应的子流解析样本值的模式;第一定长码的码长小于或等于待处理图像的图像位宽;图像位宽用于表征存储待处理图像中每个样本所需的比特数。
在一实施例中,获取模块2801,还用于获取编码单元;编码单元为待处理图像中的图像块;编码单元包括多个通道的编码块。处理模块2802,还用于确定第一总码长;第一总码长为多个通道的编码块均按照各自对应的目标编码模式编码后得到的的第一码流的总码长;目标编码模式包括第一编码模式;第一编码模式为按照第一定长码对编码块中的样本值进行编码的模式;第一定长码的码长小于或等于待处理图像的图像位宽;图像位宽用于表征存储待处理图像中每个样本所需的比特位数;当第一总码长大于或等于码流缓冲区的剩余大小时,将多个通道的编码块按照回退模式进行编码;回退模式和第一编码模式的模式标志相同。
在一实施例中,处理模块2802,还用于在多个通道的编码块编码得到的多个子流中编码模式标志;模式标志用于指示多个通道的编码块各自采用的编码模式。
在一实施例中,多个通道包括第一通道;第一通道为多个通道中的任意一个通道;处理模块2802,具体用于在第一通道的编码块编码得到的子流中编码子模式标志;子模式标志用于指示多个通道的编码块所采用的回退模式的种类。
在一实施例中,多个通道包括第一通道;第一通道为多个通道中的任意一个通道;处理模块2802,具体用于在第一通道的编码块编码得到的子流中编码第一标志、第二标志、以及第三标志;第一标志用于指示多个通道的编码块采用第一编码模式或者回退模式编码;第二标志用于指示多个通道的编码块采用目标模式编码,目标模式为第一编码模式和回退模式中的任意一种;当第二标志指示多个通道的编码块采用的目标模式为回退模式时,第三标志用于指示多个通道的编码块采用的回退模式的种类。
在一实施例中,处理模块2802,还用于基于码流缓冲区的剩余大小、以及目标像素深度BPP,确定编码单元的目标编码码长;目标编码码长用于指示编码编码单元所需的码长;基于编码码长,确定多个通道的分配码长;分配码长用于指示编码多个通道的编码块的残差所需的码长;根据分配码长在多个通道的平均值,确定多个通道各自分配的编码码长。
在一实施例中,处理模块2802,还用于解析对编码单元编码后的码流;编码单元包括多个通道的编码块;若从多个通道的编码块编码得到的子流中解析出模式标志,且第一总码长大于码流缓冲区的剩余大小,则确定子流的目标解码模式为回退模式;模式标志用于指示多个通道的编码块采用第一编码模式或者回退模式进行编码;第一总码长为多个分量通道的编码块均按照各自对应的目标编码模式第一编码模式进行编码后得到的的第一码流的总码长;目标编码模式包括第一编码模式;第一编码模式为按照第一定长码对编码块中的样本值编码单元进行编码的模式;第一定长码的码长小于或等于待处理图像的图像位宽;图像位宽用于表征编码存储待处理图像中每个样本所需的比特位数;解析子流中的预设标志位,确定目标回退模式;目标回退模式为回退模式中的一种;预设标志位用于指示子模式标志的位置;子模式标志用于指示多个通道的编码块编码时所采用的回退模式的种类;回退模式包括第一回退模式和第二回退模式;按照目标回退模式对子流进行解码。
在一实施例中,处理模块2802,还用于基于码流缓冲区的剩余大小、以及目标像素深度BPP,确定编码单元的目标解码码长;目标解码码长用于指示解码编码单元的码流所需的码长;基于解码码长;确定多个通道的分配码长;分配码长用于指示解码多个通道的编码块的码流的残差所需的码长;根据分配码长在多个通道的平均值,确定多个通道各自分配的解码码长。
在一实施例中,处理模块2802,还用于解析对编码单元编码后的码流;编码单元包括多个通道的编码块;多个通道包括第一通道;第一通道为多个通道中的任意一个通道;从第一通道的编码块编码得到的子流中解析第一标志;第一标志用于指示多个通道的编码块采用第一编码模式或回退模式编码;第一编码模式为按照第一定长码对第一通道的编码块中的采样值进行编码的模式;第一定长码的码长小于或等于待处理图像的图像位宽;图像位宽用于表征编码存储待处理图像中每个样本所需的比特位数;从第一通道的编码块编码得到的子流中解析第二标志;第二标志用于指示多个通道的编码块目标模式编码;目标模式为第一编码模式和回退模式中的任意一种;当第二标志指示多个通道的编码块采用的目标模式为回退模式时,从第一通道的编码块编码得到的子流中解析第三标志;根据第三标志指示的回退模式的种类确定多个通道的目标解码模式,并按照目标解码模式对多个通道的编码块编码得到的子流进行解码。
在一实施例中,处理模块2802,还用于基于码流缓冲区的剩余大小、以及目标像素深度BPP,确定编码单元的目标解码码长;目标解码码长用于指示解码编码单元的码流所需的码长;基于解码码长;确定多个通道的分配码长;分配码长用于指示解码多个通道的编码块的码流的残差所需的码长;根据分配码长在多个通道的平均值,确定多个通道各自分配的解码码长。
在一实施例中,获取模块2801,还用于获取编码单元;编码单元包括多个通道的编码块。处理模块2802,还用于按照帧内块复制IBC模式对多个通道中的至少一个通道的编码块进行编码;获取参考预测块的块向量BV;参考预测块的BV用于指示参考预测块在已编码的图像块中的位置;参考预测块用于表征按照IBC模式编码的编码块的预测值;在至少一种通道的编码块经IBC模式编码得到的至少一个子流中编码参考预测块的BV。
在一实施例中,处理模块2802,具体用于使用率失真优化决策确定出至少一个通道的目标编码模式;目标编码模式包括IBC模式;按照目标编码模式对至少一个通道的编码块进行编码。
在一实施例中,处理模块2802,具体用于按照IBC模式对多个通道的编码块进行编码;在多个通道中的每一个通道的编码块经IBC模式编码得到的子流中编码参考预测块的BV。
在一实施例中,处理模块2802,具体用于按照预设比例在多个通道中的每一个通道的编码块经IBC模式编码得到的码流中编码参考预测块的BV。
在一实施例中,处理模块2802,还用于解析对编码单元编码后的码流;编码单元包括多个通道的编码块;码流包括多个通道的编码块编码后的、与多个通道一一对应的多个子流;基于从多个子流中的至少一个子流中解析到的参考预测块的块向量BV,在多个子流中确定出参考预测块的位置;参考预测块用于表征按照帧内块复制IBC模式进行编码的编码块的预测 值;参考预测块的BV用于指示参考预测块在已编码的图像块中的位置;基于参考预测块的位置信息,确定按照IBC模式进行解码的解码块的预测值;基于预测值,对按照IBC模式进行解码的解码块进行重建。
在一实施例中,参考预测块的BV编码于多个子流中;参考预测块的BV是按照IBC模式对多个通道的编码块进行编码时得到的。
在一实施例中,获取模块2801,还用于获取编码单元对应的处理系数;处理系数包括残差系数和变换系数中的一项或多项。处理模块2802,还用于按照个数阈值将处理系数分为多组,多组处理系数中的每组处理系数的个数小于或等于个数阈值。
在一实施例中,处理模块2802,还用于解析对编码单元编码后的码流,确定编码单元对应的处理系数;处理系数包括残差系数和变换系数中的一项或多项;处理系数包括多组;每组处理系数的个数小于或等于个数阈值;基于处理系数对码流进行解码。
在一实施例中,获取模块2801,还用于获取编码单元;编码单元包括P个通道的编码块;P为大于或等于2的整数;获取P个通道中每个通道的编码块的复杂度信息;复杂度信息用于表征每个通道的编码块的像素值的差异程度。处理模块2802,还用于在P个通道的编码块编码得到的子流中编码每个通道的编码块的复杂度信息。
在一实施例中,复杂度信息包括复杂度等级;处理模块2802,具体用于在P个通道的编码块编码得到的子流中各自编码每个通道的编码块的复杂度等级。
在一实施例中,复杂度信息包括复杂度等级和第一参考系数;第一参考系数用于表征不同通道的编码块的复杂度等级之间的比例关系;处理模块2802,具体用于在P个通道中的Q个通道的编码块编码得到的子流中各自编码Q个通道的编码块的复杂度等级;Q为小于P的整数;在P个通道中的P-Q个通道的编码块编码得到的子流中编码第一参考系数。
在一实施例中,复杂度信息包括复杂度等级、参考复杂度等级、以及第二参考系数;参考复杂度等级包括以下任意一项:第一复杂度等级、第二复杂度等级、以及第三复杂度等级;第一复杂度等级为P个通道中的P-Q个通道的编码块的复杂度等级中的最大值;Q为小于P的整数;第二复杂度等级为P个通道中P-Q个通道的编码块的复杂度等级中的最小值;第三复杂度等级为P个通道中的P-Q个通道的编码块的复杂度等级中的平均值;第二参考系数用于表征P个通道中的P-Q个通道的编码块的复杂度等级之间的大小关系,和/或,比例关系;处理模块2802,具体用于在P个通道中的Q个通道的编码块编码得到的子流中各自编码Q个通道的编码块的复杂度等级;在P个通道中的P-Q个通道的编码块编码得到的子流中编码参考复杂度等级和第二参考系数。
在一实施例中,处理模块2802,还用于解析对编码单元编码后的码流;编码单元包括P个通道的编码块;P为大于或等于2的整数;码流包括P个通道的编码块编码后的、与P个通道一一对应的多个子流;在P个通道的编码块编码得到的子流中解析每个通道的编码块的复杂度信息;复杂度信息用于表征每个通道的编码块的像素值的差异程度;基于每个通道的编码块的复杂度信息,确定每个通道的编码块的量化参数;基于每个通道的编码块的量化参数,对码流进行解码。
在一实施例中,复杂度信息包括复杂度等级;处理模块2802,还用于在P个通道的编码块编码得到的子流中各自解析每个通道的编码块的复杂度等级。
在一实施例中,复杂度信息包括复杂度等级和第一参考系数;第一参考系数用于表征不同通道的编码块的复杂度等级之间的比例关系;处理模块2802,具体用于在P个通道中的Q个通道的编码块编码得到的子流中各自解析Q个通道的编码块的复杂度等级;Q为小于P的整数;在P个通道中的P-Q个通道的编码块编码得到的子流中解析第一参考系数;基于第一参考系数、以及Q个通道的编码块的复杂度等级,确定P-Q个通道的编码块的复杂度等级。
在一实施例中,复杂度信息包括复杂度等级、参考复杂度等级、以及第二参考系数;参考复杂度等级包括以下任意一项:第一复杂度等级、第二复杂度等级、以及第三复杂度等级;第一复杂度等级为P个通道中的P-Q个通道的编码块的复杂度等级中的最大值;Q为小于P的整数;第二复杂度等级为P个通道中P-Q个通道的编码块的复杂度等级中的最小值;第三复杂度等级为P个通道中的P-Q个通道的编码块的复杂度等级中的平均值;第二参考系数用于表征P个通道中的P-Q个通道的编码块的复杂度等级之间的大小关系,和/或,比例关系;处理模块2802,具体用于在P个通道中的Q个通道的编码块编码得到的子流中各自解析Q个通道的编码块的复杂度等级;在P个通道中的P-Q个通道的编码块编码得到的子流中解析参考复杂度等级和第二参考系数;基于Q个通道的编码块的复杂度等级。以及第二参考系数,确定P-Q个通道的编码块的复杂度等级。
在一实施例中,获取模块2801,还用于获取编码单元;编码单元包括多个通道的编码块;编码单元为待处理图像中的图像块。处理模块2802,还用于当待处理图像的图像格式为预设格式时,则将多个通道中的至少两个预设通道的编码块编码得到的子流合并为一个合并子流。
在一实施例中,处理模块2802,还用于解析对编码单元编码后的码流;编码单元包括多个通道的编码块;编码单元为待处理图像中的图像块;确定合并子流;合并子流是当待处理图像的图像格式为预设格式时,将多个通道中的至少两个预设通道的编码块编码的到的子流合并得到的;基于合并了至少两个待合并子流,对码流进行解码。
在一实施例中,获取模块2801,还用于获取编码单元;编码单元包括多个通道的编码块。处理模块2802,还用于在满足预设条件的目标子流中编码预设码字,直至目标子流不满足预设条件;目标子流是多个子流中的子流;多个子流是对多个通道的编码块进行编码得到的码流。
在一实施例中,预设条件包括:子流比特数小于预设的第一比特数阈值。
在一实施例中,预设条件包括:编码单元的码流中存在比特数小于预设的第二比特数阈值的经编码的编码块。
在一实施例中,处理模块2802,还用于解析对编码单元编码后的码流;编码单元包括多个通道的编码块;确定码字数量;码字数量用于指示在满足预设条件的目标子流中编码的预设码字的数量;预设码字是当存在满足预设条件的目标子流时,编码入目标子流中的;基于码字数量,对码流进行解码。
在一实施例中,获取模块2801,还用于获取编码单元;编码单元包括多个通道的编码块。处理模块2802,还用于基于预设膨胀率,确定多个通道的编码块中每个通道的编码块各自对应的目标编码模式;按照目标编码模式对每个通道的编码块进行编码,以使得当前膨胀率小于或等于预设膨胀率。
在一实施例中,预设膨胀率包括第一预设膨胀率;当前膨胀率等于最大子流的比特数与最小子流的比特数之商;最大子 流为对多个通道的编码块进行编码得到的多个子流中,比特数最大的子流;最小子流为对多个通道的编码块进行编码得到的多个子流中,比特数最小的子流。
在一实施例中,预设膨胀率包括第二预设膨胀率;当前膨胀率等于经编码的多个通道的编码块中,比特数最大的编码块的比特数和比特数最小的编码块的比特数之商。
在一实施例中,处理模块2802,还用于确定当前膨胀率;当当前膨胀率大于第一预设膨胀率时,在最小子流中编码预设码字,以使得当前膨胀率小于或等于第一预设膨胀率。
在一实施例中,处理模块2802,还用于确定当前膨胀率;当当前膨胀率大于第二预设膨胀率时,在比特数最小的编码块中编码预设码字,以使得当前膨胀率小于或等于第二预设膨胀率。
在一实施例中,处理模块2802,还用于解析对编码单元编码后的码流;编码单元包括多个通道的编码块;基于码流,确定预设码字的数量;基于预设码字的数量,对码流进行解码。
需要说明的是,图28中对模块的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。例如,还可以将两个或两个以上的功能集成在一个处理模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。
在示例性的实施例中,本申请实施例还提供了一种可读存储介质,包括执行指令,当其在图像编解码装置上运行时,使得图像编解码装置执行上述实施例提供的任意一种方法。
在示例性的实施例中,本申请实施例还提供了一种包含执行指令的计算机程序产品,当其在图像编解码装置上运行时,使得图像编解码装置执行上述实施例提供的任意一种方法。
在示例性的实施例中,本申请实施例还提供了一种芯片,包括:处理器和接口,处理器通过接口与存储器耦合,当处理器执行存储器中的计算机程序或图像编解码装置执行指令时,使得上述实施例提供的任意一种方法被执行。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件程序实现时,可以全部或部分地以计算机程序产品的形式来实现。该计算机程序产品包括一个或多个计算机执行指令。在计算机上加载和执行计算机执行指令时,全部或部分地产生按照本申请实施例的流程或功能。计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。计算机执行指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,计算机执行指令可以从一个网站站点、计算机、服务器或者数据中心通过有线(例如同轴电缆、光纤、数字用户线(digital subscriber line,DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可以用介质集成的服务器、数据中心等数据存储设备。可用介质可以是磁性介质(例如,软盘、硬盘、磁带),光介质(例如,DVD)、或者半导体介质(例如固态硬盘(solid state disk,SSD))等。
尽管在此结合各实施例对本申请进行了描述,然而,在实施所要求保护的本申请过程中,本领域技术人员通过查看附图、公开内容、以及所附权利要求书,可理解并实现公开实施例的其他变化。在权利要求中,“包括”(Comprising)一词不排除其他组成部分或步骤,“一”或“一个”不排除多个的情况。单个处理器或其他单元可以实现权利要求中列举的若干项功能。相互不同的从属权利要求中记载了某些措施,但这并不表示这些措施不能组合起来产生良好的效果。
尽管结合具体特征及其实施例对本申请进行了描述,显而易见的,在不脱离本申请的精神和范围的情况下,可对其进行各种修改和组合。相应地,本说明书和附图仅仅是所附权利要求所界定的本申请的示例性说明,且视为已覆盖本申请范围内的任意和所有修改、变化、组合或等同物。显然,本领域的技术人员可以对本申请进行各种改动和变型而不脱离本申请的精神和范围。这样,倘若本申请的这些修改和变型属于本申请权利要求及其等同技术的范围之内,则本申请也意图包含这些改动和变型在内。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何在本申请揭露的技术范围内的变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应该以权利要求的保护范围为准。

Claims (40)

  1. 一种图像编码方法,其中,所述方法包括:
    获取编码单元;所述编码单元包括多个通道的编码块;
    基于预设膨胀率对所述每个通道的编码块进行编码,以使得当前膨胀率小于或等于所述预设膨胀率,
    其中,所述预设膨胀率包括第一预设膨胀率;所述当前膨胀率的值由最大子流的比特数与最小子流的比特数之商导出;所述最大子流为对所述多个通道的编码块进行编码得到的多个子流中,比特数最大的子流;所述最小子流为对所述多个通道的编码块进行编码得到的多个子流中,比特数最小的子流。
  2. 根据权利要求1所述的方法,其中,所述方法还包括:
    获取第一子流和第二子流,所述第一子流为已发送的最多固定长度码流中剩余量的总量最大的子流,所述第二子流为已发送的最少固定长度码流中剩余量的总量最小的子流;
    当所述第一子流和所述第二子流的比值或差值大于预设的差值阈值时,确定所述第一子流对应的编码块的目标编码模式为低码率编码模式,所述第二子流对应的编码块的目标编码模式为高码率编码模式。
  3. 根据权利要求1所述的方法,其中,所述方法还包括:
    确定所述当前膨胀率;
    当所述当前膨胀率大于所述第一预设膨胀率时,在所述最小子流中编码预设码字,以使得所述当前膨胀率小于或等于所述第一预设膨胀率。
  4. 根据权利要求3所述的方法,其中,所述预设膨胀率还包括第二预设膨胀率,所述方法还包括:
    确定所述当前膨胀率;
    当所述当前膨胀率大于所述第二预设膨胀率时,在所述比特数最小的编码块中编码所述预设码字,以使得所述当前膨胀率小于或等于所述第二预设膨胀率。
  5. 根据权利要求3或4所述的方法,其中,所述在所述最小子流中编码预设码字包括:
    在所述最小子流的末尾填充所述预设码字。
  6. 一种图像编码方法,其中,所述方法包括:
    获取编码单元;所述编码单元包括多个通道的编码块;
    按照帧内块复制IBC模式对所述多个通道中的至少一个通道的编码块进行编码;
    获取参考预测块的块向量BV;所述参考预测块的BV用于指示所述参考预测块在已编码的图像块中的位置;所述参考预测块用于表征按照所述IBC模式编码的编码块的预测值;
    在所述多个通道中的每一个通道的编码块经所述IBC模式编码得到的子流中编码所述参考预测块的BV。
  7. 根据权利要求6所述的方法,其中,所述在所述多个通道中的每一个通道的编码块经所述IBC模式编码得到的子流中编码所述参考预测块的BV,包括:
    按照预设比例在所述多个通道中的每一个通道的编码块经所述IBC模式编码得到的码流中编码所述参考预测块的BV。
  8. 根据权利要求6所述的方法,其中,所述方法还包括:
    获取编码单元对应的残差系数;
    按照个数阈值将所述残差系数分为多组,所述多组残差系数中的每组残差系数的个数小于或等于所述个数阈值。
  9. 一种图像编码方法,其中,所述方法包括:
    获取编码单元;所述编码单元为待处理图像中的图像块;所述编码单元包括多个通道的编码块;
    确定第一总码长;所述第一总码长为所述多个通道的编码块均按照各自对应的目标编码模式编码后得到的的第一码流的总码长;所述目标编码模式包括第一编码模式;所述第一编码模式为按照第一定长码对编码块中的样本值进行编码的模式;所述第一定长码的码长小于或等于所述待处理图像的图像位宽;所述图像位宽用于表征存储所述待处理图像中每个样本所需的比特位数;
    当所述第一总码长大于或等于码流缓冲区的剩余大小时,将所述多个通道的编码块按照回退模式进行编码;所述回退模式和所述第一编码模式的模式标志相同。
  10. 根据权利要求9所述的方法,其中,所述方法还包括:
    在所述多个通道的编码块编码得到的多个子流中编码模式标志;所述模式标志用于指示所述多个通道的编码块各自采用的编码模式。
  11. 根据权利要求10所述的方法,其中,所述多个通道包括第一通道;所述第一通道为所述多个通道中的任意一个通道;所述在所述多个通道的编码块编码得到的子流中编码模式标志,包括:
    在所述第一通道的编码块编码得到的子流中编码子模式标志;所述子模式标志用于指示所述多个通道的编码块所采用的回退模式的种类,所述回退模式包括第一回退模式和第二回退模式。
  12. 根据权利要求10所述的方法,其中,所述多个通道包括第一通道;所述第一通道为所述多个通道中的任意一个通道;所述在所述多个通道的编码块编码得到的子流中编码模式标志,包括:
    在所述第一通道的编码块编码得到的子流中编码第一标志和第二标志;所述第一标志用于指示所述多个通道的编码块采用所述第一编码模式或者所述回退模式编码。
  13. 根据权利要求10-12任一项所述的方法,其中,所述方法还包括:
    基于所述码流缓冲区的剩余大小、以及目标像素深度BPP,确定所述编码单元的目标编码码长;所述目标编码码长用于指示编码所述编码单元所需的码长;
    基于所述目标编码码长,确定多个通道的分配码长;所述分配码长用于指示编码所述多个通道的编码块的残差所需的码长;
    根据所述分配码长在所述多个通道的平均值,确定所述多个通道各自分配的编码码长。
  14. 根据权利要求13所述的方法,其中,所述基于所述目标编码码长,确定多个通道的分配码长,包括:
    将所述目标编码码长减去公共信息所占用的码长,得到多个通道的分配码长。
  15. 一种图像解码方法,其中,所述方法包括:
    解析对编码单元编码后的码流;所述编码单元包括多个通道的编码块;
    确定码字数量;所述码字数量用于指示在满足预设条件的目标子流中编码的预设码字的数量;所述预设码字是当存在满足预设条件的目标子流时,编码入所述目标子流中的;
    基于所述码字数量,对所述码流进行解码。
  16. 根据权利要求15所述的方法,其中,所述基于所述码字数量,对所述码流进行解码包括:
    基于所述码字数量将预设码字删除,对删除了预设码字后的码流进行解码。
  17. 一种图像解码方法,其中,所述方法包括:
    解析对编码单元编码后的码流;所述编码单元包括多个通道的编码块;
    确定所述当前膨胀率;
    根据所述当前膨胀率和第一预设膨胀率确定码字数量;所述码字数量用于指示在满足预设条件的目标子流中编码的预设码字的数量;所述预设码字是当存在满足预设条件的目标子流时,编码入所述目标子流中的;
    基于所述码字数量,对所述码流进行解码。
  18. 一种图像解码方法,其中,所述方法包括:
    解析对编码单元编码后的码流;所述编码单元包括多个通道的编码块;所述码流包括所述多个通道的编码块编码后的、与所述多个通道一一对应的多个子流;
    基于从所述多个子流中的至少一个子流中解析到的参考预测块的块向量BV,在所述多个子流中确定出所述参考预测块的位置;所述参考预测块用于表征按照帧内块复制IBC模式进行解码的解码块的预测值;所述参考预测块的BV用于指示所述参考预测块在已重建的图像块中的位置;
    基于所述参考预测块的位置信息,确定所述按照IBC模式进行解码的解码块的预测值;
    基于所述预测值,对所述按照IBC模式进行解码的解码块进行重建。
  19. 根据权利要求18所述的方法,其中,所述按照IBC模式进行进行解码的解码块包括至少两个通道的解码块;所述至少两个通道的解码块共用所述参考预测块的BV。
  20. 根据权利要求18所述的方法,其中,所述方法还包括:
    当从所述多个子流中的任意一个子流中解析出IBC模式标识时,确定所述多个子流对应的目标解码模式为所述IBC模式。
  21. 根据权利要求18所述的方法,其中,所述方法还包括:
    确定所述解码单元对应的残差系数;所述残差系数包括多组;每组残差系数的个数小于或等于个数阈值;
    基于所述残差系数对码流进行解码。
  22. 一种图像解码方法,其中,所述方法包括:
    解析对编码单元编码后的码流;所述编码单元包括多个通道的编码块;
    若从所述多个子流中解析出模式标志,且第二总码长大于码流缓冲区的剩余大小,则确定所述子流的目标解码模式为回退模式;所述模式标志用于指示所述多个通道的解码块是否采用回退模式进行编码或第一通道是否采用第一解码模式;所述第二总码长为所述多个通道的编码块均按照第一编码模式进行编码后得到的码流的总码长;所述第一编码模式为按照第一定长码对编码块中的样本值所述编码单元进行编码的模式;所述第一定长码的码长小于或等于所述待处理图像的图像位宽;所述图像位宽用于表征编码存储所述待处理图像中每个样本所需的比特位数;
    解析所述子流中的预设标志位,确定目标回退子模式;所述目标回退子模式为所述回退模式中的一种;所述预设标志位用于指示所述多个通道的编码块编码时所采用的回退模式的种类;所述回退模式包括第一回退模式和第二回退模式;
    按照所述目标回退模式对所述子流进行解码。
  23. 根据权利要求22所述的方法,其中,所述方法还包括:
    基于所述码流缓冲区的剩余大小、以及目标像素深度BPP,确定所述编码单元的目标解码码长;所述目标解码码长用于指示解码所述编码单元的码流所需的码长;
    基于所述解码码长;确定多个通道的分配码长;所述分配码长用于指示解码所述多个通道的编码块的码流的残差所需的码长;
    根据所述分配码长在所述多个通道的平均值,确定所述多个通道各自分配的解码码长。
  24. 一种图像解码方法,其中,所述方法包括:
    获取对编码单元编码后的码流;
    按照第一解码模式对第一通道对应的子流进行解码。
  25. 根据权利要求24所述的方法,其中,所述按照第一解码模式对第一通道对应的子流进行解码包括:
    当第一定长码等于图像位宽时,直接按照所述第一解码模式对所述第一通道对应的子流进行解码;或者
    当第一定长码小于图像位宽时,对解析出来的所述第一通道的编码块中的像素值进行反量化。
  26. 一种图像解码方法,其中,所述方法包括:
    获取解码单元;所述解码单元为待处理图像中的图像块,所述解码单元包括多个通道的解码块;
    确定第一总码长;所述第一总码长为所述多个通道的解码块均按照各自对应的目标解码模式进行解码后得到的第一码流的总码长;
    当所述第一总码长大于或等于码流缓冲区的剩余大小时,将所述多个通道的解码块按照回退模式进行解码。
  27. 根据权利要求26所述的方法,其中,所述方法还包括:
    在所述多个通道的解码块解码得到的多个子流中解码模式标志。
  28. 一种图像解码方法,其中,所述方法包括:
    解析对编码单元编码后的码流,确定所述编码单元对应的处理系数;
    基于所述处理系数对所述码流进行解码。
  29. 根据权利要求28所述的方法,其中,所述处理系数包括残差系数和变换系数中的一项或多项;所述处理系数包括多组;每组处理系数的个数小于或等于个数阈值。
  30. 一种图像编码装置,其中,所述装置包括:
    获取模块,用于获取编码单元;所述编码单元包括多个通道的编码块;
    处理模块,用于在满足预设条件的目标子流中编码预设码字,直至所述目标子流不满足所述预设条件;所述目标子流是多个子流中的子流;所述多个子流是对所述多个通道的编码块进行编码得到的码流。
  31. 一种图像编码装置,其中,所述装置包括:
    获取模块,用于获取编码单元;所述编码单元包括多个通道的编码块;
    处理模块,用于基于预设膨胀率对所述每个通道的编码块进行编码,以使得当前膨胀率小于或等于所述预设膨胀率。
  32. 一种图像编码装置,其中,所述装置包括:
    获取模块,用于获取编码单元;所述编码单元包括多个通道的编码块;
    处理模块,用于按照帧内块复制IBC模式对所述多个通道中的至少一个通道的编码块进行编码;获取参考预测块的块向量BV;所述参考预测块的BV用于指示所述参考预测块在已编码的图像块中的位置;所述参考预测块用于表征按照所述IBC模式编码的编码块的预测值;在所述至少一种通道的编码块经所述IBC模式编码得到的至少一个子流中编码所述参考预测块的BV。
  33. 一种图像编码装置,其中,所述装置包括:
    获取模块,用于获取编码单元;所述编码单元为待处理图像中的图像块;所述编码单元包括多个通道的编码块;
    处理模块,用于确定第一总码长;所述第一总码长为所述多个通道的编码块均按照各自对应的目标编码模式编码后得到的的第一码流的总码长;所述目标编码模式包括第一编码模式;所述第一编码模式为按照第一定长码对编码块中的样本值进行编码的模式;所述第一定长码的码长小于或等于所述待处理图像的图像位宽;所述图像位宽用于表征存储所述待处理图像中每个样本所需的比特位数;当所述第一总码长大于或等于码流缓冲区的剩余大小时,将所述多个通道的编码块按照回退模式进行编码;所述回退模式和所述第一编码模式的模式标志相同。
  34. 一种图像解码装置,其中,所述装置包括:
    处理模块,用于解析对编码单元编码后的码流;所述编码单元包括多个通道的编码块;确定码字数量;所述码字数量用于指示在满足预设条件的目标子流中编码的预设码字的数量;所述预设码字是当存在满足预设条件的目标子流时,编码入所述目标子流中的;基于所述码字数量,对所述码流进行解码。
  35. 一种图像解码装置,其中,所述装置包括:
    处理模块,用于解析对编码单元编码后的码流;所述编码单元包括多个通道的编码块;确定所述当前膨胀率;根据所述当前膨胀率和第一预设膨胀率确定码字数量;所述码字数量用于指示在满足预设条件的目标子流中编码的预设码字的数量;所述预设码字是当存在满足预设条件的目标子流时,编码入所述目标子流中的;基于所述码字数量,对所述码流进行解码。
  36. 一种图像解码装置,其中,所述装置包括:
    处理模块,用于解析对编码单元编码后的码流;所述编码单元包括多个通道的编码块;所述码流包括所述多个通道的编码块编码后的、与所述多个通道一一对应的多个子流;基于从所述多个子流中的至少一个子流中解析到的参考预测块的块向量BV,在所述多个子流中确定出所述参考预测块的位置;所述参考预测块用于表征按照帧内块复制IBC模式进行解码的解码块的预测值;所述参考预测块的BV用于指示所述参考预测块在已重建的图像块中的位置;基于所述参考预测块的位置信息,确定所述按照IBC模式进行解码的解码块的预测值;基于所述预测值,对所述按照IBC模式进行解码的解码块进行重建。
  37. 一种图像解码装置,其中,所述装置包括:
    处理模块,用于解析对编码单元编码后的码流;所述编码单元包括多个通道的编码块;若从所述多个通道的编码块编码得到的子流中解析出模式标志,且第二总码长大于码流缓冲区的剩余大小,则确定所述子流的目标解码模式为回退模式;所述模式标志用于指示所述多个通道的编码块是否采用回退模式进行编码;所述第二总码长为所述多个通道的编码块均按照第一编码模式进行编码后得到的码流的总码长;所述第一编码模式为按照第一定长码对编码块中的样本值所述编码单元进行编码的模式;所述第一定长码的码长小于或等于所述待处理图像的图像位宽;所述图像位宽用于表征编码存储所述待处理图像中每个样本所需的比特位数;解析所述子流中的预设标志位,确定目标回退子模式;所述目标回退子模式为所述回退模式中的一种;所述预设标志位用于指示所述多个通道的编码块编码时所采用的回退模式的种类;所述回退模式包括第一回退模式和第二回退模式;按照所述目标回退模式对所述子流进行解码。
  38. 一种视频编码器,其中,所述视频编码器包括:处理器和存储器;
    所述存储器存储有所述处理器可执行的指令;
    所述处理器被配置为执行所述指令时,使得所述视频编码器实现如权利要求1-14中任一项所述的方法。
  39. 一种视频解码器,其中,所述视频解码器包括:处理器和存储器;
    所述存储器存储有所述处理器可执行的指令;
    所述处理器被配置为执行所述指令时,使得所述视频解码器实现如权利要求15-29中任一项所述的方法。
  40. 一种可读存储介质,其中,所述可读存储介质包括:软件指令;
    当所述软件指令在图像编码装置和图像解码装置中运行时,使得所述图像编码装置实现如权利要求1-14中任一项所述的方法和所述图像解码装置实现如权利要求15-29中任一项所述的方法。
PCT/CN2023/118293 2022-09-20 2023-09-12 图像编码方法和图像解码方法、装置及存储介质 WO2024061055A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211146464.4 2022-09-20
CN202211146464.4A CN116132685A (zh) 2022-09-20 2022-09-20 图像编解码方法、装置及存储介质

Publications (1)

Publication Number Publication Date
WO2024061055A1 true WO2024061055A1 (zh) 2024-03-28

Family

ID=86303357

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/118293 WO2024061055A1 (zh) 2022-09-20 2023-09-12 图像编码方法和图像解码方法、装置及存储介质

Country Status (2)

Country Link
CN (3) CN116132685A (zh)
WO (1) WO2024061055A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116132685A (zh) * 2022-09-20 2023-05-16 杭州海康威视数字技术股份有限公司 图像编解码方法、装置及存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110248872A1 (en) * 2010-04-13 2011-10-13 Research In Motion Limited Methods and devices for load balancing in parallel entropy coding and decoding
CN104159107A (zh) * 2014-09-04 2014-11-19 上海航天电子通讯设备研究所 多通道视频信号的静态图像编码方法
CN111327901A (zh) * 2020-03-10 2020-06-23 北京达佳互联信息技术有限公司 视频编码方法、装置、存储介质及编码设备
CN116132685A (zh) * 2022-09-20 2023-05-16 杭州海康威视数字技术股份有限公司 图像编解码方法、装置及存储介质

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110248872A1 (en) * 2010-04-13 2011-10-13 Research In Motion Limited Methods and devices for load balancing in parallel entropy coding and decoding
CN104159107A (zh) * 2014-09-04 2014-11-19 上海航天电子通讯设备研究所 多通道视频信号的静态图像编码方法
CN111327901A (zh) * 2020-03-10 2020-06-23 北京达佳互联信息技术有限公司 视频编码方法、装置、存储介质及编码设备
CN116132685A (zh) * 2022-09-20 2023-05-16 杭州海康威视数字技术股份有限公司 图像编解码方法、装置及存储介质

Also Published As

Publication number Publication date
CN116437095A (zh) 2023-07-14
CN116248881A (zh) 2023-06-09
CN116132685A (zh) 2023-05-16

Similar Documents

Publication Publication Date Title
US10666948B2 (en) Method, apparatus and system for encoding and decoding video data
KR102342660B1 (ko) 디스플레이 스트림 압축을 위한 서브스트림 멀티플렉싱
KR102120571B1 (ko) 넌-4:4:4 크로마 서브-샘플링의 디스플레이 스트림 압축 (dsc) 을 위한 엔트로피 코딩 기법들
WO2023231866A1 (zh) 一种视频译码方法、装置及存储介质
WO2024061055A1 (zh) 图像编码方法和图像解码方法、装置及存储介质
JP2018538742A (ja) ディスプレイストリーム圧縮(dsc)における固定小数点近似のためのシステムおよび方法
WO2023236936A1 (zh) 一种图像编解码方法及装置
TWI705693B (zh) 用於顯示串流壓縮之基於向量之熵寫碼的裝置及方法
WO2024104382A1 (zh) 图像编解码方法、装置及存储介质
WO2024022359A1 (zh) 一种图像编解码方法及装置
WO2024022039A1 (zh) 一种视频图像解码方法、编码方法、装置及存储介质
WO2024022367A1 (zh) 图像解码方法、编码方法及装置
WO2023138532A1 (zh) 一种视频解码方法、装置、视频解码器及存储介质
TW202415067A (zh) 圖像編碼方法和圖像解碼方法、裝置及存儲介質
TWI838089B (zh) 一種視頻解碼方法、裝置、視頻解碼器及存儲介質
TWI821013B (zh) 視頻編解碼方法及裝置
WO2022193389A1 (zh) 视频编解码方法与系统、及视频编解码器
WO2023083245A1 (zh) 解码方法、编码方法及装置
WO2022217447A1 (zh) 视频编解码方法与系统、及视频编解码器

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23867344

Country of ref document: EP

Kind code of ref document: A1