WO2023138532A1 - Procédé et appareil de décodage vidéo, décodeur vidéo et support de stockage - Google Patents

Procédé et appareil de décodage vidéo, décodeur vidéo et support de stockage Download PDF

Info

Publication number
WO2023138532A1
WO2023138532A1 PCT/CN2023/072358 CN2023072358W WO2023138532A1 WO 2023138532 A1 WO2023138532 A1 WO 2023138532A1 CN 2023072358 W CN2023072358 W CN 2023072358W WO 2023138532 A1 WO2023138532 A1 WO 2023138532A1
Authority
WO
WIPO (PCT)
Prior art keywords
current block
image
decoded
bits
rate control
Prior art date
Application number
PCT/CN2023/072358
Other languages
English (en)
Chinese (zh)
Inventor
王岩
陈方栋
曹小强
孙煜程
Original Assignee
杭州海康威视数字技术股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 杭州海康威视数字技术股份有限公司 filed Critical 杭州海康威视数字技术股份有限公司
Publication of WO2023138532A1 publication Critical patent/WO2023138532A1/fr

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/44Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/124Quantisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/40Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using video transcoding, i.e. partial or full decoding of a coded input stream followed by re-encoding of the decoded output stream

Definitions

  • the present application relates to the technical field of video decoding, and in particular to a video decoding method and device, a video decoder and a storage medium.
  • Video codec technology plays an important role in the field of video transmission and storage.
  • quantization is a key step in determining image quality in the process of video encoding and decoding.
  • Quantization mainly reduces data redundancy through quantization parameters, but at the same time it may bring the risk of image distortion.
  • the quantization parameters used for quantization are written into the code stream in the video encoding stage, and the video decoder implements decoding by analyzing the quantization parameters in the code stream.
  • quantization parameters occupy more resources in the code stream, which affects transmission efficiency.
  • Embodiments of the present application provide a video decoding method, device, and storage medium, which help save resources in code streams and improve transmission efficiency of code streams.
  • an embodiment of the present application provides a video decoding method, which is applied to a video decoding device or a chip of a video decoding device, and the method includes: acquiring one or more rate control parameters of a current block in an image to be decoded; determining a quantization parameter of the current block according to the one or more rate control parameters; and decoding the current block based on the quantization parameter of the current block.
  • the decoding end obtains the code rate control parameters, and calculates the quantization parameters through the code rate control parameters to realize decoding, realize saving resources in the code stream, and improve the efficiency of code stream transmission.
  • acquiring one or more rate control parameters of the current block in the image to be decoded includes: acquiring information of the image to be decoded, the information of the image to be decoded includes basic information of the image to be decoded and information of image blocks in the image to be decoded; determining one or more rate control parameters of the current block according to part or all of the information of the image to be decoded.
  • This possible implementation provides a specific implementation of acquiring rate control parameters.
  • the rate control parameters used to generate quantization parameters are calculated.
  • the calculation of quantization parameters is completed through some features of the image.
  • the rate control parameters include the resource adequacy of the current block, the complexity of the current block, the number of predicted bits of the current block, the average quantization parameter of the current block, and the change value of the quantization parameter of the current block; wherein, the average quantization parameter is used to represent the average degree of the quantization parameters of the decoded image blocks in the image to be decoded, the resource adequacy refers to the adequacy of the resources used to store the current block in the resources used to store the image to be decoded, the complexity refers to the complexity of the decoded image block in the image to be decoded, and the number of predicted bits refers to the estimated occupancy of the decoded image block in the image to be decoded. resources, the change value of the quantization parameter refers to the change value of the quantization parameter between decoded image blocks in the image to be decoded.
  • This possible implementation mode provides various code rate control parameters, and the various code rate control parameters can generate quantization parameters in different ways, thereby improving the implementability of the solution.
  • determining the quantization parameter of the current block according to one or more rate control parameters includes: acquiring the rate control parameter of the decoded image block; determining the quantization parameter of the current block according to the rate control parameter of the decoded image block.
  • This possible implementation method provides a specific implementation method for generating quantization parameters according to the code rate control parameters.
  • the quantization parameters of the current block are determined by the code rate control parameters of the decoded image blocks, which helps to establish the correlation between image blocks.
  • determining the quantization parameter of the current block according to one or more code rate control parameters includes: acquiring the quantization parameter of the decoded image block; and determining the quantization parameter of the current block according to the quantization parameter of the decoded image block.
  • This possible implementation provides a specific implementation of generating quantization parameters according to the code rate control parameters.
  • the quantization parameters of the current block are determined by the quantization parameters of the decoded image blocks, which helps to establish the correlation between image blocks.
  • the method further includes: calculating the total number of bits of the current block according to the information of the current block, where the total number of bits of the current block is the number of bits occupied by the current block after decoding; determining the number of available bits according to the total number of bits, the number of available bits is used to store decoded image data of the current block, and the number of available bits is used to determine one or more code rate control parameters of the current block.
  • This possible implementation method provides a specific implementation method for determining the code rate control parameters according to the information of the image to be decoded. By calculating the number of available bits of the image block, it helps to improve the accuracy of the code rate control parameters and the implementability of the scheme.
  • determining the number of available bits according to the total number of bits includes: removing other bits used for storing non-image data from the total number of bits.
  • This possible implementation method provides a specific implementation method for determining the number of available bits, wherein, by removing part of the number of other bits used to store non-image data, the number of available bits used to store image data is determined, which helps to improve the accuracy of the bit rate control parameters and improve the implementability of the solution.
  • the method further includes: initializing an intermediate parameter, where the intermediate parameter is used in conjunction with one or more rate control parameters to determine a quantization parameter of the current block.
  • This possible implementation provides an initialization process for determining the intermediate parameters of the quantization parameters, which is helpful for the generation of the quantization parameters.
  • the method further includes: modifying the quantization parameter based on the one or more code rate control parameters.
  • a method for modifying the quantization parameter after the quantization parameter is determined is provided, which helps to improve the quality of the decoded image.
  • the method further includes: updating the one or more rate control parameters.
  • a method for updating the code rate control parameters is provided, which is adaptively adjusted for the image blocks in the image to be decoded, and helps to improve the flexibility of the application of the scheme.
  • an embodiment of the present application provides a video decoding device, and the device has a function of implementing the video decoding method in any one of the foregoing first aspects.
  • This function may be implemented by hardware, or may be implemented by executing corresponding software on the hardware.
  • the hardware or software includes one or more modules corresponding to the above functions.
  • a video decoding device including: a processor and a memory; the memory is used to store computer-executable instructions, and when the video decoding device is running, the processor executes the computer-executable instructions stored in the memory, so that the video decoding device executes the video decoding method according to any one of the above-mentioned first aspects.
  • a computer-readable storage medium is provided. Instructions are stored in the computer-readable storage medium. When the computer-readable storage medium is run on a computer, the computer can execute the video decoding method in any one of the above-mentioned first aspects.
  • a computer program product including instructions, which, when run on a computer, enable the computer to execute the video decoding method in any one of the above first aspects.
  • an electronic device includes a video decoding apparatus, and the processing circuit is configured to execute the video decoding method according to any one of the first aspect above.
  • a chip in a seventh aspect, includes a processor, the processor is coupled to a memory, the memory stores program instructions, and when the program instructions stored in the memory are executed by the processor, the video decoding method in any one of the above-mentioned first aspects is implemented.
  • FIG. 1 is a system architecture diagram of a codec system provided in an embodiment of the present application
  • FIG. 2 is a schematic structural diagram of a video encoder provided in an embodiment of the present application.
  • FIG. 3 is a schematic structural diagram of a video decoder provided in an embodiment of the present application.
  • FIG. 4 is a schematic flowchart of a video decoding provided by an embodiment of the present application.
  • FIG. 5 is a schematic structural diagram of a video decoder provided in an embodiment of the present application.
  • FIG. 6 is a flowchart of a video decoding method provided in an embodiment of the present application.
  • FIG. 7 is a schematic diagram of image block location information provided by an embodiment of the present application.
  • Figure 8 is a schematic diagram of the sigmoid function provided by the embodiment of the present application.
  • FIG. 9 is a flow chart of determining parameter initialization provided by the embodiment of the present application.
  • FIG. 10 is a schematic structural diagram of an encoding device provided by an embodiment of the present application.
  • Video sequences have a series of redundant information such as spatial redundancy, temporal redundancy, visual redundancy, information entropy redundancy, structural redundancy, knowledge redundancy, and importance redundancy.
  • video coding technology is proposed to achieve the effect of reducing storage space and saving transmission bandwidth.
  • Video coding technology is also called video compression technology.
  • video compression coding standards include but are not limited to: Advanced Video Coding (AVC) in Part 10 of the MPEG-2 and MPEG-4 standards formulated by the Motion Picture Experts Group (Motion Picture Experts Group, MPEG), and developed by the International Telecommunication Union-Telecommunication Standardization Sector (ITU-T) H.263, H.264 and H.265 (also known as High Efficiency Video Coding (HEVC) standard).
  • AVC Advanced Video Coding
  • MPEG Motion Picture Experts Group
  • ITU-T International Telecommunication Union-Telecommunication Standardization Sector
  • H.263, H.264 and H.265 also known as High Efficiency Video Coding (HEVC) standard.
  • the basic processing unit in the process of video compression encoding is an image block, which is obtained by dividing a frame/image at the encoding end.
  • HEVC defines Coding Tree Unit (CTU), Coding Unit (CU), Prediction Unit (PU) and Transform Unit (TU).
  • CTUs, CUs, PUs and TUs can all be used as image blocks obtained after division. Both PU and TU are divided based on CU.
  • a pixel is the smallest complete sample of a video or image, so the data processing of an image block is performed in units of pixels. That , each pixel records color information, for example, color can be represented by RGB, including three image channels, R represents red red, G represents green green, and B represents blue blue; another example, color can be represented by YUV, including three image channels, Y represents luminance (luminance), U represents the first chromaticity Cb, and V represents the second chromaticity Cr. Since people are more sensitive to luminance than chroma, the storage space can be reduced by storing more luminance and less chroma. Specifically, in video coding and decoding, video sampling is usually performed in YUV format, including 420 sampling format, 444 sampling format, and the like. The sampling format determines the number of samples for two chroma based on the number of samples of luma. For example, assuming that a CU has 4 ⁇ 2 pixels, the format is as follows:
  • the 420 sampling format means that YUV is sampled in a 4:2:0 format, that is, the brightness and the first chroma or the second chroma are selected in a ratio of 4:2, and the first chroma and the second chroma are selected alternately.
  • the above-mentioned CU sampling selects the luminance Y0-Y3 of the first row, and the first chromaticity U0 and U2, and selects the luminance Y4-Y7 of the second row, and the second chromaticity V4 and V6.
  • the CU is composed of a luma coding unit and a chroma coding unit after sampling, wherein the luma coding unit is:
  • the chroma coding units are:
  • the 444 sampling format means that YUV is sampled in a 4:4:4 format, that is, the brightness, the first chroma and the second chroma are selected in a ratio of 4:4:4. Then the sampled luminance coding unit of the above CU is:
  • the chroma coding units are:
  • the above-mentioned luma coding unit and chroma coding unit obtained through sampling are used as data units for subsequent coding processing.
  • FIG. 1 shows the structure of a video codec system.
  • the video codec system includes a source device 10 and a destination device 11 .
  • the source device 10 generates encoded video data.
  • the source device 10 may also be called a video encoding device or video encoding device.
  • the destination device 11 may decode the encoded video data generated by the source device 10.
  • the destination device 11 may also be called a video decoding device or a video decoding device.
  • Source device 10 and/or destination device 11 may include at least one processor and memory coupled to the at least one processor.
  • the memory may include but not limited to a read-only memory (Read-Only Memory, ROM), a random access memory (Random Access Memory, RAM), an electrically erasable programmable read-only memory (Electrically Erasable Programmable Read-Only Memory, EEPROM), a flash memory or any other media that can be used to store desired program codes in the form of instructions or data structures that can be accessed by a computer, and the present application does not specifically limit this.
  • ROM read-only memory
  • RAM Random Access Memory
  • EEPROM Electrically erasable programmable read-only memory
  • flash memory any other media that can be used to store desired program codes in the form of instructions or data structures that can be accessed by a computer, and the present application does not specifically limit this.
  • Source device 10 and destination device 11 may include a variety of devices, including electronic devices such as desktop computers, mobile computing devices, notebook (e.g., laptop) computers, tablet computers, set-top boxes, telephone handsets such as so-called “smart" phones, televisions, cameras, display devices, digital media players, video game consoles, vehicle-mounted computers, or the like.
  • electronic devices such as desktop computers, mobile computing devices, notebook (e.g., laptop) computers, tablet computers, set-top boxes, telephone handsets such as so-called “smart" phones, televisions, cameras, display devices, digital media players, video game consoles, vehicle-mounted computers, or the like.
  • Link 12 may include one or more media and/or devices capable of moving encoded video data from source device 10 to destination device 11 .
  • link 12 may include one or more communication media that enable source device 10 to transmit encoded video data directly to destination device 11 in real-time.
  • the source device 10 can modulate the encoded video data, and the modulated video data can be transmitted to the destination device 11.
  • the above one or more communication media may include wireless and/or wired communication media, for example: radio frequency (Radio Frequency, RF) spectrum, one or more physical transmission lines.
  • radio frequency Radio Frequency, RF
  • the one or more communication media described above may form part of a packet-based network such as a local area network, a wide area network, or a global network (eg, the Internet), among others.
  • the one or more communication media described above may include routers, switches, base stations, or other devices that enable communication from source device 10 to destination device 11 .
  • the encoded video data may be output from the output interface 103 to the storage device 13 .
  • encoded video data may be accessed from the storage device 13 through the input interface 113 .
  • the storage device 13 may include various local access data storage media, such as Blu-ray Disc, Digital Video Disc (DVD), Compact Disc Read-Only Memory (CD-ROM), flash memory, or other suitable digital storage media for storing encoded video data.
  • storage device 13 may correspond to a file server or another intermediate storage device that stores encoded video data generated by source device 10 .
  • destination device 11 may obtain its stored video data from storage device 13 via streaming or downloading.
  • the file server may be any type of server capable of storing encoded video data and transmitting the encoded video data to destination device 11 .
  • a file server may include a World Wide Web (Web) server (e.g., for a website), a File Transfer Protocol (FTP) server, a Network Attached Storage (NAS) device, and a local disk drive.
  • Web World Wide Web
  • FTP File Transfer Protocol
  • NAS Network Attached Storage
  • Destination device 11 may access the encoded video data through any standard data connection, such as an Internet connection.
  • Example types of data connections include wireless channels suitable for accessing encoded video data stored on a file server, wired connections (eg, cable modem, etc.), or a combination of both.
  • the encoded video data can be transmitted from the file server by streaming, download transmission or a combination of both.
  • the decoding method of the present application is not limited to wireless application scenarios.
  • the decoding method of the present application can be applied to video codecs supporting the following multiple multimedia applications: over-the-air television broadcasting, cable television transmission, satellite television transmission, streaming video transmission (for example, via the Internet), encoding of video data stored on a data storage medium, decoding of video data stored on a data storage medium, or other applications.
  • a video codec system can be configured to support one-way or two-way video transmission to support applications such as video streaming, video playback, video broadcasting, and/or video telephony.
  • the video codec system shown in FIG. 1 is only an example of a video codec system, and is not a limitation of the video codec system in this application.
  • the codec method provided in this application is also applicable to a scenario where there is no data communication between the encoding device and the decoding device.
  • the video data to be encoded or the encoded video data may be retrieved from local storage, streamed over a network, or the like.
  • the video encoding device can encode the video data to be encoded and store the encoded video data in the memory, and the video decoding device can also obtain the encoded video data from the memory and decode the encoded video data.
  • a source device 10 includes a video source 101 , a video encoder 102 and an output interface 103 .
  • output interface 103 may include a modulator/demodulator (modem) and/or a transmitter.
  • Video source 101 may include a video capture device (e.g., a video camera), a video archive containing previously captured video data, a video input interface to receive video data from a video content provider, and/or a computer graphics system to generate video data, or a combination of such sources of video data.
  • Video encoder 102 may encode video data from video source 101 .
  • source device 10 transmits the encoded video data directly to destination device 11 via output interface 103 .
  • the encoded video data may also be stored on the storage device 13 for later access by the destination device 11 for decoding and/or playback.
  • the destination device 11 includes a display device 111 , a video decoder 112 and an input interface 113 .
  • input interface 113 includes a receiver and/or a modem.
  • Input interface 113 may receive encoded video data via link 12 and/or from storage device 13 .
  • the display device 111 may be integrated with the destination device 11 or may be external to the destination device 11 . In general, the display device 111 displays the decoded video data.
  • the display device 111 may include various display devices, for example, a liquid crystal display, a plasma display, an organic light emitting diode display, or other types of display devices.
  • video encoder 102 and video decoder 112 may each be integrated with an audio encoder and decoder, and may include appropriate multiplexer-demultiplexer units or other hardware and software to handle the encoding of both audio and video in a common data stream or in separate data streams.
  • the video encoder 102 and the video decoder 112 may include a microprocessor, a digital signal processor (Digital Signal Processor, DSP), an application-specific integrated circuit (Application-Specific Integrated Circuit, ASIC), a field programmable gate array (Field Programmable Gate Array, FPGA), discrete logic circuits, hardware or any combination thereof.
  • DSP Digital Signal Processor
  • ASIC Application-Specific Integrated Circuit
  • FPGA Field Programmable Gate Array
  • the video encoder 102 and the video decoder 112 in this application may operate according to a video compression standard (such as HEVC), or may also operate according to other industry standards, which is not specifically limited in this application.
  • a video compression standard such as HEVC
  • Fig. 2 is a schematic block diagram of the video encoder 102 in the embodiment of the present application.
  • the video encoder 102 can respectively perform prediction, transformation, quantization and entropy coding processes in the prediction module 21 , the transformation module 22 , the quantization module 23 and the entropy coding module 24 .
  • the video encoder 102 also includes a preprocessing module 20 and a summer 202, wherein the preprocessing module 20 includes a segmentation module and a code rate control module.
  • the video encoder 102 also includes an inverse quantization module 25 , an inverse transformation module 26 , a summer 201 and a reference image memory 27 .
  • video encoder 102 receives video data, and a segmentation module in pre-processing module 20 segments the data into raw blocks.
  • This partitioning may also include partitioning into slices, image blocks or other larger units, as well as, for example, video block partitioning based on Largest Coding Units (LCUs) and quadtree structures of CUs.
  • LCUs Largest Coding Units
  • video encoder 102 encodes components of video blocks within a video slice to be encoded.
  • a slice may be divided into a number of original blocks (and possibly into a collection of original blocks called tiles).
  • the sizes of CUs, PUs, and TUs are typically determined in the partitioning module.
  • the code rate control module in the preprocessing module 20 obtains the size of CU, PU and TU, and the input parameters of video data, wherein, the input parameters include the resolution of the image in the video data, the format of the image, bit width and input bpp (bits per pixel, pixel depth); target bpp.
  • the code rate control module generates code rate control parameters according to the input parameters and image block size, and the code rate control parameters are used to generate quantization parameters so that the quantization module 23 and the inverse quantization module 25 perform related calculations.
  • the code rate control module may also update the code rate control parameters according to the reconstructed block obtained by the summer 201 through reconstruction.
  • the prediction module 21 may provide the predicted block to the summer 202 to generate a residual block, and provide the predicted block to the summer 201 to obtain a reconstructed block, which is used as a reference pixel for subsequent prediction.
  • the video encoder 102 subtracts the pixel value of the prediction block from the pixel value of the original block to form a pixel difference value, and the pixel difference value constitutes a residual block, and the data in the residual block may include brightness difference and chrominance difference.
  • Summer 201 represents one or more components that perform this addition operation.
  • the prediction module 21 can also send the related syntax elements to the entropy encoding module 24 for merging the syntax elements into the code stream.
  • Transform module 22 may divide the residual block into one or more TUs for transformation. Transform module 22 may convert the residual block from the pixel domain to a transform domain (eg, the frequency domain). For example, the residual block is transformed to obtain transform coefficients by using discrete cosine transform (Discrete Cosine Transform, DCT) or discrete sine transform (Discrete Sine Transform, DST). Transform module 22 may send the resulting transform coefficients to quantization module 23 .
  • DCT discrete Cosine Transform
  • DST discrete sine transform
  • the quantization module 23 quantizes the transform coefficients to further reduce the code rate to obtain quantized coefficients.
  • the quantization process may reduce the bit depth associated with some or all of the coefficients.
  • the degree of quantization can be modified by adjusting quantization parameters.
  • quantization module 23 may then perform a scan of the matrix comprising quantized transform coefficients.
  • entropy encoding module 24 may perform a scan.
  • entropy encoding module 24 may entropy encode the quantized coefficients.
  • the entropy coding module 24 may perform Context-Adaptive Variable-Length Coding (CAVLC), Context-based Adaptive Binary Arithmetic Coding (CABAC), Syntax-based Context-Adaptive Binary Arithmetic Coding (SBAC), Probability Interval Partitioning Entropy (PIPE) decoding, or other entropy coding methods or techniques.
  • CAVLC Context-Adaptive Variable-Length Coding
  • CABAC Context-based Adaptive Binary Arithmetic Coding
  • SBAC Syntax-based Context-Adaptive Binary Arithmetic Coding
  • PIPE Probability Interval Partitioning Entropy
  • the inverse quantization module 25 and the inverse transformation module 26 apply inverse quantization and inverse transformation respectively, and the summer 201 adds the inversely transformed residual block and the predicted residual block to generate a reconstructed block, which is used as a reference pixel for subsequent prediction of the original block.
  • This reconstructed block is stored in the reference image memory 27 .
  • FIG. 3 is a schematic structural diagram of the video decoder 112 in the embodiment of the present application.
  • the video decoder 112 contains An entropy decoding module 30 , a prediction module 31 , an inverse quantization module 32 , an inverse transformation module 33 , a summer 301 and a reference image memory 34 .
  • the entropy decoding module 30 includes an analysis module and a code rate control module.
  • the video decoder 112 may perform an exemplary reciprocal decoding process to the encoding process described with respect to the video encoder 102 from FIG. 2 .
  • video decoder 112 receives a codestream of encoded video from video encoder 102 .
  • the parsing module in the entropy decoding module 30 of the video decoder 112 performs entropy decoding on the code stream to generate quantization coefficients and syntax elements.
  • the entropy decoding module 30 forwards the syntax elements to the prediction module 31 .
  • Video decoder 112 may receive syntax elements at the video slice level and/or the video block level.
  • the code rate control module in the entropy decoding module 30 generates code rate control parameters according to the information of the image to be decoded obtained by the analysis module, and the code rate control parameters are used to generate quantization parameters so that the inverse quantization module 32 performs correlation calculations.
  • the code rate control module can also update the code rate control parameters according to the reconstructed block obtained by the summer 301 through reconstruction.
  • the dequantization module 32 dequantizes (for example, dequantizes) the quantization coefficients provided in the code stream and decoded by the entropy decoding module 30 and the generated quantization parameters.
  • the inverse quantization process may include determining the degree of quantization using quantization parameters calculated by video encoder 102 for each video block in the video slice, and likewise determining the degree of inverse quantization applied.
  • the inverse transform module 33 applies inverse transform (for example, an inverse transform method corresponding to transform methods such as DCT and DST) to the inversely quantized transform coefficients, and generates inversely transformed residual blocks in the pixel domain according to the inversely transformed transform coefficients.
  • the size of the inverse transformation unit is the same as that of the TU, and the inverse transformation method and the transformation method adopt the corresponding forward transformation and inverse transformation in the same transformation method, for example, the inverse transformation of DCT and DST is inverse DCT, inverse DST or a conceptually similar inverse transformation process.
  • video decoder 112 forms a decoded video block by summing the inverse transformed residual block from inverse transformation module 33 with the prediction block.
  • Summer 301 represents one or more components that perform this summation operation.
  • a deblocking filter may also be applied to filter the decoded blocks in order to remove blocking artifacts.
  • the decoded viewblocks in a given frame or picture are stored in reference picture memory 34 as reference pixels for subsequent predictions.
  • FIG. 4 is a schematic flowchart of a video encoding/decoding provided by this application.
  • the implementation of video encoding/decoding includes process 1 to process 5, and process 1 to process 5 can be executed by any one or more of the above-mentioned source device 10, video encoder 102, destination device 11, or video decoder 112.
  • Process 1 Divide a frame of image into one or more non-overlapping parallel units.
  • the one or more parallel units have no dependencies, and can be completely parallel/independently encoded and decoded, such as parallel unit 1 and parallel unit 2 shown in FIG. 4 .
  • Process 2 For each parallel unit, it can be divided into one or more independent units that do not overlap with each other. Each independent unit can be independent of each other, but can share some parallel unit header information.
  • the independent unit may include three components of brightness Y, first chromaticity Cb, and second chromaticity Cr, or three components of RGB, or only one of them. If the independent unit contains three components, the sizes of the three components can be exactly the same or different, depending on the input format of the image.
  • each independent unit it can be divided into one or more non-overlapping coding units.
  • Each coding unit in an independent unit can depend on each other. For example, multiple coding units can perform mutual reference precoding and decoding.
  • the size of the coding unit is the same as that of the independent unit (that is, the independent unit is only divided into one coding unit), then its size can be all the sizes described in process 2.
  • the coding unit may include three components of luma Y, first chroma Cb, and second chroma Cr (or RGB three components), or may only include one of them. If it contains three components, the sizes of several components can be exactly the same or different, depending on the image input format.
  • process 3 is an optional step in the video encoding and decoding method, and the video encoder/decoder can encode/decode the residual coefficient (or residual value) of the independent unit obtained in the process 2.
  • PG Prediction Group
  • PG can also be referred to as Group
  • each PG is encoded and decoded according to the selected prediction mode, and the predicted value of PG is obtained to form the predicted value of the entire coding unit.
  • the coding unit is obtained. element's residual value.
  • Process 5 Based on the residual value of the coding unit, the coding unit is grouped to obtain one or more non-overlapping residual blocks (residual block, RB).
  • the residual coefficients of each RB are encoded and decoded according to the selected mode to form a residual coefficient stream. Specifically, it can be divided into two types: transforming the residual coefficients and not transforming them.
  • the selection mode of the residual coefficient encoding and decoding method in process 5 may include, but not limited to any of the following: semi-fixed length encoding method, exponential Golomb (Golomb) encoding method, Golomb-Rice encoding method, truncated unary code encoding method, run-length encoding method, direct encoding of the original residual value, etc.
  • a video encoder may directly encode coefficients within RBs.
  • the video encoder may also perform transformation on the residual block, such as DCT, DST, Hadamard transformation, etc., and then encode the transformed coefficients.
  • the video encoder may directly uniformly quantize each coefficient in the RB, and then perform binarization coding. If the RB is large, it can be further divided into multiple coefficient groups (coefficient groups, CG), and then each CG is uniformly quantized, and then binarized and encoded.
  • coefficient groups coefficient groups, CG
  • QG quantization group
  • the maximum value of the absolute value of the residual within an RB block is defined as the modified maximum value (mm).
  • the number of coded bits of the residual coefficient in the RB block is determined (the number of coded bits of the residual coefficient in the same RB block is the same). For example, if the critical limit (critical limit, CL) of the current RB block is 2 and the current residual coefficient is 1, then 2 bits are required to encode the residual coefficient 1, which is expressed as 01. If the CL of the current RB block is 7, it means encoding 8-bit residual coefficient and 1-bit sign bit.
  • the determination of CL is to find the minimum M value that satisfies all the residuals of the current sub-block within the range of [-2 ⁇ (M-1), 2 ⁇ (M-1)]. If there are two boundary values -2 ⁇ (M-1) and 2 ⁇ (M-1) at the same time, M should be increased by 1, that is, M+1 bits are required to encode all residuals of the current RB block; if there is only one of the two boundary values -2 ⁇ (M-1) and 2 ⁇ (M-1), a Trailing bit needs to be encoded to determine whether the boundary value is -2 ⁇ (M-1) or 2 ⁇ (M-1); if all residuals do not exist -2 ⁇ (M-1) and 2 ⁇ ( M-1), there is no need to encode the Trailing bit.
  • the video encoder can also directly encode the original value of the image instead of the residual value.
  • the above-mentioned video encoder 102 and video decoder 112 can also be realized in another form of implementation, for example, by using a general-purpose digital processor system, such as the codec device 50 shown in FIG.
  • the codec device 50 may be applied on the encoding side or on the decoding side.
  • the codec device 50 includes a processor 501 and a memory 502 .
  • the processor 501 is connected to the memory 502 (for example, connected to each other through a bus 504 ).
  • the codec device 50 may further include a communication interface 503 connected to the processor 501 and the memory 502 for receiving/sending data.
  • the memory 502 may be a random access memory (Random Access Memory, RAM), a read-only memory (Read-Only Memory, ROM), an erasable programmable read-only memory (Erasable Programmable Read Only Memory, EPROM) or a compact disc (Compact Disc Read-Only Memory, CD-ROM).
  • RAM Random Access Memory
  • ROM read-only memory
  • EPROM erasable programmable read-only memory
  • CD-ROM Compact disc Read-Only Memory
  • Processor 501 may be one or more central processing units (Central Processing Unit, CPU), such as CPU 0 and CPU 1 shown in FIG. 5 .
  • CPU Central Processing Unit
  • the CPU may be a single-core CPU or a multi-core CPU.
  • the processor 501 is configured to read the program codes stored in the memory 502, and execute operations of any one of the implementations corresponding to FIG. 6 and various feasible implementations thereof.
  • FIG. 6 it is a flow chart of a video decoding method provided by this application. The method includes the following steps.
  • the video decoder acquires one or more rate control parameters of a current block in an image to be decoded.
  • the image to be decoded is composed of multiple image blocks, and the current block is any one of the multiple image blocks.
  • the rate control parameters are used to generate quantized parameter, the quantization parameter is applied to the inverse quantization process in the decoding process, as described above about the inverse quantization module 32 in FIG. 3 .
  • the code rate control parameters include the resource adequacy of the current block, the complexity of the current block, the number of predicted bits of the current block, the average quantization parameter of the current block, and the variation value of the quantization parameter.
  • Resource adequacy refers to the degree of adequacy of the resources used to store the current block among the resources used to store the image to be decoded.
  • the complexity of the current block refers to the complexity of the decoded image block in the image to be decoded.
  • the number of predicted bits of the current block refers to an estimated resource occupied by the current block after decoding.
  • the average quantization parameter of the current block is used to represent the average degree of the quantization parameters of the decoded image blocks in the image to be decoded.
  • the variation value of the quantization parameter refers to the variation value of the quantization parameter between decoded image blocks in the image to be decoded.
  • the method further includes that the video decoder obtains information of the image to be decoded, and the information of the image to be decoded includes basic information of the image to be decoded and information of image blocks in the image to be decoded.
  • the basic information includes one or more of the following information.
  • the resolution of the image to be decoded is used to indicate the number of pixels contained in the image to be decoded, usually composed of horizontal pixels and vertical pixels.
  • 640 ⁇ 480 means that the horizontal pixels are 640 units, and the vertical pixels are 480 units.
  • the resolution of the image is 307200 pixels, which can also be expressed as 300,000 pixels.
  • the format of the image to be decoded refers to the sampling format of the image to be decoded, and reference may be made to the above description about video sampling.
  • the bit width of the image to be decoded refers to the depth of each pixel used to store the image to be decoded.
  • each pixel of a color image is represented by three components of R, G, and B. If each component is represented by 8 bits, then one pixel is represented by 24 bits, that is to say, the depth of the pixel is 24.
  • the information of the image block in the image to be decoded includes one or more of the following information.
  • the input pixel depth (bpp) of the image to be decoded refers to the depth of each pixel of the image to be decoded. For example, it is expected that the image is transmitted in the code stream with each pixel occupying 18 bits. If the image is represented by three RGB components, each component occupies 6 bits, which saves 2 bits compared to the above-mentioned bit width for storing the image. It should be noted that the bpp is usually less than or equal to the bit width, and the bpp is usually determined in the encoding stage and written into the code stream for acquisition by the decoding end.
  • the size information (block_size) of the image block in the image to be decoded wherein the sizes of the image blocks can be the same or different, and correspondingly, the size information of all the image blocks in the image to be decoded can also be calculated based on this information, or obtained directly in the code stream, which is not limited in this application.
  • the size of the image block refer to the above description about FIG. 2 , for example, one or more sizes of CU, PU, and TU, which may specifically be other sizes such as 8 ⁇ 8 and 16 ⁇ 8.
  • the location information of the current block refers to the location information of the current block in the image to be encoded.
  • the size of the image to be encoded is 8 ⁇ 4, and it can be divided into 8 small blocks according to the image block size of 2 ⁇ 2.
  • the position of the current block in the image to be encoded can be expressed as (1, 1) in the form of coordinates.
  • a reconstructed image block which may also be called a reconstructed block, refers to an image block formed by reconstructing a plurality of pixels from the inversely transformed residual block and the predicted residual block, such as the reconstructed block shown in FIG. 3 .
  • the resources of the image to be decoded and the resources occupied by each image block in the image to be decoded can be obtained according to the above basic information and the information of each image block, and the resources can be represented by part or all of the following information: the number of remaining bits (remain_bits) of the current block refers to the bit resources that are not occupied by the decoded image blocks in the resources storing the image to be decoded;
  • the number of remaining blocks (remain_blocks) of the current block refers to the number of undecoded image blocks in the image to be decoded
  • the actual number of bits refers to the bit resource occupied by each decoded image block in the resource for storing the image to be decoded.
  • a video decoder can obtain the basic information of the image to be decoded by obtaining the header information of the code stream.
  • the basic information refers to information such as the format, resolution, and size of the image file.
  • the code stream also includes information about the image block in the image to be decoded, such as the size of the image block and bpp.
  • the video decoder determines the code rate control parameters according to part or all of the image information to be decoded. Specifically, the following 5 situations are included.
  • the video decoder determines the resource sufficiency (sufficiency) of the current block according to the information of the image to be decoded.
  • the video decoder determines resource adequacy according to the number of remaining bits (remain_bits), the number of remaining blocks (remain_blocks), the size information of the current block (block_size) and the input pixel depth bpp.
  • the video decoder determines resource adequacy according to the number of remaining bits, the number of remaining blocks and the position information of the current block.
  • the video decoder determines the position of the current block in the image to be decoded, and further, determines the position of the current block in the parallel unit in the image to be decoded, and the parallel unit may be a slice and/or a group processing unit. Specifically, if the current block is at the first 1/n position of the parallel unit, set the first weight coefficient w1 to 1+a; if the current block is at the middle 1/n position of the parallel unit, then set w1 to 1; if the current block is at the last 1/n position of the parallel unit, then set w1 to 1-a, where 0 ⁇ a ⁇ 1.
  • a increases as the distance between the front 1/n and the back 1/n relative to the middle 1/n increases.
  • the video decoder may use the product of the weight coefficients of the respective parallel units as the first weight coefficient.
  • Case 2 The video decoder determines the complexity of the current block according to the reconstructed image block in the information of the decoded image block.
  • the video decoder performs weighting according to the transformation coefficients of the first n reconstructed blocks of the current block, 1 ⁇ n ⁇ the number of decoded image blocks.
  • the number of decoded image blocks can be obtained by parsing the code stream, or calculated according to the number of remaining blocks and the number of image blocks of the image to be decoded.
  • the transformation method may be DCT transformation, Hadamard transformation or DST transformation.
  • C i represents the transformation coefficient of the i-th reconstruction block
  • W i represents the weight coefficient of the transformation coefficient of the i-th reconstruction block.
  • the DCT transformation it includes a DC coefficient and an AC coefficient
  • the DC coefficient represents the transformation coefficient of the first image block in the decoded image block
  • the AC coefficient represents the transformation coefficient of other image blocks in the decoded image block.
  • the weight coefficient of the DC coefficient is set to 0, and the weight coefficient of other AC coefficients is set to 1. The above selection method helps to simplify the calculation process.
  • n is a preset fixed value, or n may refer to the resource adequacy mentioned above. When resources are sufficient, n takes a smaller value, and when resources are insufficient, n takes a larger value.
  • the video decoder takes the variance of the pixel values of the first n reconstructed blocks of the current block as the complexity, where 1 ⁇ n ⁇ the number of decoded image blocks.
  • the video decoder performs weighting according to the gradients of the first n reconstructed blocks of the current block, 1 ⁇ n ⁇ the number of decoded image blocks.
  • Horizontal gradient x reconstruction of column t - reconstruction of column t-1;
  • Level complexity (complexity1) sum of level gradients (gradx)/grad_block_size
  • the reconstruction of the tth column - the reconstruction of the t-1th column means that the values of the samples corresponding to the tth column and the t-1th column are subtracted two by two
  • the reconstruction of the sth row - the reconstruction of the s-1th row means that the sth row corresponds to the s-1th row
  • grad_block_size is the number of pixels involved in the calculation of the gradient reconstruction block.
  • the current reconstruction block is 4 ⁇ 2
  • the video decoder performs sharpening according to the first n reconstructed blocks of the current block, and weights the sharpening results. Wherein, 1 ⁇ n ⁇ the number of decoded image blocks.
  • Common sharpening algorithms include Robert, Prewitt, Sobel, Laplacian, and Kirsch.
  • P represents all pixel values of the reconstructed block
  • A1 and A2 represent the horizontal operator and vertical operator of any of the above sharpening algorithms, respectively
  • A1 ⁇ P represents the application of the A1 operator on P.
  • Case 3 The video decoder determines the number of predicted bits (pred_bits) of the current block according to the information of the image to be decoded.
  • the video decoder obtains the number of predicted bits according to the number of remaining bits and the number of remaining blocks.
  • pred_bits remain_bits/remain_blocks.
  • the video decoder obtains the number of predicted bits by weighting according to the actual number of bits (real_bits) of the decoded image block.
  • the video decoder performs weighting according to the actual number of bits of the first n decoded image blocks, 1 ⁇ n ⁇ number of decoded image blocks.
  • pred_bits B1 ⁇ real_bits1+B2 ⁇ real_bits2+...+Bn ⁇ real_bitsn;
  • code rate control parameters may also be calculated according to at least one code rate control parameter.
  • code rate control parameter e.g., the following three situations are included:
  • the video decoder determines resource adequacy based on the number of predicted bits and complexity
  • the video decoder determines the resource adequacy according to the information of the current block (the number of remaining bits, the number of remaining blocks) and the number of predicted bits.
  • the video decoder determines resource adequacy based on the complexity of the current block, the number of predicted bits, and the maximum complexity (max_complexity) of the first n decoded image blocks, combined with the size information of the current block and the depth of the input pixel. Wherein, 1 ⁇ n ⁇ the number of decoded image blocks.
  • the video decoder determines the complexity according to the number of predicted bits and resource adequacy
  • the video decoder performs weighting according to the complexity of the first n decoded image blocks, 1 ⁇ n ⁇ the number of decoded image blocks;
  • the video decoder determines the complexity of the current block according to the complexity (pre_complexity) of the previous n decoded image blocks and the location information of the current block. Wherein, 1 ⁇ n ⁇ the number of decoded image blocks.
  • the video decoder determines the position of the current block in the image to be decoded, and further, determines the position of the current block in the parallel unit in the image to be decoded, and the parallel unit may be a slice and/or a group processing unit. If the current block is in the first 1/n position of the parallel unit, set the second weight coefficient w2 to 1+a; if the current block is in the middle 1/n position of the parallel unit, then set w2 to 1; if the current block is in the last 1/n position of the parallel unit, then set w2 to 1-a, where 0 ⁇ a ⁇ 1.
  • a increases as the distance between the front 1/n and the back 1/n relative to the middle 1/n increases.
  • the video decoder may use the product of the respective weight coefficients as the second weight coefficient.
  • the video decoder determines the complexity of the current block according to the number of predicted bits (pre_pred_bits) of the first n decoded image blocks, the resource adequacy (pre_sufficiency) of the first n decoded image blocks, and the number of predicted bits and resource adequacy of the current block.
  • pred_bits>pre_pred_bits and sufficiency>pre_sufficiency is considered to be less complex
  • pred_bits ⁇ pre_pred_bits and sufficiency ⁇ pre_sufficiency are considered to be complex.
  • the video decoder determines the number of predicted bits according to resource adequacy and complexity
  • the video decoder weights the number of predicted bits of the current block according to the number of predicted bits of the first n decoded image blocks, 1 ⁇ n ⁇ the number of decoded image blocks.
  • pred_bits E1 ⁇ pred_bits1+E2 ⁇ pred_bits2+...+En ⁇ pred_bitsn;
  • the video decoder obtains the predicted bit number of the current block by weighting the predicted bit numbers and the actual bit numbers of the first n decoded image blocks, 1 ⁇ n ⁇ the number of decoded image blocks.
  • pred_bits F1 ⁇ pred_bits1+ whil+Fn ⁇ pred_bitsn+G1 ⁇ real_bits1+ whil+Gn ⁇ real_bitsn
  • the video decoder is weighted according to the overall complexity (pre_complexity) of the first n decoded image blocks, the number of predicted bits (pre_pred_bits) of the first n decoded image blocks, and the complexity (complexity) of the current block to obtain the number of predicted bits of the current block, 1 ⁇ n ⁇ the number of decoded image blocks.
  • pre_complexity is a variable value to satisfy the calculation of the following formula:
  • pred_bits clip(complexity/pre_complexity, 1-a, 1+a) ⁇ pre_pred_bits;
  • clip(complexity/pre_complexity, 1-a, 1+a) indicates that the value of complexity/pre_complexity is within the range of (1-a, 1+a) by adjusting pre_complexity, 0 ⁇ a ⁇ 1.
  • pre_complexity can be the average complexity of the first n decoded image blocks; or, any one of the complexities of the first n decoded image blocks; or, the maximum value of the complexity of the first n decoded image blocks and other coefficients representing the overall complexity.
  • the video decoder obtains the predicted number of bits of the current block according to the overall complexity of the first n decoded image blocks, the actual number of bits and the complexity of the current block, 1 ⁇ n ⁇ the number of decoded image blocks.
  • pre_complexity is a variable value to satisfy the calculation of the following formula
  • pred_bits clip(complexity/pre_complexity, 1-a, 1+a) ⁇ real_bits
  • the video decoder obtains the empirical value according to the position information, and the complexity of the current block, and weights the number of predicted bits of the previous n decoded image blocks to obtain the number of predicted bits of the current block, 1 ⁇ n ⁇ the number of decoded image blocks.
  • the video decoder determines the position of the current block in the image to be decoded, and further, determines the position of the current block in the parallel unit in the image to be decoded, and the parallel unit may be a slice and/or a group processing unit.
  • the third weight coefficient w3 is set to 1+a; if the current block is in the middle 1/n position of the parallel unit, then set w3 to 1; if the current block is in the last 1/n position of the parallel unit and the complexity is greater than the second preset complexity, then set w3 to 1-a, wherein, 0 ⁇ a ⁇ 1, the first preset complexity is smaller than the second preset complexity.
  • a is 0.2
  • the first preset complexity is 2000
  • the second preset complexity is 10000.
  • a increases as the distance between the front 1/n and the back 1/n relative to the middle 1/n increases.
  • the third weight coefficient is different for different parallel units, and when the position of the slice of the current block in the image to be decoded and the position of the group processing unit are considered at the same time, the video decoder can use the product of the respective weight coefficients as the third weight coefficient.
  • Case 4 The video decoder determines the average quantization parameter (average_quantization) according to the information of the image to be decoded.
  • the video decoder adopts any one of the above three situations to obtain the quantization parameters of the first n decoded image blocks, 1 ⁇ n ⁇ the number of decoded image blocks, and the quantization parameters of the first n decoded image blocks can refer to the description of step S602 below.
  • average_quantization The sum of quantization parameters of the first n blocks (total_sum)/n(total_blocks)
  • the video decoder calculates the average quantization parameter according to the position information corresponding to different weights:
  • average_quantization t average_quantization t-1 ⁇ H1+H2 ⁇ pre_quantization
  • t represents the current block
  • t-1 represents the previous decoded image block of the current block
  • average_quantization t-1 represents the average quantization parameter of the previous decoded image block
  • pre_quantization represents the quantization parameter of the previous decoded image block.
  • the video decoder determines the average quantization parameter according to the complexity of the previous n decoded image blocks and the complexity of the current block.
  • the video decoder divides the complexity of the first n decoded image blocks into k levels, determines the position of the complexity of the current block in the k levels, and calculates the average value of the quantization parameters of this level as the quantization parameter of the current block.
  • level 1 is complexity>10000
  • level 2 is 1000 ⁇ complexity ⁇ 10000
  • level 3 is complexity ⁇ 1000
  • the video decoder determines the average quantization parameter according to the resource adequacy of the previous n decoded image blocks and the resource adequacy of the current block.
  • the video decoder determines the average quantization parameter according to the number of predicted bits of the previous n decoded image blocks and the number of predicted bits of the current block.
  • 4) and 5) are similar to the above 3), divide the levels and calculate the average value of the quantization parameters of the corresponding levels, which will not be repeated here.
  • the video decoder uses the predicted number of bits of the current block and the actual number of bits of the previous block to perform clustering.
  • the above methods 3)-6) mainly use clustering based on certain information of the current block and certain information of the previous N blocks, and average the quantization parameters of blocks in similar classes, wherein different coefficients can also be used for weighted averaging for different blocks in the level to obtain the average quantization parameter.
  • the information used for clustering may be a combination of various information, such as at least two of complexity, resource adequacy, and number of predicted bits.
  • Case 5 The video decoder determines the change value ( ⁇ quantization) of the quantization parameter according to the information of the image to be decoded.
  • the video decoder obtains the change value of the quantization parameter according to the accumulation of the difference between the predicted bit number and the actual bit number of the first n decoded image blocks, and the size information of the current block.
  • pred_real_sum the accumulation of the difference between the predicted number of bits and the actual number of bits of the first n decoded image blocks.
  • pred_real_sum may be replaced by target_real_sum
  • target_real_sum represents the accumulation of the difference between the target bit number and the actual bit number of the first n decoded image blocks.
  • target bit number refers to the product of the target pixel depth (target bpp) and the size information of the current block.
  • the target pixel depth is in units of block processing units, and is used to represent the pixel depth of the current block to be transmitted, that is, a further defined parameter based on the input bpp of the image to be decoded.
  • target_bits target bpp ⁇ block_size
  • pred_real_sum may be replaced by target_pred_sum, and the target_pred_sum represents the accumulation of differences between the target number of bits and the number of predicted bits of the first n decoded image blocks.
  • the maximum quantization parameter is represented by max_quantization
  • J1 is -max_quantization/3
  • J2 is max_quantization/3
  • J1 is -4
  • J2 is 4.
  • the video decoder is mapped to obtain the change value of the quantization parameter according to the accumulation of the difference between the target number of bits and the number of predicted bits of the first n decoded image blocks.
  • ⁇ quantization sigmoid(target_pred_sum/block_size) ⁇ J3, where J3 is a parameter.
  • target_pred_sum can also be replaced by pred_real_sum or target_real_sum.
  • the video decoder determines the change value of the quantization parameter according to the complexity of the current block, the complexity of the previous decoded image block and the maximum quantization parameter.
  • ⁇ pre_quantization represents the change value of the quantization parameter of the previous decoded image block
  • max_ ⁇ quantization represents the maximum change value of the quantization parameter of the image block in the image to be decoded
  • the maximum value may be a preset fixed value
  • the video decoder sets the change value of the quantization parameter to a fixed value.
  • the video decoder adjusts the change value of the quantization parameter according to the information of the image to be decoded and other code control parameters of the current block.
  • the video decoder can also adjust the fixed value according to other input parameters.
  • the video decoder sets the change value of the quantization parameter to a fixed value according to resource adequacy in other code control parameters of the current block.
  • resource adequacy in other code control parameters of the current block.
  • the method further includes that the video decoder initializes the code rate control parameters, as shown in FIG. 9 , including the following steps.
  • the video decoder calculates the total number of bits of the current block according to the information of the image to be decoded.
  • the total number of bits refers to the number of bits occupied by the current block, and the total number of bits is calculated according to the sampling format, resolution, and bpp in the basic information of the image to be decoded.
  • the total number of bits can be calculated according to the sampling format of the image. For example, if the image is in YUV format, the total number of bits of the Y channel, the total number of bits of the U channel and the total number of bits of the V channel can be calculated respectively. Among them, multiple channels can also form different sets to calculate the total number of bits.
  • an N-channel image is divided into M sets (M ⁇ N).
  • M M ⁇ N
  • the Y channel is a first set
  • the U channel and V channel are a second set.
  • the parameters of each set are independent of each other.
  • the parameters include initialization parameters, code rate control parameters, quantization parameters and all variable parameters in the decoding process.
  • the parameters may also include some of the above variable parameters, that is, some or all of the parameters in each set are independent of each other.
  • the parameters of each collection are common.
  • the parameters may include some or all of the above variable parameters, some or all of which are common to all sets.
  • the common parameters of the first set and the second set include initialized parameters, updated parameter thresholds, and quantization parameters
  • the independent parameters include intermediate variables and rate control parameters.
  • the intermediate variable refers to an intermediate parameter generated during the calculation of the quantization parameter, for example, the number of decoded image blocks based on the current block.
  • the above parameters are updated according to M sets. Specifically, after each set is decoded, the parameters are updated; or, after all sets are decoded, the parameters are updated.
  • the video decoder determines the number of available bits according to the total number of bits, and the number of available bits is used to store the image data of the current block before decoding.
  • the video decoder removes other bits used to store non-image data from the total bits to obtain usable bits.
  • other bits include the following types of data, for example, reserved bits for storing image header information, and inter-block offset bits for determining the position of data in the code stream.
  • the video decoder takes the total number of bits as the available number of bits, because in some applications, the total number of bits does not include other bits.
  • a branching step may also be included, the video decoder is divided and combined according to the channel, and the total number of available bits of each set is calculated; the total resource is allocated to each set according to the ratio of M sets.
  • the ratio of the M sets may be determined with reference to the number of image blocks in each set.
  • the video decoder allocates the total number of available bits of the current block according to the proportion of the set.
  • this branching step can also be applied.
  • the video decoder calculates the number of available bits in each of the M sets.
  • step S902 other non-image bit numbers in step S902 are respectively removed after the set is divided.
  • target bpp is used to determine the target number of bits and other parameters related to the target number of bits.
  • the video decoder initializes an intermediate parameter, where the intermediate parameter is used in conjunction with one or more rate control parameters to determine a quantization parameter of the current block.
  • target_bits target_real_sum and pred_real_sum.
  • these parameters can be set as:
  • target_bits total_bits/total_blocks
  • target_real_sum target_bits ⁇ total_blocks–total_bits
  • some intermediate variables in calculating the rate control parameters may also be pre-defined in this step, such as the maximum quantization parameter, the maximum allowed complexity, and the like.
  • the video decoder determines the quantization parameter of the current block according to one or more code rate control parameters.
  • Solution 1 The average quantization parameter is used as the quantization parameter.
  • the quantization parameter is obtained by weighting according to the average quantization parameter and the change value of the quantization parameter.
  • quantization clip(average_quantization + ⁇ quantization, 0, max_quantization).
  • Solution 3 Determine the quantization parameter according to the complexity of the current block, the complexity of the previous decoded image block, and the quantization parameter of the previous decoded image block.
  • quantization clip(complexity/pre_complexity,1-a,1+a) ⁇ pre_quantization
  • Solution 4 Determine the quantization parameter of the current block according to the position information and complexity of the current block, and the quantization parameter of the previous decoded image block.
  • the fourth weight coefficient w4 is 1-a; in the last 1/2 position of the parallel unit, then w4 is 1+a, 0 ⁇ a ⁇ 1;
  • Solution 5 Determine the quantization parameter of the current block according to the information of the previous decoded image block.
  • pred_real_sum if pred_real_sum is greater than the preset threshold, the quantization parameter increases, otherwise, the quantization parameter decreases.
  • Solution 6 Determine the quantization parameter of the current block according to the location information of the current block.
  • Solution 7 Determine the quantization parameter of the current block according to multiple complexities of the current block.
  • N1 and N2 can be calibrated by bit width, and both N1 and N2 are greater than 0.
  • the specific implementation of the video encoder generating quantization parameters based on the rate control parameters is a reciprocal encoding process with the above embodiment on the decoding side, and will not be described again.
  • the video decoder modifies the quantization parameter.
  • the first possible implementation mode is to adjust according to the resource adequacy of the current block; if it exceeds the range of the resource adequacy, make corresponding adjustments.
  • the adjustment is performed based on the above number of bits.
  • the third possible implementation is to adjust according to the actual number of bits of the previous n blocks and the size of the code-controlled buffer (buffer_size) of the current block.
  • target_real_sum is actually the same as buffer_size.
  • the quantization parameter is adjusted according to the position of the current block.
  • quantization quantization/2; the difference between the left and right parallel units of the parallel unit boundary cannot be greater than 3; the quantization parameter difference of the middle, upper, lower, left, and right blocks of the same parallel unit cannot be greater than max_quantization/4.
  • a fifth possible implementation manner is to adjust based on the smoothness of the rate control.
  • Constant Bit Rate This code rate control method has a buffer_size, and each block will update the buffer_size according to the actual number of bits and the target number of bits.
  • the buffer_size must not exceed the first preset threshold and cannot be less than 0.
  • buffer_size is target_real_sum, or use input bpp as target bpp to calculate target_real_sum.
  • two levels of buffers are used, and the two levels of buffers are parallel and independent to perform CBR.
  • the video decoder restricts the modification of quantization parameters.
  • first-level buffering can also be applied to the video encoder for correction after the quantization parameter is calculated.
  • the real_bits of the currently selected optimal mode will make buffer_size ⁇ max_buffer_size ⁇ U1, then select a mode with the smallest distortion; if the real_bits of the currently selected optimal mode will make buffer_size ⁇ max_buffer_size ⁇ U2, then choose the mode without distortion or the mode with the smallest distortion; if the real_bits of the currently selected optimal mode will make buffer_size>max_buffer_size ⁇ U3, then Choose one that will not exceed max_buffer_size x U3 and has less distortion; if the real_bits of the currently selected optimal mode will make buffer_size>max_buffer_size ⁇ U4, choose a new mode with the smallest cost at this time.
  • U1-U4 can be calculated according to the cost of the minimum cost mode in max_buffer_size, or different fixed values can be preset.
  • buffer_size is required to be within an ideal range, that is, max_buffer_size ⁇ V1 ⁇ buffer_size ⁇ max_buffer_size ⁇ V2, where 0 ⁇ V1 ⁇ V2 ⁇ 1.
  • the encoding end or the decoding end may reserve resources during parameter initialization, and the reserved resources are max_buffer_size ⁇ V3, where V3 ⁇ 1.
  • the reserved resources determine the number of available bits in parameter initialization, they belong to the number of reserved bits and need to be deducted to meet the supplement of subsequent resources.
  • the reserved resources can also be used for encoding and decoding the last few image blocks, or for the boundaries of parallel units, and the reserved resources can be released in places where resources are relatively insufficient.
  • steps S601-S603 are implemented in different parallel units.
  • Embodiment 1 block, row, group, slice level parallelism
  • the encoding end records the offset information of the data of the current block.
  • the offset information refers to the position of the current block in the parallel unit, and is represented by the sum of the actual bit numbers from the first block of the parallel unit to the previous image block of the current block.
  • the video encoder writes the offset information into the code stream, which is used for the decoding end to analyze the code stream and obtain the offset information, so as to locate the position of the current block in the parallel unit.
  • the position in the code stream written by the encoding end is the beginning of the entire parallel unit code stream.
  • the video decoder obtains the offset information in the code stream, locates the position of the current block according to the offset information, and performs decoding.
  • the offset information can be removed as other bits for storing non-image data in the parameter initialization stage, or it can be included in the actual number of bits as the consumption of parallel units during the encoding and decoding process, or it can not be included in the consumption of actual bits during the encoding and decoding process.
  • the calculation methods are different, which will have an impact on the calculation of some parameters in the above-mentioned scheme, and the selection should be made after comprehensive consideration.
  • Embodiment 2 row-level parallelism
  • the encoding end fills in the code stream according to the rules.
  • each row is divided into several units, and each unit is aligned according to a given parameter (such as 128bit); (if the number of bits in the unit is not a multiple of 128bit, fill the unit to a multiple of 128bits).
  • a given parameter such as 128bit
  • the parameters can be: empirical parameters, or obtained from one or more information in the line width, height and bpp.
  • the maximum number of bits between slice rows the number of consumed bits of the longest row in all slice rows at the same position. Align according to the maximum number of bits between slice rows; (if the number of consumed bits in this row is not a multiple of the maximum number of bits between slice rows, fill the consumed bits to a multiple of the maximum number of bits between slice rows).
  • the third possible implementation method is to fill -buffer_size bits after the encoding of the slice is completed (CBR guarantees that buffer_size ⁇ 0)
  • the decoder includes three possible implementations:
  • the end of the unit when the end of the unit is read, it is judged whether the currently consumed bits meet the alignment of the given parameter bits, and if it is satisfied, there is no need to read the filled bits, otherwise, the number of bits filled by a multiple of the given bits is read.
  • the second possible implementation method is to judge whether the currently consumed bits meet the alignment of the maximum number of bits between slice lines when decoding to the end of the line. If so, there is no need to read the padding bits, otherwise read the number of bits filled by a multiple of the maximum number of bits between slice lines.
  • the third possible implementation is to read -buffer_size padding bits.
  • Exemplary a code stream arrangement method, taking 2 slices as an example, wherein, the slice will only be divided vertically: the first unit of slice1 (filling), the first unit of slice2 (filling), the second unit of slice1, the second unit of slice2 (filling) ... the last unit of the first row of slice1, the last unit of the first row of slice2 (filling).
  • the maximum number of bits between the first lines of the slice is the number of bits consumed by slice1 to encode one line.
  • the number of bits consumed by the padding is included in the actual bit consumption of the corresponding block, or the number of bits consumed by the padding is not included in the actual bit consumption of the corresponding block.
  • Embodiment 3 slice-level parallelism
  • the coding end calculates the number of unit bits and writes it into the code stream.
  • the number of bits of each unit may be calculated according to one or more of the width and height of the slice and bpp (or target_bpp).
  • the width ⁇ height ⁇ bpp of the slice 10000 bits
  • the 10000 bits are the number of bits of the units in the slice.
  • the number of bits of the unit may also be a preset fixed value.
  • code stream arrangement is a code stream arrangement method, taking 2 slices as an example, among which, the slice will only be divided vertically: the first unit of slice1, the first unit of slice2, the second unit of slice1, the second unit of slice2...the last unit of slice1, the last unit of slice2.
  • the number of bits of each unit is a constant size determined by any one of the above two optional manners.
  • the decoder parses the code stream in the order of the first unit of the first slice, the first unit of the second slice, the second unit of the first slice, and the second unit of the second slice.
  • An embodiment of the present application provides a decoding device, and the decoding device may be a video decoder. Specifically, the decoding device is configured to perform the steps performed by the video decoder in the above decoding method.
  • the decoding device provided in the embodiment of the present application may include modules corresponding to corresponding steps.
  • the functional modules of the decoding device may be divided according to the above method examples.
  • each functional module may be divided corresponding to each function, or two or more functions may be integrated into one processing module.
  • the above-mentioned integrated modules can be implemented in the form of hardware or in the form of software function modules.
  • the division of modules in the embodiment of the present application is schematic, and is only a logical function division, and there may be other division methods in actual implementation.
  • FIG. 10 shows a possible structural diagram of the decoding device involved in the above embodiment.
  • the decoding device 100 includes an acquisition module 1001, a determination module 1002 and a decoding module 1003.
  • the acquisition module 1001 is configured to acquire one or more code rate control parameters of the current block in the image to be decoded, such as the above step S601.
  • the determination module 1002 is configured to determine the quantization parameter of the current block according to one or more code rate control parameters; for example, the above step S602.
  • the decoding module 1003 is configured to decode the current block based on the quantization parameter of the current block, such as the above step S603.
  • the acquiring module 1001 is specifically configured to acquire information of the image to be decoded, the information of the image to be decoded includes basic information of the image to be decoded and information of image blocks in the image to be decoded; determine one or more code rate control parameters of the current block according to part or all of the information of the image to be decoded.
  • the code rate control parameters include the resource adequacy of the current block, the complexity of the current block, the number of predicted bits of the current block, the average quantization parameter of the current block, and the change value of the quantization parameter of the current block; wherein, the average quantization parameter is used to represent the average degree of the quantization parameters of the decoded image blocks in the image to be decoded, the resource adequacy refers to the adequacy of the resources used to store the current block in the resources used to store the image to be decoded, the complexity refers to the complexity of the decoded image block in the image to be decoded, and the number of predicted bits refers to the estimated resources occupied by the current block after decoding.
  • the variation value of the quantization parameter refers to the variation value of the quantization parameter between decoded image blocks in the image to be decoded.
  • the determining module 1002 is specifically configured to acquire a code rate control parameter of the decoded image block; and determine a quantization parameter of the current block according to the code rate control parameter of the decoded image block.
  • the determining module 1002 is specifically configured to acquire the quantization parameter of the decoded image block; and determine the quantization parameter of the current block according to the quantization parameter of the decoded image block.
  • the determining module 1002 is further configured to calculate the total number of bits of the current block according to the information of the current block, the total number of bits of the current block is the number of bits occupied by the current block after decoding; the number of available bits is determined according to the total number of bits, the number of available bits is used to store decoded image data of the current block, and the number of available bits is used to determine one or more code rate control parameters of the current block.
  • the determining module 1002 is specifically configured to remove other bits used to store non-image data from the total bits.
  • the determining module 1002 is further configured to initialize an intermediate parameter, and the intermediate parameter is used in conjunction with one or more code rate control parameters to determine a quantization parameter of the current block.
  • the determining module 1002 is further configured to modify the quantization parameter based on one or more code rate control parameters.
  • the determining module 1002 is further configured to update one or more code rate control parameters.
  • the embodiment of the present application also provides an electronic device, the electronic device includes the above-mentioned decoding apparatus 100, and the decoding apparatus 100 executes any method performed by the video decoder provided above.
  • the embodiment of the present application also provides a communication system, the communication system includes the above-mentioned decoding device 100 and an encoding device, the decoding device 100 performs the method performed by any video decoder provided above, and the encoding device performs the method performed by any video encoder provided above.
  • the embodiment of the present application also provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is run on a computer, the computer is made to perform the method performed by any one of the video decoders provided above.
  • the embodiment of the present application also provides a chip.
  • the chip integrates a control circuit and one or more ports for realizing the functions of the decoding device 100 described above.
  • the functions supported by the chip can refer to the above, and will not be repeated here.
  • the program can be stored in a computer-readable storage medium.
  • the storage medium mentioned above may be a read-only memory, a random access memory, and the like.
  • the above-mentioned processing unit or processor can be a central processing unit, a general-purpose processor, a specific integrated circuit application specific integrated circuit (ASIC), microprocessor (digital signal processor, DSP), field programmable gate array (field programmable gate array, FPGA) or other programmable logic devices, transistor logic devices, hardware components or any combination thereof.
  • ASIC application specific integrated circuit
  • DSP digital signal processor
  • FPGA field programmable gate array
  • the embodiments of the present application also provide a computer program product containing instructions, which, when the instructions are run on a computer, cause the computer to execute any one of the methods in the foregoing embodiments.
  • the computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on the computer, the processes or functions according to the embodiments of the present application will be generated in whole or in part.
  • a computer can be a general purpose computer, special purpose computer, computer network, or other programmable device.
  • Computer instructions may be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, computer instructions may be transmitted from one website, computer, server or data center to another website, computer, server or data center through wired (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.) means.
  • the computer-readable storage medium may be any available medium that can be accessed by a computer, or may contain one or more data storage devices such as servers and data centers that can be integrated with the medium.
  • the available media may be magnetic media (eg, floppy disk, hard disk, magnetic tape), optical media (eg, DVD), or semiconductor media (eg, SSD), etc.
  • all or part of them may be implemented by software, hardware, firmware or any combination thereof.
  • a software program When implemented using a software program, it may be implemented in whole or in part in the form of a computer program product.
  • the computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on the computer, the processes or functions according to the embodiments of the present application will be generated in whole or in part.
  • a computer can be a general purpose computer, special purpose computer, computer network, or other programmable device.
  • Computer instructions may be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, computer instructions may be transmitted from one website, computer, server or data center to another website, computer, server or data center through wired (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.) means.
  • the computer-readable storage medium may be any available medium that can be accessed by a computer, or may contain one or more data storage devices such as servers and data centers that can be integrated with the medium.
  • the usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, or a magnetic tape), an optical medium (for example, a DVD), or a semiconductor medium (for example, a solid state disk (solid state disk, SSD)), etc.
  • a magnetic medium for example, a floppy disk, a hard disk, or a magnetic tape
  • an optical medium for example, a DVD
  • a semiconductor medium for example, a solid state disk (solid state disk, SSD)

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

Des modes de réalisation de la présente demande concernent un procédé et un appareil de décodage vidéo, un décodeur vidéo et un support de stockage, qui relèvent du domaine technique du décodage vidéo et sont avantageux pour économiser des ressources dans des flux de code et pour améliorer l'efficacité de transmission des flux de code. Le procédé comprend : l'obtention d'un ou de plusieurs paramètres de commande de débit de code d'un bloc courant dans une image à décoder ; la détermination d'un paramètre de quantification du bloc courant selon le ou les paramètres de commande de débit de code ; et le décodage du bloc courant sur la base du paramètre de quantification du bloc courant.
PCT/CN2023/072358 2022-01-21 2023-01-16 Procédé et appareil de décodage vidéo, décodeur vidéo et support de stockage WO2023138532A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210074669.XA CN116095335A (zh) 2022-01-21 2022-01-21 一种视频解码方法、装置及存储介质
CN202210074669.X 2022-01-21

Publications (1)

Publication Number Publication Date
WO2023138532A1 true WO2023138532A1 (fr) 2023-07-27

Family

ID=86197950

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/072358 WO2023138532A1 (fr) 2022-01-21 2023-01-16 Procédé et appareil de décodage vidéo, décodeur vidéo et support de stockage

Country Status (3)

Country Link
CN (2) CN116095335A (fr)
TW (1) TWI838089B (fr)
WO (1) WO2023138532A1 (fr)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109076212A (zh) * 2017-12-27 2018-12-21 深圳市大疆创新科技有限公司 码率控制的方法与编码装置
CN112690000A (zh) * 2018-09-21 2021-04-20 华为技术有限公司 用于进行反量化的装置和方法
US20210227221A1 (en) * 2018-06-25 2021-07-22 Electronics And Telecommunications Research Institute Method and apparatus for encoding/decoding image using quantization parameter, and recording medium storing bitstream
CN113784126A (zh) * 2021-09-17 2021-12-10 Oppo广东移动通信有限公司 图像编码方法、装置、设备及存储介质

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109076212A (zh) * 2017-12-27 2018-12-21 深圳市大疆创新科技有限公司 码率控制的方法与编码装置
US20210227221A1 (en) * 2018-06-25 2021-07-22 Electronics And Telecommunications Research Institute Method and apparatus for encoding/decoding image using quantization parameter, and recording medium storing bitstream
CN112690000A (zh) * 2018-09-21 2021-04-20 华为技术有限公司 用于进行反量化的装置和方法
CN113784126A (zh) * 2021-09-17 2021-12-10 Oppo广东移动通信有限公司 图像编码方法、装置、设备及存储介质

Also Published As

Publication number Publication date
TW202337212A (zh) 2023-09-16
CN117221564A (zh) 2023-12-12
TWI838089B (zh) 2024-04-01
CN116095335A (zh) 2023-05-09

Similar Documents

Publication Publication Date Title
TWI666920B (zh) 用於視訊寫碼之具有執行長度碼之調色盤預測器信令
TW201841501A (zh) 用於視訊寫碼之多種類型樹架構
CN115152223A (zh) 对用于视频编解码的高级语法的输出层集数据和一致性窗口数据进行编解码
TW201513639A (zh) 於視訊寫碼程序中用於係數層級寫碼之萊斯(rice)參數初始化
TW201334543A (zh) 判定用於視訊寫碼之解塊濾波的量化參數
WO2023231866A1 (fr) Procédé et appareil de décodage de vidéo, et support de stockage
WO2024061055A1 (fr) Procédé et appareil de codage d'image, procédé et appareil de décodage d'image, et support de stockage
WO2024104382A1 (fr) Procédé et appareil de codage d'image, procédé et appareil de décodage d'image, et support de stockage
TWI847806B (zh) 視訊圖像解碼方法、編碼方法、裝置及存儲介質
WO2024022359A1 (fr) Procédé et dispositif de codage d'image et procédé et dispositif de décodage d'image
TW202341739A (zh) 圖像解碼方法、圖像編碼方法及相應的裝置
WO2023138532A1 (fr) Procédé et appareil de décodage vidéo, décodeur vidéo et support de stockage
WO2022174475A1 (fr) Procédé et système de codage vidéo, procédé et système de décodage vidéo, codeur vidéo et décodeur vidéo
CN116156167A (zh) 一种残差跳过编解码方法及装置
CN116366847B (en) Video image decoding method, device and storage medium
TWI821013B (zh) 視頻編解碼方法及裝置
TWI829424B (zh) 解碼方法、編碼方法及裝置
WO2022193389A1 (fr) Procédé et système de codage vidéo, procédé et système de décodage vidéo, et codeur et décodeur vidéo
TW202435613A (zh) 圖像編碼方法和圖像解碼方法、裝置及存儲介質
TW202435612A (zh) 圖像編碼方法和圖像解碼方法、裝置及存儲介質
TW202435614A (zh) 圖像編碼裝置和圖像解碼方法、裝置及存儲介質

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23742847

Country of ref document: EP

Kind code of ref document: A1

WD Withdrawal of designations after international publication
NENP Non-entry into the national phase

Ref country code: DE