WO2023231866A1 - 一种视频译码方法、装置及存储介质 - Google Patents

一种视频译码方法、装置及存储介质 Download PDF

Info

Publication number
WO2023231866A1
WO2023231866A1 PCT/CN2023/096070 CN2023096070W WO2023231866A1 WO 2023231866 A1 WO2023231866 A1 WO 2023231866A1 CN 2023096070 W CN2023096070 W CN 2023096070W WO 2023231866 A1 WO2023231866 A1 WO 2023231866A1
Authority
WO
WIPO (PCT)
Prior art keywords
current block
image
complexity information
complexity
block
Prior art date
Application number
PCT/CN2023/096070
Other languages
English (en)
French (fr)
Inventor
王岩
孙煜程
陈方栋
Original Assignee
杭州海康威视数字技术股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 杭州海康威视数字技术股份有限公司 filed Critical 杭州海康威视数字技术股份有限公司
Publication of WO2023231866A1 publication Critical patent/WO2023231866A1/zh

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/124Quantisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/182Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a pixel
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
    • H04N19/91Entropy coding, e.g. variable length coding [VLC] or arithmetic coding

Definitions

  • the present application relates to the field of video decoding technology, and in particular, to a video decoding method, device and storage medium.
  • Video decoding technology plays an important role in the field of video processing.
  • video decoding technology includes video encoding and decoding.
  • the process of quantizing or dequantizing images in the video is the key to determining image quality. Quantization mainly replaces part of the original data in the code stream through quantization parameters to reduce the redundancy of the original data in the code stream. But the quantification process brings the risk of image distortion. Therefore, in order to improve video decoding efficiency while taking image quality into consideration, determining more accurate quantization parameters for images in videos is an urgent problem that needs to be solved.
  • Embodiments of the present application provide a video decoding method, device, and storage medium, which help improve video decoding efficiency.
  • embodiments of the present application provide a video decoding method, which method is applied to a video encoding device or a video decoding device or a chip of a video encoding and decoding device.
  • the method includes: obtaining the complexity of the current block in the image to be processed.
  • Information, the complexity information of the current block is obtained by calculating at least one angular gradient of the current block based on at least the pixel value of the current block; the quantization parameters of the current block are determined based on the complexity information of the current block; and the current block is decoded based on the quantization parameters.
  • Quantization parameters play an important role in the video encoding and decoding process.
  • the video decoding device obtains the complexity information of the current block in the image to be processed.
  • the complexity information is calculated based on the information of the current block, and the quantization of the current block is determined based on the complexity information.
  • parameters and decoding taking into account the angular gradient information of the current block, which helps to determine more accurate quantization parameters for the current block, thus improving the decoding efficiency of the video while taking into account the image quality.
  • the decoder obtains the complexity information of the current block from the code stream to determine the quantization parameters, which helps to reduce the resources occupied by the quantization parameters in the code stream, thereby making more effective transmission in the code stream data to improve transmission efficiency.
  • obtaining the complexity information of the current block in the image to be processed includes: calculating at least one angular gradient of the current block based on the pixel value of the current block and the reconstruction value of the decoded pixel value of the current block. ; Obtain the complexity information of the current block based on at least one angular gradient of the current block.
  • calculating the complexity of the current block through the pixel value and reconstruction value of the current block helps to determine more accurate quantization parameters for the current block, and improves video decoding while taking into account image quality. efficiency.
  • obtaining the complexity information of the current block in the image to be processed includes: calculating at least one angle of the current block based on the pixel value of the current block and the pixel values adjacent to the current block in the image to be processed. Gradient; obtain the complexity information of the current block based on at least one angular gradient of the current block.
  • the complexity of the current block is calculated through the pixel value of the current block and the pixel values adjacent to the current block, which helps to determine more accurate quantization parameters for the current block while taking into account the image quality. , improve the decoding efficiency of video.
  • obtaining the complexity information of the current block in the image to be processed includes: obtaining the prediction angle used in the angle prediction mode of the current block; based on the prediction angle, calculating the angle gradient to obtain the corresponding complexity information ; Use the corresponding complexity information as the complexity information of the current block.
  • the corresponding complexity is determined through the angle prediction mode of the current block, which helps to determine more accurate quantization parameters for the current block.
  • the decoder it helps to save resources in the code stream and improve the video quality. decoding efficiency.
  • the current block is an N-channel image block
  • obtaining the complexity information of the current block in the image to be processed includes: based on the pixel value of each channel image block in the N-channel image block, obtaining each channel The complexity information of the image block; N is an integer greater than zero; the complexity information of the current block is determined based on the complexity information of each channel image block.
  • This possible implementation method provides an implementation method for determining the complexity of the current block based on the complexity of multiple channel image blocks, thereby improving the implementability of the solution.
  • dividing the image into multiple channels for calculation separately helps to improve the complexity information determined. accuracy.
  • obtaining the complexity information of each channel image block based on the pixel value of each channel image block in the N-channel image block includes: dividing each channel image block into at least two sub-blocks; Determine the complexity information of at least two sub-blocks of each channel image block; determine the complexity information of the corresponding channel image block in each channel image block based on the complexity information of at least two sub-blocks of each channel image block.
  • This possible implementation provides an implementation method for determining the complexity of the current block based on the complexity of multiple channel image blocks. Determining the complexity by further dividing the multiple channel image blocks helps to improve the determination of the complexity. The accuracy of the obtained complexity information.
  • determining the complexity information of the corresponding channel image block based on the complexity information of at least two sub-blocks of each channel image block includes: converting the complexity information of the at least two sub-blocks of each channel image block into The minimum value in the complexity information is determined as the complexity information of the corresponding channel image block.
  • This possible implementation provides an implementation method for determining the complexity of multiple channel image blocks based on the complexity of the divided multiple channel image blocks, thereby improving the implementability of the solution.
  • determining the complexity information of the current block based on the complexity information of each channel image block includes: determining the minimum value in the complexity information of each channel image block as the complexity of the current block information.
  • This possible implementation method provides an implementation method for determining the complexity information of multiple channel image blocks based on the complexity information of the multiple channel image blocks, thereby improving the implementability of the solution.
  • determining the complexity information of the current block based on the complexity information of each channel image block includes: determining the complexity level of each channel image block based on the complexity information of each channel image block. ; Based on the complexity level of each channel image block, determine the complexity information of the current block.
  • This possible implementation provides an implementation method for determining the complexity of multiple channel image blocks based on the complexity of the divided multiple channel image blocks, thereby improving the implementability of the solution.
  • determining the quantization parameter of the current block based on the complexity information of the current block includes: determining the reference quantization parameter of the current block based on the complexity information of the current block; determining the reference quantization parameter of the current block based on the current block. Quantization parameters of the current block.
  • This possible implementation provides a method of determining quantization parameters based on reference quantization parameters to improve the accuracy of the determined quantization parameters.
  • determining the reference quantization parameters of the current block according to the complexity information of the current block includes: obtaining the buffer area status of the image to be processed, the buffer area status Used to characterize the number of bits occupied in the buffer area for the encoded image blocks in the image to be processed, where the buffer area is used to control the constant speed output of the code stream of the image to be processed; based on the buffer area status and the complexity information of the current block Correspondence, determine the reference quantization parameters of the current block.
  • the decoding process can simulate the situation of buffering the code stream in the buffer area during the above encoding process, so as to determine the reference quantization parameters based on the simulation results.
  • This possible implementation method provides an implementation method for determining the reference quantization parameters of the current block based on the buffer area status and the complexity information of the current block, thereby improving the implementability of the solution.
  • determining the reference quantization parameters of the current block according to the complexity information of the current block includes: determining the complexity level of the current block; determining the corresponding target bits according to the complexity level of the current block.
  • the target bits It refers to the number of bits occupied by the current block in the code stream; the reference quantization parameter of the current block is obtained based on the target bits.
  • This possible implementation method provides an implementation method of determining the reference quantization parameters of the current block based on the target bits, thereby improving the implementability of the solution.
  • determining the quantization parameter of the current block according to the reference quantization parameter of the current block includes: determining a weighting coefficient according to the complexity information of the current block, and the weighting coefficient is used to adjust the current block according to the complexity of the current block. quantization parameter; determine the quantization parameter of the current block based on the weighting coefficient and the reference quantization parameter of the current block.
  • This possible implementation method provides an implementation method of determining the quantization parameter of the current block based on the reference quantization parameter, thereby improving the implementability of the solution.
  • the complexity information of the current block is calculated based on the code rate control unit of the current block.
  • the code rate control unit is the basic processing unit for calculating the complexity information of the current block;
  • the quantization parameter of the current block is Decoding the current block based on the quantization parameters of the code rate control unit of the current block, including: determining the decoding of the current block based on the quantization parameters of the code rate control unit The quantization parameter of the decoding unit; decode the current block according to the quantization parameter of the decoding unit.
  • This possible implementation provides an implementation of determining the quantization parameters of the current block, in which when the size of the code rate control unit is smaller than the size of the quantization unit, the correspondingly calculated multiple quantization parameters are used for one quantization Units are quantified.
  • the above method provides corresponding solutions to improve the implementability of the solution.
  • embodiments of the present application provide a video decoding device, which has the function of implementing any of the video decoding methods in the first aspect.
  • This function can be implemented by hardware, or it can be implemented by hardware executing corresponding software.
  • the hardware or software includes one or more modules corresponding to the above functions.
  • embodiments of the present application provide a video encoder, which is used to perform the video decoding method in any one of the above-mentioned first aspects.
  • embodiments of the present application provide another video encoder, including: a processor and a memory; the memory is used to store computer execution instructions; when the video encoder is running, the processor executes the computer instructions stored in the memory. The instructions are executed to cause the video encoder to perform the video decoding method according to any one of the above first aspects.
  • embodiments of the present application provide a video decoder, which is used to perform the video decoding method in any one of the above-mentioned first aspects.
  • embodiments of the present application provide another video decoder, including: a processor and a memory; the memory is used to store computer execution instructions; when the video decoder is running, the processor executes the computer instructions stored in the memory. The instructions are executed to cause the video decoder to perform the video decoding method in any one of the above first aspects.
  • embodiments of the present application provide a computer-readable storage medium.
  • the computer-readable storage medium stores a program that, when run on a computer, enables the computer to execute the video of any one of the above-mentioned first aspects. Decoding method.
  • embodiments of the present application provide a computer program product containing instructions that, when run on a computer, enable the computer to execute any of the video interpretation methods of the first aspect.
  • inventions of the present application provide an electronic device.
  • the electronic device includes a video decoding device, and the processing circuit is configured to perform the video decoding method in any one of the above-mentioned first aspects.
  • inventions of the present application provide a chip.
  • the chip includes a processor.
  • the processor is coupled to a memory.
  • the memory stores program instructions. When the program instructions stored in the memory are executed by the processor, any one of the above first aspects is implemented. video decoding method.
  • An eleventh aspect provides a video coding and decoding system.
  • the system includes a video encoder and a video decoder.
  • the video encoder is configured to perform the video decoding method in any one of the above first aspects.
  • the video decoder is Configured to perform the video decoding method as in any one of the above first aspects.
  • Figure 1 is a system architecture diagram of a video encoding and decoding system provided by an embodiment of the present application
  • Figure 2 is a schematic structural diagram of a video encoder provided by an embodiment of the present application.
  • Figure 3 is a schematic structural diagram of a video decoder provided by an embodiment of the present application.
  • Figure 4 is a schematic flow chart of video encoding and decoding provided by an embodiment of the present application.
  • Figure 5 is a schematic structural diagram of a video encoding and decoding device provided by an embodiment of the present application.
  • Figure 6 is a flow chart of a video decoding method provided by an embodiment of the present application.
  • Figure 7 is a schematic diagram of calculating an angle gradient provided by an embodiment of the present application.
  • Figure 8 is a schematic diagram of another method of calculating angle gradient provided by the embodiment of the present application.
  • Figure 9 is a schematic diagram of yet another calculation of angle gradient provided by the embodiment of the present application.
  • Figure 10 is a schematic diagram of dividing image blocks and calculating angular gradients provided by an embodiment of the present application.
  • Figure 11 is a schematic diagram of the relationship between a dynamic threshold and an absolute threshold provided by an embodiment of the present application.
  • Figure 12a is a function image of reference quantization parameters, complexity information and buffer area status provided by the embodiment of the present application.
  • Figure 12b is a function image of reference quantization parameters and buffer area status provided by an embodiment of the present application.
  • Figure 12c is a function image of reference quantization parameters and complexity information provided by the embodiment of the present application.
  • Figure 13a is a schematic diagram of an image boundary provided by an embodiment of the present application.
  • Figure 13b is a schematic diagram of a strip provided by an embodiment of the present application.
  • Figure 14a is a schematic flowchart of a code stream grouping method at the encoding end provided by an embodiment of the present application
  • Figure 14b is a schematic flowchart of a code stream grouping method at the decoding end provided by an embodiment of the present application
  • Figure 15 is an interleaving schematic diagram of fragments based on the code stream grouping method provided by the embodiment of the present application.
  • Figure 16a is a schematic flowchart of a code stream grouping method at the encoding end provided by an embodiment of the present application
  • Figure 16b is a schematic flowchart of a code stream grouping method at the decoding end provided by an embodiment of the present application
  • Figure 17 is an interleaving schematic diagram of fragments based on the code stream grouping method provided by the embodiment of the present application.
  • Figure 18 is a schematic diagram of the composition of a video decoding device provided by an embodiment of the present application.
  • Video decoding technology includes video encoding technology and video decoding technology, which can also be collectively referred to as video coding and decoding technology.
  • video sequences have a series of redundant information such as spatial redundancy, temporal redundancy, visual redundancy, information entropy redundancy, structural redundancy, knowledge redundancy, and importance redundancy.
  • video coding technology is proposed to reduce storage space and save transmission bandwidth.
  • Video encoding technology is also called video compression technology.
  • video compression coding standards are used to standardize video encoding and decoding methods, such as: Advanced Video Part 10 of the MPEG-2 and MPEG-4 standards developed by the Motion Picture Experts Group (MPEG) Codec (Advanced Video Coding, AVC), H.263, H.264 and H.265 (also known as high-efficiency video coding) formulated by the International Telecommunication Union Telecommunication Standardization Sector (ITU-T) Decoding (High Efficiency Video Coding standard, HEVC)).
  • MPEG Motion Picture Experts Group
  • AVC Advanced Video Coding
  • H.263, H.264 and H.265 also known as high-efficiency video coding
  • ITU-T International Telecommunication Union Telecommunication Standardization Sector
  • HEVC High Efficiency Video Coding standard
  • the basic processing unit in the video encoding and decoding process is the image block, which is obtained by dividing a frame/image at the encoding end.
  • the image blocks obtained after division are usually processed line by line and one by one. Among them, the image block being processed is called the current block, and the image block that has been processed is called the encoded image block, or the decoded image block, or the decoded image block.
  • HEVC defines coding tree unit (Coding Tree Unit, CTU), coding unit (Coding Unit, CU), prediction unit (Prediction Unit, PU) and transformation unit (Transform Unit, TU).
  • CTU, CU, PU and TU can all be used as image blocks obtained after division. Both PU and TU are divided based on CU.
  • a pixel is the smallest complete sample of a video or image, so the data processing of image blocks is in pixels. Among them, each pixel records color information.
  • One sampling method is to represent color through RGB, which includes three image channels, R represents red, G represents green, and B represents blue.
  • Another sampling method is to represent color through YUV, which includes three image channels, Y represents brightness (luminance), U represents the first chromaticity Cb, and V represents the second chromaticity Cr. Because people are more sensitive to brightness than Sensitivity to chromaticity, therefore, storage space can be reduced by storing more data representing brightness and less data representing chroma.
  • YUV format is usually used for video sampling, including 420 sampling format, 422 sampling format, etc. This sampling format determines the number of samples for two chroma based on the number of luminance samples. For example, assuming a CU has 4 ⁇ 2 pixels, the format is as follows:
  • the 420 sampling format means that YUV is sampled in a 4:2:0 format, that is, the brightness and the first chroma or the second chroma are selected in a 4:2 ratio, where the first chroma and the second chroma are selected in alternate rows.
  • the above-mentioned CU sampling selects the brightness Y0-Y3 of the first row, and the first chroma U0 and U2, and selects the brightness Y4-Y7 of the second row, and the second chroma V4 and V6.
  • the CU is sampled and composed of a brightness coding unit and a chroma coding unit, where the brightness coding unit is:
  • the first chroma coding unit is:
  • the second chroma coding unit is:
  • the size of the image blocks sampled by the above sampling format has changed.
  • the block size of the brightness coding unit remains unchanged and is still 4 ⁇ 2, while the block size of the first chroma coding unit becomes 2 ⁇ 1, and the block size of the second chroma coding unit also becomes 2 ⁇ 1. Therefore, if it is assumed that the CU size is X ⁇ Y, the size of the chroma coding unit block sampled based on the 420 sampling format is X/2 ⁇ Y/2.
  • the 422 sampling format means that YUV is sampled in a 4:2:2 format, that is, the brightness, first chroma and second chroma are selected in a 4:2:2 ratio. Then the sampled brightness coding unit of the above CU is:
  • the first chroma coding unit is:
  • the second chroma coding unit is:
  • the block size of the brightness coding unit remains unchanged and is still 4 ⁇ 2, while the block size of the first chroma coding unit becomes 2 ⁇ 2, and the block size of the second chroma coding unit also becomes 2 ⁇ 2. Therefore, if it is assumed that the CU size is X ⁇ Y, the size of the chroma coding unit block sampled based on the 422 sampling format is X/2 ⁇ Y.
  • the above-sampled brightness coding unit, first chroma coding unit and second chroma coding unit are used as data units of each channel that are subsequently processed for the current block.
  • the decoding method provided by this application is suitable for video encoding and decoding systems.
  • the video encoding and decoding system may also be called a video decoding system.
  • Figure 1 shows the structure of a video encoding and decoding system.
  • the video coding and decoding system includes a source device 10 and a destination device 11 .
  • the source device 10 generates encoded video data.
  • the source device 10 may also be called a video encoding device or a video encoding device.
  • the destination device 11 may decode the encoded video data generated by the source device 10.
  • the destination device 11 also Can be called a video decoding device or video decoding device.
  • the source device 10 and/or the destination device 11 may include at least one processor and a memory coupled to the at least one processor.
  • the above memory may include but is not limited to Read-Only Memory (ROM), Random Access Memory (RAM), Electrically Erasable Programmable Read-Only Memory (EEPROM) , flash memory or any other media that can be used to store the required program code in the form of instructions or data structures that can be accessed by a computer, which is not specifically limited by this application.
  • ROM Read-Only Memory
  • RAM Random Access Memory
  • EEPROM Electrically Erasable Programmable Read-Only Memory
  • flash memory any other media that can be used to store the required program code in the form of instructions or data structures that can be accessed by a computer, which is not specifically limited by this application.
  • Source device 10 and destination device 11 may include a variety of devices, including desktop computers, mobile computing devices, notebook (eg, laptop) computers, tablet computers, set-top boxes, telephone handsets such as so-called “smart" phones, TVs, cameras, display devices devices, digital media players, video game consoles, vehicle computers, or the like.
  • Link 12 may include one or more media and/or devices capable of moving encoded video data from source device 10 to destination device 11 .
  • link 12 may include one or more communication media that enables source device 10 to transmit encoded video data directly to destination device 11 in real time.
  • the source device 10 may modulate the encoded video data according to a communication standard (eg, a wireless communication protocol), and may transmit the modulated video data to the destination device 11 .
  • the above one or more communication media may include wireless and/or wired communication media, such as: radio frequency (Radio Frequency, RF) spectrum, one or more physical transmission lines.
  • the above one or more communication media may form part of a packet-based network, such as a local area network, a wide area network, or a global network (eg, the Internet).
  • the one or more communication media mentioned above may include routers, switches, base stations, or other devices that implement communication from the source device 10 to the destination device 11 .
  • the encoded video data may be output from the output interface 103 to the storage device 13 .
  • encoded video data may be accessed from storage device 13 via input interface 113 .
  • the storage device 13 may include a variety of local access data storage media, such as Blu-ray Disc, High Density Digital Video Disc (DVD), Compact Disc Read-Only Memory (CD-ROM), Flash memory, or other suitable digital storage medium for storing encoded video data.
  • storage device 13 may correspond to a file server or another intermediate storage device that stores encoded video data generated by source device 10 .
  • destination device 11 may obtain its stored video data from storage device 13 via streaming or downloading.
  • the file server may be any type of server capable of storing encoded video data and transmitting the encoded video data to destination device 11 .
  • a file server may include a World Wide Web (Web) server (e.g., for a website), a File Transfer Protocol (FTP) server, a Network Attached Storage (NAS) device, and a local disk driver.
  • Web World Wide Web
  • FTP File Transfer Protocol
  • NAS Network Attached Storage
  • Destination device 11 may access the encoded video data through any standard data connection (eg, an Internet connection).
  • Example types of data connections include wireless channels, wired connections (eg, cable modems, etc.), or a combination of both, suitable for accessing encoded video data stored on a file server.
  • the encoded video data can be transmitted from the file server through streaming, downloading, or a combination of both.
  • the decoding method of this application is not limited to wireless application scenarios.
  • the decoding method of this application can be applied to video codecs that support the following multimedia applications: over-the-air TV broadcasting, cable TV transmission, satellite TV transmission, streaming Transmission of video transmission (eg, via the Internet), encoding of video data stored on a data storage medium, decoding of video data stored on a data storage medium, or other applications.
  • a video codec system may be configured to support one-way or two-way video transmission to support applications such as video streaming, video playback, video broadcasting, and/or video telephony.
  • Figure 1 is a system architecture diagram of a video coding and decoding system provided by an embodiment of the present application.
  • Figure 1 is only an example of a video coding and decoding system and does not limit the video coding and decoding system in this application.
  • the decoding method provided by this application can also be applied to scenarios where there is no data communication between the encoding device and the decoding device.
  • the video data to be encoded or the encoded video data may be retrieved from local storage, may be streamed over a network, etc.
  • the video encoding device may encode the video data to be encoded and store the encoded video data in a memory, and the video decoding device may also obtain the encoded video data from the memory and decode the encoded video data.
  • source device 10 includes a video source 101, a video encoder 102 and an output interface 103.
  • output interface 103 may include a modulator/demodulator (modem) and/or a transmitter.
  • Video source 101 may include a video capture device (e.g., a video camera), a video archive containing previously captured video data, a video input interface to receive video data from a video content provider, and/or computer graphics for generating the video data. system, or a combination of these sources of video data.
  • Video encoder 102 may encode video data from video source 101 .
  • source device 10 transmits the encoded video data directly to destination device 11 via output interface 103 .
  • the encoded video data may also be stored on the storage device 13 for later access by the destination device 11 for decoding and/or playback.
  • destination device 11 includes display device 111 , video decoder 112 and input interface 113 .
  • input interface 113 includes a receiver and/or modem.
  • Input interface 113 may receive encoded video data via link 12 and/or from storage device 13 .
  • Display device 111 may be integrated with destination device 11 or may be external to destination device 11 . Generally, the display device 111 displays the decoded video data.
  • the display device 111 may include a variety of display devices, such as a liquid crystal display, a plasma display, an organic light emitting diode display, or other types of display devices.
  • video encoder 102 and video decoder 112 may each be integrated with an audio encoder and decoder, and may include appropriate multiplexer-demultiplexer units or other hardware and software to handle common Encoding of both audio and video in a data stream or separate data streams.
  • the video encoder 102 and the video decoder 112 may include at least one microprocessor, digital signal processor (Digital Signal Processor, DSP), application-specific integrated circuit (Application-Specific Integrated Circuit, ASIC), field programmable gate array (Field Programmable) Gate Array, FPGA), discrete logic, hardware, or any combination thereof. If the decoding method provided by this application is implemented by software, the instructions for the software can be stored in a suitable non-volatile computer-readable storage medium, and at least one processor can be used to execute the instructions to implement this application. .
  • DSP Digital Signal Processor
  • ASIC Application-Specific Integrated Circuit
  • FPGA Field Programmable gate array
  • the video encoder 102 and the video decoder 112 in this application may operate according to a video compression standard (such as HEVC) or other industry standards, which is not specifically limited in this application.
  • a video compression standard such as HEVC
  • other industry standards which is not specifically limited in this application.
  • FIG. 2 is a schematic structural diagram of a video encoder 102 provided by an embodiment of the present application.
  • the video encoder 102 may perform prediction, transformation, quantization and entropy coding processes in the prediction module 21, the transformation module 22, the quantization module 23 and the entropy coding module 24 respectively.
  • the video encoder 102 also includes a preprocessing module 20 and a summer 202, where the preprocessing module 20 includes a segmentation module and a code rate control module.
  • the video encoder 102 also includes an inverse quantization module 25, an inverse transform module 26, a summer 201 and a reference image memory 27.
  • the video encoder 102 receives video data, and the pre-processing module 20 is used to obtain input parameters of the video data.
  • the input parameters include the resolution of the image in the video data, the sampling format of the image, pixel depth (bits per pixel, bpp), bit width and other information.
  • bpp refers to the number of bits occupied by one pixel component in a unit pixel.
  • the segmentation module in the preprocessing module 20 segments the image into original blocks.
  • This partitioning may also include partitioning into slices, image blocks, or other larger units, and video block partitioning, for example, based on a Largest Coding Unit (LCU) and a quadtree structure of a CU.
  • video encoder 102 is a component for encoding video blocks located in a video slice to be encoded.
  • a strip may be divided into a plurality of original blocks (and possibly into a collection of original blocks called image blocks).
  • the size of CU, PU and TU is usually determined in the partitioning module.
  • the segmentation module is used to determine the size of the rate control unit.
  • the code rate control unit refers to the basic processing unit in the code rate control module.
  • the code rate control unit can be used to calculate the quantization parameters of the current block.
  • the code rate control module calculates complexity information for the current block based on the code rate control unit, and then calculates the quantization parameters of the current block based on the complexity information.
  • the segmentation strategy of the segmentation module can be preset, or it can be continuously adjusted based on the image during the encoding process.
  • the segmentation strategy is a preset strategy, correspondingly, the same segmentation strategy is also preset in the decoder, thereby obtaining the same image processing unit.
  • the image processing unit is any one of the above image blocks, and corresponds one-to-one with the encoding side.
  • the segmentation strategy can be directly or indirectly encoded into the code stream.
  • the decoder obtains the corresponding parameters from the code stream, obtains the same segmentation strategy, and obtains the same image processing unit.
  • the code rate control module in the preprocessing module 20 is used to generate quantization parameters so that the quantization module 23 and the inverse quantization module 25 perform correlation calculations.
  • the code rate control module can obtain the image information of the current block for calculation, such as the above-mentioned input information; it can also obtain the reconstructed value obtained by the summer 201 for calculation. This application does not do this. limit.
  • the prediction module 21 may provide the prediction block to the summer 202 to generate a residual block, and provide the prediction block to the summer 201 for reconstruction to obtain a reconstruction block, which is used as a reference pixel for subsequent prediction.
  • the video encoder 102 forms a pixel difference value by subtracting the pixel value of the prediction block from the pixel value of the original block.
  • the pixel difference value is a residual block.
  • the data in the residual block may include a brightness difference and a chroma difference. .
  • Summer 201 represents one or more components that perform this subtraction operation.
  • the prediction module 21 may also send the relevant syntax elements to the entropy encoding module 24 for merging into the code stream.
  • Transform module 22 may divide the residual block into one or more TUs for transformation. Transform module 22 may transform the residual block from the pixel domain to the transform domain (eg, frequency domain). For example, the residual block is transformed using Discrete Cosine Transform (DCT) or Discrete Sine Transform (DST) to obtain transformation coefficients. Transform module 22 may send the resulting transform coefficients to quantization module 23.
  • DCT Discrete Cosine Transform
  • DST Discrete Sine Transform
  • the quantization module 23 may perform quantization based on quantization units.
  • the quantization unit may be the same as the above-mentioned CU, TU, and PU, or may be further divided in the segmentation module.
  • the quantization module 23 quantizes the transform coefficients to further reduce encoding bits to obtain quantized coefficients.
  • the quantization process may reduce the bit depth associated with some or all of the coefficients.
  • the degree of quantization can be modified by adjusting the quantization parameters.
  • quantization module 23 may then perform a scan of the matrix containing the quantized transform coefficients.
  • entropy encoding module 24 may perform a scan.
  • entropy encoding module 24 may entropy encode the quantized coefficients.
  • the entropy coding module 24 can perform Context-Adaptive Variable-Length Coding (CAVLC), Context-based Adaptive Binary Arithmetic Coding (CABAC), Grammar-based Context-Adaptive Binary Arithmetic Coding (SBAC), Probabilistic Interval Partition Entropy (PIPE) coding, or another entropy coding method or technique.
  • CAVLC Context-Adaptive Variable-Length Coding
  • CABAC Context-based Adaptive Binary Arithmetic Coding
  • SBAC Grammar-based Context-Adaptive Binary Arithmetic Coding
  • PIPE Probabilistic Interval Partition Entropy
  • the code stream may be transmitted to the video decoder 112 or archived for later transmission or retrieval by the video decoder 112 .
  • the inverse quantization module 25 and the inverse transform module 26 respectively apply inverse quantization and inverse transform.
  • the summer 201 adds the inversely transformed residual block and the predicted residual block to generate a reconstructed block, which is used as a subsequent original block. Reference pixel for prediction.
  • the reconstructed block is stored in the reference image memory 27 .
  • FIG. 3 is a schematic structural diagram of a video decoder 112 provided by an embodiment of the present application.
  • the video decoder 112 includes an entropy decoding module 30 , a prediction module 31 , an inverse quantization module 32 , an inverse transform module 33 , a summer 301 and a reference image memory 34 .
  • the entropy decoding module 30 includes a parsing module and a code rate control module.
  • video decoder 112 may perform a decoding process that is reciprocal to the encoding process described for video encoder 102 of FIG. 2 .
  • video decoder 112 receives a codestream of encoded video from video encoder 102 .
  • the parsing module in the entropy decoding module 30 of the video decoder 112 entropy decodes the code stream to generate quantization coefficients and syntax elements.
  • Entropy decoding module 30 passes the syntax elements to prediction module 31.
  • Video decoder 112 may receive syntax elements at the video slice level and/or the video block level.
  • the code rate control module in the entropy decoding module 30 generates quantization parameters based on the information of the image to be decoded obtained by the analysis module, so that the inverse quantization module 32 performs related calculations.
  • the code rate control module can also calculate the quantization parameter based on the reconstructed block reconstructed by the summer 301.
  • the inverse quantization module 32 performs inverse quantization (eg, dequantization) on the quantization coefficients provided in the code stream and decoded by the entropy decoding module 30 and the generated quantization parameters.
  • the inverse quantization process may include a process of determining a degree of quantization using quantization parameters calculated by video encoder 102 for each video block in the video slice. Likewise, the inverse quantization process may also include a process of determining a degree of inverse quantization to be applied.
  • the inverse transform module 33 applies inverse transform (for example, DCT, DST and other transform methods) to the inversely quantized transform coefficients, and inversely transforms the inversely quantized transform coefficients to obtain an inverse transform unit, that is, a residual block.
  • inverse transform for example, DCT, DST and other transform methods
  • the size of the inverse transformation unit can be the same as the size of the TU.
  • the inverse transformation method and the transformation method adopt the corresponding forward transformation and inverse transformation in the same transformation method.
  • the inverse transformation of DCT and DST is inverse DCT, inverse DST or concept. Similar inverse transformation process.
  • video decoder 112 forms a decoded image block by summing the inverse-transformed residual block from inverse-transform module 33 with the prediction block.
  • Summer 301 represents one or more components that perform this summation operation.
  • a deblocking filter may also be applied to filter the image of the decoded blocks in order to remove blocking artifacts. Decoded image blocks in a given frame or image are stored in reference image memory 34 as reference pixels for subsequent predictions.
  • FIG. 4 is a schematic flow chart of a video encoding/decoding method provided by this application.
  • the video encoding/decoding implementation method includes process 1 to process 5.
  • Processes 1 to 5 may be executed by any one or more of the above-mentioned source device 10, video encoder 102, destination device 11, or video decoder 112.
  • Process 1 Divide a frame of image into one or more parallel coding units that do not overlap with each other. There is no dependency between the one or more parallel coding units, and they can be completely parallel/independently encoded and decoded, as shown in Figure 4, parallel coding unit 1 and parallel coding unit 2.
  • each parallel coding unit it can be divided into one or more independent coding units that do not overlap with each other.
  • the independent coding units may not depend on each other, but they can share some parallel coding unit header information.
  • the independent coding unit may include three components of brightness Y, first chroma Cb, and second chroma Cr, or three components of RGB, or may include only one of the components. If the independent coding unit contains three components, the sizes of the three components can be exactly the same or different, depending on the input format of the image.
  • the independent coding unit can also be understood as one or more processing units formed by N channels included in each parallel coding unit.
  • the above three components of Y, Cb, and Cr are the three channels that constitute the parallel coding unit, and each of them can be an independent coding unit, or Cb and Cr can be collectively referred to as the chroma channel, then the parallel coding unit includes the brightness channel.
  • each independent coding unit it can be divided into one or more non-overlapping coding units.
  • Each coding unit within the independent coding unit can be interdependent.
  • multiple coding units can perform mutual reference precoding and decoding. .
  • the coding unit and the independent coding unit have the same size (that is, the independent coding unit is only divided into one coding unit), its size can be all the sizes described in process 2.
  • the coding unit may include three components of brightness Y, first chroma Cb, and second chroma Cr (or three RGB components), or may include only one of the components. If it contains three components, the sizes of the components can be exactly the same or different, depending on the image input format.
  • process 3 is an optional step in the video encoding and decoding method.
  • the video encoder/decoder can encode/decode the residual coefficients (or residual values) of the independent coding units obtained in process 2.
  • PG non-overlapping prediction groups
  • PG can also be referred to as Group.
  • Each PG is encoded and decoded according to the selected prediction mode to obtain the PG.
  • the predicted value constitutes the predicted value of the entire coding unit.
  • the residual value of the coding unit is obtained. For example, in Figure 4, one coding unit in the independent coding unit is divided into PG-1, PG-2 and PG-3.
  • Process 5 Based on the residual value of the coding unit, group the coding units to obtain one or more non-overlapping residual blocks (RB).
  • the residual coefficients of each RB are encoded and decoded according to the selected mode. , forming a residual coefficient stream. Specifically, it can be divided into two categories: transforming the residual coefficients and not transforming them. As shown in Figure 4, one coding unit is grouped to obtain RB-1 and RB-2.
  • the selected mode of the residual coefficient encoding and decoding method in process 5 may include, but is not limited to any of the following: semi-fixed length encoding method, exponential Golomb encoding method, Golomb-Rice encoding method, truncated unary code Encoding method, run length encoding method, direct encoding of original residual value, etc.
  • semi-fixed length encoding method exponential Golomb encoding method
  • Golomb-Rice encoding method truncated unary code Encoding method
  • run length encoding method truncated unary code Encoding method
  • direct encoding of original residual value etc.
  • the exponential Golomb coding method is selected to encode the residual coefficients of each RB
  • decoding the residual coefficients of each RB it is also necessary to select a decoding method corresponding to the exponential Golomb coding method for decoding.
  • the video encoder may directly encode the coefficients within the RB.
  • the video encoder can also transform the residual block, such as DCT, DST, Hadamard transform, etc., and then encode the transformed coefficients.
  • the residual block such as DCT, DST, Hadamard transform, etc.
  • the video encoder can directly quantize each coefficient in the RB uniformly, and then perform binary encoding. If the RB is large, it can be further divided into multiple coefficient groups (CG), and then each CG is uniformly quantized and then binary encoded. In some embodiments of the present application, the size of the coefficient group (CG) and the quantization group (QG) may be the same.
  • the maximum value of the absolute value of the residual within an RB is defined as the modified maximum value (modified maximum, mm).
  • modify maximum, mm modified maximum value
  • the way to determine CL is to find the smallest M value that satisfies all the residuals of the current RB within the range of [-2 ⁇ (M-1), 2 ⁇ (M-1)], and use the found M as the current RB CL. If there are two boundary values -2 ⁇ (M-1) and 2 ⁇ (M-1) in the current RB, M should be increased by 1, that is, M+1 bits are needed to encode all the residuals of the current RB; if the current There is only one of the two boundary values -2 ⁇ (M-1) and 2 ⁇ (M-1) in the RB, and a trailing bit needs to be encoded to determine whether the boundary value is -2 ⁇ (M- 1) or 2 ⁇ (M-1); if none of -2 ⁇ (M-1) and 2 ⁇ (M-1) exists in all residuals in the current RB, there is no need to encode the Trailing bit.
  • the video encoder can also directly encode the original value of the image instead of the residual value.
  • FIG. 5 provides a schematic structural diagram of a video codec device, as shown in Figure 5
  • the codec device 50 may be a part of the video encoder 102 or a part of the video decoder 112 .
  • the codec device 50 can be applied on the encoding side or the decoding side.
  • the encoding and decoding device 50 includes a processor 501 and a memory 502.
  • the processor 501 is connected to the memory 502 (eg, connected to each other through a bus 504).
  • the encoding and decoding device 50 may also include a communication interface 503, which is connected to the processor 501 and the memory 502 for receiving/transmitting data.
  • the memory 502 may be a random access memory (Random Access Memory, RAM), a read-only memory (Read-Only Memory, ROM), an erasable programmable read-only memory (Erasable Programmable Read Only Memory, EPROM) or a portable read-only memory. (Compact Disc Read-Only Memory, CD-ROM).
  • RAM Random Access Memory
  • ROM read-only memory
  • EPROM erasable programmable read-only memory
  • portable read-only memory Compact Disc Read-Only Memory, CD-ROM.
  • CD-ROM Compact Disc Read-Only Memory
  • the processor 501 may be one or more central processing units (Central Processing Units, CPUs), such as CPU 0 and CPU 1 shown in Figure 5.
  • CPUs Central Processing Units
  • the CPU may be a single-core CPU or a multi-core CPU.
  • the processor 501 is used to read the program code stored in the memory 502 and perform the operations of any one of the implementations corresponding to Figure 6 and its various feasible implementations.
  • the decoding method provided by this application can be used for the video encoder 102 and the video decoder 112 .
  • the video encoder 102 may not use the decoding method of this application to perform encoding, nor transmit the quantization parameter information to the video decoder 112.
  • the video decoder 112 may use the method provided by this application. decoding method.
  • the video encoder 102 can use the decoding method of the present application to perform encoding and transmit the quantization parameter information to the video decoder 112.
  • the video decoder 112 can obtain the quantization parameter information from the code stream. to decode.
  • FIG. 6 it is a flow chart of a video decoding method provided by this application.
  • the method includes:
  • the video decoder obtains the complexity information of the current block in the image to be processed.
  • the complexity information of the current block is used to represent the degree of difference in pixel values of the current block.
  • the complexity information of the current block is at least based on the pixels of the current block.
  • the value is obtained by calculating at least one angular gradient of the current block.
  • the information of an image block is usually represented by the pixels contained in the image block.
  • the complexity is low, which means that the color change of the image block is small, the image block is considered to be relatively simple.
  • the pixel values of each pixel point of an image block are greatly different, that is, the complexity is high, which means that the color of the image block changes greatly, the image block is considered to be relatively complex.
  • the complexity information (block_complexity) of the current block is obtained by calculating at least one angular gradient of the current block based on at least the pixel value of the current block.
  • the angular gradient of the current block refers to calculating the difference in pixel values of the current block based on the gradient direction of a certain angle.
  • Angle gradients include horizontal gradients, vertical gradients, and other angular gradients.
  • the horizontal gradient of the current block refers to the set of differences calculated between the pixel value in column t and the pixel value in column t-1 of the current block based on the horizontal left or right gradient direction.
  • t is an integer greater than 1. The formula is as follows:
  • Horizontal gradient H pixel value in column t - pixel value in column t-1;
  • Figure 7 provides a schematic diagram for calculating the angle gradient, as shown in (a) in Figure 7.
  • the image block is calculated according to the above formula according to the direction of the diagram, and 3 ⁇ 2 can be obtained difference values, then the horizontal gradient of the image block is the set of the above 3 ⁇ 2 difference values.
  • complexity_hor sum of elements in horizontal gradient (gradH)/number of elements (grad_block_size);
  • the complexity_hor of the current block sum of 6 differences/6.
  • the elements in the above horizontal gradient may be each horizontal gradient calculated in the horizontal direction of the image block.
  • the vertical gradient of the current block refers to the set of differences calculated between the pixel value of the sth row of the current block and the pixel value of the s-1th row based on the vertical upward or downward gradient direction.
  • s is an integer greater than 1. The formula is as follows:
  • V pixel value in column s - pixel value in column s-1;
  • the image block is calculated by the above formula according to the direction shown in the figure, and 4 ⁇ 1 differences can be obtained. Then the horizontal gradient of the image block is the set of the above 4 ⁇ 1 differences.
  • the vertical complexity (vertical complexity) is calculated.
  • complexity_ver sum of 4 differences/4.
  • the elements in the above-mentioned vertical gradient can be each vertical gradient calculated in the vertical direction of the image block.
  • other angular gradients for the current block may include a 45° gradient, a 135° gradient, a 225° gradient, or a 315° gradient.
  • the directions of the other angle gradients are shown in (a) to (d) in Figure 8 respectively.
  • calculating the complexity information of the current block based on the pixel value of the current block is conducive to determining a more accurate translation for the current block.
  • Coding parameters such as quantization parameters, thereby improving the image quality of video decoding and improving the decoding efficiency of the image.
  • the video decoder may also refer to the reconstructed value of the decoded pixel value of the current block to determine the complexity information of the current block.
  • the first possible implementation is that the video decoder calculates at least one angular gradient of the current block based on the pixel value of the current block and the reconstruction value of the decoded pixel value of the current block; based on at least one angular gradient of the current block, obtain Complexity information for the current block.
  • FIG 9 another schematic diagram for calculating the angle gradient is provided, as shown in figures a) to f) in Figure 9, in which the blank part represents the original pixel, that is, the pixel value of the current block; the shaded part represents the reconstructed pixel, that is The reconstructed value of the decoded pixel value of the current block.
  • Figure 9 is a schematic diagram of calculating the gradient of the current block line by line based on the pixels in the current block and the reconstructed values of the edge pixel values in the gradient direction of the pixel.
  • the gradient of the current block can be calculated along the current
  • the gradient direction of the block finds the reconstructed value of the corresponding edge pixel value or other original pixel value in the current block, and is used to calculate the elements contained in the current block in the gradient direction.
  • the calculated gradient of the current block can include: the gradient obtained by subtracting pixel 1-1 from pixel 2-1, the gradient obtained by subtracting pixel 1-2 from pixel 2-2... minus the gradient obtained by pixel 2-16 The gradient obtained by removing pixels 1-16, the gradient obtained by subtracting pixel 2-2 from pixel 3-1, the gradient obtained by subtracting pixel 2-3 from pixel 3-2...
  • the implementation method of obtaining the complexity information of the current block may refer to the following embodiments, which will not be described in detail here.
  • an element can be calculated in the same gradient direction, and the number of elements contained in each gradient of the current block is equal to the number of pixels in the current block.
  • the above elements can represent the change along the gradient direction of the pixels in the current block corresponding to the element, and the elements correspond to the pixels in the current block one-to-one, so each element can uniformly represent the changes along the gradient direction of the pixels in the current block, so Each element obtained based on the above can help to obtain more accurate complexity information of the current block.
  • the video decoder calculates the complexity information of the current block based on the pixel values of the current block and the pixel values adjacent to the current block in the image to be processed.
  • the first possible implementation method uses the reconstructed value of the decoded pixel value of the current block, which is the reconstructed value, and the reconstructed value is the reconstructed value of the pixel value in the current block.
  • Rebuild value the pixel values adjacent to the current block in the image to be processed are used, that is, the values related to the pixels in the blocks adjacent to the current block. This value can be a reconstructed value or is the original value.
  • the method of calculating the complexity information based on the reconstructed value of the decoded pixel value of the current block is similar to the above method. Still as shown in Figure 9, the blank part in Figure 9 can be regarded as the pixels of the current block of 16 ⁇ 2. The difference is that the shaded part in Figure 9 can be regarded as the pixels adjacent to the current block, indicating the pixels adjacent to the current block. Pixel values are adjacent pixel values.
  • the current block includes 16 ⁇ 2 pixels.
  • the reconstructed value of the pixels in the previous column of the current block minus the pixel value of the first column of pixels in the current block The reconstructed value of the pixels in the first column of the current block minus the pixel value of the second column of the current block...and the reconstructed value of the pixels in the fifteenth column of the current block minus the pixel value of the sixteenth column of the current block.
  • the above optional methods are only examples.
  • the selection of reconstructed pixels and adjacent pixels can be multiple.
  • the reconstructed pixel is the average of the first n rows or first n columns or the first n reconstructed values of the current block
  • the adjacent pixel is the average of the first n rows or first n columns or the first n pixel values of the current block.
  • the complexity information of the current block is obtained based on at least one angular gradient of the current block.
  • the video decoder uses the minimum value among the complexity information obtained based on at least one angular gradient as the complexity of the current block. degree information. That is to say, if the complexity information of the current block calculated based on a certain angle gradient is the smallest, then the minimum complexity information is used as the complexity information of the current block.
  • the current block is composed of multiple channels. composed of channel image blocks.
  • the video decoder can determine the complexity information of each channel image block based on the pixel value of each channel image block in the above-mentioned N-channel image block, And determine the complexity information of the current block based on the complexity information of each channel image block. For example, the minimum value among the complexity information of N channel image blocks is determined as the complexity information of the current block. In this way, the complexity information calculated for image blocks of individual channel images is more accurate, which helps to improve the accuracy of determining the complexity information of the current block and improve the accuracy of video decoding.
  • At least one angular gradient can be calculated. Based on each angular gradient, a complexity information of the channel image block can be obtained, and then based on the obtained complexity information, Determine the complexity information of the channel image block, for example, determine the minimum value in the obtained complexity information as the complexity information of the channel image block.
  • the video decoder divides each channel image block among the multiple channel image blocks that constitute the current block, and determines the value of each channel image block based on the divided sub-blocks of each channel image block.
  • Complexity information Specifically, the video decoder divides each channel image block into at least two sub-blocks, and obtains complexity information of at least two sub-blocks of each channel image block, based on the complexity of at least two sub-blocks of each channel image block. Information, determine the corresponding channel image block complexity information in each channel image block.
  • a certain 4 ⁇ 2 channel image block of the current block is divided in the vertical direction, and then divided into 8 ⁇ 2 sub-image blocks. Calculate the angular gradient to obtain the complexity information of the image patch for that channel.
  • the implementation method of determining the complexity information of at least two sub-blocks of each channel image block may refer to the implementation method provided in Figure 7 above, and will not be described in detail here.
  • the video decoder determines the minimum value of the complexity information of at least two sub-blocks of each channel image block as the complexity information of the corresponding channel image block.
  • the dividing rules for the above-mentioned sub-blocks for each channel image block may be the same or different.
  • the size of each channel image block may be different due to the sampling format of the image.
  • the YUV format is used for sampling based on the 420 sampling format
  • the block sizes after luminance, first chroma, and second chroma sampling are different. Therefore, a relatively large channel image block can be divided and the complexity information of the channel image block can be determined.
  • the complexity information calculated for the smaller sub-blocks in the channel image block is more accurate. Therefore, based on the complexity of the sub-block
  • the complexity information of the channel image block obtained by the degree information is more accurate, which helps to improve the accuracy of determining the complexity information of the current block and improve the accuracy of video decoding.
  • the multi-channel in this application is not limited to the aforementioned RGB three channels, and can also have more channels.
  • the corresponding image to be processed includes four channels of image information.
  • the image sensor is a five-channel sensor.
  • the corresponding image to be processed includes five-channel image information.
  • the multiple channels in this application may include at least one or more of the following channels: Y channel, U channel, V channel, Co channel, Cg channel, R channel, G channel, B channel, alpha channel, IR channel, D channel, W channel.
  • multiple channels include Y channel, U channel, and V channel, or multiple channels include R channel, G channel, and B channel, or multiple channels include R channel, G channel, B channel, and alpha channel
  • the multiple channels may include R channel, G channel, B channel, and IR channel
  • the multiple channels may include R channel, G channel, B channel, and W channel
  • the multiple channels may include R channel, G channel, B channel.
  • RGB color photosensitive channels there may also be IR channels (infrared or near-infrared photosensitive channels), D channels (dark light channels, mainly through infrared light or near-infrared light), and W channels (full-color photosensitive channels).
  • the sensors have different channels, for example, the sensor type can be RGB sensor, RGBIR sensor, RGBW sensor, RGBIRW sensor, RGBD sensor, RGBDW sensor, etc.
  • the video decoder obtains the prediction angle used in the angle prediction mode of the current block, calculates the angle gradient based on the prediction angle to obtain the corresponding complexity information, and uses the corresponding complexity information as the current block complexity information.
  • the angle prediction mode is a common prediction mode, which is used to determine the residual difference between the pixel value of the current block and the reconstructed value based on the specified angle.
  • the prediction angle can also be called the prediction direction, which is similar to the angles involved in other angle gradients mentioned above, such as 45°, 135°, etc.
  • the implementation method of calculating the angle gradient based on the predicted angle and obtaining the corresponding complexity information can refer to the implementation method described above, and will not be described again here.
  • the prediction mode of the current block is associated with the calculation of the complexity information, so that the complexity of the current block can be calculated.
  • Complexity information is more targeted, improving the accuracy of complexity information and improving the accuracy of video decoding.
  • the prediction mode also includes multiple prediction modes such as mean (DC) prediction mode and planar (Planar) prediction mode.
  • the video coder may set multiple complexity information for the current block to determine the complexity information of the current block according to the prediction mode.
  • the plurality of complexity information can be calculated based on the above-mentioned plurality of angular gradients.
  • the plurality of complexity information can be preset in the video decoder. Wherein, the plurality of preset complexity information has a corresponding relationship with the prediction mode.
  • the plurality of complexity information includes first complexity information and second complexity information, and the first complexity information corresponds to the angle prediction mode (which may include an or multiple angle prediction modes), the second complexity information corresponds to the DC prediction mode and the Planar prediction mode.
  • the optimal prediction mode is determined as the prediction mode of the current block based on the rate-distortion (Rate-Distortion Optimized, RDO) cost.
  • the complexity information of the current block can be obtained based on the correspondence between the prediction mode determined by the RDO cost and the preset complexity information. Combined with the above example, if the prediction mode of the current block is determined to be the DC prediction mode based on the RDO cost, according to the above correspondence relationship, the complexity information of the current block should be the second complexity information.
  • the above prediction mode is divided into multiple categories, where each category can correspond to a piece of complexity information.
  • the multiple categories include an intra mode category, a point prediction mode category, a screen content coding (Screen Content Coding, SCC) mode category, a raw value mode category and a fallback (fallback) mode category.
  • angle prediction mode belongs to the intra mode class.
  • the categories to which the above-mentioned prediction modes belong can also be further divided according to whether transformation is performed.
  • the intra mode class can be divided into the intra mode + transformation class and the intra mode + no transformation class.
  • the corresponding relationship between complexity information and categories is similar to the above-mentioned first complexity information corresponding to one or more prediction modes.
  • the corresponding complexity information is used to calculate the quantization parameter.
  • the video decoder can determine the complexity information of the current block based on at least one angular gradient of the N-channel image block of the current block through the following steps S11-S12. .
  • the video decoder obtains the complexity information of each channel image block based on the pixel value of each channel image block in the N-channel image block; N is an integer greater than zero.
  • the video decoder determines the complexity information of the current block based on the complexity information of each channel image block.
  • At least one angular gradient can be obtained, so that the complexity information of the current block can be obtained based on the obtained angular gradient; for the channel image block, at least one angular gradient can be obtained, so that The complexity information of the channel image block is obtained according to the obtained angular gradient; for the sub-blocks of the channel image block, at least one angular gradient can also be obtained, so that the complexity information of the sub-block is obtained according to the obtained angular gradient.
  • Implementation method 1 Use the minimum value among the complexity information calculated based on the angle gradient as the complexity information of the current block.
  • block_complexity min(complexity_ver,complexity_hor,complexity_45,complexity_135, complexity_225, complexity_315)
  • complexity_45 is the complexity information calculated based on the 45° angle gradient
  • complexity_225 is the complexity information calculated based on the 225° angle gradient
  • complexity_135 is the complexity information calculated based on the 135° angle gradient
  • complexity_315 is the complexity information calculated based on the 315° angle gradient.
  • Implementation method 2 Obtain the complexity information of the current block according to the weighted value of each complexity information calculated based on the angle gradient.
  • w v represents the weighted value corresponding to the vertical gradient
  • w h represents the weighted value corresponding to the horizontal gradient
  • w 225 represents the weighted value corresponding to the 225° gradient
  • w 315 represents the weighted value corresponding to the 315° gradient
  • Implementation method 3 Calculate the complexity level (complexity_level) of the original image block of each channel, and determine the block_complexity of the current block based on the complexity_level of the image block of each channel. Including the following steps S21-S23.
  • Step S21 The video decoder determines the complexity_level of each channel image block based on the complexity information (complexity) of each channel image block.
  • Case 1 Considering the subjective model classification of the human eye, A-1 absolute thresholds are set, and the image blocks are divided into A levels from simple to complex.
  • the complexity_level of each channel image block is based on the following method
  • the classification includes: complexity ⁇ 4 belongs to level 1, 4 ⁇ complexity ⁇ 16 belongs to level 2, and complexity ⁇ 16 belongs to level 3.
  • the complexity_level of each channel image block is divided based on the following methods, including: complexity ⁇ 8 belongs to level 1, 8 ⁇ complexity ⁇ 64 belongs to level 2, and complexity ⁇ 64 belongs to level 3.
  • the complexity_level of each channel image block is divided based on the following methods, including: complexity ⁇ 16 belongs to level 1, 16 ⁇ complexity ⁇ 256 belongs to level 2, and complexity ⁇ 256 belongs to level 3.
  • Case 2 Considering the subjective model of the human eye, B-1 absolute thresholds are set, and then C dynamic thresholds are set that are updated with the image blocks, and the image blocks are divided into B+C levels.
  • FIG. 11 it is a schematic diagram of the relationship between a dynamic threshold and an absolute threshold provided by an embodiment of the present application.
  • absolute threshold 1 is 4
  • absolute threshold 2 is 16, then 0 ⁇ dynamic threshold (thread)1 ⁇ 4, 4 ⁇ thread2 ⁇ 16.
  • complexity ⁇ thread1 belongs to level 1
  • thread1 ⁇ complexity ⁇ 4 belongs to level 2
  • 4 ⁇ complexity ⁇ thread2 belongs to level 3
  • thread2 ⁇ complexity ⁇ 16 belongs to level 4
  • complexity>16 belongs to level 5.
  • the above dynamic threshold is updated with the image block.
  • the dynamic threshold becomes smaller; if the complexity information of the current block is less weighted than the complexity information of several previous image blocks, If the complexity information is weighted equally, the dynamic threshold remains unchanged. If the complexity information of the current block is weighted more heavily than the complexity information of several previous image blocks, the dynamic threshold becomes larger.
  • Step S22 The video decoder determines the complexity level (block_complexity_level) of the current block based on the complexity_level of each channel image block.
  • each channel image block channels that are sensitive to the human eye can be given a larger weight, and conversely, a smaller weight can be given.
  • the complexity level of each channel image block is weighted to obtain the complexity level of the current block.
  • block_complexity_level complexity_level1 ⁇ w1+complexity_level2 ⁇ w2+complexity_level3 ⁇ w3
  • w1, w2, or w3 can be 0, w1, w2, and w3 respectively represent the weights of the complexity levels of the three-channel image blocks, and complexity_level1, complexity_level2, and complexity_level3 respectively represent the complexity levels of the three-channel image blocks.
  • Step S23 The video decoder represents the complexity information of the current block based on block_complexity_level.
  • the weights of the complexity levels of the image blocks of different channels can be flexibly adjusted to flexibly adjust the complexity information of the current block.
  • the video decoder may not perform the above step S21 and directly determine the complexity information (block_complexity) of the current block based on the complexity of each channel image block. That is the following implementation method 4.
  • Implementation method 4 The video decoder directly weights the complexity of each channel image block to obtain the complexity information of the current block.
  • block_complexity complexity1 ⁇ w4+complexity2 ⁇ w5+complexity3 ⁇ w6
  • w4, w5 or w6 can be 0, w4, w5, and w6 respectively represent the weights of the complexity information of the three-channel image blocks, and complexity1, complexity2, and complexity3 respectively represent the complexity information of the three-channel image.
  • the weights represented by w4, w5 and w6 can be related to the subjective model of the human eye. For example, channels sensitive to the human eye can be given greater weight.
  • step S22 can also be used to classify the block_complexity to obtain block_complexity_level.
  • the video decoder can still use the above method to determine the complexity information of the N-channel image block, and then determine the complexity information of the current block.
  • the complexity information of the two sub-blocks are sub_complexity1, sub_complexity2 respectively, then the complexity information (complexity1) of the channel image block is determined based on the complexity information of the two sub-blocks.
  • the following implementation methods 5 to 7 are included.
  • Implementation method 6 Weight the complexity information of each sub-block to obtain the complexity information of the channel image block.
  • complexity sub_complexity1 ⁇ w7+sub_complexity2 ⁇ w8, where w7 and w8 respectively represent the weight of the complexity information of the two sub-blocks, and 0 ⁇ w7 ⁇ 1, 0 ⁇ w8 ⁇ 1.
  • Implementation method 7 Determine the complexity information of the channel image block based on the complexity level of each sub-block.
  • the following (1) and (2) take the complexity levels of the two sub-blocks as sub_complexity_level1 and sub_complexity_level2 respectively, the complexity information of the channel image block as complexity1, and the complexity level of the channel image block as complexity_level1 as an example. way to explain.
  • the complexity level of each sub-block may be determined by: determining the complexity level of each sub-block based on the complexity information of each sub-block. Specifically, reference may be made to the description of step S21 above. The difference lies in the names of each sub-block and each channel image block, which will not be described in detail here.
  • the complexity level of the channel image block can be determined based on the complexity level of each sub-block.
  • each sub-block (such as: sub_complexity_level1 and sub_complexity_level2) can be weighted to obtain the complexity level of the channel image block (such as: complexity_level 1).
  • the expression is as follows:
  • complexity_level 1 sub_complexity_level 1 ⁇ w9+sub_complexity_level 2 ⁇ w10, where w9 and w10 respectively represent the weights of the complexity levels of the two sub-blocks, and 0 ⁇ w9 ⁇ 1, 0 ⁇ w10 ⁇ 1.
  • the minimum value among the complexity levels of each sub-block (such as: sub_complexity_level1 and sub_complexity_level2) is used as the complexity level (such as: complexity_level 1) of the channel image block.
  • complexity_level 1 min(sub_complexity_level 1, sub_complexity_level 2).
  • this step is similar to S23 and will not be described in detail here.
  • step S601 Another implementation manner of step S601 is described below.
  • the video encoder 102 can obtain ,the complexity information and transmit the complexity ,information to the video decoder 112. Then, in this case, when the decoder is the video decoder 112, optional implementation methods of the above step S601 include: parsing the code stream, and obtaining the complexity of the current block in the image to be processed from the code stream. information.
  • the video decoder determines the quantization parameter of the current block based on the complexity information of the current block.
  • the video decoder determines the quantization parameter of the current block according to the complexity information, including the following steps S31-S32.
  • the video decoder determines the reference quantization parameter (ref_qp) of the current block based on the complexity information of the current block.
  • the reference quantization parameters are used to guide the generation of quantization parameters.
  • the video decoder obtains the buffer area status of the image to be processed, and the correspondence between the buffer area status and the complexity information of the current block, and determines the reference quantization of the current block based on the correspondence between the buffer area status and the complexity information. parameter.
  • the video encoder also includes a buffer module, and the buffer area in the buffer module is used to control the constant speed output of the code stream.
  • the constant speed output of the code stream refers to the constant speed output of the bits occupied by the code stream, which means that the buffer area will flow out the encoded code stream at a constant speed to achieve stable output.
  • the buffer area does not allow overflow, where overflow includes overflow and underflow. An overflow is defined as exceeding the maximum value (max_buffer) of the buffer area status, and an underflow is defined as being lower than the minimum value (0) of the buffer area status.
  • the above buffer area status may also be called physical buffer status (physical_buffer).
  • the video decoder can directly read the status information of the corresponding buffer to obtain the buffer area status of the image to be processed.
  • the video decoder obtains the corresponding relationship between the buffer area status of the image to be processed and the complexity information of the current block through the following steps S41-S42.
  • the video decoder determines fullness (fullness) according to the status of the buffer area.
  • a and b are fullness-based piecewise linear mapping parameters of physical_buffer, a represents the scaling ratio for physical_buffer, and b represents the offset degree for physical_buffer. This parameter can be adjusted based on buffer area status, image information, and complexity information. The following describes the situation.
  • Case 1 Determine parameters a and b based on image information.
  • block_size represents the size of the block.
  • the buffer area status is small, so the fullness can be further reduced through the above a and b to obtain a smaller ref_qp and improve the accuracy of decoding.
  • Case 2 Determine parameters a and b based on image information and buffer area status.
  • Case 3 Determine parameters a and b based on image information and complexity information.
  • the level of complexity information can correspond to the above complexity information level.
  • Case 4 Determine parameters a and b according to the buffer area status.
  • the video decoder calculates ref_qp based on fullness.
  • ref_qp fullness ⁇ c+d
  • c and d are parameters that can be adjusted according to the buffer area status, image information and complexity information.
  • the parameter c can be determined based on max_qp, where max_qp is the maximum quantization parameter.
  • c and d are ref_qp based on fullness piecewise linear mapping parameters, c represents the scaling ratio for fullness, and d represents the degree of offset for fullness. This parameter can be adjusted based on image information, complexity information, and fullness. The following describes the situation.
  • Case 1 Determine parameters c and d based on image information.
  • Case 2 Determine parameters c and d based on fullness and image information.
  • parameters c and d are determined for the bit width and target pixel depth (target_bpp) in the image information.
  • target_bpp is 8bit and fullness ⁇ 0.1
  • target_bpp is 8bit and fullness>0.8
  • c and d can be updated with image blocks.
  • target_bpp refers to the parameter specified by the encoding end, indicating the average number of bits required for each pixel after compression. For example, for an original image with 10 bits and the sampling format YUV444, then the bpp of the original image is 30 bit. If target_bpp is 5bit means 6 times compression.
  • ref_qp can also be determined based on the bit width and pixel depth in the image information. Specifically, a range can be determined for ref_qp based on the image information, wherein when the image information changes, the corresponding range of ref_qp is different. Specifically, the range is an interval formed by the minimum reference quantization parameter (min_ref_qp) and the maximum reference quantization parameter (max_ref_qp).
  • Case 3 Determine parameters c and d according to the status of the buffer area.
  • parameters c and d are determined according to the relationship between physical_buffer and max_buffer.
  • Case 4 Determine parameters c and d based on complexity information and fullness.
  • simple blocks can be understood as blocks whose complexity information is less than or equal to the first preset value
  • complex blocks can be understood as blocks whose complexity information is greater than or equal to the second preset value
  • ordinary blocks can be understood as blocks whose complexity information is between the first and second preset values. The block between the preset value and the second preset value.
  • Case 5 Determine parameter c based on complexity information.
  • the computing resources such as the number of bits in the buffer area consumed by encoding are low, the sensitivity of qp to changes in fullness can be appropriately reduced, that is, the size of parameter c can be appropriately reduced, This prevents qp from being adjusted too small due to changes in fullness, which ensures that the encoding quality of blocks with lower complexity will not be too low.
  • the sensitivity of qp to changes in fullness can be appropriately increased, that is, the size of parameter c can be appropriately increased, This improves the ability to control the code rate.
  • a higher parameter c can be determined, and for blocks with lower complexity, a lower parameter c can be determined.
  • Case 6 Determine parameters e and f based on complexity information, maximum complexity (max_complexity), and maximum quantization parameter (max_qp), where e and f are parameters related to fullness and buffer status and image information.
  • ref_qp complexity/max_complexity ⁇ max_qp ⁇ e+f.
  • the above determination of reference quantization parameters based on complexity information and fullness or buffer area status may also refer to a function image of reference quantization parameters, complexity and buffer area status provided in Figure 12a.
  • the reference quantization parameter increases, and the complexity information can have different effects on the growth of the reference quantization parameter based on the buffer area state.
  • the buffer area status of an image block with higher complexity increases, the reference quantization parameter also increases.
  • the increase in the buffer area state can be smaller than the reference quantization parameter of an image block with a higher complexity.
  • the buffer area The state has the main impact on the reference quantization parameter. That is to say, when the buffer area state is relatively empty or full, the complexity has a small impact. In the interval of [max_buffer ⁇ 0.15, max_buffer ⁇ 0.85], the complexity has a smaller impact.
  • the impact of complexity is small, the reference quantization parameters can remain unchanged within this interval, corresponding to L 3.
  • the reference quantization parameter can grow slowly in this interval, corresponding to L 2 or L 4.
  • the reference quantification parameter may mutate, corresponding to L 1 or L 5 .
  • the starting point based on complexity changes is determined based on the complexity information itself.
  • the input and output of the code stream in the buffer area of the image to be processed can be dynamically controlled, resulting in a more stable output of the code stream.
  • the video decoder determines the complexity level of the current block, determines the target bits (target_cost) based on the complexity level of the current block, and obtains the reference quantization parameter of the current block based on the target bits.
  • the target bits refer to the number of predicted bits after decoding the current block.
  • the actual number of decoded bits of the current block may be greater than, less than, or equal to the target bits.
  • the above target bits refer to the number of bits occupied by the current block in the code stream.
  • the video decoder determines the target bits according to the complexity level of the current block through the following situations.
  • g and h are target_cost based on ref_cost piecewise linear mapping parameters, g represents the scaling ratio for target_cost, and h represents the offset degree for ref_cost.
  • g and h are used to modify ref_cost according to ave_cost to obtain target_cost.
  • Case 2 Determine target_cost based on complexity information.
  • I complexity information and J types of patterns which are divided into I ⁇ J categories, and each category corresponds to a target_cost.
  • the complexity information indicates whether the current block is simple or complex, and the modes include intra block copy (IBC) mode and non-IBC mode.
  • IBC intra block copy
  • IBC each category corresponds to a target_cost.
  • pred_cost t-1 is the number of prediction bits corresponding to the previous image block.
  • Image information includes bit width, image sampling format, or other information.
  • K bit widths and L image sampling formats which are divided into K ⁇ L categories, and each category corresponds to a target_cost.
  • K bit widths and L image sampling formats which are divided into K ⁇ L categories, and each category corresponds to a target_cost.
  • Case 4 Obtain the full degree, and determine the target_cost based on the full degree combined with the complexity information of the current block, buffer area status or image information.
  • step S41 The process of obtaining the full degree is the same as step S41.
  • Case 4.1 Determine target_cost based on fullness and buffer area status.
  • target_cost m ⁇ ref_cost+n ⁇ physical_buffer+o, where m, n and o are parameters.
  • fullness>0.85 the setting of target_cost is dominated by fullness, that is, the value of n is larger than m; if fullness ⁇ 0.25, m is larger than n.
  • Case 4.2 Determine target_cost based on fullness, image information and complexity information.
  • min_target_cost bpp ⁇ fullness ⁇ p1+q1
  • max_target_cost bpp ⁇ fullness ⁇ p2+q2; among them, p1 and q1 make min_target_cost more accurate when fullness ⁇ 0.25. Small, when fullness>0.75 makes min_target_cost larger; among them, p2 and q2 make max_target_cost smaller when fullness ⁇ 0.25, and make max_target_cost larger when fullness>0.75.
  • Case 4.3 Determine target_cost based on fullness and complexity information.
  • n For the constant segments of simple blocks and ordinary blocks, when the actual coded bits are greater than the target bits, the value of n increases, and vice versa decreases, that is, by adjusting n to make the actual bit consumption less than or equal to target_cost; for complex blocks, adjusting n to make the actual bit consumption less than or equal to target_cost.
  • the bit consumption is greater than or equal to target_cost (if simple blocks and ordinary blocks do not save bits, then complex blocks have no extra bits available, and must be strictly less than or equal to target_cost).
  • u and v are parameters.
  • the reference quantization parameters obtained based on the target bits corresponding to the complexity level can be flexibly adjusted according to the size of the target bits, so that the quantization parameters can be adjusted more flexibly.
  • step S32 The implementation of step S32 will be described below.
  • the video decoder determines the quantization parameter of the current block based on the reference quantization parameter of the current block.
  • x and y are qp piecewise linear mapping parameters based on ref_qp
  • x represents the scaling ratio for ref_qp
  • y represents the offset degree for ref_qp.
  • the video decoder determines the quantization parameter based on the current reference quantization parameter through the following situations.
  • Case 1 Determine parameters x and y based on image information.
  • the reference quantization parameter is the quantization parameter.
  • Case 2 Determine parameters x and y based on complexity information.
  • the video decoder determines the weighting coefficient according to the complexity information of the current block, and the weighting coefficient is used to adjust the quantization parameter of the current block according to the complexity of the current block; based on the weighting coefficient and the reference quantization parameter of the current block, determines the current block quantification parameters.
  • w represents the weight, including w11, w12, w13, 0 ⁇ w11, w12, w13 ⁇ 1.
  • block_complexity1, block_complexity2 and block_complexity3 respectively represent the complexity information of the three channels of the current block.
  • the quantization parameter of the current block is adjusted using the weighting coefficient determined by the complexity information of the current block, so that the quantization parameter of the current block can be adaptively adjusted according to the complexity information of the current block.
  • Case 3 Determine the weighting coefficient based on the complexity information of the M decoded image blocks and the complexity information of the current block; determine the quantization parameter of the current block based on the weighting coefficient and the reference quantization parameter of the current block.
  • weighting coefficient can be regarded as the following x, and the weighting coefficient can be determined by the following expression:
  • window_complexity represents the complexity information of the image blocks included in the sliding window. As the image blocks in the sliding window change, the window_complexity is updated accordingly.
  • the details can be calculated according to the following formula.
  • window_complexity z window_complexity z-1 ⁇ 0.75+block_complexity ⁇ 0.25;
  • x window_complexity1 z ⁇ w14/(window_complexity1 z ⁇ w14+window_complexity2 z ⁇ w15+ window_complexity3 z ⁇ w16)
  • window_complexity z represents the starting point from block z Complexity information of the previous M decoded image blocks.
  • window_complexit y z-1 represents the complexity information of the previous M decoded image blocks starting from the z-1 block.
  • window_complexity1 z , window_complexity2 z and window_complexity3 z respectively represent the complexity information of three different image blocks included in the sliding window.
  • the video decoder decodes the current block based on the quantization parameter.
  • the complexity information of the current block is calculated based on the code rate control unit of the current block.
  • the quantization parameter of the current block is the quantization parameter of the code rate control unit of the current block.
  • the video decoder is based on the quantization parameter pair of the current block.
  • Decoding the current block includes: determining the quantization parameters of the decoding unit of the current block according to the quantization parameters of the code rate control unit; and decoding the current block according to the quantization parameters of the coding unit.
  • the above-mentioned decoding unit is an encoding unit
  • the above-mentioned decoding unit is a decoding unit.
  • the code rate control module determines the quantization parameter, it calculates based on the code rate control unit.
  • the size of the code rate control unit is larger than the size of the basic coding unit (quantization unit)
  • the size of the code rate control unit is equal to the size of the quantization unit, one-to-one corresponding quantization parameters can be obtained.
  • the size of the code rate control unit is smaller than the size of the quantization unit, it means that one quantization unit corresponds to multiple quantization parameters. In this case, a certain strategy needs to be adopted to determine the final quantization parameter for the quantization unit based on the multiple quantization parameters.
  • the video decoder divides the quantization units based on the code rate control units, that is, one-to-one correspondence between multiple quantization parameters and multiple quantization units.
  • multiple quantization parameters are weighted or the minimum value is selected to obtain a quantization parameter corresponding to the quantization unit.
  • multiple quantization parameters are combined based on complexity information and buffer area status. For example, quantization parameters with similar complexity information are combined into one.
  • the similar complexity information can be multiple complexity information that meet a certain difference range.
  • the quantization parameters of the coding unit of the current block are determined based on the quantization parameters of the code rate control unit. This can make the quantization parameters of the coding block match the code rate control strategy, so that the decoding results can be balanced When code rate control is required, image quality and decoding efficiency are improved.
  • the above merging method can weight similar quantization parameters to obtain a quantization parameter, or select the minimum value among similar quantization parameters as the merged quantization parameter.
  • the merged one or more quantization parameters are then used in the first possible implementation method above to obtain multiple quantization units, corresponding to the quantization parameters; or, the second possible implementation method above is also used to obtain a
  • the quantization parameter corresponds to a quantization unit. There are no restrictions on this.
  • step S603 includes the video decoder encoding or decoding the current block based on the quantization parameter.
  • the video encoder encodes the complexity information of the current block into the code stream, or encodes the quantization parameters of the current block into the code stream. Accordingly, the decoding end obtains the complexity information in the code stream to calculate the quantization parameters for decoding, or the decoding end obtains the quantization parameters in the code stream for decoding.
  • the video encoder can also encode both of the above information into the code stream.
  • the video decoder when the video encoder encodes the complexity information of the current block into the code stream, the video decoder obtains the complexity information accordingly and calculates the quantization parameters, but the video decoder does not need to use the complexity information to update other parameters.
  • the above determination method of target_cost involves updating based on the complexity information of the current block.
  • the updating based on the complexity may be based on historical information (such as the number of bits occupied by the decoded image block). Different from the result of updating the quantization parameter of the decoded image block, at this time, the parameter can be updated without using the complexity information, and the original update parameter method can still be retained.
  • Calculating the complexity information of the current block through the above method is conducive to determining more accurate decoding parameters, such as quantization parameters, for the current block, thereby improving the decoding efficiency of the image.
  • the following code stream grouping method can be performed before encoding the code stream obtained based on the above decoding method.
  • image_width is used to specify the width of the image brightness component, that is, the number of samples in the horizontal direction, and is a 16-bit unsigned integer.
  • the unit of image_width should be the number of samples per row of the image.
  • the upper left sample of the displayable area should be aligned with the upper left sample of the decoded image.
  • the value of ImageWidth is equal to the value of image_width.
  • the value of ImageWidth should not be 0 and should be an integer multiple of 16.
  • image_height is a 16-bit unsigned integer used to specify the height of the image brightness component, that is, the number of scanning lines in the vertical direction.
  • the unit of image_height should be the number of rows of image samples.
  • the value of ImageHeight is equal to the value of image_height.
  • the value of ImageHeight should not be 0 and should be an integer multiple of 2.
  • a strip is a fixed rectangular area in the image. Therefore, the strip can also be called a rectangular strip. It contains parts of several coding units within the image, and the strips do not overlap.
  • the division method is not limited and CUs can be further divided based on stripes.
  • the current image width or image height may be adjusted.
  • real_width is the actual image width
  • real_height is the actual image height, that is, the boundary of the image displayable area.
  • the width and height of the image are adaptively increased, resulting in image_width and image_height in the figure.
  • a strip has a width (slice_width) and a height (slice_height).
  • SliceNum X can represent the number of slices in the horizontal direction of an image
  • SliceNum Y can represent the number of slices in the vertical direction of an image.
  • Figure 13b A schematic diagram of a strip is provided in Figure 13b.
  • each slice code stream is fixed, the length of the first R-1 segments (chunk) is fixed, and the last one is not fixed.
  • Figure 14a and Figure 14b they are respectively flowcharts of the encoding end and the decoding end instructing the code stream grouping method.
  • FIG 14a it is a schematic flowchart of a code stream grouping method at the encoding end provided by an embodiment of the present application, including steps S1401a-S1406a.
  • S1402a Calculate the total resources and determine chunkNum, the number of fragments in each strip.
  • total_resoure refers to the resources occupied by the strip calculated based on the number of bits required for each pixel.
  • total_resourec ((slice_width ⁇ slice_height ⁇ target pixel depth (target_bpp)+7)>>3) ⁇ 3.
  • chunkNum total_resoure/size+n
  • block_num is the default configuration parameter, and block_num is an integer multiple of 4.
  • S1403a Sequentially encode sliceNumX slices of each slice row to generate sliceNumX bit stream buffers.
  • bit stream buffer can be represented by slicebuffer[sliceNumX], and the bit stream buffer can also be zero-filled and byte aligned.
  • encoding each slice can be understood as encoding each image block in the slice according to the solution provided in the aforementioned embodiment, and obtaining the encoded data of each image block in the slice.
  • each bit stream buffer into N bit segments, where the length of the first N-1 segments is the first value, and the length of the last segment is the second value.
  • the above-mentioned bit fragments may also be called code stream fragments, and the above-mentioned N represents the number of fragments, that is, the aforementioned chunkNum.
  • each bitstream buffer is further divided into chunkNum chunk fragments.
  • the first value can be size1
  • the second value can be size2.
  • S1406a Determine whether the encoding of the slice has ended. If not, return to S1403a to encode the next slice line.
  • FIG 14b it is a schematic flowchart of a code stream grouping method at the decoding end provided by an embodiment of the present application, including steps S1401b-S1406b.
  • This step is similar to the above step S1401a, and can obtain the first number of horizontal strips and the second number of vertical strips in the image. Similarly, the above image in this step is the aforementioned image to be processed.
  • the actual image width of the image to be processed can be obtained from the video header information or image header information in the received code stream. and the actual image height, and then calculate the above-mentioned sliceNumX (number of first strips) and sliceNumY (number of second strips) based on the actual image width and actual image height.
  • sliceNumX and sliceNumY can also be obtained directly from the video header information or image header information in the code stream.
  • S1402b Calculate the total resources and determine chunkNum, the number of fragments in each strip.
  • This step is the same as step S1402a described above.
  • S1403b Receive the code stream, parse the code stream fragments of sliceNum is the first value, and the length of the last segment is the second value.
  • the decoding end can continuously receive the code stream, and of course can also directly obtain all the code streams to be decoded.
  • the number of code stream fragments in each slice is the above chunkNum.
  • the code stream segments of each strip are deinterleaved in units of stripes, and then the deinterleaved results are stored in the bit stream buffer area of the condition.
  • S1406b Determine whether the slice code stream parsing is completed. If not, return to S1403b and parse the next slice line in sequence.
  • FIG. 15 a schematic diagram of interleaving of slices based on the code stream grouping method is shown.
  • sliceNumX is 2
  • a schematic diagram of interleaving of chunk segments based on the above code stream grouping method is shown.
  • fragment R in Figure 15 represents the R-th chunk.
  • the length of the R-th chunk is not fixed, and the lengths of the other R-1 chunks are fixed.
  • size 1 represents the length of other R-1 chunks
  • size 2 represents the length of the R-th chunk.
  • the code streams represented by (c) in Figure 15 are grouped according to size 1 and size 2 respectively.
  • size 1 is smaller than size 2
  • the interleaving schematic diagram of the grouped segments is shown in (a) in Figure 15.
  • size 1 is larger than size 2
  • the interleaving schematic diagram of the grouped segments is shown in (b) in Figure 15 Show.
  • the code stream length of each slice is fixed, the length of the r-th chunk is not fixed, and the length of the other r-1 chunks is fixed.
  • Figure 16a and Figure 16b they are respectively flowcharts of the encoding end and the decoding end instructing the code stream grouping method.
  • FIG 16a it is a schematic flow chart of a code stream grouping method at the encoding end, including steps S1601a-S1606a.
  • This step is the same as step S1401a described above.
  • S1602a Calculate the total resources and determine chunkNum, the number of fragments in each strip.
  • This step is the same as step S1402a described above.
  • S1603a Sequentially encode sliceNumX slices of each slice row to generate slicenumX bitstream buffers.
  • the above-mentioned bit fragments may also be called code stream fragments, and the above-mentioned N represents the number of fragments, that is, the aforementioned chunkNum.
  • each bitstream buffer can be divided into chunkNum chunk fragments, and the length of each chunk chunksize is not fixed.
  • the first value can be size1 and the second value can be size2.
  • S1606a Determine whether the encoding of the slice has ended. If not, return to S1603a to encode the next slice line.
  • FIG 16b it is a schematic flow chart of a code stream grouping method at the decoding end, including steps S1601b-S1606b.
  • This step is the same as step S1401b described above.
  • S1602b Calculate the total resources and determine chunkNum, the number of fragments in each strip.
  • This step is the same as step S1402b described above.
  • S1606b Determine whether the slice code stream parsing is completed. If not, return to S1603b and parse the next slice line in sequence.
  • FIG. 17 a schematic diagram of interleaving slices based on a code stream grouping method is shown.
  • sliceNumX is 2
  • a schematic diagram of interleaving chunk segments based on another code stream grouping method is shown.
  • fragment r in Figure 17 represents the r-th chunk
  • fragment R represents the R-th chunk.
  • the length of the r-th chunk is not fixed, and the lengths of the other R-1 chunks are fixed.
  • size 1 represents the length of other R-1 chunks
  • size 2 represents the length of the r-th chunk.
  • the code streams represented by (c) in Figure 17 are grouped according to size 1 and size 2 respectively.
  • size 1 is smaller than size 2
  • the interleaving schematic diagram of the grouped segments is shown in Figure 17(a).
  • the interleaving schematic diagram of the grouped segments is shown in Figure 17(b).
  • Embodiments of the present application provide a video decoding device.
  • the video decoding device may be a video decoder, a video encoder, or a video decoder. Specifically, the video decoding device is used to perform the steps performed by the video decoder in the above video decoding method.
  • the video decoding device provided by the embodiment of the present application may include modules corresponding to corresponding steps.
  • Embodiments of the present application can divide the video decoding device into functional modules according to the above method examples.
  • each functional module can be divided corresponding to each function, or two or more functions can be integrated into one processing module.
  • the above integrated modules can be implemented in the form of hardware or software function modules.
  • the division of modules in the embodiments of this application is schematic and is only a logical function division. There may be other division methods in actual implementation.
  • FIG. 18 is a schematic diagram of the composition of a video decoding device provided by an embodiment of the present application.
  • the video decoding device 180 includes an acquisition module 1801 , a determination module 1802 and a decoding module 1803 .
  • the acquisition module 1801 is used to obtain the complexity information of the current block in the image to be processed.
  • the complexity information of the current block is obtained by calculating at least one angular gradient of the current block based on at least the pixel value of the current block, such as the above step S601.
  • the determination module 1802 is used to determine the quantization parameter of the current block according to the complexity information of the current block; for example, the above-mentioned step S602.
  • the decoding module 1803 is used to decode the current block based on the quantization parameters, such as the above step S603.
  • the acquisition module 1801 is specifically configured to calculate at least one angular gradient of the current block based on the pixel value of the current block and the reconstruction value of the decoded pixel value of the current block; according to at least one angular gradient of the current block, Gets the complexity information of the current block.
  • the acquisition module 1801 is specifically configured to calculate at least one angular gradient of the current block based on the pixel value of the current block and the pixel values adjacent to the current block in the image to be processed; according to at least one angular gradient of the current block , obtain the complexity information of the current block.
  • the acquisition module 1801 is specifically configured to acquire the prediction angle used in the angle prediction mode of the current block; based on the prediction Measure the angle and calculate the angle gradient to obtain the corresponding complexity information; use the corresponding complexity information as the complexity information of the current block.
  • the current block is an N-channel image block
  • the acquisition module 1801 is specifically configured to obtain the complexity information of each channel image block based on the pixel value of each channel image block in the N-channel image block; N is greater than An integer of zero; determines the complexity information of the current block based on the complexity information of each channel image block.
  • the acquisition module 1801 is specifically configured to divide each channel image block into at least two sub-blocks; determine the complexity information of at least two sub-blocks of each channel image block; based on at least The complexity information of the two sub-blocks determines the complexity information of the corresponding channel image block in each channel image block.
  • the acquisition module 1801 is specifically configured to determine the minimum value of the complexity information of at least two sub-blocks of each channel image block as the complexity information of the corresponding channel image block.
  • the acquisition module 1801 is specifically configured to determine the minimum value in the complexity information of each channel image block as the complexity information of the current block.
  • the acquisition module 1801 is specifically configured to determine the complexity level of each channel image block based on the complexity information of each channel image block; determine the complexity level of the current block based on the complexity level of each channel image block. Complexity information.
  • the determination module 1802 is specifically configured to determine the reference quantization parameter of the current block based on the complexity information of the current block; and determine the quantization parameter of the current block based on the reference quantization parameter of the current block.
  • the determination module 1802 is specifically used to obtain the buffer area status of the image to be processed.
  • the buffer area status is used to characterize the location of the encoded image blocks in the image to be processed.
  • the number of bits occupied in the buffer area, where the buffer area is used to control the constant speed output of the code stream of the image to be processed; according to the corresponding relationship between the buffer area status and the complexity information of the current block, the reference quantization parameter of the current block is determined.
  • the determination module 1802 is specifically used to determine the complexity level of the current block; determine the corresponding target bits according to the complexity level of the current block, where the target bits refer to the number of bits occupied by the current block in the code stream; according to The target bit obtains the reference quantization parameter of the current block.
  • the determination module 1802 is specifically configured to determine a weighting coefficient according to the complexity information of the current block.
  • the weighting coefficient is used to adjust the quantization parameter of the current block according to the complexity of the current block; according to the weighting coefficient and the reference quantization of the current block Parameters that determine the quantization parameters of the current block.
  • the complexity information of the current block is calculated based on the code rate control unit of the current block.
  • the code rate control unit is the basic processing unit for calculating the complexity information of the current block;
  • the quantization parameter of the current block is the The quantization parameter of the code rate control unit, the decoding module 1803, is specifically used to determine the quantization parameter of the decoding unit of the current block according to the quantization parameter of the code rate control unit; and decode the current block according to the quantization parameter of the decoding unit.
  • the video decoding device provided by the embodiment of the present application includes but is not limited to the above-mentioned modules.
  • the video decoding device may also include a storage module 1804.
  • the storage module 1804 may be used to store program codes and data of the video decoding device.
  • An embodiment of the present application also provides a video decoder, including a processor and a memory;
  • the memory stores instructions executable by the processor
  • the video decoder implements the video image decoding method in the above embodiment.
  • An embodiment of the present application also provides a video encoder, including a processor and a memory;
  • the memory stores instructions executable by the processor
  • the video encoder implements the video image encoding method in the above embodiment.
  • Embodiments of the present application also provide a video coding and decoding system, including a video encoder and a video decoder.
  • the video encoder is used to perform any of the video decoding methods provided in the above embodiments.
  • the video decoder is used to perform the above. Any video decoding method provided in the embodiment.
  • An embodiment of the present application also provides an electronic device.
  • the electronic device includes the above-mentioned video decoding device 180.
  • the video decoding device 180 performs the method performed by any of the video decoders provided above.
  • Embodiments of the present application also provide a computer-readable storage medium.
  • the computer-readable storage medium stores a computer program. When the computer program is run on the computer, it causes the computer to perform any of the methods performed by the video decoder provided above.
  • An embodiment of the present application also provides a chip.
  • the chip integrates a control circuit and one or more ports for realizing the functions of the video decoding device 100 described above.
  • the functions supported by this chip can be referred to above and will not be described again here.
  • the program may be stored in a computer-readable storage medium.
  • the storage medium mentioned above may be a read-only memory, a random access memory, etc.
  • the above-mentioned processing unit or processor can be a central processing unit, a general-purpose processor, an application specific integrated circuit (ASIC), a microprocessor (digital signal processor, DSP), a field programmable gate array (field programmable gate array, FPGA) or other programmable logic devices, transistor logic devices, hardware components, or any combination thereof.
  • ASIC application specific integrated circuit
  • DSP digital signal processor
  • FPGA field programmable gate array
  • Embodiments of the present application also provide a computer program product containing instructions. When the instructions are run on a computer, they cause the computer to perform any of the methods in the above embodiments.
  • the computer program product includes one or more computer instructions. When computer program instructions are loaded and executed on a computer, processes or functions according to embodiments of the present application are generated in whole or in part.
  • the computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable device.
  • Computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, e.g., computer instructions may be transmitted from a website, computer, server or data center via a wired (e.g.
  • Coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless means to transmit to another website, computer, server or data center.
  • Computer-readable storage media can be any available media that can be accessed by a computer or include one or more data storage devices such as servers and data centers that can be integrated with the media. Available media may be magnetic media (eg, floppy disk, hard disk, magnetic tape), optical media (eg, DVD), or semiconductor media (eg, SSD), etc.
  • the computer program product includes one or more computer instructions.
  • Computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, e.g., computer instructions may be transmitted from a website, computer, server or data center via a wired (e.g.
  • Coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless means to transmit to another website, computer, server or data center.
  • Computer-readable storage media can be any available media that can be accessed by a computer or include one or more data storage devices such as servers and data centers that can be integrated with the media. Available media may be magnetic media (eg, floppy disk, hard disk, tape), optical media (eg, DVD), or semiconductor media (eg, solid state disk (SSD)), etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

本申请实施例提供一种视频译码方法、装置及存储介质,涉及视频解码技术领域,有助于提升译码效率。该方法包括:获取待处理图像中当前块的复杂度信息,当前块的复杂度信息至少根据当前块的像素值计算当前块的至少一个角度梯度获取得到;根据当前块的复杂度信息确定当前块的量化参数;基于量化参数对当前块进行译码。

Description

一种视频译码方法、装置及存储介质
本申请要求于2022年5月31日提交中国专利局、申请号为202210612716.1发明名称为“一种视频译码方法、装置及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及视频译码技术领域,尤其涉及一种视频译码方法、装置及存储介质。
背景技术
视频译码技术在视频处理领域发挥着重要的作用。其中,视频译码技术包括视频编码和解码。而在视频编解码过程中,针对视频中的图像进行量化或反量化的处理过程,是决定图像质量的关键。量化主要通过量化参数替代码流中部分原始数据,实现减少码流中原始数据的冗余。但量化过程会带来图像失真的风险。因此,在兼顾图像质量的前提下,为了提升视频译码效率,为视频中的图像确定更精准的量化参数是目前亟待解决的问题。
发明内容
本申请实施例提供一种视频译码方法、装置及存储介质,有助于提升视频译码效率。
为达到上述目的,本申请实施例采用如下技术方案:
第一方面,本申请实施例提供一种视频译码方法,该方法应用于视频编码设备或视频解码设备或视频编解码设备的芯片中,该方法包括:获取待处理图像中当前块的复杂度信息,当前块的复杂度信息至少根据当前块的像素值计算当前块的至少一个角度梯度获取得到;根据当前块的复杂度信息确定当前块的量化参数;基于量化参数对当前块进行译码。
量化参数在视频编解码过程中具有重要的作用。采用本申请提出的视频译码方法,视频译码装置获取待处理图像中当前块的复杂度信息,该复杂度信息是基于当前块的信息计算得到的,基于该复杂度信息确定当前块的量化参数并进行译码,考虑到了当前块的角度梯度信息,这样有助于为当前块确定更准确的量化参数,从而在兼顾图像质量的情况下,提升视频的译码效率。此外,若采用上述方法进行视频解码,解码端从码流中获取当前块的复杂度信息确定量化参数,有助于减少量化参数在码流中占用的资源,从而使得码流中传输更多有效数据,提升传输效率。
在一种可能的实现方式中,获取待处理图像中当前块的复杂度信息,包括:基于当前块的像素值和当前块已译码的像素值的重建值,计算当前块的至少一个角度梯度;根据当前块的至少一个角度梯度,获取当前块的复杂度信息。
该种可能的实现方式中,通过当前块的像素值和重建值计算当前块的复杂度,有助于为当前块确定更准确的量化参数,在兼顾图像质量的情况下,提升视频的译码效率。
在一种可能的实现方式中,获取待处理图像中当前块的复杂度信息,包括:基于当前块的像素值和待处理图像中与当前块相邻的像素值,计算当前块的至少一个角度梯度;根据当前块的至少一个角度梯度,获取当前块的复杂度信息。
该种可能的实现方式中,通过当前块的像素值和与当前块相邻的像素值计算当前块的复杂度,有助于为当前块确定更准确的量化参数,在兼顾图像质量的情况下,提升视频的译码效率。
在一种可能的实现方式中,获取待处理图像中当前块的复杂度信息,包括:获取当前块的角度预测模式所采用的预测角度;基于预测角度,计算角度梯度以获取相应的复杂度信息;将相应的复杂度信息作为当前块的复杂度信息。
该种可能的实现方式中,通过当前块的角度预测模式确定相应的复杂度,有助于为当前块确定更准确的量化参数,对于解码端,有助于节省码流中的资源,提升视频的译码效率。
在一种可能的实现方式中,当前块为N通道图像块,获取待处理图像中当前块的复杂度信息,包括:基于N通道图像块中每个通道图像块的像素值,获取每个通道图像块的复杂度信息;N为大于零的整数;基于每个通道图像块的复杂度信息确定当前块的复杂度信息。
该种可能的实现方式中,提供了一种基于多个通道图像块的复杂度确定当前块复杂度的实现方式,提升方案的可实施性。此外,将图像划分为多个通道分别进行计算,有助于提升确定得到的复杂度信息 的准确性。
在一种可能的实现方式中,基于N通道图像块中每个通道图像块的像素值,获取每个通道图像块的复杂度信息,包括:将每个通道图像块划分为至少两个子块;确定每个通道图像块的至少两个子块的复杂度信息;基于每个通道图像块的至少两个子块的复杂度信息,确定每个通道图像块中,相应的通道图像块的复杂度信息。
该种可能的实现方式中,提供了一种基于多个通道图像块的复杂度确定当前块复杂度的实现方式,其中,通过将多个通道图像块进一步划分确定复杂度,有助于提升确定得到的复杂度信息的精确度。
在一种可能的实现方式中,基于每个通道图像块的至少两个子块的复杂度信息,确定相应的通道图像块的复杂度信息,包括:将每个通道图像块的至少两个子块的复杂度信息中的最小值,确定为相应的通道图像块的复杂度信息。
该种可能的实现方式中,提供了一种基于划分后的多个通道图像块的复杂度确定多个通道图像块的复杂度的实现方式,提升方案的可实施性。
在一种可能的实现方式中,基于每个通道图像块的复杂度信息确定当前块的复杂度信息,包括:将每个通道图像块的复杂度信息中的最小值确定为当前块的复杂度信息。
该种可能的实现方式中,提供了一种基于多个通道图像块的复杂度信息确定多个通道图像块的复杂度信息的实现方式,提升方案的可实施性。
在一种可能的实现方式中,基于每个通道图像块的复杂度信息确定当前块的复杂度信息,包括:基于每个通道图像块的复杂度信息,确定每个通道图像块的复杂度等级;基于每个通道图像块的复杂度等级,确定当前块的复杂度信息。
该种可能的实现方式中,提供了一种基于划分后的多个通道图像块的复杂度确定多个通道图像块的复杂度的实现方式,提升方案的可实施性。
在一种可能的实现方式中,根据当前块的复杂度信息确定当前块的量化参数,包括:根据当前块的复杂度信息,确定当前块的参考量化参数;根据当前块的参考量化参数,确定当前块的量化参数。
该种可能的实现方式中,提供了一种基于参考量化参数确定量化参数的方法,提升确定得到的量化参数的准确性。
在一种可能的实现方式中,当视频译码方法为视频编码方法时,根据当前块的复杂度信息,确定当前块的参考量化参数,包括:获取待处理图像的缓冲区域状态,缓冲区域状态用于表征针对待处理图像中完成编码的图像块在缓冲区域中占用的比特数,其中,缓冲区域用于控制待处理图像的码流匀速输出;根据缓冲区域状态和当前块的复杂度信息的对应关系,确定当前块的参考量化参数。当然上述视频译码方法为视频解码方法时,解码过程中可以模拟上述编码过程中缓冲区域缓冲码流的情况,从而依据模拟结果确定参考量化参数。
该种可能的实现方式中,提供了一种根据缓冲区域状态和当前块的复杂度信息确定当前块参考量化参数的实现方式,提升方案的可实施性。
在一种可能的实现方式中,根据当前块的复杂度信息,确定当前块的参考量化参数,包括:确定当前块的复杂度等级;根据当前块的复杂度等级确定对应的目标比特,目标比特是指当前块在码流中占用的比特数;根据目标比特获取当前块的参考量化参数。
该种可能的实现方式中,提供了一种根据目标比特确定当前块参考量化参数的实现方式,提升方案的可实施性。
在一种可能的实现方式中,根据当前块的参考量化参数,确定当前块的量化参数,包括:根据当前块的复杂度信息确定加权系数,加权系数用于根据当前块的复杂程度调整当前块的量化参数;根据加权系数与当前块的参考量化参数,确定当前块的量化参数。
该种可能的实现方式中,提供了一种根据参考量化参数确定当前块的量化参数的实现方式,提升方案的可实施性。
在一种可能的实现方式中,当前块的复杂度信息是基于当前块的码率控制单元计算得到,码率控制单元是计算当前块的复杂度信息的基本处理单元;当前块的量化参数为当前块的码率控制单元的量化参数,基于当前块的量化参数对当前块进行译码,包括:根据码率控制单元的量化参数确定当前块的译码 单元的量化参数;根据译码单元的量化参数对当前块进行译码。
该种可能的实现方式中,提供了一种确定当前块的量化参数的实现方式,其中,当码率控制单元的尺寸小于量化单元的尺寸时,相应计算得到的多个量化参数用于一个量化单元进行量化,针对该问题,上述方法提供了相应的解决方案,提升方案的可实施性。
第二方面,本申请实施例提供一种视频译码装置,该装置具有实现上述第一方面中任一项的视频译码方法的功能。该功能可以通过硬件实现,也可以通过硬件执行相应的软件实现。该硬件或软件包括一个或多个与上述功能相对应的模块。
第三方面,本申请实施例提供一种视频编码器,该视频编码器用于执行上述第一方面中任一项的视频译码方法。
第四方面,本申请实施例提供另一种视频编码器,包括:处理器和存储器;该存储器用于存储计算机执行指令,当该视频编码器运行时,该处理器执行该存储器存储的该计算机执行指令,以使该视频编码器执行如上述第一方面中任一项的视频译码方法。
第五方面,本申请实施例提供一种视频解码器,该视频解码器用于执行上述第一方面中任一项的视频译码方法。
第六方面,本申请实施例提供另一种视频解码器,包括:处理器和存储器;该存储器用于存储计算机执行指令,当该视频解码器运行时,该处理器执行该存储器存储的该计算机执行指令,以使该视频解码器执行如上述第一方面中任一项的视频译码方法。
第七方面,本申请实施例提供一种计算机可读存储介质,该计算机可读存储介质中存储有程序,当其在计算机上运行时,使得计算机可以执行上述第一方面中任一项的视频译码方法。
第八方面,本申请实施例提供一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机可以执行上述第一方面中任一项的视频解译方法。
第九方面,本申请实施例提供一种电子设备,电子设备包括视频译码装置,处理电路被配置为执行如上述第一方面中任一项的视频译码方法。
第十方面,本申请实施例提供一种芯片,芯片包括处理器,处理器和存储器耦合,存储器存储有程序指令,当存储器存储的程序指令被处理器执行时实现上述第一方面中任意一项的视频译码方法。
第十一方面,提供一种视频编解码系统,该系统包括视频编码器和视频解码器,视频编码器被配置为执行如上述第一方面中任一项的视频译码方法,视频解码器被配置为执行如上述第一方面中任一项的视频译码方法。
第二方面至第十一方面中的任一种实现方式所带来的技术效果可参见第一方面中或者下述具体实施方式中对应实现方式所带来的技术效果,此处不再赘述。
附图说明
此处所说明的附图用来提供对本申请的进一步理解,构成本申请的一部分,本申请的示意性实施例及其说明用于解释本申请,并不构成对本申请的不当限定。
图1为本申请实施例提供的视频编解码系统的系统架构图;
图2为本申请实施例提供的一种视频编码器的结构示意图;
图3为本申请实施例提供的一种视频解码器的结构示意图;
图4为本申请实施例提供的一种视频编解码的流程示意图;
图5为本申请实施例提供的视频编解码装置的结构示意图;
图6为本申请实施例提供的一种视频译码方法的流程图;
图7为本申请实施例提供的一种计算角度梯度的示意图;
图8为本申请实施例提供的另一种计算角度梯度的示意图;
图9为本申请实施例提供的再一种计算角度梯度的示意图;
图10为本申请实施例提供的一种划分图像块并计算角度梯度的示意图;
图11为本申请实施例提供的一种动态阈值与绝对阈值的关系示意图;
图12a为本申请实施例提供的一种参考量化参数、复杂度信息和缓冲区域状态的函数图像;
图12b为本申请实施例提供的一种参考量化参数与缓冲区域状态的函数图像;
图12c为本申请实施例提供的一种参考量化参数与复杂度信息的函数图像;
图13a为本申请实施例提供的一种图像边界示意图;
图13b为本申请实施例提供的一种条带示意图;
图14a为本申请实施例提供的一种编码端的码流分组方法的流程示意图;
图14b为本申请实施例提供的一种解码端的码流分组方法的流程示意图;
图15为本申请实施例提供的一种基于码流分组方法片段的交织示意图;
图16a为本申请实施例提供的一种编码端的码流分组方法的流程示意图;
图16b为本申请实施例提供的一种解码端的码流分组方法的流程示意图;
图17为本申请实施例提供的一种基于码流分组方法片段的交织示意图;
图18为本申请实施例提供的一种视频译码装置的组成示意图。
具体实施方式
为使本申请的目的、技术方案、及优点更加清楚明白,以下参照附图并举实施例,对本申请进一步详细说明。显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员所获得的所有其他实施例,都属于本申请保护的范围。
在本申请的描述中,除非另有说明,“/”表示“或”的意思,例如,A/B可以表示A或B。本文中的“和/或”仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。此外,“至少一个”是指一个或多个,“多个”是指两个或两个以上。“第一”、“第二”等字样并不对数量和执行次序进行限定,并且“第一”、“第二”等字样也并不限定一定不同。
需要说明的是,本申请中,“示例性的”或者“例如”等词用于表示作例子、例证或说明。本申请中被描述为“示例性的”或者“例如”的任何实施例或设计方案不应被解释为比其他实施例或设计方案更优选或更具优势。确切而言,使用“示例性的”或者“例如”等词旨在以具体方式呈现相关概念。
首先,对本申请实施例涉及的技术术语进行介绍:
1、视频译码技术
视频译码技术包括视频编码技术和视频解码技术,也可以统称为视频编解码技术。
其中,视频序列存在空间冗余、时间冗余、视觉冗余、信息熵冗余、结构冗余、知识冗余、重要性冗余等一系列的冗余信息。为了尽可能的去除视频序列中的冗余信息,减少表征视频的数据量,提出了视频编码技术,以达到减小存储空间和节省传输带宽的效果。视频编码技术也称为视频压缩技术。
为了获取基于上述视频压缩技术的存储或传输的数据,相应地,需要视频解码技术来实现。
在国际通用范围内,视频压缩编码标准用于规范视频编解码方法,例如:由运动图像专家组(Motion Picture Experts Group,MPEG)制定的MPEG-2和MPEG-4标准中第10部分的高级视频编解码(Advanced Video Coding,AVC),由国际电信联盟电信标准化部门(International Telecommunication Uion-Telecommunication Standardization Sector,ITU-T)制定的H.263、H.264和H.265(又称高效率视频编解码(High Efficiency Video Coding standard,HEVC))。
需要说明的是,在基于混合编码架构的编码算法中,上述压缩编码方式可以被混合使用。
视频编解码过程中的基本处理单位是图像块,该图像块是编码端将一帧/幅图像进行划分得到的。针对划分后得到的图像块,通常采用逐行逐个的方式进行处理。其中,将正在处理的图像块称为当前块,已处理的图像块称为已编码图像块,或已解码图像块,或已译码图像块。以HEVC为例,HEVC定义了编码树单元(Coding Tree Unit,CTU)、编码单元(Coding Unit,CU)、预测单元(Prediction Unit,PU)和变换单元(Transform Unit,TU)。CTU、CU、PU和TU均可作为划分后得到的图像块。其中PU和TU均基于CU进行划分。
2、视频采样
像素为视频或图像最小的完整采样,因此,对图像块进行数据处理是以像素为单位。其中,每个像素记录颜色信息。一种采样方式为通过RGB表示颜色,其中,包括三个图像通道,R表示红色red,G表示绿色green,B表示蓝色blue。另一种采样方式为通过YUV表示颜色,其中,包括三个图像通道,Y表示亮度(luminance),U表示第一色度Cb,V表示第二色度Cr。由于人们对亮度的敏感程度强于 对色度的敏感程度,因此,可以通过多存储表征亮度的数据,少存储表征色度的数据实现减少存储空间。具体地,在视频编解码中,通常采用YUV格式进行视频采样,包括420采样格式、422采样格式等。该采样格式基于亮度的取样数量,确定两个色度的取样数量,例如,假设一个CU有4×2个像素,格式如下:
[Y0,U0,V0][Y1,U1,V1][Y2,U2,V2][Y3,U3,V3];
[Y4,U4,V4][Y5,U5,V5][Y6,U6,V6][Y7,U7,V7];
420采样格式表示YUV以4:2:0的格式进行采样,即亮度与第一色度或第二色度以4:2的比例进行选取,其中第一色度与第二色度隔行选取。则上述CU采样选取第一行的亮度Y0-Y3,以及第一色度U0和U2,选取第二行的亮度Y4-Y7,以及第二色度V4和V6。该CU经采样由亮度编码单元及色度编码单元构成,其中,亮度编码单元为:
[Y0][Y1][Y2][Y3];
[Y4][Y5][Y6][Y7];
第一色度编码单元为:
[U0][U2];
第二色度编码单元为:
[V4][V6];
可以看出,经上述采样格式采样后的图像块大小发生了变化。其中亮度编码单元块大小不变,仍为4×2,而第一色度编码单元块大小变为2×1,第二色度编码单元块大小也变为2×1。因此,若假设CU大小为X×Y,则基于420采样格式采样后的色度编码单元块大小为X/2×Y/2。
类似地,422采样格式表示YUV以4:2:2的格式进行采样,即亮度与第一色度和第二色度以4:2:2的比例进行选取。则上述CU经采样的亮度编码单元为:
[Y0][Y1][Y2][Y3];
[Y4][Y5][Y6][Y7];
第一色度编码单元为:
[U0][U2];
[U4][U6];
第二色度编码单元为:
[V1][V3];
[V5][V7];
其中,亮度编码单元块大小不变,仍为4×2,而第一色度编码单元块大小变为2×2,第二色度编码单元块大小也变为2×2。因此,若假设CU大小为X×Y,则基于422采样格式采样后的色度编码单元块大小为X/2×Y。
上述经采样得到的亮度编码单元、第一色度编码单元和第二色度编码单元作为后续针对当前块进行处理的各通道的数据单元。
本申请提供的译码方法适用于视频编解码系统。该视频编解码系统也可以称为视频译码系统。图1示出了视频编解码系统的结构。
如图1所示,视频编解码系统包含源装置10和目的装置11。源装置10产生经过编码后的视频数据,源装置10也可以被称为视频编码装置或视频编码设备,目的装置11可以对源装置10产生的经过编码后的视频数据进行解码,目的装置11也可以被称为视频解码装置或视频解码设备。源装置10和/或目的装置11可包含至少一个处理器以及耦合到上述至少一个处理器的存储器。上述存储器可包含但不限于只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、带电可擦可编程只读存储器(Electrically Erasable Programmable Read-Only Memory,EEPROM)、快闪存储器或可用于以可由计算机存取的指令或数据结构的形式存储所要的程序代码的任何其它媒体,本申请对此不作具体限定。
源装置10和目的装置11可以包括各种装置,包含桌上型计算机、移动计算装置、笔记本(例如,膝上型)计算机、平板计算机、机顶盒、例如所谓的“智能”电话等电话手持机、电视机、相机、显示装 置、数字媒体播放器、视频游戏控制台、车载计算机或其类似者等电子设备。
目的装置11可经由链路12从源装置10接收经编码视频数据。链路12可包括能够将经编码视频数据从源装置10移动到目的装置11的一个或多个媒体和/或装置。在一个实例中,链路12可包括使得源装置10能够实时地将编码后的视频数据直接发射到目的装置11的一个或多个通信媒体。在此实例中,源装置10可根据通信标准(例如:无线通信协议)来调制编码后的视频数据,并且可以将调制后的视频数据发射到目的装置11。上述一个或多个通信媒体可包含无线和/或有线通信媒体,例如:射频(Radio Frequency,RF)频谱、一个或多个物理传输线。上述一个或多个通信媒体可形成基于分组的网络的一部分,基于分组的网络例如为局域网、广域网或全球网络(例如,因特网)等。上述一个或多个通信媒体可以包含路由器、交换器、基站,或者实现从源装置10到目的装置11的通信的其它设备。
在另一实例中,可将编码后的视频数据从输出接口103输出到存储装置13。类似地,可通过输入接口113从存储装置13存取编码后的视频数据。存储装置13可包含多种本地存取式数据存储媒体,例如蓝光光盘、高密度数字视频光盘(Digital Video Disc,DVD)、只读光盘(Compact Disc Read-Only Memory,CD-ROM)、快闪存储器,或用于存储经编码视频数据的其它合适数字存储媒体。
在另一实例中,存储装置13可对应于文件服务器或存储由源装置10产生的编码后的视频数据的另一中间存储装置。在此实例中,目的装置11可经由流式传输或下载从存储装置13获取其存储的视频数据。文件服务器可为任何类型的能够存储经编码的视频数据并且将经编码的视频数据发射到目的装置11的服务器。例如,文件服务器可以包含全球广域网(World Wide Web,Web)服务器(例如,用于网站)、文件传送协议(File Transfer Protocol,FTP)服务器、网络附加存储(Network Attached Storage,NAS)装置以及本地磁盘驱动器。
目的装置11可通过任何标准数据连接(例如,因特网连接)存取编码后的视频数据。数据连接的实例类型包含适合于存取存储于文件服务器上的编码后的视频数据的无线信道、有线连接(例如,缆线调制解调器等),或两者的组合。编码后的视频数据从文件服务器发射的方式可为流式传输、下载传输或两者的组合。
本申请的译码方法不限于无线应用场景,示例性的,本申请的译码方法可以应用于支持以下多种多媒体应用的视频编解码:空中电视广播、有线电视发射、卫星电视发射、流式传输视频发射(例如,经由因特网)、存储于数据存储媒体上的视频数据的编码、存储于数据存储媒体上的视频数据的解码,或其它应用。在一些实例中,视频编解码系统可经配置,以支持单向或双向视频发射,以支持例如视频流式传输、视频播放、视频广播及/或视频电话等应用。
需要说明的是,图1为本申请实施例提供的视频编解码系统的系统架构图,图1仅仅是视频编解码系统的示例,并不是对本申请中视频编解码系统的限定。本申请提供的译码方法还可适用于编码装置与解码装置之间无数据通信的场景。在其它实例中,待编码视频数据或编码后的视频数据可以从本地存储器检索,也可以在网络上流式传输等。视频编码装置可对待编码视频数据进行编码并且将编码后的视频数据存储到存储器,视频解码装置也可从存储器中获取编码后的视频数据并且对该编码后的视频数据进行解码。
在图1中,源装置10包含视频源101、视频编码器102和输出接口103。在一些实例中,输出接口103可包含调制器/解调器(调制解调器)和/或发射器。视频源101可包括视频捕获装置(例如,摄像机)、含有先前捕获的视频数据的视频存档、用以从视频内容提供者接收视频数据的视频输入接口,和/或用于产生视频数据的计算机图形系统,或视频数据的此些来源的组合。
视频编码器102可对来自视频源101的视频数据进行编码。在一些实例中,源装置10经由输出接口103将编码后的视频数据直接发射到目的装置11。在其它实例中,编码后的视频数据还可存储到存储装置13上,供目的装置11稍后存取来用于解码和/或播放。
在图1的实例中,目的装置11包含显示装置111、视频解码器112以及输入接口113。在一些实例中,输入接口113包含接收器和/或调制解调器。输入接口113可经由链路12和/或从存储装置13接收编码后的视频数据。显示装置111可与目的装置11集成或可在目的装置11外部。一般来说,显示装置111显示解码后的视频数据。显示装置111可包括多种显示装置,例如,液晶显示器、等离子显示器、有机发光二极管显示器或其它类型的显示装置。
可选的,视频编码器102和视频解码器112可各自与音频编码器和解码器集成,且可包含适当的多路复用器-多路分用器单元或其它硬件和软件,以处理共同数据流或单独数据流中的音频和视频两者的编码。
视频编码器102和视频解码器112可以包括至少一个微处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application-Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field Programmable Gate Array,FPGA)、离散逻辑、硬件或其任何组合。若本申请提供的译码方法采用软件实现,则可将用于软件的指令存储在合适的非易失性计算机可读存储媒体中,且可使用至少一个处理器执行所述指令从而实施本申请。
本申请中的视频编码器102和视频解码器112可以根据视频压缩标准(例如HEVC)操作,也可根据其它业界标准操作,本申请对此不作具体限定。
图2是本申请实施例提供的一种视频编码器102的结构示意图。视频编码器102可以在预测模块21、变换模块22、量化模块23以及熵编码模块24分别进行预测、变换、量化以及熵编码的过程。视频编码器102中还包括预处理模块20和求和器202,其中预处理模块20包括分割模块和码率控制模块。对于视频块重构建,视频编码器102也包括反量化模块25、反变换模块26、求和器201和参考图像存储器27。
如图2中所展示,视频编码器102接收视频数据,预处理模块20用于获得视频数据的输入参数。其中,该输入参数包括该视频数据中图像的分辨率、图像的采样格式、像素深度(bits per pixel,bpp)、位宽等信息。其中,bpp是指单位像素中一个像素分量所占用的比特数。位宽是指单位像素所占用的比特数。例如,以RGB三个像素分量的值表示一个像素,若每个像素分量占用8比特(bits),则该像素的像素深度为8,并且该像素的位宽为3×8=24bits。
预处理模块20中的分割模块将图像分割成原始块。此分割也可包含分割成条带(slice)、图像块或其它较大单元,以及(例如)根据最大编码单元(Largest Coding Unit,LCU)及CU的四叉树结构进行视频块分割。示例性的,视频编码器102是用于编码位于待编码视频条带中视频块的组件。一般的,条带可划分成多个原始块(且可能划分成称作图像块的原始块的集合)。通常在分割模块中确定CU、PU以及TU的尺寸。此外,分割模块还用于确定码率控制单元的尺寸。该码率控制单元是指码率控制模块中的基本处理单元。码率控制单元可以用来计算当前块的量化参数,例如,码率控制模块基于码率控制单元,为当前块计算复杂度信息,再根据复杂度信息计算当前块的量化参数。其中,分割模块的分割策略可以是预设的,也可以是编码过程中基于图像不断调整的。当分割策略是预设策略时,相应地,解码端中也预设相同的分割策略,从而获取相同的图像处理单元。该图像处理单元为上述任意一种图像块,且与编码侧一一对应。当分割策略在编码过程中基于图像不断调整时,该分割策略可以直接或间接地编入码流,相应地,解码端从码流中获取相应参数,得到相同的分割策略,获取相同的图像处理单元。
预处理模块20中的码率控制模块用于生成量化参数以使得量化模块23和反量化模块25进行相关计算。其中,码率控制模块在计算量化参数过程中,可以获取当前块的图像信息进行计算,例如上述输入信息;还可以获取求和器201经重构得到的重建值进行计算,本申请对此不作限制。
预测模块21可将预测块提供到求和器202以产生残差块,且将该预测块提供到求和器201经重构得到重建块,该重建块用于后续进行预测的参考像素。其中,视频编码器102通过原始块的像素值减去预测块的像素值来形成像素差值,该像素差值即为残差块,该残差块中的数据可包含亮度差及色度差。求和器201表示执行此减法运算的一个或多个组件。预测模块21还可将相关的语法元素发送至熵编码模块24用于合并至码流。
变换模块22可将残差块划分为一个或多个TU进行变换。变换模块22可将残差块从像素域转换到变换域(例如,频域)。例如,使用离散余弦变换(Discrete Cosine Transform,DCT)或离散正弦变换(Discrete Sine Transform,DST)将残差块经变换得到变换系数。变换模块22可将所得变换系数发送到量化模块23。
量化模块23可基于量化单元进行量化。其中,量化单元可以与上述CU、TU、PU相同,也可以在分割模块中进一步地划分。量化模块23对变换系数进行量化以进一步减少编码比特得到量化系数。其中,量化过程可减少与系数中的一些或全部相关联的比特深度。可通过调整量化参数来修改量化的程度。 在一些可行的实施方式中,量化模块23可接着执行包含经量化变换系数的矩阵的扫描。替代的,熵编码模块24可执行扫描。
在量化之后,熵编码模块24可熵编码量化系数。例如,熵编码模块24可执行上下文自适应性可变长度编码(Context-Adaptive Variable-Length Coding,CAVLC)、上下文自适应性二进制算术编码(Context-based Adaptive Binary Arithmetic Coding,CABAC)、基于语法的上下文自适应性二进制算术译码(SBAC)、概率区间分割熵(PIPE)译码或另一熵编码方法或技术。在通过熵编码模块24进行熵编码之后得到码流,可将码流传输到视频解码器112或存档以供稍后传输或由视频解码器112检索。
反量化模块25及反变换模块26分别应用反量化与反变换,求和器201将反变换后的残差块得和预测的残差块相加以产生重建块,该重建块用作后续原始块进行预测的参考像素。该重建块存储于参考图像存储器27中。
图3是本申请实施例提供的一种视频解码器112的结构示意图。如图3所示,视频解码器112包含熵解码模块30、预测模块31、反量化模块32、反变换模块33、求和器301和参考图像存储器34。其中,熵解码模块30包括解析模块和码率控制模块。在一些可行的实施方式中,视频解码器112可执行与图2的视频编码器102描述的编码流程互逆的解码流程。
在解码过程期间,视频解码器112从视频编码器102接收经编码的视频的码流。视频解码器112的熵解码模块30中的解析模块对码流进行熵解码,以产生量化系数和语法元素。熵解码模块30将语法元素转递到预测模块31。视频解码器112可在视频条带层级和/或视频块层级处接收语法元素。
熵解码模块30中的码率控制模块根据解析模块得到的待解码图像的信息,生成量化参数以使得反量化模块32进行相关计算。码率控制模块还可以根据求和器301经重构得到的重建块,计算量化参数。
反量化模块32对码流中所提供且通过熵解码模块30所解码的量化系数以及所生成的量化参数进行反量化(例如,解量化)。反量化过程可以包含使用视频编码器102针对视频条带中的每一视频块所计算的量化参数确定量化的程度的过程,同样的,反量化过程也可以包含确定应用反量化的程度的过程。反变换模块33将反变换(例如,DCT、DST等变换方法)应用于反量化后的变换系数,将反量化后的变换系数进行反变换,得到反变换单元,也就是残差块。其中,反变换单元的尺寸可以与TU的尺寸相同,反变换方法与变换方法采用同样的变换方法中相应的正变换与反变换,例如,DCT、DST的反变换为反DCT、反DST或概念上类似的反变换过程。
预测模块31生成预测块后,视频解码器112通过将来自反变换模块33的反变换后的残差块与预测块求和来形成经解码的图像块。求和器301表示执行此求和运算的一个或多个组件。在需要时,也可应用解块滤波器来对经解码块的图像进行滤波以便去除块效应伪影。给定帧或图像中的经解码的图像块存储于参考图像存储器34中,作为后续进行预测的参考像素。
本申请提供一种可能的视频编/解码实现方式,如图4所示,图4为本申请提供的一种视频编解码的流程示意图,该视频编/解码实现方式包括过程①至过程⑤,过程①至过程⑤可以由上述的源装置10、视频编码器102、目的装置11或视频解码器112中的任意一个或多个执行。
下面以视频编码过程为例对上述过程①至过程⑤进行说明。
过程①:将一帧图像分成一个或多个互相不重叠的并行编码单元。该一个或多个并行编码单元间无依赖关系,可完全并行/独立编码和解码,如图4所示出的并行编码单元1和并行编码单元2。
过程②:对于每个并行编码单元,可再将其分成一个或多个互相不重叠的独立编码单元,各个独立编码单元间可相互不依赖,但可以共用一些并行编码单元头信息。
独立编码单元既可以是包括亮度Y、第一色度Cb、第二色度Cr三个分量,或RGB三个分量,也可以仅包含其中的某一个分量。若独立编码单元包含三个分量,则这三个分量的尺寸可以完全一样,也可以不一样,具体与图像的输入格式相关。该独立编码单元也可以理解为每个并行编码单元所包含N个通道形成的一个或多个处理单元。例如上述Y、Cb、Cr三个分量即为构成该并行编码单元的三个通道,其分别可以为一个独立编码单元,或者Cb和Cr可以统称为色度通道,则该并行编码单元包括亮度通道构成的独立编码单元,以及色度通道构成的独立编码单元。
过程③:对于每个独立编码单元,可再将其分成一个或多个互相不重叠的编码单元,独立编码单元内的各个编码单元可相互依赖,如多个编码单元可以进行相互参考预编解码。
若编码单元与独立编码单元尺寸相同(即独立编码单元仅分成一个编码单元),则其尺寸可为过程②所述的所有尺寸。
编码单元既可以是包括亮度Y、第一色度Cb、第二色度Cr三个分量(或RGB三分量),也可以仅包含其中的某一个分量。若包含三个分量,几个分量的尺寸可以完全一样,也可以不一样,具体与图像输入格式相关。
值得注意的是,过程③是视频编解码方法中一个可选的步骤,视频编/解码器可以对过程②获得的独立编码单元的残差系数(或残差值)进行编/解码。
过程④:对于编码单元,可以再将其分成一个或多个互相不重叠的预测组(Prediction Group,PG),PG也可简称为Group,各个PG按照选定预测模式进行编解码,得到PG的预测值,组成整个编码单元的预测值,基于预测值和编码单元的原始值,获得编码单元的残差值。例如,图4中将独立编码单元中的一个编码单元分为了PG-1、PG-2和PG-3。
过程⑤:基于编码单元的残差值,对编码单元进行分组,获得一个或多个相不重叠的残差小块(residual block,RB),各个RB的残差系数按照选定模式进行编解码,形成残差系数流。具体的,可分为对残差系数进行变换和不进行变换两类。如图4所示,对一个编码单元进行分组,获得了RB-1和RB-2。
其中,过程⑤中残差系数编解码方法的选定模式可以包括,但不限于下述任一种:半定长编码方式、指数哥伦布(Golomb)编码方法、Golomb-Rice编码方法、截断一元码编码方法、游程编码方法、直接编码原始残差值等。例如,在选定指数哥伦布编码方法对各个RB的残差系数进行编码的情况下,在解码各个RB的残差系数时也需要选定指数哥伦布编码方法对应的解码方法来进行解码。
例如,视频编码器可直接对RB内的系数进行编码。
又如,视频编码器也可对残差块进行变换,如DCT、DST、阿达马(Hadamard)变换等,再对变换后的系数进行编码。
作为一种可能的示例,当RB较小时,视频编码器可直接对RB内的各个系数进行统一量化,再进行二值化编码。若RB较大,可进一步划分为多个系数组(coefficient group,CG),再对各个CG进行统一量化,再进行二值化编码。在本申请的一些实施例中,系数组(CG)和量化组(QG)的大小可以相同。
下面以半定长编码方式对残差系数编码的部分进行示例性说明。首先,将一个RB内残差绝对值的最大值定义为修整最大值(modified maximum,mm)。其次,根据上述mm确定该RB内残差系数的编码比特数(同一个RB内残差系数的编码比特数一致)。例如,若当前RB的码长(code length,CL)为2,并且当前残差系数为1,则编码残差系数1需要2个比特,表示为01。在一种特殊的情况下,若当前RB的CL为7,则表示编码8-bit的残差系数和1-bit的符号位。其中,CL的确定方式是去找满足当前RB所有残差都在[-2^(M-1),2^(M-1)]范围之内的最小M值,将找到的M作为当前RB的CL。若当前RB内同时存在-2^(M-1)和2^(M-1)两个边界值,则M应增加1,即需要M+1个比特编码当前RB的所有残差;若当前RB内仅存在-2^(M-1)和2^(M-1)两个边界值中的一个,则需要编码一个最末(Trailing)位来确定该边界值是-2^(M-1)还是2^(M-1);若当前RB内所有残差均不存在-2^(M-1)和2^(M-1)中的任何一个,则无需编码该Trailing位。
另外,对于某些特殊的情况,视频编码器也可以直接编码图像的原始值,而不是残差值。
上述视频编码器102以及视频解码器112也可以通过另外一种实现形态来实现,例如,采用通用的数字处理器系统实现,图5提供了视频编解码装置的结构示意图,如图5所示的编解码装置50,该编解码装置50可以为上述视频编码器102中的部分装置,也可以为上述视频解码器112中的部分装置。
该编解码装置50可以应用于编码侧,也可以应用于解码侧。编解码装置50包括处理器501以及存储器502。所述处理器501与存储器502相连接(如通过总线504相互连接)。可选的,编解码装置50还可包括通信接口503,通信接口503连接处理器501和存储器502,用于接收/发送数据。
存储器502可以为随机存储记忆体(Random Access Memory,RAM)、只读存储器(Read-Only Memory,ROM)、可擦除可编程只读存储器(Erasable Programmable Read Only Memory,EPROM)或便携式只读存储器(Compact Disc Read-Only Memory,CD-ROM)。该存储器502用于存储相关程序代 码及视频数据。
处理器501可以是一个或多个中央处理器(Central Processing Unit,CPU),例如图5中所示的CPU 0和CPU 1。在处理器501是一个CPU的情况下,该CPU可以是单核CPU,也可以是多核CPU。
该处理器501用于读取存储器502中存储的程序代码,执行图6所对应的任意一个实施方案及其各种可行的实施方式的操作。
本申请提供的译码方法可以用于视频编码器102,也可以用于视频解码器112。例如:一种情况下,视频编码器102可以不使用本申请的译码方法进行编码,也不将量化参数信息传输给视频解码器112,在该情况下,视频解码器112可以使用本申请提供的译码方法进行解码。另一种情况下,视频编码器102可以使用本申请的译码方法进行编码,将量化参数信息传输给视频解码器112,在该情况下,视频解码器112可以从码流中获得量化参数信息来进行解码。
以下,结合上述图1示出的视频编解码系统、图2示出的视频编码器102以及图3示出的视频解码器112对本申请提供的译码方法进行详细描述。
如图6所示,为本申请提供的一种视频译码方法的流程图。该方法包括:
S601、视频译码器获取待处理图像中当前块的复杂度信息,该当前块的复杂度信息用于表征当前块的像素值的差异程度,该当前块的复杂度信息至少根据当前块的像素值计算当前块的至少一个角度梯度获取得到。
可以理解的是,图像块的信息通常以该图像块中包含的像素点来表示。当某个图像块中的各个像素点的像素值差异较小时,即复杂度较低,表示该图像块的色彩变化较小,则认为该图像块较为简单。类似地当某个图像块的各个像素点的像素值差异较大时,即复杂度较高,表示该图像块的色彩变化较大,则认为该图像块较为复杂。
具体地,当前块的复杂度信息(block_complexity)至少根据当前块的像素值计算当前块的至少一个角度梯度获取得到。其中,当前块的角度梯度是指基于某个角度的梯度方向计算当前块的像素值的差异。角度梯度包含水平梯度、垂直梯度以及其他角度梯度。
当前块的水平梯度是指基于水平向左或向右的梯度方向,计算当前块第t列像素值与第t-1列像素值的差值的集合。其中,t为大于1的整数。公式如下:
水平梯度H=第t列像素值-第t-1列像素值;
图7提供了一种计算角度梯度的示意图,如图7中的(a)所示,以4×2的图像块为例,该图像块按照图示方向经上述公式计算,可得到3×2个差值,则该图像块的水平梯度为上述3×2个差值的集合。
通过以下公式,基于上述水平梯度,计算水平复杂度(horizontal complexity),下文中以complexity_hor表征水平复杂度:
complexity_hor=水平梯度中元素的和(gradH)/元素的个数(grad_block_size);
结合上述示例,当得到3×2个差值时,当前块的complexity_hor=6个差值的和/6。其中,上述水平梯度中的元素可以为该图像块按照水平方向计算得到的各个水平梯度。
类似地,当前块的垂直梯度是指基于垂直向上或向下的梯度方向,计算当前块第s行像素值与第s-1行像素值的差值的集合。其中,s为大于1的整数。公式如下:
垂直梯度V=第s列像素值-第s-1列像素值;
如图7中的(b)图所示,以4×2的图像块为例,该图像块按照图示方向经上述公式计算,可得到4×1个差值,则该图像块的水平梯度为上述4×1个差值的集合。
通过以下公式,基于上述垂直梯度,计算垂直复杂度(vertical complexity),下文中以complexity_ver表征垂直复杂度:
complexity_ver=垂直梯度中元素的和(gradV)/元素的个数(grad_block_size);
结合上述示例,complexity_ver=4个差值的和/4。其中,上述垂直梯度中的元素可以为该图像块按照垂直方向计算得到的各个垂直梯度。
类似地,当前块的其他角度梯度可以包括45°梯度、135°梯度、225°梯度或315°梯度。参见图8提供的另一种计算角度梯度的示意图,上述其他角度梯度的方向分别如图8中的(a)至(d)所示。
通过上述方式,基于当前块的像素值计算当前块的复杂度信息,有利于为该当前块确定更准确的译 码参数,例如量化参数,从而提高视频译码的图像质量并提升图像的译码效率。
下面通过下述第一种可能的实现方式至第四种可能的实现方式对确定当前块的复杂度信息的实现方式进行说明。
上述方案中,当前块的梯度所包含的元素个数小于当前块的像素个数。考虑到更精确的当前块的复杂度的计算方式,视频译码器还可以参考当前块译码的像素值的重建值,确定当前块的复杂度信息。
第一种可能的实现方式,视频译码器基于当前块的像素值和当前块已译码的像素值的重建值,计算当前块的至少一个角度梯度;根据当前块的至少一个角度梯度,获取当前块的复杂度信息。参见图9提供的再一种计算角度梯度的示意图,如图9中的a)至f)图所示,其中,空白部分表示原始像素,即当前块的像素值;阴影部分表示重建像素,即当前块已译码的像素值的重建值。图9中的a)为基于当前块中的像素和该像素梯度方向的边缘像素值的重建值逐行计算当前块的梯度的示意图,如此以来,按照上述方法,针对各原始像素,可以沿当前块的梯度方向找到对应的边缘像素值的重建值或当前块中其他原始像素值,用来计算当前块在该梯度方向所包含的元素。例如,以从左到右的顺序依次将第一行的阴影像素标记为像素1-1、像素1-2…像素1-16;以从左到右的顺序依次将第二行的像素标记为像素2-1、像素2-2…像素2-17;以从左到右的顺序依次将第三行的像素标记为像素3-1、像素3-2…像素3-16。那么,计算得到的当前块的梯度可以包括:像素2-1减去像素1-1得到的梯度、像素2-2减去像素1-2得到的梯度......像素2-16减去像素1-16得到的梯度、像素3-1减去像素2-2得到的梯度、像素3-2减去像素2-3得到的梯度…..像素3-16减去像素2-17得到的梯度。这样,对于2×16个原始像素,可以得到32个梯度值。同理,对于图9中的b)至f)所示的梯度方向的角度梯度的计算方式,可以参考上述图9中的a)所示的计算梯度的方法,区别在于使用的像素的位置的位置和角度方向的不同,这里不再详述。
根据当前块的至少一个角度梯度,获取当前块的复杂度信息的实现方式可以参考下文实施例,这里暂不详述。
这样,对于一个原始像素,在同一梯度方向都可以计算得到一个元素,当前块的各个梯度所包含的元素个数与当前块的像素个数相等。上述元素可以表征该元素对应的当前块中的像素沿梯度方向的变化,并且,元素与当前块中的像素一一对应,所以各个元素可以均匀的表征当前块中像素沿梯度方向的变化,所以基于上述得到的各元素可以有助于获取更精确的当前块的复杂度信息。
第二种可能的实现方式中,视频译码器基于当前块的像素值,以及待处理图像中与当前块相邻的像素值,计算当前块的复杂度信息。与上述第一种可能的实现方式不同的是,第一种可能的实现方式中使用的是当前块已译码的像素值的重建值,是重建值,而且重建值是该当前块内像素的重建值。而本可能的实现方式中,使用的是待处理图像中与当前块相邻的像素值,也就是,是与当前块相邻的块中像素相关的值,该值可以是重建值,也可以是原始值。
可以理解的是,与上述基于当前块已译码的像素值的重建值计算复杂度信息的方法类似。仍如图9所示,其中,图9中的空白部分可以看作为16×2的当前块的像素,区别是图9中的阴影部分可以看作为当前块相邻的像素,表示与当前块的像素值相邻的像素值。
例如,当以图9中的e)图进行水平梯度计算时,此时当前块包括16×2个像素,通过使用当前块前一列像素的重建值减去当前块第一列像素的像素值、当前块第一列像素的重建值减去当前块第二列像素的像素值……以及当前块第十五列像素的重建值减去当前块第十六列像素的像素值,这样,可以计算得到的水平梯度包括16×2个差值,则当前块的complexity_hor=32个差值的和/32。具体计算方式这里不再详述。
需要说明的是,以上可选的方式仅作为示例。事实上,重建像素与相邻像素的选取可以为多个。例如,重建像素为当前块的前n行或前n列或前n个重建值的平均值,相邻像素为当前块的前n行或前n列或前n个像素值的平均值,本申请对此不作限制。
可选的,当前块的复杂度信息根据当前块的至少一个角度梯度获取得到,具体的,视频译码器将根据至少一个角度梯度得到的复杂度信息中的最小值,作为该当前块的复杂度信息。也就是说,若基于某个角度梯度所计算得到的当前块的复杂度信息最小,则将该最小的复杂度信息作为当前块的复杂度信息。
第三种可能的实现方式中,在视频译码过程中,对待处理图像进行多通道采样,因此,当前块由多 个通道图像块构成。当前块为N通道图像块时,其中,N为大于零的整数,视频译码器可以基于上述N通道图像块中每个通道图像块的像素值,确定每个通道图像块的复杂度信息,并基于每个通道图像块的复杂度信息确定当前块的复杂度信息,例如,将N个通道图像块的复杂度信息中的最小值确定为当前块的复杂度信息。这样,针对单独通道图像的图像块计算得到的复杂度信息更准确,有助于提升确定当前块的复杂度信息的准确性,提升视频译码的准确度。
一种可能的实现方式,对于每一通道图像块而言,均可以计算至少一个角度梯度,基于每一角度梯度可以获得该通道图像块的一个复杂度信息,然后根据所得到的复杂度信息,确定该通道图像块的复杂度信息,例如,将所得到的复杂度信息中的最小值确定为该通道图像块的复杂度信息。
另一种可能的实现方式,视频译码器对构成当前块的多个通道图像块中的每个通道图像块进行划分,基于每个通道图像块划分后的子块确定每个通道图像块的复杂度信息。具体地,视频译码器将每个通道图像块划分为至少两个子块,获取每个通道图像块的至少两个子块的复杂度信息,基于每个通道图像块的至少两个子块的复杂度信息,确定每个通道图像块中,相应的通道图像块复杂度信息。
示例性的,如图10提供的一种划分图像块并计算角度梯度的示意图所示,将当前块某个4×2通道图像块按照垂直方向进行划分,划分得到8×2的子图像块再计算角度梯度,以获取该通道图像块的复杂度信息。
其中,确定每个通道图像块的至少两个子块的复杂度信息的实现方式,可以参考上文图7提供的实现方式,这里不再详述。
可选的,视频译码器将每个通道图像块的至少两个子块的复杂度信息中的最小值,确定为相应的通道图像块的复杂度信息。
需要说明的是,各个通道图像块针对上述子块的划分规则可以相同或不同。其中,由于图像的采样格式可能导致各个通道图像块的大小不同。如上述采用YUV格式基于420采样格式进行采样时,亮度、第一色度和第二色度采样后的块大小不同。因此,可以将相对较大的通道图像块进行分割再确定该通道图像块的复杂度信息,针对通道图像块中尺寸较小的子块计算得到的复杂度信息更准确,因此基于子块的复杂度信息获得的通道图像块的复杂度信息更准确,有助于提升确定当前块的复杂度信息的准确性,提升视频译码的准确度。
本申请的多通道不限于前述的RGB三通道,也可以有更多通道,比如当图像传感器为四通道传感器时,对应的待处理图像包括四通道的图像信息,比如当图像传感器为五通道传感器时,对应的待处理图像包括五通道的图像信息。
本申请中的多个通道可以包括如下通道中的至少一种或几种:Y通道、U通道、V通道、Co通道、Cg通道、R通道、G通道、B通道、alpha通道、IR通道、D通道、W通道。比如说,多个通道包括Y通道、U通道、V通道,或者,多个通道包括R通道、G通道、B通道,或者,多个通道包括、R通道、G通道、B通道、alpha通道,或者,多个通道可以包括R通道、G通道、B通道、IR通道,或者,多个通道包括R通道、G通道、B通道、W通道,或者,多个通道包括R通道、G通道、B通道、IR通道、W通道,或者,多个通道包括R通道、G通道、B通道、D通道,或者,多个通道可以包括R通道、G通道、B通道、D通道、W通道。其中,除了RGB彩色感光通道,也可能有IR通道(红外或近红外感光通道),D通道(暗光通道,主要通过红外光或近红外光),W通道(全色感光通道),针对不同的传感器有不同的通道,比如说,传感器类型可以是RGB传感器,RGBIR传感器,RGBW传感器,RGBIRW传感器,RGBD传感器,RGBDW传感器等。
第四种可能的实现方式中,视频译码器获取当前块的角度预测模式所采用的预测角度,基于预测角度计算角度梯度以获取相应的复杂度信息,将该相应的复杂度信息作为当前块的复杂度信息。
其中,角度预测模式是一种常见的预测模式,用于根据指定的角度,确定当前块的像素值与重建值的残差。预测角度也可以称为预测方向,该预测角度与上述其他角度梯度涉及的角度类似,例如45°、135°等等。
基于预测角度计算角度梯度,以及获得相应的复杂度信息的实现方式可以参考前文描述的实现方式,这里不再赘述。
通过上述方式,将当前块的预测模式与复杂度信息的计算建立关联关系,从而使得计算当前块的复 杂度信息更具针对性,提高了复杂度信息的准确度,提升了视频译码的准确度。
可选的,预测模式还包括均值(DC)预测模式、平面(Planar)预测模式等多种预测模式。视频译码器可以为当前块设置多个复杂度信息,用于根据预测模式确定当前块的复杂度信息。一种情况下,该多个复杂度信息可以根据上述多个角度梯度计算得到,另一种情况下,视频译码器中可以预设多个复杂度信息。其中,预设的多个复杂度信息与预测模式具有对应关系,例如,多个复杂度信息包括第一复杂度信息和第二复杂度信息,第一复杂度信息对应角度预测模式(可以包括一个或多个角度预测模式),第二复杂度信息对应DC预测模式和Planar预测模式。针对当前块进行预测时,可能会基于多种预测模式得到多个预测结果,基于率失真(Rate-Distortion Optimized,RDO)代价确定最优的预测模式作为当前块的预测模式。而当前块的复杂度信息可以根据该RDO代价确定的预测模式与预设复杂度信息的对应关系获取得到。结合上述示例,若基于RDO代价确定当前块的预测模式为DC预测模式,根据上述对应关系,该当前块的复杂度信息应当为第二复杂度信息。
可选的,上述预测模式划分为多种类别,其中,每个类别可以分别对应一个复杂度信息。该多种类别包括帧内模式类、点预测模式类、屏幕内容编码(Screen Content Coding,SCC)模式类、原始值模式类和回退(fallback)模式类。例如,角度预测模式属于帧内模式类。另外,还可以根据是否进行变换,对上述预测模式所属的类别进行进一步的划分。例如,可以将帧内模式类划分为帧内模式+变换类,以及帧内模式+不变换类。复杂度信息与类别的对应关系类似于上述第一复杂度信息对应一种或多种预测模式。当确定当前块的预测模式所属的类别时,采用相应的复杂度信息计算量化参数。
可选的,从上述第三种可能的实现方式中可以看出,视频译码器可以根据当前块的N通道图像块的至少一个角度梯度确定当前块的复杂度信息通过以下步骤S11-S12实现。
S11、视频译码器基于所述N通道图像块中每个通道图像块的像素值,获取所述每个通道图像块的复杂度信息;N为大于零的整数。
S12、视频译码器基于所述每个通道图像块的复杂度信息确定所述当前块的复杂度信息。
从前面的描述可以看出,对于当前块而言,可以得到至少一个角度梯度,从而根据得到的角度梯度获得当前块的复杂度信息;对于通道图像块而言,可以得到至少一个角度梯度,从而根据得到的角度梯度获得通道图像块的复杂度信息;对于通道图像块的子块而言,也可以得到至少一个角度梯度,从而根据得到的角度梯度获得子块的复杂度信息。除前面已描述的获得复杂度信息的实现方式外,还存在其他实现方式,下面以获得当前块的复杂度信息为例,通过下述实现方式1-4进行说明。
实现方式1,将各个基于角度梯度计算的复杂度信息中的最小值作为当前块的复杂度信息。表达式如下:
block_complexity=min(complexity_ver,complexity_hor,complexity_45,complexity_135,
complexity_225,complexity_315)
其中,complexity_45为基于45°角度梯度计算的复杂度信息,complexity_225为基于225°角度梯度计算的复杂度信息,complexity_135为基于135°角度梯度计算的复杂度信息,complexity_315为基于315°角度梯度计算的复杂度信息。
实现方式2、根据各个基于角度梯度计算的复杂度信息的加权值,获得当前块的复杂度信息。表达式如下:
block_complexity=complexity_ver×wv+complexity_hor×wh+complexity_225×w225+
complexity_315×w315
其中,wv表示垂直梯度对应的加权值,wh表示水平梯度对应的加权值,w225表示225°梯度对应的加权值,w315表示315°梯度对应的加权值,0≤wv,wh,w225,w315≤1且wv+wh+w225+w315=1。
实现方式3、计算各通道原始图像块的复杂度等级(complexity_level),基于各通道图像块的complexity_level确定当前块的block_complexity。包括以下步骤S21-S23。
步骤S21、视频译码器根据各通道图像块的复杂度信息(complexity),确定各通道图像块的complexity_level。
情况1、考虑人眼主观模型分级设置A-1个绝对阈值,将图像块从简单到复杂分为A个级别。
例如,对于8bit,采样格式为YUV444的图像块,各通道图像块的complexity_level基于以下方式 划分,包括:complexity≤4属于级别1,4<complexity<16属于级别2,complexity≥16属于级别3。又如,对于10bit,采样格式为YUV444的图像块,各通道图像块的complexity_level基于以下方式划分,包括:complexity≤8属于级别1,8<complexity<64属于级别2,complexity≥64属于级别3。再如,对于12bit,采样格式为YUV444的图像块,各通道图像块的complexity_level基于以下方式划分,包括:complexity≤16属于级别1,16<complexity<256属于级别2,complexity≥256属于级别3。
情况2、考虑人眼主观模型分级设置B-1个绝对阈值,然后再设置C个随着图像块更新的动态阈值,共将图像块分为B+C个级别。
如图11所示,为本申请实施例提供的一种动态阈值与绝对阈值的关系示意图。其中,绝对阈值1为4,绝对阈值2为16,则0<动态阈值(thread)1<4,4<thread2<16。假设complexity≤thread1属于级别1,thread1<complexity≤4属于级别2,4<complexity≤thread2属于级别3,thread2<complexity≤16属于级别4,complexity>16属于级别5。其中,上述动态阈值随图像块更新,若当前块的复杂度信息比之前若干个图像块的复杂度信息加权小,则动态阈值变小;若当前块的复杂度信息与之前若干个图像块的复杂度信息加权相等,则动态阈值不变,若当前块的复杂度信息比之前若干个图像块的复杂度信息加权大,则动态阈值变大。
步骤S22、视频译码器基于各通道图像块的complexity_level,确定当前块的复杂度等级(block_complexity_level)。
针对各通道图像块的复杂度等级,对于人眼敏感的通道可以赋予较大权重,反之赋予较小权重,各通道图像块的复杂度等级加权后得到当前块的复杂度等级。
以图像块的通道数量是3为例,可以通过以下表达式确定当前块的复杂度等级:
block_complexity_level=complexity_level1×w1+complexity_level2×w2+complexity_level3×w3
其中,w1、w2或w3可以为0,w1、w2、w3分别表示三个通道图像块的复杂度等级的权重,complexity_level1、complexity_level2、complexity_level3分别表示三个通道图像块的复杂度等级。例如,对于采样格式为YUV444的图像块,可以将Y通道系数赋予较大权重,那么可以设定w1=2,w2=w3=1。
步骤S23、视频译码器基于block_complexity_level表示当前块的复杂度信息。
通过步骤S21-S23,可以灵活的调整不同通道的图像块的复杂度等级的权重,来灵活的调整当前块的复杂度信息。
针对实现方式3,视频译码器也可以不执行上述步骤S21,直接根据各通道图像块的complexity,确定当前块的复杂度信息(block_complexity)。即下述实现方式4。
实现方式4、视频译码器对各通道图像块的complexity直接加权,得到当前块的复杂度信息。表达式如下:
block_complexity=complexity1×w4+complexity2×w5+complexity3×w6
其中,w4、w5或w6可以为0,w4、w5、w6分别表示三个通道图像块的复杂度信息的权重,complexity1、complexity2、complexity3分别表示三个通道图像的复杂度信息。w4、w5和w6表示的权重可以和人眼主观模型有关,例如,对人眼敏感的通道可以赋予较大权重。
另外,还可以基于得到的block_complexity,再采用上述步骤S22,对block_complexity分级得到block_complexity_level。
需要说明的是,若N通道图像块划分为至少两个子块时,视频译码器仍可以采用上述方式确定N通道图像块的复杂度信息,再确定当前块的复杂度信息。
具体地,以一个通道图像块划分两个子块为例,该两个子块的复杂度信息分别为sub_complexity1,sub_complexity2,则基于两个子块的复杂度信息确定该通道图像块的复杂度信息(complexity1)时,包括以下实现方式5-实现方式7。
实现方式5、将各个子块复杂度信息中的最小值作为该通道图像块的复杂度信息。
complexity=min(sub_complexity1,sub_complexity2)。
实现方式6、将各个子块复杂度信息加权后得到该通道图像块的复杂度信息。
complexity=sub_complexity1×w7+sub_complexity2×w8,其中,w7和w8分别表示两个子块复杂度信息的权重,且0≤w7≤1,0≤w8≤1。
实现方式7、基于各个子块复杂度等级,确定该通道图像块的复杂度信息。
下面的(1)和(2)以两个子块的复杂度等级分别为sub_complexity_level1和sub_complexity_level2、该通道图像块的复杂度信息为complexity1、该通道图像块的复杂度等级为complexity_level1为例,对该实现方式进行说明。其中,各个子块复杂度等级的确定方式可以为:根据各个子块的复杂度信息,确定各个子块的复杂度等级。具体的,可以参考上述步骤S21的描述,区别在于各个子块和各通道图像块的名称不同,这里不再详述。
(1)可以基于各个子块复杂度等级,确定该通道图像块的复杂度等级。
可选的,可以将各个子块的复杂度等级(如:sub_complexity_level1和sub_complexity_level2)加权后得到该通道图像块的复杂度等级(如:complexity_level 1)。表达式如下:
complexity_level 1=sub_complexity_level 1×w9+sub_complexity_level 2×w10,其中,w9和w10分别表示两个子块的复杂度等级的权重,且0≤w9≤1,0≤w10≤1。
可选的,将各个子块的复杂度等级(如:sub_complexity_level1和sub_complexity_level2)中的最小值作为该通道图像块的复杂度等级(如:complexity_level 1)。表达式如下:
complexity_level 1=min(sub_complexity_level 1,sub_complexity_level 2)。
(2)基于确定得到的该通道图像块的块复杂度等级表示该通道图像块的复杂度信息。
具体的,该步骤与S23相似,这里不再详述。
下面对步骤S601的另一种实现方式进行说明。
,在一种情况下,视频编码器102可以获取复杂度信息,并将复杂度信息传输给视频解码器112。那么,在这种情况下,当译码器为视频解码器112时,上述步骤S601的可选的实现方式包括:对码流进行解析,从码流中获取待处理图像中当前块的复杂度信息。
S602、视频译码器根据当前块的复杂度信息确定当前块的量化参数。
可选的,视频译码器根据复杂度信息确定当前块的量化参数包括以下步骤S31-S32。
S31、视频译码器根据当前块的复杂度信息,确定当前块的参考量化参数(ref_qp)。
S32、根据当前块的参考量化参数,确定当前块的量化参数。
其中,参考量化参数用于指导量化参数的生成。
可选的,视频译码器获取待处理图像的缓冲区域状态,以及缓冲区域状态和当前块的复杂度信息的对应关系,根据缓冲区域状态和复杂度信息的对应关系,确定当前块的参考量化参数。
在视频编码过程中,由于不同图像块的编码速率不同,使得输出的码流产生波动,影响传输稳定性。对此,视频编码器中还包括缓冲模块,该缓冲模块中的缓冲区域用于控制码流匀速输出。可以理解的是,码流匀速输出是指码流所占比特的匀速输出,则表示该缓冲区域将编码后非匀速流入的码流,以匀速的形式流出,实现稳定输出。此外,该缓冲区域不允许出现溢出,其中溢出包括上溢和下溢,超过缓冲区域状态的最大值(max_buffer)为上溢,低于缓冲区域状态的最小值(0)为下溢。
上述缓冲区域状态用于表征针对待处理图像中完成编码的图像块在缓冲区域中占用的比特数。可以理解的是,该缓冲区域状态是随图像块更新的。例如,某一图像块编码后以每秒100bit的速度流入缓冲区域,而预设的缓冲区域以每秒50bit匀速流出,则该图像块在这一秒时缓冲区域中的比特数应为100-50=50bit。
上述缓冲区域状态也可以称为物理缓冲状态(physical_buffer)。
具体地,视频译码器可以直接读取对应的缓冲区的状态信息获取待处理图像的缓冲区域状态。视频译码器获取待处理图像的缓冲区域状态与当前块的复杂度信息的对应关系可通过以下步骤S41-S42实现。
S41、视频译码器根据缓冲区域状态确定满度(fullness)。
满度是指缓冲区域状态的分段线性映射,表示缓冲区域的充盈程度。参考以下公式:
fullness=physical_buffer×a+b;
其中,a和b是fullness基于physical_buffer分段线性映射参数,a表示针对physical_buffer的缩放比例,b表示针对physical_buffer的偏移程度。该参数可根据缓冲区域状态、图像信息和复杂度信息进行调整。下面分情况进行说明。
情况1、根据图像信息确定参数a和b。
例如,在slice开始的若干个块,a=0.8,b=(-bpp)/(2×block_size)。其中block_size表示块的尺寸大小。
可以理解的是,在slice开始的若干个块中,缓冲区域状态较小,那么可以通过上述a和b使得fullness进一步变小,以获取较小的ref_qp,提高译码的精确度。
或者,在slice边界的块(如首行和首列),a=0,b=–bpp x block_size。
可以理解的是,边界的块没有参考像素,预测结果的质量较差。通过确定较小的ref_qp,实现减小fullness。
情况2、根据图像信息和缓冲区域状态确定参数a和b。
根据bpp和缓冲区域状态的关系,调整参数a和b。
例如,当bpp为8,(physical_buffer)/(max_buffer)>0.85时,a=1,b=0。又如,当bpp为6,(physical_buffer)/(max_buffer)>0.85时,a=1.1,b=block_size。
可见,在低bpp下缓冲区域状态较满的时候,可以通过上述a和b使得fullness进一步变大,以获取较大的ref_qp,防止缓冲区域状态超过缓冲区域最大比特数。
情况3、根据图像信息和复杂度信息确定参数a和b。
例如,对于复杂度信息较高的图像块,a=1,b=bpp×block_size;对于复杂度信息一般的图像块,a=1,b=0;对于复杂度信息较低的图像块,a=1,b=-bpp×block_size。
可选的,复杂度信息的高低情况可以与上文复杂度信息等级对应。
情况4、根据缓冲区域状态确定参数a和b。
例如,当physical_buffer<(max_buffer)/2时,a=0.9,b=0。当(max_buffer)/2≤physical_buffer<(3×max_buffer)/4时,a=1.0,b=0。当physical_buffer≥(3×max_buffer)/4时,a=1.2,b=0。
可见,当缓冲区域状态较空的时候,可以通过上述a和b减小fullness从而达到减小ref_qp的效果;类似地,当缓冲区域状态较慢的时候,可以通过上述a和b增大fullness从而达到增大ref_qp的效果。
需要说明的是,上述四种情况仅作为确定参数a和b的相关参数的示例,其中,参数取值的大小可以为其他值,对此不作限制。
S42、视频译码器根据满度计算ref_qp。
下面对确定ref_qp的实现方式进行说明。
一种可能的实现方式中,可以参考以下公式计算ref_qp:
ref_qp=fullness×c+d;
其中,c和d是参数,可根据缓冲区域状态、图像信息和复杂度信息进行调整。可选的,可以基于max_qp确定参数c,其中,max_qp为最大量化参数。
其中,c和d是ref_qp基于fullness分段线性映射参数,c表示针对fullness的缩放比例,d表示针对fullness的偏移程度。该参数可根据图像信息、复杂度信息和满度进行调整。下面分情况进行说明。
情况1、根据图像信息确定参数c和d。
例如,可以依据图像信息采用固定参数,c=1,d=1。对于不同的位宽可以有不同的c和d。
或者,在slice边界处的块(如首行和首列)进行特殊处理,使得ref_qp相对原来变小,即使用较小的c和d。
情况2、根据满度和图像信息确定参数c和d。
例如,针对图像信息中的位宽和目标像素深度(target_bpp)确定参数c和d。
例如,对于target_bpp为8bit,fullness<0.1的情况,当其为8bit图像时,c=0,d=2;当其为10bit图像时,c=0,d=4;当其为12bit图像时,c=0,d=6。对于target_bpp为8bit,fullness>0.8的情况,当其为8bit图像时,c=1,d=2;当其为10bit图像时,c=1,d=4;当其为12bit图像时,c=1,d=6。其中,c和d可以随图像块更新。
其中,target_bpp是指编码端规定的参数,表示经过压缩后平均每一个像素点需要的比特数,例如对于10bit、采样格式为YUV444的原始图像,那么,该原始图像的bpp为30bit,如果target_bpp为5bit,表示压缩了6倍。
另一种可能的实现方式中,还可以根据图像信息中的位宽和像素深度确定ref_qp。具体的,可以基于图像信息针对ref_qp确定给一个范围,其中,当图像信息发生变化时,其相对应的ref_qp的范围不同。具体地,该范围是最小参考量化参数(min_ref_qp)和最大参考量化参数(max_ref_qp)构成的区间。例如,对于bpp为8bit、采样格式为YUV444的图像,当fullness<0.25时,min_ref_qp=0,max_ref_qp=8,ref_qp在(0,8)中选取;当fullness>0.85时min_ref_qp=4,max_ref_qp=56,ref_qp在(4,56)中选取;否则min_ref_qp=0,max_ref_qp=32,ref_qp在(0,32)中选取。
情况3、根据缓冲区域状态确定参数c和d。
具体地,根据physical_buffer与max_buffer的关系确定参数c和d。
当physical_buffer<(max_buffer)/4时,c=0.5,d=0,(max_buffer)/4≤physical_buffer<(max_buffer)/2时,c=1.0,d=0,physical_buffer≥(max_buffer)/2时,c=1.3,d=0。
情况4、根据复杂度信息和满度确定参数c和d。
对于复杂度信息较高的块,在不同满度的映射结果不同。例如,在fullness=0.5时,对于复杂块ref_qp经过映射得到32,对于简单块ref_qp经过映射得到为16。
或者,例如,对于简单块当fullness<0.2时,c>0,d=0,即使得ref_qp随着fullness增大而增大,当0.2≤fullness≤0.8时,c=0,d>0,d的值会随图像块更新,使得ref_qp随着fullness增大而在一段fullness内保持恒定;当fullness>0.8时,c>0,即使得ref_qp随着fullness增大而增大;对于普通块和复杂块有类似操作。其中,简单块可以理解为复杂度信息小于等于第一预设数值的块,复杂块可以理解为复杂度信息大于等于第二预设数值的块,普通块可以理解为复杂度信息介于第一预设数值和第二预设数值之间的块。
情况5、根据复杂度信息确定参数c。
根据上述计算ref_qp的公式可知,在将参数c设置的较大的情况下,ref_qp随fullness变化而产生的变化增大,基于ref_qp确定的qp随fullness变化而产生的变化也会增大,可见,参数c在qp与fullness的映射关系中可以看作是“斜率”。因此,通过调整参数c的大小可以调整qp随fullness变化的灵敏度。由于,当对复杂度较低的块进行编码时,编码所消耗的缓冲区域的比特数等计算资源较低,所以,可以适当降低qp随fullness变化的灵敏度,也就是适当降低参数c的大小,使得qp不会因为fullness的变化而被调整的过小,也就是保证了复杂度较低的块的编码质量不会过低。反之,当对复杂度较高的块进行编码时,由于编码所消耗的缓冲区域的比特数等计算资源较高,所以可以适当提高qp随fullness变化的灵敏度,也就是适当提高参数c的大小,进而提高对码率的控制能力。综上,对于复杂度较高的块,可以确定较高的参数c,对于复杂度较低的块,可以确定较低的参数c。
例如,对于复杂度较高的块,c=1.1;对于复杂度较低的块,c=0.9。
情况6、根据复杂度信息、最大复杂度(max_complexity)、最大量化参数(max_qp)确定参数e和f,其中e和f是参数,与满度和buffer状态和图像信息有关。
ref_qp=complexity/max_complexity×max_qp×e+f。
上述基于复杂度信息和满度或缓冲区域状态确定参考量化参数还可以参考图12a提供的一种参考量化参数、复杂度和缓冲区域状态的函数图像。其中,随着缓冲区域状态增大,参考量化参数增大,而复杂度信息能够针对缓冲区域状态对参考量化参数的增长产生不同影响。例如,复杂度较高的图像块对于缓冲区域状态的增大,参考量化参数也随之增大。而复杂度较低的图像块,对于缓冲区域状态的增大,可以相对于复杂度较高的图像块的参考量化参数所增大的幅度较小。当然,对于复杂度与缓冲区域状态对参考量化参数的影响始终是正相关的。
具体地,如图12b提供的一种参考量化参数与缓冲区域状态的函数图像所示,假设在(0,max_buffer×0.15),以及(max_buffer×0.85,缓冲区域最大比特数]区间,以缓冲区域状态对于参考量化参数的影响为主,也就是说,在缓冲区域状态较空或较满时,复杂度的影响较小。而在[max_buffer×0.15,max_buffer×0.85]的区间中,复杂度影响较大进一步的考虑复杂度信息对参考量化参数的影响时,可能产生如图中所示的五种可能。当复杂度影响较小时,此时在该区间内,参考量化参数可不变,对应L3,当复杂度影响一般时,此时在该区间内,参考量化参数可缓慢增长,对应L2或L4,当复杂度影响较大 时,此时在该区间内,参考量化参数可能发生突变,对应L1或L5。其中,基于复杂度变化的起点是根据复杂度信息本身确定的。
在图12b中各个区间所考虑的复杂度信息的影响,可参考图12c提供的一种参考量化参数与复杂度的函数图像。
可见,基于待处理图像的物理缓冲区域状态,可以动态控制待处理图像的缓冲区域码流的输入和输出,使得码流的更稳定输出。
可选的,视频译码器确定当前块的复杂度等级,根据当前块的复杂度等级确定目标比特(target_cost),根据目标比特获取当前块的参考量化参数。目标比特是指当前块经译码后的预测比特数。当前块经译码后的实际比特数可能大于或小于或等于该目标比特。其中,确定当前块的复杂度等级可参考上文关于复杂度等级的实施例部分,此处不再赘述。其中,上述目标比特是指当前块在码流中占用的比特数。
具体地,视频译码器根据当前块的复杂度等级确定目标比特可通过以下几种情况实现。
情况1、根据图像信息和复杂度确定target_cost。
具体地,每个复杂度信息保存一个参考比特数(ref_cost)且随图像块更新,ref_cost=0.75×ref_costt-1+0.25×real_cost,其中real_cost表示当前块的预测无损消耗比特数,它与量化参数(qp)和实际编码比特数相关;ref_cost是用来确定target_cost的参考比特数,ref_costt-1为前一图像块对应的参考比特数,ref_cost也可以用于表征当前图像块对应的参考比特数;用当前块的复杂度等级下的ref_cost对其进行分段线性变换得到目标比特。参考以下公式:
target_cost=ref_cost×g+h;
其中,g和h是target_cost基于ref_cost分段线性映射参数,g表示针对target_cost的缩放比例,h表示针对ref_cost的偏移程度。参数g和h与图像信息相关,例如,对于bpp为8bit、采样格式为YUV444的图像,ref_cost>1.1×平均比特数(ave_cost)时,g=1.8,h=-9;ref_cost<0.9×ave_cost时,g=1.2,h=-6。其中,g和h是为了根据ave_cost对ref_cost进行修正从而得到target_cost。
情况2、根据复杂度信息确定target_cost。
具体地,有I个复杂度信息,J类模式,共划分为I×J个类别,每种类别分别对应一个target_cost。例如,复杂度信息指示当前块简单或复杂,模式包括帧内块复制(Intra block copy,IBC)模式和非IBC模式,则共有4个类别,包括简单IBC、简单非IBC、复杂IBC和复杂非IBC,每个类别分别对应一个target_cost。通过先估算每个类别的参考比特数:ref_cost=real_bit+qp/8,更新每个类别的预测比特数(pred_cost t)=0.75×pred_costt-1+0.25×ref_cost;再计算系数:scale=bpp/(avg_complexity-offset),其中avg_complexity表示截止到当前块位置所有块的滑动平均复杂度,offset与图像格式和bpp相关;最终得到target_cost=scale×(pred_cost–offset)。其中,pred_costt-1为前一图像块对应的预测比特数。
情况3、根据图像信息确定target_cost。图像信息包括位宽、图像采样格式或其他信息。
具体地,有K个位宽,L个图像采样格式,共划分为K×L个类别,每种类别分别对应一个target_cost。例如,有2个位宽(8bit和12bit),2个图像采样格式(YUV和RGB),共有4类target_cost,包括8bitYUV、8bitRGB、12bitYUV和12bitRGB。
情况4、获取满度,根据满度结合当前块的复杂度信息、缓冲区域状态或图像信息确定target_cost。
其中,获取满度的过程与步骤S41相同。
情况4.1、根据满度和缓冲区域状态确定target_cost。
具体地,target_cost=m×ref_cost+n×physical_buffer+o,其中m和n和o是参数。
若fullness>0.85,此时target_cost的设置以fullness为主导,即n的值相对m较大;若fullness<0.25,此时m相对n较大。
情况4.2、根据满度、图像信息和复杂度信息确定target_cost。
其中,fullness不同对应不同的最小目标比特(min_target_cost)和最大目标比特(max_target_cost),对target_cost进行限制。
例如,对于8bit、采样格式为YUV444的图像,min_target_cost=bpp×fullness×p1+q1,max_target_cost=bpp×fullness×p2+q2;其中,p1和q1在fullness<0.25时使min_target_cost更 小,在fullness>0.75使min_target_cost更大;其中,p2和q2在fullness<0.25时使max_target_cost更小,在fullness>0.75使max_target_cost更大。
情况4.3、根据满度和复杂度信息确定target_cost。
针对复杂度较低的简单块,当fullness<0.1时,m>0,n=0,即ref_qp随着fullness增大而增大,当0.1≤fullness≤0.9时,m=0,n>0,n的值会随图像块更新,ref_qp随着fullness增大而在一段fullness内保持恒定;当fullness>0.9时,m>0,即ref_qp随着fullness增大而增大;对于普通块和复杂块有类似操作。对于简单块和普通块的恒定段,当实际编码比特大于目标比特时,n的值增大,反之减小,即通过调整n使实际比特消耗小于等于target_cost;对于复杂块,通过调整n使实际比特消耗大于等于target_cost(若简单块和普通块未节约比特,则复杂块无多余比特可用,要严格小于等于target_cost)。
可选的,视频译码器根据目标比特确定当前块的参考量化参数。参考下述公式:
ref_qp=target_cost×u+v;
其中,u和v是参数。u和v是ref_qp基于target_cost分段线性映射参数,u表示针对target_cost的缩放比例,v表示针对target_cost的偏移程度。例如,u=8/3,v=ref_cost×8。
可见,基于复杂度等级对应的目标比特获取的参考量化参数能够灵活的根据目标比特的大小进行调整,这样,可以更灵活的调整量化参数。
下面对步骤S32的实现方式进行说明。
可选的,视频译码器根据当前块的参考量化参数确定当前块的量化参数。具体的,可以参考以下公式:qp=ref_qp×x+y。其中,x和y是qp基于ref_qp分段线性映射参数,x表示针对ref_qp的缩放比例,y表示针对ref_qp的偏移程度。
具体的,视频译码器根据当参考量化参数确定量化参数可通过以下几种情况实现。
情况1、根据图像信息确定参数x与y。
当x为1,y为0时,参考量化参数为量化参数。
或者,对于采样格式为YUV444的图像,各通道x=1/3,y=0。或Y通道x=1/2,y=0,色度通道,x=1,y=0。
或者,当ref_qp在(0,16)内时,对Y通道x=1/4,y=0,对色度通道x=1/2,y=0;当ref_qp在(17,32)时,对Y通道x=1/2,y=2,对色度通道x=1/2,y=4;ref_qp在(33,63)时,对Y通道x=1,y=0,对色度通道x=1,y=0。
情况2、根据复杂度信息确定参数x与y。
具体地,视频译码器根据当前块的复杂度信息确定加权系数,该加权系数用于根据当前块的复杂程度调整当前块的量化参数;根据加权系数与当前块的参考量化参数,确定当前块的量化参数。
例如,上述加权系数可以看作是以下x,可以通过以下表达式确定加权系数:
x=block_complexity11×/(block_complexity1×w11+block_complexity2×w12+block_complexity3×w13)
其中,w表示权重,包括w11,w12,w13,0≤w11,w12,w13≤1。block_complexity1、block_complexity2和block_complexity3分别表示当前块的三个通道的复杂度信息。
通过该种可能的实现方式,使用由当前块的复杂度信息确定的加权系数对当前块的量化参数进行调整,可以使得当前块的量化参数能够根据当前块的复杂度信息进行自适应的调整,提高确定当前块的量化参数的准确度,提高视频译码的准确度。
情况3、根据M个已译码图像块的复杂度信息和当前块的复杂度信息,确定加权系数;根据加权系数与当前块的参考量化参数,确定当前块的量化参数。
例如,上述加权系数可以看作是以下x,可以通过以下表达式确定加权系数:
其中,window_complexity表示滑动窗口中包括的图像块的复杂度信息。随滑动窗口中图像块的变化,该window_complexity相应更新。具体可根据以下公式计算得到。
window_complexityz=window_complexityz-1×0.75+block_complexity×0.25;
x=window_complexity1z×w14/(window_complexity1z×w14+window_complexity2z×w15+
window_complexity3z×w16)
其中,w10、w11和w12表示权重,0≤w10,w11,w12≤1,window_complexityz表示的是从z块开 始前M个已译码图像块的复杂度信息,同理,window_complexit yz-1表示的是从z-1块开始前M个已译码图像块的复杂度信息。window_complexity1z、window_complexity2z和window_complexity3z分别表示滑动窗口中包括的三个不同的图像块的复杂度信息。
y=0。
根据加权系数与当前块的参考量化参数,确定当前块的量化参数的实现方式可以参考以下公式:qp=ref_qp×x+y,这里不再详述。
S603、视频译码器基于量化参数对当前块进行译码。
可选的,当前块的复杂度信息是基于当前块的码率控制单元计算得到,当前块的量化参数为当前块的码率控制单元的量化参数,视频译码器基于当前块的量化参数对当前块进行译码,包括:根据码率控制单元的量化参数确定当前块的译码单元的量化参数;根据编码单元的量化参数对当前块进行译码。其中,对当前块进行编码时,上述译码单元为编码单元,对当前块进行解码时,上述译码单元为解码单元。
可以理解的是,码率控制模块在确定量化参数时,是根据码率控制单元进行计算的。当码率控制单元的尺寸大于基本编码单元(量化单元)的尺寸时,则表示多个基本编码单元采用相同的量化参数。当码率控制单元的尺寸等于量化单元的尺寸时,则可以获取一一对应的量化参数。而当码率控制单元的尺寸小于量化单元的尺寸时,则表示一个量化单元对应多个量化参数,此时需要采用一定的策略为该量化单元基于该多个量化参数确定最终的量化参数。
第一种可能的实现方式,视频译码器将量化单元基于码率控制单元进行划分,即,使得多个量化参数与多个量化单元一一对应。
第二种可能的实现方式中,将多个量化参数进行加权或选取最小值,得到一个量化参数,与该一个量化单元对应。
第三种可能的实现方式中,基于复杂度信息和缓冲区域状态,将多个量化参数进行合并。示例性的,将复杂度信息相近的量化参数合并为一个,相近的复杂度信息可以为满足一定差值范围的多个复杂度信息。
通过上述可能的实现方式,当前块的编码单元的量化参数是根据码率控制单元的量化参数确定得到的,这样能够使得编码块的量化参数与码率控制策略相匹配,使得译码结果在兼顾码率控制需求的情况下,兼顾图像质量并提高译码效率。
上述合并方式可以为相近的量化参数加权得到一个量化参数,或者选取相近的量化参数中的最小值作为合并后的量化参数。
将合并后的一个或多个量化参数再采用上述第一种可能的实现方式,得到多个量化单元,与量化参数对应;或者,也可以再采用上述的第二种可能的实现方式,获取一个量化参数与一个量化单元对应。对此不作限制。
可以理解的是,步骤S603中,包括视频译码器基于量化参数对当前块进行编码或解码。
可选的,在编码时,视频编码器将当前块的复杂度信息编入码流,或者将当前块的量化参数编入码流。相应地,解码端获取码流中的复杂度信息计算量化参数进行解码,或者解码端获取码流中的量化参数进行解码。当然,视频编码器也可以将上述两种信息均编入码流中。
需要说明的是,当视频编码器将当前块的复杂度信息编入码流时,视频解码器相应获取复杂度信息计算量化参数,但视频解码器可以不采用该复杂度信息更新其他参数。示例性的,上述target_cost的确定方式中涉及到根据当前块的复杂度信息进行更新,但在具体实现时,根据复杂度更新可能与其本身基于历史信息(如已译码图像块所占用的比特数和已译码图像块的量化参数)更新的结果不同,此时,可以不采用复杂度信息更新该参数,仍保留原始更新参数的方式。
通过上述方法,计算当前块的复杂度信息,有利于为该当前块确定更准确的译码参数,例如量化参数,从而提升图像的译码效率。
可选的,基于上述译码方法得到的码流,在编入码流前,还可以进行以下码流分组方法。
其中,如上述针对条带的描述可知,图像可以基于图像宽度(image_width)和图像高度(image_height)划分为多个条带。image_width是用于规定图像亮度分量的宽度,即水平方向样本数,是16位无符号整数。image_width的单位应是图像每行样本数。可显示区域的左上角样本应与解码图像左上角样本对齐。 ImageWidth的值等于image_width的值,ImageWidth的值不应为0并且应为16的整数倍。image_height是用于规定图像亮度分量的高度,即垂直方向扫描行数,16位无符号整数。image_height的单位应是图像样本的行数,ImageHeight的值等于image_height的值,ImageHeight的值不应为0并且应为2的整数倍。
条带是图像中的固定矩形区域,因此,也可以将条带称为矩形条带,包含若干编码单元在图像内的部分,条带之间不重叠。划分方式不进行限定基于条带可进一步划分出CU。其中,基于图像划分条带时,为了划分出整数个条带,可能调整当前图像宽度或图像高度。如图13a提供的一种图像边界示意图所示,real_width为实际图像宽度、real_height为实际图像高度,即图像可显示区域边界。为了划分条带,适应性增大图像的宽度和高度,从而得到图中的image_width和image_height。
条带具有宽度(slice_width)和高度(slice_height)。例如,可通过SliceNum X表示一副图像水平方向上slice的个数,SliceNum Y表示一幅图像垂直方向上slice的个数。如图13b提供的一种条带的示意图。
一种码流分组方法中,每个slice码流长度固定,其中前R-1个片段(chunk)长度固定,最后一个不固定。如图14a与图14b所示,分别为编码端和解码端指示该码流分组方法的流程示意图。
如图14a所示,为本申请实施例提供的一种编码端的码流分组方法的流程示意图,包括步骤S1401a-S1406a。
S1401a、将图像水平和垂直分割为sliceNumX*sliceNumY个矩形条带。
将图像在水平方向和垂直方向进行划分,在水平方向得到sliceNumX个矩形条带,在垂直方向得到sliceNumY个矩形条带。上述图像是指前述的待处理图像,这样对待处理图像在水平方向和垂直方向进行划分后,能够得到水平方向的条带以及垂直方向的条带,从而也就能够得到上述sliceNumX和sliceNumY,也就是,第一条带数量和第二条带数量。
S1402a、计算总资源,确定每个条带的片段数量chunkNum。
total_resoure是指基于每一个像素点需要的比特数计算得到该条带所占用的资源。
total_resourec=((slice_width×slice_height×目标像素深度(target_bpp)+7)>>3)<<3。
根据总资源total_resource确定chunk的数目(chunkNum),
chunkNum=total_resoure/size+n,
其中,n=total_resoure%size==0?0:1,size是8的整数倍,例如size=target_bpp×32×block_num,。
其中,block_num为预设配置参数,block_num是4的整数倍。
S1403a、依次编码每个slice行的sliceNumX个slice,生成sliceNumX个比特流缓冲区。
其中,比特流缓冲区可以通过slicebuffer[sliceNumX]表示,还可以对比特流缓冲区进行填零和字节对齐。
本步骤中,对每个slice进行编码可以理解为依据前述实施例提供的方案对slice中的各个图像块进行编码,得到slice中各个图像块的编码数据。
S1404a、将每个比特流缓冲区进一步划分为N个比特片段,其中前N-1个片段的长度为第一数值,最后一个片段的长度为第二数值。
本步骤中,上述比特片段也可以称为码流片段,上述N表示片段数量,也就是是前述chunkNum。
具体的,将每个比特流缓冲区进一步划分为chunkNum个chunk片段,chunksize的计算方式为:前chunkNum-1个chunk的长度chunksize=size1,最后一个chunk的长度为size2,其中size1=size,size2=total_resouce–(chunkNum-1)×size。其中,第一数值可以为size1,第二数值可以为size2。
S1405a、依次将sliceNumX个条带的每个比特片段交织编码形成最终比特流。
对于一个slice行内的sliceNum个slice,依次编码chunkNum次,每次将每个slice的chunk片段交织在一起形成最终码流。
S1406a、判断slice是否编码结束,如果没有结束,返回S1403a,编码下一个slice行。
如图14b所示,为本申请实施例提供的一种解码端的码流分组方法的流程示意图,包括步骤S1401b-S1406b。
S1401b、将图像水平和垂直分割为sliceNumX*sliceNumY个矩形条带。
该步骤与上述步骤S1401a类似,能够获得图像中水平方向的条带的第一条带数量和垂直方向的条带的第二条带数量。同样,本步骤中上述图像为前述待处理图像。
由于在解码端在完成解码之前,解码端并不能获得待处理图本身,鉴于此,本步骤中,可以从所接收码流中的视频头信息或者图像头信息中获得待处理图像的实际图像宽度和实际图像高度,进而根据实际图像宽度和实际图像高度计算上述sliceNumX(第一条带数量)和sliceNumY(第二条带数量)。
当然,也可以从码流中的视频头信息或者图像头信息中直接获得上述sliceNumX和sliceNumY。
S1402b、计算总资源,确定每个条带的片段数量chunkNum。
该步骤与上述步骤S1402a相同。
S1403b、接收码流,依次解析每个slice行的sliceNumX个slice的码流片段,将每个片段的码流解交织到每个条带的比特流缓存区,其中前N-1个片段的长度为第一数值,最后一个片段的长度为第二数值。
解码端可以持续不断的接收码流,当然也可以直接获得全部待解码的码流。
每个slice中码流片段的数量为上述chunkNum。
在解交织时,以条带为单位,对每一条带的码流片段进行解交织,然后将解交织结果存储到该条件的比特流缓存区。
可选的,接收码流,依次解析chunkNum次,每次解析slicenumX个码流片段解交织到每个slice的码流缓冲区,chunksize的计算方式为:前chunkNum-1个chunk的长度chunksize=size1,最后一个chunk的长度为size2,其中size1=size,size2=total_resouce–(chunkNum-1)×size。
S1404b、基于每个条带的比特缓冲区对每个条带进行解码。
S1405b、得到每个矩形条带的重建图像。
S1406b、判断slice码流解析是否结束,如果没有结束,返回S1403b,依次解析下一个slice行。
如图15提供的一种基于码流分组方法片段的交织示意图所示,当sliceNumX为2时,基于上述一种码流分组方法chunk片段的交织示意图。
其中,图15中的片段R表征第R个chunk,第R个chunk的长度不固定,其它R-1个chunk的长度固定。图中以尺寸1表征其它R-1个chunk的长度,以尺寸2表征第R个chunk的长度。
图15中的(c)表征的码流分别按照尺寸1和尺寸2进行分组。当尺寸1小于尺寸2时,分组后的片段的交织示意图如图15中的(a)所示,当尺寸1大于尺寸2时,分组后的片段的交织示意图如图15中的(b)所示。
另一种码流分组方法中,每个slice码流长度固定,其中第r个chunk长度不固定,其他r-1个chunk长度固定。如图16a与图16b所示,分别为编码端和解码端指示该码流分组方法的流程示意图。
如图16a所示,为一种编码端的码流分组方法的流程示意图,包括步骤S1601a-S1606a。
S1601a、将图像水平和垂直分割为sliceNumX*sliceNumY个矩形条带。
该步骤与上述步骤S1401a相同。
S1602a、计算总资源,确定每个条带的片段数量chunkNum。
该步骤与上述步骤S1402a相同。
S1603a、依次编码每个slice行的sliceNumX个slice,生成slicenumX个比特流缓冲区。
S1604a、将每个条带比特流缓冲区进一步划分为N个比特片段,第K个片段的长度为第一数值,其它N-1个片段的长度为第二数值。
本步骤中,上述比特片段也可以称为码流片段,上述N表示片段数量,也就是是前述chunkNum。
例如,可以将每个比特流缓冲区划分为chunkNum个chunk片段,每个chunk的长度chunksize不固定。第一数值可以为size1,第二数值可以为size2。其中,chunksize的计算方式为:第k个chunk的长度为size1,其它chunkNum-1个chunk的长度chunksize=size2,,其中size2=size,size1=total_resouce–(chunkNum-1)×size,k的取值范围1~chunkNum。
S1605a、依次将sliceNumX个条带的每个比特片段交织编码形成最终比特流。
对于一个slice行内的sliceNum个slice,依次编码chunkNum次,每次将每个slice的chunk片段交 织在一起形成最终码流。
S1606a、判断slice是否编码结束,如果没有结束,返回S1603a,编码下一个slice行。
如图16b所示,为一种解码端的码流分组方法的流程示意图,包括步骤S1601b-S1606b。
S1601b、将图像水平和垂直分割为sliceNumX*sliceNumY个矩形条带。
该步骤与上述步骤S1401b相同。
S1602b、计算总资源,确定每个条带的片段数量chunkNum。
该步骤与上述步骤S1402b相同。
S1603b、接收码流,依次解析每个slice行的sliceNumX个slice的码流片段,将每个片段的码流解交织到每个条带的比特流缓冲区,其中第K个片段的长度为第一数值其它N-1个片段的长度为第二数值。
接收码流,对于每个slice行的sliceNumX个slice,依次解析chunkNum次,每次解析slicenumX个码流片段解交织到每个slice的码流缓冲区,chunksize的计算方式为:第k个chunk的长度为size1,其它chunkNum-1个chunk的长度chunksize=size2,,其中size2=size,size1=total_resouce–(chunkNum-1)×size,k的取值范围1~chunkNum。
S1604b、基于每个条带的比特缓冲区对每个条带进行解码。
S1605b、得到每个矩形条带的重建图像。
S1606b、判断slice码流解析是否结束,如果没有结束,返回S1603b,依次解析下一个slice行。
如图17提供的一种基于码流分组方法片段的交织示意图所示,当sliceNumX为2时,基于上述另一种码流分组方法chunk片段的交织示意图。
其中,图17中的片段r表征第r个chunk,片段R表征第R个chunk,第r个chunk的长度不固定,其它R-1个的长度固定。图中以尺寸1表征其它R-1个chunk的长度,以尺寸2表征第r个chunk的长度。
图17中的(c)表征的码流分别按照尺寸1和尺寸2进行分组。当尺寸1小于尺寸2时,分组后的片段的交织示意图如图17的(a)所示,当尺寸1大于尺寸2时,分组后的片段的交织示意图如图17的(b)所示。
需要说明的是,上述方案中未进行特殊说明的方案,均可在解码侧或编码侧进行。
需要说明的是,在不冲突的情况下是,上文中任意多个实施例中的部分或全部内容可以构成新的实施例。
本申请实施例提供一种视频译码装置,该视频译码装置可以为视频译码器或视频编码器或视频解码器。具体的,视频译码装置用于执行以上视频译码方法中的视频译码器所执行的步骤。本申请实施例提供的视频译码装置可以包括相应步骤所对应的模块。
本申请实施例可以根据上述方法示例对视频译码装置进行功能模块的划分,例如,可以对应各个功能划分各个功能模块,也可以将两个或两个以上的功能集成在一个处理模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。本申请实施例中对模块的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。
在采用对应各个功能划分各个功能模块的情况下,图18为本申请实施例提供的一种视频译码装置的组成示意图。如图18所示,视频译码装置180包括获取模块1801、确定模块1802和译码模块1803。
获取模块1801,用于获取待处理图像中当前块的复杂度信息,当前块的复杂度信息至少根据当前块的像素值计算当前块的至少一个角度梯度获取得到,例如上述步骤S601。
确定模块1802,用于根据当前块的复杂度信息确定当前块的量化参数;例如上述步骤S602。
译码模块1803,用于基于量化参数对当前块进行译码,例如上述步骤S603。
在一种示例中,获取模块1801,具体用于基于当前块的像素值和当前块已译码的像素值的重建值,计算当前块的至少一个角度梯度;根据当前块的至少一个角度梯度,获取当前块的复杂度信息。
在一种示例中,获取模块1801,具体用于基于当前块的像素值和待处理图像中与当前块相邻的像素值,计算当前块的至少一个角度梯度;根据当前块的至少一个角度梯度,获取当前块的复杂度信息。
在一种示例中,获取模块1801,具体用于获取当前块的角度预测模式所采用的预测角度;基于预 测角度,计算角度梯度以获取相应的复杂度信息;将相应的复杂度信息作为当前块的复杂度信息。
在一种示例中,当前块为N通道图像块,获取模块1801,具体用于基于N通道图像块中每个通道图像块的像素值,获取每个通道图像块的复杂度信息;N为大于零的整数;基于每个通道图像块的复杂度信息确定当前块的复杂度信息。
在一种示例中,获取模块1801,具体用于将每个通道图像块划分为至少两个子块;确定每个通道图像块的至少两个子块的复杂度信息;基于每个通道图像块的至少两个子块的复杂度信息,确定每个通道图像块中,相应的通道图像块的复杂度信息。
在一种示例中,获取模块1801,具体用于将每个通道图像块的至少两个子块的复杂度信息中的最小值,确定为相应的通道图像块的复杂度信息。
在一种示例中,获取模块1801,具体用于将每个通道图像块的复杂度信息中的最小值确定为当前块的复杂度信息。
在一种示例中,获取模块1801,具体用于基于每个通道图像块的复杂度信息,确定每个通道图像块的复杂度等级;基于每个通道图像块的复杂度等级,确定当前块的复杂度信息。
在一种示例中,确定模块1802,具体用于根据当前块的复杂度信息,确定当前块的参考量化参数;根据当前块的参考量化参数,确定当前块的量化参数。
在一种示例中,当视频译码方法为视频编码方法时,确定模块1802,具体用于获取待处理图像的缓冲区域状态,缓冲区域状态用于表征针对待处理图像中完成编码的图像块在缓冲区域中占用的比特数,其中,缓冲区域用于控制待处理图像的码流匀速输出;根据缓冲区域状态和当前块的复杂度信息的对应关系,确定当前块的参考量化参数。
在一种示例中,确定模块1802,具体用于确定当前块的复杂度等级;根据当前块的复杂度等级确定对应的目标比特,目标比特是指当前块在码流中占用的比特数;根据目标比特获取当前块的参考量化参数。
在一种示例中,确定模块1802,具体用于根据当前块的复杂度信息确定加权系数,加权系数用于根据当前块的复杂程度调整当前块的量化参数;根据加权系数与当前块的参考量化参数,确定当前块的量化参数。
在一种示例中,当前块的复杂度信息是基于当前块的码率控制单元计算得到,码率控制单元是计算当前块的复杂度信息的基本处理单元;当前块的量化参数为当前块的码率控制单元的量化参数,译码模块1803,具体用于根据码率控制单元的量化参数确定当前块的译码单元的量化参数;根据译码单元的量化参数对当前块进行译码。
其中,上述方法实施例涉及的各步骤的所有相关内容均可以援引到对应功能模块的功能描述,在此不再赘述。
当然,本申请实施例提供的视频译码装置包括但不限于上述模块,例如:视频译码装置还可以包括存储模块1804。
存储模块1804可以用于存储该视频译码装置的程序代码和数据。
本申请实施例还提供了一种视频解码器,包括处理器和存储器;
存储器存储有处理器可执行的指令;
处理器被配置为执行指令时,使得视频解码器实现上述实施例中的视频图像解码方法。
本申请实施例还提供了一种视频编码器,包括处理器和存储器;
存储器存储有处理器可执行的指令;
处理器被配置为执行指令时,使得视频编码器实现上述实施例中的视频图像编码方法。
本申请实施例还提供了一种视频编解码系统,包括视频编码器和视频解码器,视频编码器用于执行上文实施例中提供的任意一种视频译码方法,视频解码器用于执行上文实施例中提供的任意一种视频译码方法。
本申请实施例还提供了一种电子设备,该电子设备包括上述视频译码装置180,该视频译码装置180执行上文提供的任意一种视频译码器所执行的方法。
本申请实施例还提供了一种计算机可读存储介质,该计算机可读存储介质上存储有计算机程序,当 该计算机程序在计算机上运行时,使得该计算机执行上文提供的任意一种视频解码器所执行的方法。
关于上述提供的任一种计算机可读存储介质中相关内容的解释及有益效果的描述,均可以参考上述对应的实施例,此处不再赘述。
本申请实施例还提供了一种芯片。该芯片中集成了用于实现上述视频译码装置100的功能的控制电路和一个或者多个端口。可选的,该芯片支持的功能可以参考上文,此处不再赘述。本领域普通技术人员可以理解实现上述实施例的全部或部分步骤可通过程序来指令相关的硬件完成。所述的程序可以存储于一种计算机可读存储介质中。上述提到的存储介质可以是只读存储器,随机接入存储器等。上述处理单元或处理器可以是中央处理器,通用处理器、特定集成电路(application specific integrated circuit,ASIC)、微处理器(digital signal processor,DSP),现场可编程门阵列(field programmable gate array,FPGA)或者其他可编程逻辑器件、晶体管逻辑器件、硬件部件或者其任意组合。
本申请实施例还提供了一种包含指令的计算机程序产品,当该指令在计算机上运行时,使得计算机执行上述实施例中的任意一种方法。该计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行计算机程序指令时,全部或部分地产生按照本申请实施例的流程或功能。计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,计算机指令可以从一个网站站点、计算机、服务器或者数据中心通过有线(例如同轴电缆、光纤、数字用户线(digital subscriber line,DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可以用介质集成的服务器、数据中心等数据存储设备。可用介质可以是磁性介质(例如,软盘、硬盘、磁带),光介质(例如,DVD)、或者半导体介质(例如SSD)等。
应注意,本申请实施例提供的上述用于存储计算机指令或者计算机程序的器件,例如但不限于,上述存储器、计算机可读存储介质和通信芯片等,均具有非易失性(non-transitory)。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件程序实现时,可以全部或部分地以计算机程序产品的形式来实现。该计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行计算机程序指令时,全部或部分地产生按照本申请实施例的流程或功能。计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,计算机指令可以从一个网站站点、计算机、服务器或者数据中心通过有线(例如同轴电缆、光纤、数字用户线(digital subscriber line,DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可以用介质集成的服务器、数据中心等数据存储设备。可用介质可以是磁性介质(例如,软盘、硬盘、磁带),光介质(例如,DVD)、或者半导体介质(例如固态硬盘(solid state disk,SSD))等。
尽管在此结合各实施例对本申请进行了描述,然而,在实施所要求保护的本申请过程中,本领域技术人员通过查看附图、公开内容、以及所附权利要求书,可理解并实现公开实施例的其他变化。在权利要求中,“包括”(comprising)一词不排除其他组成部分或步骤,“一”或“一个”不排除多个的情况。单个处理器或其他单元可以实现权利要求中列举的若干项功能。相互不同的从属权利要求中记载了某些措施,但这并不表示这些措施不能组合起来产生良好的效果。
尽管结合具体特征及其实施例对本申请进行了描述,显而易见的,在不脱离本申请的精神和范围的情况下,可对其进行各种修改和组合。相应地,本说明书和附图仅仅是所附权利要求所界定的本申请的示例性说明,且视为已覆盖本申请范围内的任意和所有修改、变化、组合或等同物。显然,本领域的技术人员可以对本申请进行各种改动和变型而不脱离本申请的精神和范围。这样,倘若本申请的这些修改和变型属于本申请权利要求及其等同技术的范围之内,则本申请也意图包含这些改动和变型在内。以上所述仅为本申请的较佳实施例,并不用以限制本申请,凡在本申请的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本申请保护的范围之内。

Claims (34)

  1. 一种视频图像解码方法,应用于解码端,其特征在于,包括:
    获取待处理图像中当前块的复杂度信息;
    根据所述当前块的复杂度信息确定所述当前块的量化参数;
    基于所述量化参数对所述当前块进行解码。
  2. 根据权利要求1所述的方法,其特征在于,所述当前块为N通道图像块,所述获取待处理图像中当前块的复杂度信息,包括:
    获取待处理图像中当前块的像素值,所述当前块的像素值包括N通道图像块中每个通道图像块的像素值;所述N通道图像块至少包括亮度通道图像块和色度通道图像块,所述N为大于零的整数;
    基于所述N通道图像块中每个通道图像块的像素值,获取所述每个通道图像块的复杂度信息;N为大于零的整数;
    基于所述每个通道图像块的复杂度信息确定所述当前块的复杂度信息。
  3. 根据权利要求1所述的方法,其特征在于,所述根据所述当前块的复杂度信息确定所述当前块的量化参数,包括:
    根据所述当前块的复杂度信息,确定所述当前块的参考量化参数;
    根据所述当前块的参考量化参数,确定所述当前块的量化参数。
  4. 根据权利要求1-3任一项所述的方法,其特征在于,所述基于所述量化参数对所述当前块进行解码,包括:
    基于所述当前块的量化参数,对所述当前块进行反量化,得到所述当前块的反量化后的变换系数;
    对所述当前块的反量化后的变换系数进行反变换,得到反变换后的残差块;
    基于所述反变换后的残差块,得到解码后的所述当前块。
  5. 根据权利要求1-3任一项所述的方法,其特征在于,在所述基于所述量化参数对所述当前块进行解码的步骤之后,还包括:
    获取待处理图像中每个条带所划分的码流片段数量n,所述n为正整数;
    将码流中每个条带行中的每个条带对应的码流片段进行解交织,得到所述每个条带对应的n个码流片段,其中,所述n个码流片段至少包括两种尺寸,至少一个尺寸为8的整数倍。
  6. 根据权利要求5所述的方法,其特征在于,所述获取待处理图像中每个条带所划分的n个码流片段,包括:
    获取所述待处理图像中每个条带的条带宽度和条带高度;
    根据条带宽度、条带高度和目标每像素占用比特数确定目标传输比特数,并根据所述目标传输比特数和片段尺寸值确定所述码流片段数量。
  7. 根据权利要求6所述的方法,其特征在于,所述片段尺寸值是根据码流片段中的编码单元数目、编码单元尺寸和目标像素深度导出的,所述片段尺寸值为8的整数倍。
  8. 根据权利要求6或7所述的方法,其特征在于,所述将码流中每个条带行中的每个条带对应的码流片段进行解交织,得到所述每个条带对应的n个码流片段,包括:
    基于所述码流中所述每个条带行的水平方向,将同一条带行的X个条带对应的码流片段依次解交织,得到所述X个条带中每个条带对应的n个码流片段,所述X为正整数。
  9. 根据权利要求6或7所述的方法,其特征在于,在实际编码比特数的长度固定时,针对所述码流片段数量个码流片段,除第k个码流片段之外的剩余码流片段的长度均为第一尺寸,第k个码流片段的长度为第二尺寸;其中,所述第一尺寸为8的整数倍,所述第二尺寸基于目标传输比特数、所述第一尺寸和所述n个码流片段确定,所述k为大于零且小于或等于n的正整数。
  10. 根据权利要求9所述的方法,其特征在于,
    所述第k个码流片段是所有码流片段中的最后一个码流片段。
  11. 一种视频图像编码方法,应用于编码端,其特征在于,包括:
    获取待处理图像中当前块的复杂度信息,所述当前块的复杂度信息至少根据所述当前块的像素值计 算所述当前块的至少一个角度梯度获取得到;
    根据所述当前块的复杂度信息确定所述当前块的量化参数;
    基于所述量化参数对所述当前块进行编码。
  12. 根据权利要求11所述的方法,其特征在于,所述获取待处理图像中当前块的复杂度信息,包括:
    基于所述当前块的像素值和所述当前块已编码的像素值的重建值,计算所述当前块的至少一个角度梯度;
    根据所述当前块的至少一个角度梯度,获取所述当前块的复杂度信息。
  13. 根据权利要求11所述的方法,其特征在于,所述获取待处理图像中当前块的复杂度信息,包括:
    基于所述当前块的像素值和所述待处理图像中与所述当前块相邻的像素值,计算所述当前块的至少一个角度梯度;
    根据所述当前块的至少一个角度梯度,获取所述当前块的复杂度信息。
  14. 根据权利要求11所述的方法,其特征在于,所述获取待处理图像中当前块的复杂度信息,包括:
    获取所述当前块的角度预测模式所采用的预测角度;
    基于所述预测角度,计算角度梯度以获取相应的复杂度信息;
    将所述相应的复杂度信息作为所述当前块的复杂度信息。
  15. 根据权利要求11所述的方法,其特征在于,所述当前块的预测模式包括第一预测模式和第二预测模式,所述第一预测模式和所述第二预测模式分别预设有相应的复杂度信息,所述获取待处理图像中当前块的复杂度信息,包括:
    获取基于所述第一预测模式和所述第二预测模式的预测结果;
    基于率失真代价在所述预测结果中确定最优预测模式;
    将所述最优预测模式的复杂度信息作为所述当前块的复杂度信息。
  16. 根据权利要求11所述的方法,其特征在于,所述获取待处理图像中当前块的复杂度信息,包括:
    基于所述至少一个角度梯度计算的复杂度信息的加权值得到所述当前块的复杂度信息。
  17. 根据权利要求11-16任一项所述的方法,其特征在于,所述当前块为N通道图像块,所述获取待处理图像中当前块的复杂度信息,包括:
    基于所述N通道图像块中每个通道图像块的像素值,获取所述每个通道图像块的复杂度信息;N为大于零的整数;
    基于所述每个通道图像块的复杂度信息确定所述当前块的复杂度信息。
  18. 根据权利要求17所述的方法,其特征在于,所述基于所述N通道图像块中每个通道图像块的像素值,获取所述每个通道图像块的复杂度信息,包括:
    将所述每个通道图像块划分为至少两个子块;
    确定所述每个通道图像块的至少两个子块的复杂度信息;
    基于所述每个通道图像块的至少两个子块的复杂度信息,确定相应的通道图像块的复杂度信息。
  19. 根据权利要求18所述的方法,其特征在于,所述基于所述每个通道图像块的至少两个子块的复杂度信息,确定相应的通道图像块的复杂度信息,包括:
    基于所述至少两个子块的复杂度信息的加权值得到相应的通道图像块的复杂度信息。
  20. 根据权利要求11-16任一项所述的方法,其特征在于,所述基于所述每个通道图像块的复杂度信息确定所述当前块的复杂度信息,包括:
    基于所述每个通道图像块的复杂度信息,确定所述每个通道图像块的复杂度等级;
    基于所述每个通道图像块的复杂度等级,确定所述当前块的复杂度信息。
  21. 根据权利要求11-16任一项所述的方法,其特征在于,所述根据所述当前块的复杂度信息确定所述当前块的量化参数,包括:
    根据所述当前块的复杂度信息,确定所述当前块的参考量化参数;
    根据所述当前块的参考量化参数,确定所述当前块的量化参数。
  22. 根据权利要求11-16任一项所述的方法,其特征在于,在所述基于所述量化参数对所述当前块进行编码的步骤之后,还包括:
    获取待处理图像中每个条带所划分的码流片段数量n;其中,所述n个码流片段至少包括两种尺寸,至少一个尺寸为8的整数倍,所述n为正整数;
    将所述待编码图像中每个条带行中的每个条带对应的码流片段进行交织,得到码流。
  23. 根据权利要求22所述的方法,其特征在于,所述获取待处理图像中每个条带所划分的n个码流片段,包括:
    获取所述待处理图像中每个条带的条带宽度和条带高度;
    根据条带宽度、条带高度和目标每像素占用比特数确定目标传输比特数,并根据所述目标传输比特数和片段尺寸值确定所述码流片段数量。
  24. 根据权利要求23所述的方法,其特征在于,所述片段尺寸值是根据码流片段中的编码单元数目、编码单元尺寸和目标像素深度导出的,所述片段尺寸值为8的整数倍。
  25. 根据权利要求22或23所述的方法,其特征在于,所述将所述待编码图像中每个条带行中的每个条带对应的码流片段进行交织,得到码流,包括:
    基于所述每个条带行的水平方向,将同一条带行的X个条带对应的码流片段依次交织,得到码流,所述X为正整数。
  26. 根据权利要求22或23所述的方法,其特征在于,在实际编码比特数的长度固定时,针对所述码流片段数量个码流片段,除第k个码流片段之外的剩余码流片段的长度均为第一尺寸,第k个码流片段的长度为第二尺寸;其中,所述第一尺寸为8的整数倍,所述第二尺寸基于目标传输比特数、所述第一尺寸和所述n个码流片段确定,所述k为大于零且小于或等于n的正整数。
  27. 根据权利要求26所述的方法,其特征在于,
    所述第k个码流片段是所有码流片段中的最后一个码流片段。
  28. 一种视频解码装置,其特征在于,包括:
    获取模块,用于获取待处理图像中当前块的复杂度信息;
    确定模块,用于根据所述当前块的复杂度信息确定所述当前块的量化参数;
    解码模块,用于基于所述量化参数对所述当前块进行解码。
  29. 一种视频编码装置,其特征在于,包括:
    获取模块,用于获取待处理图像中当前块的复杂度信息,所述当前块的复杂度信息至少根据所述当前块的像素值计算所述当前块的至少一个角度梯度获取得到;
    确定模块,用于根据所述当前块的复杂度信息确定所述当前块的量化参数;
    编码模块,用于基于所述量化参数对所述当前块进行编码。
  30. 一种视频解码器,其特征在于,包括处理器和存储器;
    所述存储器存储有所述处理器可执行的指令;
    所述处理器被配置为执行所述指令时,使得所述视频解码器实现如权利要求1-10中任一项所述的视频图像解码方法。
  31. 一种视频编码器,其特征在于,包括处理器和存储器;
    所述存储器存储有所述处理器可执行的指令;
    所述处理器被配置为执行所述指令时,使得所述视频编码器实现如权利要求11-27中任一项所述的视频图像编码方法。
  32. 一种视频编解码系统,其特征在于,包括视频编码器和视频解码器,所述视频编码器用于执行如权利要求11-27中任一项所述的视频图像编码方法,所述视频解码器用于执行如权利要求1-10中任一项所述的视频图像解码方法。
  33. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质中存储有程序,当所述程序在所述计算机上运行时,使得所述计算机执行如权利要求1-27中任一项所述的方法。
  34. 一种包含指令的计算机程序产品,其特征在于,当其在计算机上运行时,使得计算机执行上述权利要求1-27任一所述的方法的步骤。
PCT/CN2023/096070 2022-05-31 2023-05-24 一种视频译码方法、装置及存储介质 WO2023231866A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210612716.1A CN116095318A (zh) 2022-05-31 2022-05-31 一种视频译码方法、装置及存储介质
CN202210612716.1 2022-05-31

Publications (1)

Publication Number Publication Date
WO2023231866A1 true WO2023231866A1 (zh) 2023-12-07

Family

ID=86199741

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/096070 WO2023231866A1 (zh) 2022-05-31 2023-05-24 一种视频译码方法、装置及存储介质

Country Status (3)

Country Link
CN (3) CN116506614A (zh)
TW (1) TW202349948A (zh)
WO (1) WO2023231866A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116506614A (zh) * 2022-05-31 2023-07-28 杭州海康威视数字技术股份有限公司 一种视频译码方法、装置及存储介质
CN116708800A (zh) * 2023-08-01 2023-09-05 天津卓朗昆仑云软件技术有限公司 图像编码与解码方法、装置及系统

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019191983A1 (zh) * 2018-04-04 2019-10-10 深圳市大疆创新科技有限公司 编码方法、装置、图像处理系统和计算机可读存储介质
CN112738515A (zh) * 2020-12-28 2021-04-30 北京百度网讯科技有限公司 用于自适应量化的量化参数调整方法和装置
WO2021119145A1 (en) * 2019-12-09 2021-06-17 Qualcomm Incorporated Position-dependent intra-prediction combination for angular intra-prediction modes for video coding
CN113784126A (zh) * 2021-09-17 2021-12-10 Oppo广东移动通信有限公司 图像编码方法、装置、设备及存储介质
CN116095318A (zh) * 2022-05-31 2023-05-09 杭州海康威视数字技术股份有限公司 一种视频译码方法、装置及存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019191983A1 (zh) * 2018-04-04 2019-10-10 深圳市大疆创新科技有限公司 编码方法、装置、图像处理系统和计算机可读存储介质
WO2021119145A1 (en) * 2019-12-09 2021-06-17 Qualcomm Incorporated Position-dependent intra-prediction combination for angular intra-prediction modes for video coding
CN112738515A (zh) * 2020-12-28 2021-04-30 北京百度网讯科技有限公司 用于自适应量化的量化参数调整方法和装置
CN113784126A (zh) * 2021-09-17 2021-12-10 Oppo广东移动通信有限公司 图像编码方法、装置、设备及存储介质
CN116095318A (zh) * 2022-05-31 2023-05-09 杭州海康威视数字技术股份有限公司 一种视频译码方法、装置及存储介质

Also Published As

Publication number Publication date
CN116506614A (zh) 2023-07-28
CN116506613A (zh) 2023-07-28
CN116095318A (zh) 2023-05-09
TW202349948A (zh) 2023-12-16

Similar Documents

Publication Publication Date Title
WO2023231866A1 (zh) 一种视频译码方法、装置及存储介质
TWI682661B (zh) 在計算設備中實現的方法
JP6518701B2 (ja) ディスプレイストリーム圧縮(dsc)のためのエントロピーコーディング技法
KR102120571B1 (ko) 넌-4:4:4 크로마 서브-샘플링의 디스플레이 스트림 압축 (dsc) 을 위한 엔트로피 코딩 기법들
TWI705693B (zh) 用於顯示串流壓縮之基於向量之熵寫碼的裝置及方法
TW202349950A (zh) 圖像編解碼方法及裝置
WO2024022039A1 (zh) 一种视频图像解码方法、编码方法、装置及存储介质
WO2024104382A1 (zh) 图像编解码方法、装置及存储介质
WO2024061055A1 (zh) 图像编码方法和图像解码方法、装置及存储介质
WO2024022359A1 (zh) 一种图像编解码方法及装置
TW202406351A (zh) 圖像解碼方法、編碼方法及裝置
WO2023138532A1 (zh) 一种视频解码方法、装置、视频解码器及存储介质
TWI821013B (zh) 視頻編解碼方法及裝置
TWI838089B (zh) 一種視頻解碼方法、裝置、視頻解碼器及存儲介質
TWI829424B (zh) 解碼方法、編碼方法及裝置
WO2023138562A1 (zh) 图像解码方法、图像编码方法及相应的装置
WO2023138391A1 (zh) 系数解码方法、装置、图像解码器及电子设备
TW202415067A (zh) 圖像編碼方法和圖像解碼方法、裝置及存儲介質

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23815050

Country of ref document: EP

Kind code of ref document: A1