WO2022213809A1 - 视频编码的方法、装置、设备和存储介质 - Google Patents

视频编码的方法、装置、设备和存储介质 Download PDF

Info

Publication number
WO2022213809A1
WO2022213809A1 PCT/CN2022/082675 CN2022082675W WO2022213809A1 WO 2022213809 A1 WO2022213809 A1 WO 2022213809A1 CN 2022082675 W CN2022082675 W CN 2022082675W WO 2022213809 A1 WO2022213809 A1 WO 2022213809A1
Authority
WO
WIPO (PCT)
Prior art keywords
target
coding
macroblock
offset
encoding
Prior art date
Application number
PCT/CN2022/082675
Other languages
English (en)
French (fr)
Inventor
张文忠
Original Assignee
百果园技术(新加坡)有限公司
张文忠
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 百果园技术(新加坡)有限公司, 张文忠 filed Critical 百果园技术(新加坡)有限公司
Publication of WO2022213809A1 publication Critical patent/WO2022213809A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/124Quantisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation

Definitions

  • the present application relates to the technical field of video coding, for example, to a video coding method, apparatus, device, and storage medium.
  • ROI regions of interest
  • a third-party ROI detection device is used to identify the region of interest and the region of non-interest in the video frame, and then calculate the prediction of each coding unit at the corresponding recursion depth in the region of interest and the region of non-interest by traversing The rate-distortion cost in the coding mode, and the rate-distortion cost in each predictive coding mode of each transform unit after the coding unit is divided, and then filter out the lowest cost for each coding unit in the region of interest and the region of non-interest The optimal encoding mode to ensure the encoding quality of the region of interest and the region of non-interest.
  • the ROI coding algorithm is too complicated and has a large coding overhead, which greatly affects the efficiency of video coding.
  • the present application provides a video coding method, apparatus, device and storage medium, which reduces the complexity and computational overhead of video coding, and improves the coding quality of video on the basis of ensuring the high efficiency of video coding.
  • the application provides a method for video encoding, including:
  • the macroblock coding parameters of the target macroblock and the non-target macroblock are adjusted correspondingly.
  • the application provides a video encoding device, including:
  • a target offset calculation module configured to calculate the target coding offset when the target macroblock reaches the adaptive coding quality based on the proportion of the target macroblock in the current video frame;
  • a non-target offset calculation module configured to calculate the non-target coding offset of the non-target macroblock in the current video frame based on the frame-level coding stability principle and the target coding offset;
  • the coding parameter adjustment module is configured to use the target coding offset and the non-target coding offset to adjust the macroblock coding parameters of the target macroblock and the non-target macroblock accordingly.
  • the application provides a computer equipment, including:
  • processors one or more processors
  • storage means arranged to store one or more programs
  • the one or more processors When the one or more programs are executed by the one or more processors, the one or more processors implement the above-mentioned video encoding method.
  • the present application provides a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, implements the above-mentioned video encoding method.
  • FIG. 1A is a flowchart of a video encoding method provided in Embodiment 1 of the present application;
  • FIG. 1B is a schematic diagram of the principle of a video encoding process provided in Embodiment 1 of the present application;
  • FIG. 2A is a flowchart of a method for video coding provided in Embodiment 2 of the present application.
  • FIG. 2B is a schematic diagram of the principle of a video encoding process according to Embodiment 2 of the present application.
  • FIG. 3 is a flowchart of a method for video coding provided in Embodiment 3 of the present application.
  • FIG. 4 is a schematic structural diagram of a device for video coding according to Embodiment 4 of the present application.
  • FIG. 5 is a schematic structural diagram of a computer device according to Embodiment 5 of the present application.
  • FIG. 1A is a flowchart of a video encoding method according to Embodiment 1 of the present application.
  • This embodiment may be applicable to a scenario where there is an encoding requirement for any video.
  • the video coding method provided in this embodiment may be executed by the video coding apparatus provided in the embodiment of the present application, and the apparatus may be implemented in software and/or hardware, and integrated in a computer device that executes the method, the device It can be a background server that participates in video data interaction, etc.
  • the method may include:
  • S110 Calculate, based on the proportion of the target macroblock in the current video frame, a target encoding offset when the target macroblock achieves an adaptive encoding quality.
  • the high-quality encoding method is used to ensure the high-definition picture of the area of interest and improve the video quality of the area of interest.
  • the region of interest in this embodiment may be a local area with uniform specific picture features in the video frame for accurate identification of the region of interest, such as a face region that can be detected by skin color in scenarios such as video calls or live video broadcasts Wait.
  • the macroblock can be a 16*16 pixel block, and then the macroblock
  • this embodiment can improve the video quality of the region of interest by optimizing the encoding method of the macroblock where each region of interest is located.
  • the current video frame is the original video data without data processing.
  • the picture of each macroblock in the current video frame is the picture of the region of interest, and then divide the macroblocks in the current video frame into target macroblocks Blocks and non-target macroblocks, and the proportion of target macroblocks in the current video frame can be determined.
  • all the pictures in the target macroblock are the region of interest, and the video pictures in the non-target macroblock may include pictures other than the region of interest.
  • the proportion of target macroblocks in the current video frame is relatively high, it means that there are more target macroblocks in the current video frame. The more features are retained, the greater the encoding rate of the target macroblock. If multiple target macroblocks use the encoding method with the highest encoding quality, the encoding rate of the current video frame will be too high, exceeding the current encoding rate. The demand for network bandwidth leads to congestion in video encoding. Therefore, in order to maintain stable coding of the current video frame and avoid coding congestion or excess coding, this embodiment will reversely set different adaptive coding qualities according to different proportions of target macroblocks in the current video frame, and require each The adaptive encoding quality is better than the current encoding quality.
  • the target encoding offset when the target macroblock reaches the adaptive encoding quality, and the target encoding offset can accurately represent the optimization range between the adaptive encoding quality and the current encoding quality, that is, the target in the current video frame.
  • the frame-level coding stability principle in this embodiment means that in order to avoid the phenomenon that the coding bit rate exceeds the current network bandwidth requirement and causes video coding to be congested, it is required to optimize the coding of target macroblocks and non-target macroblocks. After that, the overall coding parameters of the current video frame can remain roughly unchanged. That is to say, in order to satisfy the frame-level coding stability principle, the overall coding offset of the target macroblock and the non-target macroblock in the current video frame is approximately 0.
  • the overall coding offset of the target macroblock can be calculated, and then according to the number of non-target macroblocks in the current video frame, the non-target macroblock can be calculated.
  • the non-target encoding offset for the target macroblock is not limited to the target coding offset of the target macroblock and the number of target macroblocks.
  • a video macroblock-level coding control algorithm can be used to calculate each target separately.
  • the non-target coding offset is also used to adjust the macroblock coding parameters of each non-target macroblock in the current video frame, so as to achieve coding optimization of the current video frame.
  • the technical solution provided by this embodiment can analyze the adaptive coding quality of the target macroblock through the proportion of the target macroblock in the current video frame, and then calculate the target coding offset when the target macroblock reaches the adaptive coding quality. Then, based on the frame-level coding stability principle, the target coding offset can be used to calculate the non-target coding offset of the non-target macroblock in the current video frame, and then the target coding offset and the non-target coding offset can be used.
  • the target coding offset and the non-target coding offset are used to adjust the macroblocks of the target macroblock and non-target macroblock accordingly.
  • FIG. 2A is a flowchart of a video encoding method according to Embodiment 2 of the present application
  • FIG. 2B is a schematic diagram of a principle of a video encoding process according to Embodiment 2 of the present application.
  • This embodiment is described on the basis of the above-mentioned embodiment. As shown in FIG. 2B , this embodiment mainly explains the calculation process of the target coding offset of the target macroblock and the non-target coding offset of the non-target macroblock in the current video frame.
  • this embodiment may include:
  • S210 Determine a coding offset base value that has been set under the adaptive coding quality and matches the proportion of the target macroblock.
  • the target macroblock ratio can be divided into suitable ratio stages for different adaptive coding quality in advance, for example, the target macroblock ratio can be divided into suitable ratio stages.
  • the block ratio is divided into four stages: (0, 0.15], (0.15, 0.35], (0.35, 0.5] and (0.5, 1), and the adaptive coding quality required by each stage is different. The smaller the proportion, the higher the quality of the adaptive encoding at the proportion stage.
  • a matching coding offset base value is set under multiple adaptive coding qualities, and the coding offset base value can ensure that the target macroblock is within the current network bandwidth.
  • the adaptive coding quality can still be achieved to the greatest extent.
  • the encoding parameter in this embodiment may be an encoding quantization parameter (Quantization Parameter, QP).
  • QP Quantization Parameter
  • a frame-level coding control algorithm may be used in this embodiment to calculate the frame-level encoding parameter that meets the current network bandwidth requirement. (QP_base), and the maximum encoding parameter (QP_max) that the current video frame meets the encoding requirements is specified externally.
  • the maximum encoding parameter is used to ensure the video encoding quality, indicating that the current network bandwidth requirements are not exceeded to avoid encoding congestion or excessive encoding problems.
  • the frame-level coding adjustable value (QP_diff) of the current video frame is the difference between the frame-level coding parameter and the maximum coding parameter.
  • this embodiment further sets a corresponding macroblock reference offset value (ROI_QP_OFF) for the target macroblock in the current video frame.
  • the proportion of target macroblocks is less than the proportion specified by the highest encoding quality, that is, when the proportion of target macroblocks is in the (0,0.15] stage, the matching encoding offset base value can be set to the frame level of the current video frame.
  • Coding adjustable value (QP_diff) the proportion of the target macroblock is not less than the proportion specified by the highest encoding quality, and the matching encoding offset base value can be the frame-level encoding adjustable value (QP_diff) and the target macroblock
  • QP_diff the frame-level encoding adjustable value
  • the minimum value among the macroblock reference offset values (ROI_QP_OFF), that is, QP_diff_limit min(QP_diff, ROI_QP_OFF).
  • S220 Calculate the target encoding offset of the target macroblock based on the encoding offset base value and the inverse effect parameter of the proportion of the target macroblock on the adaptive encoding quality.
  • the calculated coding offset base value can be adjusted by adjusting the calculated coding offset base value through the reverse influence parameter of the proportion of the target macroblock on the adaptive coding quality. The target encoding offset of the target macroblock.
  • the calculation method of the target coding offset of the target macroblock may be as follows:
  • the encoding bit rate can be regulated to keep stable, avoiding the problem of encoding congestion or excessive encoding, that is, allowing frame-level encoding.
  • the overall offset of the parameter can be within the range of the frame-level encoding adjustable value.
  • the target encoding offset of the target macroblock in the current video frame and the number of target macroblocks can be used to calculate the target macroblock in the current video frame.
  • the overall offset of the target macroblock can be used to estimate the overall offset of the non-target macroblock, and then according to the current
  • the number of non-target macroblocks in the video frame, and the non-target coding offset of the non-target macroblocks is calculated.
  • the target macroblock in order to ensure higher encoding quality of the target macroblock in the current video frame, and to achieve high-definition display of the region of interest in the current video frame, it is required that the target macroblock can be encoded with more detailed picture features, so it can be used.
  • the target coding offset correspondingly increases the macroblock coding parameters calculated by the macroblock-level coding control algorithm for each target macroblock to obtain the actual coding parameters of the target macroblock, and subsequently uses the actual coding parameters to encode the target macroblock , more detailed picture features can be obtained, thereby improving the coding quality of the target macroblock.
  • the non-target encoding offset is also used to correspondingly down-regulate each non-target macroblock.
  • the block uses the macroblock coding parameters calculated by the macroblock-level coding control algorithm to obtain the actual coding parameters of the non-target macroblock, and then uses the actual coding parameters to encode the non-target macroblock to implement frame-level coding of the current video frame. Stable to avoid coding congestion or excess coding.
  • the target encoding offset of the target macroblock is calculated by using the encoding offset base value of the proportion of the target macroblock and the inverse influence parameter of the proportion of the target macroblock on the adaptive encoding quality.
  • the target coding offset of the target macroblock and the number of target macroblocks in the current video frame and the The number of target macroblocks, and the non-target coding offset of non-target macroblocks is calculated to ensure the stability of video frame-level coding and avoid coding congestion or excess coding.
  • FIG. 3 is a flowchart of a video coding method according to Embodiment 3 of the present application.
  • This embodiment is described on the basis of the above-mentioned embodiment. Since the regions of interest in the current video frame are not evenly distributed, there is a non-target macroblock which is composed of the video pictures of the region of interest and the video pictures of the non-interest region. Therefore, in this embodiment, the non-target macroblocks It can include both partial non-target macroblocks and complete non-target macroblocks, wherein the partial non-target macroblocks are composed of the video pictures of the region of interest and the video pictures of the non-interested regions, while the completely non-target macroblocks only consist of non-targeted macroblocks.
  • the video frame composition of the region of interest In this embodiment, the calculation process of the non-target coding offsets of the partial non-target macroblocks and the complete non-target macroblocks under the non-target macroblocks is mainly explained.
  • this embodiment may include:
  • S310 use the target area detection model to detect the type of each macroblock in the current video frame, and count the number of target macroblocks and the number of non-target macroblocks in the current video frame to calculate the proportion of the target macroblock in the current video frame. Compare.
  • a target region detection model that can accurately identify the unified specific picture features possessed by the region of interest is preset, and the target region detection model is used to traverse the current video frame.
  • Each macroblock to judge whether the video pictures in the macroblock are all the video pictures of the region of interest, if the video pictures in the macroblock are all the video pictures of the region of interest, then determine that the macroblock is the target macroblock If the part video picture in this macroblock is the video picture of the region of interest, and the part video picture is the video picture of the non-interested region, then determine that this macroblock is a local non-target macroblock; If the video picture in this macroblock All the video pictures of the non-interested area, then determine that the macroblock is a completely non-target macroblock; at the same time, count the number of target macroblocks in the current video frame, the number of local non-target macroblocks in the non-target macroblock, and the total non-target macroblocks.
  • the number of target macroblock to judge whether the video pictures in the macro
  • a VPx skin color detection model can be embedded in the VPx open source encoder, and the VPx skin color detection model has higher detection efficiency for skin color regions in a video frame.
  • the corresponding width and height information is obtained for the current video frame, which are width and height respectively; then, the current video frame is traversed with a 16*16-sized macroblock as a coding unit to divide the current video frame to obtain multiple macroblocks ( x, y), where x takes the value [0, width/16], y takes the value [0, height/16], and then completes the skin color detection of each macroblock with the help of the VPx skin color detection model, and outputs the target macroblock, local
  • macroblocks non-target macroblocks and completely non-target macroblocks, and the numbers of target macroblocks, partial non-target macroblocks, and completely non-target macroblocks are calculated as n1, n2, and n3, respectively.
  • the proportion of target macroblocks can be
  • the overall coding offset of the target macroblock and non-target macroblock in the current video frame is roughly 0, then the target coding offset of the target macroblock and the number of target macroblocks can be used to calculate The overall encoding offset of the target macroblock, and then according to the number of non-target macroblocks in the current video frame, the non-target encoding offset reference amount of the non-target macroblock can be calculated.
  • the non-target coding offset reference for non-target macroblocks can be Among them, QP 1 is the target coding offset of the target macroblock, and the accuracy of the non-target coding offsets of local non-target macroblocks and complete non-target macroblocks is ensured by adjusting the non-target coding offset reference amount subsequently.
  • the local non-target macroblock is composed of the video pictures of the region of interest and the video pictures of the non-interest region, and the complete non-target macroblock is only composed of the video pictures of the non-interest region, the local non-target macroblock is required.
  • the encoding of macroblocks can make an encoding buffer between target macroblocks and completely non-target macroblocks. That is, based on the frame-level encoding adjustable value of the current video frame, set a change that meets the requirements of the frame-level encoding adjustable value, so that the current video frame can meet the requirements of the current code control, and use this change to reduce the calculated value.
  • the calculated non-target coding offset reference value is used as the non-target coding offset value of the local non-target macroblock; at the same time, the calculated non-target coding offset reference value is raised by this variation, as the non-target coding offset value of the complete non-target macroblock.
  • the non-target coding offset enables the non-target coding offset of the local non-target macroblock to be lower than the non-target coding offset of the complete non-target macroblock when the frame-level coding of the current video frame is guaranteed to be stable, so that the local non-target coding offset can be reduced.
  • the coding quality of non-target macroblocks can be better than completely non-target macroblocks.
  • the corresponding coding offsets are respectively used to adjust the macroblock coding parameters of each macroblock, so as to obtain the actual coding parameters of each macroblock.
  • the non-target coding offset of the local non-target macroblock can be
  • the non-target encoding offset for a completely non-target macroblock can be
  • the macroblock may be encoded by sequentially using the actual encoding parameters adjusted by each macroblock according to the corresponding encoding offset.
  • the technical solution provided by this embodiment can analyze the adaptive coding quality of the target macroblock through the proportion of the target macroblock in the current video frame, and then calculate the target coding offset when the target macroblock reaches the adaptive coding quality. Then, based on the frame-level coding stability principle, the target coding offset can be used to calculate the non-target coding offset of the non-target macroblock in the current video frame, and then the target coding offset and the non-target coding offset can be used.
  • the offset corresponds to adjusting the macroblock coding parameters of the target macroblock and the non-target macroblock, so as to realize the coding optimization of the target macroblock and non-target macroblock of the current video frame, without calculating the coding cost in the unnecessary coding division mode, It greatly reduces the complexity and coding overhead of video coding and ensures the high efficiency of video coding; at the same time, the target coding offset and non-target coding offset are used to adjust the macroblock coding of target macroblocks and non-target macroblocks accordingly. After the parameters are set, the coding quality of the video can be further improved on the basis of ensuring the high efficiency of the video coding.
  • FIG. 4 is a schematic structural diagram of an apparatus for video coding according to Embodiment 4 of the present application. As shown in FIG. 4 , the apparatus may include:
  • the target offset calculation module 410 is configured to calculate the target coding offset when the target macroblock reaches the adaptive coding quality based on the proportion of the target macroblock in the current video frame; the non-target offset calculation module 420, be set to calculate the non-target coding offset of non-target macroblocks in the current video frame based on the frame-level coding stability principle and the target coding offset; the coding parameter adjustment module 430 is set to adopt the target coding offset and the non-target coding offset, correspondingly adjust the macroblock coding parameters of the target macroblock and the non-target macroblock.
  • the technical solution provided by this embodiment can analyze the adaptive coding quality of the target macroblock through the proportion of the target macroblock in the current video frame, and then calculate the target coding offset when the target macroblock reaches the adaptive coding quality. Then, based on the frame-level coding stability principle, the target coding offset can be used to calculate the non-target coding offset of the non-target macroblock in the current video frame, and then the target coding offset and the non-target coding offset can be used.
  • the offset corresponds to adjusting the macroblock coding parameters of the target macroblock and the non-target macroblock, so as to realize the coding optimization of the target macroblock and non-target macroblock of the current video frame, without calculating the coding cost in the unnecessary coding division mode, It greatly reduces the complexity and coding overhead of video coding and ensures the high efficiency of video coding; at the same time, the target coding offset and non-target coding offset are used to adjust the macroblock coding of target macroblocks and non-target macroblocks accordingly. After the parameters are set, the coding quality of the video can be further improved on the basis of ensuring the high efficiency of the video coding.
  • the video encoding apparatus provided in this embodiment can be applied to the video encoding method provided in any of the foregoing embodiments, and has corresponding functions and effects.
  • FIG. 5 is a schematic structural diagram of a computer device according to Embodiment 5 of the present application.
  • the computer device includes a processor 50, a storage device 51 and a communication device 52; the number of processors 50 in the computer device may be One or more, a processor 50 is taken as an example in FIG. 5 ; the processor 50 , the storage device 51 and the communication device 52 in the computer equipment can be connected by a bus or in other ways, and the connection by a bus is taken as an example in FIG. 5 .
  • a computer device provided by this embodiment can be used to execute the video coding method provided by any of the above embodiments, and has corresponding functions and effects.
  • Embodiment 6 of the present application further provides a computer-readable storage medium on which a computer program is stored.
  • the program is executed by a processor, the video encoding method in any of the foregoing embodiments can be implemented.
  • a storage medium containing computer-executable instructions provided by an embodiment of the present application the computer-executable instructions of the computer-executable instructions are not limited to the above-mentioned method operations, and can also execute related video coding methods provided by any embodiment of the present application. operate.
  • the present application can be implemented by software and necessary general-purpose hardware, and can also be implemented by hardware.
  • the technical solution of the present application can be embodied in the form of a software product in essence, and the computer software product can be stored in a computer-readable storage medium, such as a floppy disk of a computer, a read-only memory (Read-Only Memory, ROM), a random access A memory (Random Access Memory, RAM), a flash memory (FLASH), a hard disk or an optical disc, etc., including multiple instructions to enable a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the embodiments of the present application.
  • a computer-readable storage medium may be a non-transitory storage medium.
  • the multiple units and modules included are only divided according to functional logic, but are not limited to the above-mentioned division, as long as the corresponding functions can be realized; in addition, multiple functional units
  • the names are only for the convenience of distinguishing from each other, and are not used to limit the protection scope of this application.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

本文公开了一种视频编码的方法、装置、设备和存储介质。该视频编码的方法包括:基于当前视频帧内目标宏块的占比,计算目标宏块在达到适配编码质量的情况下的目标编码偏移量;基于帧级编码稳定原则和目标编码偏移量,计算当前视频帧内非目标宏块的非目标编码偏移量;采用目标编码偏移量和非目标编码偏移量,对应调整目标宏块和非目标宏块的宏块编码参数。

Description

视频编码的方法、装置、设备和存储介质
本申请要求在2021年04月07日提交中国专利局、申请号为202110372888.1的中国专利申请的优先权,该申请的全部内容通过引用结合在本申请中。
技术领域
本申请涉及视频编码技术领域,例如涉及一种视频编码的方法、装置、设备和存储介质。
背景技术
随着互联网技术的快速发展,社交娱乐产品的功能越来越丰富,视频编码作为社交娱乐产品的一项基本功能,在视频通话、短视频录制、视频直播等领域内扮演着重要的角色。而且,对于视频帧内存在高清显示需求的一些感兴趣区域(Region of Interest,ROI),需要专门对这些感兴趣区域进行自适应编码,以提升感兴趣区域的播放画质。
通常会采用第三方ROI检测设备来识别出视频帧内的感兴趣区域和非感兴趣区域,然后通过遍历计算感兴趣区域和非感兴趣区域内每个编码单元在相应递归深度下的每一预测编码模式下的率失真代价,以及该编码单元划分后的每一变换单元在每一预测编码模式下的率失真代价,进而为感兴趣区域和非感兴趣区域内每个编码单元筛选出代价最小的最优编码模式,以保证感兴趣区域和非感兴趣区域的编码质量。
但是,ROI编码算法过于复杂,存在较大的编码开销,极大影响了视频编码的高效性。
发明内容
本申请提供了一种视频编码的方法、装置、设备和存储介质,降低视频编码的复杂度和计算开销,在保证视频编码高效性的基础上,提升视频的编码质量。
本申请提供了一种视频编码的方法,包括:
基于当前视频帧内目标宏块的占比,计算所述目标宏块达到适配编码质量时的目标编码偏移量;
基于帧级编码稳定原则和所述目标编码偏移量,计算当前视频帧内非目标宏块的非目标编码偏移量;
采用所述目标编码偏移量和所述非目标编码偏移量,对应调整所述目标宏块和所述非目标宏块的宏块编码参数。
本申请提供了一种视频编码的装置,包括:
目标偏移量计算模块,设置为基于当前视频帧内目标宏块的占比,计算所述目标宏块达到适配编码质量时的目标编码偏移量;
非目标偏移量计算模块,设置为基于帧级编码稳定原则和所述目标编码偏移量,计算当前视频帧内非目标宏块的非目标编码偏移量;
编码参数调整模块,设置为采用所述目标编码偏移量和所述非目标编码偏移量,对应调整所述目标宏块和所述非目标宏块的宏块编码参数。
本申请提供了一种计算机设备,包括:
一个或多个处理器;
存储装置,设置为存储一个或多个程序;
当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现上述的视频编码的方法。
本申请提供了一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现上述的视频编码的方法。
附图说明
图1A为本申请实施例一提供的一种视频编码的方法的流程图;
图1B为本申请实施例一提供的一种视频编码过程的原理示意图;
图2A为本申请实施例二提供的一种视频编码的方法的流程图;
图2B为本申请实施例二提供的一种视频编码过程的原理示意图;
图3为本申请实施例三提供的一种视频编码的方法的流程图;
图4为本申请实施例四提供的一种视频编码的装置的结构示意图;
图5为本申请实施例五提供的一种计算机设备的结构示意图。
具体实施方式
下面结合附图和实施例对本申请进行说明。此处所描述的具体实施例仅仅用于解释本申请。为了便于描述,附图中仅示出了与本申请相关的部分。
实施例一
图1A为本申请实施例一提供的一种视频编码的方法的流程图,本实施例可适用于对任一视频存在编码需求的场景中。本实施例提供的视频编码的方法可以由本申请实施例提供的视频编码的装置来执行,该装置可以通过软件和/或硬件的方式来实现,并集成在执行本方法的计算机设备中,该设备可以是参与视频数据交互的后台服务器等。
参考图1A,该方法可以包括:
S110,基于当前视频帧内目标宏块的占比,计算目标宏块达到适配编码质量时的目标编码偏移量。
由于每类视频每一帧内的不同区域会分别显示不同的视频画面,而用户会对一些特定的画面区域感兴趣,那么会要求视频帧内每个感兴趣区域的视频画面能够尽可能采用高质量的编码方式编码,来保证感兴趣区域的画面高清度,提升该感兴趣区域的视频质量。其中,本实施例中的感兴趣区域可以为视频帧内具备统一特定画面特征,以用于感兴趣区域准确识别的局部区域,例如视频通话或视频直播等场景下可通过肤色检测的人脸区域等。
此时,在对当前视频帧进行视频编码时,如图1B所示,由于通常会将当前视频帧划分成多个宏块,该宏块可以为16*16大小的像素块,然后以宏块为编码单元对当前视频帧进行编码,因此本实施例可以通过优化每个感兴趣区域所在宏块的编码方式,来提升感兴趣区域的视频质量。其中,当前视频帧为未经过数据处理的原始视频数据。
在本实施例中,采用感兴趣区域所具备的同一特定画面特征,可以分析当前视频帧内每个宏块的画面是否为感兴趣区域的画面,进而将当前视频帧内的宏块分成目标宏块和非目标宏块,同时可以确定出目标宏块在当前视频帧的占比。其中,目标宏块内的全部画面均为感兴趣区域,而非目标宏块内的视频画面会包含除感兴趣区域之外的画面。
如果当前视频帧内目标宏块占比较高,说明当前视频帧内存在较多的目标宏块的编码质量需要在当前编码质量的基础上优化,而考虑到编码质量越高时要求视频画面的细节特征保留越多,使得目标宏块的编码码率越大,此时如果多个目标宏块均采用最高编码质量下的编码方式,那么会使当前视频帧的编码码率过高,而超出当前网络带宽的需求,导致视频编码产生拥塞。因此,为了保持当前视频帧的稳定编码,避免产生编码拥塞或编码过剩,本实施例会按照当前视频帧内目标宏块的不同占比,反向设定不同的适配编码质量,而且要求每一适配编码质量优于当前编码质量。例如目标宏块的占比越高,说明当前视频帧内目标宏块的编码总量越大,那么设定适配编码质量相对于当前编码质量存在较低的优化幅度;目标宏块的占比越低,说明当前视频帧内目标宏块的编 码总量越小,那么适配编码质量相对于当前编码质量存在较高的优化幅度。然后,计算目标宏块达到该适配编码质量时的目标编码偏移量,该目标编码偏移量能够准确表示适配编码质量与当前编码质量之间的优化幅度,也就是当前视频帧内目标宏块的占比越大,那么目标编码偏移量越小,当前视频帧内目标宏块的占比越小,那么目标编码偏移量越大。
S120,基于帧级编码稳定原则和目标编码偏移量,计算当前视频帧内非目标宏块的非目标编码偏移量。
可选的,本实施例中的帧级编码稳定原则是指为了避免编码码率超出当前网络带宽的需求而导致视频编码产生拥塞的现象,要求在对目标宏块和非目标宏块进行优化编码后,当前视频帧的整体编码参数能够保持大致不变。也就是说,为了满足帧级编码稳定原则,当前视频帧内目标宏块和非目标宏块的整体编码偏移量大致为0。因此,利用目标宏块的目标编码偏移和目标宏块数量,即可计算出目标宏块的整体编码偏移量,那么再按照当前视频帧内的非目标宏块数量,即可计算出非目标宏块的非目标编码偏移量。
S130,采用目标编码偏移量和非目标编码偏移量,对应调整目标宏块和非目标宏块的宏块编码参数。
可选的,在得到当前视频帧内目标宏块的目标编码偏移量和非目标宏块的非目标编码偏移量后,采用视频宏块级的码控算法,能够分别计算出每一目标宏块和每一非目标宏块在满足当前网络带宽要求时的宏块编码参数,然后为了优化目标宏块的编码质量,以保证当前视频帧内感兴趣区域的高清显示,本实施例会采用该目标编码偏移量调整当前视频帧内每一目标宏块的宏块编码参数,使得目标宏块能够采用更多的画面细节特征来进行视频编码,能够在保证视频编码高效性的基础上,提升视频的编码质量。同时,为了保证视频帧级编码整体稳定,还会采用非目标编码偏移量调整当前视频帧内每一非目标宏块的宏块编码参数,以实现当前视频帧的编码优化。
本实施例提供的技术方案,通过当前视频帧内目标宏块的占比,能够分析出该目标宏块的适配编码质量,进而计算出目标宏块达到该适配编码质量时的目标编码偏移量,然后基于帧级编码稳定原则,可以利用该目标编码偏移量计算出当前视频帧内非目标宏块的非目标编码偏移量,然后采用该目标编码偏移量和该非目标编码偏移量,对应调整目标宏块和非目标宏块的宏块编码参数,以实现当前视频帧内目标宏块和非目标宏块的编码优化,无需计算不必要的编码划分模式下的编码代价,极大降低了视频编码的复杂性和编码开销,保证视频编码的高效性;同时,采用目标编码偏移量和非目标编码偏移量,对应调整目标宏块和非目标宏块的宏块编码参数后,能够在保证视频编码高效性的基础 上,提升视频的编码质量。
实施例二
图2A为本申请实施例二提供的一种视频编码的方法的流程图,图2B为本申请实施例二提供的一种视频编码过程的原理示意图。本实施例是在上述实施例的基础上进行说明。如图2B所示,本实施例主要对于当前视频帧内目标宏块的目标编码偏移量和非目标宏块的非目标编码偏移量的计算过程进行解释说明。
可选的,如图2A所示,本实施例中可以包括:
S210,确定适配编码质量下已设定的与目标宏块的占比匹配的编码偏移基值。
可选的,为了保证目标宏块占比与适配编码质量之间的准确匹配,本实施例可以预先为目标宏块占比划分出不同适配编码质量适合的占比阶段,例如将目标宏块占比分成(0,0.15]、(0.15,0.35]、(0.35,0.5]和(0.5,1)这四个阶段,每一阶段所要求的适配编码质量不同。其中,目标宏块的占比越小,所处占比阶段的适配编码质量越高。
为了准确表示不同占比阶段的适配编码质量,本实施例会在多个适配编码质量下设定匹配的编码偏移基值,该编码偏移基值能够保证目标宏块在当前网络带宽的基础上进行编码参数调整后仍能够最大程度达到该适配编码质量。
本实施例中的编码参数可以为编码量化参数(Quantization Parameter,QP),在当前视频帧进行编码前,本实施例可以采用帧级码控算法,计算出满足当前网络带宽要求的帧级编码参数(QP_base),同时由外部指定当前视频帧满足编码需求的最大编码参数(QP_max),该最大编码参数用于保证视频编码质量,表示未超出当前网络带宽要求以避免出现编码拥塞或编码过剩的问题时的编码上限,如果帧级编码参数调整后的编码参数超过该最大编码参数的话,会导致编码码率下溢,而达不到预设码率,使得视频帧的编码质量比预期较差,也就是当前视频帧的帧级编码可调值(QP_diff)为该帧级编码参数与该最大编码参数之间的差值。同时,为了保证目标宏块的自适应码率调整,本实施例还会为当前视频帧内的目标宏块设定对应的宏块基准偏移值(ROI_QP_OFF)。
目标宏块的占比小于最高编码质量指定的占比,也就是目标宏块的占比处于(0,0.15]阶段时,可以设定所匹配的编码偏移基值为当前视频帧的帧级编码可调值(QP_diff);目标宏块的占比不小于最高编码质量指定的占比,设定所匹配的编码偏移基值可以为帧级编码可调值(QP_diff)和目标宏块的宏块基准偏移值(ROI_QP_OFF)中的最小值,也就是QP_diff_limit=min(QP_diff, ROI_QP_OFF)。
S220,基于编码偏移基值以及目标宏块的占比对适配编码质量的反向影响参数,计算目标宏块的目标编码偏移量。
本实施例中,在计算出与目标宏块的占比匹配的编码偏移基值后,由于目标宏块的占比越小,该目标宏块所达到的编码质量越高,因此目标宏块的占比对适配编码质量会存在反向影响,此时通过目标宏块的占比对适配编码质量的反向影响参数来调整所计算出的编码偏移基值,即可计算出计算目标宏块的目标编码偏移量。
示例性的,目标宏块的目标编码偏移量的计算方式可以如下所示:
1)当目标宏块的占比处于(0,0.15]阶段时,该目标宏块的目标编码偏移量为QP 1=QP_diff*(1-ROI_ratio 2);其中,ROI_ratio为目标宏块的占比。
2)当目标宏块的占比处于(0.15,0.35]阶段时,该目标宏块的目标编码偏移量为QP 1=QP_diff_limit*(1-ROI_ratio 2)。
3)当目标宏块的占比处于(0.35,0.5]阶段时,该目标宏块的目标编码偏移量为QP 1=QP_diff_limit*(1-ROI_ratio 2)-QP_diff_limit/6。
4)当目标宏块的占比处于(0.5,1)阶段时,该目标宏块的目标编码偏移量为QP 1=QP_diff_limit*(1-ROI_ratio)。
由上述计算公式可以看出,随着目标宏块的占比不断增大,该目标宏块的目标编码偏移量越小,也就是每一目标宏块的编码质量优化幅度越小。
S230,基于目标编码偏移量,当前视频帧的帧级编码可调值以及当前视频帧内目标宏块的数量和非目标宏块的数量,计算当前视频帧内非目标宏块的非目标编码偏移量。
可选的,考虑到当前视频帧在帧级编码可调值对应的范围内进行编码参数调整时,能够调控编码码率保持平稳,避免出现编码拥塞或者编码过剩的问题,也就是允许帧级编码参数的整体偏移量在帧级编码可调值的范围内即可,此时利用当前视频帧内目标宏块的目标编码偏移量和目标宏块数量,能够计算出当前视频帧内目标宏块的整体偏移量,然后在当前视频帧的帧级编码可调值对应的范围内,可以利用目标宏块的整体偏移量,估算出非目标宏块的整体偏移量,进而按照当前视频帧内非目标宏块的数量,计算出非目标宏块的非目标编码偏移量。
S240,采用目标编码偏移量上调每一目标宏块的宏块编码参数,得到该目标宏块的实际编码参数。
可选的,为了保证当前视频帧内目标宏块的编码质量更高,而实现当前视频帧内感兴趣区域的高清显示,要求目标宏块编码时能够采用更多的细节画面特征,因此可以采用目标编码偏移量对应上调每一目标宏块采用宏块级的码控算法计算出的宏块编码参数,得到该目标宏块的实际编码参数,后续采用该实际编码参数对目标宏块进行编码时,能够获取更多的细节画面特征,从而提升目标宏块的编码质量。
S250,采用非目标编码偏移量下调每一非目标宏块的宏块编码参数,得到该非目标宏块的实际编码参数。
可选的,为了保证当前视频帧的帧级编码稳定,本实施例在上调当前视频帧内目标宏块的宏块编码参数后,还会采用非目标编码偏移量对应下调每一非目标宏块采用宏块级的码控算法计算出的宏块编码参数,得到该非目标宏块的实际编码参数,后续采用该实际编码参数对非目标宏块进行编码,实现当前视频帧的帧级编码稳定,避免出现编码拥塞或编码过剩的现象。
本实施例提供的技术方案,采用目标宏块的占比匹配的编码偏移基值以及目标宏块的占比对适配编码质量的反向影响参数,计算目标宏块的目标编码偏移量,保证目标宏块编码优化的准确性,同时在当前视频帧的帧级编码可调值对应的范围内,利用目标宏块的目标编码偏移量以及当前视频帧内目标宏块的数量和非目标宏块的数量,计算非目标宏块的非目标编码偏移量,保证视频帧级编码的稳定性,避免出现编码拥塞或编码过剩的现象。
实施例三
图3为本申请实施例三提供的一种视频编码的方法的流程图。本实施例是在上述实施例的基础上进行说明。由于当前视频帧内感兴趣区域并不是均匀分布的,因此存在非目标宏块内是由感兴趣区域的视频画面和非感兴趣区域的视频画面共同组成的,因此本实施例中非目标宏块可以包括局部非目标宏块和完全非目标宏块两种,其中,局部非目标宏块由感兴趣区域的视频画面和非感兴趣区域的视频画面共同组成,而完全非目标宏块仅由非感兴趣区域的视频画面组成。在本实施例中,主要对于非目标宏块下的局部非目标宏块和完全非目标宏块的非目标编码偏移量的计算过程进行解释说明。
可选的,如图3所示,本实施例中可以包括:
S310,采用目标区域检测模型检测当前视频帧内的每一宏块的类型,并统计当前视频帧内目标宏块的数量和非目标宏块的数量,以计算当前视频帧内目标宏块的占比。
可选的,本实施例针对不同的感兴趣区域,会预先设定用于能够准确识别感兴趣区域所具备的统一特定画面特征的目标区域检测模型,通过该目标区域检测模型遍历当前视频帧内的每一宏块,以判断该宏块内的视频画面是否全部为感兴趣区域的视频画面,若该宏块内的视频画面全部为感兴趣区域的视频画面则确定该宏块为目标宏块;若该宏块内的部分视频画面为感兴趣区域的视频画面,部分视频画面为非感兴趣区域的视频画面,那么确定该宏块为局部非目标宏块;若该宏块内的视频画面全部为非感兴趣区域的视频画面,那么确定该宏块为完全非目标宏块;同时统计出当前视频帧内目标宏块的数量以及非目标宏块内局部非目标宏块的数量和完全非目标宏块的数量,以此计算当前视频帧内目标宏块的占比。
示例性的,如果感兴趣区域为肤色区域时,本实施例可以将VPx肤色检测模型嵌套于VPx开源编码器中,该VPx肤色检测模型对于视频帧内的肤色区域具备较高的检测效率。
此时,对于当前视频帧获取对应的宽高信息,分别为width和height;然后,以16*16大小的宏块为编码单元遍历当前视频帧,以将当前视频帧划分得到多个宏块(x,y),其中,x取值[0,width/16],y取值[0,height/16],进而借助VPx肤色检测模型完成每一宏块的肤色检测,输出目标宏块、局部非目标宏块和完全非目标宏块三种宏块类型,并统计出目标宏块、局部非目标宏块和完全非目标宏块的数量分别为n1、n2和n3,此时当前视频帧内目标宏块的占比可以为
Figure PCTCN2022082675-appb-000001
S320,基于当前视频帧内目标宏块的占比,计算目标宏块达到适配编码质量时的目标编码偏移量。
S330,基于帧级编码稳定原则和目标编码偏移量,计算非目标宏块的非目标编码偏移基准量。
为了满足帧级编码稳定原则,当前视频帧内目标宏块和非目标宏块的整体编码偏移量大致为0,那么利用目标宏块的目标编码偏移和目标宏块数量,即可计算出目标宏块的整体编码偏移量,再按照当前视频帧内的非目标宏块数量,即可计算出非目标宏块的非目标编码偏移基准量。
非目标宏块的非目标编码偏移基准量可以为
Figure PCTCN2022082675-appb-000002
其中,QP 1为目标宏块的目标编码偏移量,后续通过对非目标编码偏移基准量调整,保证局部非目标宏块和完全非目标宏块的非目标编码偏移量的准确性。
S340,基于当前视频帧的帧级编码可调值,下调非目标编码偏移基准量,得到局部非目标宏块的非目标编码偏移量;以及,基于当前视频帧的帧级编码 可调值,上调非目标编码偏移基准量,得到完全非目标宏块的非目标编码偏移量。
可选的,由于局部非目标宏块由感兴趣区域的视频画面和非感兴趣区域的视频画面共同组成,而完全非目标宏块仅由非感兴趣区域的视频画面组成,因此要求局部非目标宏块的编码能够在目标宏块和完全非目标宏块之间做一个编码缓冲。也就是,基于当前视频帧的帧级编码可调值,设定一个符合该帧级编码可调值要求的变化量,使得当前视频帧能够满足当前码控的要求,采用该变化量下调所计算出的非目标编码偏移基准量,作为局部非目标宏块的非目标编码偏移量;同时,采用该变化量上调所计算出的非目标编码偏移基准量,作为完全非目标宏块的非目标编码偏移量,使得在保证当前视频帧的帧级编码稳定时,局部非目标宏块的非目标编码偏移量能够低于完全非目标宏块的非目标编码偏移量,使得局部非目标宏块的编码质量能够优于完全非目标宏块。然后,分别采用对应的编码偏移量来对应调整每一宏块的宏块编码参数,得到每个宏块的实际编码参数。
示例性的,如果基于当前视频帧的帧级编码可调值,设定的变化量为δ,那么局部非目标宏块的非目标编码偏移量可以为
Figure PCTCN2022082675-appb-000003
完全非目标宏块的非目标编码偏移量可以为
Figure PCTCN2022082675-appb-000004
S350,采用目标编码偏移量调整每一目标宏块的宏块编码参数,得到该目标宏块的实际编码参数;采用局部非目标宏块的非目标编码偏移量调整每一局部非目标宏块的宏块编码参数,得到该局部非目标宏块的实际编码参数;采用完全非目标宏块的非目标编码偏移量调整每一完全非目标宏块的宏块编码参数,得到该完全非目标宏块的实际编码参数。
S360,针对当前视频帧内的每一宏块,采用该宏块调整后的宏块编码参数,对该宏块进行编码。
可选的,在计算出当前视频帧内每一宏块的实际编码参数后,可以依次采用每一宏块按照对应编码偏移量调整后的实际编码参数,对该宏块进行编码。
本实施例提供的技术方案,通过当前视频帧内目标宏块的占比,能够分析出该目标宏块的适配编码质量,进而计算出目标宏块达到该适配编码质量时的目标编码偏移量,然后基于帧级编码稳定原则,可以利用该目标编码偏移量计算出当前视频帧内非目标宏块的非目标编码偏移量,然后采用该目标编码偏移量和该非目标编码偏移量,对应调整目标宏块和非目标宏块的宏块编码参数,以实现当前视频帧目标宏块和非目标宏块的编码优化,无需计算不必要的编码划分模式下的编码代价,极大降低了视频编码的复杂性和编码开销,保证视频 编码的高效性;同时,采用目标编码偏移量和非目标编码偏移量,对应调整目标宏块和非目标宏块的宏块编码参数后,能够在保证视频编码高效性的基础上,进一步提升视频的编码质量。
实施例四
图4为本申请实施例四提供的一种视频编码的装置的结构示意图,如图4所示,该装置可以包括:
目标偏移量计算模块410,设置为基于当前视频帧内目标宏块的占比,计算所述目标宏块达到适配编码质量时的目标编码偏移量;非目标偏移量计算模块420,设置为基于帧级编码稳定原则和所述目标编码偏移量,计算当前视频帧内非目标宏块的非目标编码偏移量;编码参数调整模块430,设置为采用所述目标编码偏移量和所述非目标编码偏移量,对应调整所述目标宏块和所述非目标宏块的宏块编码参数。
本实施例提供的技术方案,通过当前视频帧内目标宏块的占比,能够分析出该目标宏块的适配编码质量,进而计算出目标宏块达到该适配编码质量时的目标编码偏移量,然后基于帧级编码稳定原则,可以利用该目标编码偏移量计算出当前视频帧内非目标宏块的非目标编码偏移量,然后采用该目标编码偏移量和该非目标编码偏移量,对应调整目标宏块和非目标宏块的宏块编码参数,以实现当前视频帧目标宏块和非目标宏块的编码优化,无需计算不必要的编码划分模式下的编码代价,极大降低了视频编码的复杂性和编码开销,保证视频编码的高效性;同时,采用目标编码偏移量和非目标编码偏移量,对应调整目标宏块和非目标宏块的宏块编码参数后,能够在保证视频编码高效性的基础上,进一步提升视频的编码质量。
本实施例提供的视频编码的装置可适用于上述任意实施例提供的视频编码的方法,具备相应的功能和效果。
实施例五
图5为本申请实施例五提供的一种计算机设备的结构示意图,如图5所示,该计算机设备包括处理器50、存储装置51和通信装置52;计算机设备中处理器50的数量可以是一个或多个,图5中以一个处理器50为例;计算机设备中的处理器50、存储装置51和通信装置52可以通过总线或其他方式连接,图5中以通过总线连接为例。
本实施例提供的一种计算机设备可用于执行上述任意实施例提供的视频编 码的方法,具备相应的功能和效果。
实施例六
本申请实施例六还提供了一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时可实现上述任意实施例中的视频编码的方法。
本申请实施例所提供的一种包含计算机可执行指令的存储介质,其计算机可执行指令不限于如上所述的方法操作,还可以执行本申请任意实施例所提供的视频编码的方法中的相关操作。
通过以上关于实施方式的描述,本申请可借助软件及必需的通用硬件来实现,也可以通过硬件实现。本申请的技术方案本质上可以以软件产品的形式体现出来,该计算机软件产品可以存储在计算机可读存储介质中,如计算机的软盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、闪存(FLASH)、硬盘或光盘等,包括多个指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请实施例所述的方法。计算机可读存储介质可以为非暂态存储介质。
上述视频编码的装置的实施例中,所包括的多个单元和模块只是按照功能逻辑进行划分的,但并不局限于上述的划分,只要能够实现相应的功能即可;另外,多个功能单元的名称也只是为了便于相互区分,并不用于限制本申请的保护范围。

Claims (13)

  1. 一种视频编码的方法,包括:
    基于当前视频帧内目标宏块的占比,计算所述目标宏块在达到适配编码质量的情况下的目标编码偏移量;
    基于帧级编码稳定原则和所述目标编码偏移量,计算所述当前视频帧内非目标宏块的非目标编码偏移量;
    采用所述目标编码偏移量和所述非目标编码偏移量,对应调整所述目标宏块和所述非目标宏块的宏块编码参数。
  2. 根据权利要求1所述的方法,其中,所述基于当前视频帧内目标宏块的占比,计算所述目标宏块在达到适配编码质量的情况下的目标编码偏移量,包括:
    确定所述适配编码质量下已设定的与所述目标宏块的占比匹配的编码偏移基值;
    基于所述编码偏移基值以及所述目标宏块的占比对所述适配编码质量的反向影响参数,计算所述目标宏块的目标编码偏移量。
  3. 根据权利要求2所述的方法,其中,在所述目标宏块的占比小于最高编码质量指定的占比的情况下,所述编码偏移基值为所述当前视频帧的帧级编码可调值;在所述目标宏块的占比不小于最高编码质量指定的占比的情况下,所述编码偏移基值为所述当前视频帧的帧级编码可调值和所述目标宏块的宏块基准偏移值中的最小值。
  4. 根据权利要求1所述的方法,其中,所述基于帧级编码稳定原则和所述目标编码偏移量,计算所述当前视频帧内非目标宏块的非目标编码偏移量,包括:
    基于所述目标编码偏移量,所述当前视频帧的帧级编码可调值以及所述当前视频帧内目标宏块的数量和非目标宏块的数量,计算所述当前视频帧内非目标宏块的非目标编码偏移量。
  5. 根据权利要求1所述的方法,其中,所述采用所述目标编码偏移量和所述非目标编码偏移量,对应调整所述目标宏块和所述非目标宏块的宏块编码参数,包括:
    采用所述目标编码偏移量上调每一目标宏块的宏块编码参数,得到所述每一目标宏块的实际编码参数;
    采用所述非目标编码偏移量下调每一非目标宏块的宏块编码参数,得到所 述每一非目标宏块的实际编码参数。
  6. 根据权利要求1-5任一项所述的方法,其中,所述非目标宏块包括局部非目标宏块和完全非目标宏块。
  7. 根据权利要求6所述的方法,其中,所述基于帧级编码稳定原则和所述目标编码偏移量,计算所述当前视频帧内非目标宏块的非目标编码偏移量,包括:
    基于所述帧级编码稳定原则和所述目标编码偏移量,计算所述非目标宏块的非目标编码偏移基准量;
    基于所述当前视频帧的帧级编码可调值,下调所述非目标编码偏移基准量,得到所述局部非目标宏块的非目标编码偏移量;以及基于所述当前视频帧的帧级编码可调值,上调所述非目标编码偏移基准量,得到所述完全非目标宏块的非目标编码偏移量。
  8. 根据权利要求7所述的方法,其中,所述采用所述目标编码偏移量和所述非目标编码偏移量,对应调整所述目标宏块和所述非目标宏块的宏块编码参数,包括:
    采用所述目标编码偏移量调整每一目标宏块的宏块编码参数,得到所述每一目标宏块的实际编码参数;
    采用所述局部非目标宏块的非目标编码偏移量调整每一局部非目标宏块的宏块编码参数,得到所述每一局部非目标宏块的实际编码参数;
    采用所述完全非目标宏块的非目标编码偏移量调整每一完全非目标宏块的宏块编码参数,得到所述每一完全非目标宏块的实际编码参数。
  9. 根据权利要求1-5任一项所述的方法,在所述基于当前视频帧内目标宏块的占比,计算所述目标宏块在达到适配编码质量的情况下的目标编码偏移量之前,还包括:
    采用目标区域检测模型检测所述当前视频帧内的每一宏块的类型,并统计手术当前视频帧内目标宏块的数量和非目标宏块的数量,以计算所述当前视频帧内目标宏块的占比。
  10. 根据权利要求1-5任一项所述的方法,在所述采用所述目标编码偏移量和所述非目标编码偏移量,对应调整所述目标宏块和所述非目标宏块的宏块编码参数之后,还包括:
    针对所述当前视频帧内的每一宏块,采用所述每一宏块调整后的宏块编码参数,对所述每一宏块进行编码。
  11. 一种视频编码的装置,包括:
    目标偏移量计算模块,设置为基于当前视频帧内目标宏块的占比,计算所述目标宏块在达到适配编码质量的情况下的目标编码偏移量;
    非目标偏移量计算模块,设置为基于帧级编码稳定原则和所述目标编码偏移量,计算所述当前视频帧内非目标宏块的非目标编码偏移量;
    编码参数调整模块,设置为采用所述目标编码偏移量和所述非目标编码偏移量,对应调整所述目标宏块和所述非目标宏块的宏块编码参数。
  12. 一种计算机设备,包括:
    至少一个处理器;
    存储装置,设置为存储至少一个程序;
    当所述至少一个程序被所述至少一个处理器执行,使得所述至少一个处理器实现如权利要求1-10中任一项所述的视频编码的方法。
  13. 一种计算机可读存储介质,存储有计算机程序,其中,所述程序被处理器执行时实现如权利要求1-10中任一项所述的视频编码的方法。
PCT/CN2022/082675 2021-04-07 2022-03-24 视频编码的方法、装置、设备和存储介质 WO2022213809A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110372888.1 2021-04-07
CN202110372888.1A CN112929668B (zh) 2021-04-07 2021-04-07 一种视频编码的方法、装置、设备和存储介质

Publications (1)

Publication Number Publication Date
WO2022213809A1 true WO2022213809A1 (zh) 2022-10-13

Family

ID=76173619

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/082675 WO2022213809A1 (zh) 2021-04-07 2022-03-24 视频编码的方法、装置、设备和存储介质

Country Status (2)

Country Link
CN (1) CN112929668B (zh)
WO (1) WO2022213809A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112929668B (zh) * 2021-04-07 2024-04-26 百果园技术(新加坡)有限公司 一种视频编码的方法、装置、设备和存储介质
CN115643405A (zh) * 2022-09-29 2023-01-24 上海哔哩哔哩科技有限公司 基于roi区域检测的视频编码方法、装置以及计算设备

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6256423B1 (en) * 1998-09-18 2001-07-03 Sarnoff Corporation Intra-frame quantizer selection for video compression
US20100098162A1 (en) * 2008-10-17 2010-04-22 Futurewei Technologies, Inc. System and Method for Bit-Allocation in Video Coding
JP2010193441A (ja) * 2009-01-26 2010-09-02 Panasonic Corp 動画像処理装置、動画像処理方法および撮像装置
CN101867799A (zh) * 2009-04-17 2010-10-20 北京大学 一种视频帧处理方法和视频编码器
CN103974071A (zh) * 2013-01-29 2014-08-06 富士通株式会社 基于感兴趣区域的视频编码方法和设备
CN106791856A (zh) * 2016-12-28 2017-05-31 天津天地伟业生产力促进有限公司 一种基于自适应感兴趣区域的视频编码方法
US20190007690A1 (en) * 2017-06-30 2019-01-03 Intel Corporation Encoding video frames using generated region of interest maps
CN111918066A (zh) * 2020-09-08 2020-11-10 北京字节跳动网络技术有限公司 视频编码方法、装置、设备及存储介质
CN112188208A (zh) * 2020-09-18 2021-01-05 浙江大华技术股份有限公司 一种宏块级码率控制方法及相关装置
CN112929668A (zh) * 2021-04-07 2021-06-08 百果园技术(新加坡)有限公司 一种视频编码的方法、装置、设备和存储介质

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0896300B1 (en) * 1997-08-07 2002-01-30 Matsushita Electric Industrial Co., Ltd. Device and method for motion vector detection
EP2680581A1 (en) * 2012-06-28 2014-01-01 Alcatel-Lucent Method and apparatus for dynamic adaptation of video encoder parameters
WO2016054307A1 (en) * 2014-10-03 2016-04-07 Microsoft Technology Licensing, Llc Adapting quantization within regions-of-interest
JP6690254B2 (ja) * 2016-01-22 2020-04-28 沖電気工業株式会社 画像符号化装置、方法及びプログラム
KR102543444B1 (ko) * 2017-08-29 2023-06-13 삼성전자주식회사 영상 부호화 장치
WO2020117781A1 (en) * 2018-12-04 2020-06-11 Interdigital Vc Holdings, Inc. Method and apparatus for video encoding and decoding with adjusting the quantization parameter to block size
WO2020256522A1 (ko) * 2019-06-20 2020-12-24 한국전자통신연구원 영역 분할을 사용하는 영상 부호화 및 영상 복호화를 위한 방법 및 장치
CN110267037B (zh) * 2019-06-21 2021-08-13 腾讯科技(深圳)有限公司 视频编码方法、装置、电子设备及计算机可读存储介质

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6256423B1 (en) * 1998-09-18 2001-07-03 Sarnoff Corporation Intra-frame quantizer selection for video compression
US20100098162A1 (en) * 2008-10-17 2010-04-22 Futurewei Technologies, Inc. System and Method for Bit-Allocation in Video Coding
JP2010193441A (ja) * 2009-01-26 2010-09-02 Panasonic Corp 動画像処理装置、動画像処理方法および撮像装置
CN101867799A (zh) * 2009-04-17 2010-10-20 北京大学 一种视频帧处理方法和视频编码器
CN103974071A (zh) * 2013-01-29 2014-08-06 富士通株式会社 基于感兴趣区域的视频编码方法和设备
CN106791856A (zh) * 2016-12-28 2017-05-31 天津天地伟业生产力促进有限公司 一种基于自适应感兴趣区域的视频编码方法
US20190007690A1 (en) * 2017-06-30 2019-01-03 Intel Corporation Encoding video frames using generated region of interest maps
CN111918066A (zh) * 2020-09-08 2020-11-10 北京字节跳动网络技术有限公司 视频编码方法、装置、设备及存储介质
CN112188208A (zh) * 2020-09-18 2021-01-05 浙江大华技术股份有限公司 一种宏块级码率控制方法及相关装置
CN112929668A (zh) * 2021-04-07 2021-06-08 百果园技术(新加坡)有限公司 一种视频编码的方法、装置、设备和存储介质

Also Published As

Publication number Publication date
CN112929668A (zh) 2021-06-08
CN112929668B (zh) 2024-04-26

Similar Documents

Publication Publication Date Title
WO2022213809A1 (zh) 视频编码的方法、装置、设备和存储介质
WO2021244341A1 (zh) 图像编码方法及装置、电子设备及计算机可读存储介质
WO2020042269A1 (zh) 一种编码过程中的码率调整方法及装置
JP6615346B2 (ja) 符号化処理におけるリアルタイムビデオノイズ低減のための方法、端末、および、不揮発性コンピュータ可読記憶媒体
TWI677239B (zh) 結合多個去雜訊化技術及並行分組圖像補塊的非局部適應性環路濾波器
CN108063944B (zh) 一种基于视觉显著性的感知码率控制方法
US11070803B2 (en) Method and apparatus for determining coding cost of coding unit and computer-readable storage medium
US12022096B2 (en) Human visual system adaptive video coding
WO2019033877A1 (zh) 一种视频帧编码方法、终端及存储介质
US11259029B2 (en) Method, device, apparatus for predicting video coding complexity and storage medium
WO2020248715A1 (zh) 基于高效率视频编码的编码管理方法及装置
CN110620924A (zh) 编码数据的处理方法、装置、计算机设备及存储介质
WO2017004889A1 (zh) 基于jnd因子的超像素高斯滤波预处理方法
US10536696B2 (en) Image encoding device and image encoding method
WO2023082904A1 (zh) 视频编码方法和装置
CN111385571B (zh) 一种超长图像组码率控制方法及装置
CN109218722B (zh) 一种视频编码方法、装置及设备
CN112655212B (zh) 视频编码优化方法、装置及计算机存储介质
CN110956594A (zh) 图像滤波方法、装置、电子设备及存储介质
WO2024114432A1 (zh) 视频编码中的样点自适应补偿方法及装置
WO2023241376A1 (zh) 一种视频码率分配方法、系统、设备及存储介质
CN116363018B (zh) 一种自适应的nlm降噪方法
CN113038135B (zh) 基于块效应检测的crf自适应方法
WO2020258052A1 (zh) 图像分量预测方法、装置及计算机存储介质
WO2022246663A1 (zh) 图像处理方法、设备、系统和存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22783876

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22783876

Country of ref document: EP

Kind code of ref document: A1