WO2022174701A1 - 视频编解码方法、装置、计算机可读介质及电子设备 - Google Patents

视频编解码方法、装置、计算机可读介质及电子设备 Download PDF

Info

Publication number
WO2022174701A1
WO2022174701A1 PCT/CN2022/071732 CN2022071732W WO2022174701A1 WO 2022174701 A1 WO2022174701 A1 WO 2022174701A1 CN 2022071732 W CN2022071732 W CN 2022071732W WO 2022174701 A1 WO2022174701 A1 WO 2022174701A1
Authority
WO
WIPO (PCT)
Prior art keywords
block
sub
coding block
maximum
coding
Prior art date
Application number
PCT/CN2022/071732
Other languages
English (en)
French (fr)
Inventor
王力强
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Publication of WO2022174701A1 publication Critical patent/WO2022174701A1/zh
Priority to US18/074,855 priority Critical patent/US20230104359A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/119Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/13Adaptive entropy coding, e.g. adaptive variable length coding [AVLC] or context adaptive binary arithmetic coding [CABAC]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/44Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards

Definitions

  • the present application relates to the field of computer and communication technologies, and in particular, to a video encoding and decoding method, apparatus, computer-readable medium, and electronic device.
  • the Sub-Block Transform (SBT) technology is mentioned. Specifically, the coding block is divided into multiple sub-blocks in a certain way for coding. However, dividing the coding block according to the current AVS3 standard may have too narrow sub-blocks (too narrow sub-blocks are sub-blocks whose width and height are too large), which will reduce hardware performance and affect coding and decoding efficiency.
  • the embodiments of the present application provide a video encoding and decoding method, apparatus, computer-readable medium, and electronic device, so as to avoid generating too narrow sub-blocks when using the SBT technology at least to a certain extent, ensuring the performance of the hardware, Improved encoding and decoding efficiency.
  • a video decoding method including: acquiring a coding block of a video image frame; The sub-block maximum size ratio determines a division mode that can be used by the coding block; based on the division mode that can be used by the coding block, determines a target division mode used when performing sub-block transformation processing on the coding block; based on the target division mode The division mode performs division processing on the coding block, and performs decoding processing on a plurality of sub-blocks obtained by division.
  • a video encoding method including: acquiring a residual coefficient block of a video image frame; determining the A division mode that can be used by the residual coefficient block; selecting a target division mode for the residual coefficient block from the division modes that can be used; encoding the information of the target division mode into the code stream.
  • a video decoding apparatus including: an acquisition unit configured to acquire an encoding block of a video image frame; a first processing unit configured to perform sub-block transformation if the encoding block needs to be transformed, Then, the division mode that can be adopted by the encoding block is determined according to the size of the encoding block and the ratio of the maximum size of the sub-blocks allowed; the second processing unit is configured to determine, based on the division mode that the encoding block can adopt, the The target division mode used when the coding block performs sub-block transformation processing; the third processing unit is configured to perform division processing on the coding block based on the target division mode, and perform decoding processing on the divided sub-blocks.
  • a video encoding apparatus including: an acquisition unit configured to acquire a residual coefficient block of a video image frame; a fourth processing unit configured to obtain a residual coefficient block according to the size of the residual coefficient block determining a division mode that can be adopted by the residual coefficient block with a ratio of the maximum size of the sub-block allowed; a selection unit, configured to select a target division mode for the residual coefficient block from the division modes that can be adopted; coding a unit, configured to encode the information of the target division mode into the code stream.
  • a computer-readable medium on which a computer program is stored, and when the computer program is executed by a processor, implements the video decoding method described in the foregoing embodiments.
  • an electronic device including: one or more processors; and a storage device for storing one or more programs, when the one or more programs are stored by the one or more programs When executed by multiple processors, the one or more processors are made to implement the video decoding method described in the above embodiments.
  • a computer program product or computer program where the computer program product or computer program includes computer instructions, and the computer instructions are stored in a computer-readable storage medium.
  • the processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions to cause the computer device to perform the video decoding methods provided in the various embodiments described above.
  • FIG. 1 shows a schematic diagram of an exemplary system architecture to which the technical solutions of the embodiments of the present application can be applied;
  • FIG. 2 shows a schematic diagram of a placement manner of a video encoding device and a video decoding device in a streaming transmission system
  • Fig. 3 shows the basic flow chart of a video encoder
  • Fig. 4 shows the analytic diagram that SBT divides and derives
  • Fig. 5 shows the schematic diagram of the division mode of SBT
  • FIG. 6 shows a schematic diagram of transform combination in the sub-block transform technique
  • FIG. 7 shows a flowchart of a video decoding method according to an embodiment of the present application.
  • FIG. 8 shows a block diagram of a video decoding apparatus according to an embodiment of the present application.
  • FIG. 9 shows a schematic structural diagram of a computer system suitable for implementing the electronic device according to the embodiment of the present application.
  • Example embodiments will now be described more fully with reference to the accompanying drawings.
  • Example embodiments can be embodied in various forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this application will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art.
  • FIG. 1 shows a schematic diagram of an exemplary system architecture to which the technical solutions of the embodiments of the present application can be applied.
  • the system architecture 100 includes a plurality of end devices that can communicate with each other through, for example, a network 150 .
  • the system architecture 100 may include a first end device 110 and a second end device 120 interconnected by a network 150 .
  • the first terminal device 110 and the second terminal device 120 perform unidirectional data transmission.
  • the first terminal device 110 may encode video data (eg, a video picture stream captured by the first terminal device 110) for transmission to the second terminal device 120 through the network 150, the encoded video data in one or more
  • the second terminal device 120 may receive the encoded video data from the network 150, decode the encoded video data to restore the video data, and display video pictures according to the restored video data.
  • the system architecture 100 may include a third end device 130 and a fourth end device 140 that perform bidirectional transmission of encoded video data, such as may occur during a video conference.
  • each of the third end device 130 and the fourth end device 140 may encode video data (eg, a stream of video pictures captured by the end device) for transmission to the third end device over the network 150 130 and the other terminal device of the fourth terminal device 140 .
  • Each of the third terminal device 130 and the fourth terminal device 140 may also receive encoded video data transmitted by the other one of the third terminal device 130 and the fourth terminal device 140, and may The video data is decoded to recover the video data, and a video picture can be displayed on an accessible display device based on the recovered video data.
  • the first terminal device 110 , the second terminal device 120 , the third terminal device 130 and the fourth terminal device 140 may be servers, personal computers and smart phones, but the principles disclosed in this application may not be limited thereto .
  • Embodiments disclosed herein are applicable to laptop computers, tablet computers, media players, and/or dedicated videoconferencing equipment.
  • Network 150 represents any number of networks, including, for example, wired and/or wireless communication networks, that communicate encoded video data between first end device 110, second end device 120, third end device 130, and fourth end device 140.
  • Communication network 150 may exchange data in circuit-switched and/or packet-switched channels.
  • the network may include a telecommunications network, a local area network, a wide area network, and/or the Internet.
  • the architecture and topology of network 150 may be immaterial to the operations disclosed herein.
  • FIG. 2 shows the placement of a video encoding device and a video decoding device in a streaming environment.
  • the subject matter disclosed herein is equally applicable to other video-enabled applications including, for example, videoconferencing, digital TV (television), storing compressed video on digital media including CDs, DVDs, memory sticks, and the like.
  • the streaming system may include a capture subsystem 213 , which may include a video source 201 such as a digital camera, and the video source creates an uncompressed video picture stream 202 .
  • the video picture stream 202 includes samples captured by a digital camera.
  • the video picture stream 202 is depicted as a thick line to emphasize the high data volume of the video picture stream, which can be processed by the electronic device 220.
  • Device 220 includes video encoding device 203 coupled to video source 201 .
  • Video encoding device 203 may include hardware, software, or a combination of hardware and software to implement or implement various aspects of the disclosed subject matter as described in greater detail below.
  • the encoded video data 204 (or encoded video code stream 204) is depicted as a thin line to emphasize the lower amount of encoded video data 204 (or encoded video code stream 204) 204), which can be stored on the streaming server 205 for future use.
  • One or more streaming client subsystems such as client subsystem 206 and client subsystem 208 in FIG. 2 , may access streaming server 205 to retrieve copies 207 and 209 of encoded video data 204 .
  • Client subsystem 206 may include, for example, video decoding device 210 in electronic device 230 .
  • the video decoding device 210 decodes the incoming copy 207 of the encoded video data and produces an output video picture stream 211 that can be presented on a display 212 (eg, a display screen) or another presentation device.
  • encoded video data 204, video data 207, and video data 209 may be encoded according to certain video encoding/compression standards. Examples of these standards include ITU-T H.265.
  • video coding standard under development is informally referred to as Versatile Video Coding (VVC), and this application may be used in the context of the VVC standard.
  • electronic device 220 and the electronic device 230 may include other components not shown in the figures.
  • electronic device 220 may include a video decoding device
  • electronic device 230 may also include a video encoding device.
  • the video frame image when After inputting a video frame image, the video frame image will be divided into several non-overlapping processing units according to a block size, and each processing unit will perform a similar compression operation.
  • This processing unit is called CTU (Coding Tree Unit, coding tree unit), or LCU (Largest Coding Unit, largest coding unit).
  • the CTU can continue to be further divided into finer divisions to obtain one or more basic coding units CU, and CU is the most basic element in a coding link.
  • Predictive Coding includes intra-frame prediction and inter-frame prediction. After the original video signal is predicted by the selected reconstructed video signal, a residual video signal is obtained. The encoder needs to decide which predictive coding mode to select for the current CU and inform the decoder. Among them, intra-frame prediction means that the predicted signal comes from an area that has been coded and reconstructed in the same image; inter-frame prediction means that the predicted signal comes from another image (called a reference image) that has been coded and different from the current image. .
  • Transform & Quantization After the residual video signal undergoes transform operations such as DFT (Discrete Fourier Transform), DCT, etc., the signal is converted into the transform domain, which is called transform coefficient. The transform coefficient is further subjected to a lossy quantization operation, which loses a certain amount of information, so that the quantized signal is beneficial to the compressed expression. In some video coding standards, there may be more than one transformation mode to choose from, so the encoder also needs to select one of the transformation modes for the current CU and inform the decoder. The fineness of quantization is usually determined by the Quantization Parameter (QP for short). If the value of QP is larger, the coefficients representing a larger value range will be quantized into the same output, which usually brings greater distortion and distortion. A lower code rate; on the contrary, if the QP value is smaller, the coefficients representing a smaller value range will be quantized into the same output, so it usually brings less distortion and corresponds to a higher code rate.
  • QP Quantization Parameter
  • Entropy Coding or Statistical Coding The quantized transform domain signal will undergo statistical compression coding according to the frequency of occurrence of each value, and finally output a binarized (0 or 1) compressed code stream. At the same time, other information generated by encoding, such as the selected encoding mode, motion vector data, etc., also needs to be entropy encoded to reduce the bit rate.
  • Statistical coding is a lossless coding method that can effectively reduce the code rate required to express the same signal. Common statistical coding methods include Variable Length Coding (VLC) or context-based binary arithmetic coding ( Content Adaptive Binary Arithmetic Coding, referred to as CABAC).
  • Loop Filtering The transformed and quantized signal will obtain a reconstructed image through the operations of inverse quantization, inverse transformation and prediction compensation. Compared with the original image, the reconstructed image is different from the original image due to the influence of quantization, that is, the reconstructed image will produce distortion (Distortion). Therefore, filtering operations can be performed on the reconstructed image, such as Deblocking Filter (DB), SAO (Sample Adaptive Offset, Sample Adaptive Compensation) or ALF (Adaptive Loop Filter, Adaptive Loop Filter) and other filters , which can effectively reduce the degree of distortion caused by quantization. Since these filtered reconstructed images will be used as references for subsequent encoded images to predict future image signals, the above filtering operation is also called in-loop filtering, ie, a filtering operation in an encoding loop.
  • DB Deblocking Filter
  • SAO Sample Adaptive Offset, Sample Adaptive Compensation
  • ALF Adaptive Loop Filter, Adaptive Loop Filter
  • FIG. 3 shows a basic flowchart of a video encoder, and intra-frame prediction is used as an example for description in the flowchart.
  • the quantized coefficients are encoded by entropy coding to obtain the encoded bits
  • the reconstructed residual signal u' k [x, y] is obtained through inverse quantization and inverse transformation processing, and the predicted image signal It is superimposed with the reconstructed residual signal u' k [x, y] to generate an image signal image signal
  • it is input to the intra - frame mode decision module and the intra-frame prediction module for intra-frame prediction processing; ] can be used as a reference image for the next frame for motion estimation and motion compensation prediction.
  • the decoding end Based on the above encoding process, for each CU, the decoding end performs entropy decoding to obtain various mode information and quantization coefficients after obtaining the compressed code stream (ie, the bit stream). Then, the quantized coefficients undergo inverse quantization and inverse transformation to obtain residual signals.
  • the predicted signal corresponding to the CU can be obtained, and then the reconstructed signal can be obtained by adding the residual signal and the predicted signal. The reconstructed signal is then subjected to loop filtering and other operations to generate the final output signal.
  • the transform processing of the residual signal makes the energy of the residual signal concentrate on less low-frequency coefficients, that is, most coefficients have smaller values. Then after the subsequent quantization module, the smaller coefficient value will become zero value, which greatly reduces the cost of coding the residual signal.
  • transformation kernels such as DST7 and DCT8 are introduced into the transformation process, and the horizontal transformation and vertical transformation of the residual signal are carried out. Direct transforms can use different transform kernels.
  • the possible transformation combinations for transform processing of a residual signal are as follows: (DCT2, DCT2), (DCT8, DCT8), (DCT8, DST7) ), (DST7, DCT8) and (DST7, DST7).
  • SBT Sub-Block Transform
  • Figure 4 shows an analytical diagram of the SBT partition derivation. Among them, there are 8 sub-block division results in SBT, and only the gray part (non-zero residual sub-block) in the sub-block is transformed and encoded, while the white part (zero-residual sub-block) is forcibly cleared.
  • Table 1 the relevant descriptions of SBT in the AVS3 standard are as follows:
  • SbtVerHalf represents the vertical bisection mode identifier
  • SbtVerQuad represents the vertical quarter mode identifier
  • SbtHorHalf represents the horizontal bisection mode identifier
  • SbtHorQuad represents the horizontal quarter mode identifier
  • the vertical dichotomy mode is adopted, and only the left sub-block (gray part) is transformed and encoded, and the other sub-blocks (white part) are forcibly cleared; if the value of quad is 0, and the value of dir is 0 , the value of pos is 1, the vertical dichotomy mode is adopted, and only the right sub-block (gray part) is transformed and encoded, and the other sub-blocks (white part) are forcibly cleared; if the value of quad is 0, If the value of dir is 1 and the value of pos is 0, the horizontal dichotomy mode is adopted, and only the upper sub-block (gray part) is transformed and encoded, and other sub-blocks (white part) are forcibly cleared; if the value of quad If the value is 0, the value of dir is 1, and the value of pos is 1, the horizontal bisection mode is adopted, and only the lower sub-block (gray part) is transformed and encoded, and the other sub-blocks (white part) are forcibly cleared.
  • the quad in Figure 4 can be regarded as the size control syntax of the sub-block, and both dir and pos can be regarded as the syntax for controlling the position of the sub-block.
  • the quad determines the size (area) of the sub-block
  • dir and pos can be Determines the location of the subblock.
  • quad, dir, and pos in Figure 4 correspond to sbt_quad_flag, sbt_dir_flag, and sbt_pos_flag in the AVS3 text, respectively.
  • sbt_quad_flag and sbt_dir_flag can be obtained by direct analysis from the code stream (refer to Table 1), and sbt_pos_flag is implicitly derived through coefficient statistics.
  • sbt_quad_flag, sbt_dir_flag and sbt_pos_flag jointly determine the specific SBT division mode.
  • the 8 division modes of SBT can be classified according to the division method and division direction as shown in Fig. 5.
  • the horizontal mode and the vertical mode divide the current coding block in the horizontal direction and the vertical direction, respectively
  • the quarter mode and the bisection mode respectively indicate that 1/4 and 1/2 of the sub-blocks are divided along the corresponding division direction. coding.
  • the gray sub-blocks are not further divided, and coefficient coding is directly performed.
  • the horizontal and vertical transforms of the non-zero residual sub-block are DCT-2; in other cases, the horizontal and vertical transforms are DCT-2.
  • the selection of transformations is shown in Figure 6.
  • the gray part in the sub-block is subjected to horizontal transformation and vertical transformation coding according to the identified transformation mode, while the white part is forcibly cleared, and w1 in Figure 6 can be 1 of w (width) /2, or can be 1/4 of w; h1 can be 1/2 of h (height), or can be 1/4 of h.
  • too narrow sub-blocks may still be generated. For example, for a 64 ⁇ 8 (width ⁇ height) coding block, performing horizontal bisection will generate a 64 ⁇ 4 transform block, which is too narrow in the horizontal direction; for a 64 ⁇ 16 (width ⁇ height) coding block, performing a horizontal quadrant will result in A 64x4 transform block is produced, which is too narrow in the horizontal direction. Because the too narrow transform block will reduce the hardware performance and affect the coding and decoding efficiency.
  • the technical solutions of the embodiments of the present application propose a solution for further limiting the coding block using the SBT technology, which avoids dividing too narrow sub-blocks, effectively ensures the performance of the hardware, and improves the coding and decoding efficiency. Details are as follows:
  • FIG. 7 shows a flowchart of a video decoding method according to an embodiment of the present application.
  • the video decoding method may be executed by a device with a computing processing function, such as a terminal device or a server.
  • the video decoding method at least includes steps S710 to S740, which are described in detail as follows:
  • step S710 the coding block of the video image frame is acquired.
  • the video image frame sequence includes a series of images, each image can be further divided into slices, and the slices can be further divided into a series of LCUs (or CTUs).
  • the LCUs include There are several CUs.
  • the video image frame is encoded in block units.
  • MB macroblock
  • MB macroblock
  • Prediction Prediction block
  • basic concepts such as coding unit CU, prediction unit (PU) and transform unit (TU) are used to functionally divide a variety of block units, and adopt a new tree-based structure for describe.
  • a CU can be divided into smaller CUs according to a quadtree, and the smaller CUs can be further divided to form a quadtree structure.
  • the coding block in this embodiment of the present application may be a CU, or a block smaller than the CU, such as a smaller block obtained by dividing the CU.
  • step S720 if the coding block needs to undergo sub-block transformation, a division mode that can be adopted by the coding block is determined according to the size of the coding block and the allowed maximum size ratio of the sub-block.
  • whether the coding block needs to adopt the sub-block transform technique may be determined according to the explicit index and/or the implicitly derived index included in the coding block.
  • the maximum size ratio of the sub-block may include a maximum aspect ratio and a maximum aspect ratio, and the value of the maximum aspect ratio is equal to the value of the maximum aspect ratio.
  • the value of the sub-block maximum size ratio can be decoded from the sequence header of the video image frame.
  • the maximum size ratio of the sub-block may include a maximum aspect ratio and a maximum aspect ratio, and the value of the maximum aspect ratio is not equal to the value of the maximum aspect ratio.
  • the maximum aspect ratio value and the maximum aspect ratio value can be decoded separately from the sequence header of the video image frame.
  • the value of the maximum size ratio of sub-blocks may multiplex the value of the maximum size ratio of coding blocks allowed in the standard, that is, the value of the maximum size ratio of sub-blocks may be the same as the maximum size ratio of the allowed coding blocks. the same value.
  • the process of determining the division mode that can be adopted by the coding block according to the size of the coding block and the maximum allowed sub-block size ratio specifically includes: if the width of the coding block is greater than or equal to 8, and the width of the coding block is greater than or equal to 8 2 times the height is less than or equal to the product of the width and the maximum size ratio of the sub-block, the coding block is allowed to use the vertical bisection mode, otherwise the coding block is prohibited from using the vertical bisection mode.
  • the sub-block maximum size ratio is the maximum aspect ratio.
  • the process of determining the division mode that can be adopted by the coding block according to the size of the coding block and the maximum allowable sub-block size ratio specifically includes: if the width of the coding block is greater than or equal to 16, and the 4 times the height is less than or equal to the product of the width and the maximum size ratio of the sub-block, the coding block is allowed to use the vertical quartering mode, otherwise the coding block is prohibited from using the vertical quartering mode.
  • the sub-block maximum size ratio is the maximum aspect ratio.
  • the process of determining the division mode that can be used by the coding block according to the size of the coding block and the allowed sub-block maximum size ratio specifically includes: if the height of the coding block is greater than or equal to 8, and the height of the coding block is greater than or equal to 8 If twice the width is less than or equal to the product of the height and the maximum size ratio of the sub-block, the coding block is allowed to use the horizontal bisection mode, otherwise the coding block is prohibited from using the horizontal bisection mode.
  • the sub-block maximum size ratio is the maximum aspect ratio.
  • the process of determining the division mode that can be adopted by the coding block according to the size of the coding block and the maximum allowed sub-block size ratio specifically includes: if the height of the coding block is greater than or equal to 16, and the height of the coding block is greater than or equal to 16 4 times the width is less than or equal to the product of the height and the maximum size ratio of the sub-block, the coding block is allowed to use the horizontal quartering mode, otherwise the coding block is prohibited from using the horizontal quartering mode.
  • the sub-block maximum size ratio is the maximum aspect ratio.
  • step S730 based on the division modes that can be used by the coding block, a target division mode used when performing sub-block transformation processing on the coding block is determined.
  • step S740 the coding block is divided based on the target division mode, and the multiple sub-blocks obtained by division are decoded.
  • the sub-block in the process of decoding multiple sub-blocks obtained by division, may perform a transform skip operation or an inverse transform operation.
  • This process is the inverse process of the encoding end. If the encoding end performs the transform skip operation on the sub-block, the decoding end also performs the transform skip operation on the sub-block; if the encoding end performs the transform operation on the sub-block, the decoding end performs the transform operation on the sub-block.
  • Blocks also perform inverse transform operations.
  • MaxPartRatio represents the maximum size ratio of the sub-blocks allowed.
  • the maximum size ratio of the sub-block includes the maximum aspect ratio and the maximum aspect ratio (both are non-zero integers). Usually, the maximum aspect ratio and the maximum aspect ratio are equal. If the maximum aspect ratio and the maximum aspect ratio are not equal, then for the horizontal quarter mode and the horizontal bisection mode, the maximum aspect ratio can use the maximum aspect ratio; for the vertical quarter mode and the vertical bisection mode , the maximum aspect ratio can adopt the maximum aspect ratio.
  • encoding can be performed in the sequence header when SBT is turned on, and the encoding blocks allowed in the existing AVS3 standard can also be reused
  • the value of the maximum aspect ratio ie the maximum size ratio of sub-blocks, may be the same as the value of the maximum size ratio of allowed coding blocks. If the maximum aspect ratio and the maximum aspect ratio are not equal, they can be encoded separately in the sequence header when SBT is on.
  • the technical solutions of the embodiments of the present application can limit the division mode of the coding block by the allowable maximum size ratio of the sub-block and the size of the coding block, so as to avoid the division to obtain too narrow sub-blocks, and effectively ensure the performance of the hardware, Improved encoding and decoding efficiency.
  • a method of performing video encoding is required.
  • the video encoding method can be executed by a device with computing processing function at the encoder end, such as a terminal device or a server.
  • the video coding method may include: after obtaining the residual coefficient block of the video image frame, determining a division mode that the residual coefficient block can adopt according to the size of the residual coefficient block and the ratio of the maximum size of the sub-blocks allowed, and then from the following steps:
  • the target division mode for the residual coefficient block is selected from the division modes that can be used, and then the information of the target division mode is encoded into the code stream.
  • the coding end may select an optimal division mode from the division modes that can be adopted by the residual coefficient block as the target division mode based on RDO (Rate Distortion Optimization, rate-distortion optimization).
  • RDO Rate Distortion Optimization, rate-distortion optimization
  • FIG. 8 shows a block diagram of a video decoding apparatus according to an embodiment of the present application.
  • the video decoding apparatus may be set in a device with a computing processing function, such as a terminal device or a server.
  • a video decoding apparatus 800 includes: an obtaining unit 802 , a first processing unit 804 , a second processing unit 806 and a third processing unit 808 .
  • the obtaining unit 802 is configured to obtain the coding block of the video image frame;
  • the first processing unit 804 is configured to, if the coding block needs to perform sub-block transformation, according to the size of the coding block and the maximum allowable sub-block size ratio determining a division mode that can be used by the coding block;
  • the second processing unit 806 is configured to, based on the division mode that can be used by the coding block, determine a target division mode used when performing sub-block transformation processing on the coding block;
  • third The processing unit 808 is configured to perform division processing on the coding block based on the target division mode, and perform decoding processing on a plurality of sub-blocks obtained by division.
  • the first processing unit 804 is configured to: if the width of the coding block is greater than or equal to 8, and the double height of the coding block is less than or equal to the width and the If the product of the maximum size ratio of the sub-blocks is used, the coding block is allowed to use the vertical dichotomy mode, otherwise the coding block is prohibited from using the vertical dichotomy mode.
  • the first processing unit 804 is configured to: if the width of the coding block is greater than or equal to 16, and the 4 times height of the coding block is less than or equal to the width and the If the product of the maximum size ratio of the sub-blocks is used, the coding block is allowed to use the vertical quartering mode, otherwise the coding block is prohibited from using the vertical quartering mode.
  • the maximum size ratio of the sub-blocks is the maximum aspect ratio.
  • the first processing unit 804 is configured to: if the height of the coding block is greater than or equal to 8, and the double width of the coding block is less than or equal to the height and the If the product of the maximum size ratio of the sub-blocks is used, the coding block is allowed to use the horizontal bisection mode, otherwise, the coding block is prohibited from using the horizontal bisection mode.
  • the first processing unit 804 is configured to: if the height of the coding block is greater than or equal to 16, and the 4 times width of the coding block is less than or equal to the height and the If the product of the maximum size ratio of the sub-blocks is used, the coding block is allowed to use the horizontal quartering mode, otherwise the coding block is prohibited from using the horizontal quartering mode.
  • the maximum size ratio of the sub-block is the maximum aspect ratio of the sub-block.
  • the sub-block maximum size ratio includes a maximum aspect ratio and a maximum aspect ratio, and the value of the maximum aspect ratio is equal to the value of the maximum aspect ratio;
  • the video decoding apparatus 800 further includes: a first decoding unit configured to decode the value of the sub-block maximum size ratio from the sequence header of the video image frame.
  • the maximum size ratio of the sub-block includes a maximum aspect ratio and a maximum aspect ratio, and the value of the maximum aspect ratio is not equal to the value of the maximum aspect ratio ;
  • the video decoding apparatus 800 further includes: a second decoding unit, configured to decode the value of the maximum aspect ratio and the value of the maximum aspect ratio respectively from the sequence header of the video image frame.
  • the value of the maximum size ratio of the sub-block is the same as the value of the maximum size ratio of the allowed coding block.
  • the third processing unit 808 is configured to: perform a transform skip operation or perform an inverse transform operation during the process of decoding the divided sub-blocks.
  • Embodiments of the present application also provide a video encoding apparatus, which can be set in a device with a computing processing function at the encoder end, such as a terminal device or a server.
  • the video encoding apparatus may include: an obtaining unit configured to obtain a residual coefficient block of a video image frame; a fourth processing unit configured to determine the a division mode that can be used by the residual coefficient block; a selection unit, configured to select a target division mode for the residual coefficient block from the division modes that can be used; an encoding unit, configured to convert the information of the target division mode encoded into the code stream.
  • the division mode that can be adopted by the coding block is determined according to the size of the coding block and the allowable maximum size ratio of the sub-block, so that it is possible to
  • the division mode of the coding block is limited by the allowable maximum size ratio of the sub-block and the size of the coding block, so as to avoid the division to obtain too narrow sub-blocks, effectively ensure the performance of the hardware, and improve the coding and decoding efficiency.
  • FIG. 9 shows a schematic structural diagram of a computer system suitable for implementing the electronic device according to the embodiment of the present application.
  • the computer system 900 includes a central processing unit (Central Processing Unit, CPU) 901, which can be loaded into a random device according to a program stored in a read-only memory (Read-Only Memory, ROM) 902 or from a storage part 908
  • CPU Central Processing Unit
  • ROM Read-Only Memory
  • RAM Random Access Memory
  • a program in a memory (Random Access Memory, RAM) 903 is accessed to perform various appropriate actions and processes, such as performing the methods described in the above embodiments.
  • RAM 903 Random Access Memory
  • various programs and data required for system operation are also stored.
  • the CPU 901, the ROM 902, and the RAM 903 are connected to each other through a bus 904.
  • An Input/Output (I/O) interface 905 is also connected to the bus 904 .
  • the following components are connected to the I/O interface 905: an input section 906 including a keyboard, a mouse, etc.; an output section 907 including a cathode ray tube (CRT), a liquid crystal display (LCD), etc., and a speaker, etc. ; a storage part 908 including a hard disk, etc.; and a communication part 909 including a network interface card such as a LAN (Local Area Network) card, a modem, and the like.
  • the communication section 909 performs communication processing via a network such as the Internet.
  • a drive 910 is also connected to the I/O interface 905 as needed.
  • a removable medium 911 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, etc., is mounted on the drive 910 as needed so that a computer program read therefrom is installed into the storage section 908 as needed.
  • embodiments of the present application include a computer program product comprising a computer program carried on a computer-readable medium, the computer program comprising a computer program for performing the method illustrated in the flowchart.
  • the computer program may be downloaded and installed from the network via the communication portion 909, and/or installed from the removable medium 911.
  • CPU central processing unit
  • the computer-readable medium shown in the embodiments of the present application may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the above two.
  • the computer-readable storage medium can be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above.
  • Computer readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable Erasable Programmable Read Only Memory (EPROM), flash memory, optical fiber, portable Compact Disc Read-Only Memory (CD-ROM), optical storage device, magnetic storage device, or any suitable of the above The combination.
  • a computer-readable storage medium can be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, carrying a computer-readable computer program therein.
  • Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • a computer-readable signal medium can also be any computer-readable medium other than a computer-readable storage medium that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device .
  • a computer program embodied on a computer-readable medium may be transmitted using any suitable medium, including but not limited to: wireless, wired, etc., or any suitable combination of the foregoing.
  • the present application also provides a computer-readable medium.
  • the computer-readable medium may be included in the electronic device described in the above embodiments; it may also exist alone without being assembled into the electronic device. middle.
  • the above-mentioned computer-readable medium carries one or more programs, and when the above-mentioned one or more programs are executed by an electronic device, enables the electronic device to implement the methods described in the above-mentioned embodiments.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

本申请的实施例提供了一种视频编解码方法、装置、计算机可读介质及电子设备。该视频解码方法包括:获取视频图像帧的编码块;若所述编码块需要进行子块变换,则根据所述编码块的尺寸和所允许的子块最大尺寸比确定所述编码块能够采用的划分模式;基于所述编码块能够采用的划分模式,确定对所述编码块进行子块变换处理时所采用的目标划分模式;基于所述目标划分模式对所述编码块进行划分处理,并对划分得到的多个子块进行解码处理。本申请实施例的技术方案可以在使用SBT技术时避免产生过窄的子块,保证了硬件的性能,提高了编解码效率。

Description

视频编解码方法、装置、计算机可读介质及电子设备
本申请要求于2021年2月21日提交中国专利局、申请号为202110194810.5、名称为“视频编解码方法、装置、计算机可读介质及电子设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及计算机及通信技术领域,具体而言,涉及一种视频编解码方法、装置、计算机可读介质及电子设备。
背景技术
在AVS3(Audio Video coding Standard,音视频编码标准)标准中,提到了子块变换(Sub-Block Transform,简称SBT)技术,具体是将编码块按照一定的方式划分为多个子块来进行编码,但是按照目前的AVS3标准对编码块进行划分可能存在过窄的子块(过窄的子块即为宽度与高度之间相差过大的子块),进而会降低硬件性能,影响编解码效率。
发明内容
本申请的实施例提供了一种视频编解码方法、装置、计算机可读介质及电子设备,进而至少在一定程度上可以在使用SBT技术时避免产生过窄的子块,保证了硬件的性能,提高了编解码效率。
本申请的其他特性和优点将通过下面的详细描述变得显然,或部分地通过本申请的实践而习得。
根据本申请实施例的一个方面,提供了一种视频解码方法,包括:获取视频图像帧的编码块;若所述编码块需要进行子块变换,则根据所述编码块的尺寸和所允许的子块最大尺寸比确定所述编码块能够采用的划分模式;基于所述编码块能够采用的划分模式,确定对所述编码块进行子块变换处理时所采用的目标划分模式;基于所述目标划分模式对所述编码块进行划分处理,并对划分得到的多个子块进行解码处理。
根据本申请实施例的一个方面,提供了一种视频编码方法,包括:获取视频图像帧的残差系数块;根据所述残差系数块的尺寸和所允许的子块最大尺寸比确定所述残差系数块能够采用的划分模式;从所述能够采用的划分模式中选择针对所述残差系数块的目标划分模式;将所述目标划分模式的信息编码至码流中。
根据本申请实施例的一个方面,提供了一种视频解码装置,包括:获取单元,配置为获取视频图像帧的编码块;第一处理单元,配置为若所述编码块需要进行子块变换,则根据所述编码块的尺寸和所允许的子块最大尺寸比确定所述编码块能够采用的划分模式;第二处理单元,配置为基于所述编码块能够采用的划分模式,确定对所述编码块进行子块变换处理时所采用的目标划分模式;第三处理单元,配置为基于所述目标划分模 式对所述编码块进行划分处理,并对划分得到的多个子块进行解码处理。
根据本申请实施例的一个方面,提供了一种视频编码装置,包括:获取单元,配置为获取视频图像帧的残差系数块;第四处理单元,配置为根据所述残差系数块的尺寸和所允许的子块最大尺寸比确定所述残差系数块能够采用的划分模式;选择单元,配置为从所述能够采用的划分模式中选择针对所述残差系数块的目标划分模式;编码单元,配置为将所述目标划分模式的信息编码至码流中。
根据本申请实施例的一个方面,提供了一种计算机可读介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现如上述实施例中所述的视频解码方法。
根据本申请实施例的一个方面,提供了一种电子设备,包括:一个或多个处理器;存储装置,用于存储一个或多个程序,当所述一个或多个程序被所述一个或多个处理器执行时,使得所述一个或多个处理器实现如上述实施例中所述的视频解码方法。
根据本申请实施例的一个方面,提供了一种计算机程序产品或计算机程序,该计算机程序产品或计算机程序包括计算机指令,该计算机指令存储在计算机可读存储介质中。计算机设备的处理器从计算机可读存储介质读取该计算机指令,处理器执行该计算机指令,使得该计算机设备执行上述各种实施例中提供的视频解码方法。
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,并不能限制本申请。
附图简要说明
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本申请的实施例,并与说明书一起用于解释本申请的原理。显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。在附图中:
图1示出了可以应用本申请实施例的技术方案的示例性系统架构的示意图;
图2示出视频编码装置和视频解码装置在流式传输系统中的放置方式示意图;
图3示出了一个视频编码器的基本流程图;
图4示出了SBT划分导出的解析图;
图5示出了SBT的划分模式示意图;
图6示出了子块变换技术中的变换组合示意图;
图7示出了根据本申请的一个实施例的视频解码方法的流程图;
图8示出了根据本申请的一个实施例的视频解码装置的框图;
图9示出了适于用来实现本申请实施例的电子设备的计算机系统的结构示意图。
具体实施方式
现在将参考附图更全面地描述示例实施方式。然而,示例实施方式能够以多种形式实施,且不应被理解为限于在此阐述的范例;相反,提供这些实施方式使得本申请将更加全面和完整,并将示例实施方式的构思全面地传达给本领域的技术人员。
此外,所描述的特征、结构或特性可以以任何合适的方式结合在一个或更多实施例中。在下面的描述中,提供许多具体细节从而给出对本申请的实施例的充分理解。然而,本领域技术人员将意识到,可以实践本申请的技术方案而没有特定细节中的一个或更多,或者可以采用其它的方法、组元、装置、步骤等。在其它情况下,不详细示出或描述公知方法、装置、实现或者操作以避免模糊本申请的各方面。
图1示出了可以应用本申请实施例的技术方案的示例性系统架构的示意图。
如图1所示,系统架构100包括多个终端装置,所述终端装置可通过例如网络150彼此通信。举例来说,系统架构100可以包括通过网络150互连的第一终端装置110和第二终端装置120。在图1的实施例中,第一终端装置110和第二终端装置120执行单向数据传输。
举例来说,第一终端装置110可对视频数据(例如由第一终端装置110采集的视频图片流)进行编码以通过网络150传输到第二终端装置120,已编码的视频数据以一个或多个已编码视频码流形式传输,第二终端装置120可从网络150接收已编码视频数据,对已编码视频数据进行解码以恢复视频数据,并根据恢复的视频数据显示视频图片。
在本申请的一个实施例中,系统架构100可以包括执行已编码视频数据的双向传输的第三终端装置130和第四终端装置140,所述双向传输比如可以发生在视频会议期间。对于双向数据传输,第三终端装置130和第四终端装置140中的每个终端装置可对视频数据(例如由终端装置采集的视频图片流)进行编码,以通过网络150传输到第三终端装置130和第四终端装置140中的另一终端装置。第三终端装置130和第四终端装置140中的每个终端装置还可接收由第三终端装置130和第四终端装置140中的另一终端装置传输的已编码视频数据,且可对已编码视频数据进行解码以恢复视频数据,并可根据恢复的视频数据在可访问的显示装置上显示视频图片。
在图1的实施例中,第一终端装置110、第二终端装置120、第三终端装置130和第四终端装置140可为服务器、个人计算机和智能电话,但本申请公开的原理可不限于此。本申请公开的实施例适用于膝上型计算机、平板电脑、媒体播放器和/或专用视频会议设备。网络150表示在第一终端装置110、第二终端装置120、第三终端装置130和第四终端装置140之间传送已编码视频数据的任何数目的网络,包括例如有线和/或无线通信网络。通信网络150可在电路交换和/或分组交换信道中交换数据。该网络可包括电信网络、局域网、广域网和/或互联网。出于本申请的目的,除非在下文中有所解释,否则网络150的架构和拓扑对于本申请公开的操作来说可能是无关紧要的。
在本申请的一个实施例中,图2示出了视频编码装置和视频解码装置在流式传输环境中的放置方式。本申请所公开主题可同等地适用于其它支持视频的应用,包括例如视频会议、数字TV(television,电视机)、在包括CD、DVD、存储棒等的数字介质上存储压缩视频等等。
流式传输系统可包括采集子系统213,采集子系统213可包括数码相机等视频源201,视频源创建未压缩的视频图片流202。在实施例中,视频图片流202包括由数码相机拍摄的样本。相较于已编码的视频数据204(或已编码的视频码流204),视频图片流202被描绘为粗线以强调高数据量的视频图片流,视频图片流202可由电子装置220处理, 电子装置220包括耦接到视频源201的视频编码装置203。视频编码装置203可包括硬件、软件或软硬件组合以实现或实施如下文更详细地描述的所公开主题的各方面。相较于视频图片流202,已编码的视频数据204(或已编码的视频码流204)被描绘为细线以强调较低数据量的已编码的视频数据204(或已编码的视频码流204),其可存储在流式传输服务器205上以供将来使用。一个或多个流式传输客户端子系统,例如图2中的客户端子系统206和客户端子系统208,可访问流式传输服务器205以检索已编码的视频数据204的副本207和副本209。客户端子系统206可包括例如电子装置230中的视频解码装置210。视频解码装置210对已编码的视频数据的传入副本207进行解码,且产生可在显示器212(例如显示屏)或另一呈现装置上呈现的输出视频图片流211。在一些流式传输系统中,可根据某些视频编码/压缩标准对已编码的视频数据204、视频数据207和视频数据209(例如视频码流)进行编码。该些标准的实施例包括ITU-T H.265。在实施例中,正在开发的视频编码标准非正式地称为下一代视频编码(Versatile Video Coding,VVC),本申请可用于VVC标准的上下文中。
应注意,电子装置220和电子装置230可包括图中未示出的其它组件。举例来说,电子装置220可包括视频解码装置,且电子装置230还可包括视频编码装置。
在本申请的一个实施例中,以国际视频编码标准HEVC(High Efficiency Video Coding,高效率视频编码)、VVC(Versatile Video Coding,多功能视频编码),以及中国国家视频编码标准AVS为例,当输入一个视频帧图像之后,会根据一个块大小,将视频帧图像划分成若干个不重叠的处理单元,每个处理单元将进行类似的压缩操作。这个处理单元被称作CTU(Coding Tree Unit,编码树单元),或者称之为LCU(Largest Coding Unit,最大编码单元)。CTU再往下可以继续进行更加精细的划分,得到一个或多个基本的编码单元CU,CU是一个编码环节中最基本的元素。以下介绍对CU进行编码时的一些概念:
预测编码(Predictive Coding):预测编码包括了帧内预测和帧间预测等方式,原始视频信号经过选定的已重建视频信号的预测后,得到残差视频信号。编码端需要为当前CU决定选择哪一种预测编码模式,并告知解码端。其中,帧内预测是指预测的信号来自于同一图像内已经编码重建过的区域;帧间预测是指预测的信号来自已经编码过的、不同于当前图像的其它图像(称之为参考图像)。
变换及量化(Transform&Quantization):残差视频信号经过DFT(Discrete Fourier Transform,离散傅里叶变换)、DCT等变换操作后,将信号转换到变换域中,称之为变换系数。变换系数进一步进行有损的量化操作,丢失掉一定的信息,使得量化后的信号有利于压缩表达。在一些视频编码标准中,可能有多于一种变换方式可以选择,因此编码端也需要为当前CU选择其中的一种变换方式,并告知解码端。量化的精细程度通常由量化参数(Quantization Parameter,简称QP)来决定,QP取值较大,表示更大取值范围的系数将被量化为同一个输出,因此通常会带来更大的失真及较低的码率;相反,QP取值较小,表示较小取值范围的系数将被量化为同一个输出,因此通常会带来较小的失真,同时对应较高的码率。
熵编码(Entropy Coding)或统计编码:量化后的变换域信号将根据各个值出现的频 率进行统计压缩编码,最后输出二值化(0或者1)的压缩码流。同时,编码产生其他信息,例如选择的编码模式、运动矢量数据等,也需要进行熵编码以降低码率。统计编码是一种无损的编码方式,可以有效的降低表达同样信号所需要的码率,常见的统计编码方式有变长编码(Variable Length Coding,简称VLC)或者基于上下文的二值化算术编码(Content Adaptive Binary Arithmetic Coding,简称CABAC)。
环路滤波(Loop Filtering):经过变换及量化的信号会通过反量化、反变换及预测补偿的操作获得重建图像。重建图像与原始图像相比由于存在量化的影响,部分信息与原始图像有所不同,即重建图像会产生失真(Distortion)。因此,可以对重建图像进行滤波操作,例如去块效应滤波(Deblocking filter,简称DB)、SAO(Sample Adaptive Offset,样本自适应补偿)或者ALF(Adaptive Loop Filter,自适应环路滤波)等滤波器,可以有效降低量化所产生的失真程度。由于这些经过滤波后的重建图像将作为后续编码图像的参考来对将来的图像信号进行预测,因此上述的滤波操作也被称为环路滤波,即在编码环路内的滤波操作。
在本申请的一个实施例中,图3示出了一个视频编码器的基本流程图,在该流程中以帧内预测为例进行说明。其中,原始图像信号s k[x,y]与预测图像信号
Figure PCTCN2022071732-appb-000001
做差值运算,得到残差信号u k[x,y],残差信号u k[x,y]经过变换及量化处理之后得到量化系数,量化系数一方面通过熵编码得到编码后的比特流,另一方面通过反量化及反变换处理得到重构残差信号u' k[x,y],预测图像信号
Figure PCTCN2022071732-appb-000002
与重构残差信号u' k[x,y]叠加生成图像信号
Figure PCTCN2022071732-appb-000003
图像信号
Figure PCTCN2022071732-appb-000004
一方面输入至帧内模式决策模块和帧内预测模块进行帧内预测处理,另一方面通过环路滤波输出重建图像信号s' k[x,y],重建图像信号s' k[x,y]可以作为下一帧的参考图像进行运动估计及运动补偿预测。然后基于运动补偿预测的结果s' r[x+m x,y+m y]和帧内预测结果
Figure PCTCN2022071732-appb-000005
得到下一帧的预测图像信号
Figure PCTCN2022071732-appb-000006
并继续重复上述过程,直至编码完成。
基于上述的编码过程,在解码端针对每一个CU,在获取到压缩码流(即比特流)之后,进行熵解码获得各种模式信息及量化系数。然后量化系数经过反量化及反变换处理得到残差信号。另一方面,根据已知的编码模式信息,可获得该CU对应的预测信号,然后将残差信号与预测信号相加之后即可得到重建信号,重建信号再经过环路滤波等操作,产生最终的输出信号。
在上述的编解码过程中,对残差信号的变换处理使得残差信号的能量集中在较少的低频系数,也就是多数系数值较小。然后经过后续的量化模块后,较小系数值将变为零值,极大降低了编码残差信号的代价。但是,由于残差分布的多样性,单一的DCT变换无法适应所有的残差特性,因此,DST7和DCT8这样的变换核被引入到变换处理过程中,并且对残差信号进行的水平变换和竖直变换可以采用不同的变换核。以AMT(Adaptive multiple core transform,自适应多核变换)技术为例,对于一个残差信号进行变换处理可能选择的变换组合如下所示:(DCT2,DCT2)、(DCT8,DCT8)、(DCT8,DST7)、(DST7,DCT8)和(DST7,DST7)。
同时,AVS3中也提到了子块变换(Sub-Block Transform,简称SBT)技术。图4示出了SBT划分导出的解析图。其中,SBT存在8种子块划分结果,并且只对子块中的 灰色部分(非0残差子块)进行变换编码,而对于白色部分(0残差子块)则强行清零。具体而言,如下述表1所示为AVS3标准中关于SBT的相关说明:
Figure PCTCN2022071732-appb-000007
表1
参见表1所示,当SBT的使能标志位SbtEnableFlag使能,且满足编码块的宽度(width)和高度(height)等条件之后将解析sbt_cu_flag并赋值给SbtCuFlag(值为0或1)。若SbtCuFlag为1,则执行条件赋值SbtVerHalf=width>=8、SbtVerQuad=width>=16、SbtHorHalf=height>=8和SbtHorQuad=height>=16。其中,SbtVerHalf表示竖直二分模式标识,SbtVerQuad表示竖直四分模式标识,SbtHorHalf表示水平二分模式标识,SbtHorQuad表示水平四分模式标识。
即在目前的AVS3标准中,可执行SBT的编码块尺寸应满足如下条件:(width<=64)&&(height<=64)&&(width>4||height>4)。并且在此基础上,不同划分模式的编码块尺寸有进一步的限制,如下所示:
对于水平四分模式:SbtHorQuad=height>=16?1:0,此处SbtHorQuad值为1表示允许进行水平四分模式,SbtHorQuad值为0表示禁止进行水平四分模式。
对于水平二分模式:SbtHorHalf=height>=8?1:0,此处SbtHorHalf值为1表示允许进行水平二分模式,SbtHorHalf值为0表示禁止进行水平二分模式。
对于竖直四分模式:SbtVerQuad=width>=16?1:0,此处SbtVerQuad值为1表示允许进行竖直四分模式,SbtVerQuad值为0表示禁止进行竖直四分模式。
对于竖直二分模式:SbtVerHalf=width>=8?1:0,此处SbtVerHalf值为1表示允许进行竖直二分模式,SbtVerHalf值为0表示禁止进行竖直二分模式。
继续参照表1所示,如果SbtVerHalf或SbtHorHalf值为1,且SbtVerQuad或SbtHorQuad值为1,那么将解析sbt_quad_flag(值为0或1)。如果SbtQuadFlag、SbtVerQuad和SbtHorQuad的值都为1,或者!SbtQuadFlag、SbtVerHalf和SbtHorHalf的值都为1,那么将解析sbt_dir_flag(值为0或1)。
在表1的基础上,结合图4所示,如果quad的值为1、dir的值为0、pos的值为0,则采用竖直四分模式,并且只对左侧子块(灰色部分)进行变换编码,而对于其它子块 (白色部分)则强行清零;如果quad的值为1、dir的值为0、pos的值为1,则采用竖直四分模式,并且只对右侧子块(灰色部分)进行变换编码,而对于其它子块(白色部分)则强行清零;如果quad的值为1、dir的值为1、pos的值为0,则采用水平四分模式,并且只对上方子块(灰色部分)进行变换编码,而对于其它子块(白色部分)则强行清零;如果quad的值为1、dir的值为1、pos的值为1,则采用水平四分模式,并且只对下方子块(灰色部分)进行变换编码,而对于其它子块(白色部分)则强行清零;如果quad的值为0、dir的值为0、pos的值为0,则采用竖直二分模式,并且只对左侧子块(灰色部分)进行变换编码,而对于其它子块(白色部分)则强行清零;如果quad的值为0、dir的值为0、pos的值为1,则采用竖直二分模式,并且只对右侧子块(灰色部分)进行变换编码,而对于其它子块(白色部分)则强行清零;如果quad的值为0、dir的值为1、pos的值为0,则采用水平二分模式,并且只对上方子块(灰色部分)进行变换编码,而对于其它子块(白色部分)则强行清零;如果quad的值为0、dir的值为1、pos的值为1,则采用水平二分模式,并且只对下方子块(灰色部分)进行变换编码,而对于其它子块(白色部分)则强行清零。
其中,图4中的quad可视为子块的尺寸控制语法,dir和pos均可视为控制子块位置的语法,换句话说,quad决定子块的尺寸(面积),而dir和pos可决定子块的位置。同时,图4中的quad、dir和pos分别与AVS3文本中的sbt_quad_flag、sbt_dir_flag和sbt_pos_flag对应。sbt_quad_flag、sbt_dir_flag可从码流中直接解析获得(参照表1所示),sbt_pos_flag则通过对系数统计隐含导出。sbt_quad_flag、sbt_dir_flag和sbt_pos_flag联合决定具体的SBT划分模式。
基于图4可知,SBT的8种划分模式可按照划分方式和划分方向如图5进行分类。其中,水平模式和竖直模式分别在水平方向和竖直方向上对当前编码块进行划分,四分模式和二分模式分别表示沿着对应划分方向划分出1/4和1/2的子块进行编码。
在SBT的8种划分模式中,针对灰色子块不再进一步划分,直接进行系数编码。
关于子块的变换组合选择,当非0残差子块的宽或高为64时,该非0残差子块的水平和竖直变换均为DCT-2;其它情况下,水平和竖直变换的选择如图6所示。在图6中,会对子块中的灰色部分按照所标识的变换模式进行水平变换和竖直变换编码,而对于白色部分则强行清零,图6中的w1可以是w(宽度)的1/2,或者可以是w的1/4;h1可以是h(高度)的1/2,或者可以是h的1/4。
此外,也可以通过配置编码器在灰色部分的残差编码的过程中跳过变换,直接进行量化和系数编码。
虽然AVS3标准中目前的SBT方案对编码块的尺寸进行了一定的限制,但是对编码块进行划分之后,可能仍然会产生过窄的子块。比如对于64×8(宽×高)的编码块,进行水平二分会产生64×4的变换块,其在水平方向过窄;对于64×16(宽×高)的编码块,进行水平四分会产生64×4的变换块,其在水平方向过窄。由于过窄的变换块会降低硬件性能,影响编解码效率。因此本申请实施例的技术方案提出了进一步对采用SBT技术的编码块进行限制的方案,避免了划分得到过窄的子块,有效保证了硬件的性能,提高了编解码效率。详细说明如下:
图7示出了根据本申请的一个实施例的视频解码方法的流程图,该视频解码方法可以由具有计算处理功能的设备来执行,比如可以由终端设备或服务器来执行。参照图7所示,该视频解码方法至少包括步骤S710至步骤S740,详细介绍如下:
在步骤S710中,获取视频图像帧的编码块。
在本申请的一个实施例中,视频图像帧序列包括了一系列图像,每张图像可以被进一步划分为条带(Slice),条带又可以划分为一系列的LCU(或CTU),LCU包含有若干CU。视频图像帧在编码时是以块为单位进行编码处理,在一些新的视频编码标准中,比如在H.264标准中有宏块(macroblock,MB),宏块可进一步划分成多个可用于预测编码的预测块(prediction)。在HEVC标准中,采用编码单元CU、预测单元(prediction unit,PU)和变换单元(transform unit,TU)等基本概念,从功能上划分了多种块单元,并采用全新的基于树的结构进行描述。比如CU可以按照四叉树划分为更小的CU,而更小的CU还可以继续划分,从而形成一种四叉树结构。本申请实施例中的编码块可以是CU,或者是比CU更小的块,如对CU进行划分得到的更小的块。
在步骤S720中,若编码块需要进行子块变换,则根据编码块的尺寸和所允许的子块最大尺寸比确定编码块能够采用的划分模式。
在本申请的一个实施例中,可以根据编码块中包含的显式索引和/或隐含导出索引来确定编码块是否需要采用子块变换技术。
在本申请的一个实施例中,子块最大尺寸比可以包括最大宽高比和最大高宽比,且最大宽高比的值与最大高宽比的值相等。在这种情况下,可以从视频图像帧的序列头中解码出子块最大尺寸比的值。
在本申请的一个实施例中,子块最大尺寸比可以包括最大宽高比和最大高宽比,且最大宽高比的值与最大高宽比的值不相等。在这种情况下,可以从视频图像帧的序列头中分别解码出最大宽高比的值与最大高宽比的值。
在本申请的一个实施例中,子块最大尺寸比的值可以复用标准中所允许的编码块最大尺寸比的值,即子块最大尺寸比的值可以与允许的编码块最大尺寸比的值相同。
在本申请的一个实施例中,根据编码块的尺寸和所允许的子块最大尺寸比确定编码块能够采用的划分模式的过程具体包括:若编码块的宽度大于或等于8,且编码块的2倍高度小于或等于宽度与子块最大尺寸比的乘积,则允许编码块采用竖直二分模式,否则禁止编码块采用竖直二分模式。在该实施例中,该子块最大尺寸比为最大高宽比。
在本申请的一个实施例中,根据编码块的尺寸和所允许的子块最大尺寸比确定编码块能够采用的划分模式的过程具体包括:若编码块的宽度大于或等于16,且编码块的4倍高度小于或等于宽度与子块最大尺寸比的乘积,则允许编码块采用竖直四分模式,否则禁止编码块采用竖直四分模式。在该实施例中,该子块最大尺寸比为最大高宽比。
在本申请的一个实施例中,根据编码块的尺寸和所允许的子块最大尺寸比确定编码块能够采用的划分模式的过程具体包括:若编码块的高度大于或等于8,且编码块的2倍宽度小于或等于高度与子块最大尺寸比的乘积,则允许编码块采用水平二分模式,否则禁止编码块采用水平二分模式。在该实施例中,该子块最大尺寸比为最大宽高比。
在本申请的一个实施例中,根据编码块的尺寸和所允许的子块最大尺寸比确定编码 块能够采用的划分模式的过程具体包括:若编码块的高度大于或等于16,且编码块的4倍宽度小于或等于高度与子块最大尺寸比的乘积,则允许编码块采用水平四分模式,否则禁止编码块采用水平四分模式。在该实施例中,该子块最大尺寸比为最大宽高比。
继续参照图7所示,在步骤S730中,基于编码块能够采用的划分模式,确定对编码块进行子块变换处理时所采用的目标划分模式。
在本申请的一个实施例中,具体确定编码块在进行子块变换处理时所采用的目标划分模式可以参照前述实施例的技术方案(如表1和图4所示的方案),不再赘述。
在步骤S740中,基于目标划分模式对编码块进行划分处理,并对划分得到的多个子块进行解码处理。
在本申请的一个实施例中,在对划分得到的多个子块进行解码处理的过程中,子块既可以执行变换跳过操作,也可以执行反变换操作。该过程是编码端的逆过程,如果编码端对子块执行了变换跳过操作,那么解码端对子块也执行变换跳过操作;如果编码端对子块执行了变换操作,那么解码端对子块也执行反变换操作。
综上,本申请实施例的技术方案在SBT现有的编码块尺寸限制的基础上,进一步增加了如下的一些限制:
对于水平四分模式:SbtHorQuad=(width×4<=height×MaxPartRatio)?SbtHorQuad:0,此处SbtHorQuad值为1表示允许进行水平四分模式,SbtHorQuad值为0表示禁止进行水平四分模式。
对于水平二分模式:SbtHorHalf=(width×2<=height×MaxPartRatio)?SbtHorHalf:0,此处SbtHorHalf值为1表示允许进行水平二分模式,SbtHorHalf值为0表示禁止进行水平二分模式。
对于竖直四分模式:SbtVerQuad=(height×4<=width×MaxPartRatio)?SbtVerQuad:0,此处SbtVerQuad值为1表示允许进行竖直四分模式,SbtVerQuad值为0表示禁止进行竖直四分模式。
对于竖直二分模式:SbtVerHalf=(height×2<=width×MaxPartRatio)?SbtVerHalf:0,此处SbtVerHalf值为1表示允许进行竖直二分模式,SbtVerHalf值为0表示禁止进行竖直二分模式。
其中,MaxPartRatio表示所允许的子块最大尺寸比,该子块最大尺寸比包含最大宽高比和最大高宽比(均为非零整数),通常情况下,最大宽高比和最大高宽比相等。如果最大宽高比和最大高宽比不相等,那么对于水平四分模式和水平二分模式而言,该最大尺寸比可以采用最大宽高比;对于竖直四分模式和竖直二分模式而言,该最大尺寸比可以采用最大高宽比。
在本申请的一个实施例中,如果最大宽高比和最大高宽比相等,那么在SBT打开时,可以在序列头中进行编码,并且也可以复用现有AVS3标准中所允许的编码块最大宽高比,即子块最大尺寸比的值可以与允许的编码块最大尺寸比的值相同。如果最大宽高比和最大高宽比不相等,那么在SBT打开时,可以在序列头中分别进行编码。
基于本申请实施例中增加的限制条件,可以对SBT的相关说明修改为如表2所示:
Figure PCTCN2022071732-appb-000008
Figure PCTCN2022071732-appb-000009
表2
基于表2中所示的相关限制条件,在本申请的一个实施例中,如果MaxPartRatio=8,那么对于8×64的编码块,禁止竖直二分划分出4×64的子块;对于16×64的编码块,禁止竖直四分划分出4×64的子块;对于64×8的编码块,禁止水平二分划分出64×4的子块;对于64×16的编码块,禁止水平四分划分出64×4的子块。可见,本申请实施例的技术方案能够通过允许的子块最大尺寸比和编码块的尺寸来对编码块的划分模式进行限制,避免了划分得到过窄的子块,有效保证了硬件的性能,提高了编解码效率。
在实施例中,对于编码端,需要执行视频编码的方法。该视频编码方法可以由编码器端具有计算处理功能的设备来执行,比如可以由终端设备或服务器来执行。该视频编码方法可以包括:在获取到视频图像帧的残差系数块之后,可以根据残差系数块的尺寸和所允许的子块最大尺寸比确定残差系数块能够采用的划分模式,然后从能够采用的划分模式中选择针对残差系数块的目标划分模式,进而将目标划分模式的信息编码至码流中。
在实施例中,编码端可以基于RDO(Rate Distortion Optimization,率失真优化)从残差系数块能够采用的划分模式中选择一个最优的划分模式作为目标划分模式。
以下介绍本申请的装置实施例,可以用于执行本申请上述实施例中的视频解码方法。对于本申请装置实施例中未披露的细节,请参照本申请上述的视频解码方法的实施例。
图8示出了根据本申请的一个实施例的视频解码装置的框图,该视频解码装置可以设置在具有计算处理功能的设备内,比如可以设置在终端设备或服务器内。
参照图8所示,根据本申请的一个实施例的视频解码装置800,包括:获取单元802、第一处理单元804、第二处理单元806和第三处理单元808。
其中,获取单元802配置为获取视频图像帧的编码块;第一处理单元804配置为若所述编码块需要进行子块变换,则根据所述编码块的尺寸和所允许的子块最大尺寸比确定所述编码块能够采用的划分模式;第二处理单元806配置为基于所述编码块能够采用的划分模式,确定对所述编码块进行子块变换处理时所采用的目标划分模式;第三处理单元808配置为基于所述目标划分模式对所述编码块进行划分处理,并对划分得到的多个子块进行解码处理。
在本申请的一些实施例中,基于前述方案,第一处理单元804配置为:若所述编码 块的宽度大于或等于8,且所述编码块的2倍高度小于或等于所述宽度与所述子块最大尺寸比的乘积,则允许所述编码块采用竖直二分模式,否则禁止所述编码块采用所述竖直二分模式。
在本申请的一些实施例中,基于前述方案,第一处理单元804配置为:若所述编码块的宽度大于或等于16,且所述编码块的4倍高度小于或等于所述宽度与所述子块最大尺寸比的乘积,则允许所述编码块采用竖直四分模式,否则禁止所述编码块采用所述竖直四分模式。
在本申请的一些实施例中,基于前述方案,所述子块最大尺寸比为最大高宽比。
在本申请的一些实施例中,基于前述方案,第一处理单元804配置为:若所述编码块的高度大于或等于8,且所述编码块的2倍宽度小于或等于所述高度与所述子块最大尺寸比的乘积,则允许所述编码块采用水平二分模式,否则禁止所述编码块采用所述水平二分模式。
在本申请的一些实施例中,基于前述方案,第一处理单元804配置为:若所述编码块的高度大于或等于16,且所述编码块的4倍宽度小于或等于所述高度与所述子块最大尺寸比的乘积,则允许所述编码块采用水平四分模式,否则禁止所述编码块采用所述水平四分模式。
在本申请的一些实施例中,基于前述方案,所述子块最大尺寸比为子块的最大宽高比。
在本申请的一些实施例中,基于前述方案,所述子块最大尺寸比包括最大宽高比和最大高宽比,所述最大宽高比的值与所述最大高宽比的值相等;所述视频解码装置800还包括:第一解码单元,配置为从所述视频图像帧的序列头中解码出所述子块最大尺寸比的值。
在本申请的一些实施例中,基于前述方案,所述子块最大尺寸比包括最大宽高比和最大高宽比,所述最大宽高比的值与所述最大高宽比的值不相等;所述视频解码装置800还包括:第二解码单元,配置为从所述视频图像帧的序列头中分别解码出所述最大宽高比的值与所述最大高宽比的值。
在本申请的一些实施例中,基于前述方案,所述子块最大尺寸比的值与所允许的编码块最大尺寸比的值相同。
在本申请的一些实施例中,基于前述方案,所述第三处理单元808配置为:在对划分得到的多个子块进行解码处理的过程中,执行变换跳过操作或者执行反变换操作。
本申请的实施例还提供了一种视频编码装置,该视频编码装置可以设置在编码器端具有计算处理功能的设备内,比如可以设置在终端设备或服务器内。该视频编码装置可以包括:获取单元,配置为获取视频图像帧的残差系数块;第四处理单元,配置为根据所述残差系数块的尺寸和所允许的子块最大尺寸比确定所述残差系数块能够采用的划分模式;选择单元,配置为从所述能够采用的划分模式中选择针对所述残差系数块的目标划分模式;编码单元,配置为将所述目标划分模式的信息编码至码流中。
在本申请的一些实施例所提供的技术方案中,通过在编码块需要进行子块变换时,根据编码块的尺寸和允许的子块最大尺寸比来确定编码块能够采用的划分模式,使得能 够通过允许的子块最大尺寸比和编码块的尺寸来对编码块的划分模式进行限制,避免了划分得到过窄的子块,有效保证了硬件的性能,提高了编解码效率。
图9示出了适于用来实现本申请实施例的电子设备的计算机系统的结构示意图。
需要说明的是,图9示出的电子设备的计算机系统900仅是一个示例,不应对本申请实施例的功能和使用范围带来任何限制。
如图9所示,计算机系统900包括中央处理单元(Central Processing Unit,CPU)901,其可以根据存储在只读存储器(Read-Only Memory,ROM)902中的程序或者从存储部分908加载到随机访问存储器(Random Access Memory,RAM)903中的程序而执行各种适当的动作和处理,例如执行上述实施例中所述的方法。在RAM 903中,还存储有系统操作所需的各种程序和数据。CPU 901、ROM 902以及RAM 903通过总线904彼此相连。输入/输出(Input/Output,I/O)接口905也连接至总线904。
以下部件连接至I/O接口905:包括键盘、鼠标等的输入部分906;包括诸如阴极射线管(Cathode Ray Tube,CRT)、液晶显示器(Liquid Crystal Display,LCD)等以及扬声器等的输出部分907;包括硬盘等的存储部分908;以及包括诸如LAN(Local Area Network,局域网)卡、调制解调器等的网络接口卡的通信部分909。通信部分909经由诸如因特网的网络执行通信处理。驱动器910也根据需要连接至I/O接口905。可拆卸介质911,诸如磁盘、光盘、磁光盘、半导体存储器等等,根据需要安装在驱动器910上,以便于从其上读出的计算机程序根据需要被安装入存储部分908。
根据本申请的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本申请的实施例包括一种计算机程序产品,其包括承载在计算机可读介质上的计算机程序,该计算机程序包含用于执行流程图所示的方法的计算机程序。在这样的实施例中,该计算机程序可以通过通信部分909从网络上被下载和安装,和/或从可拆卸介质911被安装。在该计算机程序被中央处理单元(CPU)901执行时,执行本申请的系统中限定的各种功能。
需要说明的是,本申请实施例所示的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(Erasable Programmable Read Only Memory,EPROM)、闪存、光纤、便携式紧凑磁盘只读存储器(Compact Disc Read-Only Memory,CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本申请中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本申请中,计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的计算机程序。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其 结合使用的程序。计算机可读介质上包含的计算机程序可以用任何适当的介质传输,包括但不限于:无线、有线等等,或者上述的任意合适的组合。
作为另一方面,本申请还提供了一种计算机可读介质,该计算机可读介质可以是上述实施例中描述的电子设备中所包含的;也可以是单独存在,而未装配入该电子设备中。上述计算机可读介质承载有一个或者多个程序,当上述一个或者多个程序被一个该电子设备执行时,使得该电子设备实现上述实施例中所述的方法。
本领域技术人员在考虑说明书及实践这里公开的实施方式后,将容易想到本申请的其它实施方案。本申请旨在涵盖本申请的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本申请的一般性原理并包括本申请未公开的本技术领域中的公知常识或惯用技术手段。
应当理解的是,本申请并不局限于上面已经描述并在附图中示出的精确结构,并且可以在不脱离其范围进行各种修改和改变。本申请的范围仅由所附的权利要求来限制。

Claims (16)

  1. 一种视频解码方法,包括:
    获取视频图像帧的编码块;
    若所述编码块需要进行子块变换,则根据所述编码块的尺寸和所允许的子块最大尺寸比确定所述编码块能够采用的划分模式;
    基于所述编码块能够采用的划分模式,确定对所述编码块进行子块变换处理时所采用的目标划分模式;
    基于所述目标划分模式对所述编码块进行划分处理,并对划分得到的多个子块进行解码处理。
  2. 根据权利要求1所述的视频解码方法,其中,根据所述编码块的尺寸和所允许的子块最大尺寸比确定所述编码块能够采用的划分模式,包括:
    若所述编码块的宽度大于或等于8,且所述编码块的2倍高度小于或等于所述宽度与所述子块最大尺寸比的乘积,则允许所述编码块采用竖直二分模式,否则禁止所述编码块采用所述竖直二分模式。
  3. 根据权利要求1所述的视频解码方法,其中,根据所述编码块的尺寸和所允许的子块最大尺寸比确定所述编码块能够采用的划分模式,包括:
    若所述编码块的宽度大于或等于16,且所述编码块的4倍高度小于或等于所述宽度与所述子块最大尺寸比的乘积,则允许所述编码块采用竖直四分模式,否则禁止所述编码块采用所述竖直四分模式。
  4. 根据权利要求2或3所述的视频解码方法,其中,所述子块最大尺寸比为子块的最大高宽比。
  5. 根据权利要求1所述的视频解码方法,其中,根据所述编码块的尺寸和所允许的子块最大尺寸比确定所述编码块能够采用的划分模式,包括:
    若所述编码块的高度大于或等于8,且所述编码块的2倍宽度小于或等于所述高度与所述子块最大尺寸比的乘积,则允许所述编码块采用水平二分模式,否则禁止所述编码块采用所述水平二分模式。
  6. 根据权利要求1所述的视频解码方法,其中,根据所述编码块的尺寸和所允许的子块最大尺寸比确定所述编码块能够采用的划分模式,包括:
    若所述编码块的高度大于或等于16,且所述编码块的4倍宽度小于或等于所述高度与所述子块最大尺寸比的乘积,则允许所述编码块采用水平四分模式,否则禁止所述编码块采用所述水平四分模式。
  7. 根据权利要求5或6所述的视频解码方法,其中,所述子块最大尺寸比为子块的最大宽高比。
  8. 根据权利要求1所述的视频解码方法,其中,所述子块最大尺寸比包括最大宽高比和最大高宽比,所述最大宽高比的值与所述最大高宽比的值相等;
    所述视频解码方法还包括:从所述视频图像帧的序列头中解码出所述子块最大尺寸比 的值。
  9. 根据权利要求1所述的视频解码方法,其中,所述子块最大尺寸比包括最大宽高比和最大高宽比,所述最大宽高比的值与所述最大高宽比的值不相等;
    所述视频解码方法还包括:从所述视频图像帧的序列头中分别解码出所述最大宽高比的值与所述最大高宽比的值。
  10. 根据权利要求1所述的视频解码方法,其中,所述子块最大尺寸比的值与所允许的编码块最大尺寸比的值相同。
  11. 根据权利要求1所述的视频解码方法,其中,在对划分得到的多个子块进行解码处理的过程中,执行变换跳过操作或者执行反变换操作。
  12. 一种视频编码方法,包括:
    获取视频图像帧的残差系数块;
    根据所述残差系数块的尺寸和所允许的子块最大尺寸比确定所述残差系数块能够采用的划分模式;
    从所述能够采用的划分模式中选择针对所述残差系数块的目标划分模式;
    将所述目标划分模式的信息编码至码流中。
  13. 一种视频解码装置,包括:
    获取单元,配置为获取视频图像帧的编码块;
    第一处理单元,配置为若所述编码块需要进行子块变换,则根据所述编码块的尺寸和所允许的子块最大尺寸比确定所述编码块能够采用的划分模式;
    第二处理单元,配置为基于所述编码块能够采用的划分模式,确定对所述编码块进行子块变换处理时所采用的目标划分模式;
    第三处理单元,配置为基于所述目标划分模式对所述编码块进行划分处理,并对划分得到的多个子块进行解码处理。
  14. 一种视频编码装置,包括:
    获取单元,配置为获取视频图像帧的残差系数块;
    第四处理单元,配置为根据所述残差系数块的尺寸和所允许的子块最大尺寸比确定所述残差系数块能够采用的划分模式;
    选择单元,配置为从所述能够采用的划分模式中选择针对所述残差系数块的目标划分模式;
    编码单元,配置为将所述目标划分模式的信息编码至码流中。
  15. 一种电子设备,包括:
    一个或多个处理器;
    存储装置,用于存储一个或多个程序,当所述一个或多个程序被所述一个或多个处理器执行时,使得所述一个或多个处理器实现如权利要求1至12中任一项所述的方法。
  16. 一种计算机可读介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现如权利要求1至12中任一项所述的方法。
PCT/CN2022/071732 2021-02-21 2022-01-13 视频编解码方法、装置、计算机可读介质及电子设备 WO2022174701A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US18/074,855 US20230104359A1 (en) 2021-02-21 2022-12-05 Video encoding method and apparatus, video decoding method and apparatus, computer-readable medium, and electronic device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110194810.5 2021-02-21
CN202110194810.5A CN114979655A (zh) 2021-02-21 2021-02-21 视频编解码方法、装置、计算机可读介质及电子设备

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/074,855 Continuation US20230104359A1 (en) 2021-02-21 2022-12-05 Video encoding method and apparatus, video decoding method and apparatus, computer-readable medium, and electronic device

Publications (1)

Publication Number Publication Date
WO2022174701A1 true WO2022174701A1 (zh) 2022-08-25

Family

ID=82932064

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/071732 WO2022174701A1 (zh) 2021-02-21 2022-01-13 视频编解码方法、装置、计算机可读介质及电子设备

Country Status (3)

Country Link
US (1) US20230104359A1 (zh)
CN (1) CN114979655A (zh)
WO (1) WO2022174701A1 (zh)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019016287A1 (en) * 2017-07-19 2019-01-24 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. APPARATUS AND METHOD FOR ENCODING IMAGES
CN110234008A (zh) * 2019-03-11 2019-09-13 杭州海康威视数字技术股份有限公司 编码方法、解码方法及装置
WO2019185815A1 (en) * 2018-03-29 2019-10-03 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Partitioning concepts for block-based picture coding
WO2020262992A1 (ko) * 2019-06-25 2020-12-30 한국전자통신연구원 영상 부호화/복호화 방법 및 장치

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020156572A1 (en) * 2019-02-03 2020-08-06 Beijing Bytedance Network Technology Co., Ltd. Unsymmetrical quad-tree partitioning
US20200288130A1 (en) * 2019-03-07 2020-09-10 Qualcomm Incorporated Simplification of sub-block transforms in video coding

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019016287A1 (en) * 2017-07-19 2019-01-24 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. APPARATUS AND METHOD FOR ENCODING IMAGES
WO2019185815A1 (en) * 2018-03-29 2019-10-03 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Partitioning concepts for block-based picture coding
CN110234008A (zh) * 2019-03-11 2019-09-13 杭州海康威视数字技术股份有限公司 编码方法、解码方法及装置
WO2020262992A1 (ko) * 2019-06-25 2020-12-30 한국전자통신연구원 영상 부호화/복호화 방법 및 장치

Also Published As

Publication number Publication date
CN114979655A (zh) 2022-08-30
US20230104359A1 (en) 2023-04-06

Similar Documents

Publication Publication Date Title
WO2022078163A1 (zh) 视频解码方法、视频编码方法及相关装置
WO2022063033A1 (zh) 视频解码方法、视频编码方法、装置、计算机可读介质及电子设备
WO2022078304A1 (zh) 视频解码方法、装置、计算机可读介质、程序及电子设备
WO2022062880A1 (zh) 视频解码方法、装置、计算机可读介质及电子设备
WO2022174660A1 (zh) 视频编解码方法、装置、计算机可读介质及电子设备
CN113207002B (zh) 视频编解码方法、装置、计算机可读介质及电子设备
US20230053118A1 (en) Video decoding method, video coding method, and related apparatus
WO2022174701A1 (zh) 视频编解码方法、装置、计算机可读介质及电子设备
CN115209157A (zh) 视频编解码方法、装置、计算机可读介质及电子设备
JP7483029B2 (ja) ビデオ復号方法、ビデオ符号化方法、装置、媒体、及び電子機器
WO2022116854A1 (zh) 视频解码方法、装置、可读介质、电子设备及程序产品
WO2022174659A1 (zh) 视频编解码方法、装置、计算机可读介质及电子设备
WO2022037477A1 (zh) 视频解码方法、装置、计算机可读介质及电子设备
WO2022174637A1 (zh) 视频编解码方法、装置、计算机可读介质及电子设备
WO2023202097A1 (zh) 环路滤波方法、视频编解码方法、装置、介质、程序产品及电子设备
US20240064298A1 (en) Loop filtering, video encoding, and video decoding methods and apparatus, storage medium, and electronic device
WO2022063040A1 (zh) 视频编解码方法、装置及设备
WO2022174638A1 (zh) 视频编解码方法、装置、计算机可读介质及电子设备
CN115209141A (zh) 视频编解码方法、装置、计算机可读介质及电子设备
CN115209138A (zh) 视频编解码方法、装置、计算机可读介质及电子设备
CN115209146A (zh) 视频编解码方法、装置、计算机可读介质及电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22755474

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 18/01/2024)