WO2022174637A1 - 视频编解码方法、装置、计算机可读介质及电子设备 - Google Patents

视频编解码方法、装置、计算机可读介质及电子设备 Download PDF

Info

Publication number
WO2022174637A1
WO2022174637A1 PCT/CN2021/131610 CN2021131610W WO2022174637A1 WO 2022174637 A1 WO2022174637 A1 WO 2022174637A1 CN 2021131610 W CN2021131610 W CN 2021131610W WO 2022174637 A1 WO2022174637 A1 WO 2022174637A1
Authority
WO
WIPO (PCT)
Prior art keywords
coefficient
value
area
segment
coefficients
Prior art date
Application number
PCT/CN2021/131610
Other languages
English (en)
French (fr)
Inventor
胡晔
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Publication of WO2022174637A1 publication Critical patent/WO2022174637A1/zh
Priority to US17/989,400 priority Critical patent/US20230082386A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/44Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/119Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/129Scanning of coding units, e.g. zig-zag scan of transform coefficients or flexible macroblock ordering [FMO]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/13Adaptive entropy coding, e.g. adaptive variable length coding [AVLC] or context adaptive binary arithmetic coding [CABAC]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/132Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/46Embedding additional information in the video signal during the compression process
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards

Definitions

  • the present application belongs to the field of computer technology, and in particular relates to a video encoding and decoding method, a video encoding and decoding apparatus, a computer-readable medium, and an electronic device.
  • the encoder In the video encoding process, the encoder usually needs to transform, quantize, and entropy the residual data between the original video data and the predicted video data before sending it to the decoder. Since the numerical distribution of residual data is relatively sparse, when encoding and decoding residual data, it is necessary to encode and decode the flag indicating whether each coefficient is a zero coefficient, so there will be some unnecessary encoding redundancy, resulting in There is a problem of low coding efficiency, which affects the video compression performance.
  • the purpose of this application is to provide a video encoding and decoding method, a video encoding and decoding apparatus, a computer-readable medium, and an electronic device, at least to a certain extent, to overcome the technical problem of low video encoding and decoding efficiency in the related art.
  • a video decoding method which is performed by an electronic device.
  • the method includes: in a coding block of a video image frame, performing segmentation processing on coefficients to be decoded according to the scanning order of the SRCC scanning area, Obtain a coefficient segment composed of a plurality of coefficients; decode the coefficient segment all-zero flag of the coefficient segment to obtain the value of the coefficient segment all-zero flag, and the coefficient segment all-zero flag is used to indicate the Whether the coefficients in the coefficient segment are all zero; if the value of the all-zero flag of the coefficient segment is a preset first value, then sequentially decode the valid flags of each coefficient in the coefficient segment according to the scanning order , the valid flag is used to indicate whether the coefficient is a non-zero coefficient; if the value of the all-zero flag of the coefficient segment is a preset second value, the valid flag of each coefficient in the coefficient segment is The flags are assigned a value of zero.
  • a video coding method which is performed by an electronic device, the method comprising: in a coding block of a video image frame, coding the SRCC scanning region according to the scanning order of the scanning region-based coefficient coding to be coded The coefficients are segmented to obtain a coefficient segment consisting of multiple coefficients; according to whether the coefficients in the coefficient segment are all zero, the value of the all-zero flag of the coefficient segment is determined, and the coefficients are divided into zeros.
  • the valid flags of each coefficient in each coefficient segment are sequentially encoded according to the scanning order, and the valid flags use Indicates whether the coefficient is a non-zero coefficient; if the value of the all-zero flag of the coefficient segment is a preset second value, then skip encoding the valid flags of each coefficient in the coefficient segment.
  • a video decoding apparatus comprising: a coefficient segmentation module configured to, in a coding block of a video image frame, divide the coefficients to be decoded according to the scanning order of the SRCC scanning area segment processing to obtain a coefficient segment consisting of a plurality of coefficients; the first decoding module is configured to decode the coefficient segment all-zero flag of the coefficient segment, and obtain the value of the coefficient segment all-zero flag, so The coefficient segment all-zero flag is used to indicate whether the coefficients in the coefficient segment are all zero; the second decoding module is configured to be a preset first value if the value of the coefficient segment all-zero flag is , then sequentially decode the valid flags of each coefficient in the coefficient segment according to the scanning order, and the valid flag is used to indicate whether the coefficient is a non-zero coefficient; the coefficient assignment module is configured to If the value of the zero flag is a preset second value, the valid flags of each coefficient in
  • a video encoding apparatus comprising: a coefficient segmentation module configured to encode, in an encoding block of a video image frame, a scanning order of an SRCC scan area according to a coefficient based on the scan area Perform segmentation processing on the coefficients to be encoded to obtain coefficient segments composed of multiple coefficients; the first encoding module is configured to determine that the coefficient segments are all zeros according to whether the coefficients in the coefficient segments are all zeros the value of the flag, and encode the coefficient segment all-zero flag; the second encoding module is configured to, if the value of the coefficient segment all-zero flag is a preset first value, according to the scanning order Encoding the valid flags of the respective coefficients in each coefficient segment in turn, the valid flags are used to indicate whether the coefficients are non-zero coefficients; the coding skip module is configured to take the value of the all-zero flags if the coefficient segments are all zeros. is a preset second value
  • a non-volatile computer-readable medium on which a computer program is stored, and when the computer program is executed by a processor, implements the video decoding method and the video encoding method in the above technical solutions .
  • an electronic device comprising: a processor; and a memory for storing executable instructions of the processor; wherein the processor is configured to execute the The executable instructions are described to perform the video decoding method and the video encoding method as in the above technical solutions.
  • a computer program product or computer program where the computer program product or computer program includes computer instructions, and the computer instructions are stored in a computer-readable storage medium.
  • the processor of the electronic device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the electronic device executes the video decoding method and the video encoding method as in the above technical solutions.
  • FIG. 1 shows a schematic diagram of an exemplary system architecture to which the technical solutions of the embodiments of the present application can be applied.
  • FIG. 2 shows a schematic diagram of the arrangement of the video encoding device and the video decoding device in the streaming transmission system.
  • Figure 3 shows a basic flow diagram of a video encoder.
  • Figure 4 shows the scan area marked by the SRCC technique.
  • FIG. 5 shows a schematic diagram of the sequence of scanning the marked scanning area.
  • FIG. 6 shows a flowchart of steps of a video decoding method in an embodiment of the present application.
  • FIG. 7 shows a flow chart of the steps of decoding the coefficient segment all-zero flag in an embodiment of the present application by means of conventional decoding.
  • FIG. 8 shows a flowchart of steps for determining a context index increment based on the first mode in an embodiment of the present application.
  • FIG. 9 shows a flowchart of steps for determining a context index increment based on the second mode in an embodiment of the present application.
  • FIG. 10 shows a flowchart of steps of a video encoding method in an embodiment of the present application.
  • FIG. 11 shows a structural block diagram of a video encoding apparatus provided by an embodiment of the present application.
  • FIG. 12 shows a structural block diagram of a video decoding apparatus provided by an embodiment of the present application.
  • FIG. 13 schematically shows a structural block diagram of a computer system suitable for implementing the electronic device of the embodiment of the present application.
  • Example embodiments will now be described more fully with reference to the accompanying drawings.
  • Example embodiments can be embodied in various forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this application will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art.
  • FIG. 1 shows a schematic diagram of an exemplary system architecture to which the technical solutions of the embodiments of the present application can be applied.
  • the system architecture 100 includes a plurality of end devices that can communicate with each other through, for example, a network 150 .
  • the system architecture 100 may include a first end device 110 and a second end device 120 interconnected by a network 150 .
  • the first terminal device 110 and the second terminal device 120 perform unidirectional data transmission.
  • the first terminal device 110 may encode video data (eg, a video picture stream captured by the terminal device 110 ) for transmission to the second terminal device 120 through the network 150, and the encoded video data may be encoded in one or more
  • the second terminal device 120 may receive the encoded video data from the network 150, decode the encoded video data to restore the video data, and display video pictures according to the restored video data.
  • the system architecture 100 may include a third end device 130 and a fourth end device 140 that perform bidirectional transmission of encoded video data, such as may occur during a video conference.
  • each of the third end device 130 and the fourth end device 140 may encode video data (eg, a stream of video pictures captured by the end device) for transmission to the third end device over the network 150 130 and the other terminal device of the fourth terminal device 140 .
  • Each of the third terminal device 130 and the fourth terminal device 140 may also receive encoded video data transmitted by the other one of the third terminal device 130 and the fourth terminal device 140, and may The video data is decoded to recover the video data, and a video picture can be displayed on an accessible display device based on the recovered video data.
  • the first terminal device 110 , the second terminal device 120 , the third terminal device 130 and the fourth terminal device 140 may be servers, personal computers and smart phones, but the principles disclosed in this application may not be limited thereto .
  • Embodiments disclosed herein are applicable to laptop computers, tablet computers, media players, and/or dedicated videoconferencing equipment.
  • Network 150 represents any number of networks, including, for example, wired and/or wireless communication networks, that communicate encoded video data between first end device 110, second end device 120, third end device 130, and fourth end device 140.
  • Communication network 150 may exchange data in circuit-switched and/or packet-switched channels.
  • the network may include a telecommunications network, a local area network, a wide area network, and/or the Internet.
  • the architecture and topology of network 150 may be immaterial to the operations disclosed herein.
  • FIG. 2 illustrates the placement of a video encoding device and a video decoding device in a streaming environment.
  • the subject matter disclosed herein is equally applicable to other video-enabled applications including, for example, videoconferencing, digital TV (television), storing compressed video on digital media including CDs, DVDs, memory sticks, and the like.
  • the streaming system may include a capture subsystem 213 , which may include a video source 201 such as a digital camera, and the video source creates an uncompressed video picture stream 202 .
  • the video picture stream 202 includes samples captured by a digital camera.
  • the video picture stream 202 is depicted as a thick line to emphasize the high data volume of the video picture stream, which can be processed by the electronic device 220, and the electronic Device 220 includes video encoding device 203 coupled to video source 201 .
  • Video encoding device 203 may include hardware, software, or a combination of hardware and software to implement or implement various aspects of the disclosed subject matter as described in greater detail below.
  • the encoded video data 204 (or encoded video code stream 204) is depicted as a thin line to emphasize the lower amount of encoded video data 204 (or encoded video code stream 204) 204), which can be stored on the streaming server 205 for future use.
  • One or more streaming client subsystems such as client subsystem 206 and client subsystem 208 in FIG. 2 , may access streaming server 205 to retrieve copies 207 and 209 of encoded video data 204 .
  • Client subsystem 206 may include, for example, video decoding device 210 in electronic device 230 .
  • the video decoding device 210 decodes the incoming copy 207 of the encoded video data and produces an output video picture stream 211 that can be presented on a display 212 (eg, a display screen) or another presentation device.
  • encoded video data 204, video data 207, and video data 209 may be encoded according to certain video encoding/compression standards. Examples of these standards include ITU-T H.265.
  • video coding standard under development is informally referred to as Versatile Video Coding (VVC), and this application may be used in the context of the VVC standard.
  • electronic device 220 and the electronic device 230 may include other components not shown in the figures.
  • electronic device 220 may include a video decoding device
  • electronic device 230 may also include a video encoding device.
  • HEVC High Efficiency Video Coding
  • VVC Versatile Video Coding
  • AVS China National Video Coding Standard AVS
  • HEVC High Efficiency Video Coding
  • VVC Versatile Video Coding
  • AVS China National Video Coding Standard AVS
  • HEVC High Efficiency Video Coding
  • VVC Versatile Video Coding
  • AVS China National Video Coding Standard AVS
  • HEVC High Efficiency Video Coding
  • VVC Versatile Video Coding
  • AVS China National Video Coding Standard AVS
  • Predictive Coding includes intra-frame prediction and inter-frame prediction. After the original video signal is predicted by the selected reconstructed video signal, a residual video signal is obtained. The encoder needs to decide which predictive coding mode to select for the current CU and inform the decoder. Among them, intra-frame prediction means that the predicted signal comes from an area that has been coded and reconstructed in the same image; inter-frame prediction means that the predicted signal comes from another image (called a reference image) that has been coded and different from the current image. .
  • Transform & Quantization After the residual video signal undergoes discrete Fourier transform (Discrete Fourier Transform, DFT), discrete cosine transform (Discrete Cosine Transform, DCT) and other transformation operations, the signal is converted into the transform domain, called is the transformation coefficient.
  • the transform coefficient is further subjected to a lossy quantization operation, which loses a certain amount of information, so that the quantized signal is beneficial to the compressed expression.
  • a lossy quantization operation which loses a certain amount of information, so that the quantized signal is beneficial to the compressed expression.
  • the fineness of quantization is usually determined by the Quantization Parameter (QP for short).
  • the coefficients representing a larger value range will be quantized into the same output, which usually brings greater distortion and distortion. A lower code rate; on the contrary, if the QP value is smaller, the coefficients representing a smaller value range will be quantized into the same output, so it usually brings less distortion and corresponds to a higher code rate.
  • Entropy Coding or Statistical Coding The quantized transform domain signal will undergo statistical compression coding according to the frequency of occurrence of each value, and finally output a binarized (0 or 1) compressed code stream. At the same time, other information generated by encoding, such as the selected encoding mode, motion vector data, etc., also needs to be entropy encoded to reduce the bit rate.
  • Statistical coding is a lossless coding method that can effectively reduce the code rate required to express the same signal. Common statistical coding methods include variable length coding (Variable Length Coding, VLC for short) or context-based adaptive binary arithmetic coding ( Content Adaptive Binary Arithmetic Coding, referred to as CABAC).
  • the context-based adaptive binary arithmetic coding (CABAC) process mainly includes three steps: binarization, context modeling and binary arithmetic coding.
  • the binary data can be encoded by the normal encoding mode and the bypass encoding mode (bypass).
  • Bypass Coding Mode it does not need to assign a specific probability model to each binary bit, and the input binary bit bin value is directly encoded with a simple bypass encoder to speed up the entire encoding and decoding process. speed.
  • different syntax elements are not completely independent, and the same syntax element itself also has a certain memory.
  • This encoded symbol information used as a condition is called a context.
  • the binaries of syntax elements enter the context modeler sequentially.
  • the encoder assigns an appropriate probability model to each input binary bit based on the value of the previously encoded syntax element or binary bit, a process known as context modeling.
  • the context model corresponding to the syntax element can be located by the context index increment (context index increment, ctxIdxInc) and the context start index (context index Start, ctxIdxStart).
  • the context model needs to be updated according to the bin value, that is, the adaptive process in encoding.
  • Loop Filtering The changed and quantized signal will obtain a reconstructed image through the operations of inverse quantization, inverse transformation and prediction compensation. Compared with the original image, the reconstructed image is different from the original image due to the influence of quantization, that is, the reconstructed image will produce distortion (Distortion). Therefore, filtering operations can be performed on the reconstructed image, such as deblocking filter (DB), adaptive pixel compensation (Sample Adaptive Offset, SAO) or adaptive loop filter (Adaptive Loop Filter, ALF) and other filters , which can effectively reduce the degree of distortion caused by quantization. Since these filtered reconstructed images will be used as references for subsequent encoded images to predict future image signals, the above filtering operation is also called in-loop filtering, ie, a filtering operation in an encoding loop.
  • DB deblocking filter
  • SAO Sample Adaptive Offset
  • ALF adaptive Loop Filter
  • FIG. 3 shows a basic flowchart of a video encoder, and intra-frame prediction is used as an example for description in the flowchart.
  • the original image signal sk [x,y] and the predicted image signal Do the difference operation to get the residual signal u k [x,y]
  • the residual signal u k [x,y] is transformed and quantized to obtain quantized coefficients
  • the quantized coefficients are encoded by entropy coding to obtain the encoded bits
  • the reconstructed residual signal u' k [x, y] is obtained through inverse quantization and inverse transformation processing, and the predicted image signal It is superimposed with the reconstructed residual signal u' k [x, y] to generate an image signal image signal
  • it is input to the intra - frame mode decision module and the intra-frame prediction module for intra-frame prediction processing; ] can be used as a reference image for the next frame for motion estimation and motion compensation prediction.
  • the decoding end Based on the above encoding process, for each CU, the decoding end performs entropy decoding to obtain various mode information and quantization coefficients after obtaining the compressed code stream (ie, the bit stream). Then, the quantized coefficients undergo inverse quantization and inverse transformation to obtain residual signals.
  • the predicted signal corresponding to the CU can be obtained, and then the reconstructed signal can be obtained by adding the residual signal and the predicted signal. The reconstructed signal is then subjected to loop filtering and other operations to generate the final output signal.
  • the transform processing of the residual signal makes the energy of the residual signal concentrate on less low-frequency coefficients, that is, most coefficients have smaller values. Then after the subsequent quantization module, the smaller coefficient value will become zero value, which greatly reduces the cost of coding the residual signal.
  • transformation kernels such as DST7 and DCT8 are introduced into the transformation process, and the horizontal transformation and vertical transformation of the residual signal are carried out. Direct transforms can use different transform kernels.
  • the possible combinations of transforms that can be selected for transform processing of a residual signal are as follows: (DCT2, DCT2), (DCT8, DCT8), (DCT8, DST7) ), (DST7, DCT8) and (DST7, DST7).
  • rate-distortion optimization RDO
  • the residual signal can be directly quantized without going through the transformation process, that is, the transformation is skipped. Identifying whether the current residual block belongs to the transform skip mode can be done in two ways: explicit coding and implicit derivation.
  • the scanning region-based method can be used.
  • Coefficient coding technology Scan Region based Coefficient Coding, SRCC
  • the size of the upper left area of the non-zero coefficients contained in each quantized coefficient block can be marked as (SRx+1) ⁇ ( SRy+1), where SRx is the abscissa of the rightmost non-zero coefficient in the quantized coefficient block, SRy is the ordinate of the lowermost non-zero coefficient in the quantized coefficient block, and the starting point coordinate of the upper left corner is (0,0), 1 ⁇ SRx+1 ⁇ W, 1 ⁇ SRy+1 ⁇ H, and the coefficients outside this region are all 0.
  • the SRCC technology uses (SRx, SRy) to determine the quantized coefficient area that needs to be scanned in a quantized coefficient block. As shown in Figure 4, only the quantized coefficients in the scanning area marked by (SRx, SRy) need to be coded.
  • the scanning order of the coding is as follows As shown in Figure 5, it can be a reverse zigzag scan from the lower right corner to the upper left corner.
  • the coefficients to be coded in the SRCC scanning area are coded using a hierarchical method. Specifically, first, the SRCC scan area coordinates are encoded. Then, in the SRCC scanning area, based on the scanning order, a flag (significant flag) that identifies whether the coefficient at the current position is 0 is encoded one by one. At the same time, record the position of non-zero coefficients and count the number of non-zero coefficients. If the number of non-zero coefficients is greater than 0, it is necessary to encode the absolute value and sign of the non-zero coefficient at the corresponding position.
  • the boundary of the SRCC scanning area mainly depends on the positions of the rightmost and lowermost non-zero coefficients in the current block, there may be many cases where the position coefficients are 0 in the SRCC scanning area. Therefore, in the SRCC scan order, there may be a situation where multiple consecutive position coefficients are 0. In particular, for example, in transform skip mode, since there is no transform process, residual coefficient energy is not concentrated, and the distribution of non-zero coefficients may be relatively sparser. In the current AVS3 standard, all positions in the SRCC scan area need to encode a significant flag to indicate whether the position is a zero coefficient, which will lead to some unnecessary redundancy.
  • the present invention proposes a method for coding the SRCC coefficients in segments based on the scanning order of the coefficients in the SRCC scanning area when encoding and decoding the coefficients, and according to the distribution characteristics of the coefficients in the SRCC scanning area.
  • the syntax element indicates whether these coefficients are all zero, so as to reduce coding redundancy, help to improve the coding efficiency of coefficient coding, and further improve the video compression performance.
  • the coefficient coding method proposed in this application is not limited to the coefficient coding in the transform skip mode, and can also be used in coefficient coding in other coding modes, for example, applied to the coefficient coding of all blocks ; for example, used when the picture-level intra-prediction transform skip allow flag is 1; for example, when the picture-level inter-prediction transform skip allow flag is 1; for example, when the picture-level intra-prediction transform Used when the skip-enable flag value and the picture-level inter-frame prediction transform skip-enable flag value are both 1, etc.
  • This embodiment of the present application can determine whether the coding block adopts the transform skip mode by means of explicit coding or implicit derivation.
  • the explicit coding is to decode the flag flag indicating whether the coding block adopts the transform skip mode before decoding the absolute value of the coefficient, so as to clarify whether the current coding block needs to skip the transform process based on the decoding result of the flag.
  • Implicit derivation i.e. Implicit Selection of Transform Skip, Implicit Selection of Transform Skip
  • Implicit Selection of Transform Skip is to perform statistics on the decoding results of coefficients without corresponding flags, and then judge whether to skip the transformation process according to the statistical results.
  • all coefficients can be decoded first, the number of all non-zero coefficients, and the number of even coefficients in all coefficients (including zero coefficients) can be calculated first, and then according to the parity of the number of non-zero coefficients, or the number of even coefficients in all coefficients
  • the parity of to implicitly deduce whether the current coding block adopts transform skipping (for example, when the number of non-zero coefficients is odd, it is determined that the current coding block adopts transform skipping; when the number of non-zero coefficients is even, the Determine that the current coding block does not use transform skipping.
  • the picture-level intra-prediction transform skip enable flag and the picture-level inter-prediction transform skip enable flag are flags for controlling whether implicit transform skipping can be used in intra/inter frames.
  • syntax elements related to transform skip mode are explained as follows.
  • IstSkipEnableFlag can be obtained by decoding the syntax element ist_skip_enable_flag. If the syntax element ist_skip_enable_flag does not exist in the bitstream, IstSkipEnableFlag may be assigned the value 0.
  • Inter-transform skip enable flag inter_transform_skip_enable_flag
  • InterTransformSkipEnableFlag can be obtained by decoding the syntax element inter_transform_skip_enable_flag. If the syntax element inter_transform_skip_enable_flag does not exist in the bitstream, the InterTransformSkipEnableFlag may be assigned the value 0.
  • a value of '1' means that the luma intra prediction residual block and the luma coding block of the current image can be copied using the transform skip method; a value of '0' means the luma intra prediction residual block of the current image Intra prediction residual blocks and luma coded blocks should not use the transform skip method.
  • the value of the variable PictureIstSkipEnableFlag can be obtained by decoding the syntax element picture_ist_skip_enable_flag. If the syntax element picture_ist_skip_enable_flag does not exist in the bitstream, PictureIstSkipEnableFlag may be assigned the value 0.
  • a value of '1' indicates that the transform skip method can be used for the luma inter prediction residual block of the current image; a value of '0' indicates that the transform skip method should not be used for the luma inter prediction residual block of the current image.
  • the value of PictureInterSkipEnableFlag can be obtained by decoding the syntax element picture_inter_trasform_skip_flag. If the syntax element picture_inter_trasform_skip_flag does not exist in the bitstream, PictureInterSkipEnableFlag may be assigned the value 0.
  • FIG. 6 shows a flowchart of the steps of a video decoding method in an embodiment of the present application.
  • the video decoding method may be executed by a device with a computing processing function, such as a terminal device or a server, or by a device shown in FIG. 13 . performed by the electronic equipment shown.
  • the video decoding method may mainly include the following steps S610 to S640.
  • Step S610 In the coding block of the video image frame, perform segmentation processing on the coefficients to be decoded according to the scanning order of the scanning region-based coefficient coding SRCC scanning region to obtain coefficient segments composed of multiple coefficients.
  • the video image frame sequence includes a series of images, each image can be further divided into slices (Slices), and the slices can be further divided into a series of LCUs (or CTUs), LCUs include several CU.
  • the video image frame is encoded in block units.
  • MB macroblock
  • PB prediction block
  • basic concepts such as coding unit CU, prediction unit (PU) and transform unit (TU) are used to functionally divide a variety of block units, and adopt a new tree-based structure for describe.
  • a CU can be divided into smaller CUs according to a quadtree, and the smaller CUs can be further divided to form a quadtree structure.
  • the coding block in this embodiment of the present application may be a CU, or a block smaller than the CU, such as a smaller block obtained by dividing the CU.
  • Step S620 Decode the coefficient segment all-zero flag of the coefficient segment to obtain the value of the coefficient segment all-zero flag, and the coefficient segment all-zero flag is used to indicate whether the coefficients in the coefficient segment are all zero.
  • the method for decoding the coefficient segment all-zero flag coef_part_all_zero_flag may include: determining that the decoding mode of the coefficient segment all-zero flag is bypass decoding (bypass) or conventional decoding (CABAC); The decoding mode of the coefficient segmented all-zero flag is bypass decoding, then the coefficient absolute value threshold flag is decoded by the bypass decoding engine; if it is determined that the decoding mode of the coefficient segmented all-zero flag is conventional decoding, the Conventional decoding engines decode the coefficient segment all-zero flags.
  • the coefficient segment all-zero flags corresponding to different coefficient segments may select the same decoding mode, or may select different decoding modes.
  • Step S630 If the value of the all-zero flag of the coefficient segment is a preset first value, decode the valid flags of each coefficient in the coefficient segment sequentially according to the scanning order, and the valid flag is used to indicate whether the coefficient is a non-zero coefficient.
  • the preset first value may be, for example, 0.
  • the value of the all-zero flag of the coefficient segment obtained by decoding is 0, it indicates that the coefficients in the corresponding coefficient segment are not all zero coefficients, but there are a certain number of non-zero coefficients.
  • the significant flags of the respective coefficients in the coefficient segment can be decoded in sequence according to the scanning order, and then it is determined whether each coefficient is a non-zero coefficient. In an embodiment of the present application, if the significant flag of a coefficient obtained by decoding is 0, it means that the coefficient is a zero coefficient; if the significant flag of a coefficient obtained by decoding is 1, it means that the coefficient is a non-zero coefficient.
  • Step S640 If the value of the all-zero flag of the coefficient segment is a preset second value, skip decoding the valid flag, and assign the valid flag of each coefficient in the coefficient segment to zero.
  • the preset second value may be, for example, 1.
  • the value of the all-zero flag of the coefficient segment obtained by decoding is 1, it means that the coefficients in the corresponding coefficient segment are all zero coefficients. At this time, there is no need to decode the significant flag significant flag of each coefficient, and the significant flag significant flag of all coefficients in the corresponding coefficient segment can be directly assigned as 0.
  • each coefficient in the coding block is a zero coefficient or a non-zero coefficient.
  • the corresponding syntax elements can be continuously decoded to obtain the absolute value and sign of the non-zero coefficients.
  • the scanning order when encoding and decoding the coefficients in the SRCC area is performed, and according to the distribution characteristics of the coefficients in the SRCC area, a continuous segment of coefficients in the scanning order is formed into a coefficient segment, and a
  • the syntax element indicates whether the coefficients in the coefficient segment are all zero coefficients, thereby reducing coding redundancy, helping to improve coding efficiency of coefficient coding, and further improving video compression performance.
  • the method for segmenting the coefficients to be decoded according to the scanning order of the SRCC scanning area may include: acquiring a coefficient segment length, where the coefficient segment length is used to represent the coefficients included in the coefficient segment. Maximum number; successively determine the consecutive coefficients to be decoded according to the scanning order of the SRCC scan area, and form a coefficient segment with the number of coefficients equal to or less than the length of the coefficient segment.
  • the coefficient segment length CUT_NUM is a constant determined according to the maximum conversion unit size or a dynamic parameter determined according to attribute information of the coding block.
  • the attribute information of the coding block may include at least one of the following information: the size of the current conversion unit, the shape of the current conversion unit, the coordinates of the SRCC scan area, and the number of coefficients in the SRCC scan area.
  • This embodiment of the present application may also dynamically set the coefficient segment length CUT_NUM according to factors such as the current TU size, the current TU shape, the coordinates of the SRCC scanning area, and the number of SRCC coefficients.
  • the CUT_NUM value can be set in various ways, which are not limited to the foregoing examples.
  • bypass decoding or conventional decoding may be selected to decode the coefficient segment all-zero flag.
  • the conventional decoding method requires assigning a suitable probability model to each input binary bit according to the value of the previously decoded syntax element or binary bit, that is, assigning a context model to the coefficient segment all-zero flag to be decoded.
  • FIG. 7 shows a flow chart of the steps of decoding the coefficient segment all-zero flag in an embodiment of the present application by means of conventional decoding.
  • the method for decoding the coefficient segmented all-zero flag by the conventional decoding engine based on the context model may mainly include the following steps S710 to S730.
  • Step S710 Obtain a model selection method corresponding to the SRCC scanning area, and determine a context index increment according to the model selection method.
  • a fixed model selection mode may be preset in the encoder and the decoder, or a mode may be dynamically selected from multiple selectable model selection modes according to the current coding block.
  • the same model selection method may be used, or different model selection methods may be used.
  • Step S720 Select the context model corresponding to the coefficient segment all-zero flag according to the context index increment.
  • the context model corresponding to the syntax element can be located through the context index increment ctxIdxInc and the context start index ctxIdxStart.
  • Different probability models can be assigned to the coefficient segment all-zero flag based on the context index increment ctxIdxInc with different values.
  • Step S730 Based on the selected context model, perform arithmetic decoding on the coefficient segment all-zero flag by a conventional decoding engine.
  • the selected context model and the coefficient segment all-zero flag to be decoded are loaded into the conventional decoding engine, and the decoding operation of the coefficient segment all-zero flag can be completed by the conventional decoding engine to obtain the corresponding flag value.
  • the conventional decoding engine in the embodiment of the present application may be a binary arithmetic decoder based on the CABAC technology.
  • the optional model selection methods may include three modes. If the model selection method is the first mode, the context index is incremented according to the shape of the SRCC scan area; if the model selection method is the second mode, the context index is incremented according to the area of the SRCC scan area; if the model selection method is the first In the third mode, the context index is incremented according to the preset first value.
  • FIG. 8 shows a flowchart of steps for determining a context index increment based on the first mode in an embodiment of the present application.
  • the method for incrementally assigning a context index according to the shape of the SRCC scanning area may include the following steps S810 to S840 .
  • Step S810 Obtain the area width and area height of the SRCC scanning area.
  • the SRCC scanning area may be a rectangular area as shown in FIG. 4 .
  • the SRCC may be determined.
  • Step S820 Assign an initial value of the context index increment according to the numerical value relationship between the area width and the area height.
  • the initial value of the context index increment can be assigned based on different preset values. If the area width and area height are equal in value, the preset second value is used as the initial value of the context index increment; if the area width is smaller than the area height, the preset third value is used as the initial value of the context index increment is assigned; If the width of the region is greater than the height of the region, a preset fourth value is used as the initial value of the context index increment.
  • the second value may be 1
  • the third value may be 2
  • the fourth value may be 5.
  • Step S830 According to the numerical proportional relationship between the area width and the area height, assign a value to the initial value increment of the context index increment.
  • the method for assigning value to the initial value increment of the context index increment may include: calculating a numerical ratio ratio between a larger value and a smaller value in the area width and area height; It is compared with a plurality of preset ratio thresholds, so as to assign a value to the initial value increment delta of the context index increment according to the comparison result.
  • the ratio obtained by dividing the larger value sr_height by the smaller value sr_width is used as the numerical ratio ratio; when the area height sr_height is smaller than (or equal to) the area width sr_width, The ratio obtained by dividing the larger value sr_width by the smaller value sr_height is used as the numerical ratio ratio.
  • the numerical ratio ratio Take two preset ratio thresholds as 2 and 3.
  • Step S840 Assign a value to the context index increment according to the initial value and the initial value increment.
  • FIG. 9 shows a flowchart of steps for determining a context index increment based on the second mode in an embodiment of the present application.
  • the method for incrementally assigning a context index according to the area of the SRCC scanning area may include the following steps S910 to S930 .
  • Step S910 Obtain the area width and area height of the SRCC scanning area.
  • the SRCC scanning area may be a rectangular area as shown in FIG. 4 .
  • the SRCC may be determined.
  • Step S920 Determine the area area of the SRCC scanning area according to the area width and area height.
  • Step S930 Compare the area area with a plurality of preset area thresholds, so as to increment the context index according to the comparison result.
  • a preset first value can be directly used as the context index increment, so that a single context model can be determined. For example, if the first value is 0 (this value is only used for example), the context index increment ctxIdxInc can be set to 0, that is, the context start index ctxIdxStart can be used to directly locate the corresponding context without adding any increment. Model.
  • the coefficient segment all-zero flags of each coefficient segment can be selectively decoded. After segmenting the SRCC scan area of the coding block, a plurality of coefficient segments can be obtained. After that, the coefficient segment all-zero flag may be decoded for all the coefficient segments, or the coefficient segment all-zero flag may be decoded for only a part of the coefficient segments.
  • a preset decoding skip condition is satisfied before decoding the coefficient segment all-zero flag of the coefficient segment; if the decoding skip condition is satisfied, skip the coefficient segment
  • a preset second value is used as the value of the all-zero flag for the coefficient segment.
  • the second value takes a value of 1, that is, a coefficient segment that has not been decoded with an all-zero flag for coefficient segments by default is an all-zero coefficient segment composed of zero coefficients.
  • the decoding skip condition includes at least one of the following conditions: the number of coefficients in the coefficient segment is less than a preset number threshold; the area ratio of the SRCC scan area in the coding block is less than a preset percentage threshold.
  • the step of decoding the coefficient segment all-zero flag of the coefficient segment is skipped, and the preset first value is used as the The value of the coefficient piecewise all-zero flag.
  • the step of decoding the coefficient segment all-zero flag of the coefficient segment is skipped, and the preset ratio is used.
  • the specific coefficient segment if there is a specific coefficient segment whose number of coefficients is less than the length of the coefficient segment, the specific coefficient segment is located at a specific position in the SRCC scanning area.
  • the specific position may be a scanning start position or a scanning end position in the SRCC scanning area.
  • a particular coefficient segment may be the first coefficient segment or the last coefficient segment in the scan order.
  • FIG. 10 shows a flowchart of steps of a video encoding method in an embodiment of the present application.
  • the video encoding method may be performed by a device with computing processing functions, such as a terminal device or a server, or may be performed by the electronic device shown in FIG. 13 .
  • the video coding method may mainly include the following steps S1010 to S1040.
  • Step S1010 in the coding block of the video image frame, perform segmentation processing on the coefficient to be coded according to the scanning order of the coefficient coding SRCC scanning region based on the scanning region, to obtain a coefficient segment consisting of multiple coefficients;
  • Step S1020 according to whether the coefficients in the coefficient segment are all zero, determine the value of the all-zero flag of the coefficient segment, and encode the all-zero flag of the coefficient segment;
  • Step S1030 if the value of the coefficient segment all-zero flag is the preset first value, then sequentially encode the valid flags of each coefficient in each coefficient segment according to the scanning order, and the valid flags are used to indicate whether the coefficients are non-zero coefficients ;
  • Step S1040 If the value of the all-zero flag of the coefficient segment is a preset second value, skip encoding the valid flags of each coefficient in the coefficient segment.
  • the value of the coefficient segment all-zero flag is a preset second value, it means that each coefficient in the coefficient segment is a zero coefficient, and there is no need to encode the validity of each coefficient in the coefficient segment. logo.
  • the present invention proposes a method for encoding SRCC coefficients in segments based on the scanning order of the coefficients in the SRCC region when encoding and decoding the coefficients, and according to the distribution characteristics of the coefficients in the SRCC region. Coefficients, using a syntax element to indicate whether these coefficients are all zeros to reduce coding redundancy, help to improve coding efficiency of coefficient coding, and further improve video compression performance. It should be noted that the above embodiments take the decoding end as an example to illustrate the SRCC coefficient encoding and decoding methods provided in the embodiments of the present application, but the related technical solutions can also be applied to the encoding end, and the present application is not limited thereto.
  • FIG. 11 shows a structural block diagram of a video decoding apparatus provided by an embodiment of the present application.
  • the video decoding apparatus 1100 may mainly include:
  • the coefficient segmentation module 1110 is configured to perform segmentation processing on the coefficients to be decoded according to the scanning order of the SRCC scanning area in the coding block of the video image frame, so as to obtain a coefficient segmentation composed of a plurality of coefficients;
  • the first decoding module 1120 is configured to decode the coefficient segment all-zero flag of the coefficient segment to obtain the value of the coefficient segment all-zero flag, and the coefficient segment all-zero flag is used to represent the coefficients in the coefficient segment Whether all zeros are all zeros;
  • the second decoding module 1130 is configured to sequentially decode the coefficients in the coefficient segment according to the scanning order if the value of the all-zero flag of the coefficient segment is a preset first value.
  • a valid flag the valid flag is used to indicate whether the coefficient is a non-zero coefficient;
  • the coefficient assignment module 1140 is configured to assign the The valid flags of each coefficient in the coefficient segment are assigned zero.
  • the first decoding module 1120 includes: a mode determination unit configured to determine that the decoding mode of the coefficient segment all-zero flag is bypass decoding or conventional decoding
  • the bypass decoding unit is configured to decode the coefficient absolute value threshold mark by the bypass decoding engine if it is determined that the decoding mode of the coefficient segmented all zero marks is bypass decoding;
  • the conventional decoding unit is configured If it is determined that the decoding mode of the coefficient segment all-zero flag is conventional decoding, the coefficient segment all-zero flag is decoded by a conventional decoding engine based on a context model.
  • the conventional decoding unit includes: an increment determination subunit, configured to acquire a model selection manner corresponding to the SRCC scan area, and to select a model according to the model The selection method determines a context index increment; a model selection subunit is configured to select a context model corresponding to the coefficient segment all-zero flag according to the context index increment; an arithmetic decoding subunit is configured to select a context model based on the selected The context model of the coefficient segment is arithmetically decoded by a conventional decoding engine.
  • the increment determination subunit is further configured to: if the model selection mode is the first mode, the context is determined according to the shape of the SRCC scanning area Index increment assignment; if the model selection mode is the second mode, the context index is incremented according to the area of the SRCC scan area; if the model selection mode is the third mode, according to the preset first A numerical value increments the context index.
  • the increment determination subunit is further configured to: acquire the area width and area height of the SRCC scanning area; according to the area width and the area height The numerical value relationship is assigned to the initial value of the context index increment; according to the numerical proportional relationship between the area width and the area height, the initial value increment of the context index increment is assigned; according to the initial value The value and the initial value increment assign values to the context index increment.
  • the increment determination subunit is further configured to: if the area width and the area height are equal in value, take a preset second value as the The initial value of the context index increment is assigned; if the area width is smaller than the area height, the initial value of the context index increment is assigned with a preset third value; if the area width is greater than the area height, The initial value of the context index increment is assigned with a preset fourth value.
  • the increment determination subunit is further configured to: calculate a value between a larger value and a smaller value of the area width and the area height ratio; comparing the numerical ratio with a plurality of preset ratio thresholds, so as to assign a value to the initial value increment of the context index increment according to the comparison result.
  • the increment determination subunit is further configured to: if the area width is equal to the area height value, use the initial value as the context Index increment assignment; if the area width is not equal to the area height value, the context index increment is assigned based on the sum of the initial value and the increment of the initial value.
  • the increment determination subunit is further configured to: acquire the area width and area height of the SRCC scanning area; according to the area width and the area height determining an area area of the SRCC scanning area; comparing the area area with a plurality of preset area thresholds to increment the context index according to the comparison result.
  • the video decoding apparatus 1100 further includes: a condition determination module configured to determine whether a preset decoding skip condition is satisfied; a decoding skip module configured to If the decoding skip condition is satisfied, the decoding step of the coefficient segment all-zero flag is skipped, and a preset first value is used as the value of the coefficient segment all-zero flag.
  • the decoding skip condition includes at least one of the following conditions: the number of coefficients in the coefficient segment is less than a preset number threshold; the SRCC scan The area proportion of the region in the coding block is less than a preset proportion threshold.
  • the coefficient segmentation module 1010 includes: a length acquisition unit, configured to acquire a coefficient segment length, where the coefficient segment length is used to represent the length of the coefficient segment The maximum number of included coefficients; the coefficient segment unit is configured to sequentially determine consecutive coefficients to be decoded according to the scanning order of the SRCC scan area, and to form a coefficient segment with the number of coefficients equal to or less than the length of the coefficient segment .
  • the coefficient segment length is a constant determined according to the maximum conversion unit size or a dynamic parameter determined according to attribute information of the coding block.
  • the attribute information of the coding block includes at least one of the following information: the size of the current conversion unit, the shape of the current conversion unit, the coordinates of the SRCC scanning area, the SRCC The number of coefficients in the scan area.
  • the method is applied to a coding block that satisfies a preset coding condition
  • the coding block of the preset coding condition includes: a coding block in a transform skip mode; or, Picture-level intra-frame prediction transform skips coding blocks with a flag value of 1; or, picture-level inter-frame prediction transform skips coding blocks with a flag value of 1; The picture-level inter-prediction transform skips coding blocks whose allowable flag values are all 1; or, all coding blocks.
  • FIG. 12 shows a structural block diagram of a video encoding apparatus in an embodiment of the present application.
  • the video encoding apparatus 1200 may mainly include:
  • the coefficient segmentation module 1210 is configured to, in the coding block of the video image frame, perform segmentation processing on the coefficients to be coded according to the scanning order of the scanning region-based coefficient coding SRCC scanning region to obtain a coefficient segmentation composed of a plurality of coefficients ;
  • the first encoding module 1220 is configured to determine the value of the all-zero flag of the coefficient segment according to whether the coefficients in the coefficient segment are all zero, and encode the all-zero flag of the coefficient segment;
  • the second encoding module 1230 is configured to sequentially encode the valid flags of the coefficients in each coefficient segment according to the scanning order if the value of the all-zero flag of the coefficient segment is a preset first value, and the valid flag is The flag is used to indicate whether the coefficient is a non-zero coefficient;
  • the coding skip module 1240 is configured to skip coding the valid flags of each coefficient in the coefficient segment if the value of the all-zero flag of the coefficient segment is a preset second value.
  • FIG. 13 schematically shows a structural block diagram of a computer system for implementing an electronic device according to an embodiment of the present application.
  • the computer system 1300 includes a central processing unit 1301 (Central Processing Unit, CPU), which can be loaded into a random device according to a program stored in a read-only memory 1302 (Read-Only Memory, ROM) or from a storage part 1308 Various appropriate actions and processes are performed by accessing programs in the memory 1303 (Random Access Memory, RAM). In the random access memory 1303, various programs and data necessary for system operation are also stored.
  • the central processing unit 1301 , the read-only memory 1302 and the random access memory 1303 are connected to each other through a bus 1304 .
  • An input/output interface 1305 (Input/Output interface, ie, I/O interface) is also connected to the bus 1304 .
  • the following components are connected to the input/output interface 1305: an input section 1306 including a keyboard, a mouse, etc.; an output section 1307 including a cathode ray tube (CRT), a liquid crystal display (LCD), etc., and a speaker, etc. ; a storage section 1308 including a hard disk, etc.; and a communication section 1309 including a network interface card such as a local area network card, a modem, and the like.
  • the communication section 1309 performs communication processing via a network such as the Internet.
  • a driver 1310 is also connected to the input/output interface 1305 as needed.
  • a removable medium 1313 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, etc., is mounted on the drive 1310 as needed so that a computer program read therefrom is installed into the storage section 1308 as needed.
  • embodiments of the present application include a computer program product comprising a computer program carried on a computer-readable medium, the computer program containing program code for performing the method illustrated in the flowchart.
  • the computer program may be downloaded and installed from the network via the communication portion 1309, and/or installed from the removable medium 1313.
  • the central processing unit 1301 various functions defined in the system of the present application are executed.
  • non-volatile computer-readable medium shown in the embodiments of the present application may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the above two.
  • the computer-readable storage medium can be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above.
  • Computer readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable Erasable Programmable Read Only Memory (EPROM), flash memory, optical fiber, portable Compact Disc Read-Only Memory (CD-ROM), optical storage device, magnetic storage device, or any suitable of the above The combination.
  • a computer-readable storage medium can be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, carrying computer-readable program code therein.
  • Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • a computer-readable signal medium can also be any computer-readable medium other than a computer-readable storage medium that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
  • Program code embodied on a computer-readable medium may be transmitted using any suitable medium, including but not limited to wireless, wired, etc., or any suitable combination of the foregoing.
  • each block in the flowchart or block diagrams may represent a module, segment, or portion of code that contains one or more logical functions for implementing the specified functions executable instructions.
  • the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
  • the exemplary embodiments described herein may be implemented by software, or may be implemented by software combined with necessary hardware. Therefore, the technical solutions according to the embodiments of the present application may be embodied in the form of software products, and the software products may be stored in a non-volatile storage medium (which may be CD-ROM, U disk, mobile hard disk, etc.) or on the network , which includes several instructions to cause a computing device (which may be a personal computer, a server, a touch terminal, or a network device, etc.) to execute the method according to the embodiments of the present application.
  • a computing device which may be a personal computer, a server, a touch terminal, or a network device, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

本申请属于计算机技术领域,具体涉及一种视频编解码方法、视频编解码装置、计算机可读介质以及电子设备。视频解码方法由电子设备执行,包括:在视频图像帧的编码块中,按照SRCC扫描区域的扫描顺序对待解码的系数进行分段处理,得到由多个系数组成的系数分段;解码系数分段的系数分段全零标志,得到系数分段全零标志的取值,系数分段全零标志用于表示系数分段中的系数是否全部为零;若系数分段全零标志的取值为预设的第一数值,则按照扫描顺序依次解码系数分段内的各个系数的有效标志,有效标志用于表示系数是否为非零系数;若系数分段全零标志的取值为预设的第二数值,则将系数分段内的各个系数的有效标志均赋值为零。

Description

视频编解码方法、装置、计算机可读介质及电子设备
本申请要求于2021年2月21日提交中国专利局、申请号为202110194838.9、发明名称为“视频编解码方法、装置、计算机可读介质及电子设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请属于计算机技术领域,具体涉及一种视频编解码方法、视频编解码装置、计算机可读介质以及电子设备。
发明背景
在视频编码过程中,编码端通常需要对原始视频数据与预测视频数据之间的残差数据进行变换、量化及熵编码处理之后发送给解码端。由于残差数据的数值分布相对较为稀疏,在对残差数据进行编解码处理时,需要对表示每个系数是否为零系数的标志进行编解码,因此会存在一些不必要的编码冗余,导致存在编码效率低的问题,影响了视频的压缩性能。
发明内容
本申请的目的在于提供一种视频编解码方法、视频编解码装置、计算机可读介质以及电子设备,至少在一定程度上克服相关技术中存在的视频编解码效率低的技术问题。
根据本申请实施例的一个方面,提供一种视频解码方法,由电子设备执行,该方法包括:在视频图像帧的编码块中,按照SRCC扫描区域的扫描顺序对待解码的系数进行分段处理,得到由多个系数组成的系数分段;解码所述系数分段的系数分段全零标志,得到所述系数分段全零标志的取值,所述系数分段全零标志用于表示所述系数分段中的系数是否全部为零;若所述系数分段全零标志的取值为预设的第一数值,则按照扫描顺序依次解码所述系数分段内的各个系数的有效标志,所述有效标志用于表示所述系数是否为非零系数;若所述系数分段全零标志的取值为预设的第二数值,则将所述系数分段内的各个系数的有效标志均赋值为零。
根据本申请实施例的一个方面,提供一种视频编码方法,由电子设备执行,该方法包括:在视频图像帧的编码块中,按照基于扫描区域的系数编码SRCC扫描区域的扫描顺序对待编码的系数进行分段处理,得到由多个系数组成的系数分段;根据所述系数分段中的系数是否全部为零,确定所述系数分段全零标志的取值,并对所述系数分段全零标志进行编码;若所述系数分段全零标志的取值为预设的第一数值,则按照扫描顺序依次编码每个系数分段内各个系数的有效标志,所述有效标志用于表示所述系数是否为非零系数;若所述系数分段全零标志的取值为预设的第二数值,则跳过编码所述系数分段内的各个系数的有效标志。
根据本申请实施例的一个方面,提供一种视频解码装置,该装置包括:系数分 段模块,被配置为在视频图像帧的编码块中,按照SRCC扫描区域的扫描顺序对待解码的系数进行分段处理,得到由多个系数组成的系数分段;第一解码模块,被配置为解码所述系数分段的系数分段全零标志,得到所述系数分段全零标志的取值,所述系数分段全零标志用于表示所述系数分段中的系数是否全部为零;第二解码模块,被配置为若所述系数分段全零标志的取值为预设的第一数值,则按照扫描顺序依次解码所述系数分段内的各个系数的有效标志,所述有效标志用于表示所述系数是否为非零系数;系数赋值模块,被配置为若所述系数分段全零标志的取值为预设的第二数值,则将所述系数分段内的各个系数的有效标志均赋值为零。
根据本申请实施例的一个方面,提供一种视频编码装置,该装置包括:系数分段模块,被配置为在视频图像帧的编码块中,按照基于扫描区域的系数编码SRCC扫描区域的扫描顺序对待编码的系数进行分段处理,得到由多个系数组成的系数分段;第一编码模块,被配置为根据所述系数分段中的系数是否全部为零,确定所述系数分段全零标志的取值,并对所述系数分段全零标志进行编码;第二编码模块,被配置为若所述系数分段全零标志的取值为预设的第一数值,则按照扫描顺序依次编码每个系数分段内各个系数的有效标志,所述有效标志用于表示所述系数是否为非零系数;编码跳过模块,被配置为若所述系数分段全零标志的取值为预设的第二数值,则跳过编码所述系数分段内的各个系数的有效标志。
根据本申请实施例的一个方面,提供一种非易失性计算机可读介质,其上存储有计算机程序,该计算机程序被处理器执行时实现如以上技术方案中的视频解码方法和视频编码方法。
根据本申请实施例的一个方面,提供一种电子设备,该电子设备包括:处理器;以及存储器,用于存储所述处理器的可执行指令;其中,所述处理器被配置为经由执行所述可执行指令来执行如以上技术方案中的视频解码方法和视频编码方法。
根据本申请实施例的一个方面,提供一种计算机程序产品或计算机程序,该计算机程序产品或计算机程序包括计算机指令,该计算机指令存储在计算机可读存储介质中。电子设备的处理器从计算机可读存储介质读取该计算机指令,处理器执行该计算机指令,使得该电子设备执行如以上技术方案中的视频解码方法和视频编码方法。
附图简要说明
图1示出了可以应用本申请实施例的技术方案的示例性系统架构的示意图。
图2示出视频编码装置和视频解码装置在流式传输系统中的放置方式示意图。
图3示出了一个视频编码器的基本流程图。
图4示出了通过SRCC技术标记出的扫描区域。
图5示出了对标记出的扫描区域进行扫描的顺序示意图。
图6示出了本申请一个实施例中的视频解码方法的步骤流程图。
图7示出了本申请一个实施例中通过常规解码的方式对系数分段全零标志进行解码处理的步骤流程图。
图8示出了本申请一个实施例中基于第一模式确定上下文索引增量的步骤流程图。
图9示出了本申请一个实施例中基于第二模式确定上下文索引增量的步骤流程图。
图10示出了本申请一个实施例中的视频编码方法的步骤流程图。
图11示出了本申请实施例提供的视频编码装置的结构框图。
图12示出了本申请实施例提供的视频解码装置的结构框图。
图13示意性示出了适于用来实现本申请实施例的电子设备的计算机系统结构框图。
实施方式
现在将参考附图更全面地描述示例实施方式。然而,示例实施方式能够以多种形式实施,且不应被理解为限于在此阐述的范例;相反,提供这些实施方式使得本申请将更加全面和完整,并将示例实施方式的构思全面地传达给本领域的技术人员。
此外,所描述的特征、结构或特性可以以任何合适的方式结合在一个或更多实施例中。在下面的描述中,提供许多具体细节从而给出对本申请的实施例的充分理解。然而,本领域技术人员将意识到,可以实践本申请的技术方案而没有特定细节中的一个或更多,或者可以采用其它的方法、组元、装置、步骤等。在其它情况下,不详细示出或描述公知方法、装置、实现或者操作以避免模糊本申请的各方面。
附图中所示的方框图仅仅是功能实体,不一定必须与物理上独立的实体相对应。即,可以采用软件形式来实现这些功能实体,或在一个或多个硬件模块或集成电路中实现这些功能实体,或在不同网络和/或处理器装置和/或微控制器装置中实现这些功能实体。
附图中所示的流程图仅是示例性说明,不是必须包括所有的内容和操作/步骤,也不是必须按所描述的顺序执行。例如,有的操作/步骤还可以分解,而有的操作/步骤可以合并或部分合并,因此实际执行的顺序有可能根据实际情况改变。
需要说明的是:在本文中提及的“多个”是指两个或两个以上。“和/或”描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。字符“/”一般表示前后关联对象是一种“或”的关系。
图1示出了可以应用本申请实施例的技术方案的示例性系统架构的示意图。
如图1所示,系统架构100包括多个终端装置,所述终端装置可通过例如网络150彼此通信。举例来说,系统架构100可以包括通过网络150互连的第一终端装置110和第二终端装置120。在图1的实施例中,第一终端装置110和第二终端装置120执行单向数据传输。
举例来说,第一终端装置110可对视频数据(例如由终端装置110采集的视频图片流)进行编码以通过网络150传输到第二终端装置120,已编码的视频数据以一个或多个已编码视频码流形式传输,第二终端装置120可从网络150接收已编码视频数据,对已编码视频数据进行解码以恢复视频数据,并根据恢复的视频数据显示视频图片。
在本申请的一个实施例中,系统架构100可以包括执行已编码视频数据的双向传输的第三终端装置130和第四终端装置140,所述双向传输比如可以发生在视频会议期间。对于双向数据传输,第三终端装置130和第四终端装置140中的每个终端装置可对视频数据(例如由终端装置采集的视频图片流)进行编码,以通过网络150传输到第三终端装置130和第四终端装置140中的另一终端装置。第三终端装置130和第四终端装置140中的每个终端装置还可接收由第三终端装置130和第四终端装置140中的另一终端装置传输的已编码视频数据,且可对已编码视频数据进行解码以恢复视频数据,并可根据恢复的视频数据在可访问的显示装置上显示视频 图片。
在图1的实施例中,第一终端装置110、第二终端装置120、第三终端装置130和第四终端装置140可为服务器、个人计算机和智能电话,但本申请公开的原理可不限于此。本申请公开的实施例适用于膝上型计算机、平板电脑、媒体播放器和/或专用视频会议设备。网络150表示在第一终端装置110、第二终端装置120、第三终端装置130和第四终端装置140之间传送已编码视频数据的任何数目的网络,包括例如有线和/或无线通信网络。通信网络150可在电路交换和/或分组交换信道中交换数据。该网络可包括电信网络、局域网、广域网和/或互联网。出于本申请的目的,除非在下文中有所解释,否则网络150的架构和拓扑对于本申请公开的操作来说可能是无关紧要的。
在本申请的一个实施例中,图2示出视频编码装置和视频解码装置在流式传输环境中的放置方式。本申请所公开主题可同等地适用于其它支持视频的应用,包括例如视频会议、数字TV(television,电视机)、在包括CD、DVD、存储棒等的数字介质上存储压缩视频等等。
流式传输系统可包括采集子系统213,采集子系统213可包括数码相机等视频源201,视频源创建未压缩的视频图片流202。在实施例中,视频图片流202包括由数码相机拍摄的样本。相较于已编码的视频数据204(或已编码的视频码流204),视频图片流202被描绘为粗线以强调高数据量的视频图片流,视频图片流202可由电子装置220处理,电子装置220包括耦接到视频源201的视频编码装置203。视频编码装置203可包括硬件、软件或软硬件组合以实现或实施如下文更详细地描述的所公开主题的各方面。相较于视频图片流202,已编码的视频数据204(或已编码的视频码流204)被描绘为细线以强调较低数据量的已编码的视频数据204(或已编码的视频码流204),其可存储在流式传输服务器205上以供将来使用。一个或多个流式传输客户端子系统,例如图2中的客户端子系统206和客户端子系统208,可访问流式传输服务器205以检索已编码的视频数据204的副本207和副本209。客户端子系统206可包括例如电子装置230中的视频解码装置210。视频解码装置210对已编码的视频数据的传入副本207进行解码,且产生可在显示器212(例如显示屏)或另一呈现装置上呈现的输出视频图片流211。在一些流式传输系统中,可根据某些视频编码/压缩标准对已编码的视频数据204、视频数据207和视频数据209(例如视频码流)进行编码。这些标准的实施例包括ITU-T H.265。在实施例中,正在开发的视频编码标准非正式地称为多功能视频编码(Versatile Video Coding,VVC),本申请可用于VVC标准的上下文中。
应注意,电子装置220和电子装置230可包括图中未示出的其它组件。举例来说,电子装置220可包括视频解码装置,且电子装置230还可包括视频编码装置。
在本申请的一个实施例中,以国际视频编码标准高效率视频编码(High Efficiency Video Coding,HEVC)、多功能视频编码(Versatile Video Coding,VVC),以及中国国家视频编码标准AVS为例,当输入一个视频帧图像之后,会根据一个块大小,将视频帧图像划分成若干个不重叠的处理单元,每个处理单元将进行类似的压缩操作。这个处理单元被称作编码树单元(Coding Tree Unit,CTU),或者称之为最大编码单元(Largest Coding Unit,LCU)。CTU再往下可以继续进行更加精细的划分,得到一个或多个基本的编码单元(Coding Unit,CU),CU是一 个编码环节中最基本的元素。以下介绍对CU进行编码时的一些概念:
预测编码(Predictive Coding):预测编码包括了帧内预测和帧间预测等方式,原始视频信号经过选定的已重建视频信号的预测后,得到残差视频信号。编码端需要为当前CU决定选择哪一种预测编码模式,并告知解码端。其中,帧内预测是指预测的信号来自于同一图像内已经编码重建过的区域;帧间预测是指预测的信号来自已经编码过的、不同于当前图像的其它图像(称之为参考图像)。
变换及量化(Transform&Quantization):残差视频信号经过离散傅里叶变换(Discrete Fourier Transform,DFT)、离散余弦变换(Discrete Cosine Transform,DCT)等变换操作后,将信号转换到变换域中,称之为变换系数。变换系数进一步进行有损的量化操作,丢失掉一定的信息,使得量化后的信号有利于压缩表达。在一些视频编码标准中,可能有多于一种变换方式可以选择,因此编码端也需要为当前CU选择其中的一种变换方式,并告知解码端。量化的精细程度通常由量化参数(Quantization Parameter,简称QP)来决定,QP取值较大,表示更大取值范围的系数将被量化为同一个输出,因此通常会带来更大的失真及较低的码率;相反,QP取值较小,表示较小取值范围的系数将被量化为同一个输出,因此通常会带来较小的失真,同时对应较高的码率。
熵编码(Entropy Coding)或统计编码:量化后的变换域信号将根据各个值出现的频率进行统计压缩编码,最后输出二值化(0或者1)的压缩码流。同时,编码产生其他信息,例如选择的编码模式、运动矢量数据等,也需要进行熵编码以降低码率。统计编码是一种无损的编码方式,可以有效的降低表达同样信号所需要的码率,常见的统计编码方式有变长编码(Variable Length Coding,简称VLC)或者基于上下文的自适应二进制算术编码(Content Adaptive Binary Arithmetic Coding,简称CABAC)。
基于上下文的自适应二进制算术编码(CABAC)过程主要包含3个步骤:二进制化、上下文建模和二进制算术编码。在对输入的语法元素进行二值化后,可以通过常规编码模式和旁路编码模式(bypass)对二元数据进行编码。旁路编码模式(Bypass Coding Mode),它无须为每个二元位分配特定的概率模型,输入的二元位bin值直接用一个简单的旁路编码器进行编码,以加快整个编码以及解码的速度。一般情况下,不同的语法元素之间并不是完全独立的,且相同语法元素自身也具有一定的记忆性。因此,根据条件熵理论,利用其他已编码的语法元素进行条件编码,相对于独立编码或者无记忆编码能够进一步提高编码性能。这些用来作为条件的已编码符号信息称为上下文。在常规编码模式中,语法元素的二元位顺序地进入上下文模型器。编码器根据先前编码过的语法元素或二元位的值,为每一个输入的二元位分配合适的概率模型,该过程即为上下文建模。通过上下文索引增量(context index increment,ctxIdxInc)和上下文起始索引(context index Start,ctxIdxStart)即可定位到语法元素所对应的上下文模型。将二进制(bin)值和分配的概率模型一起送入二元算术编码器进行编码后,需要根据bin值更新上下文模型,也就是编码中的自适应过程。
环路滤波(Loop Filtering):经过变化及量化的信号会通过反量化、反变换及预测补偿的操作获得重建图像。重建图像与原始图像相比由于存在量化的影响,部分信息与原始图像有所不同,即重建图像会产生失真(Distortion)。因此,可以对重建图像进行滤波操作,例如去块效应滤波(Deblocking filter,简称DB)、自适应像素补偿(Sample Adaptive Offset,SAO)或者自适应环路滤波(Adaptive  Loop Filter,ALF)等滤波器,可以有效降低量化所产生的失真程度。由于这些经过滤波后的重建图像将作为后续编码图像的参考来对将来的图像信号进行预测,因此上述的滤波操作也被称为环路滤波,即在编码环路内的滤波操作。
在本申请的一个实施例中,图3示出了一个视频编码器的基本流程图,在该流程中以帧内预测为例进行说明。其中,原始图像信号s k[x,y]与预测图像信号
Figure PCTCN2021131610-appb-000001
做差值运算,得到残差信号u k[x,y],残差信号u k[x,y]经过变换及量化处理之后得到量化系数,量化系数一方面通过熵编码得到编码后的比特流,另一方面通过反量化及反变换处理得到重构残差信号u' k[x,y],预测图像信号
Figure PCTCN2021131610-appb-000002
与重构残差信号u' k[x,y]叠加生成图像信号
Figure PCTCN2021131610-appb-000003
图像信号
Figure PCTCN2021131610-appb-000004
一方面输入至帧内模式决策模块和帧内预测模块进行帧内预测处理,另一方面通过环路滤波输出重建图像信号s' k[x,y],重建图像信号s' k[x,y]可以作为下一帧的参考图像进行运动估计及运动补偿预测。然后基于运动补偿预测的结果s' r[x+m x,y+m y]和帧内预测结果
Figure PCTCN2021131610-appb-000005
得到下一帧的预测图像信号
Figure PCTCN2021131610-appb-000006
并继续重复上述过程,直至编码完成。
基于上述的编码过程,在解码端针对每一个CU,在获取到压缩码流(即比特流)之后,进行熵解码获得各种模式信息及量化系数。然后量化系数经过反量化及反变换处理得到残差信号。另一方面,根据已知的编码模式信息,可获得该CU对应的预测信号,然后将残差信号与预测信号相加之后即可得到重建信号,重建信号再经过环路滤波等操作,产生最终的输出信号。
在上述的编解码过程中,对残差信号的变换处理使得残差信号的能量集中在较少的低频系数,也就是多数系数值较小。然后经过后续的量化模块后,较小系数值将变为零值,极大降低了编码残差信号的代价。但是,由于残差分布的多样性,单一的DCT变换无法适应所有的残差特性,因此,DST7和DCT8这样的变换核被引入到变换处理过程中,并且对残差信号进行的水平变换和竖直变换可以采用不同的变换核。以自适应多核变换(Adaptive multiple core transform,AMT)技术为例,对于一个残差信号进行变换处理可能选择的变换组合如下所示:(DCT2,DCT2)、(DCT8,DCT8)、(DCT8,DST7)、(DST7,DCT8)和(DST7,DST7)。对于残差信号具体选择哪种变换组合,需要在编码端使用率失真优化(Rate–Distortion Optimization,RDO)进行决策。另外,在残差块内残差分布相关性较弱的情况下,可以不经过变换过程而直接对残差信号进行量化,即变换跳过。标识当前残差块是否属于变换跳过模式可以通过显式编码和隐式导出两种方式。
由于残差信号在经过变换和量化处理后的量化系数块中非零系数较大概率会集中在块的左边和上方区域,而块的右边和下方区域往往为0,因此可以使用基于扫描区域的系数编码技术(Scan Region based Coefficient Coding,SRCC),通过SRCC技术可以标记出每个量化系数块(尺寸为W×H)中包含的非零系数的左上区域的大小为(SRx+1)×(SRy+1),其中SRx是量化系数块中最右面的非零系数的横坐标,SRy是量化系数块中最下面的非零系数的纵坐标,且左上角起点坐标为(0,0), 1≤SRx+1≤W,1≤SRy+1≤H,而该区域外的系数均为0。SRCC技术利用(SRx,SRy)来确定一个量化系数块中需要扫描的量化系数区域,如图4所示,只有(SRx,SRy)标记的扫描区域内的量化系数需要编码,编码的扫描顺序如图5所示,可以是从右下角到左上角的反向Z字型扫描。
SRCC扫描区域内的待编码的系数采用的是分层的方法进行编码。具体来讲,首先,对SRCC扫描区域坐标进行编码。然后,在SRCC扫描区域内,基于扫描顺序,逐一编码标识当前位置上的系数是否为0的flag(significant flag)。与此同时,记录非零系数位置和统计非零系数个数。如果非零系数个数大于0,则需要对相应位置的非0系数绝对值和符号进行编码。
考虑到SRCC扫描区域的边界主要取决于当前块内最右边和最下边非零系数的位置,SRCC扫描区域内可能会存在较多位置系数为0的情况。因此,在SRCC扫描顺序上,可能会出现连续多个位置系数为0的情况。特别地,例如,在变换跳过模式中,由于不存在变换过程,残差系数能量没有被集中,非零系数的分布可能相对更为稀疏。而在目前的AVS3标准中,SRCC扫描区域的所有位置都需要编码significant flag,来表明该位置上是否是零系数,这会导致一些不必要的冗余。
本发明基于SRCC扫描区域内系数进行编解码时的扫描顺序,根据SRCC扫描区域内系数的分布特性,提出了一种分段进行SRCC系数编码的方法,对连续一段扫描顺序上的系数,用一个语法元素表明这些系数是否全是零,以减少编码冗余,有助于提升系数编码的编码效率,进一步提升视频压缩性能。
需要说明的是,本申请提出的系数编码方法,不局限于应用在变换跳过模式下的系数编码上,也可以使用在其它编码模式下的系数编码,例如,运用在所有块的系数编码中;例如,在图像级帧内预测变换跳过允许标志值为1的时候使用;例如,在图像级帧间预测变换跳过允许标志值为1的时候使用;例如,在图像级帧内预测变换跳过允许标志值和图像级帧间预测变换跳过允许标志值均为1的时候使用等等。
本申请实施例可以通过显式编码或者隐含导出的方式确定编码块是否采用变换跳过模式。其中,显式编码是在解码系数绝对值之前对用于表示编码块是否采用变换跳过模式的标志flag进行解码,以基于标志的解码结果明确当前的编码块是否需要跳过变换过程。隐含导出(即隐择变换跳过,Implicit Selection of Transform Skip)是在没有相应标志的情况下,对系数解码结果进行统计,再根据统计结果判断是否跳过变换过程。例如,可以先解码出所有系数,计算所有非零系数的数量,以及所有系数(包括零系数)中偶数系数的数量,然后根据非零系数的数量的奇偶性,或者所有系数中偶数系数的数量的奇偶性,去隐式地推导当前编码块是否采用变换跳过(例如,非零系数的数量为奇数时,确定当前编码块采用了变换跳过;而在非零系数的数量为偶数时,确定当前编码块未采用变换跳过。例如,所有系数中偶数系数的数量为偶数时,确定当前编码块采用了变换跳过;而在所有系数中偶数系数的数量为奇数时,确定当前编码块未采用变换跳过)。
图像级帧内预测变换跳过允许标志和图像级帧间预测变换跳过允许标志是用于控制是否能够在帧内/帧间使用隐择变换跳过的标志。在本申请的实施例中,涉及与变换跳过模式相关的语法元素解释如下。
隐择变换跳过允许标志ist_skip_enable_flag:
二值变量。值为‘1’表示可使用隐择变换跳过;值为‘0’表示不应使用隐择变换 跳过。变量IstSkipEnableFlag的值可以通过对语法元素ist_skip_enable_flag解码得到。如果位流中不存在语法元素ist_skip_enable_flag,可以将IstSkipEnableFlag赋值为0。
帧间变换跳过允许标志inter_transform_skip_enable_flag:
二值变量。值为‘1’表示可使用帧间变换跳过;值为‘0’表示不应使用帧间变换跳过。变量InterTransformSkipEnableFlag的值可以通过对语法元素inter_transform_skip_enable_flag解码得到。如果位流中不存在语法元素inter_transform_skip_enable_flag,可以将InterTransformSkipEnableFlag赋值为0。
图像级帧内预测变换跳过允许标志picture_ist_skip_enable_flag:
二值变量。值为‘1’表示当前图像的亮度帧内预测残差块和亮度编码块复制帧内预测残差块可使用变换跳过方法;值为‘0’表示当前图像的亮度帧内预测残差块和亮度编码块复制帧内预测残差块不应使用变换跳过方法。变量PictureIstSkipEnableFlag的值可以通过对语法元素picure_ist_skip_enable_flag解码得到。如果位流中不存在语法元素picture_ist_skip_enable_flag,可以将PictureIstSkipEnableFlag赋值为0。
图像级帧间预测变换跳过允许标志picture_inter_trasform_skip_flag:
二值变量。值为‘1’表示当前图像的亮度帧间预测残差块可使用变换跳过方法;值为‘0’表示当前图像的亮度帧间预测残差块不应使用变换跳过方法。PictureInterSkipEnableFlag的值可以通过对语法元素picture_inter_trasform_skip_flag解码得到。如果位流中不存在语法元素picture_inter_trasform_skip_flag,可以将PictureInterSkipEnableFlag赋值为0。
下面结合具体应用场景对本申请实施例的技术方案的实现细节进行详细阐述。
图6示出了本申请一个实施例中的视频解码方法的步骤流程图,该视频解码方法可以由具有计算处理功能的设备来执行,比如可以由终端设备或服务器来执行,或由图13所示的电子设备执行。如图6所示,该视频解码方法主要可以包括如下的步骤S610至步骤S640。
步骤S610:在视频图像帧的编码块中,按照基于扫描区域的系数编码SRCC扫描区域的扫描顺序对待解码的系数进行分段处理,得到由多个系数组成的系数分段。
在本申请的一个实施例中,视频图像帧序列包括了一系列图像,每张图像可以被进一步划分为切片(Slice),切片又可以划分为一系列的LCU(或CTU),LCU包含有若干CU。视频图像帧在编码时是以块为单位进行编码处理,在一些新的视频编码标准中,比如在H.264标准中有宏块(macroblock,MB),宏块可进一步划分成多个可用于预测编码的预测块(prediction block,PB)。在HEVC标准中,采用编码单元CU、预测单元(prediction unit,PU)和变换单元(transform unit,TU)等基本概念,从功能上划分了多种块单元,并采用全新的基于树的结构进行描述。比如CU可以按照四叉树划分为更小的CU,而更小的CU还可以继续划分,从而形成一种四叉树结构。本申请实施例中的编码块可以是CU,或者是比CU更小的块,如对CU进行划分得到的更小的块。
步骤S620:解码系数分段的系数分段全零标志,得到系数分段全零标志的取值,系数分段全零标志用于表示系数分段中的系数是否全部为零。
在本申请的一个实施例中,对系数分段全零标志coef_part_all_zero_flag进行解码的方法可以包括:确定系数分段全零标志的解码方式为旁路解码(bypass)或 者常规解码(CABAC);若确定系数分段全零标志的解码方式为旁路解码,则通过旁路解码引擎对系数绝对值阈值标志进行解码;若确定系数分段全零标志的解码方式为常规解码,则通过基于上下文模型的常规解码引擎对系数分段全零标志进行解码。
在本申请实施例中,不同系数分段对应的系数分段全零标志可以选择相同的解码方式,也可以选择不同的解码方式。
步骤S630:若系数分段全零标志的取值为预设的第一数值,则按照扫描顺序依次解码系数分段内的各个系数的有效标志,有效标志用于表示系数是否为非零系数。
预设的第一数值例如可以为0,当解码得到系数分段全零标志的取值为0时,表示对应的系数分段中的系数不全为零系数,而存在若干数量的非零系数。此时可以按照扫描顺序依次解码系数分段内的各个系数的有效标志significant flag,进而确定各个系数是否为非零系数。在本申请的一个实施例中,如果解码得到一个系数的有效标志significant flag为0,表示该系数为零系数;如果解码得到一个系数的有效标志significant flag为1,表示该系数为非零系数。
步骤S640:若系数分段全零标志的取值为预设的第二数值,则跳过解码所述有效标志,并将系数分段内的各个系数的有效标志均赋值为零。
预设的第二数值例如可以为1,当解码得到系数分段全零标志的取值为1时,表示对应的系数分段中的系数全部为零系数。此时无需对每个系数的有效标志significant flag进行解码,而可以直接将对应系数分段中所有系数的有效标志significant flag均赋值为0。
完成对有效标志significant flag的解码或赋值后,可以确定编码块中的各个系数为零系数或者非零系数。针对非零系数,可以继续解码相应的语法元素,得到该非零系数的系数绝对值和符号。
在本申请实施例提供的视频解码方法中,基于SRCC区域内系数进行编解码时的扫描顺序,根据SRCC区域内系数的分布特性对连续一段扫描顺序上的部分系数组成一个系数分段,用一个语法元素表明该系数分段中的系数是否全为零系数,从而可以减少编码冗余,有助于提升系数编码的编码效率,进一步提升视频压缩性能。
在本申请的一个实施例中,按照SRCC扫描区域的扫描顺序对待解码的系数进行分段处理的方法可以包括:获取系数分段长度,系数分段长度用于表示系数分段中所包含系数的最大数量;按照SRCC扫描区域的扫描顺序依次确定连续的待解码的系数,并将数量等于或小于系数分段长度的系数组成一个系数分段。
在本申请的一个实施例中,系数分段长度CUT_NUM是根据最大转换单元尺寸确定的常数或者根据编码块的属性信息确定的动态参数。其中,编码块的属性信息可以包括以下信息中的至少一种:当前转换单元的尺寸、当前转换单元的形状、SRCC扫描区域的坐标、SRCC扫描区域中的系数数量。
本申请实施例可以根据编解码器设定的最大TU尺寸,直接在编解码器中将系数分段长度CUT_NUM设定为一个不超过最大TU尺寸的常数,例如,最大TU尺寸为64*64,CUT_NUM=16。
本申请实施例也可以根据当前TU大小、当前TU形状、SRCC扫描区域坐标、SRCC系数个数等因素对系数分段长度CUT_NUM进行动态设定。总之,CUT_NUM值的设定可以有多种方式,不局限于前述举例。
在本申请实施例中,可以令各个系数分段中的系数数量为group_coef_num=CUT_NUM。当num_coef%CUT_NUM!=0,即当SRCC扫描区域内的系数总个数不是CUT_NUM的整数倍时,在经过分段编码后,将存在一个系数个数小于CUT_NUM的系数分段,即该系数分段的系数数量为group_coef_num=num_coef%CUT_NUM(group_coef_num<CUT_NUM),本申请实施例可以在SRCC扫描区域内将这些系数放置在特定位置,例如,扫描顺序开始的第一段。
在本申请的一个实施例中,可以选择旁路解码或者常规解码的方式对系数分段全零标志进行解码。其中,常规解码的方式需要根据先前解码过的语法元素或二元位的值,为每一个输入的二元位分配合适的概率模型,即为待解码的系数分段全零标志分配上下文模型。
图7示出了本申请一个实施例中通过常规解码的方式对系数分段全零标志进行解码处理的步骤流程图。在以上实施例的基础上,通过基于上下文模型的常规解码引擎对系数分段全零标志进行解码的方法主要可以包括如下的步骤S710至步骤S730。
步骤S710:获取与SRCC扫描区域相对应的模型选取方式,并根据模型选取方式确定上下文索引增量。
本申请实施例可以在编码器和解码器中预先设置固定的模型选取方式,也可以根据当前编码块从多个可供选择的模型选取方式中动态地选取一种模式。对于一个视频图像帧中的多个编码块而言,可以使用相同的模型选取方式,也可以使用不同的模型选取方式。
步骤S720:根据上下文索引增量选择与系数分段全零标志相对应的上下文模型。
通过上下文索引增量ctxIdxInc和上下文起始索引ctxIdxStart可以定位到语法元素所对应的上下文模型。基于不同取值的上下文索引增量ctxIdxInc可以为系数分段全零标志分配不同的概率模型。
步骤S730:基于所选择的上下文模型,通过常规解码引擎对系数分段全零标志进行算数解码。
将所选择的上下文模型和待解码的系数分段全零标志共同加载至常规解码引擎中,可以通过常规解码引擎完成对系数分段全零标志的解码操作,得到对应的标志取值。本申请实施例中的常规解码引擎可以是基于CABAC技术的二元算术解码器。
在本申请的一个实施例中,可供选择的模型选取方式可以包括三种模式。若模型选取方式为第一模式,根据SRCC扫描区域的形状为上下文索引增量赋值;若模型选取方式为第二模式,根据SRCC扫描区域的面积为上下文索引增量赋值;若模型选取方式为第三模式,根据预设的第一数值为上下文索引增量赋值。
图8示出了本申请一个实施例中基于第一模式确定上下文索引增量的步骤流程图。如图8所示,在以上实施例的基础上,根据SRCC扫描区域的形状为上下文索引增量赋值的方法可以包括如下的步骤S810至步骤S840。
步骤S810:获取SRCC扫描区域的区域宽度和区域高度。
在本申请的一个实施例中,SRCC扫描区域可以是如图4中所示的矩形区域,当解码得到当前编码块中的SRCC扫描区域的右端横坐标scan_region_x和下端纵坐标scan_region_y之后,可以确定SRCC扫描区域的区域宽度为sr_width=scan_region_x+1,区域高度为sr_height=scan_region_y+1。
步骤S820:根据区域宽度和区域高度的数值大小关系,为上下文索引增量的初始值赋值。
在本申请的一个实施例中,通过比较区域宽度和区域高度之间的数值大小,可以基于不同的预设数值为上下文索引增量的初始值赋值。若区域宽度和区域高度数值相等,以预设的第二数值为上下文索引增量的初始值赋值;若区域宽度小于区域高度,以预设的第三数值为上下文索引增量的初始值赋值;若区域宽度大于区域高度,以预设的第四数值为上下文索引增量的初始值赋值。
举例而言,第二数值可以取值为1,第三数值可以取值为2,第四数值可以取值为5。在此基础上,为上下文索引增量的初始值赋值的方法逻辑为:ctxIdxInc=((sr_width==sr_height)?1:(sr_width<sr_height?2:5))。即:如果区域宽度sr_width和区域高度sr_height数值相等,将上下文索引增量的初始值ctxIdxInc赋值为1;如果区域宽度sr_width小于区域高度sr_height,将上下文索引增量的初始值ctxIdxInc赋值为2;如果区域宽度sr_width大于区域高度sr_height,将上下文索引增量的初始值ctxIdxInc赋值为5。
步骤S830:根据区域宽度和区域高度的数值比例关系,为上下文索引增量的初始值增量赋值。
在本申请的一个实施例中,为上下文索引增量的初始值增量赋值的方法可以包括:计算区域宽度和区域高度中的较大值与较小值之间的数值比例ratio;将数值比例与预设的多个比例阈值进行比较,以根据比较结果为上下文索引增量的初始值增量delta赋值。在本申请实施例中,上下文索引增量的初始值可以与数值比例ratio呈负相关的关系。例如,令ratio=(sr_widt h<sr_height)?(sr_height/sr_width):(sr_width/sr_height),delta=(ratio>=3)?0:((ratio>=2)?1:2)。在本申请实施例中,当区域宽度sr_width小于区域高度sr_height时,以较大值sr_height除以较小值sr_width得到的比值作为数值比例ratio;当区域高度sr_height小于(或等于)区域宽度sr_width时,以较大值sr_width除以较小值sr_height得到的比值作为数值比例ratio。取两个预设的比例阈值为2和3,如果数值比例ratio大于或等于3,则将上下文索引增量的初始值增量delta赋值为0;如果数值比例ratio小于3,并且大于或等于2,则将上下文索引增量的初始值增量delta赋值为1;如果数值比例ratio小于2,则将上下文索引增量的初始值增量delta赋值为2。本申请实施例中涉及的各个数值仅为示例,在具体的应用场景中,可以选用其他数值,本申请实施例对此不做特殊限定。
步骤S840:根据初始值和初始值增量为上下文索引增量赋值。
在本申请的一个实施例中,若区域宽度与区域高度数值相等,则以初始值为上下文索引增量赋值,即ctxIdxInc+=0;若区域宽度与区域高度数值不相等,则以初始值和初始值增量delta的和为上下文索引增量ctxIdxInc赋值,即ctxIdxInc+=delta。
图9示出了本申请一个实施例中基于第二模式确定上下文索引增量的步骤流程图。如图9所示,在以上实施例的基础上,根据SRCC扫描区域的面积为上下文索引增量赋值的方法可以包括如下的步骤S910至步骤S930。
步骤S910:获取SRCC扫描区域的区域宽度和区域高度。
在本申请的一个实施例中,SRCC扫描区域可以是如图4中所示的矩形区域,当解码得到当前编码块中的SRCC扫描区域的右端横坐标scan_region_x和下端纵坐标scan_region_y之后,可以确定SRCC扫描区域的区域宽度为 sr_width=scan_region_x+1,区域高度为sr_height=scan_region_y+1。
步骤S920:根据区域宽度和区域高度确定SRCC扫描区域的区域面积。
当SRCC扫描区域为矩形区域时,其区域面积sr_area等于区域宽度sr_width和区域高度sr_height的乘积,即sr_area=sr_width*sr_height。
步骤S930:将区域面积与多个预设的面积阈值进行比较,以根据比较结果为上下文索引增量赋值。
在本申请的一个实施例中,上下文索引增量的取值可以与SRCC扫描区域的区域面积呈正相关的关系,例如取三个面积阈值为16、32和64,ctxIdxInc=(sr_area<16)?0:((sr_area<32)?1:(sr_area<64)?2:3)。当区域面积sr_area小于16时,将上下文索引增量ctxIdxInc赋值为0;当区域面积sr_area大于或等于16,并且小于32时,将上下文索引增量ctxIdxInc赋值为1;当区域面积sr_area大于或等于32,并且小于64时,将上下文索引增量ctxIdxInc赋值为2;当区域面积sr_area大于或等于64时,将上下文索引增量ctxIdxInc赋值为3。本申请实施例中涉及的各个数值仅为示例,在具体的应用场景中,可以选用其他数值,本申请实施例对此不做特殊限定。
在本申请一个实施例中,当采用第三模式确定上下文索引增量时,可以直接以预设的第一数值为上下文索引增量赋值,从而可以确定一个单一的上下文模型。例如,第一数值取值为0(该数值仅用于示例),可以令上下文索引增量ctxIdxInc=0,即在未附加任何增量的情况下,直接以上下文起始索引ctxIdxStart定位对应的上下文模型。
在本申请实施例提供的技术方案中,可以有选择地为各个系数分段的系数分段全零标志进行解码。在对编码块的SRCC扫描区域进行分段处理后,可以得到多个系数分段。此后可以对全部的系数分段解码系数分段全零标志,也可以仅对其中一部分系数分段解码系数分段全零标志。
在本申请的一个实施例中,在解码系数分段的系数分段全零标志之前,可以先确定是否满足预设的解码跳过条件;若满足解码跳过条件,则跳过对系数分段全零标志的解码步骤,并以预设的第二数值作为系数分段全零标志的取值。例如,第二数值取值为1,即默认未做系数分段全零标志解码的系数分段是由零系数组成的全零系数分段。
在本申请的一个实施例中,解码跳过条件包括以下条件中的至少一种:系数分段中的系数数量小于预设的数量阈值;SRCC扫描区域在编码块中的面积占比小于预设的占比阈值。
在一个实施方式中,当系数分段中的系数数量小于预设的数量阈值时,跳过对该系数分段的系数分段全零标志进行解码的步骤,并以预设的第一数值作为系数分段全零标志的取值。举例而言,以上述实施例中的系数分段长度CUT_NUM作为数量阈值,当一个系数分段的系数数量group_coef_num小于数量阈值时,无需解码该系数分段的系数分段全零标志coef_part_all_zero_flag,令coef_part_all_zero_flag=0,表示该系数分段中的所有系数需要按照扫描顺序依次解码各个系数的有效标志significant flag,进而确定各个系数是否为非零系数。
在一个实施方式中,当SRCC扫描区域在编码块中的面积占比小于预设的占比阈值时,跳过对该系数分段的系数分段全零标志进行解码的步骤,并以预设的第一数值作为系数分段全零标志的取值。举例而言,令sr_area_per=(sr_width*sr_height)/(width*height),其中width和height分别为当前编 码块的长度和宽度,sr_area_per是SRCC扫描区域在编码块中的面积占比。当一个系数分段对应的面积占比sr_area_per小于设定的占比阈值时,无需解码该系数分段的coef_part_all_zero_flag,令coef_part_all_zero_flag=0,表示该系数分段中的所有系数需要按照扫描顺序依次解码各个系数的有效标志significant flag,进而确定各个系数是否为非零系数。
在一个实施方式中,当系数分段中的系数数量小于预设的数量阈值,并且当SRCC扫描区域在编码块中的面积占比小于预设的占比阈值时,跳过对该系数分段的系数分段全零标志进行解码的步骤,并以预设的第一数值作为系数分段全零标志的取值。
在本申请的一个实施例中,若存在系数数量小于系数分段长度的特定系数分段,则特定系数分段位于SRCC扫描区域的特定位置。例如,该特定位置可以是SRCC扫描区域中的扫描开始位置或者扫描结束位置。相应地,特定系数分段可以是扫描顺序中的第一个系数分段或者最后一个系数分段。
以上实施例从解码侧的角度介绍了对SRCC扫描区域内的系数进行分段解码的解码方案,而在编码侧可以使用与以上各个实施例相对应的编码方案。图10示出了本申请一个实施例中的视频编码方法的步骤流程图。该视频编码方法可以由具有计算处理功能的设备来执行,比如可以由终端设备或服务器来执行,或者可由图13所示的电子设备执行。如图10所示,该视频编码方法主要可以包括如下的步骤S1010至步骤S1040。
步骤S1010:在视频图像帧的编码块中,按照基于扫描区域的系数编码SRCC扫描区域的扫描顺序对待编码的系数进行分段处理,得到由多个系数组成的系数分段;
步骤S1020:根据系数分段中的系数是否全部为零,确定系数分段全零标志的取值,并对系数分段全零标志进行编码;
步骤S1030:若系数分段全零标志的取值为预设的第一数值,则按照扫描顺序依次编码每个系数分段内各个系数的有效标志,有效标志用于表示系数是否为非零系数;
步骤S1040:若系数分段全零标志的取值为预设的第二数值,则跳过编码系数分段内的各个系数的有效标志。
在本申请实施例中,若系数分段全零标志的取值为预设的第二数值,即表明系数分段内的各个系数均为零系数,无需编码系数分段内的各个系数的有效标志。
本申请实施例中的视频编码方法的相关内容细节与以上实施例中的视频解码方法相对应,此处不再赘述。
基于以上各实施例,本发明基于SRCC区域内系数进行编解码时的扫描顺序,根据SRCC区域内系数的分布特性,提出了一种分段进行SRCC系数编码的方法,对连续一段扫描顺序上的系数,用一个语法元素表明这些系数是否全是零,以减少编码冗余,有助于提升系数编码的编码效率,进一步提升视频压缩性能。需要说明的是,以上实施例以解码端作为示例说明了本申请实施例中提供的SRCC系数编解码方法,但相关技术方案也可以应用于编码端,本申请并不以此为限。
应当注意,尽管在附图中以特定顺序描述了本申请中方法的各个步骤,但是,这并非要求或者暗示必须按照该特定顺序来执行这些步骤,或是必须执行全部所示的步骤才能实现期望的结果。附加的或备选的,可以省略某些步骤,将多个步骤合并为一个步骤执行,以及/或者将一个步骤分解为多个步骤执行等。
以下介绍本申请的装置实施例,可以用于执行本申请上述实施例中的视频解码方法。图11示出了本申请实施例提供的视频解码装置的结构框图。如图11所示,视频解码装置1100主要可以包括:
系数分段模块1110,被配置为在视频图像帧的编码块中,按照SRCC扫描区域的扫描顺序对待解码的系数进行分段处理,得到由多个系数组成的系数分段;第一解码模块1120,被配置为解码所述系数分段的系数分段全零标志,得到所述系数分段全零标志的取值,所述系数分段全零标志用于表示所述系数分段中的系数是否全部为零;第二解码模块1130,被配置为若所述系数分段全零标志的取值为预设的第一数值,则按照扫描顺序依次解码所述系数分段内的各个系数的有效标志,所述有效标志用于表示所述系数是否为非零系数;系数赋值模块1140,被配置为若所述系数分段全零标志的取值为预设的第二数值,则将所述系数分段内的各个系数的有效标志均赋值为零。
在本申请的一些实施例中,基于以上各实施例,所述第一解码模块1120包括:模式确定单元,被配置为确定所述系数分段全零标志的解码方式为旁路解码或者常规解码;旁路解码单元,被配置为若确定所述系数分段全零标志的解码方式为旁路解码,则通过旁路解码引擎对所述系数绝对值阈值标志进行解码;常规解码单元,被配置为若确定所述系数分段全零标志的解码方式为常规解码,则通过基于上下文模型的常规解码引擎对所述系数分段全零标志进行解码。
在本申请的一些实施例中,基于以上各实施例,所述常规解码单元包括:增量确定子单元,被配置为获取与所述SRCC扫描区域相对应的模型选取方式,并根据所述模型选取方式确定上下文索引增量;模型选择子单元,被配置为根据所述上下文索引增量选择与所述系数分段全零标志相对应的上下文模型;算数解码子单元,被配置为基于所选择的上下文模型,通过常规解码引擎对所述系数分段全零标志进行算数解码。
在本申请的一些实施例中,基于以上各实施例,所述增量确定子单元还被配置为:若所述模型选取方式为第一模式,根据所述SRCC扫描区域的形状为所述上下文索引增量赋值;若所述模型选取方式为第二模式,根据所述SRCC扫描区域的面积为所述上下文索引增量赋值;若所述模型选取方式为第三模式,根据预设的第一数值为所述上下文索引增量赋值。
在本申请的一些实施例中,基于以上各实施例,所述增量确定子单元还被配置为:获取所述SRCC扫描区域的区域宽度和区域高度;根据所述区域宽度和所述区域高度的数值大小关系,为所述上下文索引增量的初始值赋值;根据所述区域宽度和所述区域高度的数值比例关系,为所述上下文索引增量的初始值增量赋值;根据所述初始值和所述初始值增量为所述上下文索引增量赋值。
在本申请的一些实施例中,基于以上各实施例,所述增量确定子单元还被配置为:若所述区域宽度和所述区域高度数值相等,以预设的第二数值为所述上下文索引增量的初始值赋值;若所述区域宽度小于所述区域高度,以预设的第三数值为所述上下文索引增量的初始值赋值;若所述区域宽度大于所述区域高度,以预设的第四数值为所述上下文索引增量的初始值赋值。
在本申请的一些实施例中,基于以上各实施例,所述增量确定子单元还被配置为:计算所述区域宽度和所述区域高度中的较大值与较小值之间的数值比例;将所述数值比例与预设的多个比例阈值进行比较,以根据比较结果为所述上下文索引增量的初始值增量赋值。
在本申请的一些实施例中,基于以上各实施例,所述增量确定子单元还被配置为:若所述区域宽度与所述区域高度数值相等,则以所述初始值为所述上下文索引增量赋值;若所述区域宽度与所述区域高度数值不相等,则以所述初始值和所述初始值增量的和为所述上下文索引增量赋值。
在本申请的一些实施例中,基于以上各实施例,所述增量确定子单元还被配置为:获取所述SRCC扫描区域的区域宽度和区域高度;根据所述区域宽度和所述区域高度确定所述SRCC扫描区域的区域面积;将所述区域面积与多个预设的面积阈值进行比较,以根据比较结果为所述上下文索引增量赋值。
在本申请的一些实施例中,基于以上各实施例,所述视频解码装置1100还包括:条件确定模块,被配置为确定是否满足预设的解码跳过条件;解码跳过模块,被配置为若满足所述解码跳过条件,则跳过对所述系数分段全零标志的解码步骤,并以预设的第一数值作为所述系数分段全零标志的取值。
在本申请的一些实施例中,基于以上各实施例,所述解码跳过条件包括以下条件中的至少一种:所述系数分段中的系数数量小于预设的数量阈值;所述SRCC扫描区域在所述编码块中的面积占比小于预设的占比阈值。
在本申请的一些实施例中,基于以上各实施例,所述系数分段模块1010包括:长度获取单元,被配置为获取系数分段长度,所述系数分段长度用于表示系数分段中所包含系数的最大数量;系数分段单元,被配置为按照SRCC扫描区域的扫描顺序依次确定连续的待解码的系数,并将数量等于或小于所述系数分段长度的系数组成一个系数分段。
在本申请的一些实施例中,基于以上各实施例,所述系数分段长度是根据最大转换单元尺寸确定的常数或者根据所述编码块的属性信息确定的动态参数。
在本申请的一些实施例中,基于以上各实施例,所述编码块的属性信息包括以下信息中的至少一种:当前转换单元的尺寸、当前转换单元的形状、SRCC扫描区域的坐标、SRCC扫描区域中的系数数量。
在本申请的一些实施例中,基于以上各实施例,所述方法应用于满足预设编码条件的编码块,所述预设编码条件的编码块包括:变换跳过模式的编码块;或者,图像级帧内预测变换跳过允许标志值为1的编码块;或者,图像级帧间预测变换跳过允许标志值为1的编码块;或者,图像级帧内预测变换跳过允许标志值和图像级帧间预测变换跳过允许标志值均为1的编码块;或者,所有编码块。
图12示出了本申请一个实施例中的视频编码装置的结构框图。如图12所示,视频编码装置1200主要可以包括:
系数分段模块1210,被配置为在视频图像帧的编码块中,按照基于扫描区域的系数编码SRCC扫描区域的扫描顺序对待编码的系数进行分段处理,得到由多个系数组成的系数分段;
第一编码模块1220,被配置为根据所述系数分段中的系数是否全部为零,确定所述系数分段全零标志的取值,并对所述系数分段全零标志进行编码;
第二编码模块1230,被配置为若所述系数分段全零标志的取值为预设的第一数值,则按照扫描顺序依次编码每个系数分段内各个系数的有效标志,所述有效标志用于表示所述系数是否为非零系数;
编码跳过模块1240,被配置为若所述系数分段全零标志的取值为预设的第二数值,则跳过编码所述系数分段内的各个系数的有效标志。
本申请各实施例中提供的视频解码装置的具体细节已经在对应的方法实施例 中进行了详细的描述,此处不再赘述。
图13示意性地示出了用于实现本申请实施例的电子设备的计算机系统结构框图。
需要说明的是,图13示出的电子设备的计算机系统1300仅是一个示例,不应对本申请实施例的功能和使用范围带来任何限制。
如图13所示,计算机系统1300包括中央处理器1301(Central Processing Unit,CPU),其可以根据存储在只读存储器1302(Read-Only Memory,ROM)中的程序或者从存储部分1308加载到随机访问存储器1303(Random Access Memory,RAM)中的程序而执行各种适当的动作和处理。在随机访问存储器1303中,还存储有系统操作所需的各种程序和数据。中央处理器1301、只读存储器1302以及随机访问存储器1303通过总线1304彼此相连。输入/输出接口1305(Input/Output接口,即I/O接口)也连接至总线1304。
以下部件连接至输入/输出接口1305:包括键盘、鼠标等的输入部分1306;包括诸如阴极射线管(Cathode Ray Tube,CRT)、液晶显示器(Liquid Crystal Display,LCD)等以及扬声器等的输出部分1307;包括硬盘等的存储部分1308;以及包括诸如局域网卡、调制解调器等的网络接口卡的通信部分1309。通信部分1309经由诸如因特网的网络执行通信处理。驱动器1310也根据需要连接至输入/输出接口1305。可拆卸介质1313,诸如磁盘、光盘、磁光盘、半导体存储器等等,根据需要安装在驱动器1310上,以便于从其上读出的计算机程序根据需要被安装入存储部分1308。
特别地,根据本申请的实施例,各个方法流程图中所描述的过程可以被实现为计算机软件程序。例如,本申请的实施例包括一种计算机程序产品,其包括承载在计算机可读介质上的计算机程序,该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中,该计算机程序可以通过通信部分1309从网络上被下载和安装,和/或从可拆卸介质1313被安装。在该计算机程序被中央处理器1301执行时,执行本申请的系统中限定的各种功能。
在本申请实施例提供的技术方案中,基于SRCC扫描区域内系数进行编解码时的扫描顺序,根据SRCC扫描区域内系数的分布特性,提出了一种分段进行SRCC系数编码的方法,对连续一段扫描顺序上的系数,用一个语法元素表明这些系数是否全是零,从而可以减少编码冗余,有助于提升系数编码的编码效率,进一步提升视频压缩性能。
需要说明的是,本申请实施例所示的非易失性计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(Erasable Programmable Read Only Memory,EPROM)、闪存、光纤、便携式紧凑磁盘只读存储器(Compact Disc Read-Only Memory,CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本申请中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本申请中,计算机可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读信号介质还 可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:无线、有线等等,或者上述的任意合适的组合。
附图中的流程图和框图,图示了按照本申请各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,上述模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图或流程图中的每个方框、以及框图或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。
应当注意,尽管在上文详细描述中提及了用于动作执行的设备的若干模块或者单元,但是这种划分并非强制性的。实际上,根据本申请的实施方式,上文描述的两个或更多模块或者单元的特征和功能可以在一个模块或者单元中具体化。反之,上文描述的一个模块或者单元的特征和功能可以进一步划分为由多个模块或者单元来具体化。
通过以上的实施方式的描述,本领域的技术人员易于理解,这里描述的示例实施方式可以通过软件实现,也可以通过软件结合必要的硬件的方式来实现。因此,根据本申请实施方式的技术方案可以以软件产品的形式体现出来,该软件产品可以存储在一个非易失性存储介质(可以是CD-ROM,U盘,移动硬盘等)中或网络上,包括若干指令以使得一台计算设备(可以是个人计算机、服务器、触控终端、或者网络设备等)执行根据本申请实施方式的方法。
本领域技术人员在考虑说明书及实践这里公开的发明后,将容易想到本申请的其它实施方案。本申请旨在涵盖本申请的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本申请的一般性原理并包括本申请未公开的本技术领域中的公知常识或惯用技术手段。

Claims (22)

  1. 一种视频解码方法,由电子设备执行,其特征在于,包括:
    在视频图像帧的编码块中,按照基于扫描区域的系数编码SRCC扫描区域的扫描顺序对待解码的系数进行分段处理,得到一个或多个系数分段;
    解码各个所述系数分段的系数分段全零标志,得到所述系数分段全零标志的取值,所述系数分段全零标志用于表示所述系数分段中的系数是否全部为零;
    若所述系数分段全零标志的取值为预设的第一数值,则按照扫描顺序依次解码所述系数分段内的各个系数的有效标志,所述有效标志用于表示所述系数是否为非零系数;
    若所述系数分段全零标志的取值为预设的第二数值,则跳过解码所述有效标志,并将所述系数分段内的各个系数的有效标志均赋值为零。
  2. 根据权利要求1所述的视频解码方法,其特征在于,按照SRCC扫描区域的扫描顺序对待解码的系数进行分段处理,包括:
    获取系数分段长度,所述系数分段长度用于表示系数分段中所包含系数的最大数量;
    按照SRCC扫描区域的扫描顺序依次确定连续的待解码的系数,并将数量等于或小于所述系数分段长度的系数组成一个系数分段。
  3. 根据权利要求2所述的视频解码方法,其特征在于,所述系数分段长度是根据最大转换单元尺寸确定的常数或者根据所述编码块的属性信息确定的动态参数。
  4. 根据权利要求3所述的视频解码方法,其特征在于,所述编码块的属性信息包括以下信息中的至少一种:当前转换单元的尺寸、当前转换单元的形状、SRCC扫描区域的坐标、SRCC扫描区域中的系数数量。
  5. 根据权利要求1所述的视频解码方法,其特征在于,解码所述系数分段的系数分段全零标志,包括:
    确定所述系数分段全零标志的解码方式为旁路解码或者常规解码;
    若确定所述系数分段全零标志的解码方式为旁路解码,则通过旁路解码引擎对所述系数绝对值阈值标志进行解码;
    若确定所述系数分段全零标志的解码方式为常规解码,则通过基于上下文模型的常规解码引擎对所述系数分段全零标志进行解码。
  6. 根据权利要求5所述的视频解码方法,其特征在于,通过基于上下文模型的常规解码引擎对所述系数分段全零标志进行解码,包括:
    获取与所述SRCC扫描区域相对应的模型选取方式,并根据所述模型选取方式确定上下文索引增量;
    根据所述上下文索引增量选择与所述系数分段全零标志相对应的上下文模型;
    基于所选择的上下文模型,通过常规解码引擎对所述系数分段全零标志进行算数解码。
  7. 根据权利要求6所述的视频解码方法,其特征在于,根据所述模型选取方式确定上下文索引增量,包括:
    若所述模型选取方式为第一模式,根据所述SRCC扫描区域的形状为所述上下文索引增量赋值;
    若所述模型选取方式为第二模式,根据所述SRCC扫描区域的面积为所述上下文索引增量赋值;
    若所述模型选取方式为第三模式,根据预设的第一数值为所述上下文索引增量 赋值。
  8. 根据权利要求7所述的视频解码方法,其特征在于,根据所述SRCC扫描区域的形状为所述上下文索引增量赋值,包括:
    获取所述SRCC扫描区域的区域宽度和区域高度;
    根据所述区域宽度和所述区域高度的数值大小关系,为所述上下文索引增量的初始值赋值;
    根据所述区域宽度和所述区域高度的数值比例关系,为所述上下文索引增量的初始值增量赋值;
    根据所述初始值和所述初始值增量为所述上下文索引增量赋值。
  9. 根据权利要求8所述的视频解码方法,其特征在于,根据所述区域宽度和所述区域高度的数值大小关系,为所述上下文索引增量的初始值赋值,包括:
    若所述区域宽度和所述区域高度数值相等,以预设的第二数值为所述上下文索引增量的初始值赋值;
    若所述区域宽度小于所述区域高度,以预设的第三数值为所述上下文索引增量的初始值赋值;
    若所述区域宽度大于所述区域高度,以预设的第四数值为所述上下文索引增量的初始值赋值。
  10. 根据权利要求8所述的视频解码方法,其特征在于,根据所述区域宽度和所述区域高度的数值比例关系,为所述上下文索引增量的初始值增量赋值,包括:
    计算所述区域宽度和所述区域高度中的较大值与较小值之间的数值比例;
    将所述数值比例与预设的多个比例阈值进行比较,以根据比较结果为所述上下文索引增量的初始值增量赋值。
  11. 根据权利要求8所述的视频解码方法,其特征在于,根据所述初始值和所述初始值增量为所述上下文索引增量赋值,包括:
    若所述区域宽度与所述区域高度数值相等,则以所述初始值为所述上下文索引增量赋值;
    若所述区域宽度与所述区域高度数值不相等,则以所述初始值和所述初始值增量的和为所述上下文索引增量赋值。
  12. 根据权利要求7所述的视频解码方法,其特征在于,根据所述SRCC扫描区域的面积为所述上下文索引增量赋值,包括:
    获取所述SRCC扫描区域的区域宽度和区域高度;
    根据所述区域宽度和所述区域高度确定所述SRCC扫描区域的区域面积;
    将所述区域面积与多个预设的面积阈值进行比较,以根据比较结果为所述上下文索引增量赋值。
  13. 根据权利要求1至12中任意一项所述的视频解码方法,其特征在于,在解码所述系数分段的系数分段全零标志之前,所述方法还包括:
    确定是否满足预设的解码跳过条件;
    若满足所述解码跳过条件,则跳过对所述系数分段全零标志的解码步骤,并以预设的第二数值作为所述系数分段全零标志的取值。
  14. 根据权利要求13所述的视频解码方法,其特征在于,所述解码跳过条件包括以下条件中的至少一种:
    所述系数分段中的系数数量小于预设的数量阈值;
    所述SRCC扫描区域在所述编码块中的面积占比小于预设的占比阈值。
  15. 根据权利要求1至12中任意一项所述的视频解码方法,其特征在于,若存在系数数量小于系数分段长度的特定系数分段,则所述特定系数分段位于所述SRCC扫描区域的特定位置。
  16. 根据权利要求15所述的视频解码方法,其特征在于,所述特定系数分段是扫描顺序中的第一个系数分段或者最后一个系数分段。
  17. 根据权利要求1至12中任意一项所述的视频解码方法,其特征在于,所述方法应用于满足预设编码条件的编码块,所述预设编码条件的编码块包括:
    变换跳过模式的编码块;
    或者,图像级帧内预测变换跳过允许标志值为1的编码块;
    或者,图像级帧间预测变换跳过允许标志值为1的编码块;
    或者,图像级帧内预测变换跳过允许标志值和图像级帧间预测变换跳过允许标志值均为1的编码块;
    或者,所有编码块。
  18. 一种视频编码方法,由电子设备执行,其特征在于,包括:
    在视频图像帧的编码块中,按照基于扫描区域的系数编码SRCC扫描区域的扫描顺序对待编码的系数进行分段处理,得到一个或多个系数分段;
    根据所述系数分段中的系数是否全部为零,确定所述系数分段全零标志的取值,并对所述系数分段全零标志进行编码;
    若所述系数分段全零标志的取值为预设的第一数值,则按照扫描顺序依次编码每个系数分段内各个系数的有效标志,所述有效标志用于表示所述系数是否为非零系数;
    若所述系数分段全零标志的取值为预设的第二数值,则跳过编码所述系数分段内的各个系数的有效标志。
  19. 一种视频解码装置,其特征在于,包括:
    系数分段模块,被配置为在视频图像帧的编码块中,按照SRCC扫描区域的扫描顺序对待解码的系数进行分段处理,得到由多个系数组成的系数分段;
    第一解码模块,被配置为解码所述系数分段的系数分段全零标志,得到所述系数分段全零标志的取值,所述系数分段全零标志用于表示所述系数分段中的系数是否全部为零;
    第二解码模块,被配置为若所述系数分段全零标志的取值为预设的第一数值,则按照扫描顺序依次解码所述系数分段内的各个系数的有效标志,所述有效标志用于表示所述系数是否为非零系数;
    系数赋值模块,被配置为若所述系数分段全零标志的取值为预设的第二数值,则跳过解码所述有效标志,并将所述系数分段内的各个系数的有效标志均赋值为零。
  20. 一种视频编码装置,其特征在于,包括:
    系数分段模块,被配置为在视频图像帧的编码块中,按照基于扫描区域的系数编码SRCC扫描区域的扫描顺序对待编码的系数进行分段处理,得到由多个系数组成的系数分段;
    第一编码模块,被配置为根据所述系数分段中的系数是否全部为零,确定所述系数分段全零标志的取值,并对所述系数分段全零标志进行编码;
    第二编码模块,被配置为若所述系数分段全零标志的取值为预设的第一数值,则按照扫描顺序依次编码每个系数分段内各个系数的有效标志,所述有效标志用于表示所述系数是否为非零系数;
    编码跳过模块,被配置为若所述系数分段全零标志的取值为预设的第二数值,则跳过编码所述系数分段内的各个系数的有效标志。
  21. 一种非易失性计算机可读介质,其上存储有计算机程序,该计算机程序被处理器执行时实现权利要求1至18中任意一项所述的方法。
  22. 一种电子设备,其特征在于,包括:
    处理器;以及
    存储器,用于存储所述处理器的可执行指令;
    其中,所述处理器配置为经由执行所述可执行指令来执行权利要求1至18中任意一项所述的方法。
PCT/CN2021/131610 2021-02-21 2021-11-19 视频编解码方法、装置、计算机可读介质及电子设备 WO2022174637A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/989,400 US20230082386A1 (en) 2021-02-21 2022-11-17 Video encoding method and apparatus, video decoding method and apparatus, computer-readable medium, and electronic device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110194838.9A CN114979641A (zh) 2021-02-21 2021-02-21 视频编解码方法、装置、计算机可读介质及电子设备
CN202110194838.9 2021-02-21

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/989,400 Continuation US20230082386A1 (en) 2021-02-21 2022-11-17 Video encoding method and apparatus, video decoding method and apparatus, computer-readable medium, and electronic device

Publications (1)

Publication Number Publication Date
WO2022174637A1 true WO2022174637A1 (zh) 2022-08-25

Family

ID=82932221

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/131610 WO2022174637A1 (zh) 2021-02-21 2021-11-19 视频编解码方法、装置、计算机可读介质及电子设备

Country Status (3)

Country Link
US (1) US20230082386A1 (zh)
CN (1) CN114979641A (zh)
WO (1) WO2022174637A1 (zh)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130107969A1 (en) * 2011-11-01 2013-05-02 Research In Motion Limited Multi-level significance maps for encoding and decoding
US20130215969A1 (en) * 2011-12-20 2013-08-22 General Instrument Corporation Method and apparatus for last coefficient indexing for high efficiency video coding
CN110708552A (zh) * 2019-08-27 2020-01-17 杭州海康威视数字技术股份有限公司 解码方法、编码方法及装置
CN112106365A (zh) * 2018-04-27 2020-12-18 交互数字Vc控股公司 用于视频编码和解码中的自适应上下文建模的方法和设备

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2595380B1 (en) * 2011-11-19 2015-10-21 BlackBerry Limited Multi-level significance map scanning
JP2020072277A (ja) * 2017-03-03 2020-05-07 シャープ株式会社 動画像符号化装置及び動画像復号装置
CN109788285B (zh) * 2019-02-27 2020-07-28 北京大学深圳研究生院 一种量化系数结束标志位的上下文模型选取方法及装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130107969A1 (en) * 2011-11-01 2013-05-02 Research In Motion Limited Multi-level significance maps for encoding and decoding
US20130215969A1 (en) * 2011-12-20 2013-08-22 General Instrument Corporation Method and apparatus for last coefficient indexing for high efficiency video coding
CN112106365A (zh) * 2018-04-27 2020-12-18 交互数字Vc控股公司 用于视频编码和解码中的自适应上下文建模的方法和设备
CN110708552A (zh) * 2019-08-27 2020-01-17 杭州海康威视数字技术股份有限公司 解码方法、编码方法及装置

Also Published As

Publication number Publication date
CN114979641A (zh) 2022-08-30
US20230082386A1 (en) 2023-03-16

Similar Documents

Publication Publication Date Title
CN112533000B (zh) 视频解码方法、装置、计算机可读介质及电子设备
WO2022078304A1 (zh) 视频解码方法、装置、计算机可读介质、程序及电子设备
WO2022062880A1 (zh) 视频解码方法、装置、计算机可读介质及电子设备
WO2022063033A1 (zh) 视频解码方法、视频编码方法、装置、计算机可读介质及电子设备
US20230097724A1 (en) Video encoding method and apparatus, video decoding method and apparatus, computer-readable medium, and electronic device
CN113207002B (zh) 视频编解码方法、装置、计算机可读介质及电子设备
US20230053118A1 (en) Video decoding method, video coding method, and related apparatus
WO2022174637A1 (zh) 视频编解码方法、装置、计算机可读介质及电子设备
WO2022174638A1 (zh) 视频编解码方法、装置、计算机可读介质及电子设备
WO2022037478A1 (zh) 视频解码方法、视频编码方法、装置、介质及电子设备
CN114079772B (zh) 视频解码方法、装置、计算机可读介质及电子设备
WO2022174701A1 (zh) 视频编解码方法、装置、计算机可读介质及电子设备
WO2022116854A1 (zh) 视频解码方法、装置、可读介质、电子设备及程序产品
WO2023130899A1 (zh) 环路滤波方法、视频编解码方法、装置、介质及电子设备
WO2023051222A1 (zh) 滤波及编解码方法、装置、计算机可读介质及电子设备
CN115209146A (zh) 视频编解码方法、装置、计算机可读介质及电子设备
CN114979656A (zh) 视频编解码方法、装置、计算机可读介质及电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21926351

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 22/01/2024)