WO2022037464A1 - 视频解码方法、视频编码方法、装置、设备及存储介质 - Google Patents

视频解码方法、视频编码方法、装置、设备及存储介质 Download PDF

Info

Publication number
WO2022037464A1
WO2022037464A1 PCT/CN2021/112133 CN2021112133W WO2022037464A1 WO 2022037464 A1 WO2022037464 A1 WO 2022037464A1 CN 2021112133 W CN2021112133 W CN 2021112133W WO 2022037464 A1 WO2022037464 A1 WO 2022037464A1
Authority
WO
WIPO (PCT)
Prior art keywords
string
string length
current
value
length
Prior art date
Application number
PCT/CN2021/112133
Other languages
English (en)
French (fr)
Inventor
王英彬
许晓中
刘杉
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Priority to EP21857559.5A priority Critical patent/EP4113999A4/en
Publication of WO2022037464A1 publication Critical patent/WO2022037464A1/zh
Priority to US17/939,767 priority patent/US20230020127A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/593Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial prediction techniques
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/182Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a pixel
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/119Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/105Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/124Quantisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/129Scanning of coding units, e.g. zig-zag scan of transform coefficients or flexible macroblock ordering [FMO]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/13Adaptive entropy coding, e.g. adaptive variable length coding [AVLC] or context adaptive binary arithmetic coding [CABAC]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/53Multi-resolution motion estimation; Hierarchical motion estimation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
    • H04N19/91Entropy coding, e.g. variable length coding [VLC] or arithmetic coding

Definitions

  • the embodiments of the present application relate to the technical field of video encoding and decoding, and in particular, to a video decoding method, a video encoding method, an apparatus, a device, and a storage medium.
  • a coding block is divided into a series of pixel strings or unmatched pixels according to a certain scanning order, which essentially allows the coding block to be divided into pixel strings of any integer pixel length, such as an unmatched pixel string.
  • a pixel can be thought of as a pixel string of length 1. In this way, the length of the pixel string can be any positive integer such as 1, 2, 3, and 4.
  • Embodiments of the present application provide a video decoding method, a video encoding method, an apparatus, a device, and a storage medium, which can improve the encoding and decoding efficiency of pixel strings.
  • the technical solution is as follows:
  • a video decoding method is provided, the method is performed by a decoding end device, and the method includes:
  • a binary symbol string of string length information of the current string is obtained by decoding from the code stream, and the string length information includes information related to the string length of the current string;
  • the string length of the current string is determined according to the string length information.
  • a video encoding method is provided, the method is performed by an encoding end device, and the method includes:
  • the string length information including information related to the string length of the current string
  • Binarization is performed on the string length information according to the string length resolution of the current string to obtain a binary symbol string of the string length information.
  • a video decoding apparatus includes:
  • the binary symbol acquisition module is used to decode the binary symbol string of the string length information of the current string from the code stream, and the string length information includes information related to the string length of the current string;
  • an inverse binarization processing module configured to perform inverse binarization processing on the binary symbol string according to the string length resolution of the current string to obtain the string length information
  • a string length determination module configured to determine the string length of the current string according to the string length information.
  • a video encoding apparatus includes:
  • the string length determination module is used to determine the string length of the current string
  • a length information determination module configured to determine the string length information of the current string based on the string length of the current string, where the string length information includes information related to the string length of the current string;
  • the binarization processing module is configured to perform binarization processing on the string length information according to the string length resolution of the current string to obtain a binary symbol string of the string length information.
  • a computer device includes a processor and a memory, and the memory stores at least one instruction, at least one program, a code set or an instruction set, the at least one Instructions, the at least one piece of program, the code set or the instruction set are loaded and executed by the processor to implement the above video decoding method.
  • a computer device includes a processor and a memory, and the memory stores at least one instruction, at least one program, a code set or an instruction set, the at least one Instructions, the at least one piece of program, the code set or the instruction set are loaded and executed by the processor to implement the video encoding method described above.
  • a computer-readable storage medium where at least one instruction, at least one piece of program, code set or instruction set is stored in the computer-readable storage medium, the at least one instruction, all the The at least one piece of program, the code set or the instruction set is loaded and executed by the processor to implement the above video decoding method.
  • a computer-readable storage medium where at least one instruction, at least one piece of program, code set or instruction set is stored in the computer-readable storage medium, the at least one instruction, all the The at least one piece of program, the code set or the instruction set is loaded and executed by the processor to implement the above video encoding method.
  • a computer program product or computer program where the computer program product or computer program includes computer instructions, and the computer instructions are stored in a computer-readable storage medium.
  • the processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, causing the computer device to perform the above-described video decoding method.
  • a computer program product or computer program where the computer program product or computer program includes computer instructions, and the computer instructions are stored in a computer-readable storage medium.
  • the processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, causing the computer device to perform the above-described video encoding method.
  • the length of the pixel string in the encoding and decoding block can be limited to a multiple of the string length resolution, which improves the uniformity of the pixel string and enables the encoding and decoding end to align in memory
  • the encoding and decoding are carried out under the conditions of the pixel string, which improves the encoding and decoding efficiency of the pixel string.
  • the present application proposes a method for binarizing and de-binarizing string length information according to the string length resolution, which improves the performance of different string lengths.
  • the encoding and decoding method of the string length information under the resolution. Specifically, when the encoding end binarizes the value of the string length information, it can first compress the value with the string length resolution, and then compress the compressed value (that is, divide the value by the string length resolution). obtained quotient) instead of directly binarizing the value.
  • the decoding end when the decoding end performs the inverse binarization process, the compressed value is obtained by the inverse binarization recovery, and then based on the compressed value and the string length resolution (that is, the compressed value and the string length are distinguished. rate multiplication) to obtain the value of the string length information.
  • the number of characters required for binary representation can be reduced, thereby reducing the complexity of encoding and decoding, which is beneficial to the improvement of encoding and decoding performance.
  • FIG. 1 is a basic flow chart of a video encoding process exemplarily shown in the present application
  • FIG. 2 is a schematic diagram of an inter-frame prediction mode provided by an embodiment of the present application.
  • FIG. 3 is a schematic diagram of a candidate motion vector provided by an embodiment of the present application.
  • FIG. 4 is a schematic diagram of an intra-block copy mode provided by an embodiment of the present application.
  • FIG. 5 is a schematic diagram of an intra-frame string replication mode provided by an embodiment of the present application.
  • FIG. 6 is a simplified block diagram of a communication system provided by an embodiment of the present application.
  • FIG. 7 is a schematic diagram of the placement manner of a video encoder and a video decoder in a streaming transmission environment exemplarily shown in the present application;
  • FIG. 8 is a flowchart of a video decoding method provided by an embodiment of the present application.
  • FIG. 9 is a flowchart of a video encoding method provided by an embodiment of the present application.
  • FIG. 10 is a block diagram of a video decoding apparatus provided by an embodiment of the present application.
  • FIG. 11 is a block diagram of a video encoding apparatus provided by an embodiment of the present application.
  • FIG. 12 is a structural block diagram of a computer device provided by an embodiment of the present application.
  • FIG. 1 exemplarily shows a basic flow chart of a video encoding process.
  • a video signal refers to a sequence of images comprising multiple frames.
  • a frame is a representation of the spatial information of a video signal. Taking YUV mode as an example, a frame includes one luma sample matrix (Y) and two chroma sample matrices (Cb and Cr). From the point of view of the acquisition method of video signal, it can be divided into two methods: camera captured and computer generated. Due to different statistical characteristics, the corresponding compression coding methods may also be different.
  • a hybrid coding framework is used to perform the following series of operations and processing on the input original video signal:
  • Block partition structure The input image is divided into several non-overlapping processing units, each of which will perform similar compression operations.
  • This processing unit is called CTU (Coding Tree Unit, coding tree unit), or LCU (Large Coding Unit, largest coding unit). Further down the CTU, more finer divisions can be continued to obtain one or more basic coding units, which are called CUs.
  • Each CU is the most basic element in a coding session. Described below are various encoding methods that may be used for each CU.
  • Predictive Coding Including intra-frame prediction and inter-frame prediction. After the original video signal is predicted by the selected reconstructed video signal, a residual video signal is obtained. The encoder needs to decide among many possible predictive coding modes for the current CU, select the most suitable one, and inform the decoder. Among them, intra-frame prediction means that the predicted signal comes from an area that has been coded and reconstructed in the same image. Inter-frame prediction means that the predicted signal comes from other pictures (called reference pictures) that have been encoded and different from the current picture.
  • Transform & Quantization The residual video signal undergoes transformation operations such as DFT (Discrete Fourier Transform, Discrete Fourier Transform), DCT (Discrete Cosine Transform, Discrete Cosine Transform) to convert the signal into the transform domain. are called transform coefficients.
  • the signal in the transform domain is further subjected to a lossy quantization operation, which loses a certain amount of information, so that the quantized signal is conducive to compressed expression.
  • a lossy quantization operation which loses a certain amount of information, so that the quantized signal is conducive to compressed expression.
  • the encoder also needs to select one of the transformations for the current CU and inform the decoder.
  • the fineness of quantization is usually determined by the quantization parameter.
  • QP Quality Parameter, quantization parameter
  • Entropy Coding or Statistical Coding The quantized transform domain signal will undergo statistical compression coding according to the frequency of occurrence of each value, and finally output a binarized (0 or 1) compressed code stream. At the same time, the encoding generates other information, such as the selected mode, motion vector, etc., and entropy encoding is also required to reduce the bit rate.
  • Statistical coding is a lossless coding method that can effectively reduce the code rate required to express the same signal. Common statistical coding methods include Variable Length Coding (VLC) or context-based binary arithmetic coding (Content Adaptive Binary Arithmetic Coding, CABAC).
  • Loop Filtering The decoded image can be reconstructed by performing inverse quantization, inverse transformation, and prediction compensation operations (the inverse operations of 2 to 4 above) for an already encoded image. Compared with the original image, the reconstructed image has some information different from the original image due to the influence of quantization, resulting in distortion. Filtering the reconstructed image, such as deblocking, SAO (Sample Adaptive Offset, sample adaptive offset) or ALF (Adaptive Lattice Filter, adaptive lattice filter) and other filters, can effectively Reduces the amount of distortion produced by quantization. Since these filtered reconstructed images will be used as references for subsequent encoded images to predict future signals, the above filtering operations are also referred to as in-loop filtering and filtering operations within the encoding loop.
  • SAO Sample Adaptive Offset, sample adaptive offset
  • ALF Adaptive Lattice Filter, adaptive lattice filter
  • the decoder first performs entropy decoding to obtain various mode information and quantized transform coefficients. Each coefficient is inversely quantized and inversely transformed to obtain a residual signal.
  • the prediction signal corresponding to the CU can be obtained, and after adding the two, the reconstructed signal can be obtained.
  • the reconstructed value of the decoded image needs to undergo a loop filtering operation to generate the final output signal.
  • a block-based hybrid coding framework is adopted. They divide the original video data into a series of coding blocks, and combine video coding methods such as prediction, transform and entropy coding to realize the compression of video data.
  • motion compensation is a type of prediction method commonly used in video coding. Based on the redundancy characteristics of video content in the temporal or spatial domain, motion compensation derives the prediction value of the current coding block from the coded region.
  • prediction methods include: inter prediction, intra block copy prediction, intra string copy prediction, etc. In specific coding implementations, these prediction methods may be used alone or in combination.
  • the displacement vector may have different names. This article will describe it in the following manner: 1) The displacement vector in the inter prediction mode is called a motion vector. MV for short); 2) The displacement vector in the IBC (Intra Block Copy, intra-frame block copy) prediction mode is called a block vector (Block Vector, BV for short); 3) ISC (Intra String Copy, intra-frame string copy) prediction mode The displacement vector in is called a string vector (String Vector, SV for short). Intra-frame string duplication is also referred to as "string prediction” or "string matching”, among others.
  • MV refers to the displacement vector used in the inter prediction mode, pointing from the current image to the reference image, and its value is the coordinate offset between the current block and the reference block, wherein the current block and the reference block are in two different images.
  • motion vector prediction can be introduced. By predicting the motion vector of the current block, the predicted motion vector corresponding to the current block is obtained, and the difference between the predicted motion vector corresponding to the current block and the actual motion vector is calculated. Compared with directly encoding and transmitting the actual motion vector corresponding to the current block, encoding and transmission is beneficial to save bit overhead.
  • the predicted motion vector refers to the predicted value of the motion vector of the current block obtained through the motion vector prediction technology.
  • BV refers to the displacement vector used for the IBC prediction mode, and its value is the coordinate offset between the current block and the reference block, wherein both the current block and the reference block are in the current image.
  • block vector prediction can be introduced. By predicting the block vector of the current block, the predicted block vector corresponding to the current block is obtained, and the difference between the predicted block vector corresponding to the current block and the actual block vector is encoded and transmitted. , compared to directly encoding and transmitting the actual block vector corresponding to the current block, it is beneficial to save bit overhead.
  • the predicted block vector refers to the predicted value of the block vector of the current block obtained through the block vector prediction technology.
  • SV refers to the displacement vector used for the ISC prediction mode, and its value is the coordinate offset between the current string and the reference string, wherein both the current string and the reference string are in the current image.
  • string vector prediction can be introduced. By predicting the string vector of the current string, the predicted string vector corresponding to the current string is obtained, and the difference between the predicted string vector corresponding to the current string and the actual string vector is encoded and transmitted. , compared to directly encoding and transmitting the actual string vector corresponding to the current string, it is beneficial to save bit overhead.
  • the predicted string vector refers to the predicted value of the string vector of the current string obtained through the string vector prediction technology.
  • inter-frame prediction utilizes the correlation in the video temporal domain, and uses the pixels adjacent to the coded image to predict the pixels of the current image, so as to effectively remove the temporal redundancy of the video, which can effectively save the coding residual data.
  • bits P is the current frame
  • Pr is the reference frame
  • B is the current block to be encoded
  • Br is the reference block of B.
  • the coordinates of B' and B in the image are the same, the coordinates of Br are (xr, yr), and the coordinates of B' are (x, y).
  • the displacement between the current block to be coded and its reference block is called a motion vector (MV), that is:
  • MV (xr-x, yr-y).
  • inter-frame prediction includes two MV prediction technologies, Merge and AMVP (Advanced Motion Vector Prediction).
  • Merge mode will build a MV candidate list for the current PU (Prediction Unit, prediction unit), in which there are 5 candidate MVs (and their corresponding reference images). Traverse these five candidate MVs, and select the one with the smallest rate-distortion cost as the optimal MV. In the case where the codec builds the candidate list in the same way, the encoder only needs to transmit the index of the optimal MV in the candidate list.
  • the MV prediction technology of HEVC also has a skip mode, which is a special case of the Merge mode. After the optimal MV is found in the Merge mode, if the current block is basically the same as the reference block, then there is no need to transmit residual data, only the index of the MV and a skip flag need to be transmitted.
  • the MV candidate list established by the Merge mode includes both the spatial domain and the time domain.
  • B Slice B frame image
  • the airspace provides up to 4 candidate MVs, and its establishment is shown in part (a) of Figure 3.
  • the airspace list is established in the order of A1 ⁇ B1 ⁇ B0 ⁇ A0 ⁇ B2, where B2 is a substitute, that is, when one or more of A1, B1, B0, and A0 do not exist, the motion information of B2 needs to be used; time domain Only one candidate MV is provided at most, and its establishment is shown in part (b) of Figure 3, which is obtained by scaling the MV of the co-located PU as follows:
  • curMV td*colMV/tb
  • curMV represents the MV of the current PU
  • colMV represents the MV of the co-located PU
  • td represents the distance between the current image and the reference image
  • tb represents the distance between the co-located image and the reference image.
  • the PU at the D0 position on the co-located block is unavailable, the co-located PU at the D1 position is replaced.
  • HEVC generates a combined list for B slices by combining the first 4 candidate MVs in the MV candidate list in pairs.
  • the AMVP mode utilizes the MV correlation of adjacent blocks in the spatial and temporal domains to build an MV candidate list for the current PU.
  • Difference Motion Vector Difference
  • the MV candidate list of the AMVP mode also includes two cases, the spatial domain and the time domain, the difference is that the length of the MV candidate list of the AMVP mode is only 2.
  • MVD Motion Vector Difference, motion vector residual
  • the resolution of MVD is controlled by use_integer_mv_flag in slice_header.
  • MVD is encoded with 1/4 (brightness) pixel resolution; when the value of this flag is 1, MVD adopts integer (brightness) ) pixel resolution for encoding.
  • a method of Adaptive Motion Vector Resolution (AMVR for short) is used in VVC. This method allows each CU to adaptively select the resolution of the coded MV. In normal AMVP mode, the selectable resolutions include 1/4, 1/2, 1 and 4 pixel resolutions.
  • a flag is first encoded to indicate whether quarter luma sample MVD precision is used for the CU. When the flag is 0, the MVD of the current CU is coded at 1/4 pixel resolution. Otherwise, a second flag needs to be encoded to indicate that 1/2 pixel resolution or other MVD resolution is used by the CU. Otherwise, a third flag is encoded to indicate whether to use 1-pixel resolution or 4-pixel resolution for the CU.
  • IBC is an intra-frame coding tool adopted in the extension of HEVC Screen Content Coding (SCC), which significantly improves the coding efficiency of screen content.
  • SCC Screen Content Coding
  • IBC technology is also adopted to improve the performance of screen content encoding.
  • IBC utilizes the spatial correlation of screen content and video, and uses the encoded image pixels on the current image to predict the pixels of the current block to be encoded, which can effectively save the bits required for encoding pixels.
  • the displacement between the current block and its reference block in IBC is called BV (block vector).
  • H.266/VVC adopts a BV prediction technique similar to inter-frame prediction to further save the bits required for encoding BV, and allows BVD (Block Vector Difference, block vector residual) to be encoded using 1 or 4 pixel resolution.
  • BVD Block Vector Difference, block vector residual
  • ISC technology divides a coded block into a series of pixel strings or unmatched pixels according to a certain scan order (such as raster scan, round-trip scan, Zig-Zag scan, etc.). Similar to IBC, each string looks for a reference string of the same shape in the coded area of the current image, and derives the predicted value of the current string. By encoding the residual between the pixel value of the current string and the predicted value, instead of directly encoding the pixel value, it can effectively Save bits.
  • Figure 5 shows a schematic diagram of intra-frame string replication.
  • the dark gray area is the encoded area
  • the 28 white pixels are string 1
  • the light gray 35 pixels are string 2
  • the black 1 pixel represents an unmatched pixel.
  • the displacement between string 1 and its reference string is the string vector 1 in FIG. 4 ; the displacement between string 2 and its reference string is the string vector 2 in FIG. 4 .
  • the intra-frame string replication technology needs to encode the SV corresponding to each string in the current coding block, the string length, and the flag of whether there is a matching string.
  • SV represents the displacement of the to-be-coded string to its reference string.
  • String length indicates the number of pixels contained in the string. In different implementations, there are many ways to encode the length of the string.
  • IscMatchTypeFlag[i] is equal to the value of isc_match_type_flag[i]. If isc_match_type_flag[i] does not exist in the bitstream, the value of IscMatchTypeFlag[i] is 0.
  • a value of '1' indicates that the i-th part of the current coding unit is the last part of the current coding unit, and the length StrLen[i] of this part is equal to NumTotalPixel-NumCodedPixel; a value of '0' indicates that the i-th part of the current coding unit is not the current coding
  • IscLastFlag[i] is equal to the value of isc_last_flag[i].
  • next_remaining_pixel_in_cu[i] represents the number of pixels that have not yet completed decoding remaining in the current coding unit after the decoding of the i-th part of the current coding unit is completed.
  • the value of NextRemainingPixelInCu[i] is equal to the value of next_remaining_pixel_in_cu[i].
  • IscUnmatchedPixelY[i] IscUnmatchedPixelU[i] and IscUnmatchedPixelV[i] are equal to the values of isc_unmatched_pixel_y[i], isc_unmatched_pixel_u[i] and isc_unmatched_pixel_v[i] respectively.
  • the encoder needs to encode the string length information of the current string (including information related to the string length, such as the syntax element next_remaining_pixel_in_cu described above).
  • the encoding end performs binarization processing on the string length information of the current string to obtain the corresponding binary symbol string, and then adds the binary symbol string to the code stream and sends it to the decoding end.
  • the decoding end obtains the above binary symbol string from the code stream, and then performs inverse binarization processing on the binary symbol string to obtain the string length information of the current string. This further determines the string length of the current string.
  • synElVal is obtained from the binary symbol string according to Table 1 (that is, the value recovered by inverse binarization).
  • synElVal is obtained from the binary symbol string according to Table 2 (that is, the value recovered by inverse binarization).
  • next_remaining_pixel_in_cu does not exist in the bit stream, and the value of NextRemainingPixelInCu is 0; otherwise (that is, NumPixelInCu–NumCodedPixel is greater than 1), the inverse binarization method of next_remaining_pixel_in_cu is as follows:
  • the binary symbol string of next_remaining_pixel_in_cu consists of three parts.
  • the first step is to debinarize the first part.
  • the maxValPrefix refers to the index of the numerical interval to which the maximum value of the parameter that needs to be restored by the inverse binarization process (in this example, next_remaining_pixel_in_cu) belongs.
  • next_remaining_pixel_in_cu is 0 (then the string length of the current string is NumTotalPixel–NumCodedPixel); otherwise (that is, a is greater than 0), continue to the next step.
  • the second step performs the debinarization of the second part.
  • next_remaining_pixel_in_cu d+(b ⁇ k)+c-( 2n -maxValInfix-1)*k.
  • next_remaining_pixel_in_cu is mainly as follows:
  • the resolution of the string length is 1 pixel, allowing the coding unit to be divided into substrings of any integer pixel length (ie, the allowed length of the encoded string may be 1, 2, 3, . . . ).
  • the coding unit may be divided into finer-grained pixel strings, and the positions of the pixel strings may not be aligned with the memory, resulting in frequent memory access during pixel string reconstruction, which affects the coding efficiency. For example, assuming that the memory unit can process data corresponding to 4 pixels in parallel at the same time, if the string length of the current string is 7, there may be data corresponding to the pixels in the current string that are allocated in two or three memory units. In this case, the decoding end needs to access the memory unit 2 or 3 times to complete the decoding of the current string.
  • the present application proposes a video decoding method and a video encoding method.
  • the length of the pixel string in the encoding and decoding block can be limited to a multiple of the string length resolution, which improves the uniformity of the pixel string and enables the encoding and decoding end to align in memory
  • the encoding and decoding are carried out under the conditions of the pixel string, which improves the encoding and decoding efficiency of the pixel string.
  • the memory unit can process data corresponding to 4 pixels in parallel at the same time, and the corresponding setting string length resolution is 4, then the length of the pixel string can only be an integer multiple of 4, and there will be no misalignment with the memory unit. Assuming that the string length of the current string is 8, then the data of the pixels in the current string will only exist in two memory units, and they are all occupied, and cannot be allocated in three memory units, so that the decoding end needs to access the memory unit one more time. .
  • the present application proposes a method for binarizing and de-binarizing the string length information according to the string length resolution.
  • This method improves the Encoding and decoding methods of string length information under different string length resolutions. Specifically, when the encoding end binarizes the value of the string length information, it can first compress the value with the string length resolution, and then compress the compressed value (that is, divide the value by the string length resolution). obtained quotient) instead of directly binarizing the value.
  • the decoding end when the decoding end performs the inverse binarization process, the compressed value is obtained by the inverse binarization recovery, and then based on the compressed value and the string length resolution (that is, the compressed value and the string length are distinguished. rate multiplication) to obtain the value of the string length information.
  • the number of characters required for binary representation can be reduced, thereby reducing the complexity of encoding and decoding, which is beneficial to the improvement of encoding and decoding performance.
  • Communication system 600 includes a plurality of devices that can communicate with each other via, for example, network 650 .
  • the communication system 600 includes a first device 610 and a second device 620 interconnected by a network 650 .
  • the first device 610 and the second device 620 perform one-way data transfer.
  • the first device 610 may encode video data, such as a stream of video pictures captured by the first device 610, for transmission to the second device 620 over the network 650.
  • the encoded video data is transmitted in one or more encoded video streams.
  • the second device 620 may receive encoded video data from the network 650, decode the encoded video data to restore the video data, and display a video picture according to the restored video data.
  • One-way data transfer is common in applications such as media services.
  • the communication system 600 includes a third device 630 and a fourth device 640 that perform bi-directional transmission of encoded video data, which may occur, for example, during a video conference.
  • each of the third device 630 and the fourth device 640 may encode video data (eg, a stream of video pictures captured by the device) for transmission to the third device 630 and the fourth device over the network 650 Another device in the 640.
  • Each of third device 630 and fourth device 640 may also receive encoded video data transmitted by the other of third device 630 and fourth device 640, and may decode the encoded video data to restore the video data, and a video picture can be displayed on an accessible display device according to the restored video data.
  • the first device 610 , the second device 620 , the third device 630 and the fourth device 640 may be computer devices such as servers, personal computers and smart phones, but the principles disclosed in this application may not be limited thereto.
  • the embodiments of the present application are applicable to a PC (Personal Computer, personal computer), a mobile phone, a tablet computer, a media player, and/or a dedicated video conference device.
  • Network 650 represents any number of networks that communicate encoded video data between first device 610, second device 620, third device 630, and fourth device 640, including, for example, wired and/or wireless communication networks.
  • Communication network 650 may exchange data in circuit-switched and/or packet-switched channels.
  • the network may include a telecommunications network, a local area network, a wide area network, and/or the Internet.
  • the architecture and topology of network 650 may be immaterial to the operations disclosed herein.
  • Figure 7 shows the placement of video encoders and video decoders in a streaming environment.
  • the subject matter disclosed in this application is equally applicable to other video-enabled applications, including, for example, videoconferencing, digital TV (television), CD (Compact Disc), DVD (Digital Versatile Disc), memory stick Compressed video and so on on digital media.
  • the streaming system may include a capture subsystem 713, which may include a video source 701, such as a digital camera, that creates a stream 702 of uncompressed video pictures.
  • the video picture stream 702 includes samples captured by a digital camera.
  • the video picture stream 702 is depicted as a thick line to emphasize the high data volume of the video picture stream, which can be processed by the electronic device 720, which Electronic device 720 includes video encoder 703 coupled to video source 701 .
  • Video encoder 703 may include hardware, software, or a combination of hardware and software to implement or implement various aspects of the disclosed subject matter as described in greater detail below.
  • the encoded video data 704 (or encoded video code stream 704) is depicted as a thin line to emphasize the lower amount of encoded video data 704 (or encoded video code stream 704) 704), which can be stored on the streaming server 705 for future use.
  • One or more streaming client subsystems such as client subsystem 706 and client subsystem 708 in FIG. 7 , may access streaming server 705 to retrieve copies 707 and 709 of encoded video data 704 .
  • Client subsystem 706 may include, for example, video decoder 710 in electronic device 730 .
  • Video decoder 710 decodes incoming copy 707 of the encoded video data and produces output video picture stream 711 that can be presented on display 712 (eg, a display screen) or another presentation device (not depicted).
  • encoded video data 704, video data 707, and video data 709 may be encoded according to certain video encoding/compression standards.
  • electronic device 720 and electronic device 730 may include other components (not shown).
  • electronic device 720 may include a video decoder (not shown), and electronic device 730 may also include a video encoder (not shown).
  • the video decoder is used for decoding the received encoded video data; the video encoder is used for encoding the video data.
  • the technical solutions provided by the embodiments of the present application can be applied to the H.266/VVC standard, the H.265/HEVC standard, AVS (such as AVS3), or the next-generation video codec standard. This is not limited.
  • the execution subject of each step may be a decoding end device.
  • the execution subject of each step may be an encoding end device.
  • the decoding scheme provided by the embodiment of the present application may be used to obtain the string length of the current string by decoding.
  • the encoding scheme provided by the embodiments of the present application may be used to encode the string length of the current string.
  • Both the decoding device and the encoding device can be computer devices.
  • the computer devices refer to electronic devices with data computing, processing and storage capabilities, such as PCs, mobile phones, tablet computers, media players, dedicated video conferencing equipment, servers, etc. .
  • the encoder and decoder based on the method provided in this application may be implemented by one or more processors or one or more integrated circuits.
  • the technical solutions of the present application will be introduced and described through several embodiments.
  • FIG. 8 shows a flowchart of a video decoding method provided by an embodiment of the present application.
  • the method can be applied to the decoding end device, that is, the method can be executed by the decoding end device.
  • the method may include the following steps (801-803):
  • Step 801 Decode a binary symbol string of string length information of the current string from the code stream.
  • the code stream refers to the data stream generated after the video is encoded, which can be represented by a series of binary data 0 and 1.
  • a bitstream also called a bitstream, is a binary data stream formed by coded images.
  • the current string refers to the currently decoded pixel string.
  • a pixel string refers to a pixel sequence composed of a certain number of pixels.
  • the pixel string is an ordered sequence of data of a finite number of binary bits.
  • one CU can be divided into several pixel strings.
  • the string length of each pixel string needs to be determined first.
  • the string length information refers to the information related to the string length of the pixel string in the code stream, and is used to determine the string length of the pixel string.
  • the string length information of the current string includes information related to the string length of the current string, and is used to determine the string length of the current string.
  • the binary symbol string of the string length information refers to a binary string obtained by binarizing the string length information, and the characters that may appear in the binary symbol string are only 0 and 1.
  • Step 802 Perform inverse binarization processing on the binary symbol string according to the string length resolution of the current string to obtain string length information.
  • String Length Resolution is the minimum string length at which the CU is divided into pixel strings, that is, the minimum allowable string length.
  • a string length resolution of 4 means that the minimum string length of a pixel string is 4.
  • the string length resolution can be represented by N, where N is a positive integer.
  • N is an integer greater than 1.
  • the string length of the pixel string is an integer multiple of N, for example, the string length of the pixel string may be N, 2N, 3N, 4N, 5N, etc., and so on.
  • the string length resolution is 4, the string length of the pixel string may be 4, 8, 12, 16, 20, and so on.
  • step 802 may include the following two sub-steps:
  • the encoding end when the encoding end performs binarization processing on the value of the string length information, it can first compress the value with the string length resolution, and then compress the compressed value (that is, the value obtained by dividing the value by the string length resolution). quotient) instead of binarizing the value directly.
  • the decoding end when the decoding end performs the inverse binarization process, the compressed value is obtained by the inverse binarization recovery, and then based on the compressed value and the string length resolution (that is, the compressed value and the string length are distinguished. rate multiplication) to obtain the value of the string length information.
  • Step 803 Determine the string length of the current string according to the string length information.
  • the string length of the current string refers to the number of pixels contained in the current string.
  • the string length information of the current string includes the string length of the current string.
  • the string length information of the current string includes the number of remaining pixels in the decoding block to which the current string belongs after decoding the current string. Then, the decoding end can obtain the total number of pixels in the decoding block to which the current string belongs, and obtain the number of decoded pixels in the decoding block to which the current string belongs, and then determine the number of pixels based on the total number of pixels, the number of decoded pixels, and the number of remaining pixels after decoding the current string. String length of the current string.
  • the technical solutions provided by the embodiments of the present application can limit the length of the pixel string in the codec block to a multiple of the string length resolution by using the string length resolution as the basis for dividing the pixel string and encoding and decoding, which improves the performance of the pixel string.
  • the uniformity of the pixel string enables the encoder and decoder to perform encoding and decoding under the condition of memory alignment, which improves the encoding and decoding efficiency of the pixel string.
  • the present application proposes a method for binarizing and de-binarizing string length information according to the string length resolution, which improves the performance of different string lengths.
  • the encoding and decoding method of the string length information under the resolution. Specifically, when the encoding end binarizes the value of the string length information, it can first compress the value with the string length resolution, and then compress the compressed value (that is, divide the value by the string length resolution). obtained quotient) instead of directly binarizing the value.
  • the decoding end when the decoding end performs the inverse binarization process, the compressed value is obtained by the inverse binarization recovery, and then based on the compressed value and the string length resolution (that is, the compressed value and the string length are distinguished. rate multiplication) to obtain the value of the string length information.
  • the number of characters required for binary representation can be reduced, thereby reducing the complexity of encoding and decoding, which is beneficial to the improvement of encoding and decoding performance.
  • the following describes the manner in which the decoding end determines the string length resolution of the current string during the decoding process.
  • the following manners are exemplarily provided for determining the string length resolution of the current string.
  • Manner 1 Determine the first preset value as the string length resolution of the current string.
  • the above-mentioned first preset value refers to a preset value of the string length resolution.
  • the first preset value may be predefined in a protocol.
  • the decoding end determines the first preset value as the string length resolution of the current string, and does not need to obtain the string length resolution of the current string from the code stream.
  • Mode 2 Decode the sequence header of the image sequence to which the current sequence belongs, and obtain the sequence length resolution of the current sequence.
  • the above-mentioned sequence of pictures also known as a video sequence (sequence)
  • the image sequence starts with the first sequence header, and the sequence end code or video editing code indicates the end of an image sequence.
  • the sequence header between the first sequence header of the image sequence and the first occurrence of the sequence end code or video editing code is the repeated sequence header.
  • each sequence header is followed by one or more encoded pictures, each picture should be preceded by a picture header.
  • the encoded images are arranged in bitstream order in the bitstream, and the bitstream order should be the same as the decoding order.
  • the decoding order may not be the same as the display order.
  • the sequence header of the above image sequence contains some information related to decoding the image sequence.
  • the sequence header of the image sequence may be a special reserved field that defines the bit length and is appended to the front of the data sequence corresponding to the image sequence in the code stream.
  • the string length resolution is also included in the sequence header of the image sequence.
  • each string included in the image sequence to which the current string belongs has the same string length resolution, which is the string length resolution obtained by decoding the sequence header of the image sequence.
  • the decoding end decodes an indication information (such as an index, syntax element or other indication information) in the sequence header of the picture system, where the indication information indicates the string length resolution of all strings in the picture sequence.
  • Mode 3 Decode the string length resolution of the current string from the image header of the image to which the current string belongs.
  • the above image refers to a single image frame in the video. In some standards, an image can be a frame or a field.
  • the above-mentioned image is an encoded image
  • the above-mentioned encoded image is an encoded representation of an image.
  • the image header of the above image contains some information related to decoding the image. For example, the image header of the image is a special reserved field that defines the bit length and is appended to the front of the data sequence corresponding to the image in the code stream.
  • the string length resolution is also included in the image header of the image.
  • each string included in the image to which the current string belongs has the same string length resolution, which is the string length resolution obtained by decoding the image header of the image.
  • the decoding end decodes an indication information (such as an index, a syntax element or other indication information) in the picture header of the picture, and the indication information indicates the string length resolution of all strings in the picture.
  • Manner 4 Decoding to obtain the string length resolution of the current string from the slice header of the slice to which the current string belongs.
  • the above-mentioned patch refers to several adjacent maximum coding units arranged in a raster scan order.
  • the above-mentioned raster scan refers to mapping a two-dimensional rectangular grating to a one-dimensional grating. The entry of the one-dimensional grating starts from the first row of the two-dimensional grating, and then scans the second row, the third row, and so on. Rows in the raster are scanned from left to right.
  • the slice header of the above-mentioned slice contains some information related to decoding the picture.
  • the slice header of a slice is a special reserved field that defines the bit length and is appended to the front of the data sequence corresponding to the slice in the code stream.
  • the slice header also includes the string length resolution.
  • each string included in the slice to which the current string belongs has the same string length resolution, which is the string length resolution obtained by decoding the slice header of the slice.
  • the decoding end decodes a piece of indication information (such as an index, syntax element or other indication information) in the slice header of the slice, where the indication information indicates the string length resolution of all strings in the slice.
  • Manner 5 Decode to obtain the string length resolution of the current string from the encoding information of the largest coding unit LCU to which the current string belongs.
  • the LCU includes an L*L block of luma samples and a corresponding block of chroma samples, obtained by image partitioning.
  • the encoding information of the LCU includes some information related to decoding the LCU.
  • the encoding information of the above-mentioned LCU is a special reserved field that defines the bit length and is appended to the front of the data sequence corresponding to the above-mentioned LCU in the code stream.
  • One LCU may include multiple CUs.
  • the encoding information of the above-mentioned LCU also includes the string length resolution.
  • each string included in the LCU to which the current string belongs has the same string length resolution, which is the string length resolution obtained by decoding the encoding information of the LCU.
  • the decoding end decodes an indication information (such as an index, a syntax element or other indication information) in the coding information of the LCU, where the indication information indicates the string length resolution of all strings in the LCU.
  • the encoding information of the above CU includes some information related to decoding the CU.
  • the encoding information of the above-mentioned CU is a special reserved field that defines the bit length and is appended to the front of the data sequence corresponding to the above-mentioned CU in the code stream.
  • the encoding information of the above-mentioned CU further includes the string length resolution.
  • each string included in the CU to which the current string belongs has the same string length resolution, which is the string length resolution obtained by decoding from the encoding information of the CU.
  • the decoding end decodes an indication information (such as an index, a syntax element or other indication information) in the coding information of the CU, where the indication information indicates the string length resolution of all strings in the CU.
  • Manner 7 From the encoding information of the current string, decode to obtain the string length resolution of the current string.
  • the above encoding information of the current string includes some information related to decoding the current string.
  • the encoding information of the current string is a special reserved field that defines the bit length and is appended to the front of the data sequence corresponding to the current string in the code stream.
  • the encoding information of the current string also includes the string length resolution of the current string.
  • the decoding end decodes an indication information (such as an index, a syntax element or other indication information) in the encoding information of the current string, where the indication information indicates the string length resolution of the current string. In this way, the string length resolutions of different strings can be indicated separately in their respective encoding information, which is more flexible.
  • Manner 8 Determine the string length resolution of the current string according to the size of the decoding block to which the current string belongs.
  • the above-mentioned decoding block (block) is an M*N (M columns and N rows) sample matrix or transform coefficient matrix.
  • the decoding block to which the current string belongs may be the CU to which the current string belongs.
  • the size of the decoding block to which the current string belongs is obtained, where the size of the decoding block to which the current string belongs includes the height or width of the decoding block to which the current string belongs.
  • Mode 9 Determine the string length resolution of the current string according to the color component and chroma format corresponding to the current string.
  • the chroma format above refers to the color coding format used by the pixels.
  • the chroma format (chroma_format) is a 2-bit unsigned integer that specifies the format of the chroma components.
  • the above-mentioned color components refer to the chrominance components of the pixels in the chrominance format.
  • the second preset value is determined as the string length resolution of the current string.
  • the above-mentioned first threshold is a preset value, which is the basis for determining the string length resolution of the current string in this manner.
  • the above-mentioned first threshold may be determined according to the specifications of the CU, and the first thresholds corresponding to CUs of different specifications may be the same or different.
  • the above-mentioned second preset value refers to the preset value of the string length resolution, which is applicable to the situation where the number of decoded strings in the CU to which the current string belongs is greater than or equal to the first threshold, and the second preset value can be specified in the protocol. pre-determined. In one example, assuming that the number of decoded strings in the current CU is N1, in the case that N1 is greater than or equal to the first threshold, it may be determined that the string length resolution of the current string is a second preset value of 4.
  • the number of decoded strings in the CU to which the current string belongs is less than the first threshold
  • other methods described in the embodiments of the present application may be used to determine the string length resolution of the current string, or the resolution of the string length of the current string may be determined differently from the second preset value.
  • Another preset value of the value is determined as the string length resolution of the current string, which is not limited in this embodiment of the present application.
  • the third preset value is determined as the string length resolution of the current string.
  • the above-mentioned unmatched pixels refer to unmatched pixels, that is, pixels that do not match with the pixels at the corresponding positions in the reference string of the current string.
  • the above-mentioned second threshold is a preset value, and is the basis for judging the string length resolution of the current string in this manner.
  • the above-mentioned second threshold may be determined according to the number of decoded unmatched pixels in the CU to which the current string belongs, and the second threshold corresponding to the number of CUs in different CUs may be the same or different.
  • the above-mentioned third preset value refers to the preset value of the string length resolution, which is applicable to the case where the number of decoded unmatched pixels in the CU to which the current string belongs is greater than or equal to the second threshold, and the third preset value can be set in stipulated in the agreement.
  • the number of decoded unmatched pixels in the current CU is N2
  • N2 is greater than or equal to the second threshold
  • the number of decoded unmatched pixels in the CU to which the current string belongs is less than the second threshold
  • other methods described in the embodiments of the present application may be used to determine the string length resolution of the current string, or the string length resolution of the current string may be different from the third threshold.
  • Another preset value of the preset value is determined as the string length resolution of the current string, which is not limited in this embodiment of the present application.
  • the fourth preset value is determined as the string length resolution of the current string.
  • the above-mentioned third threshold is a preset value, and is the basis for judging the string length resolution of the current string in this manner.
  • the above-mentioned third threshold may be determined according to the number of decoded unmatched pixels in the CU to which the current string belongs, and the third threshold corresponding to the number of CUs in different CUs may be the same or different.
  • the above-mentioned fourth preset value refers to the preset value of the string length resolution, which is applicable to the situation that the number of undecoded pixels in the CU to which the current string belongs is less than or equal to the third threshold value.
  • the fourth preset value can be specified in the protocol. pre-determined.
  • other methods described in the embodiments of the present application may be used to determine the string length resolution of the current string, or the resolution of the string length of the current string may be determined differently from the fourth preset
  • Another preset value of the value is determined as the string length resolution of the current string, which is not limited in this embodiment of the present application.
  • the following describes the method of de-binarizing the binary symbol string of the string length information of the current string to obtain the compressed string length information based on the string length resolution.
  • the following anti-binarization processing methods are exemplarily provided.
  • the compressed string length information includes three parts, and the value of the compressed string length information is determined based on the values of the three parts.
  • the maximum value of the compressed string length information be max_val, which can include the following steps:
  • the above-mentioned plurality of numerical intervals are a series of intervals in which the numerical values are integers.
  • the above-mentioned multiple numerical ranges may be denoted as R0, R1, R2, . . . , Rn.
  • the index of the xth numerical interval Rx is x, and the xth numerical interval Rx is represented as [Rx_start, Rx_end), and x is a positive integer.
  • the value of the maximum value max_val satisfies Rn_start ⁇ max_val ⁇ Rn_end.
  • the index of the compressed string length information obtained by decoding the code stream is denoted as x, and the index indicates the numerical value range where the value val to be restored is located.
  • the index of the compressed string length information is obtained by decoding from the code stream by truncating the unary code.
  • the first part of the string length information is denoted as val_part1.
  • the index is entropy decoded in a CABAC manner, and each binary bit of the index has a corresponding context model.
  • a context model of the index of the compressed string length information (that is, val) is determined, and the context model is used to perform CABAC on the index.
  • Entropy decoding For example, a set of context models is selected according to the numerical range in which max_val is located.
  • the number of bits n of the maximum margin refers to the number of bits required to encode the maximum margin by means of a fixed-length code.
  • the value of the second part of the compressed string length information is determined according to the number of bits of the maximum margin.
  • the value of the second part is determined through steps 5-7 as follows.
  • the first bit length len n-1.
  • the second part of the string length information is denoted as val_part2.
  • the len bits are decoded from the code stream, and the len bits are de-binarized in a fixed-length code manner to obtain the value of val_part2.
  • the value of the third part of the compressed string length information is determined according to the value of the second part and the maximum value margin.
  • the value of the third part is determined through the following steps 8-10.
  • the value of the target value k is set to 0; otherwise, the value of k is set to 1.
  • the third part of the string length information is denoted as val_part3.
  • the value of the compressed string length information is calculated according to the value of the first part, the value of the second part, the value of the third part, the target value, the number of bits of the maximum margin and the maximum margin.
  • the value of the compressed string length information is calculated from val_part1, val_part2, val_part3, k, n, and max_val_infix.
  • the value of the compressed string length information val val_part1+(val_part2 ⁇ k)+val_part3 ⁇ (2 n ⁇ max_val_infix ⁇ 1) ⁇ k; where ⁇ is a left-shift symbol.
  • the compressed string length information includes two parts, and the value of the compressed string length information is determined based on the values of the two parts.
  • the maximum value of the compressed string length information be max_val, which can include the following steps:
  • the value of the first part of the compressed string length information is determined.
  • the value of the first part is determined through the following steps 2-4.
  • the third bit length len n-1.
  • the first part of the string length information is denoted as val_part1.
  • the third bit length is greater than or equal to 1, decode the data of the third bit length from the code stream, and perform inverse binarization processing on the data of the third bit length according to the fixed-length code to obtain the compressed data.
  • the len bits are decoded from the code stream, and the len bits are de-binarized in a fixed-length code manner to obtain the value of val_part1.
  • the value of the second part of the compressed string length information is determined according to the value of the first part and the maximum value margin.
  • the value of the second part is determined through steps 6-8 as follows.
  • the second part of the string length information is denoted as val_part2.
  • the value of the compressed string length information is calculated according to the value of the first part, the value of the second part, the target value, the number of digits of the maximum value, and the maximum value.
  • the value of the compressed string length information is calculated according to val_part1, val_part2, k, n, and max_val.
  • the value of the compressed string length information val (val_part1 ⁇ k)+val_part2-( 2n -max_val-1) ⁇ k; wherein, ⁇ is a left-shift symbol.
  • the binary symbol string is subjected to inverse binarization processing in the manner of the k-order exponential Golomb code to obtain the compressed string length information.
  • the main encoding format is the structure of [prefix 0] [1] [bit information].
  • the length of the prefix 0 that is, how many 0s there are in the prefix
  • the number of 1s and the bit information are calculated respectively, and the entire encoding is completed.
  • the coding steps are as follows:
  • T2 After adding the lowest K bits removed in step (1) to T1, it is temporarily called T2;
  • leadingZeroBits -1;
  • CodeNum 2 leadingZeroBits+k –2 k +read_bits(leadingZeroBits+k)
  • Table 4 shows the structure of exponential Golomb codes of order 0, 1, 2 and 3.
  • the bit string of exponential Golomb code is divided into two parts: "prefix” and "suffix".
  • the prefix consists of leadingZeroBits consecutive '0's and a '1'.
  • the suffix consists of leadingZeroBits+k bits, that is, the xi string in the table, and the value of xi is '0' or '1'.
  • inverse binarization processing is performed on the binary symbol string according to the unary code or the truncated unary code, to obtain the compressed string length information.
  • the truncated unary code is a variant of the unary code, and is used when the maximum value Max of the syntax element to be encoded is known, and the maximum value Max here is the maximum value max_val of the compressed string length information.
  • the fifth method is to perform inverse binarization processing on the binary symbol string in the manner of an n-bit fixed-length code to obtain compressed string length information; wherein, n is the maximum number of bits of the compressed string length information, and n is positive integer.
  • n is the maximum number of bits of the compressed string length information
  • n is positive integer.
  • n Ceil(Log(max_val+1))
  • max_val represents the maximum value of the compressed string length information.
  • len is equal to n.
  • the inverse binarization method may include the following steps:
  • the above-mentioned plurality of numerical intervals are a series of intervals in which the numerical values are integers.
  • the above-mentioned multiple numerical ranges may be denoted as R0, R1, R2, . . . , Rn.
  • the index of the xth numerical interval Rx is x, and the xth numerical interval Rx is represented as [Rx_start, Rx_end), and x is a positive integer.
  • the value of max_val satisfies Rn_start ⁇ max_val ⁇ Rn_end.
  • the index of the compressed string length information obtained by decoding the code stream is denoted as x, and the index indicates the numerical value range where the value val to be restored is located.
  • the decoding end may obtain the index of the compressed string length information by decoding from the code stream according to any one of the de-binarization methods in the above-mentioned manners 1 to 5.
  • the decoding end may determine the values of val_part_2, val_part_3, .
  • val val_part_1+val_part_2+val_part_3+...+val_part_n.
  • the decoding end may select an appropriate method from a variety of anti-binarization processing methods according to the actual situation.
  • the various de-binarization processing methods may include various methods described above.
  • the decoding end selects a method for performing inverse binarization processing on the binary symbol string from a variety of inverse binarization processing methods.
  • the decoding end selects a method for performing inverse binarization processing on the binary symbol string from a variety of inverse binarization processing methods.
  • the decoding end selects from a variety of inverse binarization processing methods to perform inverse binarization on the binary symbol string according to the quotient of the maximum number of remaining pixels in the decoding block to which the current string belongs and the string length resolution of the current string. the way it is handled.
  • a method for performing inverse binarization processing on the binary symbol string is selected from a variety of inverse binarization processing methods.
  • the embodiments of the present application provide a variety of binarization/de-binarization processing methods. Depending on the size of the value to be binarized, the encoding and decoding complexity and the length of the binary string will vary accordingly. different. Through the above method, an appropriate binarization/de-binarization processing method is selected directly or indirectly based on the string length resolution, and the binarization/de-binarization processing method with the best encoding and decoding performance can be flexibly selected.
  • the compressed string length information includes the string length code of the current string, denoted as L 0 .
  • the string length information includes the string length of the current string.
  • the decoding end multiplies the string length encoding of the current string by the string length resolution to obtain the string length of the current string.
  • the decoding end decodes the code stream to obtain a binary symbol string of the string length code L 0 of the current string, and the decoding end performs inverse binarization processing on the binary symbol string to recover the string length code of the current string.
  • the string length coding may also be referred to as the string length after resolution compression, that is, the quotient obtained by dividing the true value of the string length by the resolution N of the string length.
  • the number of characters required for binary representation of the string length encoding is less than the number of characters required for the binary representation of the real value of the string length, so the complexity of encoding and decoding can be reduced. Improve codec performance.
  • the compressed string length information includes the string length code L 0 of the current string.
  • the decoding end can use the inverse binarization processing method described in the following embodiments to decode to obtain a value (denoted as val), L 0 The value of is equal to val.
  • the maximum value of the value val be max_val
  • the number of remaining undecoded pixels in the decoding block to which the current string belongs is max_val_tmp
  • the string length resolution of the current string is N
  • max_val_tmp NumTotalPixel ⁇ NumCodedPixel.
  • NumTotalPixel represents the total number of pixels in the decoding block to which the current string belongs
  • NumCodedPixel represents the number of decoded pixels in the decoding block to which the current string belongs. Since the current string may be the last string in the current decoding block (or may not be), the value range of the string length L of the current string is [N, max_val_tmp].
  • the number of remaining undecoded pixels is compressed based on the string length resolution and then encoded into the code stream by binarization processing, compared to directly performing binarization processing and encoding on the real value of the number of remaining undecoded pixels In the code stream, the number of characters can be reduced, thereby reducing the complexity of encoding and decoding, and improving the performance of encoding and decoding.
  • the compressed string length information includes the string length code of the current string minus 1, denoted as L 0 -1. Accordingly, the string length information includes the string length of the current string.
  • the decoding end subtracts 1 from the string length code of the current string and adds 1 to obtain the string length code L 0 of the current string, and then multiplies the string length code L 0 of the current string by the string length resolution N to obtain the string length of the current string.
  • the compressed string length information includes the string length code of the current string minus 1 (that is, L 0 -1).
  • the decoding end can use the inverse binarization process described in the following embodiments to decode to obtain a numerical value ( Denoted as val), the value of L 0 is equal to val+1.
  • the compressed string length information includes the encoding of the number of remaining pixels after decoding the current string in the decoding block to which the current string belongs.
  • the string length information includes the number of remaining pixels after decoding the current string in the decoding block to which the current string belongs.
  • the number of remaining pixels in the decoding block to which the current string belongs after decoding the current string refers to the number of remaining undecoded pixels in the decoding block to which the current string belongs after decoding the current string.
  • the encoding of the number of remaining pixels may also be referred to as the number of remaining pixels after resolution compression, that is, the quotient obtained by dividing the real value of the number of remaining pixels by the string length resolution N.
  • the decoding end multiplies the encoding of the number of remaining pixels by the string length resolution to obtain the number of remaining pixels after decoding the current string in the decoding block to which the current string belongs.
  • the number of characters required for the binary representation of the remaining pixel number encoding is less than the number of characters required for the binary representation of the real value of the remaining pixel number, so the complexity of encoding and decoding can be reduced. to improve the encoding and decoding performance.
  • the above-mentioned number of remaining pixels is encoded and stored in the sequence header of the data sequence in the code stream, and the above-mentioned data sequence may be the data sequence corresponding to the image to which the current string belongs in the code stream, or may be the corresponding data sequence in the code stream of the current string.
  • the data sequence of the current string may also be a data sequence corresponding to the CU of the current string in the code stream, etc., which is not limited in this embodiment of the present application.
  • the string length resolution of each string in the decoding block to which the current string belongs is 4. After decoding the current string, the decoding end assumes that the number of remaining undecoded pixels in the decoding block to which the current string belongs is 4, that is, the binary number is 100. , then its corresponding coded representation (ie, the coding of the number of remaining pixels) is 1.
  • the above-mentioned number of remaining pixels is encoded as M 0 .
  • step 803 may include the following sub-steps (8031-8033):
  • Step 8031 Obtain the total number of pixels in the decoding block to which the current string belongs.
  • the total number of pixels of the decoding block is obtained by multiplying the height and width of the decoding block.
  • the total number of pixels in the decoding block to which the current string belongs is denoted as M.
  • Step 8032 Obtain the number of decoded pixels of the decoding block to which the current string belongs.
  • the above-mentioned number of decoded pixels may be obtained by accumulating the lengths of the decoded pixel strings at the decoding end.
  • the number of decoded pixels in the decoding block to which the current string belongs is denoted as M 2 .
  • Step 8033 Determine the string length of the current string based on the total number of pixels, the number of decoded pixels, and the number of remaining pixels after decoding the current string.
  • the compressed string length information includes the decoding block to which the current string belongs. After decoding the current string, the number of remaining pixels encodes M 0 .
  • the decoding end can use the inverse binarization process described in the following embodiments to decode Get a value (denoted as val), the value of M 0 is equal to val.
  • the maximum value of the value val be max_val
  • the number of remaining undecoded pixels in the decoding block to which the current string belongs is max_val_tmp
  • the string length resolution of the current string is N
  • max_val_tmp NumTotalPixel ⁇ NumCodedPixel.
  • NumTotalPixel represents the total number of pixels in the decoding block to which the current string belongs
  • NumCodedPixel represents the number of decoded pixels in the decoding block to which the current string belongs.
  • the value range of the string length L of the current string is [N, max_val_tmp], so after decoding the current string, the remaining number of pixels M 1 The value range of is [0,max_val_tmp-N].
  • the value range of the code M 0 for the remaining number of pixels after decoding the current string is [0, max_val_tmp/ N-1].
  • the number of remaining pixels after decoding the current string is compressed based on the string length and resolution, and then binarized and encoded into the code stream, compared to directly binarizing the real value of the number of remaining pixels
  • Processing and encoding into the code stream can reduce the number of characters, thereby reducing the complexity of encoding and decoding, and improving the performance of encoding and decoding.
  • the compressed string length information includes a first flag, where the first flag is used to indicate whether the current string is the last string in the decoding block to which the current string belongs.
  • the first flag is a binary variable, represented by a one-bit binary number.
  • the current string is the last string in the decoding block to which the current string belongs; when the first flag is 1, the current string is not the last string in the decoding block to which the current string belongs. string.
  • the string length information also includes the decoding block to which the current string belongs.
  • the string length information also includes the decoding block to which the current string belongs, and the number of remaining pixels after decoding the current string is coded minus 1.
  • step 803 may include the following sub-steps (803a-803d):
  • Step 803a Obtain the total number of pixels in the decoding block to which the current string belongs.
  • Step 803b Obtain the number of decoded pixels of the decoding block to which the current string belongs.
  • Step 803c when the current string is the last string, subtract the total number of pixels from the number of decoded pixels to obtain the string length of the current string.
  • Step 803d if the current string is not the last string, determine the string length of the current string based on the total number of pixels, the number of decoded pixels, and the number of remaining pixels after decoding the current string.
  • the code obtained by subtracting 1 from the code of the number of remaining pixels is denoted as M 0 .
  • add 1 to the code M 0 after subtracting 1 from the code of the remaining number of pixels, and multiply it by the string length resolution to obtain the number M 1 of remaining pixels after decoding the current string in the decoding block to which the current string belongs, that is, M 1 . (M 0 +1)*N.
  • the decoding end can adopt the inverse binarization described in the following embodiments.
  • a value (referred to as val) is obtained by decoding, and the value of M 0 is equal to val+1.
  • the current string is/is not the last string in the current decoding block is indicated by the first flag. If the current string is not the last string, the code stream will contain information about the number of remaining pixels after decoding the current string. Therefore, when the current string is not the last string, the value range of the string length L of the current string is [N, max_val_tmp-N], and the value range of the remaining number of pixels M 1 after decoding the current string is [ N,max_val_tmp-N].
  • the value range of the code M 0 for the remaining pixel number after decoding the current string is [1, max_val_tmp/ N-1].
  • the number of remaining pixels after decoding the current string is compressed based on the string length and resolution, and then binarized and encoded into the code stream, compared to directly binarizing the real value of the number of remaining pixels
  • Processing and encoding into the code stream can reduce the number of characters, thereby reducing the complexity of encoding and decoding, and improving the performance of encoding and decoding.
  • FIG. 9 shows a flowchart of a video encoding method provided by an embodiment of the present application.
  • the method can be applied to the encoding end device, that is, the method can be executed by the encoding end device.
  • the method may include the following steps (901-903):
  • Step 901 Determine the string length of the current string.
  • the current string refers to the currently encoded pixel string.
  • Step 902 Determine string length information of the current string based on the string length of the current string, where the string length information includes information related to the string length of the current string.
  • the string length information includes the string length of the current string.
  • the string length information includes the number of remaining pixels in the encoding block to which the current string belongs after encoding the current string.
  • the encoding end obtains the total number of pixels in the encoding block to which the current string belongs, and obtains the number of encoded pixels in the encoding block to which the current string belongs; The number of pixels remaining after encoding the current string.
  • Step 903 Binarize the string length information according to the string length resolution of the current string to obtain a binary symbol string of the string length information.
  • the compressed string length information is determined according to the string length information and the string length resolution; the compressed string length information is binarized to obtain a binary symbol string of the string length information.
  • the binarization process may adopt a manner corresponding to the inverse binarization process described above, which will not be repeated in this embodiment.
  • the encoding end may divide the string length of the current string by the string length resolution to obtain the string length code of the current string.
  • the compressed string length information may include the string length encoding of the current string, or may include the string length encoding of the current string minus 1.
  • the encoder can divide the number of remaining pixels by the string length resolution to obtain the encoding of the remaining number of pixels.
  • the compressed string length information may include, in the encoding block to which the current string belongs, the encoding of the number of remaining pixels after encoding the current string, or may include the encoding of the number of remaining pixels after encoding the current string in the encoding block to which the current string belongs.
  • the encoding end can select an appropriate binarization processing method:
  • the encoding end selects a method for performing binarization processing on the string length information from a variety of binarization processing methods.
  • the encoding end selects a method for performing binarization processing on the string length information from a variety of binarization processing methods.
  • the encoding end selects a method for binarizing the string length information from a variety of binarization processing methods according to the quotient of the maximum number of remaining pixels in the encoding block to which the current string belongs and the string length resolution of the current string. .
  • the encoding end selects a method for performing binarization processing on the string length information from a variety of binarization processing methods.
  • the following ways of determining the string length resolution of the current string are provided.
  • Mode 1 The string length resolution of the current string is the first preset value
  • Mode 2 The string length resolution of each string included in the image sequence to which the current string belongs is the same, and the string length resolution of the current string is encoded and added to the sequence header of the image sequence to which the current string belongs;
  • Mode 3 The string length resolution of each string included in the image to which the current string belongs is the same, and the string length resolution of the current string is encoded and added to the image header of the image to which the current string belongs;
  • Mode 4 The string length resolution of each string contained in the slice to which the current string belongs is the same, and the string length resolution of the current string is encoded and added to the slice header of the slice to which the current string belongs;
  • Mode 5 The string length resolution of each string included in the LCU to which the current string belongs is the same, and the string length resolution of the current string is encoded and added to the encoding information of the LCU to which the current string belongs;
  • Manner 6 The string length resolution of each string included in the coding unit CU to which the current string belongs is the same, and the string length resolution of the current string is encoded and added to the encoding information of the CU to which the current string belongs;
  • Method 7 The string length resolution of the current string is encoded and added to the encoding information of the current string;
  • Mode 8 The string length resolution of the current string is determined according to the size of the decoding block to which the current string belongs;
  • Method 9 The string length resolution of the current string is determined according to the color component and chroma format corresponding to the current string;
  • Mode 10 when the number of decoded strings in the CU to which the current string belongs is greater than or equal to the first threshold, the string length resolution of the current string is a second preset value;
  • Mode 11 when the number of decoded unmatched pixels in the CU to which the current string belongs is greater than or equal to the second threshold, the string length resolution of the current string is a third preset value;
  • the technical solutions provided by the embodiments of the present application can limit the length of the pixel string in the codec block to a multiple of the string length resolution by using the string length resolution as the basis for dividing the pixel string and encoding and decoding, which improves the performance of the pixel string.
  • the uniformity of the pixel string enables the encoder and decoder to perform encoding and decoding under the condition of memory alignment, which improves the encoding and decoding efficiency of the pixel string.
  • the present application proposes a method for binarization and inverse binarization processing of string length information, which improves the string length under different string length resolutions.
  • the encoding and decoding method of information is beneficial to the improvement of encoding and decoding performance.
  • FIG. 10 shows a block diagram of a video decoding apparatus provided by an embodiment of the present application.
  • the apparatus has the function of implementing the above example of the video decoding method, and the function may be implemented by hardware or by executing corresponding software by the hardware.
  • the apparatus may be the computer equipment described above, or may be provided on the computer equipment.
  • the apparatus 1000 may include: a binary symbol acquisition module 1010 , an inverse binarization processing module 1020 and a string length determination module 1030 .
  • the binary symbol obtaining module 1010 is configured to decode from the code stream to obtain a binary symbol string of string length information of the current string, where the string length information includes information related to the string length of the current string.
  • the inverse binarization processing module 1020 is configured to perform inverse binarization processing on the binary symbol string according to the string length resolution of the current string to obtain the string length information.
  • a string length determination module 1030 configured to determine the string length of the current string according to the string length information.
  • the inverse binarization processing module 1020 includes:
  • an inverse binarization processing unit configured to perform inverse binarization processing on the binary symbol string to obtain string length information compressed based on the string length resolution
  • a length information determining unit configured to determine the string length information according to the compressed string length information and the string length resolution.
  • the inverse binarization processing unit is specifically used for:
  • the inverse binarization processing unit is specifically configured to:
  • the first bit length is greater than or equal to 1
  • decode the data of the first bit length from the code stream and perform inverse binarization on the data of the first bit length according to a fixed-length code method Process to obtain the value of the second part.
  • the inverse binarization processing unit is specifically configured to:
  • the target value is set to 0; otherwise, the target value is set to 1;
  • the target value is equal to 1
  • determine that the second bit length is 1 decode the data of the second bit length from the code stream, and decode the data of the second bit length according to the fixed-length code.
  • the de-binarization processing unit is further configured to determine a context model of the index of the compressed string length information according to the maximum value of the compressed string length information, the context The model is used to perform entropy decoding on the index using CABAC.
  • the inverse binarization processing unit is specifically used for:
  • the value of the compressed string length information is calculated based on the value of the first part, the value of the second part, the number of bits of the maximum value, and the maximum value.
  • the inverse binarization processing unit is specifically configured to:
  • the third bit length is greater than or equal to 1
  • decode the data of the third bit length from the code stream and perform inverse binarization on the data of the third bit length according to a fixed-length code method process to obtain the value of the first part.
  • the inverse binarization processing unit is specifically configured to:
  • the target value is set to 0; otherwise, the target value is set to 1;
  • the target value is equal to 1
  • determine that the fourth bit length is 1 decode the data of the fourth bit length from the code stream, and perform the processing on the data of the fourth bit length according to the fixed-length code. Perform inverse binarization processing to obtain the value of the second part.
  • the inverse binarization processing unit is specifically configured to perform inverse binarization processing on the binary symbol string in the manner of k-order exponential Golomb coding to obtain the compressed string length information .
  • the inverse binarization processing unit is specifically configured to perform inverse binarization processing on the binary symbol string in the manner of unary code or truncated unary code to obtain the compressed string length information.
  • the inverse binarization processing unit is specifically configured to perform inverse binarization processing on a binary symbol string in the manner of an n-bit fixed-length code to obtain the compressed string length information; wherein , n is the maximum number of bits of the compressed string length information, and n is a positive integer.
  • the value of the compressed string length information is determined based on multiple parts; the inverse binarization processing unit is specifically configured to:
  • the value of the compressed string length information is determined according to the value of the first part and the value of the remaining part.
  • the compressed string length information includes a string length encoding of the current string
  • the length information determining unit is specifically configured to multiply the string length code of the current string by the string length resolution to obtain the string length of the current string.
  • the compressed string length information includes the string length encoding of the current string minus 1;
  • the length information determination unit is specifically configured to subtract 1 from the string length code of the current string and add 1 to obtain the string length code of the current string; distinguish the string length code of the current string from the string length rate is multiplied to obtain the string length of the current string.
  • the compressed string length information includes the encoding of the number of remaining pixels after decoding the current string in the decoding block to which the current string belongs;
  • the length information determining unit is configured to multiply the encoding of the number of remaining pixels by the length resolution of the string to obtain the number of remaining pixels after decoding the current string in the decoding block to which the current string belongs.
  • the string length determination module 1030 is configured to obtain the total number of pixels of the decoding block to which the current string belongs; obtain the number of decoded pixels of the decoding block to which the current string belongs; The number of decoded pixels and the number of remaining pixels after decoding the current string determine the string length of the current string.
  • the compressed string length information includes a first flag, and the first flag is used to indicate whether the current string is the last string in the decoding block to which the current string belongs;
  • the length information determination unit is configured to obtain the remaining length after decoding the current string in the decoding block to which the current string belongs in the case that the current string is determined not to be the last string according to the first flag Number of pixels encoding; wherein, the string length information further includes the encoding of the number of remaining pixels in the decoding block to which the current string belongs after decoding the current string; or, the string length information also includes the decoding block to which the current string belongs.
  • the code of the number of remaining pixels is subtracted by 1; the code of the number of remaining pixels is multiplied by the length and resolution of the string to obtain the decoding block to which the current string belongs. The number of pixels remaining after the string.
  • the string length determination module 1030 is configured to obtain the total number of pixels of the decoding block to which the current string belongs; obtain the number of decoded pixels of the decoding block to which the current string belongs; In the case of a string, subtract the total number of pixels from the number of decoded pixels to obtain the string length of the current string; in the case that the current string is not the last string, based on the total number of pixels The number, the number of decoded pixels, and the number of remaining pixels after decoding the current string, determine the string length of the current string.
  • the apparatus 1000 further includes a resolution determination module for:
  • decoding to obtain the string length resolution of the current string from the sequence header of the image sequence to which the current string belongs wherein the string length resolution of each string included in the image sequence to which the current string belongs is the same;
  • decoding to obtain the string length resolution of the current string from the slice header of the slice to which the current string belongs wherein the string length resolution of each string included in the slice to which the current string belongs is the same;
  • decoding to obtain the string length resolution of the current string from the encoding information of the largest coding unit LCU to which the current string belongs, wherein the string length resolution of each string included in the LCU to which the current string belongs is the same;
  • decoding to obtain the string length resolution of the current string from the encoding information of the coding unit CU to which the current string belongs, wherein the string length resolution of each string included in the coding unit CU to which the current string belongs is the same;
  • the string length resolution of the current string is obtained by decoding.
  • the apparatus 1000 further includes a mode selection module for:
  • a method for performing the inverse binarization processing on the binary symbol string is selected from a variety of inverse binarization processing methods
  • a method for performing the inverse binarization processing on the binary symbol string is selected from a variety of inverse binarization processing methods.
  • FIG. 11 shows a block diagram of a video encoding apparatus provided by an embodiment of the present application.
  • the apparatus has the function of implementing the above example of the video encoding method, and the function may be implemented by hardware or by executing corresponding software by hardware.
  • the apparatus may be the computer equipment described above, or may be provided on the computer equipment.
  • the apparatus 1100 may include: a string length determination module 1110 , a length information determination module 1120 and a binarization processing module 1130 .
  • the string length determination module 1110 is configured to determine the string length of the current string.
  • a length information determining module 1120 configured to determine the string length information of the current string based on the string length of the current string, where the string length information includes information related to the string length of the current string.
  • the binarization processing module 1130 is configured to perform binarization processing on the string length information according to the string length resolution of the current string to obtain a binary symbol string of the string length information.
  • the binarization processing module 1130 includes:
  • a length information determining unit configured to determine the compressed string length information according to the string length information and the string length resolution
  • the binarization processing unit is used for binarizing the compressed string length information to obtain a binary symbol string of the string length information.
  • the binarization process may adopt a manner corresponding to the inverse binarization process introduced above, which is not described repeatedly in this embodiment.
  • the length information determining unit may be configured to divide the string length of the current string by the string length resolution to obtain the string length code of the current string.
  • the compressed string length information may include the string length encoding of the current string, or may include the string length encoding of the current string minus 1.
  • the length information determining unit can be used to divide the remaining number of pixels by the string length resolution to obtain the remaining number of pixels encoded.
  • the compressed string length information may include, in the encoding block to which the current string belongs, the encoding of the number of remaining pixels after encoding the current string, or may include the encoding of the number of remaining pixels after encoding the current string in the encoding block to which the current string belongs.
  • the string length resolution of the current string is determined in any of the following ways:
  • Mode 1 The string length resolution of the current string is the first preset value
  • Mode 2 The string length resolution of each string included in the image sequence to which the current string belongs is the same, and the string length resolution of the current string is encoded and added to the sequence header of the image sequence to which the current string belongs;
  • Mode 3 The string length resolution of each string included in the image to which the current string belongs is the same, and the string length resolution of the current string is encoded and added to the image header of the image to which the current string belongs;
  • Mode 4 The string length resolution of each string included in the slice to which the current string belongs is the same, and the string length resolution of the current string is encoded and added to the slice header of the slice to which the current string belongs;
  • Mode 5 The string length resolution of each string included in the LCU to which the current string belongs is the same, and the string length resolution of the current string is encoded and added to the encoding information of the LCU to which the current string belongs;
  • Manner 6 The string length resolution of each string included in the coding unit CU to which the current string belongs is the same, and the string length resolution of the current string is encoded and added to the encoding information of the CU to which the current string belongs;
  • Method 7 The string length resolution of the current string is encoded and added to the encoding information of the current string;
  • Mode 8 The string length resolution of the current string is determined according to the size of the decoding block to which the current string belongs;
  • Method 9 The string length resolution of the current string is determined according to the color component and chroma format corresponding to the current string;
  • Mode 10 when the number of decoded strings in the CU to which the current string belongs is greater than or equal to the first threshold, the string length resolution of the current string is a second preset value;
  • Mode 11 when the number of decoded unmatched pixels in the CU to which the current string belongs is greater than or equal to the second threshold, the string length resolution of the current string is a third preset value;
  • the apparatus 1100 further includes a mode selection module for:
  • the method of binarizing the string length information is selected from a variety of binarization processing methods
  • a method for performing binarization processing on the string length information is selected from a variety of binarization processing methods
  • a method for performing binarization processing on the string length information is selected from a variety of binarization processing methods
  • a method for performing binarization processing on the string length information is selected from multiple binarization processing methods.
  • FIG. 12 shows a structural block diagram of a computer device provided by an embodiment of the present application.
  • the computer device may be the encoding end device described above, or the decoding end device described above.
  • the computer device 120 may include: a processor 121 , a memory 122 , a communication interface 123 , an encoder/decoder 124 and a bus 125 .
  • the processor 121 includes one or more processing cores, and the processor 121 executes various functional applications and information processing by running software programs and modules.
  • the memory 122 can be used for storing a computer program, and the processor 121 is used for executing the computer program, so as to realize the above-mentioned video decoding method, or realize the above-mentioned video encoding method.
  • the communication interface 123 can be used to communicate with other devices, such as to receive audio and video data.
  • the encoder/decoder 124 may be used to implement encoding and decoding functions, such as encoding and decoding audio and video data.
  • the memory 122 is connected to the processor 121 through a bus 125 .
  • the memory 122 can be implemented by any type of volatile or non-volatile storage device or a combination thereof.
  • the volatile or non-volatile storage device includes but is not limited to: magnetic disk or optical disk, EEPROM (Electrically Erasable Programmable Read) -Only Memory, Electrically Erasable Programmable Read-Only Memory), EPROM (Erasable Programmable Read-Only Memory, Erasable Programmable Read-Only Memory), SRAM (Static Random-Access Memory, Static Access Memory), ROM (Read-Only Memory, read-only memory), magnetic memory, flash memory, PROM (Programmable Read-Only Memory, programmable read-only memory).
  • FIG. 12 does not constitute a limitation to the computer device 120, and may include more or less components than the one shown, or combine some components, or adopt different component arrangements.
  • a computer-readable storage medium stores at least one instruction, at least one piece of program, code set or instruction set, the at least one instruction, the At least one piece of the program, the code set, or the instruction set, when executed by the processor, implements the above-described video decoding method.
  • a computer-readable storage medium stores at least one instruction, at least one piece of program, code set or instruction set, the at least one instruction, the At least one piece of program, the code set or the instruction set is loaded and executed by the processor to implement the above video encoding method.
  • a computer program product or computer program comprising computer instructions stored in a computer readable storage medium.
  • the processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, causing the computer device to perform the above-described video decoding method.
  • a computer program product or computer program comprising computer instructions stored in a computer readable storage medium.
  • the processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, causing the computer device to perform the above-described video encoding method.
  • references herein to "a plurality” means two or more.
  • "And/or" which describes the association relationship of the associated objects, means that there can be three kinds of relationships, for example, A and/or B, which can mean that A exists alone, A and B exist at the same time, and B exists alone.
  • the character “/” generally indicates that the associated objects are an "or" relationship.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

本申请提供了一种视频解码方法、视频编码方法、装置、设备及存储介质,涉及视频编解码技术领域。所述方法包括:从码流中解码得到当前串的串长度信息的二元符号串;根据当前串的串长度分辨率对二元符号串进行反二值化处理,得到串长度信息;根据串长度信息确定当前串的串长度。本申请可限制编解码块内像素串的长度为串长度分辨率的倍数,提升了像素串的整齐度,使得编解码端能够在内存对齐的条件下编解码,提升了像素串的编解码效率。另外,本申请考虑到串长度分辨率对串长度信息编解码的影响,提出了针对串长度信息的二值化和反二值化处理的方法,该方法改进了不同串长度分辨率下串长度信息的编解码方式,有利于编解码性能的提升。

Description

视频解码方法、视频编码方法、装置、设备及存储介质
本申请要求于2020年08月20日提交的申请号为202010841108.9、发明名称为“视频解码方法、视频编码方法、装置、设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请实施例涉及视频编解码技术领域,特别涉及一种视频解码方法、视频编码方法、装置、设备及存储介质。
背景技术
在目前的视频编解码标准中,如VVC(Versatile Video Coding,通用视频编码)和AVS3(Audio Video coding Standard 3,音视频编码标准3)中,引入了ISC(Intra String Copy,帧内串复制)技术。
相关的帧内串复制技术中,是按照某种扫描顺序将一个编码块分成一系列像素串或未匹配像素,实质上是允许将编码块划分为任意整像素长度的像素串,比如一个未匹配像素便可认为是长度为1的像素串。这样,像素串的长度可以是1、2、3、4等任意的正整数。
然而,基于上述划分得到的像素串进行编解码时,存在编解码效率低的问题。
发明内容
本申请实施例提供了一种视频解码方法、视频编码方法、装置、设备及存储介质,能够提升像素串的编解码效率。所述技术方案如下:
根据本申请实施例的一个方面,提供了一种视频解码方法,所述方法由解码端设备执行,所述方法包括:
从码流中解码得到当前串的串长度信息的二元符号串,所述串长度信息包括与所述当前串的串长度相关的信息;
根据当前串的串长度分辨率对所述二元符号串进行反二值化处理,得到所述串长度信息;
根据所述串长度信息确定所述当前串的串长度。
根据本申请实施例的一个方面,提供了一种视频编码方法,所述方法由编码端设备执行,所述方法包括:
确定当前串的串长度;
基于所述当前串的串长度确定所述当前串的串长度信息,所述串长度信息包括与所述当前串的串长度相关的信息;
根据所述当前串的串长度分辨率对所述串长度信息进行二值化处理,得到所述串长度信息的二元符号串。
根据本申请实施例的一个方面,提供了一种视频解码装置,所述装置包括:
二元符号获取模块,用于从码流中解码得到当前串的串长度信息的二元符号串,所述串长度信息包括与所述当前串的串长度相关的信息;
反二值化处理模块,用于根据当前串的串长度分辨率对所述二元符号串进行反二值化处理,得到所述串长度信息;
串长度确定模块,用于根据所述串长度信息确定所述当前串的串长度。
根据本申请实施例的一个方面,提供了一种视频编码装置,所述装置包括:
串长度确定模块,用于确定当前串的串长度;
长度信息确定模块,用于基于所述当前串的串长度确定所述当前串的串长度信息,所述串长度信息包括与所述当前串的串长度相关的信息;
二值化处理模块,用于根据所述当前串的串长度分辨率对所述串长度信息进行二值化处理,得到所述串长度信息的二元符号串。
根据本申请实施例的一个方面,提供了一种计算机设备,所述计算机设备包括处理器和存储器,所述存储器中存储有至少一条指令、至少一段程序、代码集或指令集,所述至少一条指令、所述至少一段程序、所述代码集或指令集由所述处理器加载并执行以实现上述视频解码方法。
根据本申请实施例的一个方面,提供了一种计算机设备,所述计算机设备包括处理器和存储器,所述存储器中存储有至少一条指令、至少一段程序、代码集或指令集,所述至少一条指令、所述至少一段程序、所述代码集或指令集由所述处理器加载并执行以实现上述视频编码方法。
根据本申请实施例的一个方面,提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有至少一条指令、至少一段程序、代码集或指令集,所述至少一条指令、所述至少一段程序、所述代码集或指令集由处理器加载并执行以实现上述视频解码方法。
根据本申请实施例的一个方面,提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有至少一条指令、至少一段程序、代码集或指令集,所述至少一条指令、所述至少一段程序、所述代码集或指令集由处理器加载并执行以实现上述视频编码方法。
根据本申请实施例的一个方面,提供了一种计算机程序产品或计算机程序,该计算机程序产品或计算机程序包括计算机指令,该计算机指令存储在计算机可读存储介质中。计算机设备的处理器从计算机可读存储介质读取该计算机指令,处理器执行该计算机指令,使得该计算机设备执行上述视频解码方法。
根据本申请实施例的一个方面,提供了一种计算机程序产品或计算机程序,该计算机程序产品或计算机程序包括计算机指令,该计算机指令存储在计算机可读存储介质中。计算机设备的处理器从计算机可读存储介质读取该计算机指令,处理器执行该计算机指令,使得该计算机设备执行上述视频编码方法。
本申请实施例提供的技术方案可以包括如下有益效果:
通过将串长度分辨率作为像素串的划分与编解码依据,可限制编解码块内像素串的长度为串长度分辨率的倍数,提升了像素串的整齐度,使得编解码端能够在内存对齐的条件下进行编解码,提升了像素串的编解码效率。
另外,本申请考虑到串长度分辨率对串长度信息编解码的影响,提出了根据串长度分辨率对串长度信息进行二值化和反二值化处理的方法,该方法改进了不同串长度分辨率下串长度信息的编解码方式。具体来讲,编码端在对串长度信息的值进行二值化处理时,可以先采用串长度分辨率对该值进行压缩,然后对压缩后的值(即对该值除以串长度分辨率得到的商)进行二值化处理,而不是直接对该值进行二值化处理。相应地,解码端在进行反二值化处理时,通过反二值化恢复得到的是压缩后的值,然后基于该压缩后的值和串长度分辨率(即将压缩后的值与串长度分辨率相乘)得到该串长度信息的值。这样,能够减少二值化表示所需的字符数量,从而能够降低编解码复杂度,有利于编解码性能的提升。
附图说明
图1是本申请示例性示出的一种视频编码过程的基本流程图;
图2是本申请一个实施例提供的帧间预测模式的示意图;
图3是本申请一个实施例提供的候选运动矢量的示意图;
图4是本申请一个实施例提供的帧内块复制模式的示意图;
图5是本申请一个实施例提供的帧内串复制模式的示意图;
图6是本申请一个实施例提供的通信系统的简化框图;
图7是本申请示例性示出的视频编码器和视频解码器在流式传输环境中的放置方式的示意图;
图8是本申请一个实施例提供的视频解码方法的流程图;
图9是本申请一个实施例提供的视频编码方法的流程图;
图10是本申请一个实施例提供的视频解码装置的框图;
图11是本申请一个实施例提供的视频编码装置的框图;
图12是本申请一个实施例提供的计算机设备的结构框图。
具体实施方式
为使本申请的目的、技术方案和优点更加清楚,下面将结合附图对本申请实施方式作进一步地详细描述。
在对本申请实施例进行介绍说明之前,首先结合图1对视频编码技术进行简单介绍。图1示例性示出了一种视频编码过程的基本流程图。
视频信号是指包括多个帧的图像序列。帧(frame)是视频信号空间信息的表示。以YUV模式为例,一个帧包括一个亮度样本矩阵(Y)和两个色度样本矩阵(Cb和Cr)。从视频信号的获取方式来看,可以分为摄像机拍摄到的以及计算机生成的两种方式。由于统计特性的不同,其对应的压缩编码方式也可能有所区别。
在一些主流的视频编码技术中,如H.265/HEVC(High Efficient Video Coding,高效率视频压缩编码)、H.266/VVC(Versatile Video Coding,通用视频编码)标准、AVS(Audio Video coding Standard,音视频编码标准)(如AVS3)中,采用了混合编码框架,对输入的原始视频信号进行如下一系列的操作和处理:
1、块划分结构(block partition structure):输入图像划分成若干个不重叠的处理单元,每个处理单元将进行类似的压缩操作。这个处理单元被称作CTU(Coding Tree Unit,编码树单元),或者LCU(Large Coding Unit,最大编码单元)。CTU再往下,可以继续进行更加精细的划分,得到一个或多个基本编码的单元,称之为CU。每个CU是一个编码环节中最基本的元素。以下描述的是对每一个CU可能采用的各种编码方式。
2、预测编码(Predictive Coding):包括了帧内预测和帧间预测等方式,原始视频信号经过选定的已重建视频信号的预测后,得到残差视频信号。编码端需要为当前CU决定在众多可能的预测编码模式中,选择最适合的一种,并告知解码端。其中,帧内预测是指预测的信号来自于同一图像内已经编码重建过的区域。帧间预测是指预测的信号来自已经编码过的,不同于当前图像的其他图像(称之为参考图像)。
3、变换编码及量化(Transform&Quantization):残差视频信号经过DFT(Discrete Fourier Transform,离散傅里叶变换)、DCT(Discrete Cosine Transform,离散余弦变换)等变换操作,将信号转换到变换域中,称之为变换系数。在变换域中的信号,进一步进行有损的量化操作,丢失掉一定的信息,使得量化后的信号有利于压缩表达。在一些视频编码标准中,可能有多于一种变换方式可以选择,因此,编码端也需要为当前CU选择其中的一种变换,并告知解码端。量化的精细程度通常由量化参数来决定。QP(Quantization Parameter,量化参数)取值较大,表示更大取值范围的系数将被量化为同一个输出,因此通常会带来更大的失真,及较低的码率;相反,QP取值较小,表示较小取值范围的系数将被量化为同一个输出,因此通常会带来较小的失真,同时对应较高的码率。
4、熵编码(Entropy Coding)或统计编码:量化后的变换域信号,将根据各个值出现的频率,进行统计压缩编码,最后输出二值化(0或者1)的压缩码流。同时,编码产生其他信息,例如选择的模式、运动矢量等,也需要进行熵编码以降低码率。统计编码是一种无损编码方式,可以有效的降低表达同样的信号所需要的码率。常见的统计编码方式有变长编码(Variable Length Coding,简称VLC)或者基于上下文的二值化算术编码(Content Adaptive Binary Arithmetic Coding,简称CABAC)。
5、环路滤波(Loop Filtering):已经编码过的图像,经过反量化、反变换及预测补偿的操作(上述2~4的反向操作),可获得重建的解码图像。重建图像与原始图像相比,由于存在量化的影响,部分信息与原始图像有所不同,产生失真(distortion)。对重建图像进行滤波操作,例如去块效应滤波(deblocking),SAO(Sample Adaptive Offset,样本自适应偏移量)或 者ALF(Adaptive Lattice Filter,自适应格型滤波器)等滤波器,可以有效的降低量化所产生的失真程度。由于这些经过滤波后的重建图像,将作为后续编码图像的参考,用于对将来的信号进行预测,所以上述的滤波操作也被称为环路滤波,及在编码环路内的滤波操作。
根据上述编码过程可以看出,在解码端,对于每一个CU,解码器获得压缩码流后,先进行熵解码,获得各种模式信息及量化后的变换系数。各个系数经过反量化及反变换,得到残差信号。另一方面,根据已知的编码模式信息,可获得该CU对应的预测信号,两者相加之后,即可得到重建信号。最后,解码图像的重建值,需要经过环路滤波的操作,产生最终的输出信号。
一些主流的视频编码标准中,如HEVC、VVC、AVS3等标准中,均采用基于块的混合编码框架。它们将原始的视频数据分成一系列的编码块,结合预测、变换和熵编码等视频编码方法,实现视频数据的压缩。其中,运动补偿是视频编码常用的一类预测方法,运动补偿基于视频内容在时域或空域的冗余特性,从已编码的区域导出当前编码块的预测值。这类预测方法包括:帧间预测、帧内块复制预测、帧内串复制预测等,在具体的编码实现中,可能单独或组合使用这些预测方法。对于使用了这些预测方法的编码块,通常需要在码流显式或隐式地编码一个或多个二维的位移矢量,指示当前块(或当前块的同位块)相对它的一个或多个参考块的位移。
需要注意的是,在不同的预测模式下及不同的实现,位移矢量可能有不同的名称,本文统一按照以下方式进行描述:1)帧间预测模式中的位移矢量称为运动矢量(Motion Vector,简称MV);2)IBC(Intra Block Copy,帧内块复制)预测模式中的位移矢量称为块矢量(Block Vector,简称BV);3)ISC(Intra String Copy,帧内串复制)预测模式中的位移矢量称为串矢量(String Vector,简称SV)。帧内串复制也称作“串预测”或“串匹配”等。
MV是指用于帧间预测模式的位移矢量,由当前图像指向参考图像,其值为当前块和参考块之间的坐标偏移量,其中,当前块与参考块在两个不同图像中。在帧间预测模式中,可以引入运动矢量预测,通过对当前块的运动矢量进行预测,得到当前块对应的预测运动矢量,对当前块对应的预测运动矢量与实际运动矢量之间的差值进行编码传输,相较于直接对当前块对应的实际运动矢量进行编码传输,有利于节省比特开销。在本申请实施例中,预测运动矢量是指通过运动矢量预测技术,得到的当前块的运动矢量的预测值。
BV是指用于IBC预测模式的位移矢量,其值为当前块和参考块之间的坐标偏移量,其中,当前块与参考块均在当前图像中。在IBC模式中,可以引入块矢量预测,通过对当前块的块矢量进行预测,得到当前块对应的预测块矢量,对当前块对应的预测块矢量与实际块矢量之间的差值进行编码传输,相较于直接对当前块对应的实际块矢量进行编码传输,有利于节省比特开销。在本申请实施例中,预测块矢量是指通过块矢量预测技术,得到的当前块的块矢量的预测值。
SV是指用于ISC预测模式的位移矢量,其值为当前串和参考串之间的坐标偏移量,其中,当前串与参考串均在当前图像中。在ISC模式中,可以引入串矢量预测,通过对当前串的串矢量进行预测,得到当前串对应的预测串矢量,对当前串对应的预测串矢量与实际串矢量之间的差值进行编码传输,相较于直接对当前串对应的实际串矢量进行编码传输,有利于节省比特开销。在本申请实施例中,预测串矢量是指通过串矢量预测技术,得到的当前串的串矢量的预测值。
下面对几种不同的预测模式进行介绍:
一、帧间预测模式
如图2所示,帧间预测利用视频时间域的相关性,使用邻近已编码图像的像素预测当前图像的像素,以达到有效去除视频时域冗余的目的,能够有效节省编码残差数据的比特。其中,P为当前帧,Pr为参考帧,B为当前待编码块,Br是B的参考块。B’与B在图像中的坐标位置相同,Br坐标为(xr,yr),B’坐标为(x,y)。当前待编码块与其参考块之间的位移,称 为运动矢量(MV),即:
MV=(xr-x,yr-y)。
考虑到时域或空域邻近块具有较强的相关性,可以采用MV预测技术进一步减少编码MV所需要的比特。在H.265/HEVC中,帧间预测包含Merge和AMVP(Advanced Motion Vector Prediction,高级运动向量预测)两种MV预测技术。
Merge模式会为当前PU(Prediction Unit,预测单元)建立一个MV候选列表,其中存在5个候选MV(及其对应的参考图像)。遍历这5个候选MV,选取率失真代价最小的作为最优MV。在编解码器依照相同的方式建立候选列表的情况下,编码器只需要传输最优MV在候选列表中的索引即可。需要注意的是,HEVC的MV预测技术还有一种skip模式,是Merge模式的一种特例。在Merge模式找到最优MV后,如果当前块和参考块基本一样,那么不需要传输残差数据,只需要传送MV的索引和一个skip flag。
Merge模式建立的MV候选列表中包含了空域和时域的两种情形,对于B Slice(B帧图像),还包含组合列表的方式。其中,空域最多提供4个候选MV,它的建立如图3中的(a)部分所示。空域列表按照A1→B1→B0→A0→B2的顺序建立,其中B2为替补,即当A1,B1,B0,A0中有一个或多个不存在时,则需要使用B2的运动信息;时域最多只提供1个候选MV,它的建立如图3中的(b)部分所示,由同位PU的MV按下式伸缩得到:
curMV=td*colMV/tb;
其中,curMV表示当前PU的MV,colMV表示同位PU的MV,td表示当前图像与参考图像之间的距离,tb表示同位图像与参考图像之间的距离。在同位块上D0位置PU不可用的情况下,用D1位置的同位PU进行替换。对于B Slice中的PU,由于存在两个MV,其MV候选列表也需要提供两个MVP(Motion Vector Predictor,预测运动矢量)。HEVC通过将MV候选列表中的前4个候选MV进行两两组合,产生了用于B Slice的组合列表。
类似的,AMVP模式利用空域和时域邻近块的MV相关性,为当前PU建立MV候选列表。与Merge模式不同,AMVP模式的MV候选列表中选择最优的预测MV,与当前待编码块通过运动搜索得到的最优MV进行差分编码,即编码MVD=MV-MVP,其中MVD为运动矢量残差(Motion Vector Difference);解码端通过建立相同的列表,仅需要MVD与MVP在该列表中的序号即可计算当前解码块的MV。AMVP模式的MV候选列表也包含空域和时域两种情形,不同的是AMVP模式的MV候选列表长度仅为2。
如上所述,在HEVC的AMVP模式中,需要对MVD(Motion Vector Difference,运动矢量残差)进行编码。在HEVC中,MVD的分辨率由slice_header中的use_integer_mv_flag控制,当该标志的值为0,MVD以1/4(亮度)像素分辨率进行编码;当该标志的值为1,MVD采用整(亮度)像素分辨率进行编码。VVC中使用了一种自适应运动矢量精度(Adaptive Motion Vector Resolution,简称AMVR)的方法。该方法允许每个CU自适应的选择编码MV的分辨率。在普通的AMVP模式中,可选的分辨率包括1/4,1/2,1和4像素分辨率。对于具有至少一个非零MVD分量的CU,首先编码一个标志指示是否将四分之一亮度采样MVD精度用于CU。在该标志为0的情况下,当前CU的MVD采用1/4像素分辨率进行编码。否则,需要编码第二个标志,以指示CU使用了1/2像素分辨率或其他MVD分辨率。否则,编码第三个标志以指示对于CU是否使用1像素分辨率或4像素分辨率。
二、IBC预测模式
IBC是HEVC屏幕内容编码(Screen Content Coding,简称SCC)扩展中采纳的一种帧内编码工具,它显著的提升了屏幕内容的编码效率。在AVS3和VVC中,也采纳了IBC技术以提升屏幕内容编码的性能。IBC利用屏幕内容视频在空间的相关性,使用当前图像上已编码图像像素预测当前待编码块的像素,能够有效节省编码像素所需的比特。如图4所示,在IBC中当前块与其参考块之间的位移,称为BV(块矢量)。H.266/VVC采用了类似于帧间预测的BV预测技术进一步节省编码BV所需的比特,并允许使用1或4像素分辨率编码BVD(Block  Vector Difference,块矢量残差)。
三、ISC预测模式
ISC技术按照某种扫描顺序(如光栅扫描、往返扫描和Zig-Zag扫描等)将一个编码块分成一系列像素串或未匹配像素。类似于IBC,每个串在当前图像已编码区域中寻找相同形状的参考串,导出当前串的预测值,通过编码当前串像素值与预测值之间残差,代替直接编码像素值,能够有效节省比特。图5给出了帧内串复制的示意图,深灰色的区域为已编码区域,白色的28个像素为串1,浅灰色的35个像素为串2,黑色的1个像素表示未匹配像素。串1与其参考串之间的位移,即为图4中的串矢量1;串2与其参考串之间的位移,即为图4中的串矢量2。
帧内串复制技术需要编码当前编码块中各个串对应的SV、串长度以及是否有匹配串的标志等。其中,SV表示待编码串到其参考串的位移。串长度表示该串所包含的像素数量。在不同的实现方式中,串长度的编码有多种方式,以下给出几种示例(部分示例可能组合使用):1)直接在码流中编码串的长度;2)在码流中编码处理该串后续的待处理像素数量,解码端则根据当前块的大小N,已处理的像素数量N1,解码得到的待处理像素数量N2,计算得到当前串的长度,L=N-N1-N2;3)在码流中编码一个标志指示该串是否为最后一个串,如果是最后一个串,则根据当前块的大小N,已处理的像素数量N1,计算得到当前串的长度L=N-N1。如果一个像素在可参考的区域中没有找到对应的参考,将直接对未匹配像素的像素值进行编码。
在2020年6月份的AVS工作组第73次会议中,串预测技术被采纳到标准中。以下给出当前方案中串预测的解码流程(带下划线加粗的字段表示需要解码的语法元素,无下划线首字母大写的字段表示变量,变量的值可由语法元素解码得到,以下流程省略了一些与本申请无关的细节):
Figure PCTCN2021112133-appb-000001
Figure PCTCN2021112133-appb-000002
相关语义描述:
1、串复制帧内预测的匹配类型标志isc_match_type_flag[i]
二值变量。值为‘1’表示当前编码单元的第i部分为一个串;值为‘0’表示当前编码单元的第i部分为一个未匹配像素。IscMatchTypeFlag[i]等于isc_match_type_flag[i]的值。如果位流中不存在isc_match_type_flag[i],IscMatchTypeFlag[i]的值为0。
2、串复制帧内预测的最后标志isc_last_flag[i]
二值变量。值为‘1’表示当前编码单元的第i部分为当前编码单元最后一部分,该部分的长度StrLen[i]等于NumTotalPixel-NumCodedPixel;值为‘0’表示当前编码单元的第i部分不为当前编码单元最后一部分,该部分的长度StrLen[i]等于NumTotalPixel-NumCodedPixel-NumRemainingPixelMinus1[i]-1。IscLastFlag[i]等于isc_last_flag[i]的值。
3、下一个剩余像素数量next_remaining_pixel_in_cu[i]
next_remaining_pixel_in_cu[i]的值表示完成当前编码单元第i部分的解码之后当前编码单元中剩余的尚未完成解码的像素的数目。NextRemainingPixelInCu[i]的值等于next_remaining_pixel_in_cu[i]的值。
4、串复制帧内预测未匹配像素Y分量值isc_unmatched_pixel_y[i]
串复制帧内预测未匹配像素U分量值isc_unmatched_pixel_u[i]
串复制帧内预测未匹配像素V分量值isc_unmatched_pixel_v[i]
10位无符号整数。当前编码单元第i部分的未匹配像素的Y,Cb或Cr分量的值。IscUnmatchedPixelY[i],IscUnmatchedPixelU[i]和IscUnmatchedPixelV[i]分别等于isc_unmatched_pixel_y[i],isc_unmatched_pixel_u[i]和isc_unmatched_pixel_v[i]的值。
为了解码端能够获知当前串的串长度,编码端需要对当前串的串长度信息(包括与串长度相关的信息,如上文介绍的语法元素next_remaining_pixel_in_cu)进行编码。在编码过程中,编码端会对当前串的串长度信息进行二值化处理,得到相应的二元符号串,然后将该二元符号串添加至码流中发送给解码端。相应地,解码端在对当前串的串长度进行解码时,从码流中获取上述二元符号串,然后对该二元符号串进行反二值化处理,得到当前串的串长度信息,据此进一步确定出当前串的串长度。
下面,对AVS3标准中串长度信息的反二值化方法进行介绍说明。
1、采用截断一元码的反二值化方法
由二元符号串根据表1得到synElVal的值(即经过反二值化处理恢复得到的值)。
表1 synElVal与二元符号串的关系(截断一元码)
Figure PCTCN2021112133-appb-000003
Figure PCTCN2021112133-appb-000004
2、采用一元码的反二值化方法
由二元符号串根据表2得到synElVal的值(即经过反二值化处理恢复得到的值)。
表2 synElVal的值与二元符号串的关系(一元码)
Figure PCTCN2021112133-appb-000005
3、next_remaining_pixel_in_cu的反二值化方法
如果NumTotalPixel–NumCodedPixel等于1,位流中不存在next_remaining_pixel_in_cu,NextRemainingPixelInCu的值为0;否则(即NumPixelInCu–NumCodedPixel大于1),next_remaining_pixel_in_cu的反二值化方法如下:
next_remaining_pixel_in_cu的二元符号串由三部分组成。
第一步进行第一部分的反二值化。首先,根据以下方法计算maxValPrefix:
if((NumTotalPixel–NumCodedPixel)<=5)maxValPrefix=1
else if((NumTotalPixel–NumCodedPixel)<=21)maxValPrefix=2
else if((NumTotalPixel–NumCodedPixel)<=277)maxValPrefix=3
else maxValPrefix=4
其中,maxValPrefix是指需要通过反二值化处理恢复的参数(在本示例中即为next_remaining_pixel_in_cu)的最大值所属数值区间的索引。
由第一部分和maxVal=maxValPrefix查表1得到synElVal的值并设定a等于synElVal的值。
如果a等于0,码流中不存在第二部分和第三部分,next_remaining_pixel_in_cu的值为0(那么当前串的串长度为NumTotalPixel–NumCodedPixel);否则(即a大于0),继续执行下一步骤。
第二步进行第二部分的反二值化。首先,根据以下方法计算maxValInfix:
Figure PCTCN2021112133-appb-000006
Figure PCTCN2021112133-appb-000007
其次,计算n=Ceil(Log(maxValInfix+1))。然后,由第二部分和len=n-1查表3得到synElVal的值b。当len<1时,二元符号串为空(即实际上不存在于位流中,故无需真正解析位流)。
表3 synElVal与二元符号串的关系(长度为len的定长码)
Figure PCTCN2021112133-appb-000008
第三步进行第三部分的反二值化。如果b小于2 n-maxValInfix-1或者maxValInfix等于0,k的值设定为0,否则,k的值设定为1。然后,由第三部分和len=k查表3得到synElVal的值c。
最后,从d,b,c,k,n和maxValInfix的值,根据以下方法计算next_remaining_pixel_in_cu的值:
next_remaining_pixel_in_cu=d+(b<<k)+c-(2 n-maxValInfix-1)×k。
对于编码端来说,next_remaining_pixel_in_cu的二值化过程主要如下:
1、根据剩余的最大像素数量max_remaining_pixel_in_cu确定一系列数值区间(如包括区间1,2,….,N),并记录next_remaining_pixel_in_cu的值所在区间的索引。其中,next_remaining_pixel_in_cu等于0时,将索引设置为0。将索引值记串长度的第1部分;
2、将max_remaining_pixel_in_cu减去所在分组的起始值记为maxValInfix,计算n=Ceil(Log(maxValInfix+1))。将next_remaining_pixel_in_cu减去所在分组的起始值记为valInfix。然后根据n和2 n-maxValInfix-1的值,将valInfix分成第2部分和第3部分进行编码(此处略去具体的步骤)。
目前的帧内串复制方案中串长度的分辨率为1像素,允许编码单元划分为任意整像素长度的子串(即允许编码的串长度可以是1,2,3,…)。该方案下编码单元可能划分为较细粒度的像素串,且像素串的位置可能与内存不对齐,导致像素串重建时需要频繁的进行内存访问,影响编码效率。例如,假设内存单元可同时并行处理4个像素对应的数据,在当前串的串长度是7的情况下,可能会存在当前串中的像素对应的数据被分配在两个或者三个内存单元的情况,此种情况下,解码端需要访问2次或者3次内存单元才可完成对当前串的解码。
为了提升像素串的整齐度以及像素串的解码效率,本申请提出了一种视频解码方法以及一种视频编码方法。通过将串长度分辨率作为像素串的划分与编解码依据,可限制编解码块 内像素串的长度为串长度分辨率的倍数,提升了像素串的整齐度,使得编解码端能够在内存对齐的条件下进行编解码,提升了像素串的编解码效率。例如,假设内存单元可同时并行处理4个像素对应的数据,相应的设置串长度分辨率为4,那么像素串的长度只能是4的整数倍,不会出现与内存单元不对齐的情况。假设当前串的串长度是8,那么当前串中像素的数据只会存在两个内存单元中,并且是全部占满,不可能被分配在三个内存单元而导致需要解码端多访问一次内存单元。
在此基础上,本申请考虑到串长度分辨率对串长度信息编解码的影响,提出了根据串长度分辨率对串长度信息进行二值化和反二值化处理的方法,该方法改进了不同串长度分辨率下串长度信息的编解码方式。具体来讲,编码端在对串长度信息的值进行二值化处理时,可以先采用串长度分辨率对该值进行压缩,然后对压缩后的值(即对该值除以串长度分辨率得到的商)进行二值化处理,而不是直接对该值进行二值化处理。相应地,解码端在进行反二值化处理时,通过反二值化恢复得到的是压缩后的值,然后基于该压缩后的值和串长度分辨率(即将压缩后的值与串长度分辨率相乘)得到该串长度信息的值。这样,能够减少二值化表示所需的字符数量,从而能够降低编解码复杂度,有利于编解码性能的提升。
如图6所示,其示出了本申请一个实施例提供的通信系统的简化框图。通信系统600包括多个设备,所述设备可通过例如网络650彼此通信。举例来说,通信系统600包括通过网络650互连的第一设备610和第二设备620。在图6的实施例中,第一设备610和第二设备620执行单向数据传输。举例来说,第一设备610可对视频数据例如由第一设备610采集的视频图片流进行编码以通过网络650传输到第二设备620。已编码的视频数据以一个或多个已编码视频码流形式传输。第二设备620可从网络650接收已编码视频数据,对已编码视频数据进行解码以恢复视频数据,并根据恢复的视频数据显示视频图片。单向数据传输在媒体服务等应用中是较常见的。
在另一实施例中,通信系统600包括执行已编码视频数据的双向传输的第三设备630和第四设备640,所述双向传输可例如在视频会议期间发生。对于双向数据传输,第三设备630和第四设备640中的每个设备可对视频数据(例如由设备采集的视频图片流)进行编码,以通过网络650传输到第三设备630和第四设备640中的另一设备。第三设备630和第四设备640中的每个设备还可接收由第三设备630和第四设备640中的另一设备传输的已编码视频数据,且可对所述已编码视频数据进行解码以恢复视频数据,且可根据恢复的视频数据在可访问的显示装置上显示视频图片。
在图6的实施例中,第一设备610、第二设备620、第三设备630和第四设备640可为服务器、个人计算机和智能电话等计算机设备,但本申请公开的原理可不限于此。本申请实施例适用于PC(Personal Computer,个人计算机)、手机、平板电脑、媒体播放器和/或专用视频会议设备。网络650表示在第一设备610、第二设备620、第三设备630和第四设备640之间传送已编码视频数据的任何数目的网络,包括例如有线连线的和/或无线通信网络。通信网络650可在电路交换和/或分组交换信道中交换数据。该网络可包括电信网络、局域网、广域网和/或互联网。出于本申请的目的,除非在下文中有所解释,否则网络650的架构和拓扑对于本申请公开的操作来说可能是无关紧要的。
作为实施例,图7示出视频编码器和视频解码器在流式传输环境中的放置方式。本申请所公开主题可同等地适用于其它支持视频的应用,包括例如视频会议、数字TV(电视)、在包括CD(Compact Disc,光盘)、DVD(Digital Versatile Disc,数字通用光盘)、存储棒等的数字介质上存储压缩视频等等。
流式传输系统可包括采集子系统713,所述采集子系统可包括数码相机等视频源701,所述视频源创建未压缩的视频图片流702。在实施例中,视频图片流702包括由数码相机拍摄的样本。相较于已编码的视频数据704(或已编码的视频码流),视频图片流702被描绘为粗线以强调高数据量的视频图片流,视频图片流702可由电子装置720处理,所述电子装置720 包括耦接到视频源701的视频编码器703。视频编码器703可包括硬件、软件或软硬件组合以实现或实施如下文更详细地描述的所公开主题的各方面。相较于视频图片流702,已编码的视频数据704(或已编码的视频码流704)被描绘为细线以强调较低数据量的已编码的视频数据704(或已编码的视频码流704),其可存储在流式传输服务器705上以供将来使用。一个或多个流式传输客户端子系统,例如图7中的客户端子系统706和客户端子系统708,可访问流式传输服务器705以检索已编码的视频数据704的副本707和副本709。客户端子系统706可包括例如电子装置730中的视频解码器710。视频解码器710对已编码的视频数据的传入副本707进行解码,且产生可在显示器712(例如显示屏)或另一呈现装置(未描绘)上呈现的输出视频图片流711。在一些流式传输系统中,可根据某些视频编码/压缩标准对已编码的视频数据704、视频数据707和视频数据709(例如视频码流)进行编码。
应注意,电子装置720和电子装置730可包括其它组件(未示出)。举例来说,电子装置720可包括视频解码器(未示出),且电子装置730还可包括视频编码器(未示出)。其中,视频解码器用于对接收到的已编码视频数据进行解码;视频编码器用于对视频数据进行编码。
需要说明的一点是,本申请实施例提供的技术方案可以应用于H.266/VVC标准、H.265/HEVC标准、AVS(如AVS3)或者下一代视频编解码标准中,本申请实施例对此不作限定。
还需要说明的一点是,本申请实施例提供的视频解码方法,各步骤的执行主体可以是解码端设备。本申请实施例提供的视频编码方法,各步骤的执行主体可以是编码端设备。在ISC模式下的视频解码过程中,可以采用本申请实施例提供的解码方案,解码得到当前串的串长度。在ISC模式下的视频编码过程中,可以采用本申请实施例提供的编码方案,对当前串的串长度进行编码。解码端设备和编码端设备均可以是计算机设备,该计算机设备是指具备数据计算、处理和存储能力的电子设备,如PC、手机、平板电脑、媒体播放器、专用视频会议设备、服务器等等。
另外,本申请所提供的方法可以单独使用或以任意顺序与其他方法合并使用。基于本申请所提供方法的编码器和解码器,可以由1个或多个处理器或是1个或多个集成电路来实现。下面,通过几个实施例对本申请技术方案进行介绍说明。
请参考图8,其示出了本申请一个实施例提供的视频解码方法的流程图。该方法可应用于解码端设备中,即该方法可以由解码端设备执行。该方法可以包括如下几个步骤(801~803):
步骤801,从码流中解码得到当前串的串长度信息的二元符号串。
码流是指视频经过编码后生成的数据流,其可以采用一系列的二进制数据0和1进行表示。在一些标准中,码流也称为位流(bitstream),是编码图像所形成的二进制数据流。
对于解码过程来说,当前串是指当前解码的像素串。像素串是指一定数量的像素组成的像素序列。可选地,像素串是有限个二进制位的数据构成的有序序列。在ISC模式中,一个CU可以被划分成若干个像素串。对于视频解码来讲,为了恢复像素串中各个像素的像素值,需要先确定各个像素串的串长度。
串长度信息是指码流中与像素串的串长度相关的信息,用于确定像素串的串长度。可选地,当前串的串长度信息包括与当前串的串长度相关的信息,用于确定当前串的串长度。
串长度信息的二元符号串是指对串长度信息进行二值化处理后得到的一个二进制串,该二元符号串中可能出现的字符只有0和1。
步骤802,根据当前串的串长度分辨率对二元符号串进行反二值化处理,得到串长度信息。
在本申请实施例中,在进行反二值化处理时,考虑到了串长度分辨率对串长度信息编解码的影响。串长度分辨率(String Length Resolution,简称SLR)是将CU划分为像素串的最小串长度,也即串长度所允许的最小值。例如,串长度分辨率为4,表示像素串的最小串长 度为4。可选地,串长度分辨率可以用N表示,N为正整数。可选地,N为大于1的整数。当串长度分辨率为N时,像素串的串长度为N的整数倍,例如像素串的串长度可以是N、2N、3N、4N、5N等,以此类推。比如,串长度分辨率为4时,像素串的串长度可以是4、8、12、16、20等。
可选地,步骤802可以包括如下两个子步骤:
1、对二元符号串进行反二值化处理,得到基于串长度分辨率压缩后的串长度信息;
2、根据压缩后的串长度信息和串长度分辨率,确定串长度信息。
例如,编码端在对串长度信息的值进行二值化处理时,可以先采用串长度分辨率对该值进行压缩,然后对压缩后的值(即对该值除以串长度分辨率得到的商)进行二值化处理,而不是直接对该值进行二值化处理。相应地,解码端在进行反二值化处理时,通过反二值化恢复得到的是压缩后的值,然后基于该压缩后的值和串长度分辨率(即将压缩后的值与串长度分辨率相乘)得到该串长度信息的值。
另外,在本申请实施例中,提供了多种对二元符号串进行反二值化处理,得到基于串长度分辨率压缩后的串长度信息的方式,具体请参见下文实施例中的介绍说明。
步骤803,根据串长度信息确定当前串的串长度。
当前串的串长度是指当前串中所包含的像素的数量。
在一个示例中,当前串的串长度信息即包括当前串的串长度。
在另一个示例中,当前串的串长度信息包括当前串所属解码块中,在解码当前串后的剩余像素数量。那么,解码端可以获取当前串所属解码块的总像素数量,以及获取当前串所属解码块的已解码像素数量,然后基于总像素数量、已解码像素数量以及解码当前串后的剩余像素数量,确定当前串的串长度。假设总像素数量记为M、已解码像素数量记为M 2以及解码当前串后的剩余像素数量记为M 1,则当前串的串长度L=M-M 1-M 2
综上所述,本申请实施例提供的技术方案,通过将串长度分辨率作为像素串的划分与编解码依据,可限制编解码块内像素串的长度为串长度分辨率的倍数,提升了像素串的整齐度,使得编解码端能够在内存对齐的条件下进行编解码,提升了像素串的编解码效率。
另外,本申请考虑到串长度分辨率对串长度信息编解码的影响,提出了根据串长度分辨率对串长度信息进行二值化和反二值化处理的方法,该方法改进了不同串长度分辨率下串长度信息的编解码方式。具体来讲,编码端在对串长度信息的值进行二值化处理时,可以先采用串长度分辨率对该值进行压缩,然后对压缩后的值(即对该值除以串长度分辨率得到的商)进行二值化处理,而不是直接对该值进行二值化处理。相应地,解码端在进行反二值化处理时,通过反二值化恢复得到的是压缩后的值,然后基于该压缩后的值和串长度分辨率(即将压缩后的值与串长度分辨率相乘)得到该串长度信息的值。这样,能够减少二值化表示所需的字符数量,从而能够降低编解码复杂度,有利于编解码性能的提升。
下面,对解码端在解码过程中,确定当前串的串长度分辨率的方式进行介绍说明。在本申请实施例中,示例性提供了以下几种确定当前串的串长度分辨率的方式。
方式一:将第一预设值确定为当前串的串长度分辨率。上述第一预设值是指预先设置好的串长度分辨率的数值,例如,该第一预设值可以在协议中进行预定义。解码端在对当前串进行解码时,将第一预设值确定为当前串的串长度分辨率,无需从码流中获取当前串的串长度分辨率。
方式二:从当前串所属图像序列的序列头中,解码得到当前串的串长度分辨率。在一些标准中,上述图像序列又称视频序列(sequence),是编码位流的最高层语法结构,包括一个或多个连续的编码图像。可选地,图像序列由第一个序列头开始,序列结束码或视频编辑码表明了一个图像序列的结束。图像序列的第一个序列头到第一个出现的序列结束码或视频编辑码之间的序列头为重复序列头。可选地,每个序列头后面跟着一个或多个编码图像,每幅 图像之前应有图像头。可选地,编码图像在位流中按位流顺序排列,位流顺序应与解码顺序相同。解码顺序可与显示顺序不相同。上述图像序列的序列头中包含一些用于解码该图像序列相关的信息。例如,图像序列的序列头可以是附加在上述图像序列在码流中对应的数据序列前面的、定义位长度的特殊保留字段。在本示例中,图像序列的序列头中还包括串长度分辨率。可选地,当前串所属图像序列中包含的各个串的串长度分辨率相同,均为从该图像序列的序列头中解码得到的串长度分辨率。在一个示例中,解码端在图像系统的序列头中解码一个指示信息(如索引、语法元素或者其他指示信息),该指示信息指示了该图像序列中所有串的串长度分辨率。
方式三:从当前串所属图像的图像头中,解码得到当前串的串长度分辨率。上述图像是指视频中的单个图像帧。在一些标准中,一幅图像可以是一帧或者一场。可选地,上述图像是编码图像,上述编码图像是一幅图像的编码表示。上述图像的图像头中包含一些用于解码该图像相关的信息。例如,图像的图像头是附加在上述图像在码流中对应的数据序列前面的、定义位长度的特殊保留字段。在本示例中,图像的图像头中还包括串长度分辨率。可选地,当前串所属图像中包含的各个串的串长度分辨率相同,均为从该图像的图像头中解码得到的串长度分辨率。在一个示例中,解码端在图像的图像头中解码一个指示信息(如索引、语法元素或者其他指示信息),该指示信息指示了该图像中所有串的串长度分辨率。
方式四:从当前串所属片的片头中,解码得到当前串的串长度分辨率。上述片(patch)是指按光栅扫描顺序排列的相邻若干最大编码单元。上述光栅扫描(raster scan)是指将二维矩形光栅映射到一维光栅,一维光栅的入口从二维光栅的第一行开始,然后扫描第二行、第三行,依次类推。光栅中的行从左到右扫描。上述片的片头中包含一些用于解码该图像相关的信息。例如,片的片头是附加在上述片在码流中对应的数据序列前面的、定义位长度的特殊保留字段。在本示例中,片的片头还包括串长度分辨率。可选地,当前串所属片中包含的各个串的串长度分辨率相同,均为从该片的片头中解码得到的串长度分辨率。在一个示例中,解码端在片的片头中解码一个指示信息(如索引、语法元素或者其他指示信息),该指示信息指示了该片中所有串的串长度分辨率。
方式五:从当前串所属最大编码单元LCU的编码信息中,解码得到当前串的串长度分辨率。在一些标准中,LCU包括一个L*L的亮度样值块和对应的色度样值块,由图像划分得到。上述LCU的编码信息中包含一些用于解码该LCU相关的信息。例如,上述LCU的编码信息是附加在上述LCU在码流中对应的数据序列前面的、定义位长度的特殊保留字段。一个LCU可以包括多个CU。在本示例中,上述LCU的编码信息还包括串长度分辨率。可选地,当前串所属LCU中包含的各个串的串长度分辨率相同,均为从LCU的编码信息中解码得到的串长度分辨率。在一个示例中,解码端在LCU的编码信息中解码一个指示信息(如索引、语法元素或者其他指示信息),该指示信息指示了该LCU中所有串的串长度分辨率。
方式六:从当前串所属编码单元CU的编码信息中,解码得到当前串的串长度分辨率。上述CU的编码信息中包含一些用于解码该CU相关的信息。例如,上述CU的编码信息是附加在上述CU在码流中对应的数据序列前面的、定义位长度的特殊保留字段。在本示例中,上述CU的编码信息还包括串长度分辨率。可选地,当前串所属CU中包含的各个串的串长度分辨率相同,均为从CU的编码信息中解码得到的串长度分辨率。在一个示例中,解码端在CU的编码信息中解码一个指示信息(如索引、语法元素或者其他指示信息),该指示信息指示了该CU中所有串的串长度分辨率。
方式七:从当前串的编码信息中,解码得到当前串的串长度分辨率。上述当前串的编码信息中包含一些用于解码当前串相关的信息。例如,当前串的编码信息是附加在当前串在码流中对应的数据序列前面的、定义位长度的特殊保留字段。在本示例中,当前串的编码信息还包括当前串的串长度分辨率。在一个示例中,解码端在当前串的编码信息中解码一个指示信息(如索引、语法元素或者其他指示信息),该指示信息指示了当前串的串长度分辨率。这 样,不同串的串长度分辨率可以在各自的编码信息中分别进行指示,较为灵活。
方式八:根据当前串所属解码块的尺寸,确定当前串的串长度分辨率。在一些标准中,上述解码块(block)是一个M*N(M列N行)的样值矩阵或者变换系数矩阵。可选地,当前串所属解码块可以是当前串所属CU。可选地,获取当前串所属解码块的尺寸,其中当前串所属解码块的尺寸包括当前串所属解码块的高度或者宽度。在一个示例中,对于大小为4x4的块,串长度分辨率N=1;对于大小为16x16的块,串长度分辨率N=2;对于面积(宽x高)大于128的块,串长度分辨率N=2。
方式九:根据当前串对应的颜色分量和色度格式,确定当前串的串长度分辨率。上述色度格式是指像素采用的颜色编码格式。在一些标准中,色度格式(chroma_format)是2位无符号整数,规定色度分量的格式。上述颜色分量是指在色度格式下像素的色度分量。可选地,当前视频中像素采用RGB格式或者YUV格式。在一个示例中,在YUV 4:2:0格式的视频中,在已确定亮度分量的串长度分辨率N=4的情况下,色度分量的串长度分辨率N=2。
方式十:在当前串所属CU中已解码串的数量大于或等于第一阈值的情况下,将第二预设值确定为当前串的串长度分辨率。上述第一阈值是预设值,是本方式中用于确定当前串的串长度分辨率的依据。可选地,上述第一阈值可根据CU的规格来确定,不同规格的CU对应的第一阈值可以相同,也可以不同。上述第二预设值是指预先设置好的串长度分辨率的数值,适用于当前串所属CU中已解码串的数量大于或等于第一阈值的情况,该第二预设值可以在协议中预先规定。在一个示例中,假设当前CU中已解码串的数量为N1,在N1大于或等于第一阈值的情况下,可确定当前串的串长度分辨率为第二预设值4。另外,在当前串所属CU中已解码串的数量小于第一阈值的情况下,可以采用本申请实施例介绍的其他方式确定当前串的串长度分辨率,或者也可以将不同于第二预设值的另一预设值确定为当前串的串长度分辨率,本申请实施例对此不作限定。
方式十一:在当前串所属CU中已解码未匹配像素的数量大于或等于第二阈值的情况下,将第三预设值确定为当前串的串长度分辨率。上述未匹配像素是指未匹配成功的像素,即与当前串的参考串中相应位置上的像素不匹配的像素。上述第二阈值是预设值,是本方式中用于判断当前串的串长度分辨率的依据。可选地,上述第二阈值可根据当前串所属CU中已解码未匹配像素的数量来确定,不同CU中数量的CU对应的第二阈值可以相同,也可以不同。上述第三预设值是指预先设置好的串长度分辨率的数值,适用于当前串所属CU中已解码未匹配像素的数量大于或等于第二阈值的情况,该第三预设值可以在协议中预先规定。在一个示例中,假设当前CU中已解码未匹配像素的数量为N2,在N2大于或等于第二阈值的情况下,可判断当前串的串长度分辨率为第三预设值。另外,在当前串所属CU中已解码未匹配像素的数量小于第二阈值的情况下,可以采用本申请实施例介绍的其他方式确定当前串的串长度分辨率,或者也可以将不同于第三预设值的另一预设值确定为当前串的串长度分辨率,本申请实施例对此不作限定。
方式十二:在当前串所属CU中未解码像素的数量小于或等于第三阈值的情况下,将第四预设值确定为当前串的串长度分辨率。上述第三阈值是预设值,是本方式中用于判断当前串的串长度分辨率的依据。可选地,上述第三阈值可根据当前串所属CU中已解码未匹配像素的数量来确定,不同CU中数量的CU对应的第三阈值可以相同,也可以不同。上述第四预设值是指预先设置好的串长度分辨率的数值,适用于当前串所属CU中未解码像素的数量小于或等于第三阈值的情况,该第四预设值可以在协议中预先规定。另外,在当前串所属CU中未解码像素的数量大于第三阈值的情况下,可以采用本申请实施例介绍的其他方式确定当前串的串长度分辨率,或者也可以将不同于第四预设值的另一预设值确定为当前串的串长度分辨率,本申请实施例对此不作限定。
下面介绍对当前串的串长度信息的二元符号串进行反二值化处理,得到基于串长度分辨 率压缩后的串长度信息的方式进行介绍说明。在本申请实施例中,示例性提供了以下几种反二值化处理的方式。
方式一,压缩后的串长度信息包括三部分,基于该三部分的值确定出压缩后的串长度信息的值。令压缩后的串长度信息的最大值为max_val,可以包括如下几个步骤:
1、确定压缩后的串长度信息的最大值,根据该最大值确定多个数值区间;
上述多个数值区间是一系列数值为整数的区间。可选地,上述多个数值区间可以记为R0,R1,R2,…,Rn。其中,第x个数值区间Rx的索引为x,且第x个数值区间Rx表示为[Rx_start,Rx_end),x为正整数。最大值max_val的值满足Rn_start≤max_val<Rn_end。
2、从码流中解码得到压缩后的串长度信息的索引x;
在本申请实施例中,假设从码流中解码得到压缩后的串长度信息的索引记为x,该索引指示的是待恢复的数值val所在的数值区间。
可选地,采用截断一元码的方式从码流中解码得到压缩后的串长度信息的索引。
3、根据索引x所对应的数值区间Rx,确定压缩后的串长度信息的第一部分的值为Rx_start;
在本申请实施例中,串长度信息的第一部分记为val_part1。
可选地,在数值区间Rx中只有一个整数值,即Rx_end-Rx_start的值等于1的情况下,val=val_part1,结束反二值化过程。
可选地,索引采用CABAC的方式进行熵解码,索引的每个二进制位具有对应的上下文模型。
可选地,根据压缩后的串长度信息的最大值(即max_val的值),确定压缩后的串长度信息(即val)的索引的上下文模型,该上下文模型用于对索引采用CABAC的方式进行熵解码。例如,根据max_val所在的数值区间选择一套上下文模型。
4、根据最大值以及第一部分的值,计算得到最大值余量和最大值余量的位数;
其中,最大值余量max_val_infix=max_val-Rx_start和最大值余量的位数n=Ceil(Log(max_val_infix+1));其中,Ceil()表示向上取整,Log()表示取对数。最大值余量的位数n是指用定长码的方式来编码该最大值余量所需要的比特数目。
之后,根据最大值余量的位数,确定压缩后的串长度信息的第二部分的值。在一些实施例中,通过如下步骤5~7确定第二部分的值。
5、基于最大值余量的位数确定第一比特长度;
第一比特长度len=n-1。
6、在第一比特长度小于1的情况下,确定压缩后的串长度信息的第二部分的值为0;
在本申请实施例中,串长度信息的第二部分记为val_part2。
在第一比特长度len<1的情况下,确定val_part2=0。
7、在第一比特长度大于等于1的情况下,从码流中解码第一比特长度的数据,按照定长码的方式对第一比特长度的数据进行反二值化处理,得到压缩后的串长度信息的第二部分的值;
在第一比特长度len≥1的情况下,从码流中解码len位比特,按照定长码的方式对该len位比特进行反二值化处理,得到val_part2的值。
之后,根据第二部分的值和最大值余量,确定压缩后的串长度信息的第三部分的值。在一些实施例中,通过如下步骤8~10确定第三部分的值。
8、在第二部分的值满足第一条件或者最大值余量等于0的情况下,将目标值设定为0;否则,将所述目标值设定为1;
可选地,在val_part2<2 n-max_val_infix-1或者max_val_infix=0的情况下,将目标值k的值设定为0;否则,k的值设定为1。
9、在目标值k=0的情况下,确定压缩后的串长度信息的第三部分的值为0;
在本申请实施例中,串长度信息的第三部分记为val_part3。
10、在目标值k=1的情况下,确定第二比特长度为1,从码流中解码第二比特长度的数据,按照定长码的方式对该第二比特长度的数据进行反二值化处理,得到压缩后的串长度信息的第三部分的值;
设第二比特长度len=1,从码流中解码len位比特,按照定长码的方式对该len位比特进行反二值化处理,得到val_part3的值。
11、根据第一部分的值、第二部分的值、第三部分的值、最大值余量的位数和最大值余量,计算压缩后的串长度信息的值。
可选地,根据第一部分的值、第二部分的值、第三部分的值、目标值、最大值余量的位数和最大值余量,计算压缩后的串长度信息的值。
即,根据val_part1、val_part2、val_part3、k、n和max_val_infix,计算压缩后的串长度信息的值。
可选地,压缩后的串长度信息的值val=val_part1+(val_part2<<k)+val_part3-(2 n-max_val_infix-1)×k;其中,<<为左移符号。
方式二,压缩后的串长度信息包括两部分,基于该两部分的值确定出压缩后的串长度信息的值。令压缩后的串长度信息的最大值为max_val,可以包括如下几个步骤:
1、确定压缩后的串长度信息的最大值和最大值的位数;
压缩后的串长度信息的最大值max_val,该最大值max_val的位数n=Ceil(Log(max_val+1))。
之后,根据最大值的位数,确定压缩后的串长度信息的第一部分的值。在一些实施例中,通过如下步骤2~4确定第一部分的值。
2、基于最大值的位数确定第三比特长度;
第三比特长度len=n-1。
3、在第三比特长度小于1的情况下,确定压缩后的串长度信息的第一部分的值为0;
在本申请实施例中,串长度信息的第一部分记为val_part1。
在len<1的情况下,确定val_part1=0。
4、在第三比特长度大于等于1的情况下,从码流中解码第三比特长度的数据,按照定长码的方式对该第三比特长度的数据进行反二值化处理,得到压缩后的串长度信息的第一部分的值;
在len≥1的情况下,从码流中解码len位比特,按照定长码的方式对len位比特进行反二值化处理,得到val_part1的值。
5、根据最大值以及第一部分的值,计算得到最大值余量;
其中,最大值余量max_val_infix=max_val-val_part1。
之后,根据第一部分的值和最大值余量,确定压缩后的串长度信息的第二部分的值。在一些实施例中,通过如下步骤6~8确定第二部分的值。
6、在第一部分的值满足第二条件或者最大值余量等于0的情况下,将目标值设定为0;否则,将目标值设定为1;
在val_part1<2 n-max_val_infix-1或者max_val_infix=0的情况下,将目标值k的值设定为0;否则,k的值设定为1。
7、在目标值k=0的情况下,确定压缩后的串长度信息的第二部分的值为0;
在本申请实施例中,串长度信息的第二部分记为val_part2。
在k=0的情况下,确定val_part2=0。
8、在目标值k=1的情况下,确定第四比特长度为1,从码流中解码第四比特长度的数据,按照定长码的方式对该第四比特长度的数据进行反二值化处理,得到压缩后的串长度信息的第二部分的值;
设第四比特长度len=1,从码流中解码len位比特,按照定长码的方式对len位比特进行反二值化处理,得到val_part2的值;
9、根据第一部分的值、第二部分的值、最大值的位数和最大值,计算压缩后的串长度信息的值。
可选地,根据第一部分的值、第二部分的值、目标值、最大值的位数和最大值,计算压缩后的串长度信息的值。
即,根据val_part1、val_part2、k、n和max_val,计算所述压缩后的串长度信息的值。
可选地,压缩后的串长度信息的值val=(val_part1<<k)+val_part2-(2 n-max_val-1)×k;其中,<<为左移符号。
方式三,按照k阶指数哥伦布码的方式对二元符号串进行反二值化处理,得到压缩后的串长度信息。
K阶指数哥伦布码,主要的编码格式为【前缀0】【1】【bit信息】的结构。分别计算出了前缀0的长度(即前缀有多少个0)、1的个数以及bit信息,就完成了整个编码。编码步骤如下:
(1)将待编码的数据以二进制的形式表示,去掉最低位的k个比特,然后加1,得到新的值T1,查看T1含多少个bit,将该值减1,得到的便是前缀0的个数;
(2)将第(1)步中去掉的最低K个比特位加到T1后,暂称其为T2;
(3)在T2前增加前缀0,至此编码完成。
另外,K阶指数哥伦布码的反二值化方法如下:
解析k阶指数哥伦布码时,首先从位流的当前位置开始寻找第一个非零位,并将找到的零位个数记为leadingZeroBits,然后根据leadingZeroBits计算CodeNum。用伪代码描述如下:
leadingZeroBits=-1;
for(b=0;!b;leadingZeroBits++)
b=read_bits(1)
CodeNum=2 leadingZeroBits+k–2 k+read_bits(leadingZeroBits+k)
表4给出了0阶、1阶、2阶和3阶指数哥伦布码的结构。指数哥伦布码的位串分为“前缀”和“后缀”两部分。前缀由leadingZeroBits个连续的‘0’和一个‘1’构成。后缀由leadingZeroBits+k个位构成,即表中的xi串,xi的值为‘0’或‘1’。
表4 k阶指数哥伦布码表
Figure PCTCN2021112133-appb-000009
Figure PCTCN2021112133-appb-000010
方式四,按照一元码或截断一元码的方式对二元符号串进行反二值化处理,得到压缩后的串长度信息。
一元码的编码规则是,对于待编码的符号“x”>=0,编码为x个“1”再加一个“0”编码组成。如果编码端按照一元码对压缩后的串长度信息进行二值化处理,则可以在确定出压缩后的串长度信息的值val(val=synElVal)之后,查询上述表2得到相应的二元符号串;相应地,解码端解码得到上述二元符号串之后,查询上述表2恢复出压缩后的串长度信息的值val。
截断一元码属于一元码的变体,用在已知待编码的语法元素的最大值Max的情况下,此处的最大值Max即为压缩后的串长度信息的最大值max_val。假设待编码符号为x:如果0≤x<Max,x二值化采用一元码的方式;如果x=Max,x二值化的二进制串全部由1组成,长度为Max。如果编码端按照截断一元码对压缩后的串长度信息进行二值化处理,则可以在确定出压缩后的串长度信息的值val(val=synElVal)之后,查询上述表1得到相应的二元符号串;相应地,解码端解码得到上述二元符号串之后,查询上述表1恢复出压缩后的串长度信息的值val。
方式五,按照n位定长码的方式对二元符号串进行反二值化处理,得到压缩后的串长度信息;其中,n为压缩后的串长度信息的最大值的位数,n为正整数。示例性地,n=Ceil(Log(max_val+1)),max_val表示压缩后的串长度信息的最大值。
如果编码端按照n位定长码的方式对压缩后的串长度信息进行二值化处理,则可以在确定出压缩后的串长度信息的值val(val=synElVal)之后,查询上述表3得到相应的二元符号串;相应地,解码端解码得到上述二元符号串之后,查询上述表3恢复出压缩后的串长度信息的值val。其中,len等于n。
方式六,压缩后的串长度信息的值基于多个部分确定,令压缩后的串长度信息的值为val,val由n部分组成,即val=val_part_1+val_part_2+val_part_3+…+val_part_n,记max_val_n=max_val-val_part_1+val_part_2+val_part_3+…+val_part_n-1),其中,max_val为val的最大值。该反二值化方式可以包括如下几个步骤:
1、确定压缩后的串长度信息的最大值,根据该最大值确定多个数值区间;
上述多个数值区间是一系列数值为整数的区间。可选地,上述多个数值区间可以记为R0,R1,R2,…,Rn。其中,第x个数值区间Rx的索引为x,且第x个数值区间Rx表示为[Rx_start,Rx_end),x为正整数。max_val的值满足Rn_start≤max_val<Rn_end。
2、从码流中解码得到压缩后的串长度信息的索引x;
在本申请实施例中,假设从码流中解码得到压缩后的串长度信息的索引记为x,该索引指示的是待恢复的数值val所在的数值区间。
例如,解码端可以按照上述方式一至五中的任一种反二值化方式,从码流中解码得到压缩后的串长度信息的索引。
3、根据索引x所对应的数值区间Rx,确定压缩后的串长度信息的第一部分的值为Rx_start;
4、依次确定压缩后的串长度信息的其余部分的值;
例如,解码端可以按照上述方式一至五中的任一种反二值化方式,确定val_part_2、val_part_3、…、val_part_n的值。
5、根据第一部分的值和其余部分的值,确定压缩后的串长度信息的值。
其中,压缩后的串长度信息的值val=val_part_1+val_part_2+val_part_3+…+val_part_n。
在示例性实施例中,解码端可以结合实际情况,从多种反二值化处理方式中选择合适的方式。该多种反二值化处理方式可以包括上文介绍的多种方式。
例如,解码端根据当前串的串长度分辨率,从多种反二值化处理方式中选择对二元符号串进行反二值化处理的方式。
又例如,解码端根据串长度信息的最大值,从多种反二值化处理方式中选择对二元符号串进行反二值化处理的方式。
又例如,解码端根据当前串所属解码块的剩余像素数量的最大值与当前串的串长度分辨率的商,从多种反二值化处理方式中选择对二元符号串进行反二值化处理的方式。
又例如,根据当前串所属解码块的尺寸,从多种反二值化处理方式中选择对二元符号串进行反二值化处理的方式。
本申请实施例提供了多种二值化/反二值化处理方式,依据待二值化处理的值的大小不同,采用不同方式的编解码复杂度以及二元字符串的长度会相应有所不同。通过上述方式,直接或者间接地基于串长度分辨率来选择合适的二值化/反二值化处理方式,能够灵活选择编解码性能最优的二值化/反二值化处理方式。
下面,通过几个示例对几种不同串长度信息的确定方式,以及相应确定串长度的方式进行介绍说明。
在一个示例中,压缩后的串长度信息包括当前串的串长度编码,记为L 0。相应地,串长度信息包括当前串的串长度。解码端将当前串的串长度编码与串长度分辨率相乘,得到当前串的串长度。在一个示例中,解码端从码流中解码得到当前串的串长度编码L 0的二元符号串,解码端对该二元符号串进行反二值化处理,恢复得到当前串的串长度编码L 0,然后将当前串的串长度编码L 0与串长度分辨率N相乘,得到当前串的串长度L,即L=L 0*N。其中,串长度编码也可以称为经分辨率压缩后的串长度,也即串长度的真实值除以串长度分辨率N得到的商。在编码过程中,对串长度编码进行二值化表示所需的字符数量,相比于对串长度的真实值进行二值化表示所需的字符数量要少,因此能够降低编解码复杂度,提升编解码性能。
在此示例中,压缩后的串长度信息包括当前串的串长度编码L 0,解码端可以采用下文实施例介绍的反二值化处理的方式,解码得到一数值(记为val),L 0的值等于val。
另外,在此示例中,令数值val的最大值为max_val,当前串所属解码块的剩余未解码像素数量为max_val_tmp,当前串的串长度分辨率为N;其中,max_val_tmp=NumTotalPixel–NumCodedPixel。NumTotalPixel表示当前串所属解码块的总像素数量,NumCodedPixel表示当前串所属解码块的已解码像素数量。由于当前串有可能是当前解码块中的最后一个串(也有可能不是),因此当前串的串长度L的取值范围即为[N,max_val_tmp]。将当前串的串长度L经过串长度分辨率N压缩(也即L/N)之后,得到当前串的串长度编码L 0的取值范围为[1,max_val_tmp/N]。因此,如果压缩后的串长度信息包括当前串的串长度编码L0,max_val应该设置为一个大于等于max_val_tmp/N的整数值。在一个示例中,max_val=max_val_tmp/N,有助于提升编解码效率。
在本申请实施例中,将剩余未解码像素数量基于串长度分辨率压缩后进行二值化处理编码至码流中,相比于直接对剩余未解码像素数量的真实值进行二值化处理编码至码流中,能够减少字符数,从而降低编解码复杂度,提升编解码性能。
在另一个示例中,压缩后的串长度信息包括当前串的串长度编码减1,记为L 0-1。相应地,串长度信息包括当前串的串长度。解码端将当前串的串长度编码减1加上1,得到当前串的串长度编码L 0,然后将当前串的串长度编码L 0与串长度分辨率N相乘,得到当前串的串长度L,即L=L 0*N。
在此示例中,压缩后的串长度信息包括当前串的串长度编码减1(即L 0-1),解码端可以采用下文实施例介绍的反二值化处理的方式,解码得到一数值(记为val),L 0的值等于val+1。
在上一示例中已经介绍,L 0的最小值为1,所以压缩后的串长度信息中也可以包括L 0-1,L 0-1的取值范围为[0,max_val_tmp/N-1]。因此,如果压缩后的串长度信息包括L 0-1,max_val应该设置为一个大于等于max_val_tmp/N-1的整数值。在一个示例中,max_val=max_val_tmp/N-1,有助于提升编解码效率。
在另一个示例中,压缩后的串长度信息包括当前串所属解码块中,在解码当前串后的剩余像素数量编码。相应地,串长度信息包括当前串所属解码块中,在解码当前串后的剩余像素数量。当前串所属解码块中在解码当前串后的剩余像素数量,是指解码当前串后,当前串所属解码块中剩余的未解码像素的数量。该剩余像素数量编码也可以称为经分辨率压缩后的剩余像素数量,也即剩余像素数量的真实值除以串长度分辨率N得到的商。解码端将剩余像素数量编码与串长度分辨率相乘,得到当前串所属解码块中,在解码当前串后的剩余像素数量。在编码过程中,对剩余像素数量编码进行二值化表示所需的字符数量,相比于对剩余像素数量的真实值进行二值化表示所需的字符数量要少,因此能够降低编解码复杂度,提升编解码性能。
可选地,上述剩余像素数量编码存储在码流中的数据序列的序列头中,上述数据序列可以是当前串所属图像在码流中对应的数据序列,也可以是当前串在码流中对应的数据序列,还可以是当前串所属CU在码流中对应的数据序列等等,本申请实施例对此不作限定。在一个示例中,当前串所属解码块中各个串的串长度分辨率为4,解码端在解码当前串后,假设当前串所属解码块中剩余的未解码像素的数量为4,即二进制数100,则其对应的编码表示(即剩余像素数量编码)为1。可选地,将上述剩余像素数量编码记为M 0
可选地,上述步骤803可以包括如下几个子步骤(8031~8033):
步骤8031,获取当前串所属解码块的总像素数量。
可选地,解码块的总像素数量通过解码块的高度和宽度相乘得到。可选地,当前串所属解码块的总像素数量记为M。
步骤8032,获取当前串所属解码块的已解码像素数量。
可选地,上述已解码像素数量可以通过解码端对已解码的像素串的长度进行累加得到。可选地,当前串所属解码块的已解码像素数量记为M 2
步骤8033,基于总像素数量、已解码像素数量以及解码当前串后的剩余像素数量,确定当前串的串长度。
可选地,解码端将剩余像素数量编码M 0与串长度分辨率N相乘,得到当前串所属解码块中,在解码当前串后的剩余像素数量M 1,即M 1=M 0*N。
可选地,从上述总像素数量中减去已解码像素数量和剩余像素数量,得到当前串的串长度L,即L=M-M 1-M 2
在此示例中,压缩后的串长度信息包括当前串所属解码块中,在解码当前串后的剩余像素数量编码M 0,解码端可以采用下文实施例介绍的反二值化处理的方式,解码得到一数值(记为val),M 0的值等于val。
另外,在此示例中,令数值val的最大值为max_val,当前串所属解码块的剩余未解码像 素数量为max_val_tmp,当前串的串长度分辨率为N;其中,max_val_tmp=NumTotalPixel–NumCodedPixel。NumTotalPixel表示当前串所属解码块的总像素数量,NumCodedPixel表示当前串所属解码块的已解码像素数量。由于当前串有可能是当前解码块中的最后一个串(也有可能不是),因此当前串的串长度L的取值范围为[N,max_val_tmp],所以在解码当前串后的剩余像素数量M 1的取值范围即为[0,max_val_tmp-N]。将解码当前串后的剩余像素数量M 1经过串长度分辨率N压缩(也即M 1/N)之后,得到解码当前串后的剩余像素数量编码M 0的取值范围为[0,max_val_tmp/N-1]。因此,如果压缩后的串长度信息包括解码当前串后的剩余像素数量编码M 0,max_val应该设置为一个大于等于max_val_tmp/N-1的整数值。在一个示例中,max_val=max_val_tmp/N-1,有助于提升编解码效率。
在本申请实施例中,将解码当前串后的剩余像素数量基于串长度分辨率压缩后进行二值化处理编码至码流中,相比于直接对该剩余像素数量的真实值进行二值化处理编码至码流中,能够减少字符数,从而降低编解码复杂度,提升编解码性能。
在另一个示例中,压缩后的串长度信息包括第一标志,该第一标志用于指示当前串是否为当前串所属解码块中的最后一个串。
可选地,第一标志是二值变量,用一位二进制数表示。可选地,在第一标志为0的情况下,当前串为当前串所属解码块中的最后一个串;在第一标志为1的情况下,当前串不是当前串所属解码块中的最后一个串。
可选地,在根据第一标志确定当前串不是最后一个串的情况下,获取当前串所属解码块中,在解码当前串后的剩余像素数量编码;其中,串长度信息还包括当前串所属解码块中,在解码当前串后的剩余像素数量编码;或者,串长度信息还包括当前串所属解码块中,在解码当前串后的剩余像素数量编码减1。之后,将剩余像素数量编码与串长度分辨率相乘,得到当前串所属解码块中,在解码当前串后的剩余像素数量。
可选地,上述步骤803可以包括如下几个子步骤(803a~803d):
步骤803a,获取当前串所属解码块的总像素数量。
步骤803b,获取当前串所属解码块的已解码像素数量。
步骤803c,在当前串是最后一个串的情况下,将总像素数量与已解码像素数量相减,得到当前串的串长度。
可选地,将总像素数量M与已解码像素数量M 2相减,得到当前串的串长度L,即L=M-M 2
步骤803d,在当前串不是最后一个串的情况下,基于总像素数量、已解码像素数量以及解码当前串后的剩余像素数量,确定当前串的串长度。
可选地,将上述剩余像素数量编码减1后的编码记为M 0。可选地,将剩余像素数量编码减1后的编码M 0加1之后与串长度分辨率相乘,得到当前串所属解码块中,在解码当前串后的剩余像素数量M 1,即M 1=(M 0+1)*N。可选地,从上述总像素数量减去上述已解码像素数量以及上述剩余像素数量,得到当前串的串长度L,即L=M-M 1-M 2
在此示例中,以压缩后的串长度信息包括当前串所属解码块中,在解码当前串后的剩余像素数量编码M 0-1为例,解码端可以采用下文实施例介绍的反二值化处理的方式,解码得到一数值(记为val),M 0的值等于val+1。
另外,在此示例中,通过第一标志来指示当前串是/不是当前解码块中的最后一个串。如果当前串不是最后一个串,码流中会包含解码该当前串后的剩余像素数量的相关信息。因此,在当前串不是最后一个串的情况下,当前串的串长度L的取值范围是[N,max_val_tmp-N],在解码当前串后的剩余像素数量M 1的取值范围即为[N,max_val_tmp-N]。将解码当前串后的剩余像素数量M 1经过串长度分辨率N压缩(也即M 1/N)之后,得到解码当前串后的剩余像素数量编码M 0的取值范围为[1,max_val_tmp/N-1]。M 0的最小值为1,所以压缩后的串长度信息中也可以包括M 0-1,M 0-1的取值范围为[0,max_val_tmp/N-2]。因此,如果压缩后的串长度信息包括M 0-1,max_val应该设置为一个大于等于max_val_tmp/N-2的整数值。在一个示例 中,max_val=max_val_tmp/N-2,有助于提升编解码效率。
在本申请实施例中,将解码当前串后的剩余像素数量基于串长度分辨率压缩后进行二值化处理编码至码流中,相比于直接对该剩余像素数量的真实值进行二值化处理编码至码流中,能够减少字符数,从而降低编解码复杂度,提升编解码性能。
请参考图9,其示出了本申请一个实施例提供的视频编码方法的流程图。该方法可应用于编码端设备中,即该方法可以由编码端设备执行。该方法可以包括如下几个步骤(901~903):
步骤901,确定当前串的串长度。
对于编码过程来说,当前串是指当前编码的像素串。
步骤902,基于当前串的串长度确定当前串的串长度信息,该串长度信息包括与当前串的串长度相关的信息。
在一个示例中,串长度信息包括当前串的串长度。
在另一个示例中,串长度信息包括当前串所属编码块中,在编码当前串后的剩余像素数量。
编码端获取当前串所属编码块的总像素数量,以及获取当前串所属编码块的已编码像素数量;基于总像素数量、已编码像素数量以及当前串的串长度,确定当前串所属编码块中,在编码当前串后的剩余像素数量。
步骤903,根据当前串的串长度分辨率对串长度信息进行二值化处理,得到串长度信息的二元符号串。
可选地,根据串长度信息和串长度分辨率,确定压缩后的串长度信息;对压缩后的串长度信息进行二值化处理,得到串长度信息的二元符号串。
二值化处理可以采用与上文介绍的反二值化处理相对应的方式,本实施例对此不再赘述。
可选地,如果串长度信息包括当前串的串长度,那么编码端可以将当前串的串长度除以串长度分辨率,得到当前串的串长度编码。相应地,压缩后的串长度信息可以包括当前串的串长度编码,也可以包括当前串的串长度编码减1。
可选地,如果串长度信息包括当前串所属编码块中,在编码当前串后的剩余像素数量,那么编码端可以将该剩余像素数量除以串长度分辨率,得到剩余像素数量编码。相应地,压缩后的串长度信息可以包括当前串所属编码块中,在编码当前串后的剩余像素数量编码,也可以包括当前串所属编码块中,在编码当前串后的剩余像素数量编码减1。
在示例中实施例中,编码端可以选择合适的二值化处理方式:
例如,编码端根据当前串的串长度分辨率,从多种二值化处理方式中选择对串长度信息进行二值化处理的方式。
又例如,编码端根据串长度信息的最大值,从多种二值化处理方式中选择对串长度信息进行二值化处理的方式。
又例如,编码端根据当前串所属编码块的剩余像素数量的最大值与当前串的串长度分辨率的商,从多种二值化处理方式中选择对串长度信息进行二值化处理的方式。
又例如,编码端根据当前串所属编码块的尺寸,从多种二值化处理方式中选择对串长度信息进行二值化处理的方式。
在示例性实施例中,提供了以下几种确定当前串的串长度分辨率的方式。
方式一:当前串的串长度分辨率为第一预设值;
方式二:当前串所属图像序列中包含的各个串的串长度分辨率相同,当前串的串长度分辨率编码后添加在当前串所属图像序列的序列头中;
方式三:当前串所属图像中包含的各个串的串长度分辨率相同,当前串的串长度分辨率编码后添加在当前串所属图像的图像头中;
方式四:当前串所属片中包含的各个串的串长度分辨率相同,当前串的串长度分辨率编 码后添加在当前串所属片的片头中;
方式五:当前串所属LCU中包含的各个串的串长度分辨率相同,当前串的串长度分辨率编码后添加在当前串所属LCU的编码信息中;
方式六:当前串所属编码单元CU中包含的各个串的串长度分辨率相同,当前串的串长度分辨率编码后添加在当前串所属CU的编码信息中;
方式七:当前串的串长度分辨率编码后添加在当前串的编码信息中;
方式八:当前串的串长度分辨率根据当前串所属解码块的尺寸确定;
方式九:当前串的串长度分辨率根据当前串对应的颜色分量和色度格式确定;
方式十:在当前串所属CU中已解码串的数量大于或等于第一阈值的情况下,当前串的串长度分辨率为第二预设值;
方式十一:在当前串所属CU中已解码未匹配像素的数量大于或等于第二阈值的情况下,当前串的串长度分辨率为第三预设值;
方式十二:在当前串所属CU中未解码像素的数量小于或等于第三阈值的情况下,当前串的串长度分辨率为第四预设值。
综上所述,本申请实施例提供的技术方案,通过将串长度分辨率作为像素串的划分与编解码依据,可限制编解码块内像素串的长度为串长度分辨率的倍数,提升了像素串的整齐度,使得编解码端能够在内存对齐的条件下进行编解码,提升了像素串的编解码效率。
另外,本申请考虑到串长度分辨率对串长度信息编解码的影响,提出了针对串长度信息的二值化和反二值化处理的方法,该方法改进了不同串长度分辨率下串长度信息的编解码方式,有利于编解码性能的提升。
下述为本申请装置实施例,可以用于执行本申请方法实施例。对于本申请装置实施例中未披露的细节,请参照本申请方法实施例。
请参考图10,其示出了本申请一个实施例提供的视频解码装置的框图。该装置具有实现上述视频解码方法示例的功能,所述功能可以由硬件实现,也可以由硬件执行相应的软件实现。该装置可以是上文介绍的计算机设备,也可以设置在计算机设备上。该装置1000可以包括:二元符号获取模块1010、反二值化处理模块1020和串长度确定模块1030。
二元符号获取模块1010,用于从码流中解码得到当前串的串长度信息的二元符号串,所述串长度信息包括与所述当前串的串长度相关的信息。
反二值化处理模块1020,用于根据所述当前串的串长度分辨率对所述二元符号串进行反二值化处理,得到所述串长度信息。
串长度确定模块1030,用于根据所述串长度信息确定所述当前串的串长度。
在示例性实施例中,所述反二值化处理模块1020,包括:
反二值化处理单元,用于对所述二元符号串进行反二值化处理,得到基于所述串长度分辨率压缩后的串长度信息;
长度信息确定单元,用于根据所述压缩后的串长度信息和所述串长度分辨率,确定所述串长度信息。
在示例性实施例中,所述反二值化处理单元,具体用于:
确定所述压缩后的串长度信息的最大值,根据所述最大值确定多个数值区间,其中,第x个数值区间Rx的索引为x,且第x个数值区间Rx表示为[Rx_start,Rx_end),x为正整数;
从所述码流中解码得到所述压缩后的串长度信息的索引x;
根据所述索引x所对应的数值区间Rx,确定所述压缩后的串长度信息的第一部分的值为Rx_start;
根据所述最大值以及所述第一部分的值,计算得到最大值余量和所述最大值余量的位数;
根据所述最大值余量的位数,确定所述压缩后的串长度信息的第二部分的值;
根据所述第二部分的值和所述最大值余量,确定所述压缩后的串长度信息的第三部分的值;
根据所述第一部分的值、所述第二部分的值、所述第三部分的值、所述最大值余量的位数和所述最大值余量,计算所述压缩后的串长度信息的值。
在一些实施例中,所述反二值化处理单元,具体用于:
基于所述最大值余量的位数确定第一比特长度;
在所述第一比特长度小于1的情况下,确定所述第二部分的值为0;
在所述第一比特长度大于等于1的情况下,从所述码流中解码所述第一比特长度的数据,按照定长码的方式对所述第一比特长度的数据进行反二值化处理,得到所述第二部分的值。
在一些实施例中,所述反二值化处理单元,具体用于:
在所述第二部分的值满足第一条件或者所述最大值余量等于0的情况下,将目标值设定为0;否则,将所述目标值设定为1;
在所述目标值等于0的情况下,确定所述第三部分的值为0;
在所述目标值等于1的情况下,确定第二比特长度为1,从所述码流中解码所述第二比特长度的数据,按照定长码的方式对所述第二比特长度的数据进行反二值化处理,得到所述第三部分的值。
在示例性实施例中,所述反二值化处理单元,还用于根据所述压缩后的串长度信息的最大值,确定所述压缩后的串长度信息的索引的上下文模型,所述上下文模型用于对所述索引采用CABAC的方式进行熵解码。
在示例性实施例中,所述反二值化处理单元,具体用于:
确定所述压缩后的串长度信息的最大值和所述最大值的位数;
根据所述最大值的位数,确定所述压缩后的串长度信息的第一部分的值;
根据所述最大值以及所述第一部分的值,计算得到最大值余量;
根据所述第一部分的值和所述最大值余量,确定所述压缩后的串长度信息的第二部分的值;
根据所述第一部分的值、所述第二部分的值、所述最大值的位数和所述最大值,计算所述压缩后的串长度信息的值。
在一些实施例中,所述反二值化处理单元,具体用于:
基于所述最大值的位数确定第三比特长度;
在所述第三比特长度小于1的情况下,确定所述第一部分的值为0;
在所述第三比特长度大于等于1的情况下,从所述码流中解码所述第三比特长度的数据,按照定长码的方式对所述第三比特长度的数据进行反二值化处理,得到所述第一部分的值。
在一些实施例中,所述反二值化处理单元,具体用于:
在所述第一部分的值满足第二条件或者所述最大值余量等于0的情况下,将目标值设定为0;否则,将所述目标值设定为1;
在所述目标值等于0的情况下,确定所述第二部分的值为0;
在所述目标值等于1的情况下,确定第四比特长度为1,从所述码流中解码所述第四比特长度的数据,按照定长码的方式对所述第四比特长度的数据进行反二值化处理,得到所述第二部分的值。
在示例性实施例中,所述反二值化处理单元,具体用于按照k阶指数哥伦布编码的方式对所述二元符号串进行反二值化处理,得到所述压缩后的串长度信息。
在示例性实施例中,所述反二值化处理单元,具体用于按照一元码或截断一元码的方式对所述二元符号串进行反二值化处理,得到所述压缩后的串长度信息。
在示例性实施例中,所述反二值化处理单元,具体用于按照n位定长码的方式对二元符号串进行反二值化处理,得到所述压缩后的串长度信息;其中,n为所述压缩后的串长度信 息的最大值的位数,n为正整数。
在示例性实施例中,所述压缩后的串长度信息的值基于多个部分确定;所述反二值化处理单元,具体用于:
确定所述压缩后的串长度信息的最大值,根据所述最大值确定多个数值区间,其中,第x个数值区间Rx的索引为x,且第x个数值区间Rx表示为[Rx_start,Rx_end),x为正整数;
从所述码流中解码得到所述压缩后的串长度信息的索引x;
根据所述索引x所对应的数值区间Rx,确定所述压缩后的串长度信息的第一部分的值为Rx_start;
依次确定所述压缩后的串长度信息的其余部分的值;
根据所述第一部分的值和所述其余部分的值,确定所述压缩后的串长度信息的值。
在示例性实施例中,所述压缩后的串长度信息包括所述当前串的串长度编码;
所述长度信息确定单元,具体用于将所述当前串的串长度编码与所述串长度分辨率相乘,得到所述当前串的串长度。
在示例性实施例中,所述压缩后的串长度信息包括所述当前串的串长度编码减1;
所述长度信息确定单元,具体用于将所述当前串的串长度编码减1加上1,得到所述当前串的串长度编码;将所述当前串的串长度编码与所述串长度分辨率相乘,得到所述当前串的串长度。
在示例性实施例中,所述压缩后的串长度信息包括所述当前串所属解码块中,在解码所述当前串后的剩余像素数量编码;
所述长度信息确定单元,用于将所述剩余像素数量编码与所述串长度分辨率相乘,得到所述当前串所属解码块中,在解码所述当前串后的剩余像素数量。
相应地,所述串长度确定模块1030,用于获取所述当前串所属解码块的总像素数量;获取所述当前串所属解码块的已解码像素数量;基于所述总像素数量、所述已解码像素数量以及解码所述当前串后的所述剩余像素数量,确定所述当前串的串长度。
在示例性实施例中,所述压缩后的串长度信息包括第一标志,所述第一标志用于指示所述当前串是否为所述当前串所属解码块中的最后一个串;
所述长度信息确定单元,用于在根据所述第一标志确定所述当前串不是所述最后一个串的情况下,获取所述当前串所属解码块中,在解码所述当前串后的剩余像素数量编码;其中,所述串长度信息还包括所述当前串所属解码块中,在解码所述当前串后的剩余像素数量编码;或者,所述串长度信息还包括所述当前串所属解码块中,在解码所述当前串后的剩余像素数量编码减1;将所述剩余像素数量编码与所述串长度分辨率相乘,得到所述当前串所属解码块中,在解码所述当前串后的剩余像素数量。
相应地,所述串长度确定模块1030,用于获取所述当前串所属解码块的总像素数量;获取所述当前串所属解码块的已解码像素数量;在所述当前串是所述最后一个串的情况下,将所述总像素数量与所述已解码像素数量相减,得到所述当前串的串长度;在所述当前串不是所述最后一个串的情况下,基于所述总像素数量、所述已解码像素数量以及解码所述当前串后的所述剩余像素数量,确定所述当前串的串长度。
在示例性实施例中,所述装置1000还包括分辨率确定模块,用于:
将第一预设值确定为所述当前串的串长度分辨率;
或者,从所述当前串所属图像序列的序列头中,解码得到所述当前串的串长度分辨率,其中,所述当前串所属图像序列中包含的各个串的串长度分辨率相同;
或者,从所述当前串所属图像的图像头中,解码得到所述当前串的串长度分辨率,其中,所述当前串所属图像中包含的各个串的串长度分辨率相同;
或者,从所述当前串所属片的片头中,解码得到所述当前串的串长度分辨率,其中,所述当前串所属片中包含的各个串的串长度分辨率相同;
或者,从所述当前串所属最大编码单元LCU的编码信息中,解码得到所述当前串的串长度分辨率,其中,所述当前串所属LCU中包含的各个串的串长度分辨率相同;
或者,从所述当前串所属编码单元CU的编码信息中,解码得到所述当前串的串长度分辨率,其中,所述当前串所属编码单元CU中包含的各个串的串长度分辨率相同;
或者,从所述当前串的编码信息中,解码得到所述当前串的串长度分辨率。
在示例性实施例中,所述装置1000还包括方式选择模块,用于:
根据所述当前串的串长度分辨率,从多种反二值化处理方式中选择对所述二元符号串进行反二值化处理的方式;
或者,根据所述串长度信息的最大值,从多种反二值化处理方式中选择对所述二元符号串进行反二值化处理的方式;
或者,根据所述当前串所属解码块的剩余像素数量的最大值与所述当前串的串长度分辨率的商,从多种反二值化处理方式中选择对所述二元符号串进行反二值化处理的方式;
或者,根据所述当前串所属解码块的尺寸,从多种反二值化处理方式中选择对所述二元符号串进行反二值化处理的方式。
请参考图11,其示出了本申请一个实施例提供的视频编码装置的框图。该装置具有实现上述视频编码方法示例的功能,所述功能可以由硬件实现,也可以由硬件执行相应的软件实现。该装置可以是上文介绍的计算机设备,也可以设置在计算机设备上。该装置1100可以包括:串长度确定模块1110、长度信息确定模块1120和二值化处理模块1130。
串长度确定模块1110,用于确定当前串的串长度。
长度信息确定模块1120,用于基于所述当前串的串长度确定所述当前串的串长度信息,所述串长度信息包括与所述当前串的串长度相关的信息。
二值化处理模块1130,用于根据所述当前串的串长度分辨率对所述串长度信息进行二值化处理,得到所述串长度信息的二元符号串。
在示例性实施例中,所述二值化处理模块1130,包括:
长度信息确定单元,用于根据串长度信息和串长度分辨率,确定压缩后的串长度信息;
二值化处理单元,用于对压缩后的串长度信息进行二值化处理,得到串长度信息的二元符号串。
可选地,二值化处理可以采用与上文介绍的反二值化处理相对应的方式,本实施例对此不再赘述。
可选地,如果串长度信息包括当前串的串长度,那么长度信息确定单元可用于将当前串的串长度除以串长度分辨率,得到当前串的串长度编码。相应地,压缩后的串长度信息可以包括当前串的串长度编码,也可以包括当前串的串长度编码减1。
可选地,如果串长度信息包括当前串所属编码块中,在编码当前串后的剩余像素数量,那么长度信息确定单元可用于将该剩余像素数量除以串长度分辨率,得到剩余像素数量编码。相应地,压缩后的串长度信息可以包括当前串所属编码块中,在编码当前串后的剩余像素数量编码,也可以包括当前串所属编码块中,在编码当前串后的剩余像素数量编码减1。
在示例性实施例中,采用以下任意一种方式确定当前串的串长度分辨率:
方式一:当前串的串长度分辨率为第一预设值;
方式二:当前串所属图像序列中包含的各个串的串长度分辨率相同,当前串的串长度分辨率编码后添加在当前串所属图像序列的序列头中;
方式三:当前串所属图像中包含的各个串的串长度分辨率相同,当前串的串长度分辨率编码后添加在当前串所属图像的图像头中;
方式四:当前串所属片中包含的各个串的串长度分辨率相同,当前串的串长度分辨率编码后添加在当前串所属片的片头中;
方式五:当前串所属LCU中包含的各个串的串长度分辨率相同,当前串的串长度分辨率编码后添加在当前串所属LCU的编码信息中;
方式六:当前串所属编码单元CU中包含的各个串的串长度分辨率相同,当前串的串长度分辨率编码后添加在当前串所属CU的编码信息中;
方式七:当前串的串长度分辨率编码后添加在当前串的编码信息中;
方式八:当前串的串长度分辨率根据当前串所属解码块的尺寸确定;
方式九:当前串的串长度分辨率根据当前串对应的颜色分量和色度格式确定;
方式十:在当前串所属CU中已解码串的数量大于或等于第一阈值的情况下,当前串的串长度分辨率为第二预设值;
方式十一:在当前串所属CU中已解码未匹配像素的数量大于或等于第二阈值的情况下,当前串的串长度分辨率为第三预设值;
方式十二:在当前串所属CU中未解码像素的数量小于或等于第三阈值的情况下,当前串的串长度分辨率为第四预设值。
在示例中实施例中,所述装置1100还包括方式选择模块,用于:
根据当前串的串长度分辨率,从多种二值化处理方式中选择对串长度信息进行二值化处理的方式;
或者,根据串长度信息的最大值,从多种二值化处理方式中选择对串长度信息进行二值化处理的方式;
或者,根据当前串所属编码块的剩余像素数量的最大值与当前串的串长度分辨率的商,从多种二值化处理方式中选择对串长度信息进行二值化处理的方式;
或者,根据当前串所属编码块的尺寸,从多种二值化处理方式中选择对串长度信息进行二值化处理的方式。
需要说明的是,上述实施例提供的装置,在实现其功能时,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将设备的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。另外,上述实施例提供的装置与方法实施例属于同一构思,其具体实现过程详见方法实施例,这里不再赘述。
请参考图12,其示出了本申请一个实施例提供的计算机设备的结构框图。该计算机设备可以是上文介绍的编码端设备,也可以是上文介绍的解码端设备。该计算机设备120可以包括:处理器121、存储器122、通信接口123、编码器/解码器124和总线125。
处理器121包括一个或者一个以上处理核心,处理器121通过运行软件程序以及模块,从而执行各种功能应用以及信息处理。
存储器122可用于存储计算机程序,处理器121用于执行该计算机程序,以实现上述视频解码方法,或者实现上述视频编码方法。
通信接口123可用于与其它设备进行通信,如收发音视频数据。
编码器/解码器124可用于实现编码和解码功能,如对音视频数据进行编码和解码。
存储器122通过总线125与处理器121相连。
此外,存储器122可以由任何类型的易失性或非易失性存储设备或者它们的组合实现,易失性或非易失性存储设备包括但不限于:磁盘或光盘,EEPROM(Electrically Erasable Programmable Read-Only Memory,电可擦除可编程只读存储器),EPROM(Erasable Programmable Read-Only Memory,可擦除可编程只读存储器),SRAM(Static Random-Access Memory,静态随时存取存储器),ROM(Read-Only Memory,只读存储器),磁存储器,快闪存储器,PROM(Programmable Read-Only Memory,可编程只读存储器)。
本领域技术人员可以理解,图12中示出的结构并不构成对计算机设备120的限定,可以 包括比图示更多或更少的组件,或者组合某些组件,或者采用不同的组件布置。
在示例性实施例中,还提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有至少一条指令、至少一段程序、代码集或指令集,所述至少一条指令、所述至少一段程序、所述代码集或所述指令集在被处理器执行时实现上述视频解码方法。
在示例性实施例中,还提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有至少一条指令、至少一段程序、代码集或指令集,所述至少一条指令、所述至少一段程序、所述代码集或指令集由处理器加载并执行以实现上述视频编码方法。
在示例性实施例中,还提供了一种计算机程序产品或计算机程序,该计算机程序产品或计算机程序包括计算机指令,该计算机指令存储在计算机可读存储介质中。计算机设备的处理器从计算机可读存储介质读取该计算机指令,处理器执行该计算机指令,使得该计算机设备执行上述视频解码方法。
在示例性实施例中,还提供了一种计算机程序产品或计算机程序,该计算机程序产品或计算机程序包括计算机指令,该计算机指令存储在计算机可读存储介质中。计算机设备的处理器从计算机可读存储介质读取该计算机指令,处理器执行该计算机指令,使得该计算机设备执行上述视频编码方法。
应当理解的是,在本文中提及的“多个”是指两个或两个以上。“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。字符“/”一般表示前后关联对象是一种“或”的关系。
以上所述仅为本申请的示例性实施例,并不用以限制本申请,凡在本申请的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。

Claims (24)

  1. 一种视频解码方法,所述方法包括:
    从码流中解码得到当前串的串长度信息的二元符号串,所述串长度信息包括与所述当前串的串长度相关的信息;
    根据所述当前串的串长度分辨率对所述二元符号串进行反二值化处理,得到所述串长度信息;
    根据所述串长度信息确定所述当前串的串长度。
  2. 根据权利要求1所述的方法,其中,所述根据所述当前串的串长度分辨率对所述二元符号串进行反二值化处理,得到所述串长度信息,包括:
    对所述二元符号串进行反二值化处理,得到基于所述串长度分辨率压缩后的串长度信息;
    根据所述压缩后的串长度信息和所述串长度分辨率,确定所述串长度信息。
  3. 根据权利要求2所述的方法,其中,所述对所述二元符号串进行反二值化处理,得到基于所述串长度分辨率压缩后的串长度信息,包括:
    确定所述压缩后的串长度信息的最大值,根据所述最大值确定多个数值区间,其中,第x个数值区间Rx的索引为x,且第x个数值区间Rx表示为[Rx_start,Rx_end),x为正整数;
    从所述码流中解码得到所述压缩后的串长度信息的索引x;
    根据所述索引x所对应的数值区间Rx,确定所述压缩后的串长度信息的第一部分的值为Rx_start;
    根据所述最大值以及所述第一部分的值,计算得到最大值余量和所述最大值余量的位数;
    根据所述最大值余量的位数,确定所述压缩后的串长度信息的第二部分的值;
    根据所述第二部分的值和所述最大值余量,确定所述压缩后的串长度信息的第三部分的值;
    根据所述第一部分的值、所述第二部分的值、所述第三部分的值、所述最大值余量的位数和所述最大值余量,计算所述压缩后的串长度信息的值。
  4. 根据权利要求3所述的方法,其特征在于,所述根据所述最大值余量的位数,确定所述压缩后的串长度信息的第二部分的值,包括:
    基于所述最大值余量的位数确定第一比特长度;
    在所述第一比特长度小于1的情况下,确定所述第二部分的值为0;
    在所述第一比特长度大于等于1的情况下,从所述码流中解码所述第一比特长度的数据,按照定长码的方式对所述第一比特长度的数据进行反二值化处理,得到所述第二部分的值。
  5. 根据权利要求3所述的方法,其特征在于,所述根据所述第二部分的值和所述最大值余量,确定所述压缩后的串长度信息的第三部分的值,包括:
    在所述第二部分的值满足第一条件或者所述最大值余量等于0的情况下,将目标值设定为0;否则,将所述目标值设定为1;
    在所述目标值等于0的情况下,确定所述第三部分的值为0;
    在所述目标值等于1的情况下,确定第二比特长度为1,从所述码流中解码所述第二比特长度的数据,按照定长码的方式对所述第二比特长度的数据进行反二值化处理,得到所述第三部分的值。
  6. 根据权利要求3所述的方法,其中,所述方法还包括:
    根据所述压缩后的串长度信息的最大值,确定所述压缩后的串长度信息的索引的上下文模型,所述上下文模型用于对所述索引采用基于上下文的二值化算术编码CABAC的方式进行熵解码。
  7. 根据权利要求2所述的方法,其中,所述对所述二元符号串进行反二值化处理,得到基于所述串长度分辨率压缩后的串长度信息,包括:
    确定所述压缩后的串长度信息的最大值和所述最大值的位数;
    根据所述最大值的位数,确定所述压缩后的串长度信息的第一部分的值;
    根据所述最大值以及所述第一部分的值,计算得到最大值余量;
    根据所述第一部分的值和所述最大值余量,确定所述压缩后的串长度信息的第二部分的值;
    根据所述第一部分的值、所述第二部分的值、所述最大值的位数和所述最大值,计算所述压缩后的串长度信息的值。
  8. 根据权利要求7所述的方法,其特征在于,所述根据所述最大值的位数,确定所述压缩后的串长度信息的第一部分的值,包括:
    基于所述最大值的位数确定第三比特长度;
    在所述第三比特长度小于1的情况下,确定所述第一部分的值为0;
    在所述第三比特长度大于等于1的情况下,从所述码流中解码所述第三比特长度的数据,按照定长码的方式对所述第三比特长度的数据进行反二值化处理,得到所述第一部分的值。
  9. 根据权利要求7所述的方法,其特征在于,所述根据所述第一部分的值和所述最大值余量,确定所述压缩后的串长度信息的第二部分的值,包括:
    在所述第一部分的值满足第二条件或者所述最大值余量等于0的情况下,将目标值设定为0;否则,将所述目标值设定为1;
    在所述目标值等于0的情况下,确定所述第二部分的值为0;
    在所述目标值等于1的情况下,确定第四比特长度为1,从所述码流中解码所述第四比特长度的数据,按照定长码的方式对所述第四比特长度的数据进行反二值化处理,得到所述第二部分的值。
  10. 根据权利要求2所述的方法,其中,所述对所述二元符号串进行反二值化处理,得到基于所述串长度分辨率压缩后的串长度信息,包括:
    按照k阶指数哥伦布码的方式对所述二元符号串进行反二值化处理,得到所述压缩后的串长度信息。
  11. 根据权利要求2所述的方法,其中,所述对所述二元符号串进行反二值化处理,得到基于所述串长度分辨率压缩后的串长度信息,包括:
    按照一元码或截断一元码的方式对所述二元符号串进行反二值化处理,得到所述压缩后的串长度信息。
  12. 根据权利要求2所述的方法,其中,所述对所述二元符号串进行反二值化处理,得到基于所述串长度分辨率压缩后的串长度信息,包括:
    按照n位定长码的方式对二元符号串进行反二值化处理,得到所述压缩后的串长度信息;
    其中,n为所述压缩后的串长度信息的最大值的位数,n为正整数。
  13. 根据权利要求2所述的方法,其中,所述压缩后的串长度信息的值基于多个部分确定;
    所述对所述二元符号串进行反二值化处理,得到基于所述串长度分辨率压缩后的串长度信息,包括:
    确定所述压缩后的串长度信息的最大值,根据所述最大值确定多个数值区间,其中,第x个数值区间Rx的索引为x,且第x个数值区间Rx表示为[Rx_start,Rx_end),x为正整数;
    从所述码流中解码得到所述压缩后的串长度信息的索引x;
    根据所述索引x所对应的数值区间Rx,确定所述压缩后的串长度信息的第一部分的值为Rx_start;
    依次确定所述压缩后的串长度信息的其余部分的值;
    根据所述第一部分的值和所述其余部分的值,确定所述压缩后的串长度信息的值。
  14. 根据权利要求2所述的方法,其中,所述压缩后的串长度信息包括所述当前串的串长度编码;
    所述根据所述压缩后的串长度信息和所述串长度分辨率,确定所述串长度信息,包括:
    将所述当前串的串长度编码与所述串长度分辨率相乘,得到所述当前串的串长度。
  15. 根据权利要求2所述的方法,其中,所述压缩后的串长度信息包括所述当前串的串长度编码减1;
    所述根据所述压缩后的串长度信息和所述串长度分辨率,确定所述串长度信息,包括:
    将所述当前串的串长度编码减1加上1,得到所述当前串的串长度编码;
    将所述当前串的串长度编码与所述串长度分辨率相乘,得到所述当前串的串长度。
  16. 根据权利要求2所述的方法,其中,所述压缩后的串长度信息包括所述当前串所属解码块中,在解码所述当前串后的剩余像素数量编码;
    所述根据所述压缩后的串长度信息和所述串长度分辨率,确定所述串长度信息,包括:
    将所述剩余像素数量编码与所述串长度分辨率相乘,得到所述当前串所属解码块中,在解码所述当前串后的剩余像素数量;
    所述根据所述串长度信息确定所述当前串的串长度,包括:
    获取所述当前串所属解码块的总像素数量;
    获取所述当前串所属解码块的已解码像素数量;
    基于所述总像素数量、所述已解码像素数量以及解码所述当前串后的所述剩余像素数量,确定所述当前串的串长度。
  17. 根据权利要求2所述的方法,其中,所述压缩后的串长度信息包括第一标志,所述第一标志用于指示所述当前串是否为所述当前串所属解码块中的最后一个串;
    所述根据所述压缩后的串长度信息和所述串长度分辨率,确定所述串长度信息,包括:
    在根据所述第一标志确定所述当前串不是所述最后一个串的情况下,获取所述当前串所属解码块中,在解码所述当前串后的剩余像素数量编码;其中,所述串长度信息还包括所述当前串所属解码块中,在解码所述当前串后的剩余像素数量编码;或者,所述串长度信息还包括所述当前串所属解码块中,在解码所述当前串后的剩余像素数量编码减1;
    将所述剩余像素数量编码与所述串长度分辨率相乘,得到所述当前串所属解码块中,在解码所述当前串后的剩余像素数量;
    所述根据所述串长度信息确定所述当前串的串长度,包括:
    获取所述当前串所属解码块的总像素数量;
    获取所述当前串所属解码块的已解码像素数量;
    在所述当前串是所述最后一个串的情况下,将所述总像素数量与所述已解码像素数量相减,得到所述当前串的串长度;
    在所述当前串不是所述最后一个串的情况下,基于所述总像素数量、所述已解码像素数量以及解码所述当前串后的所述剩余像素数量,确定所述当前串的串长度。
  18. 根据权利要求1至17任一项所述的方法,其中,所述方法还包括:
    将第一预设值确定为所述当前串的串长度分辨率;
    或者,从所述当前串所属图像序列的序列头中,解码得到所述当前串的串长度分辨率,其中,所述当前串所属图像序列中包含的各个串的串长度分辨率相同;
    或者,从所述当前串所属图像的图像头中,解码得到所述当前串的串长度分辨率,其中,所述当前串所属图像中包含的各个串的串长度分辨率相同;
    或者,从所述当前串所属片的片头中,解码得到所述当前串的串长度分辨率,其中,所述当前串所属片中包含的各个串的串长度分辨率相同;
    或者,从所述当前串所属最大编码单元LCU的编码信息中,解码得到所述当前串的串长度分辨率,其中,所述当前串所属LCU中包含的各个串的串长度分辨率相同;
    或者,从所述当前串所属编码单元CU的编码信息中,解码得到所述当前串的串长度分辨率,其中,所述当前串所属编码单元CU中包含的各个串的串长度分辨率相同;
    或者,从所述当前串的编码信息中,解码得到所述当前串的串长度分辨率。
  19. 根据权利要求1至17任一项所述的方法,其中,所述方法还包括:
    根据所述当前串的串长度分辨率,从多种反二值化处理方式中选择对所述二元符号串进行反二值化处理的方式;
    或者,
    根据所述串长度信息的最大值,从多种反二值化处理方式中选择对所述二元符号串进行反二值化处理的方式;
    或者,
    根据所述当前串所属解码块的剩余像素数量的最大值与所述当前串的串长度分辨率的商,从多种反二值化处理方式中选择对所述二元符号串进行反二值化处理的方式;
    或者,
    根据所述当前串所属解码块的尺寸,从多种反二值化处理方式中选择对所述二元符号串进行反二值化处理的方式。
  20. 一种视频编码方法,所述方法包括:
    确定当前串的串长度;
    基于所述当前串的串长度确定所述当前串的串长度信息,所述串长度信息包括与所述当前串的串长度相关的信息;
    根据所述当前串的串长度分辨率对所述串长度信息进行二值化处理,得到所述串长度信息的二元符号串。
  21. 一种视频解码装置,所述装置包括:
    二元符号获取模块,用于从码流中解码得到当前串的串长度信息的二元符号串,所述串长度信息包括与所述当前串的串长度相关的信息;
    反二值化处理模块,用于根据所述当前串的串长度分辨率对所述二元符号串进行反二值化处理,得到所述串长度信息;
    串长度确定模块,用于根据所述串长度信息确定所述当前串的串长度。
  22. 一种视频编码装置,所述装置包括:
    串长度确定模块,用于确定当前串的串长度;
    长度信息确定模块,用于基于所述当前串的串长度确定所述当前串的串长度信息,所述串长度信息包括与所述当前串的串长度相关的信息;
    二值化处理模块,用于根据所述当前串的串长度分辨率对所述串长度信息进行二值化处理,得到所述串长度信息的二元符号串。
  23. 一种计算机设备,所述计算机设备包括处理器和存储器,所述存储器中存储有至少一条指令、至少一段程序、代码集或指令集,所述至少一条指令、所述至少一段程序、所述代码集或指令集由所述处理器加载并执行以实现如权利要求1至19任一项所述的方法,或者实现如权利要求20所述的方法。
  24. 一种计算机可读存储介质,所述计算机可读存储介质中存储有至少一条指令、至少一段程序、代码集或指令集,所述至少一条指令、所述至少一段程序、所述代码集或指令集由处理器加载并执行以实现如权利要求1至19任一项所述的方法,或者实现如权利要求20所述的方法。
PCT/CN2021/112133 2020-08-20 2021-08-11 视频解码方法、视频编码方法、装置、设备及存储介质 WO2022037464A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP21857559.5A EP4113999A4 (en) 2020-08-20 2021-08-11 VIDEO DECODING METHOD AND APPARATUS, VIDEO ENCODING METHOD AND APPARATUS, DEVICE, AND STORAGE MEDIUM
US17/939,767 US20230020127A1 (en) 2020-08-20 2022-09-07 Video decoding method and apparatus, video coding method and apparatus, device, and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010841108.9 2020-08-20
CN202010841108.9A CN114079780A (zh) 2020-08-20 2020-08-20 视频解码方法、视频编码方法、装置、设备及存储介质

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/939,767 Continuation US20230020127A1 (en) 2020-08-20 2022-09-07 Video decoding method and apparatus, video coding method and apparatus, device, and storage medium

Publications (1)

Publication Number Publication Date
WO2022037464A1 true WO2022037464A1 (zh) 2022-02-24

Family

ID=80281936

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/112133 WO2022037464A1 (zh) 2020-08-20 2021-08-11 视频解码方法、视频编码方法、装置、设备及存储介质

Country Status (4)

Country Link
US (1) US20230020127A1 (zh)
EP (1) EP4113999A4 (zh)
CN (1) CN114079780A (zh)
WO (1) WO2022037464A1 (zh)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102404571A (zh) * 2011-11-22 2012-04-04 浙江大学 视频图像编解码中的二进制化的方法和装置
WO2016119726A1 (en) * 2015-01-30 2016-08-04 Mediatek Inc. Method and apparatus for entropy coding of source samples with large alphabet
CN105973287A (zh) * 2016-05-04 2016-09-28 广东工业大学 一种多轨绝对光栅尺图像编码解码方法
US20190325083A1 (en) * 2018-04-20 2019-10-24 International Business Machines Corporation Rapid partial substring matching
CN111131818A (zh) * 2014-10-01 2020-05-08 株式会社Kt 对视频信号进行解码的方法和对视频信号进行编码的方法

Family Cites Families (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6111833A (en) * 1996-02-08 2000-08-29 Sony Corporation Data decoder
JP2002304859A (ja) * 2001-02-02 2002-10-18 Victor Co Of Japan Ltd 同期信号生成方法、記録装置、伝送装置、記録媒体及び伝送媒体
JP2004080652A (ja) * 2002-08-22 2004-03-11 Nec Corp 2値化画像データ用無歪圧縮方法及びそれを適用した無歪圧縮システム
WO2009022048A1 (en) * 2007-08-16 2009-02-19 Nokia Corporation A method and apparatuses for encoding and decoding an image
US20110293004A1 (en) * 2010-05-26 2011-12-01 Jicheng An Method for processing motion partitions in tree-based motion compensation and related binarization processing circuit thereof
US20120014433A1 (en) * 2010-07-15 2012-01-19 Qualcomm Incorporated Entropy coding of bins across bin groups using variable length codewords
WO2013114826A1 (ja) * 2012-01-31 2013-08-08 パナソニック株式会社 画像復号装置
US9584802B2 (en) * 2012-04-13 2017-02-28 Texas Instruments Incorporated Reducing context coded and bypass coded bins to improve context adaptive binary arithmetic coding (CABAC) throughput
KR20140004825A (ko) * 2012-06-29 2014-01-14 주식회사 팬택 엔트로피 부호화 및 엔트로피 복호화를 위한 구문 요소 이진화 방법 및 장치
US9503760B2 (en) * 2013-08-15 2016-11-22 Mediatek Inc. Method and system for symbol binarization and de-binarization
KR101943805B1 (ko) * 2014-06-20 2019-01-29 에이치에프아이 이노베이션 인크. 비디오 코딩에서의 신택스에 대한 이진화 및 컨텍스트 적응 코딩의 방법 및 장치
CN107743239B (zh) * 2014-09-23 2020-06-16 清华大学 一种视频数据编码、解码的方法及装置
CN109479135B (zh) * 2016-08-10 2021-10-15 松下电器(美国)知识产权公司 编码装置、解码装置、编码方法及解码方法
WO2018174591A1 (ko) * 2017-03-22 2018-09-27 김기백 영상을 구성하는 화소값 범위를 이용한 영상 부호화/복호화 방법
US10735736B2 (en) * 2017-08-29 2020-08-04 Google Llc Selective mixing for entropy coding in video compression
US9992496B1 (en) * 2017-11-15 2018-06-05 Google Llc Bin string coding based on a most probable symbol
US10798376B2 (en) * 2018-07-17 2020-10-06 Tencent America LLC Method and apparatus for video coding
CN109788284B (zh) * 2019-02-27 2020-07-28 北京大学深圳研究生院 一种量化块的解码方法、装置及电子设备
CN110475038B (zh) * 2019-08-02 2021-07-27 陕西师范大学 一种结合最小闭包编码的字符画生成式隐藏及恢复方法
TW202118300A (zh) * 2019-09-24 2021-05-01 法商內數位Vc控股法國公司 同質語法
CN113727108B (zh) * 2020-05-26 2024-03-01 腾讯科技(深圳)有限公司 视频解码方法、视频编码方法及相关设备
CN111866512B (zh) * 2020-07-29 2022-02-22 腾讯科技(深圳)有限公司 视频解码方法、视频编码方法、装置、设备及存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102404571A (zh) * 2011-11-22 2012-04-04 浙江大学 视频图像编解码中的二进制化的方法和装置
CN111131818A (zh) * 2014-10-01 2020-05-08 株式会社Kt 对视频信号进行解码的方法和对视频信号进行编码的方法
WO2016119726A1 (en) * 2015-01-30 2016-08-04 Mediatek Inc. Method and apparatus for entropy coding of source samples with large alphabet
CN105973287A (zh) * 2016-05-04 2016-09-28 广东工业大学 一种多轨绝对光栅尺图像编码解码方法
US20190325083A1 (en) * 2018-04-20 2019-10-24 International Business Machines Corporation Rapid partial substring matching

Also Published As

Publication number Publication date
CN114079780A (zh) 2022-02-22
EP4113999A4 (en) 2023-09-27
EP4113999A1 (en) 2023-01-04
US20230020127A1 (en) 2023-01-19

Similar Documents

Publication Publication Date Title
WO2022022297A1 (zh) 视频解码方法、视频编码方法、装置、设备及存储介质
US20220094969A1 (en) Image prediction method and apparatus
JP6162150B2 (ja) ビデオコーディング用の残差4分木(rqt)コーディング
CN113615187B (zh) 视频解码的方法、装置以及存储介质
MX2013014123A (es) Modelado de contexto eficiente de memoria.
WO2013130952A1 (en) Scan-based sliding window in context derivation for transform coefficient coding
CN112533000B (zh) 视频解码方法、装置、计算机可读介质及电子设备
WO2022116836A1 (zh) 视频解码方法、视频编码方法、装置及设备
CN112352429A (zh) 利用分组的旁路剩余级别进行系数编码以用于依赖量化
CN112335251B (zh) 以分组的旁路位元的系数编码
CN116095329A (zh) 视频编解码方法、装置、计算机可读介质及电子设备
WO2022078339A1 (zh) 参考像素候选列表构建方法、装置、设备及存储介质
WO2022022299A1 (zh) 视频编解码中的运动信息列表构建方法、装置及设备
TW202308377A (zh) 視訊譯碼中的用信號通知的具有多個分類器的自我調整迴路濾波器
WO2022037464A1 (zh) 视频解码方法、视频编码方法、装置、设备及存储介质
CN115209157A (zh) 视频编解码方法、装置、计算机可读介质及电子设备
CN114079773B (zh) 视频解码方法、装置、计算机可读介质及电子设备
WO2022174638A1 (zh) 视频编解码方法、装置、计算机可读介质及电子设备
CN114079772B (zh) 视频解码方法、装置、计算机可读介质及电子设备
WO2023130899A1 (zh) 环路滤波方法、视频编解码方法、装置、介质及电子设备
WO2022037458A1 (zh) 视频编解码中的运动信息列表构建方法、装置及设备
CN114979656A (zh) 视频编解码方法、装置、计算机可读介质及电子设备
CN115209141A (zh) 视频编解码方法、装置、计算机可读介质及电子设备
CN114079782A (zh) 视频图像重建方法、装置、计算机设备及存储介质
CN115209138A (zh) 视频编解码方法、装置、计算机可读介质及电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21857559

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2021857559

Country of ref document: EP

Effective date: 20220929

NENP Non-entry into the national phase

Ref country code: DE