WO2022116824A1 - 视频解码方法、视频编码方法、相关设备及存储介质 - Google Patents

视频解码方法、视频编码方法、相关设备及存储介质 Download PDF

Info

Publication number
WO2022116824A1
WO2022116824A1 PCT/CN2021/131183 CN2021131183W WO2022116824A1 WO 2022116824 A1 WO2022116824 A1 WO 2022116824A1 CN 2021131183 W CN2021131183 W CN 2021131183W WO 2022116824 A1 WO2022116824 A1 WO 2022116824A1
Authority
WO
WIPO (PCT)
Prior art keywords
pixel
string
current
unit
coding unit
Prior art date
Application number
PCT/CN2021/131183
Other languages
English (en)
French (fr)
Inventor
王英彬
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Publication of WO2022116824A1 publication Critical patent/WO2022116824A1/zh
Priority to US17/977,589 priority Critical patent/US20230047433A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/105Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/182Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a pixel
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/11Selection of coding mode or of prediction mode among a plurality of spatial predictive coding modes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/119Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/157Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
    • H04N19/159Prediction type, e.g. intra-frame, inter-frame or bidirectional frame prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/46Embedding additional information in the video signal during the compression process
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/129Scanning of coding units, e.g. zig-zag scan of transform coefficients or flexible macroblock ordering [FMO]

Definitions

  • the present application relates to the field of image processing technologies, and in particular, to video coding and decoding technologies.
  • the coding end usually divides the frame image contained in the video into multiple coding units, and then obtains the code stream data of the frame image by encoding each coding unit, and then transmits the code stream data to the decoding unit. end.
  • the decoding end after receiving the code stream data, the decoding end usually also performs a decoding operation in units of coding units to obtain a decoded image.
  • the concept of unit vector string is proposed; the unit vector string has received extensive attention due to its advantages of low implementation complexity.
  • the related art only uses the unit vector string in the equivalent string included in the intra-frame string copy coding mode and the unit vector string sub-mode; and when encoding and decoding a coding unit based on the unit vector string, only the coding unit can be referred to. internal pixels.
  • This encoding and decoding method restricts certain strings in the encoding unit (such as the strings in the first line) and cannot become unit vector strings. It can be seen that the related technology limits the applicable scope of unit vector strings to a certain extent, thus affecting the encoding and decoding. Decoding performance.
  • Embodiments of the present application provide a video decoding method, a video encoding method, a related device, and a storage medium, which can effectively expand the scope of application of a unit vector string and help improve encoding and decoding performance.
  • an embodiment of the present application provides a video decoding method, and the method includes:
  • the current string is a unit vector string and the current string includes the first pixel
  • the historical decoding unit is the a decoded coding unit adjacent to the current coding unit in the current image, and the reference pixel of the first pixel and the first pixel are adjacent in the current image
  • the predicted value of the first pixel is obtained according to the reconstructed value of the reference pixel of the first pixel to obtain a decoded image.
  • an embodiment of the present application provides a video encoding method, the method comprising:
  • the current string is a unit vector string and the current string includes the first pixel
  • the historical coding unit is the an encoded coding unit adjacent to the current coding unit in the current image
  • the reference pixel of the first pixel and the first pixel are adjacent in the current image
  • the predicted value of the first pixel is acquired according to the reconstructed value of the reference pixel of the first pixel, so as to obtain the encoding information of the current coding unit.
  • an embodiment of the present application provides a video decoding device, the device comprising:
  • a decoding unit configured to determine a reference pixel of the first pixel from a historical decoding unit in the current image if the current string is a unit vector string and the current string includes a first pixel; the The historical decoding unit is a decoded coding unit adjacent to the current coding unit in the current image, and the reference pixel of the first pixel and the first pixel are adjacent in the current image;
  • the decoding unit is further configured to obtain the predicted value of the first pixel according to the reconstructed value of the reference pixel of the first pixel, so as to obtain a decoded image.
  • an embodiment of the present application provides a video encoding apparatus, the apparatus comprising:
  • an encoding unit configured to determine a reference pixel of the first pixel from historical encoding units in the current image if the current string is a unit vector string and the current string includes a first pixel; the The historical coding unit is an encoded coding unit adjacent to the current coding unit in the current image, and the reference pixel of the first pixel and the first pixel are adjacent in the current image;
  • the coding unit is further configured to obtain the predicted value of the first pixel according to the reconstructed value of the reference pixel of the first pixel, so as to obtain the coding information of the current coding unit.
  • an embodiment of the present application provides a computer device, the computer device includes an input interface and an output interface, and the computer device further includes:
  • a processor adapted to implement one or more instructions
  • a computer storage medium stores one or more first instructions, the one or more first instructions are suitable for being loaded by the processor and executing the video decoding method according to the first aspect, Or the video coding method according to the above second aspect.
  • an embodiment of the present application provides a computer storage medium, where the computer storage medium stores one or more first instructions, and the one or more first instructions are suitable for being loaded by a processor and executing the above-mentioned first instruction.
  • an embodiment of the present application provides a computer program product, including instructions, which, when run on a computer, cause the computer to execute the video decoding method described in the first aspect above, or the video decoding method described in the second aspect above.
  • Video encoding method
  • the current string of the current coding unit is a unit vector string, and the current string includes a first pixel (for example, a pixel in the first row in the current coding unit or the first pixel in the current coding unit). Pixels in one column, etc.), the reference pixel of the first pixel can be determined from the historical decoding unit in the current image, and the predicted value of the first pixel can be obtained according to the reconstructed value of the reference pixel to implement encoding and decoding processing.
  • a first pixel for example, a pixel in the first row in the current coding unit or the first pixel in the current coding unit. Pixels in one column, etc.
  • any string in the current coding unit can become a unit vector string, which can effectively widen the applicability of the unit vector string. range, which helps to improve codec performance.
  • the predictive coding mode adopted for the current coding unit in the embodiments of the present application is not limited to the sub-modes of the equivalent string and the unit vector string, but may also be other predictive coding modes such as the string prediction sub-mode in the intra-frame string copy mode; that is, That is, the embodiment of the present application can allow the coding unit in the string prediction sub-mode to use the unit vector string, which can further widen the applicable range of the unit vector string and improve the coding performance of the string prediction.
  • FIG. 1a is a schematic diagram of the architecture of an image processing system provided by an embodiment of the present application.
  • 1b is a schematic diagram of dividing an image into multiple coding units according to an embodiment of the present application
  • Fig. 1c is a basic working flowchart of a video encoder provided by an embodiment of the present application.
  • 1d is a schematic diagram of a plurality of intra-frame prediction modes provided by an embodiment of the present application.
  • 1e is a schematic diagram of an angle prediction mode in an intra prediction mode provided by an embodiment of the present application.
  • 1f is a schematic diagram of an inter-frame prediction provided by an embodiment of the present application.
  • 1g is a schematic diagram of an intra-frame block copy provided by an embodiment of the present application.
  • 1h is a schematic diagram of a range of reference pixels for intra-block replication provided by an embodiment of the present application.
  • 1i is a schematic diagram of an intra-frame string replication provided by an embodiment of the present application.
  • 1j is a schematic diagram of intra-frame string replication with string length resolution control provided by an embodiment of the present application
  • FIG. 2 is a schematic flowchart of a video encoding method provided by an embodiment of the present application
  • 3a is a schematic diagram of a scanning mode provided by an embodiment of the present application.
  • 3b is an explanatory diagram of an orientation of a current coding unit provided by an embodiment of the present application.
  • FIG. 4 is a schematic flowchart of a video decoding method provided by an embodiment of the present application.
  • FIG. 5 is a schematic structural diagram of a video encoding apparatus provided by an embodiment of the present application.
  • FIG. 6 is a schematic structural diagram of a video decoding apparatus provided by an embodiment of the present application.
  • FIG. 7 is a schematic structural diagram of a computer device provided by an embodiment of the present application.
  • the embodiment of the present application relates to an image processing system; referring to FIG. 1 a , the image processing system at least includes: a video encoding device 11 and a video decoding device 12 .
  • the interior of the video encoding device 11 at least includes a video encoder, which is used to encode images in the video signal to obtain a code stream;
  • the interior of the video decoding device 12 at least includes a video decoder, which is used for The code stream is decoded to obtain a reconstructed image corresponding to the image in the video signal.
  • the video signals mentioned here can be divided into those captured by cameras and those generated by computers in terms of how the signals are acquired. Due to different statistical characteristics, the corresponding compression coding methods may also be different.
  • the current mainstream video coding standards such as the international video coding standard HEVC (High Efficiency Video Coding, high-performance video coding standard), VVC (Versatile Video Coding, multi-functional video coding standard), and China's national video coding standard AVS3 (Audio Video Coding) coding Standard 3, audio and video coding standards), etc.
  • HEVC High Efficiency Video Coding, high-performance video coding standard
  • VVC Very Video Coding, multi-functional video coding standard
  • AVS3 Anaudio Video Coding
  • Both adopt a hybrid coding framework, which divides the images in the original video signal into a series of CUs (Coding Units), and combines video coding processing methods such as prediction, transformation, and entropy coding to achieve video data compression.
  • these mainstream video coding standards usually need to perform the following series of operations and processing on the image in the input original video signal:
  • Block partition structure Divide the input image into several non-overlapping processing units, and perform similar compression operations on each processing unit.
  • the processing unit here is called CTU (Coding Tree Unit, coding tree unit), or LCU (Largest Coding Unit, largest coding unit). Further down the CTU or LCU, more finer divisions can be continued to obtain one or more basic coding units, which are called CUs.
  • CUs basic coding units
  • FIG. 1b shows that the LCU is evenly divided into multiple CUs, but actually the LCU can also be divided into multiple CUs unevenly.
  • Each CU is the most basic element in an encoding process, and each CU can independently use a predictive encoding mode for encoding and decoding.
  • the current video coding technology includes a variety of predictive coding modes such as intra-frame prediction mode and inter-frame prediction mode.
  • a predictive coding mode of the CU and informs the decoder. Then, the encoder uses the selected predictive encoding mode to perform predictive encoding on the current CU; the current CU obtains a residual video signal after the selected reconstructed video signal is predicted.
  • Intra-frame prediction mode The predicted signal comes from the area that has been coded and reconstructed in the same image
  • the predicted signal comes from other images (called reference images) that have been coded that are different from the current image.
  • the residual video signal undergoes transformation operations such as DFT (Discrete Fourier Transform, Discrete Fourier Transform), DCT (Discrete Cosine Transform, Discrete Cosine Transform) to convert the residual video signal to transform. In the domain, they are called transform coefficients.
  • DFT Discrete Fourier Transform
  • DCT Discrete Cosine Transform
  • Discrete Cosine Transform Discrete Cosine Transform
  • a lossy quantization operation is further performed, and certain information is lost, so that the quantized signal is favorable for compressed expression. Since in some video coding standards, there may be more than one transformation mode to choose from, so the encoder needs to select one transformation mode for the current CU and inform the decoder.
  • the fineness of quantization is usually determined by QP (Quantization Parameter, quantization parameter); when the value of QP is larger, the coefficients representing a larger range of values will be quantized into the same output, so it usually brings greater Distortion and lower code rate; on the contrary, when the QP value is small, coefficients representing a smaller range of values will be quantized into the same output, so it usually brings less distortion and corresponds to a higher code rate .
  • QP Quality Parameter, quantization parameter
  • Entropy Coding or Statistical Coding The quantized transform domain signal is subjected to statistical compression coding according to the frequency of occurrence of each value, and finally a binarized (0 or 1) compressed code stream is output. At the same time, other information is encoded; for example, the selected mode, motion vector, etc. also need to be entropy encoded to reduce the bit rate.
  • statistical coding is a lossless coding method, which can effectively reduce the code rate required to express the same signal; common statistical coding methods include variable length coding (Variable Length Coding, VLC) or context-based adaptive binary arithmetic coding (Content Adaptive Binary Arithmetic Coding, CABAC), etc.
  • Loop Filtering After the coded image, through the operations of inverse quantization, inverse transformation and prediction compensation (the reverse operation of the above step 2) - step 4), the reconstructed decoded image can be obtained. Comparing the reconstructed decoded image with the original image, due to the influence of quantization, some information is different from the original image, resulting in distortion (Distortion). Therefore, filters can be used to perform filtering operations on the reconstructed decoded images, such as deblocking, SAO (Sample Adaptive Offset, sample adaptive compensation) or ALF (Adaptive Loop Filter, adaptive loop filtering) and other filtering. operation can effectively reduce the degree of distortion caused by quantization. Since these filtered and reconstructed decoded images will be used as a reference for subsequent encoded images to predict future signals, the above filtering operation is also called in-loop filtering, and the filtering operation in the encoding loop .
  • SAO Sample Adaptive Offset, sample adaptive compensation
  • ALF Adaptive Loop Filter, adaptive loop filter
  • Fig. 1c exemplarily shows the basic work flow diagram of a video encoder; wherein, Fig. 1c uses the kth CU (marked as sk [x, y])
  • k is a positive integer greater than or equal to 1 and less than or equal to the number of CUs in the current image.
  • s k [x, y] represents a pixel with coordinates [x, y] (referred to as a pixel) in the kth CU
  • x represents the abscissa of the pixel
  • y represents the ordinate of the pixel.
  • the prediction signal is obtained.
  • s k [x, y] with The residual signal uk [x, y] is obtained by subtraction, and then the residual signal uk [x, y] is transformed and quantized.
  • the quantized output data There are two different places for the quantized output data: one is sent to the entropy encoder for entropy encoding, and the encoded stream is output to a buffer (buffer) for storage and waiting for transmission; the other is inversely quantized and inversely transformed.
  • s*k[x, y] is subjected to intra-picture prediction to obtain f(s*k[x, y]).
  • s*k[x, y] obtains s' k [x, y] after loop filtering, and sends s' k [x, y] to the decoded image buffer for saving, and is used to generate the reconstructed image.
  • s' k [x, y] obtains s' r [x+m x , y+m y ] after motion-compensated prediction, the s' r [x+m x , y+m y ] represents the reference block; wherein , m x and m y represent the horizontal and vertical components of a motion vector (Motion Vector, MV), respectively.
  • MV Motion Vector
  • the video decoder at the decoding end, for each CU, after the video decoder obtains the compressed code stream, it needs to perform entropy decoding first to obtain various mode information (such as predictive encoding mode information) and quantization. so that the residual signal can be obtained after each coefficient is inversely quantized and inversely transformed.
  • mode information such as predictive encoding mode information
  • the prediction signal corresponding to the CU that is, the prediction value of each pixel in the CU
  • the A reconstructed signal ie, a reconstructed image
  • the reconstructed signal is subjected to an operation of loop filtering to generate a final output signal (ie, a decoded image obtained by final decoding).
  • the predictive coding modes used in AVS3 may include: intra-frame prediction mode, inter-frame prediction mode, intra-frame block copy prediction mode, intra-frame string copy mode, etc.; when using AVS3 for coding, these predictions can be used alone or in combination. Coding mode; below, each predictive coding mode will be introduced separately:
  • Intra prediction mode is a commonly used predictive coding technology, which mainly derives the prediction value of the current CU from the adjacent coded regions based on the correlation of the pixels of the video image in the spatial domain.
  • Figure 1d there are currently 3 non-angle intra prediction modes (Plane, DC and Bilinear) and 66 angle prediction modes in AVS3.
  • the angle prediction mode is adopted, the pixel in the current CU uses the value of the reference pixel at the corresponding position on the reference pixel row or column as the prediction value according to the direction corresponding to the angle of the prediction mode.
  • the position of the reference pixel can be determined from the pixel row that has been encoded above according to the prediction angle in the figure, and then the value of the reference pixel is used as the predicted value of the pixel P.
  • the positions of the reference pixels pointed to by the pixel positions are of integer pixel precision (as shown in Figure 1e, the position of the reference pixel of the pixel P is a sub-pixel position between the pixel B and the pixel C); here In this case, the predicted value of pixel P needs to be obtained by interpolation of surrounding pixels.
  • the values of reference pixels for intra prediction stored in on-chip memory can generally be used.
  • inter-frame prediction mainly uses the correlation in the temporal domain of the video, and uses the pixels in the adjacent coded other images to predict the pixels in the current image, so as to achieve the purpose of effectively removing the temporal redundancy of the video, which can effectively save coding residues. bits of poor data.
  • P is the current frame
  • Pr is the reference frame
  • B is the current CU
  • B r is the reference block of B (ie, the reference CU).
  • B' is the same coordinate position as B in the image.
  • Intra Block Copy which can also be called block copy intra prediction; is an intra-frame coding tool used in the HEVC screen content coding (Screen Cotent Coding, SCC) extension, which significantly improves the Encoding efficiency of screen content.
  • SCC Screen Cotent Coding
  • IBC technology is also used to improve the performance of screen content encoding. IBC mainly utilizes the spatial correlation of screen content and video, and uses the pixels in the coded area on the current image to predict the predicted value of the pixel in the current CU, which can effectively save the bits required for coding the pixel.
  • the displacement between the current CU and its reference block can be called a block vector (BV);
  • H.266/VVC adopts a BV prediction technology similar to inter-frame prediction, which further saves Bits required to encode BV.
  • all reference pixels used for IBC prediction can be stored in the on-chip memory to reduce extra bandwidth for reading and writing in and out of the on-chip memory.
  • IBC is limited to use only one S ⁇ S pixel size (such as 128 ⁇ 128 pixel size) memory as storage space; then, in order to improve the use efficiency of this memory, 128 ⁇ 128 memory can be used as storage space. It is divided into 4 areas of 64 ⁇ 64 size to realize the multiplexing of reference pixel memory.
  • the range of reference pixels can be shown in Figure 1h: the block with vertical bars in Figure 1h is the current CU in the current processing unit (ie, the current CTU or the current LCU); the gray area is the coded part, and the white area is the current The part of the processing unit that has not been coded; the part marked with an "X" in Figure 1h is not available for the current CU, and the pixels in the area marked with an "X” in Figure 1h cannot be used as reference pixels for the current CU.
  • the embodiments of the present application only take a memory with a size of 128 ⁇ 128 pixels as an example for description, and in other embodiments, a memory with a size of 64 ⁇ 64 pixels can also be used as a storage space to further reduce the hardware cost; or, A memory with a size of 256 ⁇ 256 pixels can also be used as the storage space to meet the storage requirements of more data, and this application does not make any limitation on the size of the storage space.
  • the embodiments of the present application only take the storage space divided into four 64 ⁇ 64-sized areas as an example for description.
  • other area division methods may also be used, such as dividing a 128 ⁇ 128
  • the memory is divided into 16 regions of size 32 ⁇ 32 for multiplexing of reference pixel memory, and so on. That is to say, a memory of S ⁇ S pixel size can be divided into G 2 size area.
  • G is a divisor of S, that is, the quotient obtained by dividing S by G is an integer.
  • Intra String Copy also known as Intra String Copy intra prediction; it mainly divides a CU into a A series of pixel strings or unmatched pixels (ie, pixels not matched to a reference pixel). Similar to the IBC mode, each pixel string in the string prediction sub-mode looks for a reference string of the same shape in the coded area of the current image, thereby deriving the predicted value of each pixel in the current string; by encoding the pixel of each pixel in the current string The residual between the value and the predicted value, instead of directly encoding the pixel value, can effectively save bits.
  • ISC Intra String Copy
  • each pixel string obtained by dividing the CU does not include unmatched pixels; in this implementation manner, FIG. 1i exemplarily provides a schematic diagram of intra-frame string replication.
  • the dark gray area is the encoded area; 28 white pixels constitute string 1, 35 light gray pixels constitute string 2, and 1 black pixel is an unmatched pixel.
  • the length of all strings is restricted to be an integer multiple of 4, and the types of strings are divided into matching strings and incomplete matching strings.
  • the matching string refers to a string whose length is an integer multiple of 4 and does not include unmatched pixels; the incomplete matching string refers to a string whose length is 4 and includes unmatched pixels.
  • the length of the matching string may not be limited to an integer multiple of 4, but may also be an integer multiple of 5, or an integer multiple of 3, etc.; The length may also not be limited to 4, it may also be 5, 6, and so on.
  • the predicted value can be derived according to the corresponding string displacement vector; and for the unmatched pixels in the incomplete matching string, the pixel value can be decoded from the code stream. That is, the pixel string obtained by dividing the CU may include unmatched pixels; in this embodiment, FIG. 1j exemplarily provides a schematic diagram of intra-frame string replication with string length resolution control. Among them, the dark gray area is the encoded area; 24 white pixels constitute string 1, one black pixel and its right three pixels constitute string 2, and 36 light gray pixels constitute string 3.
  • the intra-frame string copy technology also needs to encode the string displacement vector (String Vector, SV for short) corresponding to each string in the current CU, the string length, and the flag of whether there is a matching string, etc.
  • the string displacement vector here can also be called a string A vector that represents the displacement of the current string to the reference string.
  • String length indicates the number of pixels contained in the current string.
  • 1 Directly encode the length of the string.
  • 2 The number of pixels to be processed after the current string is encoded, so that the decoding end can decode the number of pixels to be processed P2 excluding the current string according to the size P of the current CU and the number of processed pixels P1, so as to calculate the string of the current string.
  • Length L, L P-P1-P2; wherein, L and P are both integers greater than 0, and both P1 and P2 are integers greater than or equal to 0.
  • the intra-frame string copy technology in AVS3 uses a reference range limit similar to IBC, using a 128 ⁇ 128 pixel memory to store the value of the reference pixel and The current value of the pixel to be reconstructed, and use a memory reuse strategy similar to IBC to improve the efficiency of this memory usage.
  • the Equivalent String and Unitary Vector String mode is a sub-mode of the Intra String Copy mode, which was adopted into the AVS3 standard in October 2020. Similar to the intra string copy mode, this mode divides a CU (ie encoding/decoding block) into a series of pixel strings or unmatched pixels according to a certain scanning order.
  • the type of pixel string can be an equivalent string or a unit vector string .
  • the characteristic of the equivalence string in this mode is that all pixels in the pixel string have the same predicted value.
  • the feature of the unit vector string in this mode (also known as the unit base vector string, the unit offset string, the upper copy string, etc.) is that each pixel in the pixel string uses the reconstructed value of the upper pixel as the predicted value of the current pixel .
  • the length and prediction value of each string in the current CU need to be encoded; and the reference range of the equivalent string and unit vector string submodes is consistent with the reference range of the string prediction submode.
  • the embodiment of the present application proposes a coding and decoding scheme.
  • the codec scheme allows CUs in the string prediction sub-mode of intra-string replication to use unit vector strings, thereby improving the coding performance of string prediction; on the other hand, the codec scheme also allows to derive the The adjacent pixels are used as the reference of the current string to obtain the predicted value of the pixels in the current string.
  • the encoding and decoding scheme may include a video encoding method and a video decoding method; the following describes the video encoding method and the video decoding method proposed by the embodiments of the present application with reference to FIG. 2 and FIG. 4 .
  • FIG. 2 is a flowchart of a video encoding method proposed by an embodiment of the present application; the video encoding method may be executed by the above-mentioned video encoding device or a video encoder in the video encoding device.
  • the video encoding method may include the following steps S201-S203:
  • the video encoding device may receive the original video signal, and encode images in the original video signal in sequence; correspondingly, the current image refers to the image currently being encoded, which may be any image in the original video signal. a frame of image.
  • the video encoding device may first divide the current image into multiple processing units (such as CTUs or LCUs), and further divide each processing unit into one or more coding units ( That is, CU), thereby encoding each coding unit in turn.
  • the coding unit that is currently to be coded (that is, the coding is about to start at the moment) or the coding unit that is currently being coded (that is, some pixels have been coded) may be referred to as the current coding unit.
  • the current coding unit For example, suppose there are a total of 5 CUs: CU1, CU2, CU3, CU4, and CU5; and each pixel in CU1-CU2 has been encoded, and each pixel in CU3-CU5 has not been encoded; then it can be determined according to the encoding order.
  • CU3 is the current coding unit to be coded, so the current coding unit is CU3.
  • each pixel in CU1 has been encoded, and some pixels in CU2 have been encoded (that is, some pixels remain unencoded) , each pixel in CU3-CU5 has not been coded; then it can be determined that CU2 is the coding unit currently being coded, so the current coding unit is CU2.
  • the video encoding apparatus may use the ISC to encode each pixel in the current coding unit.
  • the current coding unit may be coded in the string prediction sub-mode in the intra string copy mode (ISC); and the coding unit in the string prediction sub-mode is allowed to use a unit vector string.
  • the current coding unit may also be coded in the sub-mode of the equivalent string and the unit vector string in the intra string copy mode (ISC), which is not limited in this application.
  • the current coding unit may include P rows and Q columns of pixels; wherein, the values of P and Q are both positive integers, and the values of P and Q may be equal or unequal, which is not limited in this application.
  • the values of P and Q may both be 64, then the current coding unit may include 64 ⁇ 64 pixels.
  • the video encoding device may determine the scan mode of the current coding unit, where the scan mode includes but is not limited to: a horizontal scan mode or a vertical scan mode.
  • the horizontal scanning mode also referred to as the horizontal round-trip scanning mode
  • a vertical scanning mode or referred to as a vertical scanning mode
  • Vertical scanning mode refers to a scanning mode in which each column of pixels in the current coding unit is scanned sequentially from left to right.
  • the horizontal scan mode may also refer to: a scan mode in which each row of pixels in the current coding unit is scanned sequentially from bottom to top;
  • the vertical scan mode may also refer to: The scanning mode of scanning the pixels of each column in the current coding unit in sequence from right to left is not limited in this application.
  • the video encoding apparatus may divide the current coding unit into at least one string and/or unmatched pixels according to the scanning mode of the current coding unit, and sequentially perform encoding processing on the divided strings and/or unmatched pixels.
  • the current string to be encoded can be called the current string; since the AVS3 standard defines that the length of the string must be an integer multiple of 4, the current string can include 4L pixels, and L is a positive integer; but it should be understood that in the In other video coding standards, if the length of the string is not specified, the current string may include one or more pixels.
  • the current string may or may not be a unit vector string; and when the current string is a unit vector string, it may be a matching string or an incomplete matching string, which is not limited in this application. If the current string is a unit vector string, step S202 can be executed.
  • the features of the unit vector strings mentioned in the embodiments of the present application are not exactly the same as the features of the unit vector strings in the above-mentioned equivalent string and unit vector string sub-modes.
  • the features of the unit vector string mentioned in the embodiments of the present application are as follows: the predicted value of each pixel in the unit vector string is determined by the reconstructed value of the corresponding reference pixel, and the reference pixel corresponding to any pixel refers to: the target located in the pixel Pixels on the side and adjacent to this pixel.
  • the target side mentioned here can be determined according to the scanning mode of the current coding unit; if the scanning mode of the current coding unit is the horizontal scanning mode, and the horizontal scanning mode indicates that the current coding unit is scanned in the order from top to bottom, Then the target side of any pixel refers to the top of any pixel, that is, the reference pixel of any pixel refers to a certain pixel (such as the pixel directly above) located in the upper row of the any pixel; if the scanning of the current coding unit The mode is vertical scan mode, and the vertical scan mode indicates that the current coding unit is scanned in the order from left to right, then the target side of any pixel refers to the left side of any pixel, that is, the reference pixel of any pixel Refers to a certain pixel located in a column to the left of any pixel (such as the pixel directly to the left).
  • the above mentioned in the embodiments of the present application refers to the orientation on the upper side of the horizontal line where the upper left corner of the current coding unit is located, and the right above refers to the vertical orientation of the position where the pixel is located; the left refers to the orientation located in the current coding The orientation of the left side of the vertical line where the upper left corner of the unit is located, and the right left side refers to the orientation to the left horizontally where the pixel is located, as shown in Figure 3b. It should be noted that the reconstructed values in the embodiments of the present application refer to pixel values that have not undergone loop filtering.
  • the historical coding unit refers to the coded coding unit adjacent to the current coding unit in the current image; the reference pixel corresponding to the first pixel and the first pixel are adjacent in the current image, and the first pixel satisfies the following conditions: The reference pixel corresponding to one pixel is not located in the current coding unit. It should be noted that the fact that the historical coding unit and the current coding unit are adjacent means that no other coding unit exists between the historical coding unit and the current coding unit.
  • the fact that the reference pixel corresponding to the first pixel and the first pixel are adjacent in the current image means: in the current image, the row where the reference pixel of the first pixel is located and the row where the first pixel is located are adjacent to each other (that is, next to each other).
  • the row where the reference pixel of the first pixel is located is the 5th row; if the row where the first pixel is located is the 6th row, it can be considered that the reference pixel of the first pixel is located at the 5th row; The row where the first pixel is located and the row where the first pixel is located are adjacent to each other; if the row where the first pixel is located is the 7th row, it can be considered that the row where the reference pixel of the first pixel is located is not the same as the row where the first pixel is located. adjacent to each other.
  • the reference pixel corresponding to the first pixel and the first pixel are adjacent in the current image means: in the current image, the column where the reference pixel of the first pixel is located and the column where the first pixel is located are adjacent to each other (that is, next to); for example, in the current image, it is assumed that the column where the reference pixel of the first pixel is located is the 5th column; if the row where the first pixel is located is the 6th column, it can be considered that the The column where the reference pixel is located and the column where the first pixel is located are adjacent to each other; if the column where the first pixel is located is the 7th column, it can be considered that the column where the reference pixel of the first pixel is located is the same as the column where the first pixel is located.
  • the columns at are not adjacent to each other.
  • the first pixel should be a pixel on a critical row or a critical column in the current coding unit;
  • the critical row here can also be called an edge row, which may be the first row or the last row of the current coding unit; similarly, the critical column may also be called the edge column, which may be the first column or the last column of the current coding unit.
  • the specific meaning of the first pixel may be determined according to the scan mode of the current coding unit. For example, if the scanning mode of the current coding unit is the horizontal scanning mode, and the horizontal scanning mode indicates that the current coding unit is scanned in the order from top to bottom, the historical coding unit is located above the current coding unit (that is, the historical coding unit is located at the top of the current coding unit).
  • the coded CU above the current coding unit and adjacent to the current coding unit) the first pixel is the pixel located in the first row of the current coding unit; that is, in this case, the current string includes the current coding unit on the first row. at least one pixel.
  • the scanning mode of the current coding unit is the vertical scanning mode, and the vertical scanning mode indicates that the current coding unit is scanned in the order from left to right
  • the historical coding unit is located to the left of the current coding unit (that is, the historical coding unit is The coded CU located to the left of the current coding unit and adjacent to the current coding unit)
  • the first pixel is the pixel located in the first column of the current coding unit; that is, in this case, the current string includes the first column in the current coding unit at least one pixel on .
  • S203 Acquire the predicted value of the first pixel according to the reconstructed value of the reference pixel of the first pixel to obtain encoding information of the current coding unit.
  • the reconstructed value of the reference pixel of the first pixel can be obtained first, and the reconstructed value of the reference pixel of the first pixel can be used as the predicted value of the first pixel; then, the pixel value of the first pixel and the predicted value can be encoded.
  • the residual difference between the values is used to obtain the coding information of the current coding unit; this coding method can reduce the number of bits and improve the coding efficiency.
  • indication information (such as a string vector, string length, etc.) for indicating the predicted value may be encoded to obtain encoding information of the current coding unit.
  • the acquisition method of the reconstructed value of the reference pixel of the first pixel is also different; refer to the following description for details:
  • the acquisition mode of the reconstruction value of the reference pixel of the first pixel includes any of the following kind:
  • the first type obtaining the reconstructed value of the reference pixel of the first pixel from the first storage space corresponding to the intra-frame string copy mode (or referred to as the reference pixel memory of the intra-frame string copy mode).
  • the first storage space in the intra-string copy mode may be the 128 ⁇ 128 memory mentioned in the aforementioned IBC content, and the first storage space may be divided into four 64 ⁇ 64 areas for reference pixels Multiplexing of memory; but it should be noted that, as mentioned in the aforementioned IBC related content, the first storage space may not be limited to 128 ⁇ 128 memory, and the first storage space is not limited to being divided into four 64 ⁇ 64 area.
  • the prediction encoding mode of the current coding unit is the string prediction sub-mode in the intra-frame string copy mode, or the equivalent string and unit vector string sub-mode
  • the value of the first pixel will be stored in the first storage space, then in the first Obtaining the reconstructed value of the reference pixel of the first pixel in the storage space as the predicted value of the first pixel can realize the assignment of the predicted value in one storage space, which can effectively improve the coding efficiency.
  • the second type obtain the reconstructed value of the reference pixel of the first pixel from the second storage space corresponding to the intra-frame prediction mode (or referred to as the reference pixel memory of the intra-frame prediction mode). Because if the reference pixel of the first pixel is encoded using the intra prediction mode, the reconstructed value of the reference pixel of the first pixel in this case is located in the second storage space; and if the reference pixel of the first pixel is encoded using the intra frame string copy In this case, the reconstructed value of the reference pixel of the first pixel may be overwritten due to memory multiplexing. Therefore, in order to improve the success rate of obtaining the reconstructed value of the reference pixel of the first pixel, the reconstructed value of the reference pixel of the first pixel may be directly obtained from the second storage space.
  • the third type divide the current image into multiple N ⁇ N areas, when the first pixel is located in the first row of any N ⁇ N area, obtain the reconstruction of the reference pixel of the first pixel from the second storage space value; otherwise, obtain the reconstructed value of the reference pixel of the first pixel from the first storage space.
  • the value of N may be determined according to an empirical value, or the size of the processing unit, or the size of the current CU. For example, if the size of the processing unit is 128 ⁇ 128, the target area can be divided into multiple 128 ⁇ 128 areas; when the first pixel is located in the first row of any 128 ⁇ 128 area, see Fig. 1h Referring to the limitation of the range (the scanning mode corresponding to FIG.
  • the first storage space since the first storage space only stores the value of the current CU (such as the reconstructed value of the reconstructed pixel), it only stores the CU to the left of the current CU.
  • the reconstruction value of the first pixel does not store the reconstruction value of the reference pixel of the first pixel (that is, the pixel located above the current CU and adjacent to the first pixel), so the reconstruction value of the reference pixel of the first pixel can be obtained from the second storage space. value to improve the success rate of obtaining the reconstruction value.
  • the reference pixel of the first pixel is located inside the current CU, then the reconstructed value of the reference pixel exists in the first storage space, correspondingly from The reconstructed value of the reference pixel of the first pixel is acquired in the first storage space, so as to improve the acquisition efficiency of the reconstructed value.
  • the pixels in the current string may all be the second pixel (that is, the current string does not include the first pixel), or the current string may also include the first pixel and the second pixel (that is, the current string except the first pixel pixel, and also includes a second pixel); wherein, the second pixel is a pixel located outside the first row of the current coding unit, and each second pixel in the current string is distributed in at least one row of the current coding unit. Then the computer device can also derive the predicted value of each second pixel in the current string in any of the following ways:
  • Mode A The predicted value of the second pixel in each row of the current string is acquired row by row in row units. For example, assuming that the second pixel in the current string is distributed in the second row to the third row in the current coding unit, the predicted value of the second pixel in the second row can be derived at one time in the order from top to bottom. , and once again derive the predicted value of the second pixel in the third row; wherein, the second pixel in each row uses the reconstructed value of the reference pixel in the upper row as the predicted value.
  • the method A obtains the predicted value of the second pixel row by row, which can effectively improve the efficiency of obtaining the predicted value, thereby improving the coding efficiency.
  • Manner B Using a single pixel as a unit, the predicted value of each second pixel in the current string is acquired pixel by pixel. For example, assuming that the second pixels in the current string are distributed in the second row to the third row in the current coding unit, the predicted value of each second pixel in the second row can be derived pixel by pixel, and then derived pixel by pixel. Predicted values for the second pixels in row 3; where each second pixel uses the reconstructed value of the reference pixel above it as a predicted value.
  • the predicted value of any second pixel is obtained according to the reconstructed value of the reference pixel located above the second pixel and adjacent to the second pixel.
  • the reconstructed value of the reference pixel above any second pixel and adjacent to the second pixel can be directly used as the predicted value of the second pixel.
  • the scanning mode of the current coding unit is a vertical scanning mode, and the vertical scanning mode indicates that the current coding unit is scanned in the order from left to right; then the acquisition mode of the reconstruction value of the reference pixel of the first pixel includes the following: Either:
  • the first type obtaining the reconstructed value of the reference pixel of the first pixel from the first storage space corresponding to the intra-frame string copy mode (or referred to as the reference pixel memory of the intra-frame string copy mode).
  • the second type obtain the reconstructed value of the reference pixel of the first pixel from the second storage space corresponding to the intra-frame prediction mode (or referred to as the reference pixel memory of the intra-frame prediction mode).
  • the third type divide the current image into multiple N ⁇ N areas, when the first pixel is located in the first column of any N ⁇ N area, obtain the reconstruction of the reference pixel of the first pixel from the second storage space value; otherwise, obtain the reconstructed value of the reference pixel of the first pixel from the first storage space.
  • the pixels in the current string may all be the third pixel (that is, the current string does not include the first pixel), or the current string may also include the first pixel and the third pixel (that is, the current string except the first pixel pixel, and also includes a third pixel); wherein, the third pixel is a pixel located outside the first column of the current coding unit, and each third pixel in the current string is distributed in at least one column of the current coding unit. Then the computer device can also derive the predicted value of each third pixel in the current string in any of the following ways:
  • Mode a Obtain the predicted value of the third pixel in each column of the current string column by column in units of columns. For example, assuming that the third pixel in the current string is distributed in the second column to the third column in the current coding unit, the predicted value of the third pixel in the second column can be derived at one time in the order from left to right. , and once again derive the predicted value of the third pixel in the third column; wherein, the third pixel in each column uses the reconstructed value of the reference pixel in the left column as the predicted value.
  • this mode a by obtaining the predicted value of the third pixel column by column, the efficiency of obtaining the predicted value can be effectively improved, thereby improving the coding efficiency.
  • Manner b Using a single pixel as a unit, obtain the predicted value of each third pixel in the current string on a pixel-by-pixel basis. For example, assuming that the third pixels in the current string are distributed in the second column to the third column in the current coding unit, then in the order from left to right, firstly derive the values of each third pixel in the second column pixel by pixel. The predicted value is obtained, and then the predicted value of the third pixel in the third column is derived pixel by pixel; wherein, each third pixel uses the reconstructed value of the reference pixel located to the left thereof as the predicted value.
  • the predicted value of any third pixel is obtained according to the reconstructed value of the reference pixel located to the left of the third pixel and adjacent to the third pixel.
  • the reconstructed value of the reference pixel to the left of any third pixel and adjacent to the third pixel may be directly used as the predicted value of the third pixel.
  • the video encoding device may also encode a string indicator, so that the encoding information includes the string indicator; wherein, if the current string is a unit vector string, the string indicator is the target value.
  • the target value can be set according to experience value or business requirements, for example, the target value can be set to 0 or 1 or the like.
  • the string indication flag may be a binary flag to indicate whether the current string is a unit vector string. For example, when the string indicator is 1 (ie, the target value is 0), it can indicate that the current string is a unit vector string; when the string indicator is 0, it can indicate that the current string is not a unit vector string.
  • the video encoding device can also encode a value K, so that the encoding information includes a value K; the value of the value K is greater than or equal to zero, and the value K is used to indicate the target group of pixels in the current coding unit
  • At least one string located after the Kth group of pixels is a unit vector string.
  • the current coding unit may be divided into multiple groups of pixels; a group of pixels includes: a string or at least one unmatched pixel.
  • the current coding unit may be divided into unmatched pixels; in this case, these divided unmatched pixels may be divided into Pixels may constitute one or more groups of pixels.
  • the distribution positions of at least two unmatched pixels in the current coding unit are continuous among the divided unmatched pixels, all unmatched pixels in the at least two unmatched pixels may be regarded as a group of pixels , or select some continuous unmatched pixels from the at least two unmatched pixels as a group of pixels, and each unselected unmatched pixels are respectively used as a group of pixels. If there are independent unmatched pixels in the divided unmatched pixels (ie, unmatched pixels between two strings), an independent unmatched pixel can be regarded as a group of pixels.
  • the embodiments of the present application merely exemplify several division manners for dividing the unmatched pixels into one or more groups of pixels, and are not exhaustive.
  • the division mode can be set by the encoding end according to requirements. If the video encoding device uses the intra-frame string replication technique with string length resolution control as shown in FIG. 1j to perform string division on the current coding unit, the current coding unit will be divided into multiple strings (such as matching strings and/or incomplete strings). matching strings); in this case, each matching string can be a set of pixels, and each incomplete matching string can also be a set of pixels.
  • the number of K included in the coding information in the current coding unit may be one, and the target group of pixels may be all groups of pixels in the current coding unit; in this case, the value K may be used to indicate the current coding unit All strings located after the Kth group of pixels in all groups of pixels in are unit vector strings.
  • the value K may be used to indicate that the K+1th string in all groups of pixels in the current coding unit is a unit vector string.
  • each group of pixels includes a string; the number of K included in the encoding information in the current coding unit may be multiple, and one K corresponds to one target group pixel; in this embodiment, the i-th K It can be used to indicate that the K+1 th string in the target group of pixels corresponding to the ith K in the current coding unit is a unit vector string.
  • the target group of pixels corresponding to the i-th K is the remaining group of pixels whose string type (unit vector string or non-unit vector string) is not determined according to the previous i-1 K in the current coding unit.
  • the target group pixels corresponding to the first K are all groups of pixels in the current coding unit, and the first K indicates the pixels in the first group of pixels in the current coding unit.
  • the strings in the 2nd group of pixels are not unit vector strings, while the strings in the 3rd group of pixels are unit vector strings.
  • the second K corresponds to the target group of pixels: the remaining group of pixels whose string type is not determined according to the first K, that is, the fourth group of pixels and the pixels located after the fourth group of pixels group of pixels.
  • the second K may indicate that the strings in the 4th group of pixels are not unit vector strings, and the 5th group strings are unit vector strings.
  • the video encoding device can also encode the string displacement vector of the current string, so that the encoding information includes the string displacement vector; wherein, if the current string is a unit vector string, the string displacement vector is a unit vector,
  • the unit vector here may include a first element and a second element. If the scanning mode of the current coding unit is the horizontal scanning mode, the element value of the first element is the first value, and the element value of the second element is the second value; if the scanning mode of the current coding unit is the vertical scanning mode, the The element value of one element is the second value, and the element value of the second element is the first value.
  • the specific values of the first numerical value and the second numerical value can be set according to empirical values or business requirements; for example, setting the first numerical value to 0 and setting the second numerical value to -1: then when the scanning mode of the current coding unit is the horizontal scanning mode , the unit vector is (0, -1); when the scan mode of the current coding unit is the vertical scan mode, the unit vector is (-1, 0).
  • the current image is divided into multiple processing units, each processing unit includes one or more encoding units; and the multiple processing units include: the current processing unit and the encoded processing unit, the current processing unit includes the current encoding unit .
  • the string displacement vector is used to represent the displacement from the current string to the reference string.
  • the reference string of the current string can be allowed to not satisfy one or more of the following conditions:
  • any reference pixel in the reference string is within the range of the specified area; for example, the range of the specified area can be the range of the current processing unit, or the range of H decoded processing units to the left of the current processing unit, where H is a positive integer .
  • the value of H may be determined according to the size of the processing unit (ie, LCU or CTU); specifically, H may be determined according to the following formula:
  • the " ⁇ " operator means left shift, which is used to shift all the binary bits of a number to the left by K (K is a positive integer greater than or equal to 1) bits, the high bits are discarded, and the low bits are filled with 0. (((log2_lcu_size_minus2+2) ⁇ 7)?
  • H can also be determined according to the following formula:
  • both the predetermined pixel and the target area can be determined according to the size of the processing unit (ie, LCU or CTU); specifically, if the size of the processing unit is M ⁇ M, the predetermined pixel is M pixels, and the target area is the number of pixels in the reference string. After any pixel is moved to the right by M pixels, the corresponding area.
  • the predetermined pixels can be 128 pixels
  • the target area is a 64 ⁇ 64 area corresponding to any pixel in the reference string moved to the right by 128 pixels.
  • the pixels in the target area where any reference pixel of the reference string is moved to the right by a predetermined pixel have not been reconstructed yet means: any reference pixel of the reference string is moved to the right by a predetermined pixel. None of the pixels have been reconstructed, or the pixel at the upper left corner of the target area where any reference pixel of the reference string is moved to the right by a predetermined pixel has not been reconstructed.
  • the reference area including any reference pixel in the reference string in the processing unit adjacent to the left can correspondingly find the same location area in the current processing unit, and the coordinates of the upper left corner of the location area should not be the same as the upper left corner of the current CU.
  • the corner coordinates are the same.
  • (xRefA, yRefA) and (xRefB, yRefB) are the coordinates of any two reference pixels (luminance components) A and B in the reference string. It should be noted that "/" here represents division by rounding down.
  • Any reference pixel in the reference string does not belong to the current string, that is, the reference string and the current string do not overlap.
  • the video encoding apparatus may also encode a serial scan mode flag (or referred to as a serial predictive scan mode flag), so that the encoded information includes the serial scan mode flag.
  • a serial scan mode flag (or referred to as a serial predictive scan mode flag)
  • the string scanning mode flag is the third value
  • the scanning mode of the current coding unit is the vertical scanning mode
  • the string scanning mode flag is the fourth value.
  • the third numerical value and the fourth numerical value can both be set according to experience values or business requirements; for example, the third numerical value is set to 0, and the fourth numerical value is set to 1.
  • the video encoding device may also search for the reference string of the current string from the current processing unit and some encoded encoding units in the encoded processing unit;
  • the string obtains the predicted value of each pixel in the current string to obtain the encoding information of the current coding unit.
  • the reference string must meet one or more of the following conditions:
  • Any reference pixel in the reference string is within the range of the current processing unit and the H decoded processing units to the left of the current processing unit, where H is a positive integer.
  • Each reference pixel in the reference string is located in the same alignment area, and the size of the alignment area is determined according to the size of the processing unit.
  • Each reference pixel in the reference string and each pixel in the current string are located in the same independent coding area, where the independent coding area may include the current image, or slices and strips in the current image.
  • the video encoding device can directly encode the unmatched pixels. , so that the pixel values of unmatched pixels are included in the encoded information.
  • the current string of the current coding unit is a unit vector string, and the current string includes a first pixel (for example, a pixel in the first row in the current coding unit or the first pixel in the current coding unit).
  • the reference pixel of the first pixel can be determined from the historical decoding unit in the current image, and the predicted value of the first pixel can be obtained according to the reconstructed value of the reference pixel to implement the encoding process.
  • any string in the current coding unit can become a unit vector string, which can effectively widen the unit vector string. Scope of application, which helps to improve encoding performance.
  • the predictive coding mode adopted for the current coding unit in the embodiments of the present application is not limited to the sub-modes of the equivalent string and the unit vector string, but may also be other predictive coding modes such as the string prediction sub-mode in the intra-frame string copy mode; That is to say, the embodiment of the present application can allow the coding unit in the string prediction sub-mode to use the unit vector string, which can further widen the applicable range of the unit vector string and improve the coding performance of the string prediction.
  • FIG. 4 is a flowchart of a video decoding method proposed by an embodiment of the present application; the video decoding method may be executed by the above-mentioned video decoding device or a video decoder in the video decoding device.
  • the video decoding method may include the following steps S401-S403:
  • the video decoding device can obtain the encoding information of the current coding unit in the current image, and the current coding unit includes pixels of P rows and Q columns; wherein, the values of P and Q are both positive integers. Then, the current string to be decoded may be determined from the current coding unit according to the coding information of the current coding unit. Specifically, the video decoding device can decode the predictive coding mode of the current coding unit from the coding information of the current coding unit; if the predictive coding mode is the string prediction sub-mode in the ISC or the equivalent string and unit vector string prediction sub-mode, then The scan mode and string length are further decoded from the decoding information of the current coding unit. Then, the current string to be decoded may be determined from the current coding unit according to the scan mode and the string length.
  • the current coding units on the decoding side all refer to: the CU that is currently to be decoded (that is, the decoding is about to start currently) or the CU that is currently being decoded (that is, some pixels have been decoded).
  • the current coding unit may be coded in a string prediction sub-mode in an intra string copy mode (ISC); and the coding unit in the string prediction sub-mode is allowed to use a unit vector string.
  • the current coding unit may also be coded in the sub-mode of the equivalent string and the unit vector string in the intra string copy mode (ISC), which is not limited in this application.
  • ISC intra string copy mode
  • the video decoding device can use any of the following methods to determine whether the current string is a unit vector string:
  • the video decoding device can decode the string indicator from the encoding information of the current coding unit; if the string indicator is the target value, the current string is determined to be a unit vector string.
  • the video decoding device can decode the value K from the encoding information of the current coding unit, and the value of the value K is greater than or equal to zero; and the value K is used to indicate that the target group of pixels in the current coding unit is located in the Kth group of pixels.
  • the last at least one string is a unit vector string, and a group of pixels includes: one string or at least one unmatched pixel. If the current string is located after the Kth group of pixels in the target group of pixels, the current string is determined to be a unit vector string.
  • the value K can be specifically used to indicate that in the target group of pixels in the current coding unit, the K+1th string is a unit vector string; then in this embodiment, it can be It is further judged whether the current string is the K+1th string in the target group of pixels; if the current string is the K+1th string in the target group of pixels, the current string is determined to be a unit vector string.
  • the video decoding device can decode the string displacement vector of the current string from the encoding information of the current coding unit; if the string displacement vector is a unit vector, the current string is determined to be a unit vector string.
  • the unit vector may include a first element and a second element. If the scanning mode of the current coding unit is the horizontal scanning mode, the element value of the first element is the first value, and the element value of the second element is the second value; if the scanning mode of the current coding unit is the vertical scanning mode, the The element value of one element is the second value, and the element value of the second element is the first value.
  • the specific values of the first numerical value and the second numerical value can be set according to empirical values or business requirements; for example, setting the first numerical value to 0 and setting the second numerical value to -1: then when the scanning mode of the current coding unit is the horizontal scanning mode , the unit vector is (0, -1); when the scan mode of the current coding unit is the vertical scan mode, the unit vector is (-1, 0).
  • the determination method of the scanning mode of the current coding unit may be as follows: decoding the serial scanning mode flag from the coding information of the current coding unit; if the serial scanning mode flag is the third numerical value, then determining that the scanning mode of the current coding unit is horizontal scanning mode; if the string scan mode flag is the fourth value, it is determined that the scan mode of the current coding unit is the vertical scan mode.
  • the current image is divided into multiple processing units, each processing unit includes one or more coding units; and the multiple processing units include: the current processing unit and the decoded processing unit, and the current processing unit includes the current coding unit .
  • the string displacement vector is used to represent the displacement from the current string to the reference string.
  • the reference string of the current string can be allowed to not satisfy one or more of the following conditions:
  • Any reference pixel in the reference string is within the range of the specified area; for example, the range of the specified area may be the range of the current processing unit, or the range of the H decoded processing units to the left of the current processing unit.
  • the reference area including any reference pixel in the reference string in the processing unit adjacent to the left can correspondingly find the same location area in the current processing unit, and the coordinates of the upper left corner of the location area should not be the same as the upper left corner of the current CU.
  • the corner coordinates are the same.
  • Each reference pixel in the reference string is located in the same alignment area.
  • the historical decoding unit refers to the decoded coding unit adjacent to the current coding unit in the current image, the reference pixel corresponding to the first pixel and the first pixel are adjacent in the current image; and the first pixel satisfies the following conditions: The reference pixel corresponding to one pixel is not located in the current coding unit.
  • the fact that the historical decoding unit and the current coding unit are adjacent means that no other coding unit exists between the historical decoding unit and the current coding unit.
  • the fact that the reference pixel corresponding to the first pixel and the first pixel are adjacent in the current image means: in the current image, the row where the reference pixel of the first pixel is located and the row where the first pixel is located are adjacent to each other (that is, next to each other). )of.
  • the reference pixel corresponding to the first pixel and the first pixel are adjacent in the current image means: in the current image, the column where the reference pixel of the first pixel is located and the column where the first pixel is located are adjacent to each other of.
  • the first pixel should be a pixel on a critical row or a critical column in the current coding unit;
  • the critical row here may also be called an edge row , which can be the first row or the last row of the current coding unit;
  • the critical column can also be called the edge column, which can be the first column or the last column of the current coding unit.
  • the specific meaning of the first pixel may be determined according to the scan mode of the current coding unit.
  • the scanning mode of the current coding unit is the horizontal scanning mode, and the horizontal scanning mode indicates that the current coding unit is scanned in the order from top to bottom
  • the historical decoding unit is located above the current coding unit (that is, the historical decoding unit is located at the top of the current coding unit).
  • the decoded CU above the current coding unit and adjacent to the current coding unit) the first pixel is the pixel located in the first row of the current coding unit; that is, in this case, the current string includes the current coding unit on the first row. at least one pixel.
  • the scanning mode of the current coding unit is the vertical scanning mode, and the vertical scanning mode indicates to scan the current coding unit in the order from left to right
  • the historical decoding unit is located to the left of the current coding unit (that is, the historical decoding unit is The decoded CU located to the left of the current coding unit and adjacent to the current coding unit)
  • the first pixel is the pixel located in the first column of the current coding unit; that is, in this case, the current string includes the first column in the current coding unit at least one pixel on .
  • S403 Obtain a predicted value of the first pixel according to the reconstructed value of the reference pixel of the first pixel to obtain a decoded image.
  • the reconstructed value of the reference pixel of the first pixel can be obtained first, and the reconstructed value of the reference pixel of the first pixel can be used as the predicted value of the first pixel to obtain a decoded image; this decoding method can simplify the The acquisition process of the predicted value improves the decoding efficiency.
  • the scanning mode of the current coding unit is different, and the acquisition method of the reconstruction value of the reference pixel of the first pixel is also different; refer to the following description for details:
  • the acquisition mode of the reconstruction value of the reference pixel of the first pixel includes any of the following kind:
  • the first type obtaining the reconstructed value of the reference pixel of the first pixel from the first storage space corresponding to the intra-frame string copy mode (or referred to as the reference pixel memory of the intra-frame string copy mode).
  • the second type obtain the reconstructed value of the reference pixel of the first pixel from the second storage space corresponding to the intra-frame prediction mode (or referred to as the reference pixel memory of the intra-frame prediction mode).
  • the third type divide the current image into multiple N ⁇ N areas, when the first pixel is located in the first row of any N ⁇ N area, obtain the reconstruction of the reference pixel of the first pixel from the second storage space value; otherwise, obtain the reconstructed value of the reference pixel of the first pixel from the first storage space.
  • the pixels in the current string may all be the second pixel (that is, the current string does not include the first pixel), or the current string may also include the first pixel and the second pixel (that is, the current string except the first pixel pixel, and also includes a second pixel); wherein, the second pixel is a pixel located outside the first row of the current coding unit, and each second pixel in the current string is distributed in at least one row of the current coding unit. Then the computer device can also derive the predicted value of each second pixel in the current string in any of the following ways:
  • Mode A The predicted value of the second pixel in each row of the current string is acquired row by row in row units.
  • Manner B Using a single pixel as a unit, the predicted value of each second pixel in the current string is acquired pixel by pixel.
  • the predicted value of any second pixel is obtained according to the reconstructed value of the reference pixel located above the second pixel and adjacent to the second pixel.
  • the scanning mode of the current coding unit is a vertical scanning mode, and the vertical scanning mode indicates that the current coding unit is scanned in the order from left to right; then the acquisition mode of the reconstruction value of the reference pixel of the first pixel includes the following: Either:
  • the first type obtaining the reconstructed value of the reference pixel of the first pixel from the first storage space corresponding to the intra-frame string copy mode.
  • the second type obtain the reconstructed value of the reference pixel of the first pixel from the second storage space corresponding to the intra prediction mode.
  • the third type divide the current image into multiple N ⁇ N areas, when the first pixel is located in the first column of any N ⁇ N area, obtain the reconstruction of the reference pixel of the first pixel from the second storage space value; otherwise, obtain the reconstructed value of the reference pixel of the first pixel from the first storage space.
  • the pixels in the current string may all be the third pixel (that is, the current string does not include the first pixel), or the current string may also include the first pixel and the third pixel (that is, the current string except the first pixel pixel, and also includes a third pixel); wherein, the third pixel is a pixel located outside the first column of the current coding unit, and each third pixel in the current string is distributed in at least one column of the current coding unit. Then the computer device can also derive the predicted value of each third pixel in the current string in any of the following ways:
  • Mode a Obtain the predicted value of the third pixel in each column of the current string column by column in units of columns.
  • Manner b Using a single pixel as a unit, obtain the predicted value of each third pixel in the current string on a pixel-by-pixel basis. In the above manners a-b, the predicted value of any third pixel is obtained according to the reconstructed value of the reference pixel located to the left of any third pixel and adjacent to the any third pixel.
  • the video decoding device can also search for the reference string of the current string from the current processing unit and some decoded coding units in the decoded processing unit; and according to the reference string Get the predicted value of each pixel in the current string to get the decoded image.
  • the reference string must meet one or more of the following conditions:
  • Any reference pixel in the reference string is within the range of the current processing unit and the H decoded processing units to the left of the current processing unit, where H is a positive integer.
  • Each reference pixel in the reference string is located in the same alignment area, and the size of the alignment area is determined according to the size of the processing unit.
  • Each reference pixel in the reference string and each pixel in the current string are located in the same independent coding area, where the independent coding area may include the current image, or slices and strips in the current image.
  • the pixel values of the unmatched pixels can be directly decoded from the coding information of the current coding unit.
  • the current string of the current coding unit is a unit vector string, and the current string includes a first pixel (for example, a pixel in the first row in the current coding unit or the first pixel in the current coding unit. Column pixels, etc.), the reference pixel of the first pixel can be determined from the historical decoding unit in the current image, and the predicted value of the first pixel can be obtained according to the reconstructed value of the reference pixel to implement the decoding process.
  • a first pixel for example, a pixel in the first row in the current coding unit or the first pixel in the current coding unit. Column pixels, etc.
  • any string in the current coding unit can become a unit vector string, which can effectively widen the applicability of the unit vector string. range, which helps improve decoding performance.
  • the predictive coding mode adopted for the current coding unit in the embodiments of the present application is not limited to the sub-modes of the equivalent string and the unit vector string, but may also be other predictive coding modes such as the string prediction sub-mode in the intra-frame string copy mode; that is, That is, the embodiment of the present application can allow the coding unit in the string prediction sub-mode to use the unit vector string, which can further widen the applicable range of the unit vector string and improve the coding performance of the string prediction.
  • the embodiment of the present application proposes a string prediction method based on a unit vector string; this method is suitable for using
  • the codec of the intra-frame string copy mode is described below from the decoding side:
  • the unit vector is (-1, 0).
  • a) Decode a binary flag (that is, the aforementioned string indication flag) from the code stream (such as the encoding information of the current CU), if the binary flag is the target value, it indicates that the current string is a unit vector string;
  • b) Decode a value K from the code stream (eg encoding information of the current CU), indicating that at least one string behind the Kth group of pixels in the target group of pixels in the current coding unit is a unit vector string. If the current string is located after the Kth group of pixels in the target group of pixels, then determine that the current string is a unit vector string;
  • the pixels in each row of the string use the value of the pixel in the upper row (such as the reconstructed value) as the predicted value;
  • ii completes the reconstruction of the unit vector string pixel by pixel, and each pixel in the string uses the value of the pixel above it (such as the reconstructed value) as the predicted value;
  • iii. In i. and ii., if the pixel to be predicted is located in the first row of the current decoding block (that is, the current CU), then the reference pixel of the pixel is located in the upper row outside the current decoding block, and the reconstructed value of the reference pixel is There are the following optional acquisition methods:
  • the pixel in each row of the string uses the value of the pixel in the left column (such as the reconstructed value) as the predicted value;
  • ii completes the reconstruction of the unit vector string pixel by pixel, and each pixel in the string uses the value of the pixel located to the left of it (such as the reconstructed value) as the predicted value;
  • the reconstructed value of the reference pixel has the following optional: method of obtaining:
  • the unit vector string is allowed to be used in the intra-frame string replication string prediction sub-mode, unless the current string is a unit vector offset string, the standard-compliant bit stream shall meet all or part of the following reference range restrictions:
  • any reference pixel in the reference string pointed to by the string displacement vector is limited to a certain specified area.
  • the specified area range may be: the range of the current processing unit (such as the current largest coding unit) or the range of the left H processing units (such as the largest coding unit); where the size of H can be determined by the processing unit (such as LCU or CTU) size determines.
  • N (1 ⁇ ((7-log2_lcu_size_minus2+2) ⁇ 1))-(((log2_lcu_size_minus2+2) ⁇ 7)?1:0).
  • any reference pixel in the reference string pointed to by the string displacement vector falls on the left adjacent largest coding unit, and the luminance sample size of the largest coding unit is 128 ⁇ 128, after the luminance sample of any reference pixel in the reference string is shifted to the right by 128 pixels
  • the upper-left corner of the 64x64 region where the position is located has not been reconstructed.
  • any reference pixel in the reference string falls in the 64 ⁇ 64 area where the adjacent CTU on the left is located, and the same area position in the current CTU can be found correspondingly, and the coordinates of the upper left corner of the 64 ⁇ 64 area should not be the same as the current coding block.
  • the coordinates of the upper left corner are the same.
  • any position of the reference string luminance component is (xRef, yRef), ((xRef+128)/64 ⁇ 64, yRef/64 ⁇ 64) is not available; ((xRef+128)/64 ⁇ 64, yRef/64 ⁇ 64 ) should not be equal to the position of the upper left corner of the current block (xCb, yCb).
  • (xRefA, yRefA) and (xRefB, yRefB) are the coordinates of any two luminance components A and B on the reference string.
  • the method proposed in the embodiment of the present application expands the use range of the unit vector string, which helps to improve the coding efficiency of the string vector; and, by combining the unit vector string with the string prediction sub-mode, it is also possible to rely on the relatively high performance of the unit vector string.
  • the feature of low implementation complexity improves the coding performance of string prediction.
  • the embodiment of the present application further discloses a video encoding apparatus, and the video encoding apparatus may be a computer program (including program code) running in the video encoding device mentioned above. .
  • the video encoding apparatus may perform the method shown in FIG. 2 . Referring to Fig. 5, the video encoding apparatus may run the following units:
  • Obtaining unit 501 for determining the current string to be encoded in the current encoding unit of the current image
  • An encoding unit 502 configured to determine a reference pixel of the first pixel from historical encoding units in the current image if the current string is a unit vector string and the current string includes a first pixel;
  • the historical coding unit is an encoded coding unit adjacent to the current coding unit in the current image, and the reference pixel of the first pixel and the first pixel are adjacent in the current image;
  • the encoding unit 502 is further configured to obtain the predicted value of the first pixel according to the reconstructed value of the reference pixel of the first pixel, so as to obtain the encoding information of the current encoding unit.
  • the current coding unit is coded in a string prediction sub-mode in an intra string copy mode; and the coding units in the string prediction sub-mode are allowed to use unit vector strings.
  • the encoding unit 502 is further configured to: encode the string indicator, so that the encoded information includes the string indicator. Wherein, if the current string is a unit vector string, the string indicator is a target value.
  • the encoding unit 502 may be further configured to encode a numerical value K, so that the numerical value K is included in the encoding information, and the value of the numerical value K is greater than or equal to zero.
  • the value K is used to indicate that at least one string behind the Kth group of pixels in the target group of pixels of the current coding unit is a unit vector string, and a group of pixels includes: one string or at least one unmatched pixel.
  • the value K is used to indicate that the K+1th string in the target group of pixels of the current coding unit is the unit vector string.
  • the encoding unit 502 is further configured to: encode the string displacement vector of the current string, so that the encoding information includes the string displacement vector. Wherein, if the current string is the unit vector string, the string displacement vector is a unit vector.
  • the unit vector includes a first element and a second element; if the scanning mode of the current coding unit is a horizontal scanning mode, the element value of the first element is the first numerical value, and the The element value of the second element is the second numerical value; if the scanning mode of the current coding unit is the vertical scanning mode, the element value of the first element is the second numerical value, and the element value of the second element is the first value.
  • the current image is divided into multiple processing units, each of the processing units includes one or more encoding units; the multiple processing units include: the current processing unit and the encoded processing unit , the current processing unit includes the current coding unit;
  • the string displacement vector is used to represent the displacement of the current string to the reference string, and the reference string is allowed to not satisfy one or more of the following conditions:
  • Any reference pixel in the reference string is within the specified area
  • any reference pixel in the reference string is in the decoded processing unit adjacent to the left of the current processing unit, any reference pixel in the reference string is moved to the right by a predetermined pixel in the target area where the position is located of pixels have not been reconstructed;
  • Each reference pixel in the reference string is located in the same alignment area, and the size of the alignment area is determined according to the size of the processing unit;
  • Any reference pixel in the reference string does not belong to the current string.
  • the current coding unit includes P rows and Q columns of pixels; wherein, the values of P and Q are positive integers; if the scanning mode of the current coding unit is a horizontal scanning mode, and the The horizontal scanning mode indicates that the current coding unit is scanned in the order from top to bottom, then the historical coding unit is located above the current coding unit, and the first pixel is the first pixel located in the current coding unit. row of pixels.
  • the pixels in the current string are all second pixels, or the current string includes the first pixel and the second pixel; the second pixel is located in the current coding unit , and each second pixel in the current string is distributed in at least one row of the current coding unit; correspondingly, the coding unit 502 can also be used for:
  • the predicted value of any second pixel is obtained according to the reconstructed value of the reference pixel located above and adjacent to the any second pixel.
  • the current coding unit includes P rows and Q columns of pixels; wherein, the values of P and Q are both positive integers; if the scanning mode of the current coding unit is the vertical scanning mode, and all The vertical scanning mode indicates that the current coding unit is scanned in the order from left to right, then the historical coding unit is located to the left of the current coding unit, and the first pixel is located in the current coding unit. of the first column of pixels.
  • the pixels in the current string are all third pixels, or the current string includes the first pixel and the third pixel; the third pixel is located in the current coding unit , and each third pixel in the current string is distributed in at least one column in the current coding unit; correspondingly, the coding unit 502 can also be used for:
  • the predicted value of any third pixel is obtained according to the reconstructed value of the reference pixel located to the left of the any third pixel and adjacent to the any third pixel.
  • the acquisition method of the reconstructed value of the reference pixel of the first pixel includes any of the following:
  • the encoding unit 502 is further configured to: encode a string scan mode flag, so that the encoded information includes the string scan mode flag.
  • the string scanning mode flag is the third value; if the scanning mode of the current coding unit is the vertical scanning mode, the string scanning mode flag is the fourth value.
  • the current image is divided into multiple processing units, and each processing unit includes one or more encoding units; the multiple processing units include: a current processing unit and an encoded processing unit, the The current processing unit includes the current coding unit; correspondingly, the coding unit 502 can also be used for:
  • the current string is not the unit vector string, searching for a reference string of the current string from the current processing unit and some encoded encoding units in the encoded processing unit;
  • the predicted value of each pixel in the current string is acquired according to the reference string, so as to obtain encoding information of the current coding unit.
  • the reference string satisfies one or more of the following conditions:
  • Any reference pixel in the reference string is within the range of the current processing unit and H encoded processing units to the left of the current processing unit, where H is a positive integer;
  • any reference pixel in the reference string is in the coded processing unit adjacent to the left of the current processing unit, any reference pixel in the reference string is moved to the right by a predetermined pixel in the target area where the position is located 's pixels have not been encoded;
  • Each reference pixel in the reference string is located in the same alignment area, and the size of the alignment area is determined according to the size of the processing unit;
  • Each reference pixel in the reference string and each pixel in the current string are located in the same independent coding area;
  • Any reference pixel in the reference string does not belong to the current string.
  • the encoding unit 502 may also be used for:
  • the pixel values of the unmatched pixels are encoded, so that the encoded information includes the pixel values of the unmatched pixels.
  • each step involved in the method shown in FIG. 2 may be performed by each unit in the video encoding apparatus shown in FIG. 5 .
  • step S201 shown in FIG. 2 may be performed by the obtaining unit 501 shown in FIG. 5
  • steps S202 to S203 may be performed by the encoding unit 502 shown in FIG. 5 .
  • each unit in the video encoding apparatus shown in FIG. 5 may be combined into one or several other units, respectively or all, to form, or some unit(s) may be further disassembled. It is divided into a plurality of units with smaller functions, which can realize the same operation without affecting the realization of the technical effects of the embodiments of the present application.
  • the above-mentioned units are divided based on logical functions.
  • the function of one unit may also be implemented by multiple units, or the functions of multiple units may be implemented by one unit.
  • the video-based encoding apparatus may also include other units. In practical applications, these functions may also be implemented with the assistance of other units, and may be implemented by cooperation of multiple units.
  • a general-purpose computing device such as a computer
  • a general-purpose computing device may be implemented on a general-purpose computing device including a central processing unit (CPU), a random access storage medium (RAM), a read-only storage medium (ROM), and other processing elements and storage elements.
  • CPU central processing unit
  • RAM random access storage medium
  • ROM read-only storage medium
  • Running a computer program capable of executing the steps involved in the corresponding method as shown in FIG. 2 , to construct a video encoding apparatus as shown in FIG. 5 , and to implement the video encoding of the embodiments of the present application method.
  • the computer program can be recorded on, for example, a computer-readable recording medium, and loaded in the above-mentioned computing device through the computer-readable recording medium, and executed therein.
  • the current string of the current coding unit is a unit vector string, and the current string includes a first pixel (for example, a pixel in the first row in the current coding unit or the first pixel in the current coding unit).
  • the reference pixel of the first pixel can be determined from the historical decoding unit in the current image, and the predicted value of the first pixel can be obtained according to the reconstructed value of the reference pixel to implement the encoding process.
  • any string in the current coding unit can become a unit vector string, which can effectively widen the applicability of the unit vector string. range, which helps improve encoding performance.
  • the predictive coding mode adopted for the current coding unit in the embodiments of the present application is not limited to the sub-modes of the equivalent string and the unit vector string, but may also be other predictive coding modes such as the string prediction sub-mode in the intra-frame string copy mode; That is to say, the embodiment of the present application can allow the coding unit in the string prediction sub-mode to use the unit vector string, which can further widen the applicable range of the unit vector string and improve the coding performance of the string prediction.
  • the embodiments of the present application further disclose a video decoding apparatus, and the video decoding apparatus may be a computer program (including program code) running in the above-mentioned video decoding device .
  • the video decoding apparatus may perform the method shown in FIG. 4 . Referring to Fig. 6, the video decoding apparatus may run the following units:
  • a decoding unit 602 configured to determine a reference pixel of the first pixel from historical decoding units in the current image if the current string is a unit vector string and the current string includes a first pixel;
  • the historical decoding unit is a decoded coding unit adjacent to the current coding unit in the current image, and the reference pixel of the first pixel and the first pixel are adjacent in the current image;
  • the decoding unit 602 is further configured to obtain the predicted value of the first pixel according to the reconstructed value of the reference pixel of the first pixel to obtain a decoded image.
  • the current coding unit is coded in a string prediction sub-mode in an intra string copy mode; and the coding units in the string prediction sub-mode are allowed to use unit vector strings.
  • the decoding unit 602 may also be used to:
  • the string indication flag is the target value, it is determined that the current string is a unit vector string.
  • the decoding unit 602 may also be used to:
  • a value K is decoded from the encoding information of the current coding unit, and the value of the value K is greater than or equal to zero; the value K is used to indicate at least one of the pixels in the target group of the current coding unit located after the Kth group of pixels
  • the string is a unit vector string, and a group of pixels includes: a string or at least one unmatched pixel;
  • the current string is determined to be the unit vector string.
  • the numerical value K is used to indicate that the K+1th string in the target group of pixels of the current coding unit is the unit vector string;
  • the decoding unit 602 when the decoding unit 602 is configured to determine that the current string is the unit vector string if the current string is located after the Kth group of pixels in the target group of pixels, it can be specifically configured to:
  • the current string is the K+1 th string in the target group of pixels, the current string is determined to be the unit vector string.
  • the decoding unit 602 may also be used to:
  • the string displacement vector is a unit vector, it is determined that the current string is the unit vector string.
  • the unit vector includes a first element and a second element; if the scanning mode of the current coding unit is a horizontal scanning mode, the element value of the first element is the first numerical value, and the The element value of the second element is the second numerical value; if the scanning mode of the current coding unit is the vertical scanning mode, the element value of the first element is the second numerical value, and the element value of the second element is the first value.
  • the current image is divided into a plurality of processing units, each of the processing units includes one or more coding units; the plurality of processing units include: a current processing unit and a decoded processing unit , the current processing unit includes the current coding unit;
  • the string displacement vector is used to represent the displacement of the current string to the reference string, and the reference string is allowed to not satisfy one or more of the following conditions:
  • Any reference pixel in the reference string is within the specified area
  • any reference pixel in the reference string is in the decoded processing unit adjacent to the left of the current processing unit, any reference pixel in the reference string is moved to the right by a predetermined pixel in the target area where the position is located of pixels have not been reconstructed;
  • Each reference pixel in the reference string is located in the same alignment area, and the size of the alignment area is determined according to the size of the processing unit;
  • Any reference pixel in the reference string does not belong to the current string.
  • the current coding unit includes P rows and Q columns of pixels; wherein, the values of P and Q are positive integers; if the scanning mode of the current coding unit is a horizontal scanning mode, and the The horizontal scanning mode indicates that the current coding unit is scanned in the order from top to bottom, then the historical decoding unit is located above the current coding unit, and the first pixel is the first pixel located in the current coding unit. row of pixels.
  • the decoding unit 602 can also be used for:
  • the predicted value of the second pixel in each row of the current string is obtained row by row; or,
  • the predicted value of any second pixel is obtained according to the reconstructed value of the reference pixel located above and adjacent to the any second pixel.
  • the current coding unit includes P rows and Q columns of pixels; wherein, the values of P and Q are both positive integers; if the scanning mode of the current coding unit is the vertical scanning mode, and all The vertical scanning mode indicates that the current coding unit is scanned in the order from left to right, then the historical decoding unit is located to the left of the current coding unit, and the first pixel is located in the current coding unit. of the first column of pixels.
  • the decoding unit 602 can also be used for:
  • the predicted value of any third pixel is obtained according to the reconstructed value of the reference pixel located to the left of the any third pixel and adjacent to the any third pixel.
  • the acquisition method of the reconstructed value of the reference pixel of the first pixel includes any of the following:
  • the decoding unit 602 may also be used to:
  • the string scan mode flag is a fourth value, it is determined that the scan mode of the current coding unit is a vertical scan mode.
  • the current image is divided into multiple processing units, each processing unit includes one or more coding units; the multiple processing units include: the current processing unit and the decoded processing unit, the The current processing unit includes the current coding unit; correspondingly, the decoding unit 602 can also be used for:
  • the current string is not the unit vector string, searching for a reference string of the current string from the current processing unit and the partially decoded coding units in the decoded processing unit;
  • the predicted value of each pixel in the current string is obtained according to the reference string to obtain a decoded image.
  • the reference string satisfies one or more of the following conditions:
  • Any reference pixel in the reference string is within the range of the current processing unit and the H decoded processing units to the left of the current processing unit, where H is a positive integer;
  • any reference pixel in the reference string is in the decoded processing unit adjacent to the left of the current processing unit, any reference pixel in the reference string is moved to the right by a predetermined pixel in the target area where the position is located of pixels have not been reconstructed;
  • Each reference pixel in the reference string is located in the same alignment area, and the size of the alignment area is determined according to the size of the processing unit;
  • Each reference pixel in the reference string and each pixel in the current string are located in the same independent decoding area;
  • Any reference pixel in the reference string does not belong to the current string.
  • the pixel value of the unmatched pixels is obtained by decoding from the coding information of the current coding unit.
  • each step involved in the method shown in FIG. 4 may be performed by each unit in the video decoding apparatus shown in FIG. 6 .
  • step S401 shown in FIG. 4 may be performed by the obtaining unit 501 shown in FIG. 6
  • steps S402 to S403 may be performed by the decoding unit 602 shown in FIG. 6 .
  • each unit in the video decoding apparatus shown in FIG. 6 may be respectively or all combined into one or several other units to form, or some of the unit(s) may be further disassembled It is divided into a plurality of units with smaller functions, which can realize the same operation without affecting the realization of the technical effects of the embodiments of the present application.
  • the above-mentioned units are divided based on logical functions.
  • the function of one unit may also be implemented by multiple units, or the functions of multiple units may be implemented by one unit.
  • the video-based decoding apparatus may also include other units. In practical applications, these functions may also be implemented with the assistance of other units, and may be implemented by cooperation of multiple units.
  • a general-purpose computing device such as a computer
  • a general-purpose computing device may be implemented on a general-purpose computing device including a central processing unit (CPU), a random access storage medium (RAM), a read-only storage medium (ROM), and other processing elements and storage elements.
  • CPU central processing unit
  • RAM random access storage medium
  • ROM read-only storage medium
  • Running a computer program capable of executing the steps involved in the corresponding method as shown in FIG. 4 , to construct the video decoding apparatus as shown in FIG. 6 , and to realize the video decoding of the embodiments of the present application method.
  • the computer program can be recorded on, for example, a computer-readable recording medium, and loaded in the above-mentioned computing device through the computer-readable recording medium, and executed therein.
  • the current string of the current coding unit is a unit vector string, and the current string includes a first pixel (for example, a pixel in the first row in the current coding unit or the first pixel in the current coding unit).
  • the reference pixel of the first pixel can be determined from the historical decoding unit in the current image, and the predicted value of the first pixel can be obtained according to the reconstructed value of the reference pixel to implement the decoding process.
  • any string in the current coding unit can become a unit vector string, which can effectively widen the applicability of the unit vector string. range, which helps improve decoding performance.
  • the predictive coding mode adopted for the current coding unit in the embodiments of the present application is not limited to the sub-modes of the equivalent string and the unit vector string, but may also be other predictive coding modes such as the string prediction sub-mode in the intra-frame string copy mode; That is to say, the embodiment of the present application can allow the coding unit in the string prediction sub-mode to use the unit vector string, which can further widen the applicable range of the unit vector string and improve the coding performance of the string prediction.
  • an embodiment of the present application further provides a computer device; the computer device may be the above-mentioned video encoding device or the above-mentioned video decoding device.
  • the computer device may include at least a processor 701 , an input interface 702 , an output interface 703 and a computer storage medium 704 .
  • the processor 701 , the input interface 702 , the output interface 703 and the computer storage medium 704 in the computer device may be connected by a bus or other means.
  • the computer device may also include a video encoder; if the computer device is the above-mentioned video decoding device, then the computer device may also include a video decoding device. device.
  • the computer storage medium 704 can be stored in the memory of the computer device, the computer storage medium 704 is used for storing a computer program, and the computer program includes program instructions, and the processor 701 is used for executing the storage of the computer storage medium 704 program instructions.
  • the processor 701 (or called CPU (Central Processing Unit, central processing unit)) is the computing core and the control core of the computer device, which is suitable for implementing one or more instructions, and is specifically suitable for loading and executing one or more instructions to achieve Corresponding method flow or corresponding function.
  • the processor 701 described in this embodiment of the present application may be configured to perform related method steps of the video encoding method shown in FIG. 2 above.
  • the processor 701 described in this embodiment of the present application may be used to perform relevant method steps of the video decoding method shown in FIG. 4 above.
  • Embodiments of the present application further provide a computer storage medium (Memory), where the computer storage medium is a memory device in a computer device, used to store programs and data.
  • the computer storage medium here may include both the built-in storage medium in the computer device, and certainly also the extended storage medium supported by the computer device.
  • Computer storage media provide storage space that stores the operating system of the computer device.
  • one or more instructions suitable for being loaded and executed by the processor 1001 are also stored in the storage space, and these instructions may be one or more computer programs (including program codes).
  • the computer storage medium here can be a high-speed RAM memory, or a non-volatile memory (non-volatile memory), such as at least one disk memory; optionally, it can also be at least one memory located far away from the aforementioned processor. computer storage media.
  • the processor 701 can load and execute one or more first instructions stored in the computer storage medium, so as to realize the video shown in FIG. 4 above. Corresponding steps of the method in the decoding method embodiment; in specific implementation, one or more first instructions in the computer storage medium are loaded by the processor 701 and execute the video encoding method shown in FIG. 2 or the video encoding shown in FIG. 4 . method.
  • a computer program product or computer program where the computer program product or computer program includes computer instructions, and the computer instructions are stored in a computer-readable storage medium.
  • the processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device executes the video encoding method shown in FIG. 2 or the video decoding method shown in FIG. 4.

Abstract

本申请实施例公开了一种视频解码方法、视频编码方法、相关设备及存储介质,其中视频解码方法包括:在当前图像的当前编码单元中确定待解码的当前串;若所述当前串为单位矢量串,且所述当前串中包括第一像素,则从所述当前图像中的历史解码单元中确定所述第一像素的参考像素;所述历史解码单元是所述当前图像中与所述当前编码单元相邻的已解码的编码单元,所述第一像素的参考像素和所述第一像素在所述当前图像中相邻;根据所述第一像素的参考像素的重建值获取所述第一像素的预测值,以得到解码图像。本申请实施例可以有效扩宽单位矢量串的适用范围,有助于提升编解码性能。

Description

视频解码方法、视频编码方法、相关设备及存储介质
本申请要求于2020年12月05日提交中国专利局、申请号为2020114162420、申请名称为“视频解码方法、视频编码方法、相关设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及图像处理技术领域,尤其涉及视频编解码技术。
背景技术
在视频编解码技术中,编码端通常会先将视频内包含的帧图像划分成多个编码单元,然后通过对各编码单元进行编码得到帧图像的码流数据,进而将码流数据传输至解码端。相应的,解码端接收到码流数据后,通常也以编码单元为单位执行解码操作,得到解码图像。
随着编解码技术的发展,单位矢量串这一概念被提出;单位矢量串凭借其具有的较低实现复杂度的优势,受到了广泛关注。目前,相关技术只在帧内串复制编码模式包括的等值串、以及单位矢量串子模式中使用了该单位矢量串;并且基于单位矢量串对编码单元进行编解码时,只能参考该编码单元内部的像素。这种编解码方式使得编码单元中某些串(如第一行的串)被限制,不能成为单位矢量串,可见,相关技术在一定程度上限制了单位矢量串的适用范围,从而影响了编解码性能。
发明内容
本申请实施例提供了一种视频解码方法、视频编码方法、相关设备及存储介质,可以有效扩展单位矢量串的适用范围,有助于提升编解码性能。
第一方面,本申请实施例提供了一种视频解码方法,所述方法包括:
在当前图像的当前编码单元中确定待解码的当前串;
若所述当前串为单位矢量串,且所述当前串中包括第一像素,则从所述当前图像中的历史解码单元中确定所述第一像素的参考像素;所述历史解码单元是所述当前图像中与所述当前编码单元相邻的已解码的编码单元,所述第一像素的参考像素和所述第一像素在所述当前图像中相邻;
根据所述第一像素的参考像素的重建值获取所述第一像素的预测值,以得到解码图像。
第二方面,本申请实施例提供了一种视频编码方法,所述方法包括:
在当前图像的当前编码单元中确定待编码的当前串;
若所述当前串为单位矢量串,且所述当前串中包括第一像素,则从所述当前图像中的历史编码单元中确定所述第一像素的参考像素;所述历史编码单元是所述当前图像中与所述当前编码单元相邻的已编码的编码单元,所述第一像素的参考像素和所述第一像素在所述当前图像中相邻;
根据所述第一像素的参考像素的重建值获取所述第一像素的预测值,以得到所述当前编码单元的编码信息。
第三方面,本申请实施例提供了一种视频解码装置,所述装置包括:
获取单元,用于在当前图像的当前编码单元中确定待解码的当前串;
解码单元,用于若所述当前串为单位矢量串,且所述当前串中包括第一像素,则从所述当前图像中的历史解码单元中确定所述第一像素的参考像素;所述历史解码单元是所述当前图像中与所述当前编码单元相邻的已解码的编码单元,所述第一像素的参考像素和所述第一像素在所述当前图像中相邻;
所述解码单元,还用于根据所述第一像素的参考像素的重建值获取所述第一像素的预测值,以得到解码图像。
第四方面,本申请实施例提供了一种视频编码装置,所述装置包括:
获取单元,用于在当前图像的当前编码单元中确定待编码的当前串;
编码单元,用于若所述当前串为单位矢量串,且所述当前串中包括第一像素,则从所述当前图像中的历史编码单元中确定所述第一像素的参考像素;所述历史编码单元是所述当前图像中与所述当前编码单元相邻的已编码的编码单元,所述第一像素的参考像素和所述第一像素在所述当前图像中相邻;
所述编码单元,还用于根据所述第一像素的参考像素的重建值获取所述第一像素的预测值,以得到所述当前编码单元的编码信息。
第五方面,本申请实施例提供了一种计算机设备,所述计算机设备包括输入接口和输出接口,所述计算机设备还包括:
处理器,适于实现一条或多条指令;以及,
计算机存储介质,所述计算机存储介质存储有一条或多条第一指令,所述一条或多条第一指令适于由所述处理器加载并执行如上述第一方面所述的视频解码方法、或者如上述第二方面所述的视频编码方法。
第六方面,本申请实施例提供了一种计算机存储介质,所述计算机存储介质存储有一条或多条第一指令,所述一条或多条第一指令适于由处理器加载并执行上述第一方面所述的视频解码方法;或者,所述计算机存储介质存储有一条或多条第二指令,所述一条或多条第二指令适于由处理器加载并执行上述第二方面所述的视频编码方法。
第七方面,本申请实施例提供了一种计算机程序产品,包括指令,当其在计算机上运行时,使得计算机执行如上述第一方面所述的视频解码方法,或上述第二方面所述的视频编码方法。
本申请实施例在编解码的过程中,在当前编码单元的当前串为单位矢量串,且当前串中包括第一像素(如当前编码单元中的第一行的像素或者当前编码单元中的第一列的像素等等)时,可从当前图像中的历史解码单元中确定第一像素的参考像素,并根据参考像素的重建值获取第一像素的预测值以实现编解码处理。可见,本申请实施例通过允许使用当前编码单元相邻的历 史解码单元中的像素作为参考像素,使得当前编码单元中任一串均可成为单位矢量串,这样可有效扩宽单位矢量串的适用范围,有助于提升编解码性能。并且,本申请实施例针对当前编码单元采用的预测编码模式并不局限于等值串与单位矢量串子模式,还可以是帧内串复制模式中的串预测子模式等其他预测编码模式;也就是说,本申请实施例可允许串预测子模式下的编码单元使用单位矢量串,这样可进一步扩宽单位矢量串的适用范围,提升串预测的编码性能。
附图说明
图1a是本申请实施例提供的一种图像处理系统的架构示意图;
图1b是本申请实施例提供的一种将图像划分成多个编码单元的示意图;
图1c是本申请实施例提供的一种视频编码器的基本工作流程图;
图1d是本申请实施例提供的一种多种帧内预测模式的模式示意图;
图1e是本申请实施例提供的一种帧内预测模式中的角度预测模式的示意图;
图1f是本申请实施例提供的一种帧间预测的示意图;
图1g是本申请实施例提供的一种帧内块复制的示意图;
图1h是本申请实施例提供的一种帧内块复制的参考像素的范围示意图;
图1i是本申请实施例提供的一种帧内串复制的示意图;
图1j是本申请实施例提供的一种具有串长度分辨率控制的帧内串复制的示意图;
图2是本申请实施例提供的一种视频编码方法的流程示意图;
图3a是本申请实施例提供的一种扫描模式的示意图;
图3b是本申请实施例提供的一种关于当前编码单元的方位说明图;
图4是本申请实施例提供的一种视频解码方法的流程示意图;
图5是本申请实施例提供的一种视频编码装置的结构示意图;
图6是本申请实施例提供的一种视频解码装置的结构示意图;
图7是本申请实施例提供的一种计算机设备的结构示意图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述。
本申请实施例涉及一种图像处理系统;参见图1a,该图像处理系统至少包括:视频编码设备11和视频解码设备12。其中,视频编码设备11的内部至少包括视频编码器,该视频编码器用于对视频信号中的图像进行编码处理以得到码流;视频解码设备12的内部至少包括视频解码器,该视频解码器用于对码流进行解码处理以得到视频信号中的图像对应的重建图像。此处提及的视频信号,从信号的获取方式看,可以分为摄像机拍摄到的以及计算机生 成的。由于统计特性的不同,其对应的压缩编码方式也可能有所区别。
目前主流的视频编码标准,如国际视频编码标准HEVC(High Efficiency Video Coding,,高性能视频编码标准),VVC(Versatile Video Coding,多功能视频编码标准),以及中国国家视频编码标准AVS3(Audio Video coding Standard 3,音视频编码标准)等。均采用了混合编码框架,通过将原始视频信号中的图像划分成一系列的CU(Coding Unit,编码单元),结合预测、变换和熵编码等视频编码处理方式,实现视频数据的压缩。具体的,这些主流的视频编码标准对输入的原始视频信号中的图像,通常需要进行如下一系列的操作和处理:
1)块划分结构(block partition structure):将输入图像划分成若干个不重叠的处理单元,对每个处理单元进行类似的压缩操作。此处的处理单元被称作CTU(Coding Tree Unit,编码树单元),或者LCU(Largest Coding Unit,最大编码单元)。CTU或者LCU再往下,可以继续进行更加精细的划分,得到一个或多个基本的编码单元,称之为CU。以处理单元为LCU为例,根据每个LCU的特点,相应地将LCU进一步划分为若干个CU的示意图可参见图1b;应理解的是,图1b只是示例性地表示LCU的划分方式,并不对其进行限定;图1b表示的是将LCU平均地划分为多个CU,但实际也可将LCU非平均地划分为多个CU。每个CU是一个编码环节中最基本的元素,且每个CU可独立采用一种预测编码模式进行编解码。
2)预测编码(Predictive Coding):目前的视频编码技术包括帧内预测模式和帧间预测模式等多种预测编码模式,编码端需要为当前CU,在众多可能的预测编码模式中选择最适合该CU的一种预测编码模式,并告知解码端。然后,编码端采用其选择的预测编码模式,对当前CU进行预测编码;当前CU经过选定的已重建视频信号的预测后,得到残差视频信号。
①帧内预测模式:预测的信号来自于同一图像内已经编码重建过的区域;
②帧间预测模式:预测的信号来自已经编码过的,不同于当前图像的其他图像(称之为参考图像)。
3)变换编码及量化(Transform&Quantization):残差视频信号经过DFT(Discrete Fourier Transform,离散傅里叶变换)、DCT(Discrete Cosine Transform,离散余弦变换)等变换操作,将残差视频信号转换到变换域中,称之为变换系数。对于在变换域中的残差视频信号,进一步地进行有损的量化操作,丢失掉一定的信息,使得量化后的信号有利于压缩表达。由于在一些视频编码标准中,可能有多于一种变换方式可以选择,因此编码端需要为当前CU选择其中的一种变换方式,并告知解码端。其中,量化的精细程度通常由QP(Quantization Parameter,量化参数)决定;当QP取值较大时,表示更大取值范围的系数将被量化为同一个输出,因此通常会带来更大的失真及较低的码率;相反,当QP取值较小时,表示更小取值范围的系数将被 量化为同一个输出,因此通常会带来更小的失真,同时对应较高的码率。
4)熵编码(Entropy Coding)或统计编码:量化后的变换域信号,根据各个值出现的频率,进行统计压缩编码,最后输出二值化(0或者1)的压缩码流。同时,编码其他信息;例如,选择的模式、运动矢量等也需要进行熵编码以降低码率。其中,统计编码是一种无损编码方式,可以有效的降低表达同样的信号所需要的码率;常见的统计编码方式有变长编码(Variable Length Coding,VLC)或者基于上下文的自适应二进制算术编码(Content Adaptive Binary Arithmetic Coding,CABAC),等等。
5)环路滤波(Loop Filtering):已经编码过的图像,经过反量化、反变换及预测补偿的操作(上述步骤2)-步骤4)的反向操作),可获得重建的解码图像。重建的解码图像与原始图像相比,由于存在量化的影响,部分信息与原始图像有所不同,会产生失真(Distortion)。因此可采用滤波器对重建的解码图像进行滤波操作,例如去块效应滤波(deblocking)、SAO(Sample Adaptive Offset,样点自适应补偿)或者ALF(Adaptive Loop Filter,自适应环路滤波)等滤波操作,可以有效的降低因量化产生的失真程度。由于这些经过滤波后的重建的解码图像,将作为后续编码图像的参考,用于对将来的信号进行预测,所以上述的滤波操作也被称为环路滤波,及在编码环路内的滤波操作。
基于上述步骤1)-步骤5)的相关描述,图1c示例性地展示了一个视频编码器的基本工作流程图;其中,图1c以第k个CU(标记为s k[x,y])为例进行举例说明,k为大于或等于1且小于或等于当前图像中的CU的数量的正整数。s k[x,y]表示第k个CU中坐标为[x,y]的像素点(简称像素),x表示该像素点的横坐标,y表示该像素点的纵坐标。s k[x,y]经过运动补偿或者帧内预测等处理方式中的一种较优的方式处理后,获得预测信号
Figure PCTCN2021131183-appb-000001
s k[x,y]与
Figure PCTCN2021131183-appb-000002
相减得到残差信号u k[x,y],然后对该残差信号u k[x,y]进行变换和量化。量化输出的数据有两个不同的去处:一个送给熵编码器进行熵编码,编码后的码流输出到一个缓冲器(buffer)中保存,等待传送出去;另一个进行反量化和反变换后,得到信号u' k[x,y],将信号u' k[x,y]与
Figure PCTCN2021131183-appb-000003
相加得到新的预测信号s*k[x,y],并将s*k[x,y]送至当前图像的缓冲器中保存。s*k[x,y]经过帧内-图像预测,获得f(s*k[x,y])。s*k[x,y]经过环路滤波后获得s' k[x,y],将s' k[x,y]送至解码图像缓冲器中保存,以用于生成重建图像。s' k[x,y]经过运动-补偿预测后获得s' r[x+m x,y+m y],该s' r[x+m x,y+m y]表示参考块;其中,m x和m y分别表示运动矢量(Motion Vector,MV)的水平分量和竖直分量。
根据上述编码过程的描述,可以看出:在解码端,对于每一个CU,视频解码器获得压缩码流后,需要先进行熵解码,获得各种模式信息(如预测编码模式信息)及量化后的变换系数,使得各个系数经过反量化及反变换后可 得到残差信号。另一方面,根据已知的预测编码模式信息,可获得该CU对应的预测信号(即该CU中各个像素的预测值),两者(即残差信号和预测信号)相加之后,即可得到重建信号(即重建图像)。最后,使得重建信号经过环路滤波的操作,产生最终的输出信号(即最终解码得到的解码图像)。
需要说明的是,实际采用的视频编码标准的不同,上述编码过程中涉及的预测编码模式也存在一定的差异。其中,AVS3中使用的预测编码模式可包括:帧内预测模式、帧间预测模式、帧内块复制预测模式、帧内串复制模式等等;使用AVS3进行编码时,可以单独或组合使用这些预测编码模式;下面,将对各预测编码模式分别进行介绍:
(一)帧内预测模式:
帧内预测模式是一种常用的预测编码技术,其主要基于视频图像的像素在空域存在的相关性,从相邻的已编码区域导出当前CU的预测值。如图1d所示,目前AVS3中包含3种非角度帧内预测模式(Plane,DC和Bilinear),以及66种角度预测模式。当采用角度预测模式时,当前CU内的像素根据预测模式的角度对应的方向,将参考像素行或列上对应位置的参考像素的值作为预测值。如图1e所示,对于当前CU中的像素P;首先可根据图中的预测角度,从上方已经编码的像素行中确定参考像素的位置,然后以参考像素的值作为像素P的预测值。但需要注意的是,不是所有像素位置指向的参考像素的位置都是整像素精度(如图1e中像素P的参考像素所在位置为像素B和像素C中间的某个分像素位置);在此情况下,像素P的预测值需要利用周围像素插值得到。为了提高帧内预测的效率,通常可使用片上存储器存储的帧内预测的参考像素的值。
(二)帧间预测模式:
参见图1f,帧间预测主要利用视频时间域的相关性,使用邻近已编码的其他图像中的像素预测当前图像中的像素,以达到有效去除视频时域冗余的目的,能够有效节省编码残差数据的比特。其中,P为当前帧,P r为参考帧,B为当前CU,B r是B的参考块(即参考CU)。B’与B在图像中的坐标位置相同。假设B r坐标为(x r,y r),B’坐标为(x,y);当前CU与其参考块之间的位移称为运动向量(MV),则:MV=(x r-x,y r-y)。
(三)帧内块复制模式:
帧内块复制(Intra Block Copy,IBC),其也可称为块复制帧内预测;是HEVC屏幕内容编码(Screen Cotent Coding,SCC)扩展中采用的一种帧内编码工具,它显著地提升了屏幕内容的编码效率。在AVS3和VVC中,也采用了IBC技术以提升屏幕内容编码的性能。IBC主要利用屏幕内容视频在空间的相关性,使用当前图像上已编码区域中的像素预测当前CU中的像素的预测值,能够有效节省编码像素所需的比特。如图1g所示,在IBC中,当前 CU与其参考块之间的位移可称为块矢量(Block Vector,BV);H.266/VVC采用了类似于帧间预测的BV预测技术,进一步节省编码BV所需的比特。
在硬件实现中,可将所有用于IBC预测的参考像素,都存储在片上存储器中,以降低片内外内存读写的额外带宽。为了降低硬件成本,IBC被限制为只能使用一个S×S像素大小(如128×128像素大小)的内存作为存储空间;那么,为了提高该内存的使用效率,128×128大小的内存可被分为4个64×64大小的区域,以实现参考像素内存的复用。其中,参考像素的范围可如图1h所示:图1h中具有垂直条的块是当前处理单元(即当前CTU或当前LCU)中的当前CU;灰色区域是已编码的部分,白色区域是当前处理单元中尚未编码的部分;图1h中带有“X”标记的部分对于当前CU不可用,图1h中带有“X”标记的区域内的像素不能作为当前CU的参考像素。
需要说明的是,本申请实施例只是以128×128像素大小的内存为例进行说明,在其他实施例也可使用一个64×64像素大小的内存作为存储空间,以进一步降低硬件成本;或者,也可使用一个256×256像素大小的内存作为存储空间,以满足更多数据的存储需求,本申请在此不对存储空间的大小做任何限定。并且,本申请实施例也只是以将该存储空间划分成4个64×64大小的区域为例进行说明,在其他实施例中,也可采用其他的区域划分方式,如将一个128×128大小的内存分为16个32×32大小的区域以进行参考像素内存的复用,等等。也就是说,可将一个S×S像素大小的内存划分成G 2
Figure PCTCN2021131183-appb-000004
Figure PCTCN2021131183-appb-000005
大小的区域。进行参考像素内存的复用;其中,G为S的约数,即S除以G得到的商为整数。
(四)帧内串复制模式——串预测子模式:
帧内串复制技术(Intra String Copy,ISC),也可称为串复制帧内预测;其主要按照某种扫描模式(如光栅扫描、往返扫描和Zig-Zag扫描等),将一个CU分成一系列像素串或未匹配像素(即未匹配到参考像素的像素)。类似于IBC模式,串预测子模式中的每个像素串在当前图像的已编码区域中寻找相同形状的参考串,从而导出当前串中各个像素的预测值;通过编码当前串中各个像素的像素值与预测值之间的残差,代替直接编码像素值,能够有效 节省比特。
在一种实施方式中,对划分CU得到的各个像素串均不包括未匹配像素;在此实施方式下,图1i示例性地给出了帧内串复制的示意图。其中,深灰色的区域为已编码区域;白色的28个像素构成串1,浅灰色的35个像素构成串2,黑色的1个像素为未匹配像素。
在一种实施方式中,由于2020年8月份的AVS会议采纳了串长度分辨率控制技术,限制所有串的长度必须为4的整数倍,串的类型被分为匹配串和不完全匹配串。其中,匹配串是指长度为4的整数倍,且不包括未匹配像素的串;不完全匹配串是指长度为4,且包含未匹配像素的串。需要说明的是,在其他实施例中,匹配串的长度也可不局限于4的整数倍,其也可以是5的整数倍,或者3的整数倍,等等;同理,不完全匹配串的长度也可不局限于4,其也可以是5、6等等。对于不完全匹配串中的匹配像素,可以根据对应的串位移矢量导出预测值;而对于不完全匹配串中的未匹配像素,其像素值可以从码流中解码得到。也就是说,划分CU得到的像素串中可包括未匹配像素;在此实施方式下,图1j示例性地给出了具有串长度分辨率控制的帧内串复制的示意图。其中,深灰色的区域为已编码区域;白色的24个像素构成串1,黑色的一个像素与其右侧的三个像素构成串2,浅灰色的36个像素构成串3。
另外,帧内串复制技术还需要编码当前CU中各个串对应的串位移矢量(String Vector,简称SV)、串长度以及是否有匹配串的标志等;此处的串位移矢量又可称为串矢量,其用于表示当前串到参考串的位移。串长度表示当前串包含的像素数量。
在不同的实现方式中,串长度的编码有多种方式,以下给出几种示例(部分示例可能组合使用):①直接编码串长度。②编码处理当前串后的待处理像素数量,使得解码端根据当前CU的大小P,已处理的像素数量P1,解码得到不包括当前串后的待处理像素数量P2,从而计算得到当前串的串长度L,L=P-P1-P2;其中,L、P均为大于0的整数,P1和P2均为大于或等于0的整数。③编码一个标志指示当前串是否为最后一个串,如果是最后一个串,则根据当前CU的大小P,已处理的像素数量P1,计算得到当前串的串长度L,L=P-P1。
在实际应用中,为了保持硬件实现具有较低的复杂度,AVS3中的帧内串复制技术使用了类似IBC的参考范围限制,使用1个128×128像素大小的内存,存储参考像素的值和当前待重建的像素的值,并使用与IBC类似的内存重用策略提高该内存的使用效率。
(五)帧内串复制模式——等值串与单位矢量串子模式:
等值串与单位矢量串模式是帧内串复制模式的一种子模式,在2020年10月被采纳至AVS3标准中。类似于帧内串复制模式,该模式将一个CU(即 编码/解码块)按照某种扫描顺序划分为一系列的像素串或未匹配像素,像素串的类型可以为等值串或单位矢量串。此模式下的等值串的特点在于:像素串中所有像素具有相同的预测值。此模式下的单位矢量串(也称为单位基矢量串,单位偏移串,复制上方串等)的特点在于:像素串中的每个像素使用上方的像素的重建值作为当前像素的预测值。在等值串与单位矢量串子模式中,需要对当前CU中的各个串的长度和预测值进行编码;且等值串与单位矢量串子模式的参考范围与串预测子模式的参考范围一致。
基于上述各个预测编码模式的相关介绍,本申请实施例提出了一种编解码方案。一方面,该编解码方案允许在帧内串复制的串预测子模式下的CU使用单位矢量串,从而提升串预测的编码性能;另一方面,该编解码方案还可允许导出与当前CU相邻的像素作为当前串的参考,从而得到当前串中的像素的预测值。具体的,该编解码方案可包括视频编码方法和视频解码方法;下面将分别结合图2和图4,对本申请实施例提出的视频编码方法和视频解码方法进行说明。
请参见图2,是本申请实施例提出一种视频编码方法的流程图;该视频编码方法可以由上述提及的视频编码设备,或者视频编码设备中的视频编码器执行。为便于阐述,后续均以视频编码设备执行该视频编码方法为例进行说明;如图2所示,该视频编码方法可包括以下步骤S201-S203:
S201,在当前图像的当前编码单元中确定待解码的当前串。
在具体实施过程中,视频编码设备可接收原始视频信号,并按顺序对原始视频信号中的图像进行编码;相应的,当前图像是指当前正在编码的图像,其可以是原始视频信号中的任一帧图像。在对该当前图像进行编码处理的过程中,视频编码设备可先将该当前图像划分成多个处理单元(如CTU或者LCU),并将每个处理单元进一步划分成一个或多个编码单元(即CU),从而依次对每个编码单元进行编码。
其中,当前待编码(即当前即将开始编码)或者当前正在编码(即已编码了部分像素)的编码单元可称为当前编码单元。例如,设总共有5个CU:CU1、CU2、CU3、CU4以及CU5;且CU1-CU2中的各个像素均已被编码,CU3-CU5中的各个像素均未被编码;那么按照编码顺序可确定CU3为当前待编码的编码单元,因此当前编码单元便为CU3。又如,设总共有5个CU:CU1、CU2、CU3、CU4以及CU5;且CU1中的各个像素均已被编码,CU2中的部分像素均已被编码(即还剩余一部分像素未被编码),CU3-CU5中的各个像素均未被编码;那么可确定CU2为当前正在编码的编码单元,因此当前编码单元为CU2。
针对当前编码单元,视频编码设备可采用ISC对当前编码单元中的各个像素进行编码。需要说明的是,当前编码单元可以是在帧内串复制模式(ISC) 中的串预测子模式下被编码的;并且,该串预测子模式下的编码单元被允许使用单位矢量串。或者,当前编码单元也可以是在帧内串复制模式(ISC)中的等值串与单位矢量串子模式下被编码的,本申请对此不作限制。
为便于阐述,后文均以采用串预测子模式对当前编码单元进行编码为例进行说明。并且,当前编码单元可包括P行Q列个像素;其中,P和Q的取值均为正整数,且P和Q的取值可相等或不等,本申请对此不作限制。例如,P和Q的取值可以均为64,那么当前编码单元便可包括64×64个像素。
具体的,视频编码设备可确定当前编码单元的扫描模式,此处的扫描模式包括但不限于:水平扫描模式或者竖直扫描模式。参见图3a所示,水平扫描模式(又可称为水平往返扫描模式)是指:按照从上往下的顺序依次扫描当前编码单元中的各行像素的扫描模式;竖直扫描模式(或称为垂直扫描模式)是指:按照从左往右的顺序依次扫描当前编码单元中的各列像素的扫描模式。需要说明的是,在其他实施例中,水平扫描模式也可以是指:按照从下往上的顺序依次扫描当前编码单元中的各行像素的扫描模式;竖直扫描模式也可以是指:按照从右往左的顺序依次扫描当前编码单元中的各列像素的扫描模式,本申请对此不作限制。
为便于阐述,后文均以采用图3a所示的水平扫描模式和竖直扫描模式进行扫描为例进行阐述。然后,视频编码设备可按照当前编码单元的扫描模式,将当前编码单元划分成至少一个串和/或未匹配像素,并依次对划分得到的串和/或未匹配像素进行编码处理。其中,当前待编码的串可称为当前串;由于AVS3标准限定了串的长度必须为4的整数倍,因此当前串中可包括4L个像素,L为正整数;但应理解的是,在其他视频编码标准中,若未规定串的长度,则当前串可包括一个或多个像素。并且,当前串可以是单位矢量串,也可以不是单位矢量串;且在当前串为单位矢量串时,其可以是匹配串或者不完全匹配串,本申请对此不作限定。若当前串为单位矢量串,则可执行步骤S202。
需要说明的是,本申请实施例中提及的单位矢量串的特点,与上述等值串与单位矢量串子模式中的单位矢量串的特点并不完全相同。本申请实施例提及的单位矢量串的特点如下:单位矢量串中每个像素的预测值由对应的参考像素的重建值确定,且任一像素对应的参考像素是指:位于该像素的目标侧且与该像素相邻的像素。其中,此处提及的目标侧可根据当前编码单元的扫描模式确定;若当前编码单元的扫描模式为水平扫描模式,且水平扫描模式指示按照从上往下的顺序对当前编码单元进行扫描,则任一像素的目标侧是指任一像素的上方,即任一像素的参考像素是指位于该任一像素的上方一行的某个像素(如正上方的像素);若当前编码单元的扫描模式为竖直扫描模式,且竖直扫描模式指示按照从左往右的顺序对当前编码单元进行扫描,则任一像素的目标侧是指任一像素的左方,即任一像素的参考像素是指位于该 任一像素的左方一列的某个像素(如正左方的像素)。
需要说明的是,本申请实施例提及的上方是指位于当前编码单元的左上角所在的水平线上侧的方位,正上方则是指像素所在位置垂直向上的方位;左方是指位于当前编码单元的左上角所在的竖直线左侧的方位,正左方则是指像素所在位置水平向左的方位,如图3b所示。需要说明的是,本申请实施例中的重建值指的是未经过环路滤波的像素值。
S202,若当前串为单位矢量串,且当前串中包括第一像素,则在当前图像中的历史编码单元中确定第一像素的参考像素。
其中,历史编码单元是指当前图像中与当前编码单元相邻的已编码的编码单元;第一像素对应的参考像素和第一像素在当前图像中相邻,且第一像素满足如下条件:第一像素对应的参考像素不位于当前编码单元中。需要说明的是,历史编码单元和当前编码单元相邻是指:历史编码单元和当前编码单元之间不存在其他的编码单元。第一像素对应的参考像素和第一像素在当前图像中相邻是指:在当前图像中,第一像素的参考像素所处的行和第一像素所处的行是相互毗邻(即紧挨)的;例如,在当前图像中,假设第一像素的参考像素所处的行是第5行;若第一像素所处的行是第6行,则可认为第一像素的参考像素所处的行和第一像素所处的行是相互毗邻的;若第一像素所处的行是第7行,则可认为第一像素的参考像素所处的行和第一像素所处的行不是相互毗邻的。或者,第一像素对应的参考像素和第一像素在当前图像中相邻是指:在当前图像中,第一像素的参考像素所处的列和第一像素所处的列之间是相互毗邻(即紧挨)的;例如,在当前图像中,假设第一像素的参考像素所处的列是第5列;若第一像素所处的行是第6列,则可认为第一像素的参考像素所处的列和第一像素所处的列是相互毗邻的;若第一像素所处的列是第7列,则可认为第一像素的参考像素所处的列和第一像素所处的列不是相互毗邻的。
可见,当第一像素和第一像素的参考像素在当前图像中相邻时,第一像素应当是当前编码单元中的临界行或临界列上的像素;此处的临界行又可称为边缘行,其可以是当前编码单元的第一行或者最后一行;同理,临界列又可称为边缘列,其可以是当前编码单元的第一列或者最后一列。
在具体实现中,第一像素的具体含义可以根据当前编码单元的扫描模式确定。例如,若当前编码单元的扫描模式为水平扫描模式,且水平扫描模式指示按照从上往下的顺序对当前编码单元进行扫描,则历史编码单元位于当前编码单元的上方(即历史编码单元为位于当前编码单元的上方且与当前编码单元相邻的已编码CU),第一像素为位于当前编码单元的第一行的像素;即此情况下,当前串包括当前编码单元中第一行上的至少一个像素。若当前编码单元的扫描模式为竖直扫描模式,且竖直扫描模式指示按照从左往右的顺序对当前编码单元进行扫描,则历史编码单元位于当前编码单元的左方(即 历史编码单元为位于当前编码单元的左方且与当前编码单元相邻的已编码CU),第一像素为位于当前编码单元的第一列的像素;即此情况下,当前串包括当前编码单元中第一列上的至少一个像素。
S203,根据第一像素的参考像素的重建值获取第一像素的预测值,以得到当前编码单元的编码信息。
在具体实施过程中,可先获取第一像素的参考像素的重建值,并将第一像素的参考像素的重建值作为第一像素的预测值;然后,可编码第一像素的像素值和预测值之间的残差,得到当前编码单元的编码信息;通过此编码方式,可减少比特数,提升编码效率。或者,在得到第一像素的预测值后,可编码用于指示该预测值的指示信息(如串矢量、串长度等),以得到当前编码单元的编码信息。其中,当前编码单元的扫描模式的不同,第一像素的参考像素的重建值的获取方式也随之不同;具体参见下述描述:
1)若当前编码单元的扫描方式为水平扫描方式,且水平扫描模式指示按照从上往下的顺序对当前编码单元进行扫描;则第一像素的参考像素的重建值的获取方式包括以下任一种:
第一种:从帧内串复制模式对应的第一存储空间(或称为帧内串复制模式的参考像素存储器)中获取第一像素的参考像素的重建值。其中,帧内串复制模式的第一存储空间可以是前述IBC内容中提及的128×128大小的内存,且该第一存储空间可被分为4个64×64大小的区域以进行参考像素内存的复用;但需说明的是,如前述IBC的相关内容所言,第一存储空间可以不局限于128×128大小的内存,且第一存储空间也不局限于被划分成4个64×64的区域。由于当前编码单元的预测编码模式为帧内串复制模式中的串预测子模式,或者等值串与单位矢量串子模式,因此第一像素的值会存储在第一存储空间中,那么在第一存储空间中获取第一像素的参考像素的重建值作为第一像素的预测值,可实现在一个存储空间中完成预测值的赋值,可有效提升编码效率。
第二种:从帧内预测模式对应的第二存储空间(或称为帧内预测模式的参考像素存储器)中获取第一像素的参考像素的重建值。由于若第一像素的参考像素采用帧内预测模式进行编码,则此情况下的第一像素的参考像素的重建值位于第二存储空间中;又若第一像素的参考像素采用帧内串复制模式进行编码,则此情况下可能因为内存复用的原因导致第一像素的参考像素的重建值被覆盖。因此,为提升获取第一像素的参考像素的重建值的成功率,可直接从第二存储空间中获取第一像素的参考像素的重建值。
第三种:将当前图像划分成多个N×N的区域,当第一像素位于任一N×N的区域的第一行时,从第二存储空间中获取第一像素的参考像素的重建值;否则,从第一存储空间中获取第一像素的参考像素的重建值。具体的,N的取值可根据经验值,或者处理单元的大小,或者当前CU的大小确定。 例如,处理单元的大小为128×128,则可将目标区域划分成多个128×128的区域;当第一像素位于任一128×128的区域的第一行时,参见图1h所示的参考范围限制(图1h对应的扫描模式为水平扫描模式)可知,由于第一存储空间除了存储当前CU的值(如已重建的像素的重建值)以外,其只会存储当前CU左方的CU的重建值,并未存储第一像素的参考像素(即位于当前CU上方且与第一像素相邻的像素)的重建值,因此可从第二存储空间中获取第一像素的参考像素的重建值,以提升重建值的获取成功率。当第一像素未位于任一128×128的区域的第一行时,第一像素的参考像素便位于当前CU的内部,那么该参考像素的重建值存在于第一存储空间中,相应地从第一存储空间中获取第一像素的参考像素的重建值,以提升重建值的获取效率。
需要说明的是,当前串中的像素可能均为第二像素(即当前串中不包括第一像素),或者当前串中也可能包括第一像素和第二像素(即当前串中除了第一像素,还包括第二像素);其中,第二像素为位于当前编码单元的第一行以外的像素,且当前串中的各个第二像素分布在当前编码单元中的至少一行中。那么计算机设备还可采用如下任一种方式导出当前串中的各个第二像素的预测值:
方式A:以行为单位,逐行地获取当前串的各行中的第二像素的预测值。例如,设当前串中的第二像素分布在当前编码单元中的第2行-第3行,则可按照从上往下的顺序,先一次性导出第2行中的第二像素的预测值,再一次性导出第3行中的第二像素的预测值;其中,每一行中的第二像素均使用上方一行中的参考像素的重建值作为预测值。本方式A通过逐行地获取第二像素的预测值,可有效提升预测值的获取效率,从而提升编码效率。
方式B:以单个像素为单位,逐像素地获取当前串中的各个第二像素的预测值。例如,设当前串中的第二像素分布在当前编码单元中的第2行-第3行,则可先逐个像素地导出第2行中的各个第二像素的预测值,再逐个像素地导出第3行中的第二像素的预测值;其中,每个第二像素均使用位于其上方的参考像素的重建值作为预测值。
由此可见,在上述方式A-B中,任一第二像素的预测值是根据位于该第二像素的上方、且与该第二像素相邻的参考像素的重建值获取到的。具体的,可直接使用任一第二像素的上方、且与该第二像素相邻的参考像素的重建值作为该第二像素的预测值。
2)若当前编码单元的扫描方式为竖直扫描方式,且竖直扫描模式指示按照从左往右的顺序对当前编码单元进行扫描;则第一像素的参考像素的重建值的获取方式包括以下任一种:
第一种:从帧内串复制模式对应的第一存储空间(或称为帧内串复制模式的参考像素存储器)中获取第一像素的参考像素的重建值。
第二种:从帧内预测模式对应的第二存储空间(或称为帧内预测模式的 参考像素存储器)中获取第一像素的参考像素的重建值。
第三种:将当前图像划分成多个N×N的区域,当第一像素位于任一N×N的区域的第一列时,从第二存储空间中获取第一像素的参考像素的重建值;否则,从第一存储空间中获取第一像素的参考像素的重建值。
需要说明的是,当前串中的像素可能均为第三像素(即当前串中不包括第一像素),或者当前串中也可能包括第一像素和第三像素(即当前串中除了第一像素,还包括第三像素);其中,第三像素为位于当前编码单元的第一列以外的像素,且当前串中的各个第三像素分布在当前编码单元中的至少一列中。那么计算机设备还可采用如下任一种方式导出当前串中的各个第三像素的预测值:
方式a:以列为单位,逐列地获取当前串的各列中的第三像素的预测值。例如,设当前串中的第三像素分布在当前编码单元中的第2列-第3列,则可按照从左往右的顺序,先一次性导出第2列中的第三像素的预测值,再一次性导出第3列中的第三像素的预测值;其中,每一列中的第三像素均使用左方一列中的参考像素的重建值作为预测值。本方式a通过逐列地获取第三像素的预测值,可有效提升预测值的获取效率,从而提升编码效率。
方式b:以单个像素为单位,逐像素地获取当前串中的各个第三像素的预测值。例如,设当前串中的第三像素分布在当前编码单元中的第2列-第3列,则可按照从左往右的顺序,先逐个像素地导出第2列中的各个第三像素的预测值,再逐个像素地导出第3列中的第三像素的预测值;其中,每个第三像素均使用位于其左方的参考像素的重建值作为预测值。
由此可见,在上述方式a-b中,任一第三像素的预测值是根据位于该第三像素的左方、且与该第三像素相邻的参考像素的重建值获取到的。具体的,可直接使用任一第三像素的左方、且与该第三像素相邻的参考像素的重建值作为该第三像素的预测值。
在一种可选的具体实现中,视频编码设备还可编码串指示标志,使得编码信息中包括该串指示标志;其中,若当前串为单位矢量串,则串指示标志为目标数值。其中,目标数值可根据经验值或者业务需求设置,例如目标数值可设置为0或1等。可选的,该串指示标志可以是一个二值标志,以指示当前串是否为单位矢量串。例如,当串指示标志为1(即目标数值为0)时,可指示当前串为单位矢量串;当串指示标志为0时,可指示当前串不为单位矢量串。
在一种可选的具体实现中,视频编码设备还可编码数值K,使得编码信息中包括数值K;数值K的取值大于或等于零,该数值K用于指示当前编码单元中的目标组像素中位于第K组像素后的至少一个串为单位矢量串。其中,当前编码单元可分成多组像素;一组像素包括:一个串或者至少一个未匹配像素。
具体的,如果视频编码设备采用图1i所示的帧内串复制技术对当前编码单元进行串划分,那么当前编码单元可能会被划分出未匹配像素;此情况下,这些被划分出的未匹配像素可构成一组或多组像素。具体的,若划分出的未匹配像素中存在至少两个未匹配像素在当前编码单元中的分布位置是连续的,则可以将该至少两个未匹配像素中的所有未匹配像素作为一组像素,或者从该至少两个未匹配像素中选取部分连续的未匹配像素作为一组像素,未被选取的各个未匹配像素分别作为一组像素。若划分出的未匹配像素存在独立的未匹配像素(即处于两个串之间的未匹配像素),则一个独立的未匹配像素可作为一组像素。
需要说明的是,本申请实施例只是示例性地列举了几种将未匹配像素划分成一组或多组像素的划分方式,并非穷举。实际应用中,可由编码端根据需求设置划分方式。如果视频编码设备采用图1j所示的具有串长度分辨率控制的帧内串复制技术对当前编码单元进行串划分,那么当前编码单元会被划分成多个串(如匹配串和/或不完全匹配串);在此情况下,每个匹配串可作为一组像素,且每个不完全匹配串也可作为一组像素。
在一种实施方式中,当前编码单元中的编码信息包括的K的数量可以为一个,目标组像素可以是当前编码单元中的所有组像素;在此情况下,数值K可用于指示当前编码单元中的所有组像素中位于第K组像素后的所有串均为单位矢量串。可选的,若每组像素均包括一个串,则数值K可用于指示当前编码单元中的所有组像素中第K+1个串为单位矢量串。
在一种实施方式中,每组像素均包括一个串;当前编码单元中的编码信息包括的K的数量可以为多个,一个K对应一个目标组像素;在此实施方式下,第i个K可用于指示当前编码单元中的第i个K对应的目标组像素中第K+1个串为单位矢量串。其中,第i个K对应的目标组像素为:当前编码单元中未根据前i-1个K确定出串类型(单位矢量串或者非单位矢量串)的剩余组像素。例如,设第1个K的取值等于2,则第1个K对应的目标组像素便为当前编码单元中的所有组像素,且第1个K指示当前编码单元中的第1组像素中的串和第2组像素中的串不为单位矢量串,而第3组像素中的串为单位矢量串。设第2个K的取值等于1,则第2个K对应目标组像素便为:未根据第1个K确定出串类型的剩余组像素,即第4组像素及位于第4组像素以后的组像素。并且,由于第2个K所对应的目标组像素中的第1组像素(即第K组像素)对应当前编码单元中的第4组像素,而第2个K所对应的目标组像素中的第2组像素(即第K+1组像素)对应当前编码单元中的第5组像素。因此,第2个K可指示第4组像素中的串不为单位矢量串,第5组串为单位矢量串。
在一种可选的具体实现中,视频编码设备还可编码当前串的串位移矢量,使得编码信息中包括串位移矢量;其中,若当前串为单位矢量串,则串位移 矢量为单位矢量,此处的单位矢量可包括第一元素和第二元素。若当前编码单元的扫描模式为水平扫描模式,则第一元素的元素值为第一数值,第二元素的元素值为第二数值;若当前编码单元的扫描模式为竖直扫描模式,则第一元素的元素值为第二数值,第二元素的元素值为第一数值。其中,第一数值和第二数值的具体取值可根据经验值或者业务需求设置;例如设置第一数值为0,设置第二数值为-1:那么当前编码单元的扫描模式为水平扫描模式时,单位矢量则为(0,-1);当前编码单元的扫描模式为竖直扫描模式时,单位矢量则为(-1,0)。
由前述可知,当前图像被划分成多个处理单元,每个处理单元包括一个或多个编码单元;且多个处理单元包括:当前处理单元和已编码的处理单元,当前处理单元包括当前编码单元。串位移矢量用于表示当前串到参考串的位移,在当前串为单位矢量串时,可允许当前串的参考串不满足以下一项或多项条件:
①参考串中的任一参考像素在指定区域范围内;例如,该指定区域范围可以为当前处理单元的范围、或当前处理单元左方的H个已解码的处理单元的范围,H为正整数。在示例性的实施例中,H的取值可根据处理单元(即LCU或CTU)的大小确定;具体的,可根据下述公式确定H:
H=(1<<((7-log2_lcu_size_minus2+2)<<1))–(((log2_lcu_size_minus2+2)<7)?1:0)
其中,lcu_size,lcu_size表示处理单元的宽或高,其具体可为大于或等于1的正整数;log2_lcu_size_minus2=log2(lcu_size)-2。“<<”运算符表示左移,用来将一个数的各二进制位全部左移K(K为大于或等于1的正整数)位,高位舍弃,低位补0。(((log2_lcu_size_minus2+2)<7)?1:0)是一个三目运算符,先判断((log2_lcu_size_minus2+2)<7)是否成立,若成立,则(((log2_lcu_size_minus2+2)<7)?1:0)=1;若不成立,则(((log2_lcu_size_minus2+2)<7)?1:0)=0。例如,若处理单元的大小为128×128,则lcu_size=128,log2(128)=7,log2_lcu_size_minus2=5,H=(1<<(0<<1))-0=1。再例如,若处理单元的大小等于64×64,则lcu_size=64,log2(64)=6,log2_lcu_size_minus2=4,H=(1<<(1<<1))–1=3。
或者,也可以根据以下公式确定H:
H=(1<<((7-log2_lcu_size)<<1))–(((log2_lcu_size)<7)?1:0)
其中,log2_lcu_size=log2(lcu_size);先判断((log2_lcu_size)<7)是否成立,若成立,则(((log2_lcu_size)<7)?1:0)=1;若不成立,则(((log2_lcu_size)<7)?1:0)=0。
②当参考串中的任一参考像素处于与当前处理单元左方相邻的已解码的处理单元内时,参考串的任一参考像素往右移动预定像素后的位置所在的目 标区域的像素尚未重建。其中,预定像素和目标区域均可根据处理单元(即LCU或CTU)的大小确定;具体的,若处理单元的大小为M×M,则预定像素为M个像素,目标区域为参考串中的任一像素往右移动M个像素后对应的
Figure PCTCN2021131183-appb-000006
区域。
例如,假设处理单元的大小为128×128,则预定像素可为128个像素,目标区域则为参考串中的任一像素往右移动128个像素后对应的64×64的区域。其中,参考串的任一参考像素往右移动预定像素后的位置所在的目标区域的像素尚未重建是指:参考串的任一参考像素往右移动预定像素后的位置所在的目标区域的任一像素均尚未重建,或者,参考串的任一参考像素往右移动预定像素后的位置所在的目标区域的左上角处的像素尚未重建。
可选的,左方相邻的处理单元中包括参考串中任一参考像素的参考区域可对应地在当前处理单元中找到相同的位置区域,且该位置区域的左上角坐标不应与当前CU的左上角坐标相同。以处理单元的大小为128×128,CU的大小为64×64为例;设参考串中的任一参考像素的位置为(xRef,yRef),则((xRef+128)/64×64,yRef/64×64)不可得(即第一存储空间中无法找到该像素的重建值),((xRef+128)/64×64,yRef/64×64)不应等于当前CU左上角坐标(xCb,yCb)。需要说明的是,这里“/”表示向下取整的除法。
③参考串中的各个参考像素位于同一个对齐区域,该对齐区域的尺寸是根据处理单元的尺寸确定的。具体的,可采用vSize表示对齐区域的高或宽,且可定义:vSize=LcuSize>=64?64:LcuSize。其中,vSize=LcuSize>=64?64:LcuSize表示:判断LcuSize是否大于或等于64,LcuSize表示处理单元的高或宽;若是,则vSize等于64;否则,vSize等于LcuSize。
也就是说,在此情况下:串位移矢量指向的参考串中所有的参考像素只能来自同一个vSize x vSize对齐区域。即需满足如下条件:
xRefA/vSize=xRefB/vSize
yRefA/vSize=yRefB/vSize
其中,(xRefA,yRefA)和(xRefB,yRefB)是参考串中任意两个参考像素(亮度分量)A和B的坐标。需要说明的是,这里“/”表示向下取整的除法。
④参考串中的任一参考像素不属于当前串,即参考串和当前串不重叠。
在一种可选的具体实现中,视频编码设备还可编码串扫描模式标志(或称为串预测扫描模式标志),使得编码信息中包括串扫描模式标志。具体的,若当前编码单元的扫描模式为水平扫描模式,则串扫描模式标志为第三数值;若当前编码单元的扫描模式为竖直扫描模式,则串扫描模式标志为第四数值。其中,第三数值和第四数值均可根据经验值或者业务需求设置;例如将第三数值设置为0,将第四数值设置为1。
需要说明的是,若当前串不为单位矢量串,则视频编码设备还可从当前处理单元和已编码的处理单元中的部分已编码的编码单元中,搜索当前串的参考串;并根据参考串获取当前串中的各个像素的预测值,以得到当前编码单元的编码信息。此情况下,参考串需满足以下一项或多项条件:
①参考串中的任一参考像素在当前处理单元和当前处理单元左方的H个已解码的处理单元范围内,H为正整数。
②当参考串中的任一参考像素处于与当前处理单元左方相邻的已编码的处理单元内时,参考串的任一参考像素往右移动预定像素后的位置所在的目标区域的像素尚未重建。
③参考串中的各个参考像素位于同一个对齐区域,该对齐区域的尺寸是根据处理单元的尺寸确定的。
④参考串中的各个参考像素和当前串中的各个像素位于同一个独立编码区域内,此处的独立编码区域可以包括当前图像,或者当前图像中的片、条带。
⑤参考串中的任一参考像素不属于当前串。
还需说明的是,若当前编码单元中包括未匹配像素(即在当前图像的已编码区域中的参考区域内,未找到对应的参考像素的像素),则视频编码设备可直接编码未匹配像素的像素值,使得编码信息中包括未匹配像素的像素值。
本申请实施例在编码的过程中,在当前编码单元的当前串为单位矢量串,且当前串中包括第一像素(如当前编码单元中的第一行的像素或者当前编码单元中的第一列的像素等等)时,可从当前图像中的历史解码单元中确定第一像素的参考像素,并根据参考像素的重建值获取第一像素的预测值以实现编码处理。可见,本申请实施例通过允许使用与当前编码单元相邻的历史解码单元中的像素作为参考像素,使得当前编码单元中任一串均可成为单位矢量串,这样可有效扩宽单位矢量串的适用范围,有助于提升编码性能。并且,本申请实施例针对当前编码单元所采用的预测编码模式并不局限于等值串与单位矢量串子模式,还可以是帧内串复制模式中的串预测子模式等其他预测编码模式;也就是说,本申请实施例可允许串预测子模式下的编码单元使用单位矢量串,这样可进一步扩宽单位矢量串的适用范围,提升串预测的编码性能。
请参见图4,是本申请实施例提出一种视频解码方法的流程图;该视频解码方法可以由上述所提及的视频解码设备,或者视频解码设备中的视频解码器执行。为便于阐述,后续均以视频解码设备执行该视频解码方法为例进行说明;如图4所示,该视频解码方法可包括以下步骤S401-S403:
S401,在当前图像的当前编码单元中确定待解码的当前串。
在具体实施过程中,视频解码设备可获取当前图像中的当前编码单元的 编码信息,当前编码单元包括P行Q列个像素;其中,P和Q的取值均为正整数。然后,可根据当前编码单元的编码信息,从当前编码单元中确定出待解码的当前串。具体的,视频解码设备可从当前编码单元的编码信息中解码出当前编码单元的预测编码模式;若预测编码模式为ISC中的串预测子模式或者等值串与单位矢量串预测子模式,则进一步从当前编码单元的解码信息中解码出扫描模式和串长度。然后,可根据扫描模式和串长度从当前编码单元中确定待解码的当前串。
需要说明的是,解码侧的当前编码单元均是指:当前待解码(即当前即将开始解码)或者当前正在解码(即已解码了部分像素)的CU。当前编码单元可以是在帧内串复制模式(ISC)中的串预测子模式下被编码的;并且,该串预测子模式下的编码单元被允许使用单位矢量串。或者,当前编码单元也可以是在帧内串复制模式(ISC)中的等值串与单位矢量串子模式下被编码的,本申请对此不作限制。为便于阐述,后文均以采用串预测子模式对当前编码单元进行编码为例进行说明。
在确定出当前串后,视频解码设备可采用下述任一方式判断当前串是否为单位矢量串:
方式一,视频解码设备可从当前编码单元的编码信息中解码出串指示标志;若串指示标志为目标数值,则确定当前串为单位矢量串。
方式二,视频解码设备可从当前编码单元的编码信息中解码出数值K,数值K的取值大于或等于零;且该数值K用于指示当前编码单元中的目标组像素中位于第K组像素后的至少一个串为单位矢量串,一组像素包括:一个串或者至少一个未匹配像素。若当前串在所述目标组像素中位于第K组像素后,则确定当前串为单位矢量串。
可选的,若每组像素均包括一个串,则数值K可具体用于指示当前编码单元中的目标组像素中,第K+1个串为单位矢量串;那么在此实施方式下,可进一步判断当前串是否为在所述目标组像素中的第K+1个串;若当前串在所述目标组像素中为第K+1个串,则确定当前串为单位矢量串。
方式三,视频解码设备可从当前编码单元的编码信息中解码出当前串的串位移矢量;若串位移矢量为单位矢量,则确定当前串为单位矢量串。
其中,单位矢量可包括第一元素和第二元素。若当前编码单元的扫描模式为水平扫描模式,则第一元素的元素值为第一数值,第二元素的元素值为第二数值;若当前编码单元的扫描模式为竖直扫描模式,则第一元素的元素值为第二数值,第二元素的元素值为第一数值。其中,第一数值和第二数值的具体取值可根据经验值或者业务需求设置;例如设置第一数值为0,设置第二数值为-1:那么当前编码单元的扫描模式为水平扫描模式时,单位矢量则为(0,-1);当前编码单元的扫描模式为竖直扫描模式时,单位矢量则为(-1,0)。
其中,当前编码单元的扫描方式的确定方式可以如下:从当前编码单元的编码信息中解码出串扫描模式标志;若串扫描模式标志为第三数值,则确定当前编码单元的扫描模式为水平扫描模式;若串扫描模式标志为第四数值,则确定当前编码单元的扫描模式为竖直扫描模式。
由前述可知,当前图像被划分成多个处理单元,每个处理单元包括一个或多个编码单元;且多个处理单元包括:当前处理单元和已解码的处理单元,当前处理单元包括当前编码单元。串位移矢量用于表示当前串到参考串的位移,在当前串为单位矢量串时,可允许当前串的参考串不满足以下一项或多项条件:
①参考串中的任一参考像素在指定区域范围内;例如,指定区域范围可以为当前处理单元的范围,或者和当前处理单元左方的H个已解码的处理单元的范围。
②当参考串中的任一参考像素处于与当前处理单元左方相邻的已解码的处理单元内时,参考串的任一参考像素往右移动预定像素后的位置所在的目标区域的像素尚未重建。可选的,左方相邻的处理单元中包括参考串中任一参考像素的参考区域可对应地在当前处理单元中找到相同的位置区域,且该位置区域的左上角坐标不应与当前CU的左上角坐标相同。
③参考串中的各个参考像素位于同一个对齐区域。
④参考串中的任一参考像素不属于当前串。
S402,若当前串为单位矢量串,且当前串中包括第一像素,则从当前图像中的历史解码单元中确定第一像素的参考像素。
其中,历史解码单元是指当前图像中与当前编码单元相邻的已解码的编码单元,第一像素对应的参考像素和第一像素在当前图像中相邻;且第一像素满足如下条件:第一像素对应的参考像素未位于当前编码单元中。
需要说明的是,历史解码单元和当前编码单元相邻是指:历史解码单元和当前编码单元之间不存在其他的编码单元。第一像素对应的参考像素和第一像素在当前图像中相邻是指:在当前图像中,第一像素的参考像素所处的行和第一像素所处的行是相互毗邻(即紧挨)的。或者,第一像素对应的参考像素和第一像素在当前图像中相邻是指:在当前图像中,第一像素的参考像素所处的列和第一像素所处的列之间是相互毗邻的。
可见,当第一像素和第一像素的参考像素在当前图像中相邻时,第一像素应当是当前编码单元中临界行或临界列上的像素;此处的临界行又可称为边缘行,其可以是当前编码单元的第一行或者最后一行;同理,临界列又可称为边缘列,其可以是当前编码单元的第一列或者最后一列。在具体实现中,第一像素的具体含义可以根据当前编码单元的扫描模式确定。例如,若当前编码单元的扫描模式为水平扫描模式,且水平扫描模式指示按照从上往下的顺序对当前编码单元进行扫描,则历史解码单元位于当前编码单元的上方(即 历史解码单元为位于当前编码单元的上方且与当前编码单元相邻的已解码CU),第一像素为位于当前编码单元的第一行的像素;即此情况下,当前串包括当前编码单元中第一行上的至少一个像素。若当前编码单元的扫描模式为竖直扫描模式,且竖直扫描模式指示按照从左往右的顺序对当前编码单元进行扫描,则历史解码单元位于当前编码单元的左方(即历史解码单元为位于当前编码单元的左方且与当前编码单元相邻的已解码CU),第一像素为位于当前编码单元的第一列的像素;即此情况下,当前串包括当前编码单元中第一列上的至少一个像素。
S403,根据第一像素的参考像素的重建值获取第一像素的预测值,以得到解码图像。
在具体实施过程中,可先获取第一像素的参考像素的重建值,并将第一像素的参考像素的重建值作为第一像素的预测值,以得到解码图像;通过此解码方式,可简化预测值的获取过程,提升解码效率。其中,当前编码单元的扫描模式不同,第一像素的参考像素的重建值的获取方式也不同;具体参见下述描述:
1)若当前编码单元的扫描方式为水平扫描方式,且水平扫描模式指示按照从上往下的顺序对当前编码单元进行扫描;则第一像素的参考像素的重建值的获取方式包括以下任一种:
第一种:从帧内串复制模式对应的第一存储空间(或称为帧内串复制模式的参考像素存储器)中获取第一像素的参考像素的重建值。
第二种:从帧内预测模式对应的第二存储空间(或称为帧内预测模式的参考像素存储器)中获取第一像素的参考像素的重建值。
第三种:将当前图像划分成多个N×N的区域,当第一像素位于任一N×N的区域的第一行时,从第二存储空间中获取第一像素的参考像素的重建值;否则,从第一存储空间中获取第一像素的参考像素的重建值。
需要说明的是,当前串中的像素可能均为第二像素(即当前串中不包括第一像素),或者当前串中也可能包括第一像素和第二像素(即当前串中除了第一像素,还包括第二像素);其中,第二像素为位于当前编码单元的第一行以外的像素,且当前串中的各个第二像素分布在当前编码单元中的至少一行中。那么计算机设备还可采用如下任一种方式导出当前串中的各个第二像素的预测值:
方式A:以行为单位,逐行地获取当前串的各行中的第二像素的预测值。
方式B:以单个像素为单位,逐像素地获取当前串中的各个第二像素的预测值。
在上述方式A-B中,任一第二像素的预测值是根据位于该第二像素的上方、且与该第二像素相邻的参考像素的重建值获取到的。
2)若当前编码单元的扫描方式为竖直扫描方式,且竖直扫描模式指示按 照从左往右的顺序对当前编码单元进行扫描;则第一像素的参考像素的重建值的获取方式包括以下任一种:
第一种:从帧内串复制模式对应的第一存储空间中获取第一像素的参考像素的重建值。
第二种:从帧内预测模式对应的第二存储空间中获取第一像素的参考像素的重建值。
第三种:将当前图像划分成多个N×N的区域,当第一像素位于任一N×N的区域的第一列时,从第二存储空间中获取第一像素的参考像素的重建值;否则,从第一存储空间中获取第一像素的参考像素的重建值。
需要说明的是,当前串中的像素可能均为第三像素(即当前串中不包括第一像素),或者当前串中也可能包括第一像素和第三像素(即当前串中除了第一像素,还包括第三像素);其中,第三像素为位于当前编码单元的第一列以外的像素,且当前串中的各个第三像素分布在当前编码单元中的至少一列中。那么计算机设备还可采用如下任一种方式导出当前串中的各个第三像素的预测值:
方式a:以列为单位,逐列地获取当前串的各列中的第三像素的预测值。
方式b:以单个像素为单位,逐像素地获取当前串中的各个第三像素的预测值。在上述方式a-b中,任一第三像素的预测值是根据位于任一第三像素的左方、且与该任一第三像素相邻的参考像素的重建值获取到的。
需要说明的是,若当前串不为单位矢量串,则视频解码设备还可从当前处理单元和已解码的处理单元中部分已解码的编码单元中,搜索当前串的参考串;并根据参考串获取当前串中的各个像素的预测值,以得到解码图像。此情况下,参考串需满足以下一项或多项条件:
①参考串中的任一参考像素在当前处理单元和当前处理单元左方的H个已解码的处理单元范围内,H为正整数。
②当参考串中的任一参考像素处于与当前处理单元左方相邻的已编码的处理单元内时,参考串的任一参考像素往右移动预定像素后的位置所在的目标区域的像素尚未重建。
③参考串中的各个参考像素位于同一个对齐区域,该对齐区域的尺寸是根据处理单元的尺寸确定的。
④参考串中的各个参考像素和当前串中的各个像素位于同一个独立编码区域内,此处的独立编码区域可以包括当前图像,或者当前图像中的片、条带。
⑤参考串中的任一参考像素不属于当前串。
还需说明的是,若当前编码单元包括未匹配像素,则未匹配像素的像素值可直接从当前编码单元的编码信息中解码得到。
本申请实施例在解码的过程中,在当前编码单元的当前串为单位矢量串, 且当前串中包括第一像素(如当前编码单元中的第一行的像素或者当前编码单元中的第一列的像素等等)时,可从当前图像中的历史解码单元中确定第一像素的参考像素,并根据参考像素的重建值获取第一像素的预测值以实现解码处理。可见,本申请实施例通过允许使用当前编码单元相邻的历史解码单元中的像素作为参考像素,使得当前编码单元中任一串均可成为单位矢量串,这样可有效扩宽单位矢量串的适用范围,有助于提升解码性能。并且,本申请实施例针对当前编码单元采用的预测编码模式并不局限于等值串与单位矢量串子模式,还可以是帧内串复制模式中的串预测子模式等其他预测编码模式;也就是说,本申请实施例可允许串预测子模式下的编码单元使用单位矢量串,这样可进一步扩宽单位矢量串的适用范围,提升串预测的编码性能。
基于上述图2所示的视频编码方法的实施例描述和图4所示的视频解码方法的实施例描述,本申请实施例提出了一种基于单位矢量串的串预测方法;该方法适用于使用帧内串复制模式的编解码器,下面从解码端进行描述:
1)单位矢量为:
a)如果ISC块(当前CU)的扫描模式为水平扫描模式,那么单位矢量为(0,-1);
b)如果ISC块(当前CU)的扫描模式为竖直扫描模式,那么单位矢量为(-1,0)。
2)有以下可选方式确认当前串为单位矢量串:
a)从码流(如当前CU的编码信息)中解码一个二值标志(即前述所提及的串指示标志),若二值标志为目标数值,则指示当前串为单位矢量串;
b)从码流(如当前CU的编码信息)中解码一个数值K,指示当前编码单元中的目标组像素中位于第K组像素后的至少一个串为单位矢量串。若当前串在目标组像素中位于第K组像素后,则确定当前串为单位矢量串;
c)从码流(如当前CU的编码信息)解码当前串的串矢量(即串位移矢量),如果串矢量为单位矢量,则当前串为单位矢量串。
3)单位矢量串的预测值的导出:
a)如果当前ISC块(当前CU)采用水平扫描顺序
i、从上到下逐行导出当前串的预测值,该串每行中的像素使用上方一行像素的值(如重建值)作为预测值;
ii、i.的另一种实现方式,逐像素完成单位矢量串的重建,该串中每个像素使用位于其上方像素的值(如重建值)作为预测值;
iii、在i.和ii.中,如果待预测的像素位于当前解码块(即当前CU)的第一行,那么该像素的参考像素位于当前解码块之外上方一行,该参考像素的重建值有以下可选的获取方式:
①从帧内串复制的参考像素存储器(即前述所提及的第一存储空间)中获取该参考像素的重建值;
②从帧内预测的参考像素存储器(即前述所提及的第二存储空间)中获取该参考像素的重建值;
③将当前图像分为一系列大小为N×N(例如64×64)的区域,仅当该像素位于区域的第一行,从帧内预测的参考像素存储器中获取该参考像素的重建值;否则,从帧内串复制的参考像素存储器中获取该参考像素的重建值。
b)如果当前ISC块(当前CU)采用竖直扫描顺序
i、逐列导出当前串的预测值,该串每行中的像素使用左方一列像素的值(如重建值)作为预测值;
ii、i.的另一种实现方式,逐像素完成单位矢量串的重建,该串中每个像素使用位于其左方像素的值(如重建值)作为预测值;
iii、在i.和ii.中,如果待预测的像素位于当前解码块的第一列,该像素的参考像素位于当前解码块之外左方一列,该参考像素的重建值有以下可选的获取方式:
①从帧内串复制的参考像素存储器中获取该参考像素的重建值;
②从帧内预测的参考像素存储器中获取该参考像素的重建值;
③将当前图像分为一系列大小为N×N(例如64×64)的区域,当该像素位于区域的第一列,从帧内预测的参考像素存储器中获取该参考像素的重建值;否则,从帧内串复制的参考像素存储器中获取该参考像素的重建值。4)可选的,在帧内串复制串预测子模式中允许使用单位矢量串,除非当前串为单位矢量偏移串,符合标准的位流应满足如下全部或部分参考范围限制:
a)串位移矢量指向的参考串中任意参考像素限制在一定的指定区域范围内。例如,指定区域范围可以是:当前处理单元(如当前最大编码单元)的范围或左边H个处理单元(如最大编码单元)的范围内;其中,H的大小可由处理单元(如LCU或者CTU)的大小决定。N=(1<<((7-log2_lcu_size_minus2+2)<<1))-(((log2_lcu_size_minus2+2)<7)?1:0)。
b)当串位移矢量指向的参考串中任意参考像素落在左边相邻最大编码单元,且最大编码单元的亮度样本尺寸为128×128时,参考串中任意参考像素亮度样本右移128像素后的位置所在的64×64区域的左上角尚未重建。且,参考串中任意参考像素落在左边相邻CTU中所在的64×64的区域可对应找到当前CTU中的相同的区域位置,该64×64的区域的左上角坐标不应与当前编码块的左上角坐标位置相同。即参考串亮度分量任意位置为(xRef,yRef),((xRef+128)/64×64,yRef/64×64)不可得;((xRef+128)/64×64,yRef/64×64)不应等于当前块左上角位置(xCb,yCb)。
c)参考串位置限定。定义样本限制区域大小vSize=LcuSize>=64?64:LcuSize。串位移矢量指向的参考串中所有的像素只能来自同一个vSize x  vSize对齐区域。即:
xRefA/vSize=xRefB/vSize
yRefA/vSize=yRefB/vSize
其中,(xRefA,yRefA)和(xRefB,yRefB)是参考串上任意两个亮度分量A和B的坐标。
d)串位移矢量指向的参考串中任意参考像素不应位于当前串内。
本申请实施例所提出的方法扩展了单位矢量串的使用范围,有助于提升串矢量的编码效率;并且,通过将单位矢量串与串预测子模式结合,还可凭借单位矢量串具有的较低实现复杂度的特点,提升串预测的编码性能。
基于上述视频编码方法的实施例描述,本申请实施例还公开了一种视频编码装置,所述视频编码装置可以是运行于上述所提及的视频编码设备中的一个计算机程序(包括程序代码)。该视频编码装置可以执行图2所示的方法。请参见图5,所述视频编码装置可以运行如下单元:
获取单元501,用于在当前图像的当前编码单元中确定待编码的当前串;
编码单元502,用于若所述当前串为单位矢量串,且所述当前串中包括第一像素,则从所述当前图像中的历史编码单元中确定所述第一像素的参考像素;所述历史编码单元是所述当前图像中与所述当前编码单元相邻的已编码的编码单元,所述第一像素的参考像素和所述第一像素在所述当前图像中相邻;
所述编码单元502,还用于根据所述第一像素的参考像素的重建值获取所述第一像素的预测值,以得到所述当前编码单元的编码信息。
在一种实施方式中,所述当前编码单元是在帧内串复制模式中的串预测子模式下被编码的;并且,所述串预测子模式下的编码单元被允许使用单位矢量串。
在一种实施方式中,编码单元502还可用于:编码所述串指示标志,使得所述编码信息中包括所述串指示标志。其中,若所述当前串为单位矢量串,则所述串指示标志为目标数值。
再一种实施方式中,编码单元502还可用于:编码数值K,使得所述编码信息中包括所述数值K,数值K的取值大于或等于零。其中,所述数值K用于指示所述当前编码单元的目标组像素中位于第K组像素后的至少一个串为单位矢量串,一组像素包括:一个串或者至少一个未匹配像素。
再一种实施方式中,若每组像素均包括一个串,则所述数值K用于指示所述当前编码单元的目标组像素中第K+1个串为所述单位矢量串。
再一种实施方式中,编码单元502还可用于:编码所述当前串的串位移矢量,使得所述编码信息中包括所述串位移矢量。其中,若所述当前串为所述单位矢量串,则所述串位移矢量为单位矢量。
再一种实施方式中,所述单位矢量包括第一元素和第二元素;若所述当前编码单元的扫描模式为水平扫描模式,则所述第一元素的元素值为第一数值,所述第二元素的元素值为第二数值;若所述当前编码单元的扫描模式为竖直扫描模式,则所述第一元素的元素值为所述第二数值,所述第二元素的元素值为所述第一数值。
再一种实施方式中,所述当前图像被划分成多个处理单元,每个所述处理单元包括一个或多个编码单元;所述多个处理单元包括:当前处理单元和已编码的处理单元,所述当前处理单元包括所述当前编码单元;
所述串位移矢量用于表示所述当前串到参考串的位移,且允许所述参考串不满足以下一项或多项条件:
所述参考串中的任一参考像素在指定区域范围内;
当所述参考串中的任一参考像素处于与所述当前处理单元左方相邻的已解码的处理单元内时,所述参考串的任一参考像素往右移动预定像素后的位置所在的目标区域的像素尚未重建;
所述参考串中的各个参考像素位于同一个对齐区域,所述对齐区域的尺寸是根据所述处理单元的尺寸确定的;
所述参考串中的任一参考像素不属于所述当前串。
再一种实施方式中,所述当前编码单元包括P行Q列个像素;其中,P和Q的取值均为正整数;若所述当前编码单元的扫描模式为水平扫描模式,且所述水平扫描模式指示按照从上往下的顺序对所述当前编码单元进行扫描,则所述历史编码单元位于所述当前编码单元的上方,所述第一像素为位于所述当前编码单元的第一行的像素。
在此情况下,若所述当前串中的像素均为第二像素,或者所述当前串中包括所述第一像素和所述第二像素;所述第二像素为位于所述当前编码单元的第一行以外的像素,且所述当前串中的各个第二像素分布在所述当前编码单元中的至少一行中;则相应的,编码单元502还可用于:
以行为单位,逐行地获取所述当前串的各行中的第二像素的预测值;或者,以单个像素为单位,逐像素地获取所述当前串中的各个第二像素的预测值;
其中,任一第二像素的预测值是根据位于所述任一第二像素的上方、且与该任一第二像素相邻的参考像素的重建值获取到的。
再一种实施方式中,所述当前编码单元包括P行Q列个像素;其中,P和Q的取值均为正整数;若所述当前编码单元的扫描模式为竖直扫描模式,且所述竖直扫描模式指示按照从左往右的顺序对所述当前编码单元进行扫描,则所述历史编码单元位于所述当前编码单元的左方,所述第一像素为位于所述当前编码单元的第一列的像素。
在此情况下,若所述当前串中的像素均为第三像素,或者所述当前串中 包括所述第一像素和所述第三像素;所述第三像素为位于所述当前编码单元的第一列以外的像素,且所述当前串中的各个第三像素分布在所述当前编码单元中的至少一列中;则相应的,编码单元502还可用于:
以列为单位,逐列地获取所述当前串的各列中的第三像素的预测值;或者,以单个像素为单位,逐像素地获取所述当前串中的各个第三像素的预测值;
其中,任一第三像素的预测值是根据位于所述任一第三像素的左方、且与该任一第三像素相邻的参考像素的重建值获取到的。
再一种实施方式中,所述第一像素的参考像素的重建值的获取方式包括以下任一种:
从帧内串复制模式对应的第一存储空间中获取所述第一像素的参考像素的重建值;
从帧内预测模式对应的第二存储空间中获取所述第一像素的参考像素的重建值;
将所述当前图像划分成多个N×N的区域,当所述第一像素位于任一所述N×N的区域的第一行时,从所述第二存储空间中获取所述第一像素的参考像素的重建值;否则,从所述第一存储空间中获取所述第一像素的参考像素的重建值。
再一种实施方式中,编码单元502还可用于:编码串扫描模式标志,使得所述编码信息中包括所述串扫描模式标志。其中,若所述当前编码单元的扫描模式为水平扫描模式,则所述串扫描模式标志为第三数值;若所述当前编码单元的扫描模式为竖直扫描模式,则所述串扫描模式标志为第四数值。
再一种实施方式中,所述当前图像被划分成多个处理单元,每个处理单元包括一个或多个编码单元;所述多个处理单元包括:当前处理单元和已编码的处理单元,所述当前处理单元包括所述当前编码单元;相应的,编码单元502还可用于:
若所述当前串不为所述单位矢量串,则从所述当前处理单元和所述已编码的处理单元中的部分已编码的编码单元中,搜索所述当前串的参考串;
根据所述参考串获取所述当前串中的各个像素的预测值,以得到所述当前编码单元的编码信息。
再一种实施方式中,所述参考串满足以下一项或多项条件:
所述参考串中的任一参考像素在所述当前处理单元和所述当前处理单元左方的H个已编码的处理单元范围内,所述H为正整数;
当所述参考串中的任一参考像素处于与所述当前处理单元左方相邻的已编码的处理单元内时,所述参考串的任一参考像素往右移动预定像素后的位置所在的目标区域的像素尚未编码;
所述参考串中的各个参考像素位于同一个对齐区域,所述对齐区域的尺 寸是根据所述处理单元的尺寸确定的;
所述参考串中的各个参考像素和所述当前串中的各个像素位于同一个独立编码区域内;
所述参考串中的任一参考像素不属于所述当前串。
再一种实施方式中,编码单元502还可用于:
若当前编码单元中包括未匹配像素,则编码未匹配像素的像素值,使得编码信息中包括未匹配像素的像素值。
根据本申请的一个实施例,图2所示的方法所涉及的各个步骤均可以是由图5所示的视频编码装置中的各个单元来执行的。例如,图2中所示的步骤S201可由图5中所示的获取单元501来执行,步骤S202-S203均可由图5中所示的编码单元502来执行。
根据本申请的另一个实施例,图5所示的视频编码装置中的各个单元可以分别或全部合并为一个或若干个另外的单元来构成,或者其中的某个(些)单元还可以再拆分为功能上更小的多个单元来构成,这可以实现同样的操作,而不影响本申请的实施例的技术效果的实现。上述单元是基于逻辑功能划分的,在实际应用中,一个单元的功能也可以由多个单元来实现,或者多个单元的功能由一个单元实现。在本申请的其它实施例中,基于视频编码装置也可以包括其它单元,在实际应用中,这些功能也可以由其它单元协助实现,并且可以由多个单元协作实现。
根据本申请的另一个实施例,可以通过在包括中央处理单元(CPU)、随机存取存储介质(RAM)、只读存储介质(ROM)等处理元件和存储元件的例如计算机的通用计算设备上运行能够执行如图2中所示的相应方法所涉及的各步骤的计算机程序(包括程序代码),来构造如图5中所示的视频编码装置设备,以及来实现本申请实施例的视频编码方法。所述计算机程序可以记载于例如计算机可读记录介质上,并通过计算机可读记录介质装载于上述计算设备中,并在其中运行。
本申请实施例在编码的过程中,在当前编码单元的当前串为单位矢量串,且当前串中包括第一像素(如当前编码单元中的第一行的像素或者当前编码单元中的第一列的像素等等)时,可从当前图像中的历史解码单元中确定第一像素的参考像素,并根据参考像素的重建值获取第一像素的预测值以实现编码处理。可见,本申请实施例通过允许使用当前编码单元相邻的历史解码单元中的像素作为参考像素,使得当前编码单元中任一串均可成为单位矢量串,这样可有效扩宽单位矢量串的适用范围,有助于提升编码性能。并且,本申请实施例针对当前编码单元所采用的预测编码模式并不局限于等值串与单位矢量串子模式,还可以是帧内串复制模式中的串预测子模式等其他预测编码模式;也就是说,本申请实施例可允许串预测子模式下的编码单元使用单位矢量串,这样可进一步扩宽单位矢量串的适用范围,提升串预测的编码 性能。
基于上述视频解码方法的实施例描述,本申请实施例还公开了一种视频解码装置,所述视频解码装置可以是运行于上述所提及的视频解码设备中的一个计算机程序(包括程序代码)。该视频解码装置可以执行图4所示的方法。请参见图6,所述视频解码装置可以运行如下单元:
获取单元601,用于在当前图像的当前编码单元中确定待解码的当前串;
解码单元602,用于若所述当前串为单位矢量串,且所述当前串中包括第一像素,则从所述当前图像中的历史解码单元中确定所述第一像素的参考像素;所述历史解码单元是所述当前图像中与所述当前编码单元相邻的已解码的编码单元,所述第一像素的参考像素和所述第一像素在所述当前图像中相邻;
所述解码单元602,还用于根据所述第一像素的参考像素的重建值获取所述第一像素的预测值,以得到解码图像。
在一种实施方式中,所述当前编码单元是在帧内串复制模式中的串预测子模式下被编码的;并且,所述串预测子模式下的编码单元被允许使用单位矢量串。
再一种实施方式中,解码单元602还可用于:
从所述当前编码单元的编码信息中解码出串指示标志;
若所述串指示标志为目标数值,则确定所述当前串为单位矢量串。
再一种实施方式中,解码单元602还可用于:
从所述当前编码单元的编码信息中解码出数值K,数值K的取值大于或等于零;所述数值K用于指示所述当前编码单元的目标组像素中位于第K组像素后的至少一个串为单位矢量串,一组像素包括:一个串或者至少一个未匹配像素;
若所述当前串在所述目标组像素中位于所述第K组像素后,则确定所述当前串为所述单位矢量串。
再一种实施方式中,若每组像素均包括一个串,则所述数值K用于指示所述当前编码单元的目标组像素中第K+1个串为所述单位矢量串;
相应的,解码单元602在用于若所述当前串在所述目标组像素中位于所述第K组像素后,则确定所述当前串为所述单位矢量串时,可具体用于:
若所述当前串为所述目标组像素中所述第K+1个串,则确定所述当前串为所述单位矢量串。
再一种实施方式中,解码单元602还可用于:
从所述当前编码单元的编码信息中解码出所述当前串的串位移矢量;
若所述串位移矢量为单位矢量,则确定所述当前串为所述单位矢量串。
再一种实施方式中,所述单位矢量包括第一元素和第二元素;若所述当 前编码单元的扫描模式为水平扫描模式,则所述第一元素的元素值为第一数值,所述第二元素的元素值为第二数值;若所述当前编码单元的扫描模式为竖直扫描模式,则所述第一元素的元素值为所述第二数值,所述第二元素的元素值为所述第一数值。
再一种实施方式中,所述当前图像被划分成多个处理单元,每个所述处理单元包括一个或多个编码单元;所述多个处理单元包括:当前处理单元和已解码的处理单元,所述当前处理单元包括所述当前编码单元;
所述串位移矢量用于表示所述当前串到参考串的位移,且允许所述参考串不满足以下一项或多项条件:
所述参考串中的任一参考像素在指定区域范围内;
当所述参考串中的任一参考像素处于与所述当前处理单元左方相邻的已解码的处理单元内时,所述参考串的任一参考像素往右移动预定像素后的位置所在的目标区域的像素尚未重建;
所述参考串中的各个参考像素位于同一个对齐区域,所述对齐区域的尺寸是根据所述处理单元的尺寸确定的;
所述参考串中的任一参考像素不属于所述当前串。
再一种实施方式中,所述当前编码单元包括P行Q列个像素;其中,P和Q的取值均为正整数;若所述当前编码单元的扫描模式为水平扫描模式,且所述水平扫描模式指示按照从上往下的顺序对所述当前编码单元进行扫描,则所述历史解码单元位于所述当前编码单元的上方,所述第一像素为位于所述当前编码单元的第一行的像素。
在此情况下,若所述当前串中的像素均为第二像素,或者所述当前串中包括所述第一像素和所述第二像素;所述第二像素为位于所述当前编码单元的第一行以外的像素,且所述当前串中的各个第二像素分布在所述当前编码单元中的至少一行中;则相应的,解码单元602还可用于:
以行为单位,逐行地获取所述当前串的各行中的第二像素的预测值;或者,
以单个像素为单位,逐像素地获取所述当前串中的各个第二像素的预测值;
其中,任一第二像素的预测值是根据位于所述任一第二像素的上方、且与该任一第二像素相邻的参考像素的重建值获取到的。
再一种实施方式中,所述当前编码单元包括P行Q列个像素;其中,P和Q的取值均为正整数;若所述当前编码单元的扫描模式为竖直扫描模式,且所述竖直扫描模式指示按照从左往右的顺序对所述当前编码单元进行扫描,则所述历史解码单元位于所述当前编码单元的左方,所述第一像素为位于所述当前编码单元的第一列的像素。
在此情况下,若所述当前串中的像素均为第三像素,或者所述当前串中 包括所述第一像素和所述第三像素;所述第三像素为位于所述当前编码单元的第一列以外的像素,且所述当前串中的各个第三像素分布在所述当前编码单元中的至少一列中;则相应的,解码单元602还可用于:
以列为单位,逐列地获取所述当前串的各列中的第三像素的预测值;或者,
以单个像素为单位,逐像素地获取所述当前串中的各个第三像素的预测值;
其中,任一第三像素的预测值是根据位于所述任一第三像素的左方、且与该任一第三像素相邻的参考像素的重建值获取到的。
再一种实施方式中,述第一像素的参考像素的重建值的获取方式包括以下任一种:
从帧内串复制模式对应的第一存储空间中获取所述第一像素的参考像素的重建值;
从帧内预测模式对应的第二存储空间中获取所述第一像素的参考像素的重建值;
将所述当前图像划分成多个N×N的区域,当所述第一像素位于任一所述N×N的区域的第一列时,从所述第二存储空间中获取所述第一像素的参考像素的重建值;否则,从所述第一存储空间中获取所述第一像素的参考像素的重建值。
再一种实施方式中,解码单元602还可用于:
从所述当前编码单元的编码信息中解码出串扫描模式标志;
若所述串扫描模式标志为第三数值,则确定所述当前编码单元的扫描模式为水平扫描模式;
若所述串扫描模式标志为第四数值,则确定所述当前编码单元的扫描模式为竖直扫描模式。
再一种实施方式中,所述当前图像被划分成多个处理单元,每个处理单元包括一个或多个编码单元;所述多个处理单元包括:当前处理单元和已解码的处理单元,所述当前处理单元包括所述当前编码单元;相应的,解码单元602还可用于:
若所述当前串不为所述单位矢量串,则从所述当前处理单元和所述已解码的处理单元中的部分已解码的编码单元中,搜索所述当前串的参考串;
根据所述参考串获取所述当前串中的各个像素的预测值,以得到解码图像。
再一种实施方式中,所述参考串满足以下一项或多项条件:
所述参考串中的任一参考像素在所述当前处理单元和所述当前处理单元左方的H个已解码的处理单元范围内,所述H为正整数;
当所述参考串中的任一参考像素处于与所述当前处理单元左方相邻的已 解码的处理单元内时,所述参考串的任一参考像素往右移动预定像素后的位置所在的目标区域的像素尚未重建;
所述参考串中的各个参考像素位于同一个对齐区域,所述对齐区域的尺寸是根据所述处理单元的尺寸确定的;
所述参考串中的各个参考像素和所述当前串中的各个像素位于同一个独立解码区域内;
所述参考串中的任一参考像素不属于所述当前串。
再一种实施方式中,若所述当前编码单元包括未匹配像素,则从所述当前编码单元的编码信息中解码得到所述未匹配像素的像素值。
根据本申请的一个实施例,图4所示的方法所涉及的各个步骤均可以是由图6所示的视频解码装置中的各个单元来执行的。例如,图4中所示的步骤S401可由图6中所示的获取单元501来执行,步骤S402-S403均可由图6中所示的解码单元602来执行。
根据本申请的另一个实施例,图6所示的视频解码装置中的各个单元可以分别或全部合并为一个或若干个另外的单元来构成,或者其中的某个(些)单元还可以再拆分为功能上更小的多个单元来构成,这可以实现同样的操作,而不影响本申请的实施例的技术效果的实现。上述单元是基于逻辑功能划分的,在实际应用中,一个单元的功能也可以由多个单元来实现,或者多个单元的功能由一个单元实现。在本申请的其它实施例中,基于视频解码装置也可以包括其它单元,在实际应用中,这些功能也可以由其它单元协助实现,并且可以由多个单元协作实现。
根据本申请的另一个实施例,可以通过在包括中央处理单元(CPU)、随机存取存储介质(RAM)、只读存储介质(ROM)等处理元件和存储元件的例如计算机的通用计算设备上运行能够执行如图4中所示的相应方法所涉及的各步骤的计算机程序(包括程序代码),来构造如图6中所示的视频解码装置设备,以及来实现本申请实施例的视频解码方法。所述计算机程序可以记载于例如计算机可读记录介质上,并通过计算机可读记录介质装载于上述计算设备中,并在其中运行。
本申请实施例在解码的过程中,在当前编码单元的当前串为单位矢量串,且当前串中包括第一像素(如当前编码单元中的第一行的像素或者当前编码单元中的第一列的像素等等)时,可从当前图像中的历史解码单元中确定第一像素的参考像素,并根据参考像素的重建值获取第一像素的预测值以实现解码处理。可见,本申请实施例通过允许使用当前编码单元相邻的历史解码单元中的像素作为参考像素,使得当前编码单元中任一串均可成为单位矢量串,这样可有效扩宽单位矢量串的适用范围,有助于提升解码性能。并且,本申请实施例针对当前编码单元所采用的预测编码模式并不局限于等值串与单位矢量串子模式,还可以是帧内串复制模式中的串预测子模式等其他预测 编码模式;也就是说,本申请实施例可允许串预测子模式下的编码单元使用单位矢量串,这样可进一步扩宽单位矢量串的适用范围,提升串预测的编码性能。
基于上述方法实施例以及装置实施例的描述,本申请实施例还提供一种计算机设备;该计算机设备可以是上述所提及的视频编码设备,或者上述所提及的视频解码设备。请参见图7,该计算机设备可至少包括处理器701、输入接口702、输出接口703以及计算机存储介质704。其中,计算机设备内的处理器701、输入接口702、输出接口703以及计算机存储介质704可通过总线或其他方式连接。可选的,若计算机设备为上述所提及的视频编码设备,则计算机设备中还可包括视频编码器;若计算机设备为上述所提及的视频解码设备,则计算机设备中还可包括视频解码器。
其中,计算机存储介质704可以存储在计算机设备的存储器中,所述计算机存储介质704用于存储计算机程序,所述计算机程序包括程序指令,所述处理器701用于执行所述计算机存储介质704存储的程序指令。处理器701(或称CPU(Central Processing Unit,中央处理器))是计算机设备的计算核心以及控制核心,其适于实现一条或多条指令,具体适于加载并执行一条或多条指令从而实现相应方法流程或相应功能。在一个实施例中,当计算机设备为视频编码设备时,本申请实施例所述的处理器701可以用于进行上述图2所示的视频编码方法的相关方法步骤。再一个实施中,当计算机设备为视频解码设备时,本申请实施例所述的处理器701可用于进行上述图4所示的视频解码方法的相关方法步骤。
本申请实施例还提供了一种计算机存储介质(Memory),所述计算机存储介质是计算机设备中的记忆设备,用于存放程序和数据。可以理解的是,此处的计算机存储介质既可以包括计算机设备中的内置存储介质,当然也可以包括计算机设备所支持的扩展存储介质。计算机存储介质提供存储空间,所述存储空间存储了计算机设备的操作系统。并且,在该存储空间中还存放了适于被处理器1001加载并执行的一条或多条的指令,这些指令可以是一个或一个以上的计算机程序(包括程序代码)。需要说明的是,此处的计算机存储介质可以是高速RAM存储器,也可以是非不稳定的存储器(non-volatile memory),例如至少一个磁盘存储器;可选的还可以是至少一个位于远离前述处理器的计算机存储介质。
在一个实施例中,当计算机设备为上述所提及的视频解码设备时,可由处理器701加载并执行计算机存储介质中存放的一条或多条第一指令,以实现上述图4所示的视频解码方法实施例中的方法的相应步骤;具体实现中,计算机存储介质中的一条或多条第一指令由处理器701加载并执行图2所示的视频编码方法或者图4所示的视频编码方法。
需要说明的是,根据本申请的一个方面,还提供了一种计算机程序产品或计算机程序,该计算机程序产品或计算机程序包括计算机指令,该计算机指令存储在计算机可读存储介质中。计算机设备的处理器从计算机可读存储介质读取该计算机指令,处理器执行该计算机指令,使得该计算机设备执行上述图2所示的视频编码方法或图4所示的视频解码方法实施例方面的各种可选方式中提供的方法。
并且,应理解的是,以上所揭露的仅为本申请较佳实施例而已,当然不能以此来限定本申请之权利范围,因此依本申请权利要求所作的等同变化,仍属本申请所涵盖的范围。

Claims (41)

  1. 一种视频解码方法,由计算机设备执行,包括:
    在当前图像的当前编码单元中确定待解码的当前串;
    若所述当前串为单位矢量串,且所述当前串中包括第一像素,则从所述当前图像中的历史解码单元中确定所述第一像素的参考像素;所述历史解码单元是所述当前图像中与所述当前编码单元相邻的已解码的编码单元,所述第一像素的参考像素和所述第一像素在所述当前图像中相邻;
    根据所述第一像素的参考像素的重建值获取所述第一像素的预测值,以得到解码图像。
  2. 如权利要求1所述的方法,所述当前编码单元是在帧内串复制模式中的串预测子模式下被编码的;
    并且,所述串预测子模式下的编码单元被允许使用单位矢量串。
  3. 如权利要求1或2所述的方法,所述方法还包括:
    从所述当前编码单元的编码信息中解码出串指示标志;
    若所述串指示标志为目标数值,则确定所述当前串为单位矢量串。
  4. 如权利要求1或2所述的方法,所述方法还包括:
    从所述当前编码单元的编码信息中解码出数值K,所述数值K的取值大于或等于零;所述数值K用于指示所述当前编码单元的目标组像素中位于第K组像素后的至少一个串为单位矢量串,一组像素包括:一个串或者至少一个未匹配像素;
    若所述当前串在所述目标组像素中位于所述第K组像素后,则确定所述当前串为所述单位矢量串。
  5. 如权利要求4所述的方法,若每组像素均包括一个串,则所述数值K用于指示所述当前编码单元的目标组像素中第K+1个串为所述单位矢量串;
    所述若所述当前串在所述目标组像素中位于所述第K组像素后,则确定所述当前串为所述单位矢量串,包括:
    若所述当前串为所述目标组像素中所述第K+1个串,则确定所述当前串为所述单位矢量串。
  6. 如权利要求1或2所述的方法,所述方法还包括:
    从所述当前编码单元的编码信息中解码出所述当前串的串位移矢量;
    若所述串位移矢量为单位矢量,则确定所述当前串为所述单位矢量串。
  7. 如权利要求6所述的方法,所述单位矢量包括第一元素和第二元素;
    若所述当前编码单元的扫描模式为水平扫描模式,则所述第一元素的元素值为第一数值,所述第二元素的元素值为第二数值;
    若所述当前编码单元的扫描模式为竖直扫描模式,则所述第一元素的元素值为所述第二数值,所述第二元素的元素值为所述第一数值。
  8. 如权利要求6所述的方法,所述当前图像被划分成多个处理单元,每 个所述处理单元包括一个或多个编码单元;所述多个处理单元包括:当前处理单元和已解码的处理单元,所述当前处理单元包括所述当前编码单元;
    所述串位移矢量用于表示所述当前串到参考串的位移,且允许所述参考串不满足以下一项或多项条件:
    所述参考串中的任一参考像素在指定区域范围内;
    当所述参考串中的任一参考像素处于与所述当前处理单元左方相邻的已解码的处理单元内时,所述参考串的任一参考像素往右移动预定像素后的位置所在的目标区域的像素尚未重建;
    所述参考串中的各个参考像素位于同一个对齐区域,所述对齐区域的尺寸是根据所述处理单元的尺寸确定的;
    所述参考串中的任一参考像素不属于所述当前串。
  9. 如权利要求1所述的方法,所述当前编码单元包括P行Q列个像素;其中,所述P和所述Q的取值均为正整数;
    若所述当前编码单元的扫描模式为水平扫描模式,且所述水平扫描模式指示按照从上往下的顺序对所述当前编码单元进行扫描,则所述历史解码单元位于所述当前编码单元的上方,所述第一像素为位于所述当前编码单元的第一行的像素。
  10. 如权利要求9所述的方法,所述第一像素的参考像素的重建值的获取方式包括以下任一种:
    从帧内串复制模式对应的第一存储空间中,获取所述第一像素的参考像素的重建值;
    从帧内预测模式对应的第二存储空间中,获取所述第一像素的参考像素的重建值;
    将所述当前图像划分成多个N×N的区域,当所述第一像素位于任一所述N×N的区域的第一行时,从所述第二存储空间中获取所述第一像素的参考像素的重建值;否则,从所述第一存储空间中获取所述第一像素的参考像素的重建值。
  11. 如权利要求9所述的方法,所述当前串中的像素均为第二像素,或者所述当前串中包括所述第一像素和所述第二像素;所述第二像素为位于所述当前编码单元的第一行以外的像素,且所述当前串中的各个所述第二像素分布在所述当前编码单元中的至少一行中;
    所述方法还包括:
    以行为单位,逐行地获取所述当前串的各行中的所述第二像素的预测值;
    或者,以单个像素为单位,逐像素地获取所述当前串中的各个所述第二像素的预测值;
    其中,任一第二像素的预测值是根据位于所述任一第二像素的上方、且与所述任一第二像素相邻的参考像素的重建值获取到的。
  12. 如权利要求1所述的方法,所述当前编码单元包括P行Q列个像素;其中,所述P和所述Q的取值均为正整数;
    若所述当前编码单元的扫描模式为竖直扫描模式,且所述竖直扫描模式指示按照从左往右的顺序对所述当前编码单元进行扫描,则所述历史解码单元位于所述当前编码单元的左方,所述第一像素为位于所述当前编码单元的第一列的像素。
  13. 如权利要求12所述的方法,所述第一像素的参考像素的重建值的获取方式包括以下任一种:
    从帧内串复制模式对应的第一存储空间中获取所述第一像素的参考像素的重建值;
    从帧内预测模式对应的第二存储空间中获取所述第一像素的参考像素的重建值;
    将所述当前图像划分成多个N×N的区域,当所述第一像素位于任一所述N×N的区域的第一列时,从所述第二存储空间中获取所述第一像素的参考像素的重建值;否则,从所述第一存储空间中获取所述第一像素的参考像素的重建值。
  14. 如权利要12所述的方法,所述当前串中的像素均为第三像素,或者所述当前串中包括所述第一像素和所述第三像素;所述第三像素为位于所述当前编码单元的第一列以外的像素,且所述当前串中的各个所述第三像素分布在所述当前编码单元中的至少一列中;
    所述方法还包括:
    以列为单位,逐列地获取所述当前串的各列中的所述第三像素的预测值;
    或者,以单个像素为单位,逐像素地获取所述当前串中的各个所述第三像素的预测值;
    其中,任一第三像素的预测值是根据位于所述任一第三像素的左方、且与所述任一第三像素相邻的参考像素的重建值获取到的。
  15. 如权利要求9-14任一项所述的方法,所述方法还包括:
    从所述当前编码单元的编码信息中解码出串扫描模式标志;
    若所述串扫描模式标志为第三数值,则确定所述当前编码单元的扫描模式为水平扫描模式;
    若所述串扫描模式标志为第四数值,则确定所述当前编码单元的扫描模式为竖直扫描模式。
  16. 如权利要求1所述的方法,所述当前图像被划分成多个处理单元,每个所述处理单元包括一个或多个编码单元;所述多个处理单元包括:当前处理单元和已解码的处理单元,所述当前处理单元包括所述当前编码单元;
    所述方法还包括:
    若所述当前串不为所述单位矢量串,则从所述当前处理单元和所述已解 码的处理单元中的部分已解码的编码单元中,搜索所述当前串的参考串;
    根据所述参考串获取所述当前串中的各个像素的预测值,以得到解码图像。
  17. 如权利要求16所述的方法,所述参考串满足以下一项或多项条件:
    所述参考串中的任一参考像素在所述当前处理单元和所述当前处理单元左方的H个已解码的处理单元范围内,所述H为正整数;
    当所述参考串中的任一参考像素处于与所述当前处理单元左方相邻的已解码的处理单元内时,所述参考串的任一参考像素往右移动预定像素后的位置所在的目标区域的像素尚未重建;
    所述参考串中的各个参考像素位于同一个对齐区域,所述对齐区域的尺寸是根据所述处理单元的尺寸确定的;
    所述参考串中的各个参考像素和所述当前串中的各个像素位于同一个独立解码区域内;
    所述参考串中的任一参考像素不属于所述当前串。
  18. 如权利要求1所述的方法,若所述当前编码单元包括未匹配像素,则从所述当前编码单元的编码信息中解码得到所述未匹配像素的像素值。
  19. 一种视频编码方法,由计算机设备执行,包括:
    在当前图像中的当前编码单元中确定待编码的当前串;
    若所述当前串为单位矢量串,且所述当前串中包括第一像素,则从所述当前图像中的历史编码单元中确定所述第一像素的参考像素;所述历史编码单元是所述当前图像中与所述当前编码单元相邻的已编码的编码单元,所述第一像素的参考像素和所述第一像素在所述当前图像中相邻;
    根据所述第一像素的参考像素的重建值获取所述第一像素的预测值,以得到所述当前编码单元的编码信息。
  20. 如权利要求19所述的方法,所述当前编码单元是在帧内串复制模式中的串预测子模式下被编码的;
    并且,所述串预测子模式下的编码单元被允许使用单位矢量串。
  21. 如权利要求19或20所述的方法,所述方法还包括:
    编码所述串指示标志,使得所述编码信息中包括所述串指示标志;
    其中,若所述当前串为单位矢量串,则所述串指示标志为目标数值。
  22. 如权利要求19或20所述的方法,所述方法还包括:
    编码数值K,使得所述编码信息中包括所述数值K,所述数值K的取值大于或等于零;
    其中,所述数值K用于指示所述当前编码单元的目标组像素中位于第K组像素后的至少一个串为单位矢量串,一组像素包括:一个串或者至少一个未匹配像素。
  23. 如权利要求22所述的方法,若每组像素均包括一个串,则所述数值 K用于指示所述当前编码单元的目标组像素中第K+1个串为所述单位矢量串。
  24. 如权利要求19或20所述的方法,其特征在于,所述方法还包括:
    编码所述当前串的串位移矢量,使得所述编码信息中包括所述串位移矢量;
    其中,若所述当前串为所述单位矢量串,则所述串位移矢量为单位矢量。
  25. 如权利要求24所述的方法,所述单位矢量包括第一元素和第二元素;
    若所述当前编码单元的扫描模式为水平扫描模式,则所述第一元素的元素值为第一数值,所述第二元素的元素值为第二数值;
    若所述当前编码单元的扫描模式为竖直扫描模式,则所述第一元素的元素值为所述第二数值,所述第二元素的元素值为所述第一数值。
  26. 如权利要求24所述的方法,所述当前图像被划分成多个处理单元,每个所述处理单元包括一个或多个编码单元;所述多个处理单元包括:当前处理单元和已编码的处理单元,所述当前处理单元包括所述当前编码单元;
    所述串位移矢量用于表示所述当前串到参考串的位移,且允许所述参考串不满足以下一项或多项条件:
    所述参考串中的任一参考像素在指定区域范围内;
    当所述参考串中的任一参考像素处于与所述当前处理单元左方相邻的已解码的处理单元内时,所述参考串的任一参考像素往右移动预定像素后的位置所在的目标区域的像素尚未重建;
    所述参考串中的各个参考像素位于同一个对齐区域,所述对齐区域的尺寸是根据所述处理单元的尺寸确定的;
    所述参考串中的任一参考像素不属于所述当前串。
  27. 如权利要求19所述的方法,所述当前编码单元包括P行Q列个像素;其中,所述P和所述Q的取值均为正整数;
    若所述当前编码单元的扫描模式为水平扫描模式,且所述水平扫描模式指示按照从上往下的顺序对所述当前编码单元进行扫描,则所述历史编码单元位于所述当前编码单元的上方,所述第一像素为位于所述当前编码单元的第一行的像素。
  28. 如权利要求27所述的方法,所述第一像素的参考像素的重建值的获取方式包括以下任一种:
    从帧内串复制模式对应的第一存储空间中获取所述第一像素的参考像素的重建值;
    从帧内预测模式对应的第二存储空间中获取所述第一像素的参考像素的重建值;
    将所述当前图像划分成多个N×N的区域,当所述第一像素位于任一所述N×N的区域的第一行时,从所述第二存储空间中获取所述第一像素的参 考像素的重建值;否则,从所述第一存储空间中获取所述第一像素的参考像素的重建值。
  29. 如权利要求27所述的方法,所述当前串中的像素均为第二像素,或者所述当前串中包括所述第一像素和所述第二像素;所述第二像素为位于所述当前编码单元的第一行以外的像素,且所述当前串中的各个第二像素分布在所述当前编码单元中的至少一行中;
    所述方法还包括:
    以行为单位,逐行地获取所述当前串的各行中的所述第二像素的预测值;
    或者,以单个像素为单位,逐像素地获取所述当前串中的各个所述第二像素的预测值;
    其中,任一第二像素的预测值是根据位于所述任一第二像素的上方、且与所述任一第二像素相邻的参考像素的重建值获取到的。
  30. 如权利要求19所述的方法,所述当前编码单元包括P行Q列个像素;其中,所述P和所述Q的取值均为正整数;
    若所述当前编码单元的扫描模式为竖直扫描模式,且所述竖直扫描模式指示按照从左往右的顺序对所述当前编码单元进行扫描,则所述历史编码单元位于所述当前编码单元的左方,所述第一像素为位于所述当前编码单元的第一列的像素。
  31. 如权利要求30所述的方法,所述第一像素的参考像素的重建值的获取方式包括以下任一种:
    从帧内串复制模式对应的第一存储空间中获取所述第一像素的参考像素的重建值;
    从帧内预测模式对应的第二存储空间中获取所述第一像素的参考像素的重建值;
    将所述当前图像划分成多个N×N的区域,当所述第一像素位于任一所述N×N的区域的第一列时,从所述第二存储空间中获取所述第一像素的参考像素的重建值;否则,从所述第一存储空间中获取所述第一像素的参考像素的重建值。
  32. 如权利要求30所述的方法,所述当前串中的像素均为第三像素,或者所述当前串中包括所述第一像素和所述第三像素;所述第三像素为位于所述当前编码单元的第一列以外的像素,且所述当前串中的各个所述第三像素分布在所述当前编码单元中的至少一列中;
    所述方法还包括:
    以列为单位,逐列地获取所述当前串的各列中的所述第三像素的预测值;
    或者,以单个像素为单位,逐像素地获取所述当前串中的各个所述第三像素的预测值;
    其中,任一第三像素的预测值是根据位于所述任一第三像素的左方、且 与所述任一第三像素相邻的参考像素的重建值获取到的。
  33. 如权利要求27-32任一项所述的方法,所述方法还包括:
    编码串扫描模式标志,使得所述编码信息中包括所述串扫描模式标志;
    其中,若所述当前编码单元的扫描模式为水平扫描模式,则所述串扫描模式标志为第三数值;若所述当前编码单元的扫描模式为竖直扫描模式,则所述串扫描模式标志为第四数值。
  34. 如权利要求19所述的方法,所述当前图像被划分成多个处理单元,每个所述处理单元包括一个或多个编码单元;所述多个处理单元包括:当前处理单元和已编码的处理单元,所述当前处理单元包括所述当前编码单元;
    所述方法还包括:
    若所述当前串不为所述单位矢量串,则从所述当前处理单元和所述已编码的处理单元中的部分已编码的编码单元中,搜索所述当前串的参考串;
    根据所述参考串获取所述当前串中的各个像素的预测值,以得到所述当前编码单元的编码信息。
  35. 如权利要求34所述的方法,所述参考串满足以下一项或多项条件:
    所述参考串中的任一参考像素在所述当前处理单元和所述当前处理单元左方的H个已编码的处理单元范围内,所述H为正整数;
    当所述参考串中的任一参考像素处于与所述当前处理单元左方相邻的已编码的处理单元内时,所述参考串的任一参考像素往右移动预定像素后的位置所在的目标区域的像素尚未编码;
    所述参考串中的各个参考像素位于同一个对齐区域,所述对齐区域的尺寸是根据所述处理单元的尺寸确定的;
    所述参考串中的各个参考像素和所述当前串中的各个像素位于同一个独立编码区域内;
    所述参考串中的任一参考像素不属于所述当前串。
  36. 如权利要求19所述的方法,所述方法还包括:
    若所述当前编码单元中包括未匹配像素,则编码所述未匹配像素的像素值,使得所述编码信息中包括所述未匹配像素的像素值。
  37. 一种视频解码装置,包括:
    获取单元,用于在当前图像的当前编码单元中确定待解码的当前串;
    解码单元,用于若所述当前串为单位矢量串,且所述当前串中包括第一像素,则从所述当前图像中的历史解码单元中确定所述第一像素的参考像素;所述历史解码单元是所述当前图像中与所述当前编码单元相邻的已解码的编码单元,所述第一像素的参考像素和所述第一像素在所述当前图像中相邻;
    所述解码单元,还用于根据所述第一像素的参考像素的重建值获取所述第一像素的预测值,以得到解码图像。
  38. 一种视频编码装置,包括:
    获取单元,用于在当前图像的当前编码单元中确定待编码的当前串;
    编码单元,用于若所述当前串为单位矢量串,且所述当前串中包括第一像素,则从所述当前图像中的历史编码单元中确定所述第一像素的参考像素;所述历史编码单元是所述当前图像中与所述当前编码单元相邻的已编码的编码单元,所述第一像素的参考像素和所述第一像素在所述当前图像中相邻;
    所述编码单元,还用于根据所述第一像素的参考像素的重建值获取所述第一像素的预测值,以得到所述当前编码单元的编码信息。
  39. 一种计算机设备,包括输入接口和输出接口,还包括:
    处理器,适于实现一条或多条指令;以及,
    计算机存储介质,所述计算机存储介质存储有一条或多条第一指令,所述一条或多条第一指令适于由所述处理器加载并执行如权利要求1-18任一项所述的视频解码方法;或者,所述计算机存储介质存储有一条或多条第二指令,所述一条或多条第二指令适于由所述处理器加载并执行如权利要求19-36任一项所述的视频编码方法。
  40. 一种计算机存储介质,所述计算机存储介质存储有一条或多条第一指令,所述一条或多条第一指令适于由处理器加载并执行如权利要求1-18任一项所述的视频解码方法;或者,所述计算机存储介质存储有一条或多条第二指令,所述一条或多条第二指令适于由所述处理器加载并执行如权利要求19-36任一项所述的视频编码方法。
  41. 一种计算机程序产品,包括指令,当其在计算机上运行时,使得计算机执行如权利要求1-18任一项所述的视频解码方法,或权利要求19-36任一项所述的视频编码方法。
PCT/CN2021/131183 2020-12-05 2021-11-17 视频解码方法、视频编码方法、相关设备及存储介质 WO2022116824A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/977,589 US20230047433A1 (en) 2020-12-05 2022-10-31 Video decoding method, video encoding method, related devices, and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011416242.0 2020-12-05
CN202011416242.0A CN114615498A (zh) 2020-12-05 2020-12-05 视频解码方法、视频编码方法、相关设备及存储介质

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/977,589 Continuation US20230047433A1 (en) 2020-12-05 2022-10-31 Video decoding method, video encoding method, related devices, and storage medium

Publications (1)

Publication Number Publication Date
WO2022116824A1 true WO2022116824A1 (zh) 2022-06-09

Family

ID=81852920

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/131183 WO2022116824A1 (zh) 2020-12-05 2021-11-17 视频解码方法、视频编码方法、相关设备及存储介质

Country Status (3)

Country Link
US (1) US20230047433A1 (zh)
CN (1) CN114615498A (zh)
WO (1) WO2022116824A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116847088B (zh) * 2023-08-24 2024-04-05 深圳传音控股股份有限公司 图像处理方法、处理设备及存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104244007A (zh) * 2013-06-13 2014-12-24 上海天荷电子信息有限公司 基于任意形状匹配的图像压缩方法和装置
CN106576163A (zh) * 2014-06-20 2017-04-19 高通股份有限公司 用于调色板模式译码的从先前行复制
CN107071450A (zh) * 2016-02-10 2017-08-18 同济大学 数据压缩的编码、解码方法及装置
CN107770527A (zh) * 2016-08-21 2018-03-06 上海天荷电子信息有限公司 使用邻近编码参数和最近编码参数的数据压缩方法和装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104244007A (zh) * 2013-06-13 2014-12-24 上海天荷电子信息有限公司 基于任意形状匹配的图像压缩方法和装置
CN106576163A (zh) * 2014-06-20 2017-04-19 高通股份有限公司 用于调色板模式译码的从先前行复制
CN107071450A (zh) * 2016-02-10 2017-08-18 同济大学 数据压缩的编码、解码方法及装置
CN107770527A (zh) * 2016-08-21 2018-03-06 上海天荷电子信息有限公司 使用邻近编码参数和最近编码参数的数据压缩方法和装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Y. CHEN,J. XU (CE COORDINATORS): "Description of Core Experiment 10 (CE10): Intra String Copy", 18. JCT-VC MEETING; 30-6-2014 - 9-7-2014; SAPPORO; (JOINT COLLABORATIVE TEAM ON VIDEO CODING OF ISO/IEC JTC1/SC29/WG11 AND ITU-T SG.16 ); URL: HTTP://WFTP3.ITU.INT/AV-ARCH/JCTVC-SITE/, 8 July 2014 (2014-07-08), Sapporo JP, pages 1 - 6, XP030116721 *

Also Published As

Publication number Publication date
US20230047433A1 (en) 2023-02-16
CN114615498A (zh) 2022-06-10

Similar Documents

Publication Publication Date Title
WO2020125595A1 (zh) 视频译码器及相应方法
CN114157864B (zh) 图像预测方法、装置、设备、系统及存储介质
CN111327904B (zh) 图像重建方法和装置
US20160050431A1 (en) Method and system for organizing pixel information in memory
KR20200096227A (ko) 블록 형상에 기초한 비디오 인코딩 및 디코딩을 위한 방법 및 장치
WO2023065891A1 (zh) 多媒体数据处理方法、装置、设备、计算机可读存储介质及计算机程序产品
CN115836525B (zh) 用于从多个交叉分量进行预测的视频编码、解码方法和设备
WO2022116824A1 (zh) 视频解码方法、视频编码方法、相关设备及存储介质
CA3222255A1 (en) Decoding prediction method and apparatus, and computer storage medium
CN111327899A (zh) 视频译码器及相应方法
CN112055211B (zh) 视频编码器及qp设置方法
CN111327894B (zh) 块划分方法、视频编解码方法、视频编解码器
CN113115041A (zh) 支持alpha通道的无损图片压缩方法、装置及介质
WO2020147514A1 (zh) 视频编码器、视频解码器及相应方法
KR20200096862A (ko) 잔차 레벨 데이터의 위치 의존적 엔트로피 코딩을 위한 임베디드 코덱(ebc) 회로
WO2022217442A1 (zh) 系数编解码方法、编码器、解码器以及计算机存储介质
WO2022078339A1 (zh) 参考像素候选列表构建方法、装置、设备及存储介质
WO2020259330A1 (zh) 非可分离变换方法以及设备
WO2023065890A1 (zh) 多媒体数据处理方法、装置、计算机设备、计算机可读存储介质及计算机程序产品
WO2023272517A1 (zh) 编解码方法、码流、编码器、解码器及计算机存储介质
KR20210021080A (ko) 인트라 예측 장치, 인코딩 장치, 디코딩 장치 및 방법들
CN111770338B (zh) 编码单元的索引值确定方法、装置、设备及存储介质
WO2022266971A1 (zh) 编解码方法、编码器、解码器以及计算机存储介质
CN114598873B (zh) 量化参数的解码方法和装置
CN110944177B (zh) 视频解码方法及视频解码器,视频编码方法及视频编码器

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21899858

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21899858

Country of ref document: EP

Kind code of ref document: A1