WO2023122969A1 - 帧内预测方法、设备、系统、及存储介质 - Google Patents

帧内预测方法、设备、系统、及存储介质 Download PDF

Info

Publication number
WO2023122969A1
WO2023122969A1 PCT/CN2021/142114 CN2021142114W WO2023122969A1 WO 2023122969 A1 WO2023122969 A1 WO 2023122969A1 CN 2021142114 W CN2021142114 W CN 2021142114W WO 2023122969 A1 WO2023122969 A1 WO 2023122969A1
Authority
WO
WIPO (PCT)
Prior art keywords
intra
prediction mode
frame prediction
current block
frame
Prior art date
Application number
PCT/CN2021/142114
Other languages
English (en)
French (fr)
Inventor
谢志煌
Original Assignee
Oppo广东移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oppo广东移动通信有限公司 filed Critical Oppo广东移动通信有限公司
Priority to PCT/CN2021/142114 priority Critical patent/WO2023122969A1/zh
Publication of WO2023122969A1 publication Critical patent/WO2023122969A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/105Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/157Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
    • H04N19/159Prediction type, e.g. intra-frame, inter-frame or bidirectional frame prediction

Definitions

  • the present application relates to the technical field of video coding and decoding, and in particular to an intra prediction method, device, system, and storage medium.
  • Digital video technology can be incorporated into a variety of video devices, such as digital televisions, smartphones, computers, e-readers, or video players, among others.
  • video devices implement video compression technology to enable more effective transmission or storage of video data.
  • Video is compressed through encoding, and the encoding process includes prediction, transformation, and quantization. For example, through intra-frame prediction and/or inter-frame prediction, determine the prediction block of the current block, subtract the prediction block from the current block to obtain a residual block, transform the residual block to obtain a transformation coefficient, and quantize the transformation coefficient to obtain a quantization coefficient, And encode the quantized coefficients to form a code stream.
  • two or more intra-frame prediction modes can be used to perform weighted fusion prediction on the current block to obtain the prediction value of the current block.
  • the use of weighted fusion prediction can improve the prediction effect, but in some cases, the use of weighted fusion prediction will reduce the prediction quality. Therefore, before weighted fusion, it is necessary to judge whether to perform weighted fusion based on the weighted fusion conditions. It can be seen that the setting of weighted fusion conditions directly affects the accuracy of intra prediction.
  • the embodiment of the present application provides an intra prediction method, device, system, and storage medium.
  • the weighted fusion condition is determined by the amplitude value of the intra prediction mode.
  • the intra prediction can be improved. Effect.
  • an intra prediction method including:
  • Intra-frame prediction mode the N is a positive integer greater than 1;
  • the weighted fusion condition of the current block determines whether the current block passes through the first intra-frame performing weighted prediction on the prediction mode, the second intra prediction mode and the third intra prediction mode;
  • the embodiment of the present application provides an intra prediction method, including:
  • N is a positive integer greater than 1;
  • the weighted fusion condition of the current block determines whether the current block passes through the first intra-frame performing weighted prediction on the prediction mode, the second intra prediction mode and the third intra prediction mode;
  • the present application provides an intra prediction device, configured to execute the method in the above first aspect or its various implementation manners.
  • the encoder includes a functional unit configured to execute the method in the above first aspect or its implementations.
  • the present application provides an intra-frame prediction device, configured to perform the method in the above-mentioned second aspect or various implementations thereof.
  • the decoder includes a functional unit configured to execute the method in the above second aspect or its various implementations.
  • a video encoder including a processor and a memory.
  • the memory is used to store a computer program, and the processor is used to call and run the computer program stored in the memory, so as to execute the method in the above first aspect or its various implementations.
  • a sixth aspect provides a video decoder, including a processor and a memory.
  • the memory is used to store a computer program
  • the processor is used to invoke and run the computer program stored in the memory, so as to execute the method in the above second aspect or its various implementations.
  • a video codec system including a video encoder and a video decoder.
  • the video encoder is configured to execute the method in the above first aspect or its various implementations
  • the video decoder is configured to execute the method in the above second aspect or its various implementations.
  • the chip includes: a processor, configured to call and run a computer program from the memory, so that the device installed with the chip executes any one of the above-mentioned first to second aspects or any of the implementations thereof. method.
  • a computer-readable storage medium for storing a computer program, and the computer program causes a computer to execute any one of the above-mentioned first to second aspects or the method in each implementation manner thereof.
  • a computer program product including computer program instructions, the computer program instructions cause a computer to execute any one of the above first to second aspects or the method in each implementation manner.
  • a computer program which, when running on a computer, causes the computer to execute any one of the above-mentioned first to second aspects or the method in each implementation manner thereof.
  • a code stream is provided, where the code stream is generated by any one of the first aspects above or each implementation thereof.
  • N is a positive integer greater than 1; according to the magnitude value of the first intra-frame prediction mode and the second intra-frame prediction mode, determine the weighted fusion condition of the current block, and the weighted fusion condition is used to judge whether the current block passes the first Perform weighted prediction in the first intra prediction mode, the second intra prediction mode and the third intra prediction mode; according to the weighted fusion condition, and in the first intra prediction mode, the second intra prediction mode and the third intra prediction mode At least one of , determine the target prediction value of the current block.
  • the present application determines the weighted fusion condition of the current block according to the magnitude values of the first intra-frame prediction mode and the second intra-frame prediction mode, and judges whether to perform weighted fusion prediction on the current block based on the determined weighted fusion condition.
  • weighted fusion prediction is performed on image content that requires weighted fusion prediction, the prediction quality is reduced and unnecessary noise is introduced, thereby improving the accuracy of intra prediction.
  • FIG. 1 is a schematic block diagram of a video encoding and decoding system involved in an embodiment of the present application
  • Fig. 2 is a schematic block diagram of a video encoder involved in an embodiment of the present application
  • Fig. 3 is a schematic block diagram of a video decoder involved in an embodiment of the present application.
  • FIG. 4 is a schematic diagram of an intra prediction mode
  • FIG. 5 is a schematic diagram of an intra prediction mode
  • FIG. 6 is a schematic diagram of an intra prediction mode
  • FIG. 7A is a schematic diagram of a template of a DIMD
  • Fig. 7B is a histogram of amplitude values and angle patterns
  • Figure 7C is a schematic diagram of prediction of DIMD
  • FIG. 8 is a schematic flowchart of an intra prediction method provided by an embodiment of the present application.
  • Fig. 9 is a schematic flow chart of an intra prediction method provided by an embodiment of the present application.
  • FIG. 10 is a schematic flowchart of an intra prediction method provided by an embodiment of the present application.
  • FIG. 11 is a schematic flowchart of an intra prediction method provided by an embodiment of the present application.
  • Fig. 12 is a schematic block diagram of an intra prediction device provided by an embodiment of the present application.
  • Fig. 13 is a schematic block diagram of an intra prediction device provided by an embodiment of the present application.
  • Fig. 14 is a schematic block diagram of an electronic device provided by an embodiment of the present application.
  • Fig. 15 is a schematic block diagram of a video codec system provided by an embodiment of the present application.
  • the application can be applied to the field of image codec, video codec, hardware video codec, dedicated circuit video codec, real-time video codec, etc.
  • the solution of the present application can be combined with audio and video coding standards (audio video coding standard, referred to as AVS), for example, H.264/audio video coding (audio video coding, referred to as AVC) standard, H.265/high efficiency video coding ( High efficiency video coding (HEVC for short) standard and H.266/versatile video coding (VVC for short) standard.
  • the solutions of the present application may operate in conjunction with other proprietary or industry standards, including ITU-TH.261, ISO/IECMPEG-1Visual, ITU-TH.262 or ISO/IECMPEG-2Visual, ITU-TH.263 , ISO/IECMPEG-4Visual, ITU-TH.264 (also known as ISO/IECMPEG-4AVC), including scalable video codec (SVC) and multi-view video codec (MVC) extensions.
  • SVC scalable video codec
  • MVC multi-view video codec
  • FIG. 1 is a schematic block diagram of a video encoding and decoding system involved in an embodiment of the present application. It should be noted that FIG. 1 is only an example, and the video codec system in the embodiment of the present application includes but is not limited to what is shown in FIG. 1 .
  • the video codec system 100 includes an encoding device 110 and a decoding device 120 .
  • the encoding device is used to encode (can be understood as compression) the video data to generate a code stream, and transmit the code stream to the decoding device.
  • the decoding device decodes the code stream generated by the encoding device to obtain decoded video data.
  • the encoding device 110 in the embodiment of the present application can be understood as a device having a video encoding function
  • the decoding device 120 can be understood as a device having a video decoding function, that is, the embodiment of the present application includes a wider range of devices for the encoding device 110 and the decoding device 120, Examples include smartphones, desktop computers, mobile computing devices, notebook (eg, laptop) computers, tablet computers, set-top boxes, televisions, cameras, display devices, digital media players, video game consoles, vehicle-mounted computers, and the like.
  • the encoding device 110 may transmit the encoded video data (such as code stream) to the decoding device 120 via the channel 130 .
  • Channel 130 may include one or more media and/or devices capable of transmitting encoded video data from encoding device 110 to decoding device 120 .
  • channel 130 includes one or more communication media that enable encoding device 110 to transmit encoded video data directly to decoding device 120 in real-time.
  • encoding device 110 may modulate the encoded video data according to a communication standard and transmit the modulated video data to decoding device 120 .
  • the communication medium includes a wireless communication medium, such as a radio frequency spectrum.
  • the communication medium may also include a wired communication medium, such as one or more physical transmission lines.
  • the channel 130 includes a storage medium that can store video data encoded by the encoding device 110 .
  • the storage medium includes a variety of local access data storage media, such as optical discs, DVDs, flash memory, and the like.
  • the decoding device 120 may acquire encoded video data from the storage medium.
  • channel 130 may include a storage server that may store video data encoded by encoding device 110 .
  • the decoding device 120 may download the stored encoded video data from the storage server.
  • the storage server may store the encoded video data and may transmit the encoded video data to the decoding device 120, such as a web server (eg, for a website), a file transfer protocol (FTP) server, and the like.
  • FTP file transfer protocol
  • the encoding device 110 includes a video encoder 112 and an output interface 113 .
  • the output interface 113 may include a modulator/demodulator (modem) and/or a transmitter.
  • the encoding device 110 may include a video source 111 in addition to the video encoder 112 and the input interface 113 .
  • the video source 111 may include at least one of a video capture device (for example, a video camera), a video archive, a video input interface, a computer graphics system, wherein the video input interface is used to receive video data from a video content provider, and the computer graphics system Used to generate video data.
  • a video capture device for example, a video camera
  • a video archive for example, a video archive
  • a video input interface for example, a video archive
  • video input interface for example, a video input interface
  • computer graphics system used to generate video data.
  • the video encoder 112 encodes the video data from the video source 111 to generate a code stream.
  • Video data may include one or more pictures or a sequence of pictures.
  • the code stream contains the encoding information of an image or image sequence in the form of a bit stream.
  • Encoding information may include encoded image data and associated data.
  • the associated data may include a sequence parameter set (SPS for short), a picture parameter set (PPS for short) and other syntax structures.
  • SPS sequence parameter set
  • PPS picture parameter set
  • An SPS may contain parameters that apply to one or more sequences.
  • a PPS may contain parameters applied to one or more images.
  • the syntax structure refers to a set of zero or more syntax elements arranged in a specified order in the code stream.
  • the video encoder 112 directly transmits encoded video data to the decoding device 120 via the output interface 113 .
  • the encoded video data can also be stored on a storage medium or a storage server for subsequent reading by the decoding device 120 .
  • the decoding device 120 includes an input interface 121 and a video decoder 122 .
  • the decoding device 120 may include a display device 123 in addition to the input interface 121 and the video decoder 122 .
  • the input interface 121 includes a receiver and/or a modem.
  • the input interface 121 can receive encoded video data through the channel 130 .
  • the video decoder 122 is used to decode the encoded video data to obtain decoded video data, and transmit the decoded video data to the display device 123 .
  • the display device 123 displays the decoded video data.
  • the display device 123 may be integrated with the decoding device 120 or external to the decoding device 120 .
  • the display device 123 may include various display devices, such as a liquid crystal display (LCD), a plasma display, an organic light emitting diode (OLED) display, or other types of display devices.
  • LCD liquid crystal display
  • plasma display a plasma display
  • OLED organic light emitting diode
  • FIG. 1 is only an example, and the technical solutions of the embodiments of the present application are not limited to FIG. 1 .
  • the technology of the present application may also be applied to one-sided video encoding or one-sided video decoding.
  • Fig. 2 is a schematic block diagram of a video encoder involved in an embodiment of the present application. It should be understood that the video encoder 200 can be used to perform lossy compression on images, and can also be used to perform lossless compression on images.
  • the lossless compression may be visually lossless compression or mathematically lossless compression.
  • the video encoder 200 can be applied to image data in luminance-chrominance (YCbCr, YUV) format.
  • the YUV ratio can be 4:2:0, 4:2:2 or 4:4:4, Y means brightness (Luma), Cb (U) means blue chroma, Cr (V) means red chroma, U and V are expressed as chroma (Chroma) for describing color and saturation.
  • 4:2:0 means that every 4 pixels have 4 luminance components
  • 2 chroma components YYYYCbCr
  • 4:2:2 means that every 4 pixels have 4 luminance components
  • 4 Chroma component YYYYCbCrCbCr
  • 4:4:4 means full pixel display (YYYYCbCrCbCrCbCrCbCr).
  • the video encoder 200 reads video data, and divides a frame of image into several coding tree units (coding tree units, CTUs) for each frame of image in the video data.
  • CTB can be called “Tree block", “Largest Coding unit” (LCU for short) or “coding tree block” (CTB for short).
  • LCU Large Coding unit
  • CTB coding tree block
  • Each CTU may be associated with a pixel block of equal size within the image. Each pixel may correspond to one luminance (luma) sample and two chrominance (chrominance or chroma) samples.
  • each CTU may be associated with one block of luma samples and two blocks of chroma samples.
  • a CTU size is, for example, 128 ⁇ 128, 64 ⁇ 64, 32 ⁇ 32 and so on.
  • a CTU can be further divided into several coding units (Coding Unit, CU) for coding, and the CU can be a rectangular block or a square block.
  • the CU can be further divided into a prediction unit (PU for short) and a transform unit (TU for short), so that encoding, prediction, and transformation are separated, and processing is more flexible.
  • a CTU is divided into CUs in a quadtree manner, and a CU is divided into TUs and PUs in a quadtree manner.
  • the video encoder and video decoder can support various PU sizes. Assuming that the size of a specific CU is 2N ⁇ 2N, video encoders and video decoders may support 2N ⁇ 2N or N ⁇ N PU sizes for intra prediction, and support 2N ⁇ 2N, 2N ⁇ N, N ⁇ 2N, NxN or similarly sized symmetric PUs for inter prediction. The video encoder and video decoder may also support asymmetric PUs of 2NxnU, 2NxnD, nLx2N, and nRx2N for inter prediction.
  • the video encoder 200 may include: a prediction unit 210, a residual unit 220, a transform/quantization unit 230, an inverse transform/quantization unit 240, a reconstruction unit 250, and a loop filter unit 260. Decoded image cache 270 and entropy coding unit 280. It should be noted that the video encoder 200 may include more, less or different functional components.
  • the current block may be called a current coding unit (CU) or a current prediction unit (PU).
  • a predicted block may also be called a predicted image block or an image predicted block, and a reconstructed image block may also be called a reconstructed block or an image reconstructed image block.
  • the prediction unit 210 includes an inter prediction unit 211 and an intra estimation unit 212 . Because there is a strong correlation between adjacent pixels in a video frame, the intra-frame prediction method is used in video coding and decoding technology to eliminate the spatial redundancy between adjacent pixels. Due to the strong similarity between adjacent frames in video, the inter-frame prediction method is used in video coding and decoding technology to eliminate time redundancy between adjacent frames, thereby improving coding efficiency.
  • the inter-frame prediction unit 211 can be used for inter-frame prediction.
  • the inter-frame prediction can refer to image information of different frames.
  • the inter-frame prediction uses motion information to find a reference block from the reference frame, and generates a prediction block according to the reference block to eliminate temporal redundancy;
  • Frames used for inter-frame prediction may be P frames and/or B frames, P frames refer to forward predictive frames, and B frames refer to bidirectional predictive frames.
  • the motion information includes the reference frame list where the reference frame is located, the reference frame index, and the motion vector.
  • the motion vector can be an integer pixel or a sub-pixel. If the motion vector is sub-pixel, then it is necessary to use interpolation filtering in the reference frame to make the required sub-pixel block.
  • the reference frame found according to the motion vector A block of whole pixels or sub-pixels is called a reference block.
  • Some technologies will directly use the reference block as a prediction block, and some technologies will further process the reference block to generate a prediction block. Reprocessing and generating a prediction block based on a reference block can also be understood as taking the reference block as a prediction block and then processing and generating a new prediction block based on the prediction block.
  • the intra-frame estimation unit 212 only refers to the information of the same frame of images to predict the pixel information in the current code image block for eliminating spatial redundancy.
  • a frame used for intra prediction may be an I frame.
  • the pixels in the left row and the upper column of the current block are reference pixels of the current block, and the intra prediction uses these reference pixels to predict the current block.
  • These reference pixels may all be available, that is, all have been encoded and decoded. Some parts may also be unavailable, for example, the current block is the leftmost of the whole frame, then the reference pixel on the left of the current block is unavailable.
  • the lower left part of the current block has not been encoded and decoded, so the reference pixel at the lower left is also unavailable.
  • the available reference pixel or some value or some method can be used for filling, or no filling is performed.
  • FIG. 4 is a schematic diagram of intra-frame prediction modes.
  • the intra-frame prediction modes used by HEVC include Planar, DC, and 33 angle modes, a total of 35 prediction modes.
  • Fig. 5 is a schematic diagram of an intra-frame prediction mode.
  • the intra-frame modes used by VVC include Planar, DC and 65 angle modes, a total of 67 prediction modes.
  • Figure 6 is a schematic diagram of the intra prediction mode. As shown in Figure 6, AVS3 uses DC, Planar, Bilinear and 63 angle modes, a total of 66 prediction modes
  • the intra-frame prediction will be more accurate, and it will be more in line with the demand for the development of high-definition and ultra-high-definition digital video.
  • the residual unit 220 may generate a residual block of the CU based on the pixel blocks of the CU and the prediction blocks of the PUs of the CU. For example, residual unit 220 may generate a residual block for a CU such that each sample in the residual block has a value equal to the difference between the samples in the pixel blocks of the CU, and the samples in the PUs of the CU. Corresponding samples in the predicted block.
  • Transform/quantization unit 230 may quantize the transform coefficients. Transform/quantization unit 230 may quantize transform coefficients associated with TUs of a CU based on quantization parameter (QP) values associated with the CU. Video encoder 200 may adjust the degree of quantization applied to transform coefficients associated with a CU by adjusting the QP value associated with the CU.
  • QP quantization parameter
  • Inverse transform/quantization unit 240 may apply inverse quantization and inverse transform to the quantized transform coefficients, respectively, to reconstruct a residual block from the quantized transform coefficients.
  • the reconstruction unit 250 may add samples of the reconstructed residual block to corresponding samples of one or more prediction blocks generated by the prediction unit 210 to generate a reconstructed image block associated with the TU. By reconstructing the sample blocks of each TU of the CU in this way, the video encoder 200 can reconstruct the pixel blocks of the CU.
  • Loop filtering unit 260 may perform deblocking filtering operations to reduce blocking artifacts of pixel blocks associated with a CU.
  • the loop filtering unit 260 includes a deblocking filtering unit and a sample adaptive compensation/adaptive loop filtering (SAO/ALF) unit, wherein the deblocking filtering unit is used for deblocking, and the SAO/ALF unit Used to remove ringing effects.
  • SAO/ALF sample adaptive compensation/adaptive loop filtering
  • the decoded image buffer 270 may store reconstructed pixel blocks.
  • Inter prediction unit 211 may use reference pictures containing reconstructed pixel blocks to perform inter prediction on PUs of other pictures.
  • intra estimation unit 212 may use the reconstructed pixel blocks in decoded picture cache 270 to perform intra prediction on other PUs in the same picture as the CU.
  • Entropy encoding unit 280 may receive the quantized transform coefficients from transform/quantization unit 230 . Entropy encoding unit 280 may perform one or more entropy encoding operations on the quantized transform coefficients to generate entropy encoded data.
  • Fig. 3 is a schematic block diagram of a video decoder involved in an embodiment of the present application.
  • the video decoder 300 includes: an entropy decoding unit 310 , a prediction unit 320 , an inverse quantization/transformation unit 330 , a reconstruction unit 340 , a loop filter unit 350 and a decoded image buffer 360 . It should be noted that the video decoder 300 may include more, less or different functional components.
  • the video decoder 300 can receive code streams.
  • the entropy decoding unit 310 may parse the codestream to extract syntax elements from the codestream. As part of parsing the codestream, the entropy decoding unit 310 may parse the entropy-encoded syntax elements in the codestream.
  • the prediction unit 320 , the inverse quantization/transformation unit 330 , the reconstruction unit 340 and the loop filter unit 350 can decode video data according to the syntax elements extracted from the code stream, that is, generate decoded video data.
  • the prediction unit 320 includes an inter prediction unit 321 and an intra estimation unit 322 .
  • Intra estimation unit 322 may perform intra prediction to generate a predictive block for a PU.
  • Intra-estimation unit 322 may use an intra-prediction mode to generate a predictive block for a PU based on pixel blocks of spatially neighboring PUs.
  • Intra estimation unit 322 may also determine the intra prediction mode for the PU from one or more syntax elements parsed from the codestream.
  • the inter prediction unit 321 can construct the first reference picture list (list 0) and the second reference picture list (list 1) according to the syntax elements parsed from the codestream. Furthermore, if the PU is encoded using inter prediction, entropy decoding unit 310 may parse the motion information for the PU. Inter prediction unit 321 may determine one or more reference blocks for the PU according to the motion information of the PU. Inter prediction unit 321 may generate a prediction block for a PU based on one or more reference blocks of the PU.
  • Inverse quantization/transform unit 330 may inversely quantize (ie, dequantize) the transform coefficients associated with a TU. Inverse quantization/transform unit 330 may use QP values associated with CUs of the TU to determine the degree of quantization.
  • inverse quantization/transform unit 330 may apply one or more inverse transforms to the inverse quantized transform coefficients in order to generate a residual block associated with the TU.
  • Reconstruction unit 340 uses the residual blocks associated with the TUs of the CU and the prediction blocks of the PUs of the CU to reconstruct the pixel blocks of the CU. For example, the reconstruction unit 340 may add the samples of the residual block to the corresponding samples of the prediction block to reconstruct the pixel block of the CU to obtain the reconstructed image block.
  • Loop filtering unit 350 may perform deblocking filtering operations to reduce blocking artifacts of pixel blocks associated with a CU.
  • Video decoder 300 may store the reconstructed picture of the CU in decoded picture cache 360 .
  • the video decoder 300 may use the reconstructed picture in the decoded picture buffer 360 as a reference picture for subsequent prediction, or transmit the reconstructed picture to a display device for presentation.
  • the basic flow of video encoding and decoding is as follows: at the encoding end, a frame of image is divided into blocks, and for the current block, the prediction unit 210 uses intra-frame prediction or inter-frame prediction to generate the prediction block of the current block .
  • the residual unit 220 may calculate a residual block based on the predicted block and the original block of the current block, for example, subtract the predicted block from the original block of the current block to obtain a residual block, which may also be referred to as residual information.
  • the residual block can be transformed and quantized by the transformation/quantization unit 230 to remove information that is not sensitive to human eyes, so as to eliminate visual redundancy.
  • the residual block before being transformed and quantized by the transform/quantization unit 230 may be called a time domain residual block, and the time domain residual block after being transformed and quantized by the transform/quantization unit 230 may be called a frequency residual block or a frequency-domain residual block.
  • the entropy encoding unit 280 receives the quantized transform coefficients output by the transform and quantization unit 230 , may perform entropy encoding on the quantized transform coefficients, and output a code stream.
  • the entropy coding unit 280 can eliminate character redundancy according to the target context model and the probability information of the binary code stream.
  • the entropy decoding unit 310 can analyze the code stream to obtain the prediction information of the current block, the quantization coefficient matrix, etc., and the prediction unit 320 uses intra prediction or inter prediction for the current block based on the prediction information to generate a prediction block of the current block.
  • the inverse quantization/transformation unit 330 uses the quantization coefficient matrix obtained from the code stream to perform inverse quantization and inverse transformation on the quantization coefficient matrix to obtain a residual block.
  • the reconstruction unit 340 adds the predicted block and the residual block to obtain a reconstructed block.
  • the reconstructed blocks form a reconstructed image, and the loop filtering unit 350 performs loop filtering on the reconstructed image based on the image or based on the block to obtain a decoded image.
  • the encoding end also needs similar operations to the decoding end to obtain the decoded image.
  • the decoded image may also be referred to as a reconstructed image, and the reconstructed image may be a subsequent frame as a reference frame for inter-frame prediction.
  • the block division information determined by the encoder as well as mode information or parameter information such as prediction, transformation, quantization, entropy coding, and loop filtering, etc., are carried in the code stream when necessary.
  • the decoding end analyzes the code stream and analyzes the existing information to determine the same block division information as the encoding end, prediction, transformation, quantization, entropy coding, loop filtering and other mode information or parameter information, so as to ensure the decoding image obtained by the encoding end It is the same as the decoded image obtained by the decoder.
  • the current block may be the current coding unit (CU) or the current prediction unit (PU).
  • the above is the basic process of the video codec under the block-based hybrid coding framework. With the development of technology, some modules or steps of the framework or process may be optimized. This application is applicable to the block-based hybrid coding framework.
  • the basic process of the video codec but not limited to the framework and process.
  • the general hybrid encoding framework will first perform prediction, and the prediction uses the spatial or temporal correlation performance to obtain an image that is the same as or similar to the current block.
  • the prediction uses the spatial or temporal correlation performance to obtain an image that is the same as or similar to the current block.
  • the hybrid coding framework will subtract the predicted image from the original image of the current block to obtain a residual image, or subtract the predicted block from the current block to obtain a residual block.
  • Residual blocks are usually much simpler than the original image, so prediction can significantly improve compression efficiency.
  • the residual block is not directly encoded, but usually transformed first.
  • the transformation is to transform the residual image from the spatial domain to the frequency domain, and remove the correlation of the residual image.
  • After the residual image is transformed into the frequency domain since most of the energy is concentrated in the low-frequency region, most of the transformed non-zero coefficients are concentrated in the upper left corner.
  • quantization is used for further compression. And because the human eye is not sensitive to high frequencies, a larger quantization step size can be used in high frequency areas.
  • JVET an international video coding standard-setting organization
  • the model that is, the platform test software, the Enhanced Compression Model (ECM for short).
  • ECM Enhanced Compression Model
  • ECM is a reference software for further improving the performance of VVC tools and the combination of tools. It is based on VTM-10.0 and integrates tools and technologies adopted by EE.
  • VTM VVC's reference software test platform
  • intra-frame prediction Similar to VTM (VVC's reference software test platform), there are traditional intra-frame prediction, residual transformation and other processes.
  • the difference from VVC is that in the intra-frame prediction process, two techniques for deriving the intra-frame prediction mode are adopted, namely Decoder-side Intra Mode Derivation (DIMD) and template-based frame Inner mode export (Template-based Intra Mode Derivation, TIMD).
  • DIMD Decoder-side Intra Mode Derivation
  • TIMD template-based frame Inner mode export
  • the DIMD and TIMD technologies can derive the intra-frame prediction mode at the decoding end, so that the index of the coding intra-frame prediction mode is omitted to achieve the effect of saving codewords.
  • the implementation process of DIMD technology mainly includes the following two steps:
  • the first step is to derive the intra prediction mode, and use the same prediction mode strength calculation method on the codec side.
  • DIMD uses the reconstructed pixels around the current block as a template, and scans and calculates the gradients in the horizontal and vertical directions on each 3X3 area on the template (Template) through the sobel operator.
  • the gradients Dx and Dy are obtained in the vertical direction.
  • the amplitude values of the same angle mode are accumulated to obtain the histogram of the amplitude value and angle mode as shown in FIG. 7B .
  • the prediction mode with the largest amplitude value in the histogram shown in FIG. 7B is determined as the first intra-frame prediction mode, and the prediction mode with the second largest amplitude value is determined as the second intra-frame prediction mode.
  • the second step is to derive the prediction block, and use the same prediction block derivation method on the codec side to obtain the current prediction block.
  • ECM2.0 as an example, the encoder judges the following two conditions,
  • the gradient of the second prediction mode is not 0;
  • the current prediction block is derived using a weighted averaging method.
  • the specific method is that the prediction values of the first intra prediction mode, the second intra prediction mode and the Planar mode are weighted to obtain the final prediction result of DIMD, and the specific process is shown in FIG. 7C .
  • the weighted calculation process is shown in formulas (1) and (2):
  • Pred Pred planar ⁇ w0+Pred mode1 ⁇ w1+Pred mode2 ⁇ w2 (2)
  • w0, w1, and w2 are the weights assigned to the Planar mode, the first intra-frame prediction mode, and the second intra-frame prediction mode respectively
  • Pred planar is the predicted value corresponding to the Planar mode
  • Pred mode1 is the corresponding value of the first intra-frame prediction mode
  • Pred mode2 is the predicted value corresponding to the second intra prediction mode
  • Pred is the weighted prediction value corresponding to DIMD
  • amp1 is the amplitude value corresponding to the first intra prediction mode
  • amp2 is the corresponding amplitude value of the second intra prediction mode Amplitude value.
  • DIMD needs to transmit a flag bit to the decoder to indicate whether the current coding unit uses DIMD technology.
  • the prediction mode derivation of the DIMD technology can reduce the transmission burden of certain syntax elements, so that the prediction mode overhead that originally needs at least 5 bits or more is reduced to 1 bit.
  • the DIMD fuses the prediction blocks corresponding to the optimal prediction mode, the suboptimal prediction mode, and the Planar mode through a fusion operation to generate a new prediction block.
  • the new prediction block is neither predictable by any of the foregoing prediction modes, nor does the same prediction block exist in subsequent prediction tools. Through experimental comparison, it can be found that the fusion technology does improve the prediction efficiency.
  • the prediction value obtained by weighted fusion is suitable for video content in natural scenes, but not for video content in specific scenes.
  • the objects in the former video content usually have blurred edges and some noise generated by shooting, and the fusion technology of DIMD can obtain prediction values that better match these objects.
  • Objects in the latter video content generally have sharper and brighter colors.
  • These video content are usually recorded by a computer or called screen content video.
  • the prediction value generated by DIMD fusion technology is redundant in such content. It reduces the quality of prediction and can be said to bring noise. That is to say, in some cases, the use of weighted fusion prediction can improve the prediction effect, but in some cases, the use of weighted fusion prediction will reduce the prediction quality.
  • weighted fusion it is necessary to judge whether to carry out weighting based on weighted fusion conditions fusion. It can be seen that the setting of weighted fusion conditions directly affects the accuracy of intra prediction. However, the current weighted fusion conditions are too broad, resulting in weighted fusion for image content that does not require weighted fusion, resulting in poor prediction quality.
  • the present application determines the weighted fusion condition of the current block according to the magnitude values of the first intra-frame prediction mode and the second intra-frame prediction mode, and judges whether to perform weighted fusion prediction on the current block based on the determined weighted fusion condition , which can avoid the problems of reducing prediction quality and introducing unnecessary noise when weighted fusion prediction is performed on image content that does not need weighted fusion prediction, and thus improves the accuracy of intra prediction.
  • the intra prediction method provided by the embodiment of the present application can also be applied to any scene where two or more intra prediction modes are allowed to perform weighted fusion prediction.
  • the video decoding method provided in the embodiment of the present application is introduced by taking the decoding end as an example.
  • FIG. 8 is a schematic flowchart of an intra prediction method provided by an embodiment of the present application.
  • the embodiment of the present application is applied to the video decoder shown in FIG. 1 and FIG. 2 .
  • the method of the embodiment of the present application includes:
  • N is a positive integer greater than 1.
  • the current block may also be referred to as a current decoding block, a current decoding unit, a decoding block, a block to be decoded, a current block to be decoded, and the like.
  • the current block when the current block includes a chroma component but does not include a luma component, the current block may be called a chroma block.
  • the current block when the current block includes a luma component but does not include a chroma component, the current block may be called a luma block.
  • the video decoder determines the first intra-frame prediction mode and the second intra-frame prediction mode .
  • the first intra-frame prediction mode and the second intra-frame prediction mode meet the weighted fusion condition, use the first intra-frame prediction mode to predict the current block, obtain the first prediction value of the current block, and use the second frame
  • the intra prediction mode predicts the current block to obtain the second prediction value of the current block, and performs weighted fusion of the first prediction value and the second prediction value to obtain the target prediction value of the current block.
  • a third prediction value can also be determined, for example, using the third intra prediction mode to predict the current block to obtain the third prediction value of the current block Predicted value, the first predicted value, the second predicted value and the third predicted value are weighted and fused to obtain the target predicted value of the current block.
  • the foregoing third intra-frame prediction mode may be a preset intra-frame prediction mode, or determined in other ways, which is not limited in the present application.
  • one of the first intra-frame prediction mode and the second intra-frame prediction mode is used. , to predict the current block to obtain the target prediction value of the current block.
  • the method for the video decoder to determine that the current block allows multiple intra-frame prediction modes to perform fusion weighted prediction may be: the video encoder carries a second flag in the code stream, and the second flag is used to indicate that the current block Whether to determine the target predictive value through at least one of the first intra-frame prediction mode, the second intra-frame prediction mode and the third intra-frame prediction mode. If the video encoder uses at least one of the first intra-frame prediction mode, the second intra-frame prediction mode, and the third intra-frame prediction mode to determine the target prediction value, then set the second flag to true, for example, set the second flag to The value is set to 1, and the second flag set to true is written into the code stream, for example, written into the code stream header.
  • the video decoder After the video decoder obtains the code stream, it decodes the code stream to obtain the second flag. If the second flag is true, for example, the value of the second flag is 1, the video decoder determines that the current block passes through the first frame. At least one of the intra prediction mode, the second intra prediction mode, and the third intra prediction mode determines a target prediction value, and at this time, the video decoder determines the first intra prediction mode and the second intra prediction mode of the current block. Optionally, the video decoder determines the first intra-frame prediction mode and the second intra-frame prediction mode in the same manner as the video encoder determines the first intra-frame prediction mode and the second intra-frame prediction mode.
  • the video encoder does not determine the target predictive value of the current block through at least one of the first intra-frame prediction mode, the second intra-frame prediction mode, and the third intra-frame prediction mode, set the second flag to false, for example, set The value of the second flag is set to 0, and the second flag set to false is written into the code stream, for example, written into the code stream header.
  • the video decoder decodes the code stream to obtain the second flag. If the second flag is false, for example, the value of the second flag is 0, the video decoder is not sure about the first intra-frame prediction mode and the second frame of the current block. Instead, traverse other preset intra prediction modes, determine the intra prediction mode with the least cost to predict the current block, and obtain the target prediction value of the current block.
  • the embodiment of the present application mainly involves that the target predictive value of the current block is determined by at least one of the first intra-frame prediction mode, the second intra-frame prediction mode and the third intra-frame prediction mode, that is to say , this application mainly discusses the case where the above-mentioned second flag is true.
  • the above-mentioned second flag may be a DIMD enable flag, for example, sps_dimd_enable_flag. That is to say, in the embodiment of the present application, the video decoder decodes the code stream, and first obtains the use-allowed flag bit of the DIMD, and the use-allowed flag bit of the DIMD is a sequence-level flag bit.
  • the DIMD allowable flag bit is used to indicate whether the current sequence is allowed to use the DIMD technology. If the allowed flag bit of the DIMD is true, for example, 1, it is determined that the current sequence is allowed to use the DIMD technology.
  • the video decoder continues to decode the code stream to obtain a DIMD enable flag, which may be a sequence-level flag bit.
  • the DIMD enabling flag is used to indicate whether the current block uses DIMD technology. If the DIMD enabling flag is true, for example, 1, it is determined that the current block uses DIMD technology.
  • the video decoder executes the above S401 to determine the first block of the current block. an intra prediction mode and a second intra prediction mode.
  • the above-mentioned DIMD enabling flag may also be an image-level sequence, which is used to indicate whether the current frame image uses the DIMD technology.
  • the true or false of the DIMD enable flag is determined by the video encoder and written into the code stream. For example, when the video encoder uses DIMD technology to determine the target prediction value of the current block, the DIMD enable flag is set to If it is true, for example, it is set to 1, and the code stream is written, for example, the code stream header is written. If the video encoder does not use the DIMD technique to determine the target prediction value of the current block, set the DIMD enable flag to false, for example, to 0, and write the code stream, for example, into the code stream header.
  • the video decoder can analyze the DIMD enable flag from the code stream, and determine whether to use DIMD technology to determine the target prediction value of the current block according to the DIMD enable flag, so as to ensure the consistency between the decoding end and the encoding end, and ensure the reliability of the prediction. sex.
  • the video decoder determines that at least one of the first intra-frame prediction mode, the second intra-frame prediction mode and the third intra-frame prediction mode needs to be used to determine the target predictive value of the current block
  • the above S401 is executed to determine that the current block has The magnitude values of the N intra-frame prediction modes corresponding to the region are reconstructed, and the first intra-frame prediction mode and the second intra-frame prediction mode of the current block are determined according to the magnitude values of the N intra-frame prediction modes.
  • the reconstructed area around the current block may be any preset area among the reconstructed areas around the current block.
  • the reconstructed area around the current block includes m rows of reconstructed pixels above the current block.
  • the reconstructed area around the current block includes k columns of reconstructed pixels on the left side of the current block.
  • the reconstructed area around the current block includes m rows of reconstructed pixels above and to the left of the current block, which are determined as the template area of the current block.
  • the reconstructed area around the current block includes m rows of reconstructed pixels above and to the left of the current block, and k columns of reconstructed pixels on the left side of the current block, such as the L area in FIG. 7A .
  • m and k may be the same or different, which is not limited in this application.
  • the above m rows of pixels may or may not be adjacent to the current block.
  • the aforementioned k columns of pixels may or may not be adjacent to the current block.
  • the process of determining the amplitude values of the N intra-frame prediction modes corresponding to the surrounding reconstructed area may be: first, each nXn (for example, Scan and calculate the gradients in the horizontal and vertical directions on the 3X3) area, and obtain the gradients Dx and Dy according to the horizontal and vertical directions.
  • nXn for example, Scan and calculate the gradients in the horizontal and vertical directions on the 3X3 area
  • a 3x3 horizontal sober filter and a vertical sober filter are used to calculate the horizontal gradient Dx and the vertical gradient Dy of a 3x3 area on the template area, respectively.
  • the horizontal gradient Dx is calculated according to the following formula (4)
  • the vertical gradient Dy is calculated according to the formula (5):
  • the L level is a horizontal sober filter
  • the L vertical is a vertical sober filter
  • A is a 3X3 area on the template area.
  • all intra-frame prediction modes in the histogram may be determined as N intra-frame prediction modes.
  • intra-frame prediction modes whose amplitude values in the histogram are greater than a certain preset value may be determined as N intra-frame prediction modes.
  • the above-mentioned reconstructed area around the current block is the template area of the current block, as shown in FIG. 7A
  • the template area of the current block is the reconstructed area around the current block The region adjacent to the current block.
  • the sample area of the current block is also referred to as the neighboring reconstructed sample area of the current block.
  • the process of determining the magnitude values of the N intra prediction modes corresponding to the reconstructed area around the current block in S401 above is to determine the N intra prediction modes corresponding to the template area of the current block if the DIMD enable flag indicates that the current block uses DIMD technology.
  • the amplitude values of the intra-frame prediction modes wherein the process of determining the amplitude values of the N intra-frame prediction modes corresponding to the template area of the current block is the same as the above-mentioned process of determining the amplitude values of the N intra-frame prediction modes corresponding to the reconstructed area around the current block Basically the same, just replace the reconstructed area around the current block with the current block's template area.
  • the first intra prediction mode and the second intra prediction mode of the current block are determined based on the magnitude values of the N intra prediction modes .
  • the first intra-frame prediction mode and the second intra-frame prediction mode of the current block are determined, including the following methods:
  • any one of the N intra-frame prediction modes is determined as the first intra-frame prediction mode.
  • Any intra-frame prediction mode except the first intra-frame prediction mode among the N intra-frame prediction modes is determined as the second intra-frame prediction mode.
  • Mode 2 determine the intra-frame prediction mode with the largest amplitude value among the N intra-frame prediction modes as the first intra-frame prediction mode, and determine the intra-frame prediction mode with the second largest amplitude value among the N intra-frame prediction modes as the second frame Intra-prediction mode.
  • S402. Determine a weighted fusion condition of the current block according to magnitude values of the first intra-frame prediction mode and the second intra-frame prediction mode.
  • the foregoing weighted fusion condition is used to determine whether the current block is weighted and predicted through the first intra-frame prediction mode, the second intra-frame prediction mode, and the third intra-frame prediction mode.
  • the video decoder after the video decoder obtains the first intra-frame prediction mode and the second intra-frame prediction mode of the current block according to the above-mentioned step S401, it does not directly use the first intra-frame prediction mode and the second intra-frame prediction mode mode to perform weighted prediction on the current block, but it is necessary to judge whether the first intra-frame prediction mode and the second intra-frame prediction mode meet the weighted fusion condition of the current block.
  • first intra prediction mode and the second intra prediction mode meet the weighted fusion condition of the current block
  • use the first intra prediction mode, the second intra prediction mode and the third intra prediction mode to weight the current block Prediction for example, use the first intra prediction mode to predict the current block to obtain the first prediction value, use the second intra prediction mode to predict the current block to obtain the second prediction value, and use the third intra prediction mode
  • the current block is predicted to obtain the third predicted value, and the first predicted value, the second predicted value and the third predicted value are weighted to obtain the target predicted value of the current block.
  • the weights for weighting the first predictive value, the second predictive value, and the third predictive value may be determined according to amplitude values corresponding to the first intra-frame prediction mode and the second intra-frame prediction mode.
  • the first intra-frame prediction mode and the second intra-frame prediction mode do not meet the weighted fusion condition of the current block, use one of the first intra-frame prediction mode, the second intra-frame prediction mode and the third intra-frame prediction mode, Predict the current block to obtain the target prediction value of the current block.
  • the current block is predicted by using the prediction mode with the largest amplitude value among the first intra-frame prediction mode and the second intra-frame prediction mode, to obtain a target prediction value of the current block.
  • the first intra-frame prediction mode is the intra-frame prediction mode with the smallest magnitude value among the N prediction modes
  • the second intra-frame prediction mode is the one with the second smallest magnitude value among the N prediction modes
  • Intra-frame prediction mode that is, the first magnitude value is greater than the second magnitude value
  • the current weighted fusion conditions have a wide range. For example, if the first intra-frame prediction mode and the second intra-frame prediction mode are not Planar and DC modes, and the amplitude value of the second intra-frame prediction mode is greater than 0, then Perform weighted fusion prediction. For some image content, such as screen recorded image content, generally have sharper and brighter colors. When weighted fusion prediction is used for these image content, the prediction quality will be reduced instead.
  • the present application determines the weighted fusion condition of the current block according to the magnitude values of the first intra-frame prediction mode and the second intra-frame prediction mode. Reduces the probability of image quality issues when image content is weighted for fusion prediction.
  • the present application does not limit the method for determining the weighted fusion condition of the current block according to the amplitude values of the first intra-frame prediction mode and the second intra-frame prediction mode in the above S402, examples include but are not limited to the following:
  • the difference between the magnitude value of the first intra-frame prediction mode and the magnitude value of the second intra-frame prediction mode is smaller than a preset value 1, and is determined as the weighted fusion condition of the current block.
  • the difference between the amplitude value of the first intra prediction mode and the amplitude value of the second intra prediction mode is greater than or equal to the preset value 1, that is to say, the corresponding value of the first intra prediction mode
  • the magnitude value is far greater than the magnitude value of the second intra-frame prediction mode, which indicates that the probability that the first intra-frame prediction mode is applicable to the current block is much greater than that of the second intra-frame prediction mode.
  • the first intra-frame prediction mode, the second intra-frame prediction mode and the third intra-frame prediction mode are used to perform weighted prediction on the current block, noise will be brought instead and the prediction effect will be reduced.
  • the difference between the amplitude value of the first intra prediction mode and the amplitude value of the second intra prediction mode is less than the preset value 1, that is to say, the amplitude value corresponding to the first intra prediction mode
  • the amplitude value corresponding to the first intra prediction mode There is not much difference between the amplitude value of the second intra-frame prediction mode, which means that the probabilities that the first intra-frame prediction mode and the second intra-frame prediction mode are applicable to the current block are basically the same.
  • the first intra-frame prediction mode, the second intra-frame prediction mode, and the third intra-frame prediction mode can be used to predict the current block. Weighted prediction is performed on the current block to improve the prediction effect of the current block.
  • the present application does not limit the specific value of the preset value 1, which is specifically determined according to actual needs.
  • the difference between the magnitude value of the first intra prediction mode and the magnitude value of the second intra prediction mode is less than the preset value 1, and is determined as the weighted fusion condition of the current block, which can realize the weighted fusion required Weighted fusion prediction is performed on the predicted current block, and the probability of performing weighted fusion prediction on image content that does not need to be weighted fusion prediction can be reduced, thereby improving the accuracy of intra-frame prediction.
  • Way 2 Make the ratio of the first amplitude value of the first intra-frame prediction mode to the second amplitude value of the second intra-frame prediction mode less than or equal to the first preset threshold, and determine the weighted fusion condition of the current block.
  • the first intra prediction mode corresponds to The magnitude value is far greater than the magnitude value of the second intra-frame prediction mode, which indicates that the probability that the first intra-frame prediction mode is applicable to the current block is much greater than that of the second intra-frame prediction mode.
  • the first intra-frame prediction mode, the second intra-frame prediction mode and the third intra-frame prediction mode are used to perform weighted prediction on the current block, noise will be brought instead and the prediction effect will be reduced.
  • the first intra prediction mode corresponds to The amplitude value of is not much different from the amplitude value of the second intra-frame prediction mode, which means that the probabilities that the first intra-frame prediction mode and the second intra-frame prediction mode are applicable to the current block are basically the same.
  • weighted prediction may be performed on the current block by using the first intra-frame prediction mode, the second intra-frame prediction mode, and the third intra-frame prediction mode, so as to improve the prediction effect of the current block.
  • the embodiment of the present application there is no limitation on the specific content of the first preset condition, which is specifically determined according to actual needs.
  • the first preset condition is that neither the first intra-frame prediction mode nor the second intra-frame prediction mode is a Planar and DC mode, and the second amplitude value corresponding to the second intra-frame prediction mode is not zero.
  • step S402 is executed, and according to the magnitude values of the first intra-frame prediction mode and the second intra-frame prediction mode , to determine the weighted fusion condition for the current block.
  • the above step S402 is not performed, but one of the first intra-frame prediction mode and the second intra-frame prediction mode is used to The block is predicted, for example, the current block is predicted using the first intra-frame prediction mode to obtain the target prediction value of the current block.
  • the current weighted fusion condition is fixed, that is to say, no matter what the image content is, the weighted fusion condition of the current block is fixed.
  • image content such as screen-recorded image content
  • weighted fusion prediction can be understood as a fuzzy prediction method, it will Reduced sharpening and color vibrancy in the image reduces prediction quality and introduces noise.
  • the present application determines the weighted fusion condition of the current block according to the image content. That is to say, this application provides differentiated weighted fusion conditions for image content, and the weighted fusion conditions corresponding to different image contents can be different, thereby ensuring weighted fusion prediction for image content that needs weighted fusion prediction to improve prediction. accuracy. For the image content that does not need weighted fusion prediction, no weighted fusion prediction is performed to avoid introducing unnecessary noise and ensure the prediction quality.
  • a sequence includes a series of images, and the series of images are usually generated in the same environment, therefore, the image content of the images in a sequence is basically the same.
  • the image content of the current block is of the same type as the image content of the current sequence, such as screen content, or other content collected by the camera, etc. Therefore, the image content of the current block can be determined through the image content of the current sequence.
  • the video decoder can obtain the image content of the current sequence by means of image recognition. For example, when the video decoder decodes the current sequence, it first uses an existing method to decode reconstructed images of the first few frames in the current sequence, for example, 2 frames. Perform image recognition on the reconstructed images of the previous frames to obtain the image content types of the reconstructed images of the previous frames, and use the image content types of the reconstructed images of the previous frames as the image content types of the current sequence.
  • the video decoder performs image recognition on the reconstructed images of the previous frames, and the method for obtaining the image content of the reconstructed images of the previous frames may be a neural network model method.
  • the neural network model is pre-trained to identify the type of image content, and the video decoder inputs the reconstructed images of the previous frames into the neural network model to obtain the image content of the reconstructed images of the previous frames output by the neural network model type.
  • the video decoder may also use other methods to determine the type of image content of the reconstructed images of the previous frames, which is not limited in this application.
  • the video decoder can obtain the image content of the current sequence through the indication information in the code stream. For example, the video encoder writes the type of the image content of the current sequence into the code stream by means of a flag bit, and the video decoder decodes the code stream to obtain the flag bit, and determine the type of the image content of the current sequence through the flag bit For example, when the value of the flag bit is 1, it indicates that the image content of the current sequence is the first image content; when the value of the flag bit is 0, it indicates that the image content of the current sequence is the second image content, wherein the first image The content is different from the second image content.
  • the weighted fusion condition of the current block is determined according to the image content corresponding to the current block.
  • the above-mentioned step S402 is performed, that is, according to the magnitude values of the first intra prediction mode and the second intra prediction mode, determine the current block Weighted fusion condition.
  • the determination of the weighted fusion condition of the current block according to the magnitude values of the first intra-frame prediction mode and the second intra-frame prediction mode can be implemented according to the above-mentioned method 1 or method 2, which will not be repeated here.
  • determining the weighted fusion condition of the current block according to the magnitude values of the first intra-frame prediction mode and the second intra-frame prediction mode in S402 includes the following steps:
  • S402-A Decode the code stream to obtain a first flag, where the first flag is used to indicate whether to use the first technology, and the first technology is used under the first image content;
  • the first image content of the present application may be image content with sharp and vivid color features, such as screen recording content and the like.
  • the weighted fusion condition may only change when the image content corresponding to the current block is the first image content, and if the image content corresponding to the current block is not the first image content, the weighted fusion condition Fusion conditions did not change. That is to say, if the image content corresponding to the current block is the first image content, the weighted fusion condition adopted is the first fusion condition; if the image content corresponding to the current block is not the first image content, the weighted fusion condition adopted is Second fusion condition.
  • the first fusion condition is different from the second fusion condition, and the first fusion condition is determined according to the magnitude values of the first intra prediction mode and the second intra prediction mode, for example, the first magnitude of the first intra prediction mode If the ratio of the value to the second magnitude value of the second intra prediction mode is less than or equal to the first preset threshold, the first fusion condition is determined.
  • the video encoder determines that the image content corresponding to the current block is the first image content, it determines that the current block can use the first technology, the first technology It can be understood that the technique provided by the embodiment of the present application is to determine the weighted fusion condition of the current block according to magnitude values of the first intra-frame prediction mode and the second intra-frame prediction mode. If the video encoder determines that the current block can use the first technology, it sets the first flag to true and encodes it into the code stream, for example, the value of the first flag is 1.
  • the video encoder determines that the image content corresponding to the current block is not the first image content, it determines that the current block cannot use the first technology, then sets the value of the first flag to false and encodes it into the code stream, for example, the first flag The value of is 0. In this way, the video decoder decodes the code stream to obtain the first flag, and then determines the weighted fusion condition of the current block according to the first flag. For example, when the value of the first flag is 1, the weighted fusion condition of the current block is determined according to the magnitude values of the first intra prediction mode and the second intra prediction mode, and if the value of the first flag is 0 , then the weighted fusion condition of the current block can be determined in other ways.
  • the above-mentioned first flag may be a sequence-level flag, which is used to indicate whether the current sequence can use the first technology.
  • the above-mentioned first flag may be an image-level flag, used to indicate whether the current image can use the first technology.
  • a new field may be added in the code stream to represent the first flag.
  • the first flag is represented by the field sps_DIMD_blendoff_flag, which is a completely new field.
  • the above-mentioned first flag multiplexes the third flag in the current sequence, that is, the existing fields in the current sequence can be reused without adding new fields, thereby saving codewords.
  • the third field above is an intra-block copy (Intra-block copy, IBC for short) enable flag or a template matching prediction (Template matching prediction, TMP for short) enable flag, and the like.
  • the weighted fusion condition of the current block may be determined from multiple weighted fusion conditions according to the first flag. For example, Table 1 shows the weighted fusion conditions corresponding to the first flag.
  • the above table 1 shows the weighted fusion conditions corresponding to different values of the first flag.
  • the value of the first flag is 1, which means that the image content corresponding to the current block is the first image content, and the corresponding weighted fusion condition is the first A fusion condition.
  • the first fusion condition is determined according to the magnitude values of the first intra-frame prediction mode and the second intra-frame prediction mode.
  • the first fusion condition is: the first intra-frame prediction mode A ratio of an amplitude value to a second amplitude value of the second intra prediction mode is less than or equal to a first preset threshold.
  • the value of the first flag is 0, indicating that the image content corresponding to the current block is not the first image content, and the corresponding weighted fusion condition is the second fusion condition.
  • the video decoder decodes the code stream to obtain the first flag, and according to the value of the first flag, query the weighted fusion condition of the current block from the above Table 1, for example, when the value of the first flag is When it is 1, it is determined that the weighted fusion condition of the current block is the first fusion condition, and when the value of the first flag is 0, it is determined that the weighted fusion condition of the current block is the second fusion condition.
  • S403. Determine a target predictive value of the current block according to the weighted fusion condition and at least one of the first intra-frame prediction mode, the second intra-frame prediction mode, and the third intra-frame prediction mode.
  • the first intra-frame prediction mode, the second intra-frame prediction mode and the third intra-frame prediction mode are used to perform weighted fusion prediction on the current block. If the first intra prediction mode and the second intra prediction mode do not satisfy the weighted fusion condition of the current block, then use the first intra prediction mode and/or the second intra prediction mode to perform weighted fusion prediction on the current block.
  • the present application does not limit the specific type of the above-mentioned third intra-frame prediction mode.
  • the third intra-frame prediction mode is the intra-frame prediction mode with the third largest amplitude value in the above histogram.
  • the above-mentioned third intra-frame prediction mode is a Planar or DC mode.
  • the method of determining the target prediction value of the current block includes but is not limited to The following types:
  • the first intra-frame prediction mode and the second intra-frame prediction mode do not satisfy the weighted fusion condition of the current block determined above, it means that the first amplitude value corresponding to the first intra-frame prediction mode is much larger than the second one.
  • the second amplitude value corresponding to the intra-frame prediction mode when the first intra-frame prediction mode is used to predict the current block, a better prediction effect can be achieved without weighted prediction.
  • S403 also includes the following S403-A2:
  • the first intra prediction mode, the second intra prediction mode, and the third intra prediction mode to predict the current block to obtain respective prediction values, and then weighting the prediction values corresponding to each intra prediction mode, Get the target prediction value of the current block.
  • determining the first prediction value of the current block includes the following steps:
  • S403-A21 Determine weights respectively corresponding to the first intra-frame prediction mode, the second intra-frame prediction mode, and the third intra-frame prediction mode.
  • the methods for determining the respective weights corresponding to the first intra-frame prediction mode, the second intra-frame prediction mode and the third intra-frame prediction mode include but are not limited to the following examples:
  • weights respectively corresponding to the first intra-frame prediction mode, the second intra-frame prediction mode and the third intra-frame prediction mode are preset weights.
  • weights corresponding to the above three intra prediction modes are the same, for example, 1/3 respectively.
  • the weight corresponding to the first intra-frame prediction mode may be greater than the weights of the other two prediction modes.
  • the preset weight is determined as the weight of the third intra-frame prediction mode; according to the first amplitude value and the second amplitude value, the weights corresponding to the first intra-frame prediction mode and the second intra-frame prediction mode are respectively determined.
  • the weight of the third intra-frame prediction mode is determined as a, and the present application does not limit the specific value of a, for example, 1/3.
  • determine the weights corresponding to the first intra-frame prediction mode and the second intra-frame prediction mode for example, the sum of the first magnitude value and the first magnitude value and the second magnitude value
  • multiply by 1-a to get the weight corresponding to the first frame prediction mode
  • S403-A22 Determine prediction values when predicting the current block using the first intra-frame prediction mode, the second intra-frame prediction mode, and the third intra-frame prediction mode respectively.
  • the prediction value corresponding to each intra prediction mode is multiplied by the weight, and then added to obtain the target prediction value of the current block.
  • the prediction values corresponding to the first intra prediction mode, the second intra prediction mode and the third intra prediction mode may be weighted according to the above formula (2), to obtain the target prediction value of the current block.
  • the video decoder decodes the code stream to obtain the first flag. If the first flag indicates that the first technology is used, the video decoder uses the first amplitude value of the first intra prediction mode and the second intra prediction mode The second magnitude value of the prediction mode determines the weighted fusion condition of the current block, for example, the ratio of the first magnitude value of the first intra prediction mode to the second magnitude value of the second intra prediction mode is less than or equal to the first preset The threshold is determined as the weighted fusion prediction condition for the current block.
  • the first intra-frame prediction mode is used.
  • the prediction mode, the second intra-frame prediction mode and the third intra-frame prediction mode determine the target prediction value of the current block. If the ratio of the first amplitude value to the second amplitude value is greater than the first preset threshold, that is, when the first intra prediction mode and the second intra prediction mode do not satisfy the weighted fusion condition of the current block, the first intra prediction is used Mode, which determines the target predictor for the current block.
  • the ratio of the first amplitude value to the second amplitude value is determined to be less than or equal to the first preset threshold as the weighted fusion condition of the current block, so that the weighted fusion condition of the current block is stricter, thereby reducing
  • the probability that the first intra-frame prediction mode and the second intra-frame prediction mode satisfy the weighted fusion condition proposed in this application can reduce the probability of performing weighted prediction on the current block of the first image content, thereby ensuring the current block of the first image content The predicted quality of the block.
  • the first intra-frame prediction mode and the second intra-frame prediction mode are used, Determine the target predictor for the current block.
  • the current block is respectively predicted using the first intra prediction mode and the second intra prediction mode, and the prediction values obtained from the self-prediction of the two intra prediction modes are weighted as the target prediction value of the current block.
  • the above-mentioned first weight and second weight are preset values, for example, the first weight is the same as the second weight, both being 1/2, or the first weight is greater than the second weight.
  • the first weight and the second weight are determined according to the first amplitude value and the second amplitude value.
  • the first intra-frame prediction mode and the second intra-frame prediction mode are used in the above S403-B1 to determine the current
  • the target prediction value of a block includes the following steps:
  • determine the sum of the first amplitude value and the second amplitude value For example, determine the sum of the first amplitude value and the second amplitude value; determine the ratio of the first amplitude value to the sum value as the first weight; determine the ratio of the second amplitude value to the sum value as the second weight.
  • S403-B14 Determine the target predictive value of the current block according to the first predictive value and the second predictive value, as well as the first weight and the second weight.
  • the sum of the product of the first predictive value and the first weight, and the product of the second predictive value and the second weight is determined as the target predictive value of the current block.
  • the video decoder decodes the code stream to obtain the first flag. If the first flag indicates that the first technology is used, the video decoder uses the first amplitude value of the first intra prediction mode and the second intra prediction mode The second magnitude value of the prediction mode determines the weighted fusion condition of the current block, for example, the ratio of the first magnitude value of the first intra prediction mode to the second magnitude value of the second intra prediction mode is less than or equal to the first preset The threshold is determined as the weighted fusion prediction condition for the current block.
  • the first intra-frame prediction mode is used.
  • the prediction mode, the second intra-frame prediction mode and the third intra-frame prediction mode determine the target prediction value of the current block. If the ratio of the first amplitude value to the second amplitude value is greater than the first preset threshold, that is, when the first intra prediction mode and the second intra prediction mode do not satisfy the weighted fusion condition of the current block, the first intra prediction is used mode and a second intra-frame prediction mode to determine the target prediction value of the current block.
  • the ratio of the first amplitude value to the second amplitude value is determined to be less than or equal to the first preset threshold as the weighted fusion condition of the current block, so that the weighted fusion condition of the current block is stricter, thereby reducing
  • the probability that the first intra-frame prediction mode and the second intra-frame prediction mode satisfy the weighted fusion condition proposed in this application can reduce the probability of performing weighted prediction on the current block of the first image content, thereby ensuring the current block of the first image content The predicted quality of the block.
  • the type of the current frame where the current block is located restricts whether to use the method of the embodiment of the present application, that is, according to the type of the current frame, it is determined whether to perform S402 according to the first intra-frame prediction mode and The magnitude value of the second intra-frame prediction mode, the step of determining the weighted fusion condition of the current block.
  • the target frame type determines the weighted fusion condition of the current block according to the magnitude values of the first intra-frame prediction mode and the second intra-frame prediction mode, for example, the I frame allows the use of the technical solution of this application B frames are not allowed.
  • This application does not limit the target frame type, which is determined according to actual needs.
  • the target frame type includes at least one of an I frame, a P frame, and a B frame.
  • whether to use the method of the embodiment of the present application is restricted by frame type and image block size.
  • the video decoder is executing the method of the embodiment of the present application, firstly determine the type of the current frame where the current block is located, and the size of the current block; according to the type of the current frame and the size of the current block, determine whether to The magnitude value of the prediction mode and the second intra-frame prediction mode determines the weighted fusion condition of the current block.
  • the size of the current block may include the height and width of the current block. Therefore, the video decoder determines whether to execute the above step S402 according to the height and width of the current block.
  • the type of the current frame is the first frame type, and the size of the current block is greater than the first threshold, then according to the magnitude values of the first intra-frame prediction mode and the second intra-frame prediction mode, Determine the weighted fusion condition for the current block.
  • the type of the current frame is the second frame type, and the size of the current block is greater than the second threshold, then according to the magnitude values of the first intra-frame prediction mode and the second intra-frame prediction mode, Determine the weighted fusion condition for the current block.
  • the current first frame type is different from the second frame type.
  • the first threshold and the second threshold are also different.
  • the present application does not limit the specific types of the first frame type and the second frame type, nor does it limit the specific values of the first threshold and the second threshold.
  • the second threshold is different from the first threshold, that is, the I frame and the B frame (or The applicable block size specified by P frame) may be different.
  • quantization parameters may also be used to limit whether to use the method of the embodiment of the present application.
  • the video decoder is executing the method of the embodiment of the present application. First, it determines the decoded code stream to obtain the quantization parameter corresponding to the current block. For example, the video decoder obtains the current block according to the frame-level permission flag or the sequence-level QP Then, according to the quantization parameter, it is determined whether to determine the weighted fusion condition of the current block according to the magnitude values of the first intra prediction mode and the second intra prediction mode.
  • the weighted fusion condition of the current block is determined according to magnitude values of the first intra-frame prediction mode and the second intra-frame prediction mode.
  • the present application does not limit the specific value of the third threshold, which is specifically determined according to actual needs.
  • the video decoder After the video decoder obtains the predicted value of the current block according to the above method, it decodes the code stream to obtain the residual value of the current block, and adds the predicted block to the residual block to obtain the reconstructed block of the current block.
  • the video decoder decodes the code stream, determines the magnitude values of the N intra prediction modes corresponding to the reconstructed area around the current block, and determines the current
  • the first intra-frame prediction mode and the second intra-frame prediction mode of the block N is a positive integer greater than 1; according to the magnitude value of the first intra-frame prediction mode and the second intra-frame prediction mode, determine the weighted fusion condition of the current block,
  • the weighted fusion condition is used to judge whether the current block is weighted and predicted through the first intra prediction mode, the second intra prediction mode and the third intra prediction mode; according to the weighted fusion condition, and the first intra prediction mode, the second frame At least one of the intra prediction mode and the third intra prediction mode determines the target prediction value of the current block.
  • the present application determines the weighted fusion condition of the current block according to the magnitude values of the first intra-frame prediction mode and the second intra-frame prediction mode, and judges whether to perform weighted fusion prediction on the current block based on the determined weighted fusion condition.
  • weighted fusion prediction is performed on image content that requires weighted fusion prediction, the prediction quality is reduced and unnecessary noise is introduced, thereby improving the accuracy of intra prediction.
  • FIG. 9 is a schematic flowchart of an intra prediction method provided in an embodiment of the present application, as shown in FIG. 9 , including:
  • the DIMD enable flag is used to indicate whether the current decoder allows the use of the DIMD technique, or is used to indicate whether the current sequence allows the use of the DIMD technique.
  • DIMD enable flag indicates that the current decoder is allowed to use the DIMD technology, decode the code stream to obtain the DIMD enable flag.
  • the DIMD enable flag is used to indicate whether the current block uses the DIMD technology.
  • the first flag is used to indicate whether to use the first technology, and the first technology is used under the first image content.
  • S504. Determine magnitude values of N intra prediction modes corresponding to the template area of the current block, and determine a first intra prediction mode and a second intra prediction mode of the current block according to the magnitude values of the N intra prediction modes.
  • the ratio of the first magnitude value of the first intra-frame prediction mode to the second magnitude value of the second intra-frame prediction mode is less than or equal to a first preset threshold to determine the first fusion condition.
  • both the first intra-frame prediction mode and the second intra-frame prediction mode are not Planar and DC modes, and the amplitude value of the second intra-frame prediction mode is greater than 0, according to the weighted fusion condition, and the first intra-frame prediction mode , at least one of the second intra-frame prediction mode and the third intra-frame prediction mode, and determine a target prediction value of the current block.
  • the first intra-frame prediction mode is used to determine the target prediction value of the current block.
  • the ratio of the first amplitude value to the second amplitude value is greater than the first preset threshold, then use the first intra-frame prediction mode and the second intra-frame prediction mode to determine the target prediction value of the current block.
  • the first intra-frame prediction mode, the second intra-frame prediction mode and the third intra-frame prediction mode are used to determine the current The target predictor for the block.
  • the video decoder After the video decoder obtains the target prediction value of the current block according to the above method, it decodes the code stream to obtain the residual value of the current block, and adds the prediction block to the residual block to obtain the reconstructed block of the current block.
  • the first intra-frame prediction is reduced.
  • mode and the second intra prediction mode satisfy the weighted fusion condition proposed in this application, which can reduce the probability of weighted prediction of the current block of the first image content, thereby ensuring the prediction quality of the current block of the first image content.
  • FIG. 10 is a schematic flowchart of an intra prediction method provided by an embodiment of the present application. As shown in Figure 10, the method of the embodiment of the present application includes:
  • N is a positive integer greater than 1.
  • the video encoder receives a video stream, which is composed of a series of image frames, performs video encoding for each frame of image in the video stream, and divides the image frames into blocks to obtain the current block.
  • the current block is also referred to as a current coding block, a current image block, a coding block, a current coding unit, a current block to be coded, a current image block to be coded, and the like.
  • the block divided by the traditional method includes not only the chrominance component of the current block position, but also the luminance component of the current block position.
  • the separation tree technology can divide separate component blocks, such as a separate luma block and a separate chrominance block, where the luma block can be understood as only containing the luma component of the current block position, and the chrominance block can be understood as containing only the current block The chroma component of the position. In this way, the luma component and the chrominance component at the same position can belong to different blocks, and the division can have greater flexibility. If the separation tree is used in CU partitioning, some CUs contain both luma and chroma components, some CUs only contain luma components, and some CUs only contain chroma components.
  • the current block in the embodiment of the present application only includes chroma components, which may be understood as a chroma block.
  • the current block in this embodiment of the present application only includes a luma component, which may be understood as a luma block.
  • the current block includes both luma and chroma components.
  • the video encoder determines the first intra-frame prediction mode and the second intra-frame prediction mode .
  • the first intra-frame prediction mode and the second intra-frame prediction mode meet the weighted fusion condition, use the first intra-frame prediction mode to predict the current block, obtain the first prediction value of the current block, and use the second frame
  • the intra prediction mode predicts the current block to obtain the second prediction value of the current block, and performs weighted fusion of the first prediction value and the second prediction value to obtain the target prediction value of the current block.
  • a third prediction value can also be determined, for example, using the third intra prediction mode to predict the current block to obtain the third prediction value of the current block Predicted value, the first predicted value, the second predicted value and the third predicted value are weighted and fused to obtain the target predicted value of the current block.
  • the foregoing third intra-frame prediction mode may be a preset intra-frame prediction mode, or determined in other ways, which is not limited in the present application.
  • one of the first intra-frame prediction mode and the second intra-frame prediction mode is used. , to predict the current block to obtain the target prediction value of the current block.
  • the video encoder carries a second flag in the code stream, and the second flag is used to indicate whether the current block has passed the first intra prediction mode, the second intra prediction mode and the third intra prediction mode At least one of the identified target predictors. If the video encoder uses at least one of the first intra-frame prediction mode, the second intra-frame prediction mode, and the third intra-frame prediction mode to determine the target prediction value, then set the second flag to true, for example, set the second flag to The value is set to 1, and the second flag set to true is written into the code stream, for example, written into the code stream header. In this way, after the video decoder obtains the code stream, it decodes the code stream to obtain the second flag.
  • the video decoder determines that the current block passes through the first frame. At least one of the intra prediction mode, the second intra prediction mode, and the third intra prediction mode determines a target prediction value, and at this time, the video decoder determines the first intra prediction mode and the second intra prediction mode of the current block.
  • the video encoder does not determine the target predictive value of the current block through at least one of the first intra-frame prediction mode, the second intra-frame prediction mode, and the third intra-frame prediction mode, set the second flag to false, for example, set The value of the second flag is set to 0, and the second flag set to false is written into the code stream, for example, written into the code stream header.
  • the video decoder decodes the code stream to obtain the second flag. If the second flag is false, for example, the value of the second flag is 0, the video decoder is not sure about the first intra prediction mode and the second intra prediction mode of the current block.
  • Intra-frame prediction mode but traverse other preset intra-frame prediction modes, determine the intra-frame prediction mode with the least cost to predict the current block, and obtain the target prediction value of the current block.
  • the embodiment of the present application mainly involves that the target predictive value of the current block is determined by at least one of the first intra-frame prediction mode, the second intra-frame prediction mode and the third intra-frame prediction mode, that is to say , this application mainly discusses the case where the above-mentioned second flag is true.
  • the above-mentioned second flag may be a DIMD enable flag, for example, sps_DIMD_enable_flag. That is to say, the video encoder obtains the use-allowed flag bit of the DIMD, and the use-allowed flag bit of the DIMD is a sequence-level flag bit.
  • the DIMD allowable flag bit is used to indicate whether the current sequence is allowed to use the DIMD technology. If the video encoder determines that the current sequence allows the use of the DIMD technology, that is, the DIMD allowable flag bit is true, for example, 1.
  • the video encoder determines the first intra-frame prediction mode and the second intra-frame prediction mode of the current block, and executes the method of the embodiment of the present application.
  • the video encoder uses the DIMD technology to determine the target prediction value of the current block
  • the DIMD The enable flag is set to true, for example, set to 1, and the code stream is written, for example, the code stream header is written. If the video encoder does not use the DIMD technique to determine the target prediction value of the current block, set the DIMD enable flag to false, for example, to 0, and write the code stream, for example, into the code stream header.
  • the video decoder can analyze the DIMD enable flag from the code stream, and determine whether to use DIMD technology to determine the target prediction value of the current block according to the DIMD enable flag, so as to ensure the consistency between the decoding end and the encoding end, and ensure the reliability of the prediction. sex.
  • the reconstructed area around the current block may be any preset area among the reconstructed areas around the current block.
  • the reconstructed area around the current block includes m rows of reconstructed pixels above the current block.
  • the reconstructed area around the current block includes k columns of reconstructed pixels on the left side of the current block.
  • the reconstructed area around the current block includes m rows of reconstructed pixels above and to the left of the current block, which are determined as the template area of the current block.
  • the reconstructed area around the current block includes m rows of reconstructed pixels above and to the left of the current block, and k columns of reconstructed pixels on the left side of the current block, such as the L area in FIG. 7A .
  • m and k may be the same or different, which is not limited in this application.
  • the above m rows of pixels may or may not be adjacent to the current block.
  • the aforementioned k columns of pixels may or may not be adjacent to the current block.
  • the process of determining the magnitude values of the N intra-frame prediction modes corresponding to the surrounding reconstructed area may be: first, use the sobel operator on each nXn (for example, 3X3) area on the reconstructed area around the current block Scan and calculate the gradients in the horizontal and vertical directions, and obtain the gradients Dx and Dy in the horizontal and vertical directions.
  • nXn for example, 3X3
  • a 3x3 horizontal sober filter and a vertical sober filter are used to calculate the horizontal gradient Dx and the vertical gradient Dy of a 3x3 area on the template area, respectively.
  • the horizontal gradient Dx is calculated according to the above formula (4)
  • the vertical gradient Dy is calculated according to the formula (5).
  • all intra-frame prediction modes in the histogram may be determined as N intra-frame prediction modes.
  • the intra-frame prediction modes whose histogram magnitudes are greater than a certain preset value may be determined as N intra-frame prediction modes.
  • the above-mentioned reconstructed area around the current block is the template area of the current block, as shown in FIG. 7A, the template area of the current block is the reconstructed area around the current block The region adjacent to the current block.
  • the sample area of the current block is also referred to as the neighboring reconstructed sample area of the current block.
  • the process of determining the amplitude values of the N intra-frame prediction modes corresponding to the reconstructed area around the current block in the above S601 is to determine the amplitude values of the N intra-frame prediction modes corresponding to the template area of the current block, wherein the current
  • the amplitude values of the N intra-frame prediction modes corresponding to the template area of the block are basically the same as the above-mentioned process of determining the amplitude values of the N intra-frame prediction modes corresponding to the reconstructed area around the current block, only need to replace the reconstructed area around the current block with Just be the template area of the current block.
  • the first intra prediction mode and the second intra prediction mode of the current block are determined based on the magnitude values of the N intra prediction modes .
  • the first intra-frame prediction mode and the second intra-frame prediction mode of the current block are determined, including the following methods:
  • any one of the N intra-frame prediction modes is determined as the first intra-frame prediction mode.
  • Any intra-frame prediction mode except the first intra-frame prediction mode among the N intra-frame prediction modes is determined as the second intra-frame prediction mode.
  • Mode 2 determine the intra-frame prediction mode with the largest amplitude value among the N intra-frame prediction modes as the first intra-frame prediction mode, and determine the intra-frame prediction mode with the second largest amplitude value among the N intra-frame prediction modes as the second frame Intra-prediction mode.
  • S602. Determine a weighted fusion condition of the current block according to magnitude values of the first intra-frame prediction mode and the second intra-frame prediction mode.
  • the foregoing weighted fusion condition is used to determine whether the current block is weighted and predicted through the first intra-frame prediction mode, the second intra-frame prediction mode, and the third intra-frame prediction mode.
  • the video decoder after the video decoder obtains the first intra-frame prediction mode and the second intra-frame prediction mode of the current block according to the above-mentioned step S601, it does not directly use the first intra-frame prediction mode and the second intra-frame prediction mode mode to perform weighted prediction on the current block, but it is necessary to judge whether the first intra-frame prediction mode and the second intra-frame prediction mode meet the weighted fusion condition of the current block.
  • first intra prediction mode and the second intra prediction mode meet the weighted fusion condition of the current block
  • use the first intra prediction mode, the second intra prediction mode and the third intra prediction mode to weight the current block Prediction for example, use the first intra prediction mode to predict the current block to obtain the first prediction value, use the second intra prediction mode to predict the current block to obtain the second prediction value, and use the third intra prediction mode
  • the current block is predicted to obtain a third predicted value, and the first predicted value, the second predicted value, and the third predicted value are weighted to obtain a first predicted value of the current block.
  • the weights for weighting the first predictive value, the second predictive value, and the third predictive value may be determined according to amplitude values corresponding to the first intra-frame prediction mode and the second intra-frame prediction mode.
  • the first intra-frame prediction mode and the second intra-frame prediction mode do not meet the weighted fusion condition of the current block, use one of the first intra-frame prediction mode, the second intra-frame prediction mode and the third intra-frame prediction mode, Predict the current block to obtain a first predicted value of the current block.
  • the current block is predicted by using the prediction mode with the largest amplitude value among the first intra-frame prediction mode and the second intra-frame prediction mode, to obtain the first prediction value of the current block.
  • the first intra prediction mode is the intra prediction mode with the smallest amplitude value among the N prediction modes
  • the second intra prediction mode is the one with the second smallest amplitude value among the N prediction modes.
  • Intra-frame prediction mode that is, the first magnitude value is greater than the second magnitude value, therefore, when the first intra-frame prediction mode and the second intra-frame prediction mode do not meet the weighted fusion conditions of the current block, the first frame is used
  • the intra prediction mode predicts the current block to obtain the target prediction value of the current block.
  • the current weighted fusion conditions have a wide range. For example, if mode1 and mode2 are not Planar and DC modes, and the amplitude value of mode2 is greater than 0, weighted fusion prediction can be performed. This is for some image content, such as screen recording. Image content generally has characteristics of sharpness and bright colors. When weighted fusion prediction is used for these image content, the prediction quality will be reduced instead.
  • the present application determines the weighted fusion condition of the current block according to the magnitude values of the first intra-frame prediction mode and the second intra-frame prediction mode. Reduces the probability of image quality issues when image content is weighted for fusion prediction.
  • the present application does not limit the method of determining the weighted fusion condition of the current block according to the amplitude values of the first intra-frame prediction mode and the second intra-frame prediction mode in the above S602, examples include but are not limited to the following:
  • the difference between the magnitude value of the first intra-frame prediction mode and the magnitude value of the second intra-frame prediction mode is smaller than a preset value 1, and is determined as the weighted fusion condition of the current block.
  • the difference between the amplitude value of the first intra prediction mode and the amplitude value of the second intra prediction mode is greater than or equal to the preset value 1, that is to say, the corresponding value of the first intra prediction mode
  • the magnitude value is far greater than the magnitude value of the second intra-frame prediction mode, which indicates that the probability that the first intra-frame prediction mode is applicable to the current block is much greater than that of the second intra-frame prediction mode.
  • the first intra-frame prediction mode, the second intra-frame prediction mode and the third intra-frame prediction mode perform weighted prediction on the current block, noise will be brought instead and the prediction effect will be reduced.
  • the difference between the amplitude value of the first intra prediction mode and the amplitude value of the second intra prediction mode is less than the preset value 1, that is to say, the amplitude value corresponding to the first intra prediction mode
  • the amplitude value corresponding to the first intra prediction mode There is not much difference between the amplitude value of the second intra-frame prediction mode, which means that the probabilities that the first intra-frame prediction mode and the second intra-frame prediction mode are applicable to the current block are basically the same.
  • the first intra-frame prediction mode, the second intra-frame prediction mode, and the third intra-frame prediction mode can be used to predict the current block. Weighted prediction is performed on the current block to improve the prediction effect of the current block.
  • the present application does not limit the specific value of the preset value 1, which is specifically determined according to actual needs.
  • the difference between the magnitude value of the first intra prediction mode and the magnitude value of the second intra prediction mode is less than the preset value 1, and is determined as the weighted fusion condition of the current block, which can realize the weighted fusion required Weighted fusion prediction is performed on the predicted current block, and the probability of performing weighted fusion prediction on image content that does not need to be weighted fusion prediction can be reduced, thereby improving the accuracy of intra-frame prediction.
  • Way 2 Make the ratio of the first amplitude value of the first intra-frame prediction mode to the second amplitude value of the second intra-frame prediction mode less than or equal to the first preset threshold, and determine the weighted fusion condition of the current block.
  • the first intra prediction mode corresponds to The magnitude value is far greater than the magnitude value of the second intra-frame prediction mode, which indicates that the probability that the first intra-frame prediction mode is applicable to the current block is much greater than that of the second intra-frame prediction mode.
  • the first intra-frame prediction mode, the second intra-frame prediction mode and the third intra-frame prediction mode perform weighted prediction on the current block, noise will be brought instead and the prediction effect will be reduced.
  • the first intra prediction mode corresponds to The amplitude value of is not much different from the amplitude value of the second intra-frame prediction mode, which means that the probabilities that the first intra-frame prediction mode and the second intra-frame prediction mode are applicable to the current block are basically the same.
  • weighted prediction may be performed on the current block by using the first intra-frame prediction mode, the second intra-frame prediction mode, and the third intra-frame prediction mode, so as to improve the prediction effect of the current block.
  • the embodiment of the present application there is no limitation on the specific content of the first preset condition, which is specifically determined according to actual needs.
  • the first preset condition is that neither the first intra-frame prediction mode nor the second intra-frame prediction mode is a Planar and DC mode, and the second amplitude value corresponding to the second intra-frame prediction mode is not zero.
  • step S602 if the first intra-frame prediction mode and the second intra-frame prediction mode meet the first preset condition, the above-mentioned step S602 is executed, and according to the amplitude values of the first intra-frame prediction mode and the second intra-frame prediction mode , to determine the weighted fusion condition for the current block.
  • the above step S602 is not performed, but one of the first intra-frame prediction mode and the second intra-frame prediction mode is used to The block is predicted, for example, the current block is predicted using the first intra-frame prediction mode to obtain the target prediction value of the current block.
  • the current weighted fusion condition is fixed, that is to say, no matter what the image content is, the weighted fusion condition of the current block is fixed.
  • image content such as screen-recorded image content
  • weighted fusion prediction can be understood as a fuzzy prediction method, it will Reduced sharpening and color vibrancy in the image reduces prediction quality and introduces noise.
  • the present application determines the weighted fusion condition of the current block according to the image content. That is to say, this application provides differentiated weighted fusion conditions for image content, and the weighted fusion conditions corresponding to different image contents can be different, thereby ensuring weighted fusion prediction for image content that needs weighted fusion prediction to improve prediction. accuracy. For the image content that does not need weighted fusion prediction, no weighted fusion prediction is performed to avoid introducing unnecessary noise and ensure the prediction quality.
  • a sequence includes a series of images, and the series of images are generated in the same environment. Therefore, the image content of the images in a sequence is basically the same.
  • the image content of the current block is of the same type as the image content of the current sequence, for example, both are screen content, or other content collected by a camera.
  • the weighted fusion condition of the current block is determined according to the image content of the current sequence. For example, when the image content corresponding to the current block is the first image content of the video encoder, according to the first intra prediction mode and the second The magnitude value of the intra prediction mode determines the weighted fusion condition of the current block. If the image content corresponding to the current block is the second image content, the weighted fusion condition of the current block is determined according to other methods.
  • the video encoder writes the type of the image content of the current sequence into the code stream in the form of a flag bit, and the video decoder decodes the code stream to obtain the flag bit, and determine the current sequence image content through the flag bit
  • the type of image content for example, when the value of the flag bit is 1, it indicates that the image content of the current sequence is the first image content; when the value of the flag bit is 0, it indicates that the image content of the current sequence is the second image content, Wherein the first image content is different from the second image content.
  • the video encoder writes a first flag into the code stream, the first flag is used to indicate whether to use the first technology, and the first technology is used under the first image content.
  • the weighted fusion condition may only change when the image content corresponding to the current block is the first image content, and if the image content corresponding to the current block is not the first image content, the weighted fusion condition Fusion conditions did not change. That is to say, if the image content corresponding to the current block is the first image content, the weighted fusion condition adopted is the first fusion condition; if the image content corresponding to the current block is not the first image content, the weighted fusion condition adopted is Second fusion condition.
  • the first fusion condition is different from the second fusion condition, and the first fusion condition is determined according to the magnitude values of the first intra prediction mode and the second intra prediction mode, for example, the first magnitude of the first intra prediction mode
  • the ratio of the value to the second amplitude value of the second intra prediction mode is less than or equal to the first preset threshold, which is determined as the first fusion condition.
  • the video encoder determines that the image content corresponding to the current block is the first image content, it determines that the current block can use the first technology, the first technology It can be understood that the technique provided by the embodiment of the present application is to determine the weighted fusion condition of the current block according to magnitude values of the first intra-frame prediction mode and the second intra-frame prediction mode. If the video encoder determines that the current block can use the first technology, it sets the first flag to true and encodes it into the code stream, for example, the value of the first flag is 1.
  • the video encoder determines that the image content corresponding to the current block is not the first image content, it determines that the current block cannot use the first technology, then sets the value of the first flag to false and encodes it into the code stream, such as the second flag The value of is 0. In this way, the video decoder decodes the code stream to obtain the first flag, and then determines the weighted fusion condition of the current block according to the first flag. For example, when the value of the first flag is 1, the weighted fusion condition of the current block is determined according to the magnitude values of the first intra prediction mode and the second intra prediction mode, and if the value of the first flag is 0 , then the weighted fusion condition of the current block can be determined in other ways.
  • the above-mentioned first flag may be a sequence-level flag, which is used to indicate whether the current sequence can use the first technology.
  • the above-mentioned first flag may be an image-level flag, used to indicate whether the current image can use the first technology.
  • a new field may be added in the code stream to represent the first flag.
  • the first flag is represented by the field sps_DIMD_blendoff_flag, which is a completely new field.
  • the above-mentioned first flag multiplexes the third flag in the current sequence, that is, the existing fields in the current sequence can be reused without adding new fields, thereby saving codewords.
  • the third field above is an intra-block copy (Intra-block copy, IBC for short) enable flag or a template matching prediction (Template matching prediction, TMP for short) enable flag, and the like.
  • S603. Determine a first predictive value of the current block according to the weighted fusion condition and at least one of the first intra-frame prediction mode, the second intra-frame prediction mode, and the third intra-frame prediction mode.
  • the first intra-frame prediction mode, the second intra-frame prediction mode and the third intra-frame prediction mode are used to perform weighted fusion prediction on the current block. If the first intra prediction mode and the second intra prediction mode do not satisfy the weighted fusion condition of the current block, then use the first intra prediction mode and/or the second intra prediction mode to perform weighted fusion prediction on the current block.
  • the present application does not limit the specific type of the above-mentioned third intra-frame prediction mode.
  • the third intra-frame prediction mode is the intra-frame prediction mode with the third largest amplitude value in the above histogram.
  • the above-mentioned third intra-frame prediction mode is a Planar or DC mode.
  • the manner of determining the first predictive value of the current block includes but does not Limited to the following:
  • the first intra-frame prediction mode and the second intra-frame prediction mode do not satisfy the weighted fusion condition of the current block determined above, it means that the first amplitude value corresponding to the first intra-frame prediction mode is much larger than the second one.
  • the second amplitude value corresponding to the intra-frame prediction mode when the first intra-frame prediction mode is used to predict the current block, a better prediction effect can be achieved without weighted prediction.
  • S603 also includes the following S603-A2:
  • the second intra prediction mode, and the third intra prediction mode to predict the current block to obtain respective prediction values, and then weighting the prediction values corresponding to each intra prediction mode, Get the first predicted value of the current block.
  • determining the first prediction value of the current block includes the following steps:
  • S603-A21 Determine weights respectively corresponding to the first intra-frame prediction mode, the second intra-frame prediction mode, and the third intra-frame prediction mode.
  • the methods for determining the respective weights corresponding to the first intra-frame prediction mode, the second intra-frame prediction mode and the third intra-frame prediction mode include but are not limited to the following examples:
  • weights respectively corresponding to the first intra-frame prediction mode, the second intra-frame prediction mode and the third intra-frame prediction mode are preset weights.
  • the weights corresponding to the three intra prediction modes are the same, for example, 1/3 respectively.
  • the weight corresponding to the first intra-frame prediction mode may be greater than the weights of the other two prediction modes.
  • the preset weight is determined as the weight of the third intra-frame prediction mode; according to the first amplitude value and the second amplitude value, the weights corresponding to the first intra-frame prediction mode and the second intra-frame prediction mode are respectively determined.
  • the weight of the third intra-frame prediction mode is determined as a, and the present application does not limit the specific value of a, for example, 1/3.
  • determine the weights corresponding to the first intra-frame prediction mode and the second intra-frame prediction mode for example, the sum of the first amplitude value and the first amplitude value and the second amplitude value
  • multiply by 1-a to get the weight corresponding to the first frame prediction mode
  • compare the second amplitude value with the sum of the first amplitude value and the second amplitude value and multiply by 1-a , to obtain the weight corresponding to the second frame prediction mode.
  • S603-A22 Determine prediction values when predicting the current block using the first intra-frame prediction mode, the second intra-frame prediction mode, and the third intra-frame prediction mode respectively.
  • the prediction value corresponding to each intra prediction mode is multiplied by the weight, and then added to obtain the first prediction value of the current block.
  • the prediction values corresponding to the first intra-frame prediction mode, the second intra-frame prediction mode and the third intra-frame prediction mode may be weighted according to the above formula (2), to obtain the first prediction value of the current block.
  • the first prediction value of the current block may be determined according to the first intra-frame prediction mode and/or the second intra-frame prediction mode, and then, the following step S604 is performed.
  • the above-mentioned first predictive value of the current block may be directly determined as the target predictive value of the current block.
  • the above S604 includes the following steps:
  • the foregoing first encoding cost may be the RDO cost, and optionally, may also be an approximate cost such as SAD or SATD, which is not limited in this application.
  • S604-A2 Determine a second coding cost when each intra prediction mode in the candidate prediction set predicts the current block.
  • the above candidate prediction set includes at least one intra-frame prediction mode, traverse each intra-frame prediction mode in the candidate prediction set, use each intra-frame prediction mode to encode and predict the current block, and obtain the first Two encoding costs.
  • the above-mentioned first predictive value is determined as the target predictive value of the current block.
  • write the code stream after setting the second flag to true for example, set the DIMD enable flag to true, for example, set it to 1 and then code into the code stream.
  • the second flag is set to false and then written into the code stream, for example, the DIMD enable flag is set to false , such as setting it to 0 and encoding it into the code stream.
  • the second flag is used to indicate whether the target prediction value of the current block is determined through at least one of the first intra-frame prediction mode and the second intra-frame prediction mode.
  • the type of the current frame where the current block is located restricts whether to use the method of the embodiment of the present application, that is, according to the type of the current frame, it is determined whether to perform S602 according to the first intra-frame prediction mode and The magnitude value of the second intra-frame prediction mode, the step of determining the weighted fusion condition of the current block.
  • the target frame type For example, if the type of the current frame is the target frame type, then according to the magnitude values of the first intra-frame prediction mode and the second intra-frame prediction mode, determine the weighted fusion condition of the current block, for example, the I frame allows the use of the technical solution of this application B frames are not allowed.
  • This application does not limit the target frame type, which is determined according to actual needs.
  • the target frame type includes at least one of an I frame, a P frame, and a B frame.
  • whether to use the method of the embodiment of the present application is restricted by frame type and image block size.
  • the video encoder is executing the method of the embodiment of the present application, firstly determine the type of the current frame where the current block is located, and the size of the current block; according to the type of the current frame and the size of the current block, determine whether to The magnitude value of the prediction mode and the second intra-frame prediction mode determines the weighted fusion condition of the current block.
  • the size of the current block may include the height and width of the current block. Therefore, the video decoder determines whether to execute the above step S602 according to the height and width of the current block.
  • the type of the current frame is the first frame type, and the size of the current block is greater than the first threshold, then according to the magnitude values of the first intra prediction mode and the second intra prediction mode, Determine the weighted fusion condition for the current block.
  • the type of the current frame is the second frame type, and the size of the current block is greater than the second threshold, then according to the magnitude values of the first intra-frame prediction mode and the second intra-frame prediction mode, Determine the weighted fusion condition for the current block.
  • the current first frame type is different from the second frame type.
  • the first threshold and the second threshold are also different.
  • the present application does not limit the specific types of the first frame type and the second frame type, nor does it limit the specific values of the first threshold and the second threshold.
  • the second threshold is different from the first threshold, that is, the I frame and the B frame (or The applicable block size specified by P frame) may be different.
  • quantization parameters may also be used to limit whether to use the method of the embodiment of the present application.
  • the video encoder is executing the method of the embodiment of the present application, and first obtains the quantization parameter corresponding to the current block, for example, the video encoder obtains the quantization parameter of the current block according to the frame-level permission flag bit or the sequence-level QP permission flag bit, and then Determine whether to determine the weighted fusion condition of the current block according to the magnitude values of the first intra-frame prediction mode and the second intra-frame prediction mode according to the quantization parameter.
  • the weighted fusion condition of the current block is determined according to magnitude values of the first intra-frame prediction mode and the second intra-frame prediction mode.
  • the present application does not limit the specific value of the third threshold, which is specifically determined according to actual needs.
  • the video encoder obtains the target prediction value of the current block according to the above method, according to the target prediction value of the current block and the original value to the residual value of the current block, the residual value is transformed and quantized, and then encoded to obtain the code flow.
  • the video encoder determines the magnitude values of the N intra prediction modes corresponding to the reconstructed area around the current block, and determines the first of the current block according to the magnitude values of the N intra prediction modes An intra-frame prediction mode and a second intra-frame prediction mode; according to the magnitude values of the first intra-frame prediction mode and the second intra-frame prediction mode, determine the weighted fusion condition of the current block; according to the weighted fusion condition, and the first intra-frame prediction mode, the second intra-frame prediction mode and the third intra-frame prediction mode, determine a first predictive value of the current block; determine a target predictive value of the current block according to the first predictive value.
  • the present application determines the weighted fusion condition of the current block according to the magnitude values of the first intra-frame prediction mode and the second intra-frame prediction mode, and judges whether to perform weighted fusion prediction on the current block based on the determined weighted fusion condition.
  • weighted fusion prediction is performed on image content that requires weighted fusion prediction, the prediction quality is reduced and unnecessary noise is introduced, thereby improving the accuracy of intra prediction.
  • FIG. 11 is a schematic flow chart of an intra prediction method provided in an embodiment of the present application, as shown in FIG. 11 , including:
  • the DIMD permission flag is used to indicate whether the current decoder is allowed to use the DIMD technology.
  • DIMD permission flag indicates that the current decoder allows the use of DIMD technology, then determine the magnitude values of the N intra-frame prediction modes corresponding to the template area of the current block, and determine the current block according to the magnitude values of the N intra-frame prediction modes The first intra prediction mode and the second intra prediction mode of .
  • the ratio of the first magnitude value of the first intra-frame prediction mode to the second magnitude value of the second intra-frame prediction mode is less than or equal to a first preset threshold to determine the first fusion condition.
  • both the first intra-frame prediction mode and the second intra-frame prediction mode are not Planar and DC modes, and the amplitude value of the second intra-frame prediction mode is greater than 0, according to the weighted fusion condition, and the first intra-frame prediction mode , at least one of the second intra-frame prediction mode and the third intra-frame prediction mode, and determine a first prediction value of the current block.
  • the first intra prediction mode is used to determine the first prediction value of the current block.
  • the ratio of the first amplitude value to the second amplitude value is greater than the first preset threshold, then use the first intra-frame prediction mode and the second intra-frame prediction mode to determine the first prediction value of the current block.
  • the first intra-frame prediction mode, the second intra-frame prediction mode and the third intra-frame prediction mode are used to determine the current The block's first predictor.
  • the foregoing first encoding cost is a rate-distortion cost.
  • the first encoding cost is the minimum cost of the first encoding cost and the second encoding cost, determine the first predicted value as the target predicted value of the current block, and set the DIMD enable flag to 1 and program stream.
  • the DIMD enable flag is used to indicate whether the current block uses the DIMD technology.
  • the first encoding cost is not the minimum cost among the first encoding cost and the second encoding cost, determine the predicted value corresponding to the smallest second encoding cost as the target predicted value of the current block, and set the DIMD enable flag to Code stream after 0.
  • the first flag is used to indicate whether to use the first technology, and the first technology is used under the first image content.
  • the first intra-frame prediction is reduced.
  • mode and the second intra prediction mode satisfy the weighted fusion condition proposed in this application, which can reduce the probability of weighted prediction of the current block of the first image content, thereby ensuring the prediction quality of the current block of the first image content.
  • FIGS. 8 to 11 are only examples of the present application, and should not be construed as limiting the present application.
  • sequence numbers of the above-mentioned processes do not mean the order of execution, and the order of execution of the processes should be determined by their functions and internal logic, and should not be used in this application.
  • the implementation of the examples constitutes no limitation.
  • the term "and/or" is only an association relationship describing associated objects, indicating that there may be three relationships. Specifically, A and/or B may mean: A exists alone, A and B exist simultaneously, and B exists alone.
  • the character "/" in this article generally indicates that the contextual objects are an "or" relationship.
  • Fig. 12 is a schematic block diagram of an intra prediction device provided by an embodiment of the present application.
  • the intra prediction device 10 includes:
  • the decoding unit 11 is used to decode the code stream, determine the amplitude values of N intra-frame prediction modes corresponding to the reconstructed area around the current block, and determine the first frame of the current block according to the amplitude values of the N intra-frame prediction modes An intra prediction mode and a second intra prediction mode, the N is a positive integer greater than 1;
  • the determination unit 12 is configured to determine the weighted fusion condition of the current block according to the magnitude values of the first intra-frame prediction mode and the second intra-frame prediction mode, and the weighted fusion condition is used to judge whether the current block passes The first intra prediction mode, the second intra prediction mode and the third intra prediction mode perform weighted prediction;
  • a prediction unit 13 configured to determine the current block according to the weighted fusion condition and at least one of the first intra-frame prediction mode, the second intra-frame prediction mode, and the third intra-frame prediction mode target predictive value.
  • the determining unit 12 is specifically configured to, when the image content corresponding to the current block is the first image content, according to the magnitude values of the first intra prediction mode and the second intra prediction mode, Determine the weighted fusion condition of the current block.
  • the determination unit 12 is specifically configured to decode the code stream to obtain a first flag, the first flag is used to indicate whether to use the first technology, and the first technology is used under the first image content ; If the first flag indicates that the first technology is used, then determine the weighted fusion condition of the current block according to the magnitude values of the first intra-frame prediction mode and the second intra-frame prediction mode.
  • the determining unit 12 is specifically configured to make the ratio of the first magnitude value of the first intra-frame prediction mode to the second magnitude value of the second intra-frame prediction mode less than or equal to a first preset Threshold, to determine the weighted fusion condition of the current block.
  • the prediction unit 13 is specifically configured to use the first intra-frame prediction mode if the ratio of the first amplitude value to the second amplitude value is greater than the first preset threshold, A target predictor for the current block is determined.
  • the prediction unit 13 is specifically configured to use the first intra prediction mode and the The second intra-frame prediction mode determines the target prediction value of the current block.
  • the prediction unit 13 is specifically configured to use the first intra prediction mode to predict the current block to obtain a first prediction value
  • the predicting unit 13 is specifically configured to determine the sum of the first amplitude value and the second amplitude value; determine the ratio of the first amplitude value to the sum value as the a first weight; determining a ratio of the second amplitude value to the sum value as the second weight.
  • the prediction unit 13 is specifically configured to use the first intra-frame prediction if the ratio of the first magnitude value to the second magnitude value is less than or equal to the first preset threshold mode, the second intra-frame prediction mode, and the third intra-frame prediction mode, and determine the target prediction value of the current block.
  • the prediction unit 13 is specifically configured to determine weights respectively corresponding to the first intra prediction mode, the second intra prediction mode, and the third intra prediction mode; The prediction value when the first intra prediction mode, the second intra prediction mode, and the third intra prediction mode predict the current block; according to the first intra prediction mode, the second Prediction values and weights respectively corresponding to the intra prediction mode and the third intra prediction mode are weighted to obtain the target prediction value of the current block.
  • weights respectively corresponding to the first intra-frame prediction mode, the second intra-frame prediction mode and the third intra-frame prediction mode are preset weights.
  • the prediction unit 13 is specifically configured to determine a preset weight as the weight of the third intra-frame prediction mode
  • the determining unit 12 is specifically configured to: if the first intra-frame prediction mode and the second intra-frame prediction mode meet a first preset condition, according to the first intra-frame prediction mode and the second intra-frame prediction mode The amplitude value of the second intra-frame prediction mode determines the weighted fusion condition of the current block.
  • the prediction unit 13 is further configured to use the first intra-frame prediction mode and the second intra-frame prediction mode if the first preset condition is not satisfied.
  • a prediction mode is used to determine the target prediction value of the current block.
  • the first preset condition is that the first intra-frame prediction mode and the second intra-frame prediction mode are neither Planar nor DC mode, and the second intra-frame prediction mode corresponds to the first Two magnitude values are not zero.
  • the decoding unit 11 is specifically configured to decode the code stream to obtain the DIMD enable flag derived from the intra-frame mode at the decoding end, and the DIMD enable flag is used to indicate whether the current block uses the DIMD technology; if the When the DIMD enable flag indicates that the current block uses the DIMD technology, determine the amplitude values of the N intra-frame prediction modes corresponding to the reconstructed area around the current block.
  • the decoding unit 11 is specifically configured to determine the amplitude values of the N intra prediction modes corresponding to the template area of the current block if the DIMD enable flag indicates that the current block uses the DIMD technology .
  • the decoding unit 11 is specifically configured to determine the intra-frame prediction mode with the largest amplitude value among the N intra-frame prediction modes as the first intra-frame prediction mode; The intra-frame prediction mode with the second largest amplitude value among the modes is determined as the second intra-frame prediction mode.
  • the decoding unit 11 is further configured to decode the code stream to obtain a second flag, and the second flag is used to indicate whether the current block passes the first intra-frame prediction mode, the second intra-frame prediction mode, or the second intra-frame prediction mode. At least one of the prediction mode and the third intra-frame prediction mode determines a target prediction value; if the second flag is true, then determine the amplitude values of N intra-frame prediction modes corresponding to the reconstructed area around the current block.
  • the second flag is a template-based intra mode derivation DIMD enable flag.
  • the decoding unit 11 is further configured to determine the type of the current frame where the current block is located; if the type of the current frame is the target frame type, then according to the first intra-frame prediction mode and the second The amplitude value of the intra-frame prediction mode determines the weighted fusion condition of the current block, and the target frame type includes at least one of I frame, P frame, and B frame.
  • the decoding unit 11 is further configured to determine the type of the current frame where the current block is located, and the size of the current block; if the type of the current frame is the first frame type, and the current block When the magnitude of is greater than the first threshold, then determine the weighted fusion condition of the current block according to the amplitude values of the first intra-frame prediction mode and the second intra-frame prediction mode; if the type of the current frame is the second frame type, and the size of the current block is greater than a second threshold, then determine the weighted fusion condition of the current block according to the magnitude values of the first intra-frame prediction mode and the second intra-frame prediction mode.
  • the decoding unit 11 is further configured to determine the quantization parameter corresponding to the current block; if the quantization parameter is smaller than the third threshold, according to the first intra prediction mode and the second intra prediction mode The amplitude value is used to determine the weighted fusion condition of the current block.
  • the first flag multiplexes a third flag of the current sequence
  • the third flag is a sequence-level intra block copy IBC enable flag or a template matching prediction TMP enable flag.
  • the device embodiment and the method embodiment may correspond to each other, and similar descriptions may refer to the method embodiment. To avoid repetition, details are not repeated here.
  • the device 10 shown in FIG. 10 can execute the intra prediction method of the embodiment of the present application, and the aforementioned and other operations and/or functions of each unit in the device 10 are to realize the above-mentioned intra prediction method and other methods. For the sake of brevity, the corresponding process will not be repeated here.
  • Fig. 13 is a schematic block diagram of an intra prediction device provided by an embodiment of the present application.
  • the intra prediction device 20 may include:
  • the first determining unit 21 is configured to determine the amplitude values of N intra-frame prediction modes corresponding to the reconstructed area around the current block, and determine the first intra-frame prediction of the current block according to the amplitude values of the N intra-frame prediction modes mode and a second intra-frame prediction mode, the N is a positive integer greater than 1;
  • the second determination unit 22 is configured to determine the weighted fusion condition of the current block according to the magnitude values of the first intra-frame prediction mode and the second intra-frame prediction mode, and the weighted fusion condition is used to judge the current block Whether to perform weighted prediction through the first intra prediction mode, the second intra prediction mode, and the third intra prediction mode;
  • a prediction unit 23 configured to determine the current block according to the weighted fusion condition and at least one of the first intra-frame prediction mode, the second intra-frame prediction mode, and the third intra-frame prediction mode A first predictive value of the first predictive value; determine a target predictive value of the current block according to the first predictive value.
  • the first determining unit 21 is configured to, when the image content corresponding to the current block is the first image content, then according to the amplitude value of the first intra prediction mode and the second intra prediction mode , to determine the weighted fusion condition of the current block.
  • the prediction unit 23 is further configured to write a first flag into the code stream, the first flag is used to indicate whether to use the first technology, and the first technology is used under the first image content .
  • the second determining unit 22 is specifically configured to make the ratio of the first amplitude value of the first intra prediction mode to the second amplitude value of the second intra prediction mode less than or equal to the first A preset threshold is used to determine the weighted fusion condition of the current block.
  • the prediction unit 23 is specifically configured to use the first intra-frame prediction mode if the ratio of the first amplitude value to the second amplitude value is greater than the first preset threshold, A first predictor of the current block is determined.
  • the prediction unit 23 is specifically configured to use the first intra-frame prediction mode and the The second intra-frame prediction mode determines the first prediction value of the current block.
  • the prediction unit 23 is specifically configured to determine a first prediction value when using the first intra prediction mode to predict the current block
  • the prediction unit 23 is specifically configured to determine a sum of the first amplitude value and the second amplitude value
  • a ratio of the second amplitude value to the sum value is determined as the second weight.
  • the prediction unit 23 is specifically configured to use the first intra-frame prediction if the ratio of the first amplitude value to the second amplitude value is less than or equal to the first preset threshold mode, the second intra-frame prediction mode, and the third intra-frame prediction mode, and determine a first prediction value of the current block.
  • the prediction unit 23 is specifically configured to determine weights respectively corresponding to the first intra-frame prediction mode, the second intra-frame prediction mode, and the third intra-frame prediction mode;
  • Weighting is performed according to prediction values and weights respectively corresponding to the first intra-frame prediction mode, the second intra-frame prediction mode, and the third intra-frame prediction mode to obtain a first prediction value of the current block.
  • weights respectively corresponding to the first intra-frame prediction mode, the second intra-frame prediction mode and the third intra-frame prediction mode are preset weights.
  • the prediction unit 23 is specifically configured to determine a preset weight as the weight of the third intra-frame prediction mode
  • the second determination unit 22 is specifically configured to: if the first intra-frame prediction mode and the second intra-frame prediction mode meet a first preset condition, then according to the first intra-frame prediction mode and the magnitude value of the second intra-frame prediction mode to determine the weighted fusion condition of the current block.
  • the prediction unit 23 is further configured to use the first intra-frame prediction mode and the second intra-frame prediction mode if the first preset condition is not satisfied.
  • a prediction mode for determining the first prediction value of the current block.
  • the first preset condition is that the first intra-frame prediction mode and the second intra-frame prediction mode are neither Planar nor DC mode, and the second intra-frame prediction mode corresponds to the first Two magnitude values are not zero.
  • the first determination unit 21 is specifically configured to obtain the DIMD permission flag derived from the intra-frame mode of the decoder, and the DIMD permission flag is used to indicate whether the current sequence is allowed to use the DIMD technology; if the DIMD When the allowed use flag indicates that the current sequence allows the use of the DIMD technology, then determine the amplitude values of the N intra-frame prediction modes.
  • the first determining unit 21 is specifically configured to, if the DIMD allowed flag indicates that the current sequence allows the use of the DIMD technique, in the adjacent reconstructed sample area of the current block, determine the Amplitude values of the N intra prediction modes.
  • the second determining unit 22 is specifically configured to determine the intra-frame prediction mode with the largest amplitude value among the N intra-frame prediction modes as the first intra-frame prediction mode; The intra-frame prediction mode with the second largest amplitude value among the intra-frame prediction modes is determined as the second intra-frame prediction mode.
  • the predicting unit 23 is specifically configured to determine a first encoding cost corresponding to the first predictive value according to the first predictive value;
  • the candidate prediction set includes other intra-frame prediction modes in the N intra-frame prediction modes except the first intra-frame prediction mode and the second intra-frame prediction mode.
  • the prediction unit 23 is further configured to set the second flag to be true and then write the code stream if the first encoding cost is the minimum encoding cost of the first encoding cost and the second encoding cost;
  • the first encoding cost is not the minimum encoding cost of the first encoding cost and the second encoding cost, then write the second flag to a false code stream;
  • the second flag is used to indicate whether the current block determines a target predictive value through at least one of the first intra-frame prediction mode, the second intra-frame prediction mode, and the third intra-frame prediction mode.
  • the second flag is a template-based DIMD enabling flag for deriving intra-mode at the decoder.
  • the second determining unit 22 is further configured to determine the type of the current frame where the current block is located;
  • the type of the current frame is a target frame type
  • the second determination unit 22 is further configured to determine the type of the current frame where the current block is located, and the size of the current block;
  • the type of the current frame is the first frame type, and the size of the current block is greater than the first threshold, then according to the amplitude values of the first intra-frame prediction mode and the second intra-frame prediction mode, determine the The weighted fusion condition of the current block;
  • the type of the current frame is the second frame type, and the size of the current block is greater than the second threshold, then according to the amplitude values of the first intra prediction mode and the second intra prediction mode, determine the The weighted fusion condition for the current block.
  • the second determination unit 22 is also configured to determine the quantization parameter corresponding to the current block
  • the quantization parameter is smaller than a third threshold, then determine a weighted fusion condition of the current block according to magnitude values of the first intra-frame prediction mode and the second intra-frame prediction mode.
  • the first flag multiplexes a third flag of the current sequence
  • the third flag is a sequence-level intra block copy IBC enable flag or a template matching prediction TMP enable flag.
  • the prediction unit 23 is further configured to write the first flag into the code stream.
  • the device embodiment and the method embodiment may correspond to each other, and similar descriptions may refer to the method embodiment. To avoid repetition, details are not repeated here.
  • the device 20 shown in FIG. 11 may correspond to the corresponding subject in the intra prediction method of the embodiment of the present application, and the aforementioned and other operations and/or functions of each unit in the device 20 are for realizing the encoding method, etc. For the sake of brevity, the corresponding processes in each method are not repeated here.
  • the functional unit may be implemented in the form of hardware, may also be implemented by instructions in the form of software, and may also be implemented by a combination of hardware and software units.
  • each step of the method embodiment in the embodiment of the present application can be completed by an integrated logic circuit of the hardware in the processor and/or instructions in the form of software, and the steps of the method disclosed in the embodiment of the present application can be directly embodied as hardware
  • the decoding processor is executed, or the combination of hardware and software units in the decoding processor is used to complete the execution.
  • the software unit may be located in a mature storage medium in the field such as random access memory, flash memory, read-only memory, programmable read-only memory, electrically erasable programmable memory, and registers.
  • the storage medium is located in the memory, and the processor reads the information in the memory, and completes the steps in the above method embodiments in combination with its hardware.
  • Fig. 14 is a schematic block diagram of an electronic device provided by an embodiment of the present application.
  • the electronic device 30 may be the video encoder or video decoder described in the embodiment of the present application, and the electronic device 30 may include:
  • a memory 33 and a processor 32 the memory 33 is used to store a computer program 34 and transmit the program code 34 to the processor 32 .
  • the processor 32 can call and run the computer program 34 from the memory 33 to implement the method in the embodiment of the present application.
  • the processor 32 can be used to execute the steps in the above-mentioned method 200 according to the instructions in the computer program 34 .
  • the processor 32 may include, but is not limited to:
  • DSP Digital Signal Processor
  • ASIC Application Specific Integrated Circuit
  • FPGA Field Programmable Gate Array
  • the memory 33 includes but is not limited to:
  • non-volatile memory can be read-only memory (Read-Only Memory, ROM), programmable read-only memory (Programmable ROM, PROM), erasable programmable read-only memory (Erasable PROM, EPROM), electronically programmable Erase Programmable Read-Only Memory (Electrically EPROM, EEPROM) or Flash.
  • the volatile memory can be Random Access Memory (RAM), which acts as external cache memory.
  • RAM Static Random Access Memory
  • SRAM Static Random Access Memory
  • DRAM Dynamic Random Access Memory
  • Synchronous Dynamic Random Access Memory Synchronous Dynamic Random Access Memory
  • SDRAM double data rate synchronous dynamic random access memory
  • Double Data Rate SDRAM, DDR SDRAM double data rate synchronous dynamic random access memory
  • Enhanced SDRAM, ESDRAM enhanced synchronous dynamic random access memory
  • SLDRAM synchronous connection dynamic random access memory
  • Direct Rambus RAM Direct Rambus RAM
  • the computer program 34 can be divided into one or more units, and the one or more units are stored in the memory 33 and executed by the processor 32 to complete the present application.
  • the one or more units may be a series of computer program instruction segments capable of accomplishing specific functions, and the instruction segments are used to describe the execution process of the computer program 34 in the electronic device 30 .
  • the electronic device 30 may also include:
  • a transceiver 33 the transceiver 33 can be connected to the processor 32 or the memory 33 .
  • the processor 32 can control the transceiver 33 to communicate with other devices, specifically, can send information or data to other devices, or receive information or data sent by other devices.
  • Transceiver 33 may include a transmitter and a receiver.
  • the transceiver 33 may further include antennas, and the number of antennas may be one or more.
  • bus system includes not only a data bus, but also a power bus, a control bus and a status signal bus.
  • Fig. 15 is a schematic block diagram of a video codec system provided by an embodiment of the present application.
  • the video codec system 40 may include: a video encoder 41 and a video decoder 42, wherein the video encoder 41 is used to execute the video encoding method involved in the embodiment of the present application, and the video decoder 42 is used to execute The video decoding method involved in the embodiment of the present application.
  • the present application also provides a computer storage medium, on which a computer program is stored, and when the computer program is executed by a computer, the computer can execute the methods of the above method embodiments.
  • the embodiments of the present application further provide a computer program product including instructions, and when the instructions are executed by a computer, the computer executes the methods of the foregoing method embodiments.
  • the present application also provides a code stream, which is generated by the above coding method.
  • the code stream includes the first flag.
  • the computer program product includes one or more computer instructions.
  • the computer can be a general purpose computer, a special purpose computer, a computer network, or other programmable device.
  • the computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, e.g. (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.) to another website site, computer, server or data center.
  • the computer-readable storage medium may be any available medium that can be accessed by a computer, or a data storage device such as a server or a data center integrated with one or more available media.
  • the available medium may be a magnetic medium (such as a floppy disk, a hard disk, or a magnetic tape), an optical medium (such as a digital video disc (digital video disc, DVD)), or a semiconductor medium (such as a solid state disk (solid state disk, SSD)), etc.
  • the disclosed systems, devices and methods may be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components can be combined or can be Integrate into another system, or some features may be ignored, or not implemented.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
  • a unit described as a separate component may or may not be physically separated, and a component displayed as a unit may or may not be a physical unit, that is, it may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

本申请提供一种帧内预测方法、设备、系统、及存储介质,通过N个帧内预测模式的幅度值,确定当前块的第一帧内预测模式和第二帧内预测模式;根据第一帧内预测模式和第二帧内预测模式的幅度值,确定当前块的加权融合条件;根据加权融合条件,以及第一帧内预测模式、第二帧内预测模式和第三帧内预测模式中的至少一个,确定当前块的目标预测值。即本申请根据第一帧内预测模式和第二帧内预测模式的幅度值,确定当前块的加权融合条件,并基于确定的加权融合条件判断是否对当前块进行加权融合预测,可以避免对不需要加权融合预测的图像内容进行加权融合预测时,降低预测质量,引入不必要噪声的问题,进而提高了帧内预测的准确性。

Description

帧内预测方法、设备、系统、及存储介质 技术领域
本申请涉及视频编解码技术领域,尤其涉及一种帧内预测方法、设备、系统、及存储介质。
背景技术
数字视频技术可以并入多种视频装置中,例如数字电视、智能手机、计算机、电子阅读器或视频播放器等。随着视频技术的发展,视频数据所包括的数据量较大,为了便于视频数据的传输,视频装置执行视频压缩技术,以使视频数据更加有效的传输或存储。
视频通过编码实现压缩,其编码过程包括预测、变换和量化等过程。例如,通过帧内预测和/或帧间预测,确定当前块的预测块,当前块减去预测块得到残差块,对残差块进行变换得到变换系数,对变换系数进行量化得到量化系数,并对量化系数进行编码,形成码流。
为了提高帧内预测的准确性,可以采用两种或多种帧内预测模式对当前块进行加权融合预测,得到当前块的预测值。在一些情况下,使用加权融合预测可以提高预测效果,但在一些情况下,使用加权融合预测反而会降低预测质量,因此,在加权融合之前,需要基于加权融合条件判断是否进行加权融合。由此可知,加权融合条件的设定直接影响到帧内预测的准确性。
发明内容
本申请实施例提供了一种帧内预测方法、设备、系统、及存储介质,通过帧内预测模式的幅度值确定加权融合条件,基于该加权融合条件进行帧内预测时,可以提高帧内预测的效果。
第一方面,本申请提供了一种帧内预测方法,包括:
解码码流,确定当前块周围已重建区域对应的N个帧内预测模式的幅度值,并根据所述N个帧内预测模式的幅度值,确定当前块的第一帧内预测模式和第二帧内预测模式,所述N为大于1的正整数;
根据所述第一帧内预测模式和第二帧内预测模式的幅度值,确定所述当前块的加权融合条件,所述加权融合条件用于判断所述当前块是否通过所述第一帧内预测模式、所述第二帧内预测模式和第三帧内预测模式进行加权预测;
根据所述加权融合条件,以及所述第一帧内预测模式、所述第二帧内预测模式和所述第三帧内预测模式中的至少一个,确定所述当前块的目标预测值。
第二方面,本申请实施例提供一种帧内预测方法,包括:
确定当前块周围已重建区域对应的N个帧内预测模式的幅度值,并根据所述N个帧内预测模式的幅度值,确定当前块的第一帧内预测模式和第二帧内预测模式,所述N为大于1的正整数;
根据所述第一帧内预测模式和第二帧内预测模式的幅度值,确定所述当前块的加权融合条件,所述加权融合条件用于判断所述当前块是否通过所述第一帧内预测模式、所述第二帧内预测模式和第三帧内预测模式进行加权预测;
根据所述加权融合条件,以及所述第一帧内预测模式、所述第二帧内预测模式和所述第三帧内预测模式中的至少一个,确定所述当前块的第一预测值;
根据所述第一预测值,确定所述当前块的目标预测值。
第三方面,本申请提供了一种帧内预测装置,用于执行上述第一方面或其各实现方式中的方法。具体地,该编码器包括用于执行上述第一方面或其各实现方式中的方法的功能单元。
第四方面,本申请提供了一种帧内预测装置,用于执行上述第二方面或其各实现方式中的方法。具体地,该解码器包括用于执行上述第二方面或其各实现方式中的方法的功能单元。
第五方面,提供了一种视频编码器,包括处理器和存储器。该存储器用于存储计算机程序,该处理器用于调用并运行该存储器中存储的计算机程序,以执行上述第一方面或其各实现方式中的方法。
第六方面,提供了一种视频解码器,包括处理器和存储器。该存储器用于存储计算机程序,该处理器用于调用并运行该存储器中存储的计算机程序,以执行上述第二方面或其各实现方式中的方法。
第七方面,提供了一种视频编解码系统,包括视频编码器和视频解码器。视频编码器用于执行上述第一方面或其各实现方式中的方法,视频解码器用于执行上述第二方面或其各实现方式中的方法。
第八方面,提供了一种芯片,用于实现上述第一方面至第二方面中的任一方面或其各实现方式中的方法。具体地,该芯片包括:处理器,用于从存储器中调用并运行计算机程序,使得安装有该芯片的设备执行如上述第一方面至第二方面中的任一方面或其各实现方式中的方法。
第九方面,提供了一种计算机可读存储介质,用于存储计算机程序,该计算机程序使得计算机执行上述第一方面至第二方面中的任一方面或其各实现方式中的方法。
第十方面,提供了一种计算机程序产品,包括计算机程序指令,该计算机程序指令使得计算机执行上述第一方面至第二方面中的任一方面或其各实现方式中的方法。
第十一方面,提供了一种计算机程序,当其在计算机上运行时,使得计算机执行上述第一方面至第二方面中的任一方面或其各实现方式中的方法。
第十二方面,提供了一种码流,该码流是通过上述第一方面中的任一方面或其各实现方式生成的。
基于以上技术方案,通过确定当前块周围已重建区域对应的N个帧内预测模式的幅度值,并根据N个帧内预测模式的幅度值,确定当前块的第一帧内预测模式和第二帧内预测模式,N为大于1的正整数;根据第一帧内预测模式和第二帧内预测模式的幅度值,确定当前块的加权融合条件,加权融合条件用于判断当前块是否通过第一帧内预测模式、第二帧内预测模式和第三帧内预测模式进行加权预测;根据加权融合条件,以及第一帧内预测模式、第二帧内预测模 式和第三帧内预测模式中的至少一个,确定当前块的目标预测值。即本申请根据第一帧内预测模式和第二帧内预测模式的幅度值,确定当前块的加权融合条件,并基于确定的加权融合条件判断是否对当前块进行加权融合预测,可以避免对不需要加权融合预测的图像内容进行加权融合预测时,降低预测质量,引入不必要噪声的问题,进而提高了帧内预测的准确性。
附图说明
图1为本申请实施例涉及的一种视频编解码系统的示意性框图;
图2是本申请实施例涉及的视频编码器的示意性框图;
图3是本申请实施例涉及的视频解码器的示意性框图;
图4是一种帧内预测模式示意图;
图5是一种帧内预测模式示意图;
图6是一种帧内预测模式示意图;
图7A为DIMD的模板示意图;
图7B为幅度值与角度模式的直方图;
图7C为DIMD的预测示意图;
图8为本申请实施例提供的帧内预测方法的一种流程示意图;
图9为本申请实施例提供的帧内预测方法的一种流程示意图
图10为本申请实施例提供的帧内预测方法的一种流程示意图;
图11为本申请实施例提供的帧内预测方法的一种流程示意图;
图12是本申请一实施例提供的帧内预测装置的示意性框图;
图13是本申请一实施例提供的帧内预测装置的示意性框图;
图14是本申请实施例提供的电子设备的示意性框图;
图15是本申请实施例提供的视频编解码系统的示意性框图。
具体实施方式
本申请可应用于图像编解码领域、视频编解码领域、硬件视频编解码领域、专用电路视频编解码领域、实时视频编解码领域等。例如,本申请的方案可结合至音视频编码标准(audio video coding standard,简称AVS),例如,H.264/音视频编码(audio video coding,简称AVC)标准,H.265/高效视频编码(high efficiency video coding,简称HEVC)标准以及H.266/多功能视频编码(versatile video coding,简称VVC)标准。或者,本申请的方案可结合至其它专属或行业标准而操作,所述标准包含ITU-TH.261、ISO/IECMPEG-1Visual、ITU-TH.262或ISO/IECMPEG-2Visual、ITU-TH.263、ISO/IECMPEG-4Visual,ITU-TH.264(还称为ISO/IECMPEG-4AVC),包含可分级视频编解码(SVC)及多视图视频编解码(MVC)扩展。应理解,本申请的技术不限于任何特定编解码标准或技术。
为了便于理解,首先结合图1对本申请实施例涉及的视频编解码系统进行介绍。
图1为本申请实施例涉及的一种视频编解码系统的示意性框图。需要说明的是,图1只是一种示例,本申请实施例的视频编解码系统包括但不限于图1所示。如图1所示,该视频编解码系统100包含编码设备110和解码设备120。其中编码设备用于对视频数据进行编码(可以理解成压缩)产生码流,并将码流传输给解码设备。解码设备对编码设备编码产生的码流进行解码,得到解码后的视频数据。
本申请实施例的编码设备110可以理解为具有视频编码功能的设备,解码设备120可以理解为具有视频解码功能的设备,即本申请实施例对编码设备110和解码设备120包括更广泛的装置,例如包含智能手机、台式计算机、移动计算装置、笔记本(例如,膝上型)计算机、平板计算机、机顶盒、电视、相机、显示装置、数字媒体播放器、视频游戏控制台、车载计算机等。
在一些实施例中,编码设备110可以经由信道130将编码后的视频数据(如码流)传输给解码设备120。信道130可以包括能够将编码后的视频数据从编码设备110传输到解码设备120的一个或多个媒体和/或装置。
在一个实例中,信道130包括使编码设备110能够实时地将编码后的视频数据直接发射到解码设备120的一个或多个通信媒体。在此实例中,编码设备110可根据通信标准来调制编码后的视频数据,且将调制后的视频数据发射到解码设备120。其中通信媒体包含无线通信媒体,例如射频频谱,可选的,通信媒体还可以包含有线通信媒体,例如一根或多根物理传输线。
在另一实例中,信道130包括存储介质,该存储介质可以存储编码设备110编码后的视频数据。存储介质包含多种本地存取式数据存储介质,例如光盘、DVD、快闪存储器等。在该实例中,解码设备120可从该存储介质中获取编码后的视频数据。
在另一实例中,信道130可包含存储服务器,该存储服务器可以存储编码设备110编码后的视频数据。在此实例中,解码设备120可以从该存储服务器中下载存储的编码后的视频数据。可选的,该存储服务器可以存储编码后的视频数据且可以将该编码后的视频数据发射到解码设备120,例如web服务器(例如,用于网站)、文件传送协议(FTP)服务器等。
一些实施例中,编码设备110包含视频编码器112及输出接口113。其中,输出接口113可以包含调制器/解调器(调制解调器)和/或发射器。
在一些实施例中,编码设备110除了包括视频编码器112和输入接口113外,还可以包括视频源111。
视频源111可包含视频采集装置(例如,视频相机)、视频存档、视频输入接口、计算机图形系统中的至少一个,其中,视频输入接口用于从视频内容提供者处接收视频数据,计算机图形系统用于产生视频数据。
视频编码器112对来自视频源111的视频数据进行编码,产生码流。视频数据可包括一个或多个图像(picture) 或图像序列(sequence of pictures)。码流以比特流的形式包含了图像或图像序列的编码信息。编码信息可以包含编码图像数据及相关联数据。相关联数据可包含序列参数集(sequence parameter set,简称SPS)、图像参数集(picture parameter set,简称PPS)及其它语法结构。SPS可含有应用于一个或多个序列的参数。PPS可含有应用于一个或多个图像的参数。语法结构是指码流中以指定次序排列的零个或多个语法元素的集合。
视频编码器112经由输出接口113将编码后的视频数据直接传输到解码设备120。编码后的视频数据还可存储于存储介质或存储服务器上,以供解码设备120后续读取。
在一些实施例中,解码设备120包含输入接口121和视频解码器122。
在一些实施例中,解码设备120除包括输入接口121和视频解码器122外,还可以包括显示装置123。
其中,输入接口121包含接收器及/或调制解调器。输入接口121可通过信道130接收编码后的视频数据。
视频解码器122用于对编码后的视频数据进行解码,得到解码后的视频数据,并将解码后的视频数据传输至显示装置123。
显示装置123显示解码后的视频数据。显示装置123可与解码设备120整合或在解码设备120外部。显示装置123可包括多种显示装置,例如液晶显示器(LCD)、等离子体显示器、有机发光二极管(OLED)显示器或其它类型的显示装置。
此外,图1仅为实例,本申请实施例的技术方案不限于图1,例如本申请的技术还可以应用于单侧的视频编码或单侧的视频解码。
下面对本申请实施例涉及的视频编码框架进行介绍。
图2是本申请实施例涉及的视频编码器的示意性框图。应理解,该视频编码器200可用于对图像进行有损压缩(lossy compression),也可用于对图像进行无损压缩(lossless compression)。该无损压缩可以是视觉无损压缩(visually lossless compression),也可以是数学无损压缩(mathematically lossless compression)。
该视频编码器200可应用于亮度色度(YCbCr,YUV)格式的图像数据上。例如,YUV比例可以为4:2:0、4:2:2或者4:4:4,Y表示明亮度(Luma),Cb(U)表示蓝色色度,Cr(V)表示红色色度,U和V表示为色度(Chroma)用于描述色彩及饱和度。例如,在颜色格式上,4:2:0表示每4个像素有4个亮度分量,2个色度分量(YYYYCbCr),4:2:2表示每4个像素有4个亮度分量,4个色度分量(YYYYCbCrCbCr),4:4:4表示全像素显示(YYYYCbCrCbCrCbCrCbCr)。
例如,该视频编码器200读取视频数据,针对视频数据中的每帧图像,将一帧图像划分成若干个编码树单元(coding tree unit,CTU),在一些例子中,CTB可被称作“树型块”、“最大编码单元”(Largest Coding unit,简称LCU)或“编码树型块”(coding tree block,简称CTB)。每一个CTU可以与图像内的具有相等大小的像素块相关联。每一像素可对应一个亮度(luminance或luma)采样及两个色度(chrominance或chroma)采样。因此,每一个CTU可与一个亮度采样块及两个色度采样块相关联。一个CTU大小例如为128×128、64×64、32×32等。一个CTU又可以继续被划分成若干个编码单元(Coding Unit,CU)进行编码,CU可以为矩形块也可以为方形块。CU可以进一步划分为预测单元(prediction Unit,简称PU)和变换单元(transform unit,简称TU),进而使得编码、预测、变换分离,处理的时候更灵活。在一种示例中,CTU以四叉树方式划分为CU,CU以四叉树方式划分为TU、PU。
视频编码器及视频解码器可支持各种PU大小。假定特定CU的大小为2N×2N,视频编码器及视频解码器可支持2N×2N或N×N的PU大小以用于帧内预测,且支持2N×2N、2N×N、N×2N、N×N或类似大小的对称PU以用于帧间预测。视频编码器及视频解码器还可支持2N×nU、2N×nD、nL×2N及nR×2N的不对称PU以用于帧间预测。
在一些实施例中,如图2所示,该视频编码器200可包括:预测单元210、残差单元220、变换/量化单元230、反变换/量化单元240、重建单元250、环路滤波单元260、解码图像缓存270和熵编码单元280。需要说明的是,视频编码器200可包含更多、更少或不同的功能组件。
可选的,在本申请中,当前块(current block)可以称为当前编码单元(CU)或当前预测单元(PU)等。预测块也可称为预测图像块或图像预测块,重建图像块也可称为重建块或图像重建图像块。
在一些实施例中,预测单元210包括帧间预测单元211和帧内估计单元212。由于视频的一个帧中的相邻像素之间存在很强的相关性,在视频编解码技术中使用帧内预测的方法消除相邻像素之间的空间冗余。由于视频中的相邻帧之间存在着很强的相似性,在视频编解码技术中使用帧间预测方法消除相邻帧之间的时间冗余,从而提高编码效率。
帧间预测单元211可用于帧间预测,帧间预测可以参考不同帧的图像信息,帧间预测使用运动信息从参考帧中找到参考块,根据参考块生成预测块,用于消除时间冗余;帧间预测所使用的帧可以为P帧和/或B帧,P帧指的是向前预测帧,B帧指的是双向预测帧。运动信息包括参考帧所在的参考帧列表,参考帧索引,以及运动矢量。运动矢量可以是整像素的或者是分像素的,如果运动矢量是分像素的,那么需要再参考帧中使用插值滤波做出所需的分像素的块,这里把根据运动矢量找到的参考帧中的整像素或者分像素的块叫参考块。有的技术会直接把参考块作为预测块,有的技术会在参考块的基础上再处理生成预测块。在参考块的基础上再处理生成预测块也可以理解为把参考块作为预测块然后再在预测块的基础上处理生成新的预测块。
帧内估计单元212只参考同一帧图像的信息,预测当前码图像块内的像素信息,用于消除空间冗余。帧内预测所使用的帧可以为I帧。例如对于4×4的当前块,当前块左边一行和上面一列的像素为当前块的参考像素,帧内预测使用这些参考像素对当前块进行预测。这些参考像素可能已经全部可得,即全部已经编解码。也可能有部分不可得,比如当前块是整帧的最左侧,那么当前块的左边的参考像素不可得。或者编解码当前块时,当前块左下方的部分还没有编解码,那么左下方的参考像素也不可得。对于参考像素不可得的情况,可以使用可得的参考像素或某些值或某些方法进行填充,或者不进行填充。
帧内预测有多种预测模式,例如,图4为帧内预测模式的示意图,如图4所示,如HEVC使用的帧内预测模式有Planar、DC和33种角度模式共35种预测模式。图5为帧内预测模式的示意图,如图5所示,VVC使用的帧内模式有Planar、DC和65种角度模式共67种预测模式。图6为帧内预测模式的示意图,如图6所示,AVS3使用DC、Planar、Bilinear和63种角度模式共66种预测模式
需要说明的是,随着角度模式的增加,帧内预测将会更加精确,也更加符合对高清以及超高清数字视频发展的需求。
残差单元220可基于CU的像素块及CU的PU的预测块来产生CU的残差块。举例来说,残差单元220可产生CU的残差块,使得残差块中的每一采样具有等于以下两者之间的差的值:CU的像素块中的采样,及CU的PU的预测块中的对应采样。
变换/量化单元230可量化变换系数。变换/量化单元230可基于与CU相关联的量化参数(QP)值来量化与CU的TU相关联的变换系数。视频编码器200可通过调整与CU相关联的QP值来调整应用于与CU相关联的变换系数的量化程度。
反变换/量化单元240可分别将逆量化及逆变换应用于量化后的变换系数,以从量化后的变换系数重建残差块。
重建单元250可将重建后的残差块的采样加到预测单元210产生的一个或多个预测块的对应采样,以产生与TU相关联的重建图像块。通过此方式重建CU的每一个TU的采样块,视频编码器200可重建CU的像素块。
环路滤波单元260可执行消块滤波操作以减少与CU相关联的像素块的块效应。
在一些实施例中,环路滤波单元260包括去块滤波单元和样点自适应补偿/自适应环路滤波(SAO/ALF)单元,其中去块滤波单元用于去方块效应,SAO/ALF单元用于去除振铃效应。
解码图像缓存270可存储重建后的像素块。帧间预测单元211可使用含有重建后的像素块的参考图像来对其它图像的PU执行帧间预测。另外,帧内估计单元212可使用解码图像缓存270中的重建后的像素块来对在与CU相同的图像中的其它PU执行帧内预测。
熵编码单元280可接收来自变换/量化单元230的量化后的变换系数。熵编码单元280可对量化后的变换系数执行一个或多个熵编码操作以产生熵编码后的数据。
图3是本申请实施例涉及的视频解码器的示意性框图。
如图3所示,视频解码器300包含:熵解码单元310、预测单元320、反量化/变换单元330、重建单元340、环路滤波单元350及解码图像缓存360。需要说明的是,视频解码器300可包含更多、更少或不同的功能组件。
视频解码器300可接收码流。熵解码单元310可解析码流以从码流提取语法元素。作为解析码流的一部分,熵解码单元310可解析码流中的经熵编码后的语法元素。预测单元320、反量化/变换单元330、重建单元340及环路滤波单元350可根据从码流中提取的语法元素来解码视频数据,即产生解码后的视频数据。
在一些实施例中,预测单元320包括帧间预测单元321和帧内估计单元322。
帧内估计单元322(也称为帧内预测单元)可执行帧内预测以产生PU的预测块。帧内估计单元322可使用帧内预测模式以基于空间相邻PU的像素块来产生PU的预测块。帧内估计单元322还可根据从码流解析的一个或多个语法元素来确定PU的帧内预测模式。
帧间预测单元321可根据从码流解析的语法元素来构造第一参考图像列表(列表0)及第二参考图像列表(列表1)。此外,如果PU使用帧间预测编码,则熵解码单元310可解析PU的运动信息。帧间预测单元321可根据PU的运动信息来确定PU的一个或多个参考块。帧间预测单元321可根据PU的一个或多个参考块来产生PU的预测块。
反量化/变换单元330(也称为反量化/变换单元)可逆量化(即,解量化)与TU相关联的变换系数。反量化/变换单元330可使用与TU的CU相关联的QP值来确定量化程度。
在逆量化变换系数之后,反量化/变换单元330可将一个或多个逆变换应用于逆量化变换系数,以便产生与TU相关联的残差块。
重建单元340使用与CU的TU相关联的残差块及CU的PU的预测块以重建CU的像素块。例如,重建单元340可将残差块的采样加到预测块的对应采样以重建CU的像素块,得到重建图像块。
环路滤波单元350可执行消块滤波操作以减少与CU相关联的像素块的块效应。
视频解码器300可将CU的重建图像存储于解码图像缓存360中。视频解码器300可将解码图像缓存360中的重建图像作为参考图像用于后续预测,或者,将重建图像传输给显示装置呈现。
由上述图2和图3可知,视频编解码的基本流程如下:在编码端,将一帧图像划分成块,对当前块,预测单元210使用帧内预测或帧间预测产生当前块的预测块。残差单元220可基于预测块与当前块的原始块计算残差块,例如将当前块的原始块减去预测块得到残差块,该残差块也可称为残差信息。该残差块经由变换/量化单元230变换与量化等过程,可以去除人眼不敏感的信息,以消除视觉冗余。可选的,经过变换/量化单元230变换与量化之前的残差块可称为时域残差块,经过变换/量化单元230变换与量化之后的时域残差块可称为频率残差块或频域残差块。熵编码单元280接收到变换量化单元230输出的量化后的变换系数,可对该量化后的变换系数进行熵编码,输出码流。例如,熵编码单元280可根据目标上下文模型以及二进制码流的概率信息消除字符冗余。
在解码端,熵解码单元310可解析码流得到当前块的预测信息、量化系数矩阵等,预测单元320基于预测信息对当前块使用帧内预测或帧间预测产生当前块的预测块。反量化/变换单元330使用从码流得到的量化系数矩阵,对量化系数矩阵进行反量化、反变换得到残差块。重建单元340将预测块和残差块相加得到重建块。重建块组成重建图像,环路滤波单元350基于图像或基于块对重建图像进行环路滤波,得到解码图像。编码端同样需要和解码端类似的操作获得解码图像。该解码图像也可以称为重建图像,重建图像可以为后续的帧作为帧间预测的参考帧。
需要说明的是,编码端确定的块划分信息,以及预测、变换、量化、熵编码、环路滤波等模式信息或者参数信息等在必要时携带在码流中。解码端通过解析码流及根据已有信息进行分析确定与编码端相同的块划分信息,预测、变换、量化、熵编码、环路滤波等模式信息或者参数信息,从而保证编码端获得的解码图像和解码端获得的解码图像相同。
当前块(current block)可以是当前编码单元(CU)或当前预测单元(PU)等。
上述是基于块的混合编码框架下的视频编解码器的基本流程,随着技术的发展,该框架或流程的一些模块或步骤可能会被优化,本申请适用于该基于块的混合编码框架下的视频编解码器的基本流程,但不限于该框架及流程。
由上述可知编码时,通用的混合编码框架会先进行预测,预测利用空间或者时间上的相关性能得到一个跟当前块相同或相似的图像。对一个块来说,预测块和当前块是完全相同的情况是有可能出现的,但是很难保证一个视频中的所有块都如此,特别是对自然视频,或者说相机拍摄的视频,因为有噪音的存在。而且视频中不规则的运动,扭曲形变,遮挡,亮度等的变化,很难被完全预测。所以混合编码框架会将当前块的原始图像减去预测图像得到残差图像,或者说当前块减去预测块得到残差块。残差块通常要比原始图像简单很多,因而预测可以显著提升压缩效率。对残差块也不是直接进行编码,而是通常先进行变换。变换是把残差图像从空间域变换到频率域,去除残差图像的相关性。残差图像变换到频率域以后,由于能量大多集中在低频区域,变换后的非零系数大多集中在左上角。接下来利用量化来进一步压缩。而且由于人眼对高频不敏感,高频区域可以使用更大的量化步长。
国际视频编码标准制定组织JVET已成立超越H.266/VVC编码模型研究的小组,并将该模型,即平台测试软件,命名为增强的压缩模型(Enhanced Compression Model,简称ECM)。ECM在VTM10.0的基础上开始接收更新和更高效的压缩算法,目前已超越VVC约13%的编码性能。ECM不仅扩大了特定分辨率的编码单元尺寸,同时也集成了许多帧内预测和帧间预测技术,本申请主要涉及帧内预测技术。
ECM是进一步提高VVC性能的工具以及工具间的组合的参考软件,它基于VTM-10.0,集成EE采纳的工具和技术。
在ECM的帧内编码中,与VTM(VVC的参考软件测试平台)类似的,有传统的帧内预测,残差的变换等过程。与VVC不同的是在帧内预测环节中,采纳了两项用于导出帧内预测模式的技术,分别是解码器端帧内模式导出(Decoder-side Intra Mode Derivation,DIMD)和基于模板的帧内模式导出(Template-based Intra Mode Derivation,TIMD)。
DIMD和TIMD技术可在解码端导出帧内预测模式,从而省去编码帧内预测模式的索引,以达到节省码字的作用。
DIMD技术的执行过程主要包括如下两个步骤:
第一步,导出帧内预测模式,在编解码端使用同样的预测模式强度计算方法。如图7A所示,DIMD以当前块周边已重建的像素点为模板,通过sobel算子在模板(Template)上的每个3X3区域上扫描并计算水平方向和竖直方向的梯度,根据水平和竖直方向上求得梯度Dx和Dy。再根据Dx和Dy求得每个位置上的幅度值amp=Dx+Dy,和角度值angular=arctan(Dy/Dx)。根据模板上每个位置的角度对应到传统的角度预测模式,累加相同角度模式的幅度值,得到如图7B所示的幅度值与角度模式的直方图。将图7B所示的直方图中幅度值最大的预测模式确定为第一帧内预测模式,将幅度值次大的预测模式确定为第二帧内预测模式。
第二步,导出预测块,在编解码端使用同样的预测块导出方式得到当前预测块。以ECM2.0为例,编码端判断以下两个条件,
条件1、第二预测模式的梯度不为0;
条件2、第一预测模式和第二预测模式均不为Planar或者DC预测模式。
若上述两个条件不同时成立,则仅使用第一帧内预测模式计算当前块的预测样本值。否则,即上述两个条件均成立,则使用加权求平均方式导出当前预测块。具体方法为,第一帧内预测模式、第二帧内预测模式和Planar模式的预测值通过加权可得到DIMD最终的预测结果,具体过程如图7C所示。示例性的,加权计算的过程如公式(1)和(2)所示:
Figure PCTCN2021142114-appb-000001
Pred=Pred planar×w0+Pred mode1×w1+Pred mode2×w2     (2)
其中,w0,w1,w2分别是分配到Planar模式、第一帧内预测模式和第二帧内预测模式的权重,Pred planar为Planar模式对应的预测值,Pred mode1为第一帧内预测模式对应的预测值,Pred mode2为第二帧内预测模式对应的预测值,Pred为DIMD对应的加权预测值,amp1为第一帧内预测模式对应的幅度值、amp2为第二帧内预测模式对应的幅度值。
在一些实施例中,DIMD需要传输一个标志位到解码端来表示当前编码单元是否使用DIMD技术。
由上述可知,DIMD技术的预测模式导出能够减轻一定的语法元素传输负担,使得原本需要至少5个比特或更多的预测模式开销节省到1个比特。而且DIMD在获得预测模式信息后通过融合操作,将最优预测模式、次优预测模式以及Planar模式对应的预测块进行融合,产生新的预测块。该新的预测块既不是前述任一种预测模式所能够预测得到的,也不存在后续预测工具中能够得到一样的预测块。通过实验对比,可以发现该融合技术确实提升了预测效率。
但是,通过加权融合得到的预测值适用于自然场景的视频内容,却不适用于特定场景下的视频内容。前者视频内容中的物体通常都有较为模糊的边缘以及拍摄所产生的一些噪声,DIMD的融合技术能够得到更匹配于这些物体的预测值。而后者视频内容中的物体一般都有较为锐化和颜色鲜明的特性,这些视频内容通常为电脑录制或称为屏幕内容视频,DIMD的融合技术所产生的预测值在此类内容中显得多余并降低了预测质量,可以说是带来了噪声。也就是说,在一些情况下,使用加权融合预测可以提高预测效果,但在一些情况下,使用加权融合预测反而会降低预测质量,因此,在加权融合之前,需要基于加权融合条件判断是否进行加权融合。由此可知,加权融合条件的设定直接影响到帧内预测的准确性。但是,目前的加权融合条件过于宽泛,导致不需要加权融合的图像内容也进行加权融合,造成预测质量差。
为了解决上述技术问题,本申请根据第一帧内预测模式和第二帧内预测模式的幅度值,确定当前块的加权融合条件,并基于确定的加权融合条件判断是否对当前块进行加权融合预测,可以避免对不需要加权融合预测的图像内容进行加权融合预测时,降低预测质量,引入不必要噪声的问题,进而提高了帧内预测的准确性。
需要说明的是,本申请实施例提供的帧内预测方法,除了可以应用于上述DIMD技术中外,还可以应用于任意允许采用两种或多种帧内预测模式进行加权融合预测的场景中。
下面结合具体的实施例,对本申请实施例提供的视频编解码方法进行介绍。
首先结合图8,以解码端为例,对本申请实施例提供的视频解码方法进行介绍。
图8为本申请实施例提供的帧内预测方法的一种流程示意图。本申请实施例应用于图1和图2所示视频解码器。如图8所示,本申请实施例的方法包括:
S401、解码码流,确定当前块周围已重建区域对应的N个帧内预测模式的幅度值,并根据N个帧内预测模式的幅度值,确定当前块的第一帧内预测模式和第二帧内预测模式,N为大于1的正整数。
在一些实施例中,当前块也可以称为当前解码块、当前解码单元、解码块、待解码块、待解码的当前块等。
在一些实施例中,当前块包括色度分量不包括亮度分量时,当前块可以称为色度块。
在一些实施例中,当前块包括亮度分量不包括色度分量时,当前块可以称为亮度块。
需要说明的是,视频解码器确定当前块允许采用多种帧内预测模式(例如允许采用DIMD技术)进行融合加权预测时,则视频解码器确定第一帧内预测模式和第二帧内预测模式。在判断该第一帧内预测模式和第二帧内预测模式符合加权融合条件时,则使用第一帧内预测模式对当前块进行预测,得到当前块的第一个预测值,使用第二帧内预测模式对当前块进行预测,得到当前块的第二个预测值,将第一个预测值和第二个预测值进行加权融合,得到当前块的目标预测值。可选的,除了确定上述第一个预测值和第二个预测值外,还可以确定第三个预测值,例如使用第三帧内预测模式对当前块进行预测,得到当前块的第三个预测值,将上述第一个预测值、第二个预测值和第三个预测值进行加权融合,得到当前块的目标预测值。可选的,上述第三帧内预测模式可以是预设的一种帧内预测模式,或者根据其他方式确定的,本申请对此不做限制。
在一些实施例中,若判断第一帧内预测模式和第二帧内预测模式不满足加权融合条件时,则使用第一帧内预测模式和第二帧内预测模式中的一个帧内预测模式,对当前块进行预测,得到当前块的目标预测值。
在一些实施例中,视频解码器确定当前块允许采用多种帧内预测模式进行融合加权预测的方式可以是:视频编码器在码流中携带第二标志,该第二标志用于指示当前块是否通过第一帧内预测模式、第二帧内预测模式和第三帧内预测模式中的至少一个确定目标预测值。若视频编码器使用第一帧内预测模式、第二帧内预测模式和第三帧内预测模式中的至少一个确定目标预测值,则将第二标志置为真,例如将第二标志的取值置为1,并将置为真的第二标志写入码流中,例如写入码流头中。这样视频解码器获得码流后,解码该码流,得到第二标志,若该第二标志为真,例如该第二标志的取值为1,则视频解码器确定当前块是通过第一帧内预测模式、第二帧内预测模式和第三帧内预测模式中的至少一个确定目标预测值,此时,视频解码器确定当前块的第一帧内预测模式和第二帧内预测模式。可选的,视频解码器确定第一帧内预测模式和第二帧内预测模式的方式与视频编码器确定第一帧内预测模式和第二帧内预测模式的方式相同。
若视频编码器不是通过第一帧内预测模式、第二帧内预测模式和第三帧内预测模式中的至少一个确定当前块的目标预测值时,则将第二标志置为假,例如将第二标志的取值置为0,并将置为假的第二标志写入码流中,例如写入码流头中。视频解码器解码码流,得到第二标志,若该第二标志为假,例如该第二标志的取值为0,则视频解码器不确定当前块的第一帧内预测模式和第二帧内预测模式,而是遍历预设的其他帧内预测模式,确定出代价最小的帧内预测模式对当前块进行预测,得到当前块的目标预测值。
需要说明的是,本申请实施例主要涉及到当前块的目标预测值是通过第一帧内预测模式、第二帧内预测模式和第三帧内预测模式中的至少一个确定的,也就是说,本申请主要讨论上述第二标志为真的情况。
在一种可能的实现方式中,若本申请采用DIMD技术,则上述第二标志可以为DIMD使能标志,例如为sps_dimd_enable_flag。也就是说,在本申请实施例中,视频解码器解码码流,首先得到DIMD的允许使用标志位,该DIMD的允许使用标志位为序列级标志位。该DIMD的允许使用标志位用于指示当前序列是否允许使用DIMD技术。若DIMD的允许使用标志位为真时,例如为1时,则确定该当前序列允许使用DIMD技术。接着,视频解码器继续解码码流,得到DIMD使能标志,该DIMD使能标志可以为序列级标志位。该DIMD使能标志用于指示当前块是否使用DIMD技术,若该DIMD使能标志为真,例如为1,则确定当前块使用DIMD技术,此时视频解码器执行上述S401确定当前块的第一帧内预测模式和第二帧内预测模式。
可选的,上述DIMD使能标志还可以是图像级序列,用于指示当前帧图像是否使用DIMD技术。
需要说明的是,DIMD使能标志的真假是由视频编码器确定的并写入码流中,例如,视频编码器采用DIMD技术确定当前块的目标预测值时,则将DIMD使能标志置为真,例如置为1,且写入码流,例如写入码流头。若视频编码器未采用DIMD技术确定当前块的目标预测值时,则将DIMD使能标志置为假,例如置为0,且写入码流,例如写入码流头中。这样视频解码器可以从码流中解析出DIMD使能标志,并根据DIMD使能标志来确定是否使用DIMD技术来确定当前块的目标预测值,进而保证解码端和编码端的一致,保证预测的可靠性。
视频解码器确定出需要通过第一帧内预测模式、第二帧内预测模式和第三帧内预测模式中的至少一个来确定当前块的目标预测值时,执行上述S401,确定当前块周围已重建区域对应的N个帧内预测模式的幅度值,并根据N个帧内预测模式的幅度值,确定当前块的第一帧内预测模式和第二帧内预测模式。
需要说明的是,当前块周围已重建区域可以为当前块周围已重建区域中的任意预设区域。
示例性的,当前块周围已重建区域包括当前块上方m行已重建像素。
示例性的,当前块周围已重建区域包括当前块左侧k列已重建像素。
示例性的,当前块周围已重建区域包括当前块的上方以及左上方的m行已重建像素确定为当前块的模板区域。
示例性的,当前块周围已重建区域包括当前块的上方以及左上方的m行已重建像素、以及当前块左侧k列已重建像素,例如图7A中的L区域。
上述m与k可以相同,也可以不同,本申请对此不做限制。
上述m行像素可以与当前块相邻,也可以不相邻。
上述k列像素可以与当前块相邻,也可以不相邻。
在一些实施例中,确定周围已重建区域对应的N个帧内预测模式的幅度值的过程可以是:首先通过索贝尔(sobel)算子在当前块周围已重建区域上的每个nXn(例如3X3)区域上扫描并计算水平方向和竖直方向的梯度,根据水平和 竖直方向上求得梯度Dx和Dy。
示例性的,使用3x3水平sober滤波器和竖直sober滤波器,分别计算模板区域上的一个3X3区域的水平梯度Dx和竖直梯度Dy。
例如,根据如下公式(4)计算水平梯度Dx,根据公式(5)计算竖直梯度Dy:
Dx=L 水平*A      (4)
Dy=L 竖直*A      (5)
其中,L 水平为水平sober滤波器,L 竖直为竖直sober滤波器,A为模板区域上的一个3X3区域。
根据上述公式可以确定出当前块周围已重建区域上每一个3X3区域的水平方向和竖直方向的梯度。接着,根据Dx和Dy求得每个位置上的幅度值amp=Dx+Dy,和角度值angular=arctan(Dy/Dx)。根据当前块周围已重建区域上每个位置的角度对应到传统的角度预测模式,累加相同角度模式的幅度值,得到如图7B所示的直方图。根据该直方图可以得到N个帧内预测模式。
可选的,可以将直方图中的所有帧内预测模式确定为N个帧内预测模式。
可选的,可以将直方图中幅度值大于一定预设值的帧内预测模式,确定为N个帧内预测模式。
在一些实施例中,若本申请应用于DIMD技术中,则上述当前块周围已重建区域为当前块的模板区域,如图7A所示,当前块的模板区域为当前块的周围已重建区域中与当前块相邻的区域。在一些实施例中,当前块的样本区域也称为当前块的相邻重建样本区域。此时,上述S401中确定当前块周围已重建区域对应的N个帧内预测模式的幅度值的过程是,若DIMD使能标志指示当前块使用DIMD技术时,确定当前块的模板区域对应的N个帧内预测模式的幅度值,其中,确定当前块的模板区域对应的N个帧内预测模式的幅度值与上述确定当前块周围已重建区域对应的N个帧内预测模式的幅度值的过程基本相同,只需将当前块周围已重建区域替换为当前块的模板区域即可。
根据上述方法,确定出当前块对应的N个帧内预测模式的幅度值后,基于这N个帧内预测模式的幅度值,确定当前块的第一帧内预测模式和第二帧内预测模式。
在一些实施例中,基于这N个帧内预测模式的幅度值,确定当前块的第一帧内预测模式和第二帧内预测模式,包括如下几种方式:
方式1,将N个帧内预测模式中的任意一个帧内预测模式确定为第一帧内预测模式。将N个帧内预测模式中除第一帧内预测模式外的任意一个帧内预测模式确定为第二帧内预测模式。
方式2,将N个帧内预测模式中幅度值最大的帧内预测模式确定为第一帧内预测模式,将N个帧内预测模式中幅度值次大的帧内预测模式确定为第二帧内预测模式。
根据上述方法,确定出当前块的第一帧内预测模式和第二帧内预测模式后,执行如下S402和S403的步骤。
S402、根据第一帧内预测模式和第二帧内预测模式的幅度值,确定当前块的加权融合条件。
上述加权融合条件用于判断当前块是否通过第一帧内预测模式、第二帧内预测模式和第三帧内预测模式进行加权预测。
在本申请实施例中,视频解码器根据上述S401的步骤得到当前块的第一帧内预测模式和第二帧内预测模式后,并不是直接使用第一帧内预测模式和第二帧内预测模式对当前块进行加权预测,而是需要判断第一帧内预测模式和第二帧内预测模式是否满足当前块的加权融合条件。若第一帧内预测模式和第二帧内预测模式满足当前块的加权融合条件时,则使用第一帧内预测模式、第二帧内预测模式和第三帧内预测模式对当前块进行加权预测,例如使用第一帧内预测模式对当前块进行预测,得到第一个预测值,使用第二帧内预测模式对当前块进行预测,得到第二个预测值,使用第三帧内预测模式对当前块进行预测,得到第三个预测值,对第一个预测值、第二个预测值和第三个预测值进行加权,得到当前块的目标预测值。其中,第一个预测值、第二个预测值和第三预测值进行加权时的权重,可以根据第一帧内预测模式和第二帧内预测模式对应的幅度值确定。
若第一帧内预测模式和第二帧内预测模式不满足当前块的加权融合条件时,则使用第一帧内预测模式、第二帧内预测模式和第三帧内预测模式中的一个,对当前块进行预测,得到当前块的目标预测值。例如,使用第一帧内预测模式和第二帧内预测模式中幅度值最大的预测模式对当前块进行预测,得到当前块的目标预测值。可选的,由上述S401的部分描述可知,第一帧内预测模式为N个预测模式中幅度值最小的帧内预测模式,第二帧内预测模式为N个预测模式中幅度值次小的帧内预测模式,也就是说,第一幅度值大于第二幅度值,因此,在第一帧内预测模式和第二帧内预测模式不满足当前块的加权融合条件时,则使用第一帧内预测模式对当前块进行预测,得到当前块的目标预测值。
目前的加权融合条件的范围较广,例如,若第一帧内预测模式与第二帧内预测模式均不为Planar以及DC模式,且第二帧内预测模式的幅度值大于0时,即可进行加权融合预测,这对于一些图像内容,例如屏幕录制的图像内容,一般都有较为锐化和颜色鲜明的特性,对这些图像内容采用加权融合预测时,反而会降低预测质量。
为了解决上述技术问题,本申请根据第一帧内预测模式和第二帧内预测模式的幅度值来确定当前块的加权融合条件,该加权融合条件较严格,可以降低对不适用加权融合预测的图像内容进行加权融合预测时,降低图像质量的问题的概率。
本申请对上述S402中根据第一帧内预测模式和第二帧内预测模式的幅度值来确定当前块的加权融合条件的方法不做限制,示例性的,包括但不限于如下几种:
方式1,将第一帧内预测模式的幅度值与第二帧内预测模式的幅度值之间的差值小于预设值1,确定为当前块的加权融合条件。
在该方式1中,若第一帧内预测模式的幅度值与第二帧内预测模式的幅度值之间的差值大于或等于预设值1,也就是说第一帧内预测模式对应的幅度值远远大于第二帧内预测模式的幅度值,这说明第一帧内预测模式适用于当前块的概率远远大于第二帧内预测模式。此时,使用第一帧内预测模式对当前块进行预测时,可以达到较优的预测效率。若使用第一帧内预测模式、第二帧内预测模式以及第三帧内预测模式对当前块进行加权预测时,反而会带来噪声,降 低预测效果。
在该方式1中,若第一帧内预测模式的幅度值与第二帧内预测模式的幅度值之间的差值小于预设值1,也就是说第一帧内预测模式对应的幅度值与第二帧内预测模式的幅度值的差别不大,这说明第一帧内预测模式和第二帧内预测模式适用于当前块的概率基本相同。此时,只使用第一帧内预测模式对当前块进行预测时,预测效果不是最佳的,因此,可以通过第一帧内预测模式、第二帧内预测模式以及第三帧内预测模式对当前块进行加权预测,以提高当前块的预测效果。
本申请对上述预设值1的具体取值不做限制,具体根据实际需要确定。
由上述可知,将第一帧内预测模式的幅度值与第二帧内预测模式的幅度值之间的差值小于预设值1,确定为当前块的加权融合条件,可以实现对需要加权融合预测的当前块进行加权融合预测,且可以降低对不需要进行加权融合预测的图像内容进行加权融合预测的概率,进而提高了帧内预测的准确性。
方式2,将第一帧内预测模式的第一幅度值与第二帧内预测模式的第二幅度值的比值小于或等于第一预设阈值,确定当前块的加权融合条件。
在该方式2中,若第一帧内预测模式的第一幅度值与第二帧内预测模式的第二幅度值的比值大于第一预设阈值,也就是说第一帧内预测模式对应的幅度值远远大于第二帧内预测模式的幅度值,这说明第一帧内预测模式适用于当前块的概率远远大于第二帧内预测模式。此时,使用第一帧内预测模式对当前块进行预测时,可以达到较优的预测效率。若使用第一帧内预测模式、第二帧内预测模式以及第三帧内预测模式对当前块进行加权预测时,反而会带来噪声,降低预测效果。
在该方式2中,若第一帧内预测模式的第一幅度值与第二帧内预测模式的第二幅度值的小于或等于第一预设阈值,也就是说第一帧内预测模式对应的幅度值与第二帧内预测模式的幅度值的差别不大,这说明第一帧内预测模式和第二帧内预测模式适用于当前块的概率基本相同。此时,可以通过第一帧内预测模式、第二帧内预测模式以及第三帧内预测模式对当前块进行加权预测,以提高当前块的预测效果。
在一些实施例中,为了进一步对加权融合过程进行限制,则在执行上述S402之前,首先需要判断第一帧内预测模式和第二帧内预测模式是否满足第一预设条件,本申请实施例对第一预设条件的具体内容不做限制,具体根据实际需要进行确定。示例性的,第一预设条件为第一帧内预测模式和第二帧内预测模式均不是Planar以及DC模式,且第二帧内预测模式对应的第二幅度值不为零。
在该实例中,若第一帧内预测模式和第二帧内预测模式满足第一预设条件时,则执行上述步骤S402,根据第一帧内预测模式和第二帧内预测模式的幅度值,确定当前块的加权融合条件。
若第一帧内预测模式和第二帧内预测模式满足第一预设条件时,则不执行上述步骤S402,而是使用第一帧内预测模式和第二帧内预测模式中的一个对当前块进行预测,例如使用第一帧内预测模式对当前块进行预测,得到当前块的目标预测值。
目前的加权融合条件是固定的,也就是说无论图像内容是什么样的,当前块的加权融合条件固定不变。但是,对于一些图像内容,例如屏幕录制的图像内容,一般都有较为锐化和颜色鲜明的特性,对这些图像内容采用加权融合预测时,由于加权融合预测可以理解为一种模糊预测方法,会降低图像中的锐化和颜色鲜明度,进而降低预测质量,带来了噪声。
为例解决上述技术,本申请根据图像内容来确定当前块的加权融合条件。也就是说,本申请针对图像内容,提供差异化的加权融合条件,不同的图像内容对应的加权融合条件可以不同,进而保证了对需要进行加权融合预测的图像内容进行加权融合预测,以提高预测准确性。对于不需要进行加权融合预测的图像内容不进行加权融合预测,以避免引入不必要的噪声,保证预测质量。
一个序列包括一系列图像,这一系列图像通常是在同一个环境中产生的,因此,一个序列中的图像的图像内容基本一致。本申请中当前块的图像内容与当前序列的图像内容的类型一致,例如均为屏幕内容,或者摄像头采集的其他内容等,因此可以通过当前序列的图像内容确定出当前块的图像内容。
在一些实施例中,视频解码器可以通过图像识别的方法,得到当前序列的图像内容。例如,视频解码器对当前序列进行解码,首先采用已有的方式,解码出当前序列中的前几帧的重建图像,例如2帧。对前几帧的重建图像进行图像识别,得到前几帧的重建图像的图像内容的类型,将前几帧的重建图像的图像内容的类型作为当前序列的图像内容的类型。在一种示例中,视频解码器对前几帧的重建图像进行图像识别,得到前几帧的重建图像的图像内容的类型方法可以是神经网络模型的方法。例如,该神经网络模型为预先训练好的可以识别出图像内容的类型,视频解码器将前几帧的重建图像输入神经网络模型中,得到神经网络模型输出的前几帧的重建图像的图像内容的类型。可选的,视频解码器还可以采用其他的方式,确定出前几帧的重建图像的图像内容的类型,本申请对此不做限制。
在一些实施例中,视频解码器可以通过码流中的指示信息,得到当前序列的图像内容。例如,视频编码器将当前序列的图像内容的类型通过标志位的方式写入码流,视频解码器解码码流,得到该标志位,并通过该标志位,确定出当前序列的图像内容的类型,例如该标志位的取值为1时,指示当前序列的图像内容为第一图像内容,该标志位的取值为0时,指示当前序列的图像内容为第二图像内容,其中第一图像内容与第二图像内容不同。
根据上述方式,得到当前块对应的图像内容后,根据当前块对应的图像内容,确定当前块的加权融合条件。
在一些实施例中,若当前块对应的图像内容为第一图像内容时,则执行上述S402的步骤,即根据第一帧内预测模式和第二帧内预测模式的幅度值,确定当前块的加权融合条件。其中根据第一帧内预测模式和第二帧内预测模式的幅度值,确定当前块的加权融合条件可以根据上述方式1或方式2实现,在此不再赘述。
在一些实施例中,上述S402中根据所述第一帧内预测模式和第二帧内预测模式的幅度值,确定所述当前块的加权融合条件包括如下步骤:
S402-A、解码码流,得到第一标志,该第一标志用于指示是否使用第一技术,该第一技术在第一图像内容下使用;
S402-B、若第一标志指示使用第一技术时,则根据第一帧内预测模式和第二帧内预测模式的幅度值,确定当前块 的加权融合条件。
本申请的第一图像内容可以为具有锐化和鲜明颜色特征的图像内容,例如为屏幕录制内容等。
在本申请的一些实施例中,可以理解为在当前块对应的图像内容为第一图像内容时,加权融合条件才可能发生变化,若当前块对应的图像内容不是第一图像内容时,则加权融合条件不发生变化。也就是说,若当前块对应的图像内容为第一图像内容时,采用的加权融合条件为第一融合条件,若当前块对应的图像内容非第一图像内容时,则采用的加权融合条件为第二融合条件。其中第一融合条件与第二融合条件不同,且第一融合条件是根据第一帧内预测模式和第二帧内预测模式的幅度值确定的,例如将第一帧内预测模式的第一幅度值与第二帧内预测模式的第二幅度值的比值小于或等于第一预设阈值,确定第一融合条件。
基于此,为了提高视频解码器确定当前块的加权融合条件的效率,若视频编码器在确定当前块对应的图像内容为第一图像内容时,确定当前块可以使用第一技术,该第一技术可以理解为本申请实施例提供的技术,即根据第一帧内预测模式和第二帧内预测模式的幅度值确定当前块的加权融合条件。若视频编码器确定当前块可以使用第一技术时,则将第一标志置为真后编入码流,例如第一标志的取值为1。若视频编码器确定当前块对应的图像内容不是第一图像内容时,则确定当前块不可以使用第一技术,则将第一标志的取值置为假后编入码流,例如第一标志的取值为0。这样,视频解码器解码码流,得到该第一标志,进而根据该第一标志确定当前块的加权融合条件。例如,该第一标志的取值为1时,则根据第一帧内预测模式和第二帧内预测模式的幅度值,确定当前块的加权融合条件,若第一标志的取值为0时,则可以通过其他方式确定出当前块的加权融合条件。
可选的,上述第一标志可以为序列级标志,用于指示当前序列是否可以使用第一技术。
可选的,上述第一标志可以为图像级标志,用于指示当前图像是否可以使用第一技术。
可选的,可以在码流中增加新字段来表示第一标志。例如,用字段sps_DIMD_blendoff_flag来表示第一标志,该字段为全新的字段。
可选的,上述第一标志复用当前序列中的第三标志,也就是说,可以复用当前序列中已有的字段,无需增加新的字段,进而节约码字。例如,上述第三字段为帧内块复制(Intra-block copy,简称IBC)使能标志或者为模板匹配预测(Template matching prediction,简称TMP)使能标志等。
在一些实施例中,可以根据第一标志,从多个加权融合条件中确定出当前块的加权融合条件。例如,表1示出了第一标志所对应的加权融合条件。
表1
第一标志的取值 加权融合条件的类型
1 第一融合条件
0 第二融合条件
上述表1中示出了第一标志的不同取值所对应的加权融合条件,第一标志的取值为1,表示当前块对应的图像内容为第一图像内容,对应的加权融合条件为第一融合条件,可选的,该第一融合条件是根据第一帧内预测模式和第二帧内预测模式的幅度值确定的,例如该第一融合条件为:第一帧内预测模式的第一幅度值与第二帧内预测模式的第二幅度值的比值小于或等于第一预设阈值。
如表1所示,第一标志的取值为0,表示当前块对应的图像内容不是第一图像内容,对应的加权融合条件为第二融合条件。基于表1,视频解码器解码码流,得到第一标志,并根据第一标志的取值,从上述表1中,查询到当前块的加权融合条件,例如,当第一标志的取值为1时,则确定当前块的加权融合条件为第一融合条件,当第一标志的取值为0时,则确定当前块的加权融合条件为第二融合条件。
S403、根据加权融合条件,以及第一帧内预测模式、第二帧内预测模式和第三帧内预测模式中的至少一个,确定当前块的目标预测值。
根据上述方法,确定出当前块的加权融合条件后,根据该加权融合条件判断是否对当前块进行加权融合预测,例如若第一帧内预测模式和第二帧内预测模式满足当前块的加权融合条件时,则使用第一帧内预测模式、第二帧内预测模式和第三帧内预测模式对当前块进行加权融合预测。若第一帧内预测模式和第二帧内预测模式不满足当前块的加权融合条件时,则使用第一帧内预测模式和/第二帧内预测模式对当前块进行加权融合预测。
本申请对上述第三帧内预测模式的具体类型不做限制。
在一种示例中,上述第三帧内预测模式为上述直方图中,幅度值第三大的帧内预测模式。
在一种示例中,上述第三帧内预测模式为Planar或者DC模式。
本申请中,S403中根据加权融合条件,以及第一帧内预测模式、第二帧内预测模式和第三帧内预测模式中的至少一个,确定当前块的目标预测值的方式包括但不限于如下几种:
方式一,若当前块的加权融合条件为第一帧内预测模式的第一幅度值与第二帧内预测模式的第二幅度值的比值小于或等于第一预设阈值时,则上述S403还包括如下S403-A1:
S403-A1、若第一幅度值与第二幅度值的比值大于第一预设阈值时,则使用第一帧内预测模式,确定当前块的目标预测值。
本申请实施例中,当第一帧内预测模式和第二帧内预测模式不满足上述确定的当前块的加权融合条件,说明第一帧内预测模式对应的第一幅度值远远大于第二帧内预测模式对应的第二幅度值,此时使用第一帧内预测模式对当前块进行预测时,可以实现较优的预测效果,无需进行加权预测。
在一些实施例中,S403还包括如下S403-A2:
S403-A2、若第一幅度值与第二幅度值的比值小于或等于第一预设阈值时,则使用第一帧内预测模式、第二帧内预测模式和第三帧内预测模式,确定当前块的目标预测值。
例如,分别使用第一帧内预测模式、第二帧内预测模式和第三帧内预测模式对当前块进行预测,得到各自的预测 值,进而将各帧内预测模式对应的预测值进行加权,得到当前块的目标预测值。
在一些实施例中,上述S403-A2中使用第一帧内预测模式、第二帧内预测模式和第三帧内预测模式,确定当前块的第一预测值包括如下步骤:
S403-A21、确定第一帧内预测模式、第二帧内预测模式和第三帧内预测模式分别对应的权重。
上述确定第一帧内预测模式、第二帧内预测模式和第三帧内预测模式分别对应的权重的方式包括但不限于如下几种示例:
示例1,上述第一帧内预测模式、第二帧内预测模式和第三帧内预测模式分别对应的权重为预设权重。
可选的,上述三种帧内预测模式对应的权重相同,例如分别为1/3。
可选的,由于第一帧内预测模式为N个帧内预测模式中幅度值最大的帧内预测模式,因此,第一帧内预测模式对应的权重可以大于其他两个预测模式的权重。
示例2,将预设权重确定为第三帧内预测模式的权重;根据第一幅度值和第二幅度值,确定第一帧内预测模式和第二帧内预测模式分别对应的权重。
可选的,确定第三帧内预测模式的权重为a,本申请对a的具体取值不做限制,例如为1/3。接着,根据第一幅度值和第二幅度值,确定第一帧内预测模式和第二帧内预测模式分别对应的权重,例如将第一幅度值与第一幅度值和第二幅度值的和值进行相比后,乘以1-a,得到第一帧预测模式对应的权重,将第二幅度值与第一幅度值和第二幅度值的和值进行相比后,乘以1-a,得到第二帧预测模式对应的权重。
S403-A22、确定分别使用第一帧内预测模式、第二帧内预测模式和第三帧内预测模式对当前块进行预测时的预测值。
具体是,使用第一帧内预测模式对当前块进行预测,得到第一个预测值,使用第二帧内预测模式对当前块进行预测,得到第二个预测值,使用第三帧内预测模式对当前块进行预测,得到第三个预测值。
S403-A23、根据第一帧内预测模式、第二帧内预测模式和第三帧内预测模式分别对应的预测值和权重进行加权,得到当前块的目标预测值。
例如,将各帧内预测模式对应的预测值与权重进行相乘后,再相加,得到当前块的目标预测值。示例性的,可以根据上述公式(2)对第一帧内预测模式、第二帧内预测模式和第三帧内预测模式分别对应的预测值进行加权,得到当前块的目标预测值。
本申请实施例中,视频解码器解码码流,得到第一标志,若第一标志指示使用第一技术时,则视频解码器根据第一帧内预测模式的第一幅度值和第二帧内预测模式的第二幅度值,确定当前块的加权融合条件,例如将第一帧内预测模式的第一幅度值与第二帧内预测模式的第二幅度值的比值小于或等于第一预设阈值确定为当前块的加权融合预测条件。若第一幅度值与第二幅度值的比值小于或等于第一预设阈值,即第一帧内预测模式和第二帧内预测模式满足当前块的加权融合条件时,则使用第一帧内预测模式、第二帧内预测模式和第三帧内预测模式,确定当前块的目标预测值。若第一幅度值与第二幅度值的比值大于第一预设阈值,即第一帧内预测模式和第二帧内预测模式不满足当前块的加权融合条件时,则使用第一帧内预测模式,确定当前块的目标预测值。也就是说,本申请中,该将第一幅度值与第二幅度值的比值小于或等于第一预设阈值确定为当前块的加权融合条件,使得当前块的加权融合条件更加严格,进而降低了第一帧内预测模式和第二帧内预测模式满足本申请提出的加权融合条件的概率,可以降低对第一图像内容的当前块进行加权预测的概率,从而保证了第一图像内容的当前块的预测质量。
方式二,若当前块的加权融合条件为第一帧内预测模式的第一幅度值与第二帧内预测模式的第二幅度值的比值小于或等于第一预设阈值时,则上述S403还包括如下S403-B1:
S403-B1、若第一幅度值与第二幅度值的比值大于第一预设阈值时,则使用第一帧内预测模式和第二帧内预测模式,确定当前块的目标预测值。
在该方式二中,若第一幅度值与第二幅度值的比值大于第一预设阈值时,为了提高当前块的预测效果,则使用第一帧内预测模式和第二帧内预测模式,确定当前块的目标预测值。例如,使用第一帧内预测模式和第二帧内预测模式分别对当前块进行预测,并将这两种帧内预测模式自预测得到的预测值进行加权,作为当前块的目标预测值。
可选的,上述第一权重和第二权重为预设值,例如第一权重与第二权重相同,均为1/2,或者,第一权重大于第二权重。
可选的,第一权重和第二权重是根据第一幅度值和第二幅度值确的,此时,上述S403-B1中使用第一帧内预测模式和第二帧内预测模式,确定当前块的目标预测值包括如下步骤:
S403-B11、使用第一帧内预测模式对当前块进行预测,得到第一个预测值;
S403-B12、使用第二帧内预测模式对当前块进行预测,得到第二个预测值;
S403-B13、根据第一幅度值和第二幅度值,确定第一个预测值的第一权重和第二个预测值的第二权重。
例如,确定第一幅度值与第二幅度值的和值;将第一幅度值与和值的比值,确定为第一权重;将第二幅度值与和值的比值,确定为第二权重。
S403-B14、根据第一个预测值和第二个预测值,以及第一权重和第二权重,确定当前块的目标预测值。
例如,将第一个预测值与第一权重的乘积,与第二个预测值与第二权重的乘积的和值,确定为当前块的目标预测值。
本申请实施例中,视频解码器解码码流,得到第一标志,若第一标志指示使用第一技术时,则视频解码器根据第一帧内预测模式的第一幅度值和第二帧内预测模式的第二幅度值,确定当前块的加权融合条件,例如将第一帧内预测模式的第一幅度值与第二帧内预测模式的第二幅度值的比值小于或等于第一预设阈值确定为当前块的加权融合预测条件。若第一幅度值与第二幅度值的比值小于或等于第一预设阈值,即第一帧内预测模式和第二帧内预测模式满足当前块的加权融合条件时,则使用第一帧内预测模式、第二帧内预测模式和第三帧内预测模式,确定当前块的目标预测值。若第一幅度值与第二幅度值的比值大于第一预设阈值,即第一帧内预测模式和第二帧内预测模式不满足当前块的加权 融合条件时,则使用第一帧内预测模式和第二帧内预测模式,确定当前块的目标预测值。也就是说,本申请中,该将第一幅度值与第二幅度值的比值小于或等于第一预设阈值确定为当前块的加权融合条件,使得当前块的加权融合条件更加严格,进而降低了第一帧内预测模式和第二帧内预测模式满足本申请提出的加权融合条件的概率,可以降低对第一图像内容的当前块进行加权预测的概率,从而保证了第一图像内容的当前块的预测质量。
在一些实施例中,通过当前块所在的当前帧的类型,限制是否采用本申请实施例的方法,也就是说,根据当前帧的类型,确定是否执行S402中的根据第一帧内预测模式和第二帧内预测模式的幅度值,确定当前块的加权融合条件的步骤。在该实施例中,规定了有些类型的帧可以使用本申请实施例的方法,有些类型的帧不能使用本申请实施例的方法,进而实现差异化执行。例如,若当前帧的类型为目标帧类型时,则根据第一帧内预测模式和第二帧内预测模式的幅度值,确定当前块的加权融合条件,例如I帧允许使用本申请的技术方案而B帧不允许使用。本申请对目标帧类型不做限制,具体根据实际需求确定。可选的,目标帧类型包括I帧、P帧、B帧中的至少一个。
在一些实施例中,通过帧类型和图像块大小,限制是否采用本申请实施例的方法。此时,视频解码器在执行本申请实施例的方法,首先确定当前块所在的当前帧的类型,以及当前块的大小;根据当前帧的类型和当前块的大小,确定是否根据第一帧内预测模式和第二帧内预测模式的幅度值,确定当前块的加权融合条件。
需要说明的是,在本申请的实施例中,当前块的大小可以包括当前块的高度和宽度,因此,视频解码器根据当前块的高度和宽度,决定是否执行上述S402的步骤。
示例性的,在本申请中,若当前帧的类型为第一帧类型,且当前块的大小大于第一阈值时,则根据第一帧内预测模式和第二帧内预测模式的幅度值,确定当前块的加权融合条件。
示例性的,在本申请中,若当前帧的类型为第二帧类型,且当前块的大小大于第二阈值时,则根据第一帧内预测模式和第二帧内预测模式的幅度值,确定当前块的加权融合条件。
可选的,当前第一帧类型与第二帧类型不相同。
可选的,上述第一阈值和第二阈值也不相同。
本申请对第一帧类型和第二帧类型的具体类型不做限制,对第一阈值和第二阈值的具体取值也不限制。
在一种具体的示例中,若第一帧类型为I帧,第二帧类型为B帧或P帧,则第二阈值与第一阈值不相同,也就是说,I帧和B帧(或P帧)规定的可适用的块大小可以不相同。
在一些实施例中,还可以通过量化参数,限制是否采用本申请实施例的方法。此时,视频解码器在执行本申请实施例的方法,首先确定解码码流,得到当前块对应的量化参数,例如视频解码器根据帧级允许标志位或序列级QP允许标志位,得到当前块的量化参数,进而根据量化参数,确定是否根据第一帧内预测模式和第二帧内预测模式的幅度值,确定当前块的加权融合条件。
示例性的,若量化参数小于第三阈值,则根据第一帧内预测模式和第二帧内预测模式的幅度值,确定当前块的加权融合条件。本申请对第三阈值的具体取值不做限制,具体根据实际需要确定。
视频解码器根据上述方法,得到当前块的预测值后,解码码流得到当前块的残差值,将预测块与残差块相加,得到当前块的重建块。
本申请实施例的帧内预测方法,视频解码器解码码流,确定当前块周围已重建区域对应的N个帧内预测模式的幅度值,并根据N个帧内预测模式的幅度值,确定当前块的第一帧内预测模式和第二帧内预测模式,N为大于1的正整数;根据第一帧内预测模式和第二帧内预测模式的幅度值,确定当前块的加权融合条件,加权融合条件用于判断当前块是否通过第一帧内预测模式、第二帧内预测模式和第三帧内预测模式进行加权预测;根据加权融合条件,以及第一帧内预测模式、第二帧内预测模式和第三帧内预测模式中的至少一个,确定当前块的目标预测值。即本申请根据第一帧内预测模式和第二帧内预测模式的幅度值,确定当前块的加权融合条件,并基于确定的加权融合条件判断是否对当前块进行加权融合预测,可以避免对不需要加权融合预测的图像内容进行加权融合预测时,降低预测质量,引入不必要噪声的问题,进而提高了帧内预测的准确性。
下面对本申请提供的帧内预测方法与DIMD技术相结合时的解码过程进行介绍。
图9为本申请实施例提供的帧内预测方法的一种流程示意图,如图9所示,包括:
S501、解码码流,得到DIMD允许标志。
其中,DIMD允许标志用于指示当前解码器是否允许使用DIMD技术,或者用于指示当前序列是否允许使用DIMD技术。
S502、若DIMD允许标志指示当前解码器允许使用DIMD技术时,解码码流,得到DIMD使能标志。
其中,DIMD使能标志用于指示当前块是否使用DIMD技术。
S503、若DIMD使能标志指示当前块使用DIMD技术时,解码码流,得到第一标志。
其中,第一标志用于指示是否使用第一技术,该第一技术在第一图像内容下使用。
S504、确定当前块的模板区域对应的N个帧内预测模式的幅度值,并根据N个帧内预测模式的幅度值,确定当前块的第一帧内预测模式和第二帧内预测模式。
具体参照上述S401的描述,在此不再赘述。
S505、若第一标志指示当前块使用第一技术时,则根据第一帧内预测模式和第二帧内预测模式的幅度值,确定当前块的加权融合条件。
例如将第一帧内预测模式的第一幅度值与第二帧内预测模式的第二幅度值的比值小于或等于第一预设阈值,确定第一融合条件。
具体参照上述S402的描述,在此不再赘述。
S506、若第一帧内预测模式与第二帧内预测模式均不为Planar以及DC模式,且第二帧内预测模式的幅度值大于0时,根据加权融合条件,以及第一帧内预测模式、第二帧内预测模式和第三帧内预测模式中的至少一个,确定当前 块的目标预测值。
例如,若第一幅度值与第二幅度值的比值大于第一预设阈值时,则使用第一帧内预测模式,确定当前块的目标预测值。或者,若第一幅度值与第二幅度值的比值大于第一预设阈值时,则使用第一帧内预测模式和第二帧内预测模式,确定当前块的目标预测值。
再例如,若第一幅度值与第二幅度值的比值小于或等于第一预设阈值时,则使用第一帧内预测模式、第二帧内预测模式和第三帧内预测模式,确定当前块的目标预测值。
具体参照上述S403的描述,在此不再赘述。
S507、若第一帧内预测模式与第二帧内预测模式中存在Planar或DC模式,或第二帧内预测模式的幅度值小于等于0时,则使用第一帧内预测模式对当前块进行预测,得到当前块的目标预测值。
视频解码器根据上述方法,得到当前块的目标预测值后,解码码流得到当前块的残差值,将预测块与残差块相加,得到当前块的重建块。
本申请实施例,通过对DIMD技术中的加权融合条件进行调整,例如根据第一帧内预测模式和第二帧内预测模式的幅度值确定当前块的加权融合条件,降低了第一帧内预测模式和第二帧内预测模式满足本申请提出的加权融合条件的概率,可以降低对第一图像内容的当前块进行加权预测的概率,从而保证了第一图像内容的当前块的预测质量。
进一步的,将本申请实施例的方法集成到最新ECM2.0上后,在通测条件全帧内预测(All Intra,简称AL)和低延迟(Low Delay,简称LD)条件下进行测试,测试结果如表2所示。
表2
Figure PCTCN2021142114-appb-000002
表3
Figure PCTCN2021142114-appb-000003
值得注意的是,现有JVET组织ECM软件的通测条件中,只有F类序列是屏幕内容编码序列,故本测试使用序列级开关(例如上述的第一标志),仅在屏幕内容(即第一图像内容)编码情况下允许使用,因此只有class F有性能上的波动。
从上述表2和3的测试中可以看到,本申请对于AI和LD均有编码性能上的提升,且较为稳定,AI有平均0.16%的编码性能提升,LD有平均0.18%的编码性能提升。此外,本申请的技术方案对软硬件的复杂度几乎没有任何影响,从参考软件的编解码时间上也可以看出没有程序运行时间上的波动。
上文对本申请实施例的解码方法进行介绍,在此基础上,下面对本申请实施例提供的编码方法进行介绍。
图10为本申请实施例提供的帧内预测方法的一种流程示意图。如图10所示,本申请实施例的方法包括:
S601、确定当前块周围已重建区域对应的N个帧内预测模式的幅度值,并根据N个帧内预测模式的幅度值,确定当前块的第一帧内预测模式和第二帧内预测模式,N为大于1的正整数。
在视频编码过程中,视频编码器接收视频流,该视频流由一系列图像帧组成,针对视频流中的每一帧图像进行视频编码,视频编码器对图像帧进行块划分,得到当前块。
在一些实施例中,当前块也称为当前编码块、当前图像块、编码块、当前编码单元、当前待编码块、当前待编码的图像块等。
在块划分时,传统方法划分后的块既包含了当前块位置的色度分量,又包含了当前块位置的亮度分量。而分离树 技术(dual tree)可以划分单独分量块,例如单独的亮度块和单独的色度块,其中亮度块可以理解为只包含当前块位置的亮度分量,色度块理解为只包含当前块位置的色度分量。这样相同位置的亮度分量和色度分量可以属于不同的块,划分可以有更大的灵活性。如果分离树用在CU划分中,那么有的CU既包含亮度分量又包含色度分量,有的CU只包含亮度分量,有的CU只包含色度分量。
在一些实施例中,本申请实施例的当前块只包括色度分量,可以理解为色度块。
在一些实施例中,本申请实施例的当前块只包括亮度分量,可以理解为亮度块。
在一些实施例中,该当前块即包括亮度分量又包括色度分量。
需要说明的是,视频编码器确定当前块允许采用多种帧内预测模式(例如允许采用DIMD技术)进行融合加权预测时,则视频编码器确定第一帧内预测模式和第二帧内预测模式。在判断该第一帧内预测模式和第二帧内预测模式符合加权融合条件时,则使用第一帧内预测模式对当前块进行预测,得到当前块的第一个预测值,使用第二帧内预测模式对当前块进行预测,得到当前块的第二个预测值,将第一个预测值和第二个预测值进行加权融合,得到当前块的目标预测值。可选的,除了确定上述第一个预测值和第二个预测值外,还可以确定第三个预测值,例如使用第三帧内预测模式对当前块进行预测,得到当前块的第三个预测值,将上述第一个预测值、第二个预测值和第三个预测值进行加权融合,得到当前块的目标预测值。可选的,上述第三个帧内预测模式可以是预设的一种帧内预测模式,或者根据其他方式确定的,本申请对此不做限制。
在一些实施例中,若判断第一帧内预测模式和第二帧内预测模式不满足加权融合条件时,则使用第一帧内预测模式和第二帧内预测模式中的一个帧内预测模式,对当前块进行预测,得到当前块的目标预测值。
在一些实施例中,视频编码器在码流中携带第二标志,该第二标志用于指示当前块是否通过第一帧内预测模式、第二帧内预测模式和第三帧内预测模式中的至少一个确定目标预测值。若视频编码器使用第一帧内预测模式、第二帧内预测模式和第三帧内预测模式中的至少一个确定目标预测值,则将第二标志置为真,例如将第二标志的取值置为1,并将置为真的第二标志写入码流中,例如写入码流头中。这样视频解码器获得码流后,解码该码流,得到第二标志,若该第二标志为真,例如该第二标志的取值为1,则视频解码器确定当前块是通过第一帧内预测模式、第二帧内预测模式和第三帧内预测模式中的至少一个确定目标预测值,此时,视频解码器确定当前块的第一帧内预测模式和第二帧内预测模式。
若视频编码器不是通过第一帧内预测模式、第二帧内预测模式和第三帧内预测模式中的至少一个确定当前块的目标预测值时,则将第二标志置为假,例如将第二标志的取值置为0,并将置为假的第二标志写入码流中,例如写入码流头中。视频解码器解码码流,得到第二标志,若该第二标志为假,例如该第二标志的取值为0,则视频解码器则不确定当前块的第一帧内预测模式和第二帧内预测模式,而是遍历预设的其他帧内预测模式,确定出代价最小的帧内预测模式对当前块进行预测,得到当前块的目标预测值。
需要说明的是,本申请实施例主要涉及到当前块的目标预测值是通过第一帧内预测模式、第二帧内预测模式和第三帧内预测模式中的至少一个确定的,也就是说,本申请主要讨论上述第二标志为真的情况。
在一种可能的实现方式中,若本申请采用DIMD技术,则上述第二标志可以为DIMD使能标志,例如为sps_DIMD_enable_flag。也就是说,视频编码器获得DIMD的允许使用标志位,该DIMD的允许使用标志位为序列级标志位。该DIMD的允许使用标志位用于指示当前序列是否允许使用DIMD技术。若视频编码器确定该当前序列允许使用DIMD技术,即DIMD的允许使用标志位为真,例如为1。接着,视频编码器确定当前块的第一帧内预测模式和第二帧内预测模式,并执行本申请实施例的方法,视频编码器采用DIMD技术确定当前块的目标预测值时,则将DIMD使能标志置为真,例如置为1,且写入码流,例如写入码流头。若视频编码器未采用DIMD技术确定当前块的目标预测值时,则将DIMD使能标志置为假,例如置为0,且写入码流,例如写入码流头中。这样视频解码器可以从码流中解析出DIMD使能标志,并根据DIMD使能标志来确定是否使用DIMD技术来确定当前块的目标预测值,进而保证解码端和编码端的一致,保证预测的可靠性。
需要说明的是,当前块周围已重建区域可以为当前块周围已重建区域中的任意预设区域。
示例性的,当前块周围已重建区域包括当前块上方m行已重建像素。
示例性的,当前块周围已重建区域包括当前块左侧k列已重建像素。
示例性的,当前块周围已重建区域包括当前块的上方以及左上方的m行已重建像素确定为当前块的模板区域。
示例性的,当前块周围已重建区域包括当前块的上方以及左上方的m行已重建像素、以及当前块左侧k列已重建像素,例如图7A中的L区域。
上述m与k可以相同,也可以不同,本申请对此不做限制。
上述m行像素可以与当前块相邻,也可以不相邻。
上述k列像素可以与当前块相邻,也可以不相邻。
在一些实施例中,确定周围已重建区域对应的N个帧内预测模式的幅度值的过程可以是:首先通过sobel算子在当前块周围已重建区域上的每个nXn(例如3X3)区域上扫描并计算水平方向和竖直方向的梯度,根据水平和竖直方向上求得梯度Dx和Dy。
示例性的,使用3x3水平sober滤波器和竖直sober滤波器,分别计算模板区域上的一个3X3区域的水平梯度Dx和竖直梯度Dy。例如,根据上述公式(4)计算水平梯度Dx,根据公式(5)计算竖直梯度Dy。
确定出当前块周围已重建区域上每一个3X3区域的水平方向和竖直方向的梯度。接着,根据Dx和Dy求得每个位置上的幅度值amp=Dx+Dy,和角度值angular=arctan(Dy/Dx)。根据当前块周围已重建区域上每个位置的角度对应到传统的角度预测模式,累加相同角度模式的幅度值,得到如图7B所示的直方图。根据该直方图可以得到当前块对应的N个帧内预测模式。
可选的,可以将直方图中的所有帧内预测模式确定为N个帧内预测模式。
可选的,可以将直方图值幅度值大于一定预设值的帧内预测模式,确定为N个帧内预测模式。
在一些实施例中,若本申请应用于DIMD技术中,则上述当前块周围已重建区域为当前块的模板区域,如图7A所示,当前块的模板区域为当前块的周围已重建区域中与当前块相邻的区域。在一些实施例中,当前块的样本区域也称为当前块的相邻重建样本区域。此时,上述S601中确定当前块周围已重建区域对应的N个帧内预测模式的幅度值的过程是,确定当前块的模板区域对应的N个帧内预测模式的幅度值,其中,确定当前块的模板区域对应的N个帧内预测模式的幅度值与上述确定当前块周围已重建区域对应的N个帧内预测模式的幅度值的过程基本相同,只需将当前块周围已重建区域替换为当前块的模板区域即可。
根据上述方法,确定出当前块对应的N个帧内预测模式的幅度值后,基于这N个帧内预测模式的幅度值,确定当前块的第一帧内预测模式和第二帧内预测模式。
在一些实施例中,基于这N个帧内预测模式的幅度值,确定当前块的第一帧内预测模式和第二帧内预测模式,包括如下几种方式:
方式1,将N个帧内预测模式中的任意一个帧内预测模式确定为第一帧内预测模式。将N个帧内预测模式中除第一帧内预测模式外的任意一个帧内预测模式确定为第二帧内预测模式。
方式2,将N个帧内预测模式中幅度值最大的帧内预测模式确定为第一帧内预测模式,将N个帧内预测模式中幅度值次大的帧内预测模式确定为第二帧内预测模式。
根据上述方法,确定出当前块的第一帧内预测模式和第二帧内预测模式后,执行如下S402和S403的步骤。
S602、根据第一帧内预测模式和第二帧内预测模式的幅度值,确定当前块的加权融合条件。
上述加权融合条件用于判断当前块是否通过第一帧内预测模式、第二帧内预测模式和第三帧内预测模式进行加权预测。
在本申请实施例中,视频解码器根据上述S601的步骤得到当前块的第一帧内预测模式和第二帧内预测模式后,并不是直接使用第一帧内预测模式和第二帧内预测模式对当前块进行加权预测,而是需要判断第一帧内预测模式和第二帧内预测模式是否满足当前块的加权融合条件。若第一帧内预测模式和第二帧内预测模式满足当前块的加权融合条件时,则使用第一帧内预测模式、第二帧内预测模式和第三帧内预测模式对当前块进行加权预测,例如使用第一帧内预测模式对当前块进行预测,得到第一个预测值,使用第二帧内预测模式对当前块进行预测,得到第二个预测值,使用第三帧内预测模式对当前块进行预测,得到第三个预测值,对第一个预测值、第二个预测值和第三个预测值进行加权,得到当前块的第一预测值。其中,第一个预测值、第二个预测值和第三预测值进行加权时的权重,可以根据第一帧内预测模式和第二帧内预测模式对应的幅度值确定。
若第一帧内预测模式和第二帧内预测模式不满足当前块的加权融合条件时,则使用第一帧内预测模式、第二帧内预测模式和第三帧内预测模式中的一个,对当前块进行预测,得到当前块的第一预测值。例如,使用第一帧内预测模式和第二帧内预测模式中幅度值最大的预测模式对当前块进行预测,得到当前块的第一预测值。可选的,由上述S601的部分描述可知,第一帧内预测模式为N个预测模式中幅度值最小的帧内预测模式,第二帧内预测模式为N个预测模式中幅度值次小的帧内预测模式,也就是说,第一幅度值大于第二幅度值,因此,在第一帧内预测模式和第二帧内预测模式不满足当前块的加权融合条件时,则使用第一帧内预测模式对当前块进行预测,得到当前块的目标预测值。
目前的加权融合条件的范围较广,例如,若mode1与mode2均不为Planar以及DC模式,且mode2的幅度值大于0时,即可进行加权融合预测,这对于一些图像内容,例如屏幕录制的图像内容,一般都有较为锐化和颜色鲜明的特性,对这些图像内容采用加权融合预测时,反而会降低预测质量。
为了解决上述技术问题,本申请根据第一帧内预测模式和第二帧内预测模式的幅度值来确定当前块的加权融合条件,该加权融合条件较严格,可以降低对不适用加权融合预测的图像内容进行加权融合预测时,降低图像质量的问题的概率。
本申请对上述S602中根据第一帧内预测模式和第二帧内预测模式的幅度值来确定当前块的加权融合条件的方法不做限制,示例性的,包括但不限于如下几种:
方式1,将第一帧内预测模式的幅度值与第二帧内预测模式的幅度值之间的差值小于预设值1,确定为当前块的加权融合条件。
在该方式1中,若第一帧内预测模式的幅度值与第二帧内预测模式的幅度值之间的差值大于或等于预设值1,也就是说第一帧内预测模式对应的幅度值远远大于第二帧内预测模式的幅度值,这说明第一帧内预测模式适用于当前块的概率远远大于第二帧内预测模式。此时,使用第一帧内预测模式对当前块进行预测时,可以达到较优的预测效率。若第一帧内预测模式、第二帧内预测模式以及第三帧内预测模式对当前块进行加权预测时,反而会带来噪声,降低预测效果。
在该方式1中,若第一帧内预测模式的幅度值与第二帧内预测模式的幅度值之间的差值小于预设值1,也就是说第一帧内预测模式对应的幅度值与第二帧内预测模式的幅度值的差别不大,这说明第一帧内预测模式和第二帧内预测模式适用于当前块的概率基本相同。此时,使用只第一帧内预测模式对当前块进行预测时,预测效果不是最佳的,因此,可以通过第一帧内预测模式、第二帧内预测模式以及第三帧内预测模式对当前块进行加权预测,以提高当前块的预测效果。
本申请对上述预设值1的具体取值不做限制,具体根据实际需要确定。
由上述可知,将第一帧内预测模式的幅度值与第二帧内预测模式的幅度值之间的差值小于预设值1,确定为当前块的加权融合条件,可以实现对需要加权融合预测的当前块进行加权融合预测,且可以降低对不需要进行加权融合预测的图像内容进行加权融合预测的概率,进而提高了帧内预测的准确性。
方式2,将第一帧内预测模式的第一幅度值与第二帧内预测模式的第二幅度值的比值小于或等于第一预设阈值,确定当前块的加权融合条件。
在该方式2中,若第一帧内预测模式的第一幅度值与第二帧内预测模式的第二幅度值的比值大于第一预设阈值, 也就是说第一帧内预测模式对应的幅度值远远大于第二帧内预测模式的幅度值,这说明第一帧内预测模式适用于当前块的概率远远大于第二帧内预测模式。此时,使用第一帧内预测模式对当前块进行预测时,可以达到较优的预测效率。若第一帧内预测模式、第二帧内预测模式以及第三帧内预测模式对当前块进行加权预测时,反而会带来噪声,降低预测效果。
在该方式2中,若第一帧内预测模式的第一幅度值与第二帧内预测模式的第二幅度值的小于或等于第一预设阈值,也就是说第一帧内预测模式对应的幅度值与第二帧内预测模式的幅度值的差别不大,这说明第一帧内预测模式和第二帧内预测模式适用于当前块的概率基本相同。此时,可以通过第一帧内预测模式、第二帧内预测模式以及第三帧内预测模式对当前块进行加权预测,以提高当前块的预测效果。
在一些实施例中,为了进一步对加权融合过程进行限制,则在执行上述S602之前,首先需要判断第一帧内预测模式和第二帧内预测模式是否满足第一预设条件,本申请实施例对第一预设条件的具体内容不做限制,具体根据实际需要进行确定。示例性的,第一预设条件为第一帧内预测模式和第二帧内预测模式均不是Planar以及DC模式,且第二帧内预测模式对应的第二幅度值不为零。
在该实例中,若第一帧内预测模式和第二帧内预测模式满足第一预设条件时,则执行上述步骤S602,根据第一帧内预测模式和第二帧内预测模式的幅度值,确定当前块的加权融合条件。
若第一帧内预测模式和第二帧内预测模式满足第一预设条件时,则不执行上述步骤S602,而是使用第一帧内预测模式和第二帧内预测模式中的一个对当前块进行预测,例如使用第一帧内预测模式对当前块进行预测,得到当前块的目标预测值。
目前的加权融合条件是固定的,也就是说无论图像内容是什么样的,当前块的加权融合条件固定不变。但是,对于一些图像内容,例如屏幕录制的图像内容,一般都有较为锐化和颜色鲜明的特性,对这些图像内容采用加权融合预测时,由于加权融合预测可以理解为一种模糊预测方法,会降低图像中的锐化和颜色鲜明度,进而降低预测质量,带来了噪声。
为例解决上述技术,本申请根据图像内容来确定当前块的加权融合条件。也就是说,本申请针对图像内容,提供差异化的加权融合条件,不同的图像内容对应的加权融合条件可以不同,进而保证了对需要进行加权融合预测的图像内容进行加权融合预测,以提高预测准确性。对于不需要进行加权融合预测的图像内容不进行加权融合预测,以避免引入不必要的噪声,保证预测质量。
一个序列包括一系列图像,这一系列图像通过是在同一个环境中产生的,因此,一个序列中的图像的图像内容基本一致。本申请中当前块的图像内容与当前序列的图像内容的类型一致,例如均为屏幕内容,或者摄像头采集的其他内容等。
在一些实施例中,根据当前序列的图像内容,确定当前块的加权融合条件,例如视频编码器在当前块对应的图像内容为第一图像内容时,则根据第一帧内预测模式和第二帧内预测模式的幅度值,确定当前块的加权融合条件。若当前块对应的图像内容为第二图像内容时,则根据其他方法确定当前块的加权融合条件。
在一些实施例中,视频编码器将当前序列的图像内容的类型通过标志位的方式写入码流,视频解码器解码码流,得到该标志位,并通过该标志位,确定出当前序列的图像内容的类型,例如该标志位的取值为1时,指示当前序列的图像内容为第一图像内容,该标志位的取值为0时,指示当前序列的图像内容为第二图像内容,其中第一图像内容与第二图像内容不同。
在一些实施例中,视频编码器将第一标志写入码流,第一标志用于指示是否使用第一技术,第一技术在第一图像内容下使用。
在本申请的一些实施例中,可以理解为在当前块对应的图像内容为第一图像内容时,加权融合条件才可能发生变化,若当前块对应的图像内容不是第一图像内容时,则加权融合条件不发生变化。也就是说,若当前块对应的图像内容为第一图像内容时,采用的加权融合条件为第一融合条件,若当前块对应的图像内容非第一图像内容时,则采用的加权融合条件为第二融合条件。其中第一融合条件与第二融合条件不同,且第一融合条件是根据第一帧内预测模式和第二帧内预测模式的幅度值确定的,例如将第一帧内预测模式的第一幅度值与第二帧内预测模式的第二幅度值的比值小于或等于第一预设阈值,确定为第一融合条件。
基于此,为了提高视频解码器确定当前块的加权融合条件的效率,若视频编码器在确定当前块对应的图像内容为第一图像内容时,确定当前块可以使用第一技术,该第一技术可以理解为本申请实施例提供的技术,即根据第一帧内预测模式和第二帧内预测模式的幅度值确定当前块的加权融合条件。若视频编码器确定当前块可以使用第一技术时,则将第一标志置为真后编入码流,例如第一标志的取值为1。若视频编码器确定当前块对应的图像内容不是第一图像内容时,则确定当前块不可以使用第一技术,则将第一标志的取值置为假后编入码流,例如第二标志的取值为0。这样,视频解码器解码码流,得到该第一标志,进而根据该第一标志确定当前块的加权融合条件。例如,该第一标志的取值为1时,则根据第一帧内预测模式和第二帧内预测模式的幅度值,确定当前块的加权融合条件,若第一标志的取值为0时,则可以通过其他方式确定出当前块的加权融合条件。
可选的,上述第一标志可以为序列级标志,用于指示当前序列是否可以使用第一技术。
可选的,上述第一标志可以为图像级标志,用于指示当前图像是否可以使用第一技术。
可选的,可以在码流中增加新字段来表示第一标志。例如,用字段sps_DIMD_blendoff_flag来表示第一标志,该字段为全新的字段。
可选的,上述第一标志复用当前序列中的第三标志,也就是说,可以复用当前序列中已有的字段,无需增加新的字段,进而节约码字。例如,上述第三字段为帧内块复制(Intra-block copy,简称IBC)使能标志或者为模板匹配预测(Template matching prediction,简称TMP)使能标志等。
S603、根据加权融合条件,以及第一帧内预测模式、第二帧内预测模式和第三帧内预测模式中的至少一个,确定 当前块的第一预测值。
根据上述方法,确定出当前块的加权融合条件后,根据该加权融合条件判断是否对当前块进行加权融合预测,例如若第一帧内预测模式和第二帧内预测模式满足当前块的加权融合条件时,则使用第一帧内预测模式、第二帧内预测模式和第三帧内预测模式对当前块进行加权融合预测。若第一帧内预测模式和第二帧内预测模式不满足当前块的加权融合条件时,则使用第一帧内预测模式和/第二帧内预测模式对当前块进行加权融合预测。
本申请对上述第三帧内预测模式的具体类型不做限制。
在一种示例中,上述第三帧内预测模式为上述直方图中,幅度值第三大的帧内预测模式。
在一种示例中,上述第三帧内预测模式为Planar或者DC模式。
本申请中,S603中根据加权融合条件,以及第一帧内预测模式、第二帧内预测模式和第三帧内预测模式中的至少一个,确定当前块的第一预测值的方式包括但不限于如下几种:
方式一,若当前块的加权融合条件为第一帧内预测模式的第一幅度值与第二帧内预测模式的第二幅度值的比值小于或等于第一预设阈值时,则上述S603还包括如下S603-A1:
S603-A1、若第一幅度值与第二幅度值的比值大于第一预设阈值时,则使用第一帧内预测模式,确定当前块的目标预测值。
本申请实施例中,当第一帧内预测模式和第二帧内预测模式不满足上述确定的当前块的加权融合条件,说明第一帧内预测模式对应的第一幅度值远远大于第二帧内预测模式对应的第二幅度值,此时使用第一帧内预测模式对当前块进行预测时,可以实现较优的预测效果,无需进行加权预测。
在一些实施例中,S603还包括如下S603-A2:
S603-A2、若第一幅度值与第二幅度值的比值小于或等于第一预设阈值时,则使用第一帧内预测模式、第二帧内预测模式和第三帧内预测模式,确定当前块的第一预测值。
例如,分别使用第一帧内预测模式、第二帧内预测模式和第三帧内预测模式对当前块进行预测,得到各自的预测值,进而将各帧内预测模式对应的预测值进行加权,得到当前块的第一标预测值。
在一些实施例中,上述S603-A2中使用第一帧内预测模式、第二帧内预测模式和第三帧内预测模式,确定当前块的第一预测值包括如下步骤:
S603-A21、确定第一帧内预测模式、第二帧内预测模式和第三帧内预测模式分别对应的权重。
上述确定第一帧内预测模式、第二帧内预测模式和第三帧内预测模式分别对应的权重的方式包括但不限于如下几种示例:
示例1,上述第一帧内预测模式、第二帧内预测模式和第三帧内预测模式分别对应的权重为预设权重。
可选的,这三种帧内预测模式对应的权重相同,例如分别为1/3。
可选的,由于第一帧内预测模式为N个帧内预测模式中幅度值最大的帧内预测模式,因此,第一帧内预测模式对应的权重可以大于其他两个预测模式的权重。
示例2,将预设权重确定为第三帧内预测模式的权重;根据第一幅度值和第二幅度值,确定第一帧内预测模式和第二帧内预测模式分别对应的权重。
可选的,确定第三帧内预测模式的权重为a,本申请对a的具体取值不做限制,例如为1/3。接着,根据第一幅度值和第二幅度值,确定第一帧内预测模式和第二帧内预测模式分别对应的权重,例如将第一幅度值与第一幅度值和第二幅度值的和值进行相比后,乘以1-a,得到第一帧预测模式对应的权重,将第二幅度值与第一幅度值和第二幅度值的和值进行相比后,乘以1-a,得到第二帧预测模式对应的权重。
S603-A22、确定分别使用第一帧内预测模式、第二帧内预测模式和第三帧内预测模式对当前块进行预测时的预测值。
具体是,使用第一帧内预测模式对当前块进行预测,得到第一个预测值,使用第二帧内预测模式对当前块进行预测,得到第二个预测值,使用第三帧内预测模式对当前块进行预测,得到第三个预测值。
S603-A23、根据第一帧内预测模式、第二帧内预测模式和第三帧内预测模式分别对应的预测值和权重进行加权,得到当前块的第一预测值。
例如,将各帧内预测模式对应的预测值与权重进行相乘后,再相加,得到当前块的第一预测值。示例性的,可以根据上述公式(2)对第一帧内预测模式、第二帧内预测模式和第三帧内预测模式分别对应的预测值进行加权,得到当前块的第一预测值。
根据上述方法,可以根据第一帧内预测模式和/或第二帧内预测模式,确定出当前块的第一预测值,接着,执行如下S604的步骤。
S604、根据第一预测值,确定当前块的目标预测值。
在一些实施例中,可以直接将上述当前块的第一预测值,确定为当前块的目标预测值。
在一些实施例中,需要将第一预测值与其他预测模式对应的预测值的代价进行比较,根据代价确定当前块的目标预测值。具体的,上述S604包括如下步骤:
S604-A1、根据第一预测值,确定第一预测值对应的第一编码代价。
上述第一编码代价可以是RDO代价,可选的,还可以是SAD或SATD等近似代价等,本申请对此不做限制。
S604-A2、确定候选预测集中的各帧内预测模式对当前块进行预测时的第二编码代价。
上述候选预测集中包括至少一个帧内预测模式,遍历候选预测集中的各帧内预测模式,使用各帧内预测模式分别对当前块进行编码预测,得到候选预测集中各帧内预测模式分别对应的第二编码代价。
S604-A3、将第一编码代价和第二编码代价中最小编码代价对应的预测值,确定为当前块的目标预测值。
在一些实施例中,若第一编码代价为第一编码代价和第二编码代价中的最小编码代价,此时将上述第一预测值, 确定为当前块的目标预测值。同时,将第二标志置为真后写入码流,例如将DIMD使能标志置为真,例如置为1后编入码流。
在一些实施例中,若第一编码代价不是第一编码代价和第二编码代价中的最小编码代价,则将第二标志置为假后写入码流,例如将DIMD使能标志置为假,例如置为0后编入码流。其中第二标志用于指示当前块是否通过第一帧内预测模式和第二帧内预测模式中的至少一个确定目标预测值。
在一些实施例中,通过当前块所在的当前帧的类型,限制是否采用本申请实施例的方法,也就是说,根据当前帧的类型,确定是否执行S602中的根据第一帧内预测模式和第二帧内预测模式的幅度值,确定当前块的加权融合条件的步骤。在该实施例中,规定了有些类型的帧可以使用本申请实施例的方法,有些类型的帧不能使用本申请实施例的方法,进而实现差异化执行。例如,若当前帧的类型为目标帧类型时,则根据第一帧内预测模式和第二帧内预测模式的幅度值,确定当前块的加权融合条件,例如I帧允许使用本申请的技术方案而B帧不允许使用。本申请对目标帧类型不做限制,具体根据实际需求确定。可选的,目标帧类型包括I帧、P帧、B帧中的至少一个。
在一些实施例中,通过帧类型和图像块大小,限制是否采用本申请实施例的方法。此时,视频编码器在执行本申请实施例的方法,首先确定当前块所在的当前帧的类型,以及当前块的大小;根据当前帧的类型和当前块的大小,确定是否根据第一帧内预测模式和第二帧内预测模式的幅度值,确定当前块的加权融合条件。
需要说明的是,在本申请的实施例中,当前块的大小可以包括当前块的高度和宽度,因此,视频解码器根据当前块的高度和宽度,决定是否执行上述S602的步骤。
示例性的,在本申请中,若当前帧的类型为第一帧类型,且当前块的大小大于第一阈值时,则根据第一帧内预测模式和第二帧内预测模式的幅度值,确定当前块的加权融合条件。
示例性的,在本申请中,若当前帧的类型为第二帧类型,且当前块的大小大于第二阈值时,则根据第一帧内预测模式和第二帧内预测模式的幅度值,确定当前块的加权融合条件。
可选的,当前第一帧类型与第二帧类型不相同。
可选的,上述第一阈值和第二阈值也不相同。
本申请对第一帧类型和第二帧类型的具体类型不做限制,对第一阈值和第二阈值的具体取值也不限制。
在一种具体的示例中,若第一帧类型为I帧,第二帧类型为B帧或P帧,则第二阈值与第一阈值不相同,也就是说,I帧和B帧(或P帧)规定的可适用的块大小可以不相同。
在一些实施例中,还可以通过量化参数,限制是否采用本申请实施例的方法。此时,视频编码器在执行本申请实施例的方法,首先获得当前块对应的量化参数,例如视频编码器根据帧级允许标志位或序列级QP允许标志位,得到当前块的量化参数,进而根据量化参数,确定是否根据第一帧内预测模式和第二帧内预测模式的幅度值,确定当前块的加权融合条件。
示例性的,若量化参数小于第三阈值,则根据第一帧内预测模式和第二帧内预测模式的幅度值,确定当前块的加权融合条件。本申请对第三阈值的具体取值不做限制,具体根据实际需要确定。
视频编码器根据上述方法,得到当前块的目标预测值后,根据当前块的目标预测值与原数值到当前块的残差值,对残差值进行变换和量化等处理后进行编码,得到码流。
本申请实施例的帧内预测方法,视频编码器通过确定当前块周围已重建区域对应的N个帧内预测模式的幅度值,并根据N个帧内预测模式的幅度值,确定当前块的第一帧内预测模式和第二帧内预测模式;根据第一帧内预测模式和第二帧内预测模式的幅度值,确定当前块的加权融合条件;根据加权融合条件,以及第一帧内预测模式、第二帧内预测模式和第三帧内预测模式中的至少一个,确定当前块的第一预测值;根据第一预测值,确定当前块的目标预测值。即本申请根据第一帧内预测模式和第二帧内预测模式的幅度值,确定当前块的加权融合条件,并基于确定的加权融合条件判断是否对当前块进行加权融合预测,可以避免对不需要加权融合预测的图像内容进行加权融合预测时,降低预测质量,引入不必要噪声的问题,进而提高了帧内预测的准确性。
下面对本申请提供的帧内预测方法与DIMD技术相结合时的编码过程进行介绍。
图11为本申请实施例提供的帧内预测方法的一种流程示意图,如图11所示,包括:
S701、获得DIMD允许标志。
其中,DIMD允许标志用于指示当前解码器是否允许使用DIMD技术。
S702、若DIMD允许标志指示当前解码器允许使用DIMD技术时,则确定当前块的模板区域对应的N个帧内预测模式的幅度值,并根据N个帧内预测模式的幅度值,确定当前块的第一帧内预测模式和第二帧内预测模式。
S703、根据第一帧内预测模式和第二帧内预测模式的幅度值,确定当前块的加权融合条件。
例如将第一帧内预测模式的第一幅度值与第二帧内预测模式的第二幅度值的比值小于或等于第一预设阈值,确定第一融合条件。
S704、若第一帧内预测模式与第二帧内预测模式均不为Planar以及DC模式,且第二帧内预测模式的幅度值大于0时,根据加权融合条件,以及第一帧内预测模式、第二帧内预测模式和第三帧内预测模式中的至少一个,确定当前块的第一预测值。
例如,若第一幅度值与第二幅度值的比值大于第一预设阈值时,则使用第一帧内预测模式,确定当前块的第一预测值。或者,若第一幅度值与第二幅度值的比值大于第一预设阈值时,则使用第一帧内预测模式和第二帧内预测模式,确定当前块的第一预测值。
再例如,若第一幅度值与第二幅度值的比值小于或等于第一预设阈值时,则使用第一帧内预测模式、第二帧内预测模式和第三帧内预测模式,确定当前块的第一预测值。
S705、若第一帧内预测模式与第二帧内预测模式中存在Planar或DC模式,或第二帧内预测模式的幅度值小于等于0时,则使用第一帧内预测模式对当前块进行预测,得到当前块的第一预测值。
S706、根据第一预测值,确定DIMD模式对应的第一编码代价。
可选的,上述第一编码代价为率失真代价。
S707、确定候选预测集中的各帧内预测模式对当前块进行预测时的第二编码代价。
S708、若第一编码代价为第一编码代价和第二编码代价中的最小代价时,则将第一预测值确定为当前块的目标预测值,且将DIMD使能标志置为1后编入码流。
其中,DIMD使能标志用于指示当前块是否使用DIMD技术。
S709、若第一编码代价不是第一编码代价和第二编码代价中的最小代价时,则将最小第二编码代价对应的预测值确定为当前块的目标预测值,且将DIMD使能标志置为0后编入码流。
S710、将第一标志置为1后编入码流。
其中,第一标志用于指示是否使用第一技术,该第一技术在第一图像内容下使用。
本申请实施例,通过对DIMD技术中的加权融合条件进行调整,例如根据第一帧内预测模式和第二帧内预测模式的幅度值确定当前块的加权融合条件,降低了第一帧内预测模式和第二帧内预测模式满足本申请提出的加权融合条件的概率,可以降低对第一图像内容的当前块进行加权预测的概率,从而保证了第一图像内容的当前块的预测质量。
应理解,图8至图11仅为本申请的示例,不应理解为对本申请的限制。
以上结合附图详细描述了本申请的优选实施方式,但是,本申请并不限于上述实施方式中的具体细节,在本申请的技术构思范围内,可以对本申请的技术方案进行多种简单变型,这些简单变型均属于本申请的保护范围。例如,在上述具体实施方式中所描述的各个具体技术特征,在不矛盾的情况下,可以通过任何合适的方式进行组合,为了避免不必要的重复,本申请对各种可能的组合方式不再另行说明。又例如,本申请的各种不同的实施方式之间也可以进行任意组合,只要其不违背本申请的思想,其同样应当视为本申请所公开的内容。
还应理解,在本申请的各种方法实施例中,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。另外,本申请实施例中,术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系。具体地,A和/或B可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中字符“/”,一般表示前后关联对象是一种“或”的关系。
上文结合图8至图11,详细描述了本申请的方法实施例,下文结合图12至图14,详细描述本申请的装置实施例。
图12是本申请一实施例提供的帧内预测装置的示意性框图。
如图12所示,帧内预测装置10包括:
解码单元11,用于解码码流,确定当前块周围已重建区域对应的N个帧内预测模式的幅度值,并根据所述N个帧内预测模式的幅度值,确定当前块的第一帧内预测模式和第二帧内预测模式,所述N为大于1的正整数;
确定单元12,用于根据所述第一帧内预测模式和第二帧内预测模式的幅度值,确定所述当前块的加权融合条件,所述加权融合条件用于判断所述当前块是否通过所述第一帧内预测模式、所述第二帧内预测模式和第三帧内预测模式进行加权预测;
预测单元13,用于根据所述加权融合条件,以及所述第一帧内预测模式、所述第二帧内预测模式和所述第三帧内预测模式中的至少一个,确定所述当前块的目标预测值。
在一些实施例中,确定单元12,具体用于在所述当前块对应的图像内容为第一图像内容时,则根据所述第一帧内预测模式和第二帧内预测模式的幅度值,确定所述当前块的加权融合条件。
在一些实施例中,确定单元12,具体用于解码所述码流,得到第一标志,所述第一标志用于指示是否使用第一技术,所述第一技术在第一图像内容下使用;若所述第一标志指示使用所述第一技术时,则根据所述第一帧内预测模式和第二帧内预测模式的幅度值,确定所述当前块的加权融合条件。
在一些实施例中,确定单元12,具体用于将所述第一帧内预测模式的第一幅度值与所述第二帧内预测模式的第二幅度值的比值小于或等于第一预设阈值,确定所述当前块的加权融合条件。
在一些实施例中,预测单元13,具体用于若所述第一幅度值与所述第二幅度值的比值大于所述第一预设阈值时,则使用所述第一帧内预测模式,确定所述当前块的目标预测值。
在一些实施例中,预测单元13,具体用于若所述第一幅度值与所述第二幅度值的比值大于所述第一预设阈值时,则使用所述第一帧内预测模式和所述第二帧内预测模式,确定所述当前块的目标预测值。
在一些实施例中,预测单元13,具体用于使用所述第一帧内预测模式对所述当前块进行预测,得到第一个预测值;
使用所述第二帧内预测模式对所述当前块进行预测,得到第二个预测值;
根据所述第一幅度值和所述第二幅度值,确定所述第一个预测值的第一权重和所述第二个预测值的第二权重;
根据所述第一个预测值和所述第二个预测值,以及所述第一权重和所述第二权重,确定所述当前块的目标预测值。
在一些实施例中,预测单元13,具体用于确定所述第一幅度值与所述第二幅度值的和值;将所述第一幅度值与所述和值的比值,确定为所述第一权重;将所述第二幅度值与所述和值的比值,确定为所述第二权重。
在一些实施例中,预测单元13,具体用于若所述第一幅度值与所述第二幅度值的比值小于或等于所述第一预设阈值时,则使用所述第一帧内预测模式、所述第二帧内预测模式和所述第三帧内预测模式,确定所述当前块的目标预测值。
在一些实施例中,预测单元13,具体用于确定所述第一帧内预测模式、所述第二帧内预测模式和所述第三帧内预测模式分别对应的权重;确定分别使用所述第一帧内预测模式、所述第二帧内预测模式和所述第三帧内预测模式对所述当前块进行预测时的预测值;根据所述第一帧内预测模式、所述第二帧内预测模式和所述第三帧内预测模式分别对应的预测值和权重进行加权,得到所述当前块的目标预测值。
可选的,所述第一帧内预测模式、所述第二帧内预测模式和所述第三帧内预测模式分别对应的权重为预设权重。
在一些实施例中,预测单元13,具体用于将预设权重确定为所述第三帧内预测模式的权重;
根据所述第一幅度值和所述第二幅度值,确定所述第一帧内预测模式和所述第二帧内预测模式分别对应的权重。
在一些实施例中,确定单元12,具体用于若所述第一帧内预测模式和所述第二帧内预测模式满足第一预设条件时,则根据所述第一帧内预测模式和第二帧内预测模式的幅度值,确定所述当前块的加权融合条件。
在一些实施例中,预测单元13,还用于若所述第一帧内预测模式和所述第二帧内预测模式不满足所述第一预设条件时,则使用所述第一帧内预测模式,确定所述当前块的目标预测值。
在一些实施例中,所述第一预设条件为所述第一帧内预测模式和所述第二帧内预测模式均不是Planar以及DC模式,且所述第二帧内预测模式对应的第二幅度值不为零。
在一些实施例,解码单元11,具体用于解码所述码流,得到解码端帧内模式导出DIMD使能标志,所述DIMD使能标志用于指示当前块是否使用所述DIMD技术;若所述DIMD使能标志指示所述当前块使用所述DIMD技术时,则确定当前块周围已重建区域对应的N个帧内预测模式的幅度值。
在一些实施例,解码单元11,具体用于若所述DIMD使能标志指示所述当前块使用所述DIMD技术时,确定所述当前块的模板区域对应的N个帧内预测模式的幅度值。
在一些实施例,解码单元11,具体用于将所述N个帧内预测模式中幅度值最大的帧内预测模式,确定为所述第一帧内预测模式;将所述N个帧内预测模式中幅度值次大的帧内预测模式,确定为所述第二帧内预测模式。
在一些实施例,解码单元11,还用于解码码流,得到第二标志,所述第二标志用于指示所述当前块是否通过所述第一帧内预测模式、所述第二帧内预测模式和所述第三帧内预测模式中的至少一个确定目标预测值;若所述第二标志为真时,则确定当前块周围已重建区域对应的N个帧内预测模式的幅度值。
可选的,所述第二标志为基于模板的帧内模式导出DIMD使能标志。
在一些实施例,解码单元11,还用于确定所述当前块所在的当前帧的类型;若所述当前帧的类型为目标帧类型时,则根据所述第一帧内预测模式和第二帧内预测模式的幅度值,确定所述当前块的加权融合条件,所述目标帧类型包括I帧、P帧、B帧中的至少一个。
在一些实施例,解码单元11,还用于确定所述当前块所在的当前帧的类型,以及所述当前块的大小;若所述当前帧的类型为第一帧类型,且所述当前块的大小大于第一阈值时,则根据所述第一帧内预测模式和第二帧内预测模式的幅度值,确定所述当前块的加权融合条件;若所述当前帧的类型为第二帧类型,且所述当前块的大小大于第二阈值时,则根据所述第一帧内预测模式和第二帧内预测模式的幅度值,确定所述当前块的加权融合条件。
在一些实施例,解码单元11,还用于确定所述当前块对应的量化参数;若所述量化参数小于第三阈值,则根据所述第一帧内预测模式和第二帧内预测模式的幅度值,确定所述当前块的加权融合条件。
可选的,所述第一标志复用所述当前序列的第三标志,所述第三标志为序列级的帧内块复制IBC使能标志或者为模板匹配预测TMP使能标志。
应理解,装置实施例与方法实施例可以相互对应,类似的描述可以参照方法实施例。为避免重复,此处不再赘述。具体地,图10所示的装置10可以执行本申请实施例的帧内预测方法,并且装置10中的各个单元的前述和其它操作和/或功能分别为了实现上述帧内预测方法等各个方法中的相应流程,为了简洁,在此不再赘述。
图13是本申请一实施例提供的帧内预测装置的示意性框图。
如图13所示,该帧内预测装置20可包括:
第一确定单元21,用于确定当前块周围已重建区域对应的N个帧内预测模式的幅度值,并根据所述N个帧内预测模式的幅度值,确定当前块的第一帧内预测模式和第二帧内预测模式,所述N为大于1的正整数;
第二确定单元22,用于根据所述第一帧内预测模式和第二帧内预测模式的幅度值,确定所述当前块的加权融合条件,所述加权融合条件用于判断所述当前块是否通过所述第一帧内预测模式、所述第二帧内预测模式和第三帧内预测模式进行加权预测;
预测单元23,用于根据所述加权融合条件,以及所述第一帧内预测模式、所述第二帧内预测模式和所述第三帧内预测模式中的至少一个,确定所述当前块的第一预测值;根据所述第一预测值,确定所述当前块的目标预测值。
在一些实施例中,第一确定单元21,用于在所述当前块对应的图像内容为第一图像内容时,则根据所述第一帧内预测模式和第二帧内预测模式的幅度值,确定所述当前块的加权融合条件。
在一些实施例中,预测单元23,还用于将第一标志写入码流,所述第一标志用于指示是否使用第一技术,所述第一技术在所述第一图像内容下使用。
在一些实施例中,第二确定单元22,具体用于将所述第一帧内预测模式的第一幅度值与所述第二帧内预测模式的第二幅度值的比值小于或等于第一预设阈值,确定所述当前块的加权融合条件。
在一些实施例中,预测单元23,具体用于若所述第一幅度值与所述第二幅度值的比值大于所述第一预设阈值时,则使用所述第一帧内预测模式,确定所述当前块的第一预测值。
在一些实施例中,预测单元23,具体用于若所述第一幅度值与所述第二幅度值的比值大于所述第一预设阈值时,则使用所述第一帧内预测模式和所述第二帧内预测模式,确定所述当前块的第一预测值。
在一些实施例中,预测单元23,具体用于确定使用所述第一帧内预测模式对所述当前块进行预测时的第一预测值;
确定使用所述第二帧内预测模式对所述当前块进行预测时的第二预测值;
根据所述第一幅度值和所述第二幅度值,确定所述第一预测值的第一权重和所述第二预测值的第二权重;
根据所述第一预测值和所述第二预测值,以及所述第一权重和所述第二权重,确定所述当前块的第一预测值。
在一些实施例中,预测单元23,具体用于确定所述第一幅度值与所述第二幅度值的和值;
将所述第一幅度值与所述和值的比值,确定为所述第一权重;
将所述第二幅度值与所述和值的比值,确定为所述第二权重。
在一些实施例中,预测单元23,具体用于若所述第一幅度值与所述第二幅度值的比值小于或等于所述第一预设阈 值时,则使用所述第一帧内预测模式、所述第二帧内预测模式和所述第三帧内预测模式,确定所述当前块的第一预测值。
在一些实施例中,预测单元23,具体用于确定所述第一帧内预测模式、所述第二帧内预测模式和所述第三帧内预测模式分别对应的权重;
确定分别使用所述第一帧内预测模式、所述第二帧内预测模式和所述第三帧内预测模式对所述当前块进行预测时的预测值;
根据所述第一帧内预测模式、所述第二帧内预测模式和所述第三帧内预测模式分别对应的预测值和权重进行加权,得到所述当前块的第一预测值。
可选的,所述第一帧内预测模式、所述第二帧内预测模式和所述第三帧内预测模式分别对应的权重为预设权重。
在一些实施例中,预测单元23,具体用于将预设权重确定为所述第三帧内预测模式的权重;
根据所述第一幅度值和所述第二幅度值,确定所述第一帧内预测模式和所述第二帧内预测模式分别对应的权重。
在一些实施例中,第二确定单元22,具体用于若所述第一帧内预测模式和所述第二帧内预测模式满足第一预设条件时,则根据所述第一帧内预测模式和第二帧内预测模式的幅度值,确定所述当前块的加权融合条件。
在一些实施例中,预测单元23,还用于若所述第一帧内预测模式和所述第二帧内预测模式不满足所述第一预设条件时,则使用所述第一帧内预测模式,确定所述当前块的第一预测值。
在一些实施例中,所述第一预设条件为所述第一帧内预测模式和所述第二帧内预测模式均不是Planar以及DC模式,且所述第二帧内预测模式对应的第二幅度值不为零。
在一些实施例中,第一确定单元21,具体用于获取解码端帧内模式导出DIMD允许使用标志,所述DIMD允许使用标志用于指示当前序列是否允许使用所述DIMD技术;若所述DIMD允许使用标志指示所述当前序列允许使用所述DIMD技术时,则确定所述N个帧内预测模式的幅度值。
在一些实施例中,第一确定单元21,具体用于若所述DIMD允许使用标志指示所述当前序列允许使用所述DIMD技术时,在所述当前块的相邻重建样本区域中,确定所述N个帧内预测模式的幅度值。
在一些实施例中,第二确定单元22,具体用于将所述N个帧内预测模式中幅度值最大的帧内预测模式,确定为所述第一帧内预测模式;将所述N个帧内预测模式中幅度值次大的帧内预测模式,确定为所述第二帧内预测模式。
在一些实施例中,预测单元23,具体用于根据所述第一预测值,确定所述第一预测值对应的第一编码代价;
确定候选预测集中的各帧内预测模式对所述当前块进行预测时的第二编码代价;
将所述第一编码代价和所述第二编码代价中最小编码代价对应的预测值,确定为所述当前块的目标预测值。
可选的,所述候选预测集包括所述N个帧内预测模式中除所述第一帧内预测模式和所述第二帧内预测模式之外的其他帧内预测模式。
在一些实施例中,预测单元23,还用于若所述第一编码代价为第一编码代价和第二编码代价中的最小编码代价,则将第二标志置为真后写入码流;
若所述第一编码代价不是第一编码代价和第二编码代价中的最小编码代价,则将所述第二标志置为假后写入码流;
其中所述第二标志用于指示所述当前块是否通过所述第一帧内预测模式、所述第二帧内预测模式和所述第三帧内预测模式中的至少一个确定目标预测值。
可选的,所述第二标志为基于模板的解码端帧内模式导出DIMD使能标志。
在一些实施例中,第二确定单元22,还用于确定所述当前块所在的当前帧的类型;
若所述当前帧的类型为目标帧类型时,则根据所述第一帧内预测模式和第二帧内预测模式的幅度值,确定所述当前块的加权融合条件,所述目标帧类型包括I帧、P帧、B帧中的至少一个。
在一些实施例中,第二确定单元22,还用于确定所述当前块所在的当前帧的类型,以及所述当前块的大小;
若所述当前帧的类型为第一帧类型,且所述当前块的大小大于第一阈值时,则根据所述第一帧内预测模式和第二帧内预测模式的幅度值,确定所述当前块的加权融合条件;
若所述当前帧的类型为第二帧类型,且所述当前块的大小大于第二阈值时,则根据所述第一帧内预测模式和第二帧内预测模式的幅度值,确定所述当前块的加权融合条件。
在一些实施例中,第二确定单元22,还用于确定所述当前块对应的量化参数;
若所述量化参数小于第三阈值,则根据所述第一帧内预测模式和第二帧内预测模式的幅度值,确定所述当前块的加权融合条件。
可选的,所述第一标志复用所述当前序列的第三标志,所述第三标志为序列级的帧内块复制IBC使能标志或者为模板匹配预测TMP使能标志。
在一些实施例中,预测单元23,还用于将所述第一标志写入所述码流。
应理解,装置实施例与方法实施例可以相互对应,类似的描述可以参照方法实施例。为避免重复,此处不再赘述。具体地,图11所示的装置20可以对应于执行本申请实施例的帧内预测方法中的相应主体,并且装置20中的各个单元的前述和其它操作和/或功能分别为了实现编码方法等各个方法中的相应流程,为了简洁,在此不再赘述。
上文中结合附图从功能单元的角度描述了本申请实施例的装置和系统。应理解,该功能单元可以通过硬件形式实现,也可以通过软件形式的指令实现,还可以通过硬件和软件单元组合实现。具体地,本申请实施例中的方法实施例的各步骤可以通过处理器中的硬件的集成逻辑电路和/或软件形式的指令完成,结合本申请实施例公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件单元组合执行完成。可选地,软件单元可以位于随机存储器,闪存、只读存储器、可编程只读存储器、电可擦写可编程存储器、寄存器等本领域的成熟的存储介质中。该存储介质位于存储器,处理器读取存储器中的信息,结合其硬件完成上述方法实施例中的步骤。
图14是本申请实施例提供的电子设备的示意性框图。
如图14所示,该电子设备30可以为本申请实施例所述的视频编码器,或者视频解码器,该电子设备30可包括:
存储器33和处理器32,该存储器33用于存储计算机程序34,并将该程序代码34传输给该处理器32。换言之,该处理器32可以从存储器33中调用并运行计算机程序34,以实现本申请实施例中的方法。
例如,该处理器32可用于根据该计算机程序34中的指令执行上述方法200中的步骤。
在本申请的一些实施例中,该处理器32可以包括但不限于:
通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等等。
在本申请的一些实施例中,该存储器33包括但不限于:
易失性存储器和/或非易失性存储器。其中,非易失性存储器可以是只读存储器(Read-Only Memory,ROM)、可编程只读存储器(Programmable ROM,PROM)、可擦除可编程只读存储器(Erasable PROM,EPROM)、电可擦除可编程只读存储器(Electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(Random Access Memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(Static RAM,SRAM)、动态随机存取存储器(Dynamic RAM,DRAM)、同步动态随机存取存储器(Synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(Double Data Rate SDRAM,DDR SDRAM)、增强型同步动态随机存取存储器(Enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(synch link DRAM,SLDRAM)和直接内存总线随机存取存储器(Direct Rambus RAM,DR RAM)。
在本申请的一些实施例中,该计算机程序34可以被分割成一个或多个单元,该一个或者多个单元被存储在该存储器33中,并由该处理器32执行,以完成本申请提供的方法。该一个或多个单元可以是能够完成特定功能的一系列计算机程序指令段,该指令段用于描述该计算机程序34在该电子设备30中的执行过程。
如图14所示,该电子设备30还可包括:
收发器33,该收发器33可连接至该处理器32或存储器33。
其中,处理器32可以控制该收发器33与其他设备进行通信,具体地,可以向其他设备发送信息或数据,或接收其他设备发送的信息或数据。收发器33可以包括发射机和接收机。收发器33还可以进一步包括天线,天线的数量可以为一个或多个。
应当理解,该电子设备30中的各个组件通过总线系统相连,其中,总线系统除包括数据总线之外,还包括电源总线、控制总线和状态信号总线。
图15是本申请实施例提供的视频编解码系统的示意性框图。
如图15所示,该视频编解码系统40可包括:视频编码器41和视频解码器42,其中视频编码器41用于执行本申请实施例涉及的视频编码方法,视频解码器42用于执行本申请实施例涉及的视频解码方法。
本申请还提供了一种计算机存储介质,其上存储有计算机程序,该计算机程序被计算机执行时使得该计算机能够执行上述方法实施例的方法。或者说,本申请实施例还提供一种包含指令的计算机程序产品,该指令被计算机执行时使得计算机执行上述方法实施例的方法。
本申请还提供了一种码流,该码流是通过上述编码方式生成的。可选的,该码流中包括第一标志。
当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。该计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行该计算机程序指令时,全部或部分地产生按照本申请实施例该的流程或功能。该计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。该计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,该计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(digital subscriber line,DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。该计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。该可用介质可以是磁性介质(例如,软盘、硬盘、磁带)、光介质(例如数字视频光盘(digital video disc,DVD))、或者半导体介质(例如固态硬盘(solid state disk,SSD))等。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,该单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。例如,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
以上内容,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以该权利要求的保护范围为准。

Claims (59)

  1. 一种帧内预测方法,其特征在于,包括:
    解码码流,确定当前块周围已重建区域对应的N个帧内预测模式的幅度值,并根据所述N个帧内预测模式的幅度值,确定当前块的第一帧内预测模式和第二帧内预测模式,所述N为大于1的正整数;
    根据所述第一帧内预测模式和第二帧内预测模式的幅度值,确定所述当前块的加权融合条件,所述加权融合条件用于判断所述当前块是否通过所述第一帧内预测模式、所述第二帧内预测模式和第三帧内预测模式进行加权预测;
    根据所述加权融合条件,以及所述第一帧内预测模式、所述第二帧内预测模式和所述第三帧内预测模式中的至少一个,确定所述当前块的目标预测值。
  2. 根据权利要求1所述的方法,其特征在于,所述根据所述第一帧内预测模式和第二帧内预测模式对应的幅度值,确定所述当前块的加权融合条件,包括:
    在所述当前块对应的图像内容为第一图像内容时,则根据所述第一帧内预测模式和第二帧内预测模式对应的幅度值,确定所述当前块的加权融合条件。
  3. 根据权利要求1所述的方法,其特征在于,所述根据所述第一帧内预测模式和第二帧内预测模式对应的幅度值,确定所述当前块的加权融合条件,包括:
    解码所述码流,得到第一标志,所述第一标志用于指示是否使用第一技术,所述第一技术在第一图像内容下使用;
    若所述第一标志指示使用所述第一技术时,则根据所述第一帧内预测模式和第二帧内预测模式的幅度值,确定所述当前块的加权融合条件。
  4. 根据权利要求1-3任一项所述的方法,其特征在于,所述根据所述第一帧内预测模式和第二帧内预测模式的幅度值,确定所述当前块的加权融合条件,包括:
    将所述第一帧内预测模式的第一幅度值与所述第二帧内预测模式的第二幅度值的比值小于或等于第一预设阈值,确定所述当前块的加权融合条件。
  5. 根据权利要求4所述的方法,其特征在于,所述根据所述加权融合条件,以及所述第一帧内预测模式、所述第二帧内预测模式和所述第三帧内预测模式中的至少一个,确定所述当前块的目标预测值,包括:
    若所述第一幅度值与所述第二幅度值的比值大于所述第一预设阈值时,则使用所述第一帧内预测模式,确定所述当前块的目标预测值。
  6. 根据权利要求4所述的方法,其特征在于,所述根据所述加权融合条件,以及所述第一帧内预测模式、所述第二帧内预测模式和所述第三帧内预测模式中的至少一个,确定所述当前块的目标预测值,包括:
    若所述第一幅度值与所述第二幅度值的比值大于所述第一预设阈值时,则使用所述第一帧内预测模式和所述第二帧内预测模式,确定所述当前块的目标预测值。
  7. 根据权利要求6所述的方法,其特征在于,所述使用所述第一帧内预测模式和所述第二帧内预测模式,确定所述当前块的目标预测值,包括:
    使用所述第一帧内预测模式对所述当前块进行预测,得到第一个预测值;
    使用所述第二帧内预测模式对所述当前块进行预测,得到第二个预测值;
    根据所述第一幅度值和所述第二幅度值,确定所述第一个预测值的第一权重和所述第二个预测值的第二权重;
    根据所述第一个预测值和所述第二个预测值,以及所述第一权重和所述第二权重,确定所述当前块的目标预测值。
  8. 根据权利要求7所述的方法,其特征在于,所述根据所述第一幅度值和所述第二幅度值,确定所述第一个预测值的第一权重和所述第二个预测值的第二权重,包括:
    确定所述第一幅度值与所述第二幅度值的和值;
    将所述第一幅度值与所述和值的比值,确定为所述第一权重;
    将所述第二幅度值与所述和值的比值,确定为所述第二权重。
  9. 根据权利要求4所述的方法,其特征在于,所述根据所述加权融合条件,以及所述第一帧内预测模式、所述第二帧内预测模式和所述第三帧内预测模式中的至少一个,确定所述当前块的目标预测值,包括:
    若所述第一幅度值与所述第二幅度值的比值小于或等于所述第一预设阈值时,则使用所述第一帧内预测模式、所述第二帧内预测模式和所述第三帧内预测模式,确定所述当前块的目标预测值。
  10. 根据权利要求9所述的方法,其特征在于,所述使用所述第一帧内预测模式、所述第二帧内预测模式和所述第三帧内预测模式,确定所述当前块的目标预测值,包括:
    确定所述第一帧内预测模式、所述第二帧内预测模式和所述第三帧内预测模式分别对应的权重;
    确定分别使用所述第一帧内预测模式、所述第二帧内预测模式和所述第三帧内预测模式对所述当前块进行预测时的预测值;
    根据所述第一帧内预测模式、所述第二帧内预测模式和所述第三帧内预测模式分别对应的预测值和权重进行加权,得到所述当前块的目标预测值。
  11. 根据权利要求10所述的方法,其特征在于,所述第一帧内预测模式、所述第二帧内预测模式和所述第三帧内预测模式分别对应的权重为预设权重。
  12. 根据权利要求10所述的方法,其特征在于,所述确定所述第一帧内预测模式、所述第二帧内预测模式和所述第三帧内预测模式分别对应的权重,包括:
    将预设权重确定为所述第三帧内预测模式的权重;
    根据所述第一幅度值和所述第二幅度值,确定所述第一帧内预测模式和所述第二帧内预测模式分别对应的权重。
  13. 根据权利要求1-3任一项所述的方法,其特征在于,所述根据所述第一帧内预测模式和第二帧内预测模式的幅度值,确定所述当前块的加权融合条件,包括:
    若所述第一帧内预测模式和所述第二帧内预测模式满足第一预设条件时,则根据所述第一帧内预测模式和第二帧内预测模式的幅度值,确定所述当前块的加权融合条件。
  14. 根据权利要求13所述的方法,其特征在于,所述方法还包括:
    若所述第一帧内预测模式和所述第二帧内预测模式不满足所述第一预设条件时,则使用所述第一帧内预测模式,确定所述当前块的目标预测值。
  15. 根据权利要求13或14所述的方法,其特征在于,所述第一预设条件为所述第一帧内预测模式和所述第二帧内预测模式均不是Planar以及DC模式,且所述第二帧内预测模式对应的第二幅度值不为零。
  16. 根据权利要求1-3任一项所述的方法,其特征在于,所述确定当前块周围已重建区域对应的N个帧内预测模式的幅度值,包括:
    解码所述码流,得到解码端帧内模式导出DIMD使能标志,所述DIMD使能标志用于指示当前块是否使用所述DIMD技术;
    若所述DIMD使能标志指示所述当前块使用所述DIMD技术时,则确定当前块周围已重建区域对应的N个帧内预测模式的幅度值。
  17. 根据权利要求16所述的方法,其特征在于,所述若所述DIMD使能标志指示所述当前块使用所述DIMD技术时,则确定当前块周围已重建区域对应的N个帧内预测模式的幅度值,包括:
    若所述DIMD使能标志指示所述当前块使用所述DIMD技术时,确定所述当前块的模板区域对应的N个帧内预测模式的幅度值。
  18. 根据权利要求1-3任一项所述的方法,其特征在于,所述根据所述N个帧内预测模式的幅度值,确定当前块的第一帧内预测模式和第二帧内预测模式,包括:
    将所述N个帧内预测模式中幅度值最大的帧内预测模式,确定为所述第一帧内预测模式;
    将所述N个帧内预测模式中幅度值次大的帧内预测模式,确定为所述第二帧内预测模式。
  19. 根据权利要求1-3任一项所述的方法,其特征在于,所述方法还包括:
    解码码流,得到第二标志,所述第二标志用于指示所述当前块是否通过所述第一帧内预测模式、所述第二帧内预测模式和所述第三帧内预测模式中的至少一个确定目标预测值;
    若所述第二标志为真时,则确定当前块周围已重建区域对应的N个帧内预测模式的幅度值。
  20. 根据权利要求19所述的方法,其特征在于,所述第二标志为基于模板的帧内模式导出DIMD使能标志。
  21. 根据权利要求1-3任一项所述的方法,其特征在于,所述方法还包括:
    确定所述当前块所在的当前帧的类型;
    若所述当前帧的类型为目标帧类型时,则根据所述第一帧内预测模式和第二帧内预测模式的幅度值,确定所述当前块的加权融合条件,所述目标帧类型包括I帧、P帧、B帧中的至少一个。
  22. 根据权利要求1-3任一项所述的方法,其特征在于,所述方法还包括:
    确定所述当前块所在的当前帧的类型,以及所述当前块的大小;
    若所述当前帧的类型为第一帧类型,且所述当前块的大小大于第一阈值时,则根据所述第一帧内预测模式和第二帧内预测模式的幅度值,确定所述当前块的加权融合条件;
    若所述当前帧的类型为第二帧类型,且所述当前块的大小大于第二阈值时,则根据所述第一帧内预测模式和第二帧内预测模式的幅度值,确定所述当前块的加权融合条件。
  23. 根据权利要求1-3任一项所述的方法,其特征在于,所述方法还包括:
    确定所述当前块对应的量化参数;
    若所述量化参数小于第三阈值,则根据所述第一帧内预测模式和第二帧内预测模式的幅度值,确定所述当前块的加权融合条件。
  24. 根据权利要求3所述的方法,其特征在于,所述第一标志复用所述当前序列的第三标志,所述第三标志为序列级的帧内块复制IBC使能标志或者为模板匹配预测TMP使能标志。
  25. 根据权利要求1-3任一项所述的方法,其特征在于,所述第三帧内预测模式为Planar模式。
  26. 一种帧内预测方法,其特征在于,包括:
    确定当前块周围已重建区域对应的N个帧内预测模式的幅度值,并根据所述N个帧内预测模式的幅度值,确定当前块的第一帧内预测模式和第二帧内预测模式,所述N为大于1的正整数;
    根据所述第一帧内预测模式和第二帧内预测模式的幅度值,确定所述当前块的加权融合条件,所述加权融合条件用于判断所述当前块是否通过所述第一帧内预测模式、所述第二帧内预测模式和第三帧内预测模式进行加权预测;
    根据所述加权融合条件,以及所述第一帧内预测模式、所述第二帧内预测模式和所述第三帧内预测模式中的至少一个,确定所述当前块的第一预测值;
    根据所述第一预测值,确定所述当前块的目标预测值。
  27. 根据权利要求26所述的方法,其特征在于,所述根据所述第一帧内预测模式和第二帧内预测模式的幅度值,确定所述当前块的加权融合条件,包括:
    在所述当前块对应的图像内容为第一图像内容时,则根据所述第一帧内预测模式和第二帧内预测模式的幅度值,确定所述当前块的加权融合条件。
  28. 根据权利要求27所述的方法,其特征在于,所述方法还包括:
    将第一标志写入码流,所述第一标志用于指示是否使用第一技术,所述第一技术在所述第一图像内容下使用。
  29. 根据权利要求26-28任一项所述的方法,其特征在于,所述根据所述第一帧内预测模式和第二帧内预测模式的幅度值,确定所述当前块的加权融合条件,包括:
    将所述第一帧内预测模式的第一幅度值与所述第二帧内预测模式的第二幅度值的比值小于或等于第一预设阈值, 确定所述当前块的加权融合条件。
  30. 根据权利要求29所述的方法,其特征在于,所述根据所述加权融合条件,以及所述第一帧内预测模式、所述第二帧内预测模式和所述第三帧内预测模式中的至少一个,确定所述当前块的第一预测值,包括:
    若所述第一幅度值与所述第二幅度值的比值大于所述第一预设阈值时,则使用所述第一帧内预测模式,确定所述当前块的第一预测值。
  31. 根据权利要求29所述的方法,其特征在于,所述根据所述加权融合条件,以及所述第一帧内预测模式、所述第二帧内预测模式和所述第三帧内预测模式中的至少一个,确定所述当前块的第一预测值,包括:
    若所述第一幅度值与所述第二幅度值的比值大于所述第一预设阈值时,则使用所述第一帧内预测模式和所述第二帧内预测模式,确定所述当前块的第一预测值。
  32. 根据权利要求31所述的方法,其特征在于,所述使用所述第一帧内预测模式和所述第二帧内预测模式,确定所述当前块的第一预测值,包括:
    确定使用所述第一帧内预测模式对所述当前块进行预测时的第一预测值;
    确定使用所述第二帧内预测模式对所述当前块进行预测时的第二预测值;
    根据所述第一幅度值和所述第二幅度值,确定所述第一预测值的第一权重和所述第二预测值的第二权重;
    根据所述第一预测值和所述第二预测值,以及所述第一权重和所述第二权重,确定所述当前块的第一预测值。
  33. 根据权利要求32所述的方法,其特征在于,所述根据所述第一幅度值和所述第二幅度值,确定所述第一预测值的第一权重和所述第二预测值的第二权重,包括:
    确定所述第一幅度值与所述第二幅度值的和值;
    将所述第一幅度值与所述和值的比值,确定为所述第一权重;
    将所述第二幅度值与所述和值的比值,确定为所述第二权重。
  34. 根据权利要求29所述的方法,其特征在于,所述根据所述加权融合条件,以及所述第一帧内预测模式、所述第二帧内预测模式和所述第三帧内预测模式中的至少一个,确定所述当前块的第一预测值,包括:
    若所述第一幅度值与所述第二幅度值的比值小于或等于所述第一预设阈值时,则使用所述第一帧内预测模式、所述第二帧内预测模式和所述第三帧内预测模式,确定所述当前块的第一预测值。
  35. 根据权利要求34所述的方法,其特征在于,所述使用所述第一帧内预测模式、所述第二帧内预测模式和所述第三帧内预测模式,确定所述当前块的第一预测值,包括:
    确定所述第一帧内预测模式、所述第二帧内预测模式和所述第三帧内预测模式分别对应的权重;
    确定分别使用所述第一帧内预测模式、所述第二帧内预测模式和所述第三帧内预测模式对所述当前块进行预测时的预测值;
    根据所述第一帧内预测模式、所述第二帧内预测模式和所述第三帧内预测模式分别对应的预测值和权重进行加权,得到所述当前块的第一预测值。
  36. 根据权利要求35所述的方法,其特征在于,所述第一帧内预测模式、所述第二帧内预测模式和所述第三帧内预测模式分别对应的权重为预设权重。
  37. 根据权利要求35所述的方法,其特征在于,所述确定所述第一帧内预测模式、所述第二帧内预测模式和所述第三帧内预测模式分别对应的权重,包括:
    将预设权重确定为所述第三帧内预测模式的权重;
    根据所述第一幅度值和所述第二幅度值,确定所述第一帧内预测模式和所述第二帧内预测模式分别对应的权重。
  38. 根据权利要求26-28任一项所述的方法,其特征在于,所述根据所述第一帧内预测模式和第二帧内预测模式的幅度值,确定所述当前块的加权融合条件,包括:
    若所述第一帧内预测模式和所述第二帧内预测模式满足第一预设条件时,则根据所述第一帧内预测模式和第二帧内预测模式的幅度值,确定所述当前块的加权融合条件。
  39. 根据权利要求38所述的方法,其特征在于,所述方法还包括:
    若所述第一帧内预测模式和所述第二帧内预测模式不满足所述第一预设条件时,则使用所述第一帧内预测模式,确定所述当前块的第一预测值。
  40. 根据权利要求38所述的方法,其特征在于,所述第一预设条件为所述第一帧内预测模式和所述第二帧内预测模式均不是Planar以及DC模式,且所述第二帧内预测模式对应的第二幅度值不为零。
  41. 根据权利要求26-28任一项所述的方法,其特征在于,所述确定当前块周围已重建区域对应的N个帧内预测模式的幅度值,包括:
    获取解码端帧内模式导出DIMD允许使用标志,所述DIMD允许使用标志用于指示当前序列是否允许使用所述DIMD技术;
    若所述DIMD允许使用标志指示所述当前序列允许使用所述DIMD技术时,则确定所述N个帧内预测模式的幅度值。
  42. 根据权利要求41所述的方法,其特征在于,所述若所述DIMD允许使用标志指示所述当前序列允许使用所述DIMD技术时,则确定所述N个帧内预测模式的幅度值,包括:
    若所述DIMD允许使用标志指示所述当前序列允许使用所述DIMD技术时,在所述当前块的相邻重建样本区域中,确定所述N个帧内预测模式的幅度值。
  43. 根据权利要求42所述的方法,其特征在于,所述根据所述N个帧内预测模式的幅度值,确定当前块的第一帧内预测模式和第二帧内预测模式,包括:
    将所述N个帧内预测模式中幅度值最大的帧内预测模式,确定为所述第一帧内预测模式;
    将所述N个帧内预测模式中幅度值次大的帧内预测模式,确定为所述第二帧内预测模式。
  44. 根据权利要求41所述的方法,其特征在于,所述根据所述当前块的第一预测值,确定所述当前块的目标预测值,包括:
    根据所述第一预测值,确定所述第一预测值对应的第一编码代价;
    确定候选预测集中的各帧内预测模式对所述当前块进行预测时的第二编码代价;
    将所述第一编码代价和所述第二编码代价中最小编码代价对应的预测值,确定为所述当前块的目标预测值。
  45. 根据权利要求44所述的方法,其特征在于,所述候选预测集包括所述N个帧内预测模式中除所述第一帧内预测模式和所述第二帧内预测模式之外的其他帧内预测模式。
  46. 根据权利要求44所述的方法,其特征在于,所述方法还包括:
    若所述第一编码代价为所述第一编码代价和所述第二编码代价中的最小编码代价,则将第二标志置为真后写入码流;
    若所述第一编码代价不是所述第一编码代价和所述第二编码代价中的最小编码代价,则将所述第二标志置为假后写入码流;
    其中所述第二标志用于指示所述当前块是否通过所述第一帧内预测模式、所述第二帧内预测模式和所述第三帧内预测模式中的至少一个确定目标预测值。
  47. 根据权利要求46所述的方法,其特征在于,所述第二标志为基于模板的解码端帧内模式导出DIMD使能标志。
  48. 根据权利要求26-28任一项所述的方法,其特征在于,所述方法还包括:
    确定所述当前块所在的当前帧的类型;
    若所述当前帧的类型为目标帧类型时,则根据所述第一帧内预测模式和第二帧内预测模式的幅度值,确定所述当前块的加权融合条件,所述目标帧类型包括I帧、P帧、B帧中的至少一个。
  49. 根据权利要求26-28任一项所述的方法,其特征在于,所述方法还包括:
    确定所述当前块所在的当前帧的类型,以及所述当前块的大小;
    若所述当前帧的类型为第一帧类型,且所述当前块的大小大于第一阈值时,则根据所述第一帧内预测模式和第二帧内预测模式的幅度值,确定所述当前块的加权融合条件;
    若所述当前帧的类型为第二帧类型,且所述当前块的大小大于第二阈值时,则根据所述第一帧内预测模式和第二帧内预测模式的幅度值,确定所述当前块的加权融合条件。
  50. 根据权利要求26-28任一项所述的方法,其特征在于,所述方法还包括:
    确定所述当前块对应的量化参数;
    若所述量化参数小于第三阈值,则根据所述第一帧内预测模式和第二帧内预测模式的幅度值,确定所述当前块的加权融合条件。
  51. 根据权利要求28所述的方法,其特征在于,所述第一标志复用所述当前序列的第三标志,所述第三标志为序列级的帧内块复制IBC使能标志或者为模板匹配预测TMP使能标志。
  52. 根据权利要求26-28任一项所述的方法,其特征在于,所述第三帧内预测模式为Planar模式。
  53. 一种帧内预测装置,其特征在于,包括:
    解码单元,用于解码码流,确定当前块周围已重建区域对应的N个帧内预测模式的幅度值,并根据所述N个帧内预测模式的幅度值,确定当前块的第一帧内预测模式和第二帧内预测模式,所述N为大于1的正整数;
    确定单元,用于根据所述第一帧内预测模式和第二帧内预测模式的幅度值,确定所述当前块的加权融合条件,所述加权融合条件用于判断所述当前块是否通过所述第一帧内预测模式、所述第二帧内预测模式和第三帧内预测模式进行加权预测;
    预测单元,用于根据所述加权融合条件,以及所述第一帧内预测模式、所述第二帧内预测模式和所述第三帧内预测模式中的至少一个,确定所述当前块的目标预测值。
  54. 一种帧内预测装置,其特征在于,包括:
    第一确定单元,用于确定当前块周围已重建区域对应的N个帧内预测模式的幅度值,并根据所述N个帧内预测模式的幅度值,确定当前块的第一帧内预测模式和第二帧内预测模式,所述N为大于1的正整数;
    第二确定单元,用于根据所述第一帧内预测模式和第二帧内预测模式的幅度值,确定所述当前块的加权融合条件,所述加权融合条件用于判断所述当前块是否通过所述第一帧内预测模式、所述第二帧内预测模式和第三帧内预测模式进行加权预测;
    预测单元,用于根据所述加权融合条件,以及所述第一帧内预测模式、所述第二帧内预测模式和所述第三帧内预测模式中的至少一个,确定所述当前块的第一预测值;并根据所述第一预测值,确定所述当前块的目标预测值。
  55. 一种视频解码器,其特征在于,包括处理器和存储器;
    所示存储器用于存储计算机程序;
    所述处理器用于调用并运行所述存储器中存储的计算机程序,以实现上述权利要求1至25任一项所述的方法。
  56. 一种视频编码器,其特征在于,包括处理器和存储器;
    所示存储器用于存储计算机程序;
    所述处理器用于调用并运行所述存储器中存储的计算机程序,以实现如上述权利要求26至52任一项所述的方法。
  57. 一种视频编解码系统,其特征在于,包括:视频编码器和视频解码器,所述视频解码器用于实现上述权利要求1至24任一项所述的方法,所述视频编码器用于实现如上述权利要求26至52任一项所述的方法。
  58. 一种计算机可读存储介质,其特征在于,用于存储计算机程序;
    所述计算机程序使得计算机执行如上述权利要求1至25或26至52任一项所述的方法。
  59. 一种码流,其特征在于,所述码流是通过如上述权利要求26至52任一项所述的方法生成的。
PCT/CN2021/142114 2021-12-28 2021-12-28 帧内预测方法、设备、系统、及存储介质 WO2023122969A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/142114 WO2023122969A1 (zh) 2021-12-28 2021-12-28 帧内预测方法、设备、系统、及存储介质

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/142114 WO2023122969A1 (zh) 2021-12-28 2021-12-28 帧内预测方法、设备、系统、及存储介质

Publications (1)

Publication Number Publication Date
WO2023122969A1 true WO2023122969A1 (zh) 2023-07-06

Family

ID=86996932

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/142114 WO2023122969A1 (zh) 2021-12-28 2021-12-28 帧内预测方法、设备、系统、及存储介质

Country Status (1)

Country Link
WO (1) WO2023122969A1 (zh)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110741643A (zh) * 2017-07-11 2020-01-31 谷歌有限责任公司 用于视频代码化的复合帧内预测
US20200221084A1 (en) * 2017-06-19 2020-07-09 Lg Electronics Inc. Intra prediction mode based image processing method, and apparatus therefor
CN112514378A (zh) * 2018-09-28 2021-03-16 Jvc建伍株式会社 图像解码装置、图像解码方法以及图像解码程序
CN113691809A (zh) * 2021-07-07 2021-11-23 浙江大华技术股份有限公司 帧内预测方法及编、解码方法、电子设备及存储介质

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200221084A1 (en) * 2017-06-19 2020-07-09 Lg Electronics Inc. Intra prediction mode based image processing method, and apparatus therefor
CN110741643A (zh) * 2017-07-11 2020-01-31 谷歌有限责任公司 用于视频代码化的复合帧内预测
CN112514378A (zh) * 2018-09-28 2021-03-16 Jvc建伍株式会社 图像解码装置、图像解码方法以及图像解码程序
CN113691809A (zh) * 2021-07-07 2021-11-23 浙江大华技术股份有限公司 帧内预测方法及编、解码方法、电子设备及存储介质

Similar Documents

Publication Publication Date Title
BR112021009848A2 (pt) codificador, decodificador e métodos correspondentes para predição inter
CN113748677A (zh) 编码器、解码器及对应的帧内预测方法
CN113170143B (zh) 一种编码器、解码器及去块滤波器的边界强度的对应推导方法
CN112954367B (zh) 使用调色板译码的编码器、解码器和相应方法
CN114885159B (zh) 位置相关预测组合的模式相关和大小相关块级限制的方法和装置
WO2020147782A1 (en) An encoder, a decoder and corresponding methods of deblocking filter adaptation
CN116405686A (zh) 图像重建方法和装置
CN113784126A (zh) 图像编码方法、装置、设备及存储介质
CN115695784A (zh) 对图像的块进行编码的方法,编码设备和计算机可读介质
CN114902661A (zh) 用于跨分量线性模型预测的滤波方法和装置
CN113330743A (zh) 编码器、解码器及去块效应滤波器自适应的对应方法
CN116567207B (zh) 用于帧内预测的方法和装置
CN114913249A (zh) 编码、解码方法和相关设备
CN113711601A (zh) 用于推导当前块的插值滤波器索引的方法和装置
CN113170118A (zh) 视频译码中进行色度帧内预测的方法及装置
WO2023044868A1 (zh) 视频编解码方法、设备、系统、及存储介质
CN114205582B (zh) 用于视频编解码的环路滤波方法、装置及设备
WO2023122969A1 (zh) 帧内预测方法、设备、系统、及存储介质
WO2023122968A1 (zh) 帧内预测方法、设备、系统、及存储介质
WO2023236113A1 (zh) 视频编解码方法、装置、设备、系统及存储介质
WO2023184250A1 (zh) 视频编解码方法、装置、设备、系统及存储介质
WO2023220946A1 (zh) 视频编解码方法、装置、设备、系统及存储介质
WO2022155922A1 (zh) 视频编解码方法与系统、及视频编码器与视频解码器
WO2023220970A1 (zh) 视频编码方法、装置、设备、系统、及存储介质
WO2024007128A1 (zh) 视频编解码方法、装置、设备、系统、及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21969348

Country of ref document: EP

Kind code of ref document: A1