WO2021238546A1 - 视频编码方法、视频播放方法、相关设备及介质 - Google Patents

视频编码方法、视频播放方法、相关设备及介质 Download PDF

Info

Publication number
WO2021238546A1
WO2021238546A1 PCT/CN2021/089770 CN2021089770W WO2021238546A1 WO 2021238546 A1 WO2021238546 A1 WO 2021238546A1 CN 2021089770 W CN2021089770 W CN 2021089770W WO 2021238546 A1 WO2021238546 A1 WO 2021238546A1
Authority
WO
WIPO (PCT)
Prior art keywords
mode
prediction
prediction mode
target
information set
Prior art date
Application number
PCT/CN2021/089770
Other languages
English (en)
French (fr)
Inventor
张清
王诗涛
刘杉
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Publication of WO2021238546A1 publication Critical patent/WO2021238546A1/zh
Priority to US17/719,691 priority Critical patent/US20220239904A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/105Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/107Selection of coding mode or of prediction mode between spatial and temporal predictive coding, e.g. picture refresh
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/109Selection of coding mode or of prediction mode among a plurality of temporal predictive coding modes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • H04N19/14Coding unit complexity, e.g. amount of activity or edge presence estimation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/154Measured or subjectively estimated visual quality after decoding, e.g. measurement of distortion
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/46Embedding additional information in the video signal during the compression process
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • H04N19/517Processing of motion vectors by encoding
    • H04N19/52Processing of motion vectors by encoding by predictive encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/587Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal sub-sampling or interpolation, e.g. decimation or subsequent interpolation of pictures in a video sequence
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
    • H04N19/96Tree coding, e.g. quad-tree coding

Definitions

  • This application relates to the field of Internet technology, in particular to the field of image processing technology, and in particular to a video encoding method, a video playback method, a video encoding device, a video playback device, a video encoding device, and a video playback Equipment and a computer storage medium.
  • Video coding usually divides the image to be encoded into multiple image blocks, and obtains the bitstream data of the image to be encoded by encoding each image block.
  • the embodiments of the present invention provide a video encoding method, a video playback method, related equipment and media.
  • a video encoding method including:
  • the mode information set includes multiple candidate prediction modes and mode costs of various candidate prediction modes
  • the target prediction mode is used to perform prediction processing on the target prediction unit to obtain the encoded data of the target image block.
  • a video encoding device includes:
  • An obtaining unit configured to obtain a target prediction unit in a target image block and a mode information set of the target prediction unit, the mode information set including multiple candidate prediction modes and mode costs of various candidate prediction modes;
  • An encoding unit configured to perform abnormal distortion point detection on the target prediction unit in at least one candidate prediction mode in the mode information set, to obtain a detection result corresponding to the at least one candidate prediction mode;
  • the coding unit is further configured to calibrate the mode cost of the at least one candidate prediction mode in the mode information set according to the detection result corresponding to the at least one candidate prediction mode to obtain a calibrated mode information set ;
  • the coding unit is also used to select a target prediction mode from the multiple candidate prediction modes according to the mode cost of each candidate prediction mode in the calibrated mode information set;
  • the coding unit is further configured to use the target prediction mode to perform prediction processing on the target prediction unit to obtain the coded data of the target image block.
  • a computer device includes a memory and one or more processors.
  • the memory stores computer-readable instructions.
  • the computer-readable instructions are executed by the one or more processors, the one or more The processor executes the steps of the video encoding method described above.
  • One or more non-volatile computer-readable storage media storing computer-readable instructions, and computer-readable instructions stored thereon.
  • the computer-readable instructions are executed by one or more processors, the one Or multiple processors execute the steps of the above-mentioned video encoding method.
  • a computer program product or computer program, the computer program product or computer program comprising computer-readable instructions stored in a computer-readable storage medium, and a processor of a computer device is readable from the computer-readable storage
  • the medium reads the computer-readable instructions, and the processor executes the computer-readable instructions, so that the computer device executes the steps of the above-mentioned video encoding method.
  • a video playback method executed by a video playback device, includes:
  • bitstream data of each frame image in the image frame sequence corresponding to the target video includes the coded data of multiple image blocks; other frame images in the image frame sequence except the first frame image
  • the coded data of each image block in is obtained by coding the above-mentioned video coding method;
  • the frames of images are sequentially displayed in the playback interface.
  • a video playback device includes:
  • the acquiring unit is used to acquire the bitstream data of each frame image in the image frame sequence corresponding to the target video, and the bitstream data of each frame image includes the coded data of multiple image blocks; the first image frame is excluded from the image frame sequence
  • the encoded data of each image block in the other frame images is obtained by encoding the above-mentioned video encoding method;
  • the decoding unit is used to decode the bitstream data of each frame image to obtain the each frame image
  • the display unit is used to sequentially display the frames of images in the playback interface.
  • a computer device includes a memory and one or more processors.
  • the memory stores computer-readable instructions.
  • the computer-readable instructions are executed by the one or more processors, the one or more The processor executes the steps of the above video playback method.
  • One or more non-volatile computer-readable storage media storing computer-readable instructions, and computer-readable instructions stored thereon.
  • the computer-readable instructions are executed by one or more processors, the one Or multiple processors execute the steps of the above video playback method.
  • a computer program product or computer program, the computer program product or computer program comprising computer-readable instructions stored in a computer-readable storage medium, and a processor of a computer device is readable from the computer-readable storage
  • the medium reads the computer-readable instruction, and the processor executes the computer-readable instruction, so that the computer device executes the steps of the above-mentioned video playback method.
  • FIG. 1a is a schematic diagram of the architecture of an image processing system provided by an embodiment of the present invention.
  • Figure 1b is a schematic diagram of an image processing flow provided by an embodiment of the present invention.
  • Fig. 1c is a schematic diagram of dividing a frame image into image blocks according to an embodiment of the present invention.
  • FIG. 1d is a schematic diagram of the division of an inter prediction mode according to an embodiment of the present invention.
  • Fig. 1e is a schematic diagram of encoding a frame image according to an embodiment of the present invention.
  • Figure 1f is another schematic diagram of encoding a frame image provided by an embodiment of the present invention.
  • FIG. 2 is a schematic flowchart of a video encoding method provided by an embodiment of the present invention
  • FIG. 3 is a schematic flowchart of a video encoding method provided by an embodiment of the present invention.
  • FIG. 4 is a schematic flowchart of a video playback method provided by an embodiment of the present invention.
  • FIG. 5 is an application scene diagram of a video encoding method and video playback method provided by an embodiment of the present invention
  • Fig. 6 is a schematic structural diagram of a video encoding device provided by an embodiment of the present invention.
  • Figure 7 is a schematic structural diagram of a video encoding device provided by an embodiment of the present invention.
  • FIG. 8 is a schematic structural diagram of a video playback device provided by an embodiment of the present invention.
  • Fig. 9 is a schematic structural diagram of a video playback device provided by an embodiment of the present invention.
  • an image processing system is involved; as shown in FIG. 1a, the image processing system at least includes: a video encoding device 11, a video playback device (video decoding device) 12, and a transmission medium 13.
  • the video encoding device 11 may be a server or a terminal; the server here may be an independent physical server, or a server cluster or distributed system composed of multiple physical servers, or it may provide cloud services, cloud databases, and cloud services.
  • Cloud servers for basic cloud computing services such as computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, CDN (Content Delivery Network), and big data and artificial intelligence platforms , Etc.
  • the terminal can be a smart phone, a tablet computer, a notebook computer, a desktop computer, and so on.
  • the video encoding device 11 may at least include an encoder, and the encoder is used to execute a series of encoding procedures.
  • the video playback device 12 may be any device with a video playback function, such as a terminal such as a smart phone, a tablet computer, a notebook computer, a smart watch, etc., or a projector, a projector, and other devices that can project video images on a screen for playback.
  • the video playback device 12 may at least include a decoder, and the decoder is used to execute a series of decoding processes.
  • the transmission medium 13 refers to the space or entity through which data is transmitted, and is mainly used to transmit data between the video encoding device 11 and the video playback device 12; it can specifically include but is not limited to: mobile network, wireless network, wired network and other networks Media, or removable hardware media with read and write functions such as USB flash drives (Universal Serial Bus, USB flash drives) and mobile hard drives.
  • FIG. 1a only exemplarily represents the architecture of the image processing system involved in the embodiment of the present invention, and does not limit the specific architecture of the image processing system.
  • the video encoding device 11 and the video playback device 12 may also be the same device; for another example, in other embodiments, the number of the video playback device 12 may not be limited to one, and so on.
  • the encoder in the video encoding device 11 may encode the current frame image based on the mainstream video encoding standard to obtain the bitstream data of the current frame image.
  • mainstream video coding standards may include but are not limited to: H.264, H.265, VVC (Versatile Video Coding, multifunctional video coding standard), AVS3 (Audio Video coding Standard 3, audio and video coding standard), etc. ;
  • H.265 can also be called HEVC (High Efficiency Video Coding, high-performance video coding standard).
  • the general coding process is as follows:
  • the current frame image to be encoded may be divided into several image blocks (or called CU (Coding Unit, coding unit)), where the image block refers to the basic unit of video coding.
  • the current frame image to be encoded may be divided into a number of non-overlapping LCUs (Largest Coding Units). Then, according to the characteristics of each LCU, the corresponding LCU can be further divided to obtain several CUs, as shown in Figure 1c; it should be understood that Figure 1c is only an exemplary representation of the LCU division method and does not limit it; 1c indicates that the LCU is equally divided into multiple CUs, but in reality, the LCU can also be non-equally divided into multiple CUs.
  • Each CU can correspond to a prediction mode of one mode type, such as an inter prediction mode (Inter mode) and an intra prediction mode (Intra mode).
  • the intra prediction mode mainly searches for the reference image block from the encoded image block in the current frame image, and uses the decoding information of the reference image block to predict; the intra prediction mode needs to transmit the corresponding prediction mode and residual information to decoder.
  • the inter-frame prediction mode is mainly based on the motion estimation (ME, Motion Estimation) and motion compensation (MC, Motion Compensation) of the current image block, searching for a reference image block matching the current image block from the encoded frame images in the image frame sequence , And use the decoding information of the reference image block to perform prediction based on MV (Motion Vector, motion vector).
  • the inter prediction mode may include at least AMVP (Advanced Motion Vector Prediction, Advanced Motion Vector Prediction) mode and Merge mode.
  • AMVP Advanced Motion Vector Prediction
  • the AMVP mode needs to transmit the motion vector data (MVD, Motion Vector Data) of the current image block, the index information of the reference image block, and the residual information of the current image block to the decoder;
  • MVD here refers to MVP (predicted The difference between the motion vector) and the motion vector based on motion estimation (ME).
  • the Merge mode does not need to transmit the motion vector data (MVD) of the current image block, and it can be subdivided into ordinary Merge mode and SKIP mode.
  • the SKIP mode here is a special case of the Merge mode.
  • the difference between the SKIP mode and the ordinary Merge mode is that the ordinary Merge mode needs to transmit the index information of the reference image block and the residual information of the current image block to the decoder, while the SKIP mode Mode only needs to transmit the index information of the reference image block to the decoder, without transmitting the residual information of the current image block.
  • the prediction mode uses the prediction mode to predict the pixel value of each pixel in the current image block to obtain the prediction block corresponding to the current image block; the prediction block includes the predicted value of each pixel.
  • the current image block can be further divided into one or more prediction units (PU), and a mode decision is made to dynamically determine the prediction mode of each prediction unit corresponding to the current image block according to the characteristics of the input signal.
  • the mode type such as the inter-frame prediction type or the intra-frame prediction type, may be determined first according to the characteristics of the current image block; and then the corresponding prediction mode is selected from the prediction modes of the mode type according to the characteristics of each prediction unit.
  • the prediction modes of each prediction unit corresponding to the current image block are all intra prediction modes; if the determined mode type is the inter prediction type, then the prediction units corresponding to the current image block
  • the prediction mode of may be AMVP mode, normal Merge mode or SKIP mode; in this case, the prediction modes of each prediction unit corresponding to the current image block may be the same or different.
  • the corresponding prediction mode can be used to perform prediction processing on each prediction unit to obtain the prediction result of each prediction unit. Then, the prediction result of each prediction unit is combined to obtain the prediction block corresponding to the current image block.
  • 3 Calculate the residual block of the current image block according to the prediction block and the current image block, where the residual block includes the difference between the predicted value and the actual pixel value of each pixel in the current image block; then the residual block
  • the transformation, quantization, and entropy coding are performed sequentially to obtain the coded data of the current image block. Iteratively execute the steps 2-3 involved in the above encoding process until each image block in the current frame image is encoded; at this time, the encoded data of each image block contained in the current frame image can be obtained, thereby obtaining the current frame image Stream data.
  • the video encoding device 11 obtains the code stream data and encoding information of the current frame image after the foregoing encoding stage and transmits it to the video display device 12, so that the video display device 12 uses the encoding information to decode the encoded data of each image block through the decoding stage. Get the current frame image.
  • the code stream data includes the encoded data of each image block in the current frame image;
  • the encoding information includes at least the transmission information specified by the prediction mode used when predicting the prediction unit of each image block in the current frame image, such as The transmission information such as the motion vector data of the current image block specified in the AMVP mode, the index information of the reference image block, and the residual information of the current image block, the transmission information such as the index information of the reference image block specified in the SKIP mode, and so on.
  • the video playback device 12 After receiving the code stream data and coding information of the current frame image, the video playback device 12 can sequentially decode the coded data of each image block in the code stream data according to the coding information.
  • the decoding process for any image block is specifically as follows: decoding, inverse quantization, and inverse transform processing are sequentially performed on the encoded data of the current image block to obtain the residual block of the current image block. Then, the prediction mode used by each prediction unit of the current image block can be determined according to the transmission information corresponding to the current image block in the coding information, and the image block can be obtained according to the determined prediction mode and residual block.
  • each image block of the current frame image can be obtained, thereby obtaining the current frame image.
  • the video playback device 12 After obtaining the current frame image, the video playback device 12 can display the current frame image on the playback interface.
  • the mode decision process involved in the encoding stage usually involves multiple prediction modes. If the prediction mode selected for the prediction unit in the mode decision process is not appropriate, and the quantization parameter (QP) involved in the transformation and quantization of the residual block is large, it will cause the image block to be easily encoded. Some unusual distortion points are generated, such as pixels with distortion as high as 100+, which in turn makes some dirty points appear in the image blocks decoded through the decoding stage, which affects the subjective quality of the image blocks and frame images, as shown in Figure 1e. Based on this, the embodiment of the present invention proposes a video coding scheme.
  • the video coding scheme is mainly used to guide the encoder to make mode decisions during the encoding process, and to reduce the generation of image blocks after encoding by selecting an appropriate prediction mode for the prediction unit.
  • the probability of an abnormal distortion point here refers to a pixel where the absolute value of the difference between the pixel value obtained by decoding and the pixel value before encoding is greater than a certain threshold.
  • the scheme principle of the video coding scheme is roughly as follows:
  • the target frame image can be divided into one or more image blocks, and an image block can be selected from the target frame image as the target image block to be encoded, and then the target image block can be further divided into one Or multiple prediction units.
  • the mode cost of each prediction mode can be obtained first, and whether the prediction unit has abnormal distortion points in at least one prediction mode . If it does not exist, a mode decision algorithm is used to select a prediction mode from a plurality of prediction modes for the prediction unit. The mode decision algorithm here is used to indicate the prediction mode with the least mode cost. If it exists, adjust the mode decision algorithm.
  • the adjustment mode decision algorithm here refers to first calibrating the mode cost of the prediction mode corresponding to the abnormal distortion point, and then according to the calibrated mode cost of the calibrated prediction mode and The mode cost of the uncalibrated prediction mode selects the prediction mode for the prediction unit. After the prediction mode is selected for the prediction unit, the selected prediction mode can be used to perform prediction processing on the prediction unit, and the above steps are iterated to perform prediction processing on each prediction unit in the target image block, thereby obtaining the coded data of the target image block . After the encoded data of the target image block is obtained, an image block can be selected from the target frame image as a new target image block, and the above steps are performed to obtain the encoded data of the new target image block. After the image blocks are encoded, the code stream data of the target frame image can be obtained.
  • the video encoding scheme of the embodiment of the present invention is used to It is encoded, and the encoded frame image as shown in the lower diagram in FIG. 1f can be obtained.
  • the coding scheme proposed by the embodiment of the present invention adds an abnormal distortion point detection mechanism in the mode decision process, which can realize the detection result based on the abnormal distortion point.
  • the mode decision-making process is corrected to effectively reduce the number of abnormal distortion points generated by the target image frame after encoding, and to improve the subjective quality of the target image frame.
  • the target prediction unit may be detected for abnormal distortion points in at least one candidate prediction mode in the mode information set to obtain the detection result corresponding to the at least one candidate prediction mode.
  • the mode cost of at least one candidate prediction mode in the mode information set can be calibrated according to the detection result corresponding to the at least one candidate prediction mode, so that the mode cost of each candidate prediction mode in the calibrated mode information set can be more accurate Reflect the corresponding code rate and distortion of the corresponding candidate prediction mode, so that according to the mode cost of each candidate prediction mode in the calibrated mode information set, the target prediction that is more suitable for the target prediction unit can be selected from multiple candidate prediction modes model.
  • a suitable target prediction mode can be used to perform prediction processing on the target prediction unit to obtain the encoded data of the target image block, which can reduce the probability of distortion of the target image block after encoding to a certain extent.
  • the embodiment of the present invention mainly selects a suitable target prediction mode to reduce the probability of distortion by modifying the mode decision process, it can effectively improve the image compression quality without substantially affecting the compression efficiency and coding complexity. , Improve the subjective quality of the target image block.
  • the embodiment of the present invention may first obtain the code stream data of each frame image in the image frame sequence corresponding to the target video, and the code stream data of each frame image includes the coded data of multiple image blocks. Secondly, the code stream data of each frame of image can be decoded to obtain each frame of image; and each frame of image can be displayed in sequence in the playback interface. Since the encoding data of each image block in the image frame sequence corresponding to the target video except the first frame image is encoded by the above-mentioned video encoding method; therefore, the probability of distortion of each image block can be effectively reduced, thereby When displaying each frame of image in the playback interface, it can reduce the probability of dirty spots in each frame of image to a certain extent, and improve the subjective quality of each frame of image.
  • an embodiment of the present invention proposes a video encoding method; the video encoding method may be executed by the above-mentioned video encoding device, and specifically may be executed by an encoder in the video encoding device.
  • the video encoding method may include the following steps S201-S205:
  • S201 Acquire a target prediction unit and a mode information set of the target prediction unit in a target image block.
  • the target prediction unit may be any prediction unit in the target image block.
  • the mode information set of the target prediction unit can include multiple candidate prediction modes and the mode costs of various candidate prediction modes.
  • the mode costs here can be used to reflect the bit rate and distortion caused by using the candidate prediction mode to predict the target prediction unit. , Which can include, but is not limited to, rate-distortion penalties.
  • the multiple candidate prediction modes may include at least: intra prediction mode and inter prediction mode; the inter prediction mode may include at least the following modes: a first prediction mode, a second prediction mode, and a third prediction mode.
  • the so-called first prediction mode refers to the mode in which the index information of the reference image block related to the target image block needs to be transmitted, which can be specifically the SKIP mode mentioned above;
  • the second prediction mode refers to the residual data of the target image block that needs to be transmitted.
  • the mode of the difference information and the index information of the reference image block related to the target image block can be specifically the ordinary merge mode mentioned above;
  • the third prediction mode refers to the need to transmit the residual information of the target image block, the target image block
  • the mode of the motion vector data of and the index information of the reference image block related to the target image block can be specifically the aforementioned AMVP mode.
  • S202 Perform abnormal distortion point detection on the target prediction unit in at least one candidate prediction mode in the mode information set, to obtain a detection result corresponding to the at least one candidate prediction mode.
  • abnormal distortion points can be detected on the target prediction unit in each candidate prediction mode in the mode information set to obtain the detection result corresponding to each candidate prediction mode; that is, in this specific implementation, at least one
  • the candidate prediction modes may include intra prediction modes and inter prediction modes.
  • the detection result of each candidate prediction mode can be used to indicate whether the target prediction unit has abnormal distortion points in the candidate prediction mode.
  • the reference prediction mode is taken as an example for description, and the reference prediction mode may be any candidate prediction mode.
  • the reference prediction mode uses the reference prediction mode to predict the pixel value of each pixel in the target prediction unit, obtain the predicted value of each pixel, and calculate the absolute value of the residual between the pixel value of each pixel in the target prediction unit and the predicted value, if If there are pixels in the target prediction unit whose residual absolute value is greater than the target threshold, the detection result corresponding to the reference prediction mode is determined to indicate that the target prediction unit has abnormal distortion points in the reference prediction mode. If there is no absolute residual value in the target prediction unit For pixels larger than the target threshold, it is determined that the detection result corresponding to the reference prediction mode indicates that the target prediction unit does not have abnormal distortion points in the reference prediction mode.
  • the probability of abnormal distortion points generated by using intra-frame prediction mode for prediction is small; the probability of abnormal distortion points generated by using inter-frame prediction mode for prediction is greater, especially for frames SKIP mode in the inter-prediction mode.
  • the SKIP mode is used for prediction, the MV is derived from other reference image blocks, and the residual information is not transmitted; therefore, although the SKIP mode can greatly save the bit rate and improve the coding efficiency, it can be used in some special scenarios (such as screen Sharing scenes, live video streaming scenes, etc.) tend to cause excessive local point distortion, which makes the probability of abnormally distorted points in the target image block larger.
  • the embodiment of the present invention can perform abnormal distortion point detection on the target prediction unit only in each mode included in the inter prediction mode, and obtain the detection result corresponding to each mode included in the inter prediction mode, that is, At least one candidate prediction mode may be an inter prediction mode.
  • the operation of detecting abnormal distortion points on the target prediction unit in the intra prediction mode can be reduced, which can effectively save processing resources and improve the encoding speed.
  • the abnormal distortion point refers to the pixel point whose absolute value of the difference between the pixel value obtained by decoding and the pixel value before encoding is greater than a certain threshold. Therefore, in an embodiment, the embodiment of the present invention may use the difference between the predicted value of the pixel and the actual pixel value (that is, the pixel value before encoding) to determine whether the pixel is an abnormal pixel.
  • any candidate prediction mode can be used to predict the pixel value of each pixel in the target prediction unit, if there is at least one pixel If the difference between the predicted value and the actual pixel value of a point is large, it can be determined that the target prediction unit has an abnormal distortion point in any candidate prediction mode; if the difference between the predicted value and the actual pixel value of each pixel is both If it is smaller, it can be determined that the target prediction unit does not have abnormal distortion points in any candidate prediction mode.
  • the pixel when at least one candidate prediction mode is an inter-frame prediction mode, the pixel can also be determined according to the difference between the motion compensation value of the pixel and the actual pixel value (that is, the pixel value before encoding) Whether it is an abnormal pixel to improve the accuracy of the detection result; the motion compensation value here is equal to the sum of the predicted value of the pixel and the residual obtained after inverse transformation and dequantization of the residual information. It should be noted that since the first prediction mode does not transmit residual information, the motion compensation value and the predicted value of the pixel in the first prediction mode are equal.
  • the detection principle of detecting abnormal distortion points of the target prediction unit in any mode of the inter prediction mode can also be as follows: any mode can be used to predict the pixel value of each pixel in the target prediction unit, according to The predicted value and residual information of each pixel are calculated to obtain the motion compensation value of each pixel; if there is a large difference between the motion compensation value of at least one pixel and the actual pixel value, it can be determined that the target prediction unit is in any mode There are abnormal distortion points below; if the difference between the motion compensation value of each pixel and the actual pixel value is small, it can be determined that there is no abnormal distortion point in the target prediction unit in any mode.
  • the detection result of each candidate prediction mode in the detected at least one candidate prediction mode may be sequentially traversed.
  • the mode cost of the currently traversed candidate prediction mode in the mode information set can be calibrated according to the detection result of the currently traversed candidate prediction mode. Specifically, if the detection result of the currently traversed candidate prediction mode indicates that the target prediction unit does not have abnormal distortion points in the currently traversed candidate prediction mode, the mode of the currently traversed candidate prediction mode is maintained in the mode information The cost remains unchanged, that is, in this case, the calibrated mode cost of the currently traversed candidate prediction mode is the same as the pre-calibrated mode cost.
  • the mode cost of the currently traversed candidate prediction mode can be performed as follows in the mode information set At least one penalty processing: amplify the mode cost of the currently traversed candidate prediction mode, and add a forbidden flag to the currently traversed candidate prediction mode, that is, in this case, the calibration of the currently traversed candidate prediction mode
  • the cost of the post mode and the cost of the mode before the calibration may be the same or different.
  • step S202 Iterate the above traversal steps until all the candidate prediction modes detected in step S202 are traversed, and then a calibrated mode information set can be obtained; the calibrated mode information collectively includes the candidate prediction modes detected in step S202 The calibrated mode cost of, and the mode cost of the candidate prediction mode that was not detected in step S202.
  • S204 Select a target prediction mode from multiple candidate prediction modes according to the mode cost of each candidate prediction mode in the calibrated mode information set.
  • the candidate prediction mode with the smallest mode cost in the calibrated mode information set can be selected as the target prediction mode.
  • the candidate prediction mode with the smallest mode cost in the calibrated mode information set has a disabled flag
  • the candidate prediction mode with the second lowest mode cost (ie, the second smallest) in the calibrated mode information set can be selected as the target prediction model.
  • the candidate prediction mode with the second lowest mode cost in the calibrated mode information set also has a forbidden flag
  • the candidate prediction mode with the third lowest mode cost in the calibrated mode information set can be selected as the target prediction mode. analogy.
  • the alternate prediction mode can be selected from multiple candidate prediction modes according to the mode cost of each candidate prediction mode in the calibrated mode information set; the alternate prediction mode here refers to the calibrated mode
  • the information concentration mode is a candidate prediction mode whose cost is greater than a cost threshold, and the cost threshold can be set according to an empirical value. Then, a backup prediction mode can be randomly selected from the selected backup prediction modes as the target prediction mode.
  • S205 Perform prediction processing on the target prediction unit using the target prediction mode to obtain coded data of the target image block.
  • the target prediction mode can be used to predict the pixel value of each pixel in the target prediction unit to obtain the prediction result of the target prediction unit; the prediction result of the target prediction unit here may include each pixel in the target prediction unit The predicted value of the point.
  • the prediction result of the target prediction unit here may include each pixel in the target prediction unit The predicted value of the point.
  • the residual block, and finally the residual block is sequentially transformed, quantized, and entropy-coded to obtain the coded data of the target image block.
  • the target prediction unit may be detected for abnormal distortion points in at least one candidate prediction mode in the mode information set to obtain the detection result corresponding to the at least one candidate prediction mode.
  • the mode cost of at least one candidate prediction mode in the mode information set can be calibrated according to the detection result corresponding to the at least one candidate prediction mode; the mode cost of each candidate prediction mode in the calibrated mode information set can be more accurate Reflect the corresponding code rate and distortion of the corresponding candidate prediction mode, so that according to the mode cost of each candidate prediction mode in the calibrated mode information set, the target prediction that is more suitable for the target prediction unit can be selected from multiple candidate prediction modes model.
  • a suitable target prediction mode can be used to perform prediction processing on the target prediction unit to obtain the encoded data of the target image block, which can reduce the probability of distortion of the target image block after encoding to a certain extent.
  • the embodiment of the present invention mainly selects a suitable target prediction mode to reduce the probability of distortion by modifying the mode decision process, it can effectively improve the image compression quality without substantially affecting the compression efficiency and coding complexity. , Improve the subjective quality of the target image block.
  • FIG. 3 is a schematic flowchart of another video encoding method provided by an embodiment of the present invention.
  • the video encoding method may be executed by the video encoding device mentioned above, and specifically may be executed by an encoder in the video encoding device.
  • the video encoding method may include the following steps S301-S307:
  • S301 Acquire a target prediction unit and a mode information set of the target prediction unit in a target image block.
  • the target image block can be divided into at least one prediction unit; then, a prediction unit that has not undergone prediction processing is selected from the at least one prediction unit as the target prediction unit. After the target prediction unit is determined, the mode information set matching the target prediction unit can also be obtained. From the foregoing, it can be seen that the mode information set of the target prediction unit includes multiple candidate prediction modes and the mode costs of each of the candidate prediction modes. Accordingly, the specific implementation manner for obtaining the mode information set matching the target prediction unit may include The following steps:
  • multiple candidate prediction modes that match the target prediction unit can be determined. Specifically, it can be detected whether the target image block belongs to I Slice (Intra Slice). Since I Slice usually only includes I macroblocks, and I macroblocks can only use the encoded pixels in the current frame image as a reference for intra prediction. Therefore, if the target image block belongs to I Slice, intra prediction can be used directly The mode performs prediction processing on the target prediction unit and does not perform subsequent steps. If the target image block does not belong to I Slice, it means that the target prediction unit can be predicted in either the intra prediction mode or the inter prediction mode. Therefore, each of the intra prediction mode and the inter prediction mode can be selected. Mode, as multiple candidate prediction modes that match the target prediction unit.
  • I Slice Intra Slice
  • the cost function can be used to calculate the mode cost of each candidate prediction mode; the cost function here can include but is not limited to: RDO (Rate-disto rtion Optimized, rate-distortion optimization) mode cost Functions, such as the cost function shown in the following formula 1.1; the cost function of the non-RDO mode, such as the cost function shown in the following formula 1.2, and so on. Then, the calculated mode cost of each candidate prediction mode and the corresponding candidate prediction mode can be added to the mode information set.
  • RDO Rate-disto rtion Optimized, rate-distortion optimization
  • cost represents the mode cost of the candidate prediction mode
  • HAD represents the sum of the absolute values of the coefficients of the residual signal of the target prediction unit after Hadamard (Hada code) transformation
  • represents the Lagrangian coefficient
  • R represents the coding candidate The number of bits required for the prediction mode (that is, the bit rate).
  • cost still represents the mode cost of the candidate prediction mode
  • SAD represents the pixel value prediction of each pixel in the target prediction unit using the candidate prediction mode, and the absolute difference between the obtained prediction result and the target prediction unit
  • 4R represents the estimated number of bits after using the candidate prediction mode
  • ⁇ (QP) represents the exponential function related to the quantization parameter (QP).
  • S302 Perform abnormal distortion point detection on the target prediction unit in at least one candidate prediction mode in the mode information set, to obtain a detection result corresponding to the at least one candidate prediction mode.
  • At least one candidate prediction mode is the inter prediction mode as an example; the inter prediction mode includes at least the following modes: the first prediction mode (SKIP mode), the second prediction mode (normal Merge mode) and the third prediction mode (AMVP mode). Since the principle of detecting abnormal distortion points of the target prediction unit in each mode of the inter prediction mode is similar, for ease of explanation, the following uses a reference prediction mode as an example to describe the implementation of step S302, where the reference prediction mode It is any one of the inter prediction modes, that is, the reference prediction mode may be the first prediction mode, the second prediction mode, or the third prediction mode.
  • DIFF (x, y) represents the residual (ie difference) between the actual pixel value and the predicted value of the pixel at the position (x, y) in the target prediction unit; ABS represents the absolute value , TH represents the target threshold. If the absolute value of the residual difference between the actual pixel value and the predicted value of the pixel at the position (x, y) in the target prediction unit is greater than the target threshold, the pixel can be determined to be an abnormal pixel; otherwise, the pixel can be determined Pixels are normal pixels. Based on this, in the process of specifically performing step S302, the reference prediction mode may be used to predict the pixel value of each pixel in the target prediction unit to obtain the predicted value of each pixel.
  • the absolute value of the residual between the pixel value of each pixel in the target prediction unit and the predicted value can be calculated. If there are pixels in the target prediction unit whose residual absolute value is greater than the target threshold, the detection result corresponding to the reference prediction mode is determined to indicate that the target prediction unit has abnormal distortion points in the reference prediction mode; if there is no absolute residual value in the target prediction unit For pixels with a value greater than the target threshold, it can be determined that the detection result corresponding to the reference prediction mode indicates that the target prediction unit does not have abnormal distortion points in the reference prediction mode.
  • the target threshold may include at least the following two value methods:
  • the above-mentioned target threshold may be set to a uniform fixed value based on empirical values; that is, in this case, regardless of the reference prediction mode It is the first prediction mode, the second prediction mode, or the third prediction mode, and the target thresholds for determining abnormal distortion points are all the same.
  • the detection criteria for abnormal distortion points in the second prediction mode and the third prediction mode can be relaxed; in this case ,
  • the target threshold can be specifically set for different reference target thresholds according to the empirical value; that is, the target threshold can be associated with the reference prediction mode.
  • the target threshold may be equal to the first threshold.
  • the first threshold is greater than the invalid value and less than the maximum value of the pixel value range; here the invalid value can be set according to an empirical value, for example, set to 0; the maximum value of the pixel value range can be determined according to the pixel bit width, the so-called pixel bit width Refers to the number of pixels that are transmitted or displayed at a time.
  • the value range of the first threshold (TH1) can be: 0 ⁇ TH1 ⁇ (2 ⁇ (BITDEPTH)); among them, BITDEPTH represents the pixel width, ⁇ represents the power operation, 2 ⁇ (BITDEPTH ) Represents the maximum value of the pixel value range; for example, if the pixel bit width is 8, the maximum value of the pixel value range is equal to 2 to the 8th power (that is, 256); then the first threshold can be any value from 0 to 256, For example, the first threshold can be set to 30.
  • the reference prediction mode in the process of specifically performing step S302, can also be used to predict the pixel value of each pixel in the target prediction unit to obtain the predicted value of each pixel, and according to each pixel The predicted value and residual information are calculated to obtain the motion compensation value of each pixel. Second, the absolute value of the difference between the pixel value of each pixel in the target prediction unit and the motion compensation value can be calculated.
  • the detection result corresponding to the reference prediction mode can be determined to indicate that the target prediction unit has abnormal distortion points in the reference prediction mode; if there is no difference in the target prediction unit For a pixel with an absolute value greater than the target threshold, it can be determined that the detection result corresponding to the reference prediction mode indicates that the target prediction unit does not have abnormal distortion points in the reference prediction mode.
  • the target threshold value in this embodiment can be set to a uniform fixed value.
  • S303 Perform complexity analysis on the target prediction unit to obtain the prediction complexity of the target prediction unit.
  • gradient operations may be performed on the pixel values included in the target prediction unit to obtain the image gradient value of the target prediction unit; the image gradient value is used as the prediction complexity of the target prediction unit.
  • variance calculation or mean calculation may be performed on the pixel values included in the target prediction unit, and the variance or mean value obtained by the calculation can be used as the prediction complexity of the target prediction unit.
  • step S304 After analyzing the prediction complexity of the target prediction unit, it can be detected according to the prediction complexity whether the target prediction unit satisfies a preset condition; wherein, the preset condition may at least include: the prediction complexity is less than or equal to the complexity threshold, and the target prediction The unit has abnormal distortion points in at least one of the inter prediction modes. If it is satisfied, step S304 can be performed; if it is not satisfied, step S305-S308 can be performed.
  • the intra prediction mode is used to perform prediction processing on the target prediction unit to obtain coded data of the target image block.
  • the reference prediction mode is Any of the inter prediction modes; that is, the reference prediction mode may be the first prediction mode, the second prediction mode, or the third prediction mode.
  • the mode cost of the reference prediction mode in the mode information set can be kept unchanged to obtain the post-calibration Set of mode information.
  • the cost adjustment strategy of the reference prediction mode can be used to adjust the mode cost of the reference prediction mode in the mode information set to obtain the post-calibration Set of mode information.
  • adopting the cost adjustment strategy of the reference prediction mode to adjust the mode cost of the reference prediction mode in the mode information set may include: adopting a penalty factor to adjust the mode cost of the reference prediction mode
  • the mode cost of the reference prediction mode is amplified to obtain the calibrated mode cost of the reference prediction mode.
  • the penalty factor is any value greater than 0; its specific value can be set according to empirical values.
  • the video encoding device may also be forced to convert the target that has not been transformed and quantized.
  • the residual value of each pixel in the image block is transmitted to the decoder, so that the decoder can decode according to the residual value.
  • adopting the cost adjustment strategy of the reference prediction mode to adjust the mode cost of the reference prediction mode in the mode information set may also include the following several implementation manners:
  • a preset cost can be obtained; the preset cost is greater than the mode cost of each candidate prediction mode in the calibrated mode information set except the first prediction mode, and is greater than the mode of the first prediction mode in the mode information set Cost (that is, greater than the mode cost of the first prediction mode before calibration). Then, in the mode information collection, the mode cost of the reference prediction mode is adjusted to the preset cost.
  • the following specific implementation manner of step S306 may be: from multiple candidate prediction modes, directly select the candidate prediction mode with the smallest mode cost in the calibrated mode information as the target prediction mode.
  • the mode cost of the first prediction mode can be set to an infinite value, so that the subsequent mode cost will be smaller The first prediction mode will not be selected when the target prediction mode is selected in the largest order.
  • the mode cost of the first prediction mode in the mode information set can be kept unchanged, and a forbidden flag is added to the first prediction mode, and the prohibited flag indicates that the first prediction mode is forbidden to perform prediction processing on the target prediction unit.
  • the following specific implementation of step S306 may be: if the candidate prediction mode with the smallest mode cost in the calibrated mode information set is not the first prediction mode, or the candidate prediction mode with the smallest mode cost is the first prediction Mode and the first prediction mode does not have a disabled flag, the candidate prediction mode with the smallest mode cost may be used as the target prediction mode.
  • the calibrated candidate prediction mode with the second lowest mode cost can be selected as the target prediction mode. It can be seen that, in this embodiment, if the target prediction unit has abnormal distortion points in the first prediction mode, the forbidden flag can be added so that the first prediction will not be selected when the target prediction mode is selected later. model.
  • the mode cost of the first prediction mode in the mode information set can be kept unchanged; no processing is performed on the first prediction mode.
  • the following specific implementation of step S306 may be: if the candidate prediction mode with the smallest mode cost in the calibrated mode information concentration is not the first prediction mode, then the candidate prediction mode with the smallest mode cost is directly used as the target Forecast mode. If the candidate prediction mode with the smallest mode cost in the calibrated mode information set is the first prediction mode, the detection result of the first prediction mode can be queried again.
  • the calibrated candidate prediction mode with the lowest cost of the mode information concentration mode can be selected as the target prediction mode; if the first prediction mode The detection result of indicates that the target prediction unit does not have abnormal distortion points in the first prediction mode, and the first prediction mode can be used as the target prediction mode.
  • S306 Select a target prediction mode from multiple candidate prediction modes according to the mode cost of each candidate prediction mode in the calibrated mode information set.
  • S307 Perform prediction processing on the target prediction unit using the target prediction mode to obtain coded data of the target image block.
  • the target prediction unit may be detected for abnormal distortion points in at least one candidate prediction mode in the mode information set to obtain the detection result corresponding to the at least one candidate prediction mode.
  • the mode cost of at least one candidate prediction mode in the mode information set can be calibrated according to the detection result corresponding to the at least one candidate prediction mode; the mode cost of each candidate prediction mode in the calibrated mode information set can be more accurate Reflect the corresponding code rate and distortion of the corresponding candidate prediction mode, so that according to the mode cost of each candidate prediction mode in the calibrated mode information set, the target prediction that is more suitable for the target prediction unit can be selected from multiple candidate prediction modes model.
  • a suitable target prediction mode can be used to perform prediction processing on the target prediction unit to obtain the encoded data of the target image block, which can reduce the probability of distortion of the target image block after encoding to a certain extent; and, due to the implementation of the present invention
  • the main example is to select the appropriate target prediction mode to reduce the probability of distortion by modifying the mode decision process. Therefore, it can effectively improve the image compression quality and improve the target image block without affecting the compression efficiency and coding complexity. Subjective quality.
  • the embodiment of the present invention also proposes a video playback method; the video playback method may be executed by the aforementioned video playback device.
  • the video playback method may include the following steps S401-S403:
  • S401 Obtain code stream data of each frame image in the image frame sequence corresponding to the target video.
  • the target video may include, but is not limited to: screen sharing video, web conference video, web live video, film and television drama video, short video, and so on.
  • the bitstream data of each image in the image frame sequence corresponding to the target video may include the coded data of multiple image blocks; and, the coded data of each image block in the other frame images except the first image in the image frame sequence Both can be obtained by encoding using the video encoding method shown in FIG. 2 or FIG. 3.
  • the video playback device can obtain the bitstream data of each frame image in the image frame sequence corresponding to the target video from the video encoding device.
  • the bitstream data of each frame image in the image frame sequence corresponding to the target video can be obtained by real-time encoding.
  • the video playback device can obtain each frame from the video encoding device in real time.
  • the bitstream data of the image that is to say, in this embodiment, every time the video encoding device encodes the bitstream data of a frame of image, the bitstream data of the frame of image can be transmitted to the video playback device for decoding and playback.
  • the bitstream data of each frame image in the image frame sequence corresponding to the target video can also be obtained by offline encoding in advance.
  • the video playback device can also be obtained from the video encoding device at one time. Get the bit stream data of each frame image in the image frame sequence. That is to say, in this embodiment, the video encoding device can encode the bitstream data of all frame images, and then transmit the bitstream data of all frame images to the video playback device for decoding and playback.
  • S402 Decode the bitstream data of each frame of image to obtain each frame of image.
  • steps S402-S403 can refer to the relevant content of the decoding stage mentioned in the foregoing image processing flow, which will not be repeated here. It should also be noted that if the bit stream data of each frame image in the image frame sequence corresponding to the target video is encoded in real time and transmitted to the video playback device in real time, the video playback device receives the bit stream of one frame of image every time Data, steps S402-S403 can be executed to realize real-time display of frame images, thereby realizing real-time playback of the target video.
  • the embodiment of the present invention may first obtain the code stream data of each frame image in the image frame sequence corresponding to the target video, and the code stream data of each frame image includes the coded data of multiple image blocks. Secondly, the code stream data of each frame of image can be decoded to obtain each frame of image; and each frame of image can be displayed in sequence in the playback interface. Since the encoding data of each image block in the image frame sequence corresponding to the target video except the first frame image is encoded by the above-mentioned video encoding method; therefore, the probability of distortion of each image block can be effectively reduced, thereby When displaying each frame of image in the playback interface, it can reduce the probability of dirty spots in each frame of image to a certain extent, and improve the subjective quality of each frame of image.
  • the video encoding method and video playback method proposed in the embodiments of the present invention can be used in various application scenarios; such as screen sharing scenes in video conferences, live video live scenes, video play scenes of film and television dramas, and so on.
  • the following takes a screen sharing scene in a video conference as an example to describe specific applications of the video encoding method and video playback method proposed in the embodiments of the present invention:
  • the first communication client used by user A detects that the screen sharing function is turned on, it can obtain the display content of the terminal screen corresponding to user A in real time, and generate the current screen sharing video based on the display content obtained in real time. Frame image. Then, the first communication client can encode the current frame image to obtain the bitstream data of the current frame image. Specifically, the current frame image can be divided into multiple image blocks, and the video encoding method shown in FIG. 2 or FIG.
  • the coded data is combined to obtain the code stream data of the current frame image.
  • the first communication client can transmit the code stream data of the current frame image to the second communication client used by other users through the server.
  • the second communication client used by other users can use the video playback method shown in FIG.
  • the data is decoded to obtain the current frame image; and the current frame image is displayed in the user interface, as shown in Figure 5.
  • the use of the video encoding method and video playback method proposed in the embodiments of the present invention can effectively reduce the probability of abnormal distortion points in the screen sharing scene, can effectively improve the compressed video quality of the screen sharing video, and improve the screen sharing video performance Subjective quality.
  • the embodiment of the present invention also discloses a video encoding device, and the video encoding device may be a computer readable instruction (including program code) running in a video encoding device.
  • the video encoding device can execute the methods shown in FIGS. 2 to 3. Referring to Fig. 6, the video encoding device may run the following units:
  • the obtaining unit 601 is configured to obtain a target prediction unit in a target image block and a mode information set of the target prediction unit, where the mode information set includes multiple candidate prediction modes and mode costs of various candidate prediction modes;
  • the encoding unit 602 is configured to perform abnormal distortion point detection on the target prediction unit in at least one candidate prediction mode in the mode information set, to obtain a detection result corresponding to the at least one candidate prediction mode;
  • the encoding unit 602 is further configured to calibrate the mode cost of the at least one candidate prediction mode in the mode information set according to the detection result corresponding to the at least one candidate prediction mode to obtain calibrated mode information set;
  • the encoding unit 602 is further configured to select a target prediction mode from the multiple candidate prediction modes according to the mode cost of each candidate prediction mode in the calibrated mode information set;
  • the encoding unit 602 is further configured to use the target prediction mode to perform prediction processing on the target prediction unit to obtain the encoded data of the target image block.
  • the at least one candidate prediction mode is an inter prediction mode
  • the inter prediction mode includes at least one of the following modes: a first prediction mode, a second prediction mode, and a third prediction model;
  • the first prediction mode refers to a mode in which index information of a reference image block related to the target image block needs to be transmitted
  • the second prediction mode refers to a mode in which the residual information of the target image block and the index information of the reference image block related to the target image block need to be transmitted;
  • the third prediction mode refers to a mode that needs to transmit the residual information of the target image block, the motion vector data of the target image block, and the index information of the reference image block related to the target image block.
  • the encoding unit 602 performs abnormal distortion point detection on the target prediction unit in at least one candidate prediction mode in the mode information set, to obtain the information corresponding to the at least one candidate prediction mode.
  • it can be specifically used to:
  • the reference prediction mode is used to predict the pixel value of each pixel in the target prediction unit to obtain the predicted value of each pixel;
  • the reference prediction mode is any one of the inter prediction modes, or the reference
  • the prediction mode is any one of the at least one candidate prediction mode;
  • the detection result corresponding to the reference prediction mode indicates that the target prediction unit does not exist in the reference prediction mode. Abnormal distortion point.
  • the target threshold is associated with the reference prediction mode
  • the target threshold is equal to a first threshold; the first threshold is greater than an invalid value and less than the maximum value of the pixel value domain;
  • the target threshold is equal to the second threshold; the second threshold is greater than or equal to the first A threshold, and less than the maximum value of the pixel value range.
  • the encoding unit 602 is configured to calibrate the mode cost of the at least one candidate prediction mode in the mode information set according to the detection result corresponding to the at least one candidate prediction mode to obtain a calibration
  • the latter mode information set can be specifically used for:
  • the mode information set of; the reference prediction mode is any one of the inter prediction modes;
  • the cost adjustment strategy of the reference prediction mode is used to predict the reference in the mode information set
  • the mode cost of the mode is adjusted to obtain the calibrated mode information set.
  • the encoding unit 602 when used to adjust the mode cost of the reference prediction mode in the mode information set by adopting the cost adjustment strategy of the reference prediction mode, it may be specifically configured to:
  • a penalty factor is used to amplify the mode cost of the reference prediction mode to obtain the calibrated mode cost of the reference prediction mode .
  • the encoding unit 602 when used to adjust the mode cost of the reference prediction mode in the mode information set by adopting the cost adjustment strategy of the reference prediction mode, it may also be used to:
  • the reference prediction mode is the first prediction mode, obtain a preset cost; the preset cost is greater than the mode cost of each candidate prediction mode in the calibrated mode information set except for the first prediction mode , And greater than the mode cost of the first prediction mode in the mode information set;
  • the mode cost of the reference prediction mode is adjusted to the preset cost.
  • the encoding unit 602 may be specifically used for selecting a target prediction mode from the multiple candidate prediction modes according to the mode cost of each candidate prediction mode in the calibrated mode information set. :
  • the candidate prediction mode with the smallest mode cost in the calibrated mode information is selected as the target prediction mode.
  • the encoding unit 602 when used to adjust the mode cost of the reference prediction mode in the mode information set by adopting the cost adjustment strategy of the reference prediction mode, it may also be used to:
  • the reference prediction mode is the first prediction mode, keeping the mode cost of the first prediction mode in the mode information set unchanged;
  • a prohibition flag is added to the first prediction mode, where the prohibition flag indicates that the first prediction mode is forbidden to perform prediction processing on the target prediction unit.
  • the encoding unit 602 may be specifically used for selecting a target prediction mode from the multiple candidate prediction modes according to the mode cost of each candidate prediction mode in the calibrated mode information set. :
  • the candidate prediction mode with the smallest mode cost in the calibrated mode information set is not the first prediction mode, or the candidate prediction mode with the smallest mode cost is the first prediction mode and the first prediction mode does not have all The forbidden flag, the candidate prediction mode with the smallest mode cost is used as the target prediction mode;
  • the candidate prediction mode with the smallest mode cost is the first prediction mode, and the first prediction mode has the forbidden flag, the candidate prediction mode with the second lowest mode cost in the calibrated mode information set is selected as the target prediction model.
  • the multiple candidate prediction modes include: an intra prediction mode and an inter prediction mode, and the inter prediction mode includes at least one mode; correspondingly, the encoding unit 602 may also be used for:
  • the intra prediction mode is used to perform prediction processing on the target prediction unit to obtain the encoded data of the target image block; wherein,
  • the preset conditions include: the prediction complexity is less than or equal to the complexity threshold, and the target prediction unit has abnormal distortion points in at least one of the inter prediction modes;
  • the target prediction unit If it is determined according to the prediction complexity that the target prediction unit does not satisfy the preset condition, perform detection of the at least one candidate in the mode information set according to the detection result corresponding to the at least one candidate prediction mode. The step of calibrating the mode cost of the prediction mode to obtain the calibrated mode information set.
  • each step involved in the method shown in FIGS. 2 to 3 may be executed by each unit in the video encoding device shown in FIG. 6.
  • step S201 shown in FIG. 2 may be performed by the acquiring unit 601 shown in FIG. 6, and steps S202-S205 may be performed by the encoding unit 602 shown in FIG. 6;
  • steps shown in FIG. 3 S301 may be performed by the acquiring unit 601 shown in FIG. 6, and steps S302-S307 may be performed by the encoding unit 602 shown in FIG. 6.
  • the units in the video encoding device shown in FIG. 6 can be separately or all combined into one or several other units to form, or some of the units can be further disassembled. It is divided into a plurality of functionally smaller units to form, which can realize the same operation without affecting the realization of the technical effect of the embodiment of the present invention.
  • the above-mentioned units are divided based on logical functions. In practical applications, the function of one unit can also be realized by multiple units, or the functions of multiple units can be realized by one unit. In other embodiments of the present invention, the video encoding device may also include other units. In practical applications, these functions may also be implemented with the assistance of other units, and may be implemented by multiple units in cooperation.
  • a general-purpose computing device such as a computer including a central processing unit (CPU), a random access storage medium (RAM), a read-only storage medium (ROM) and other processing elements and storage elements
  • CPU central processing unit
  • RAM random access storage medium
  • ROM read-only storage medium
  • Run computer-readable instructions (including program codes) that can execute the steps involved in the corresponding methods shown in FIGS. 2 to 3 to construct the video encoding device as shown in FIG. 6, and to implement the present invention
  • the computer-readable instructions may be recorded on, for example, a computer-readable recording medium, and loaded into the aforementioned computing device through the computer-readable recording medium, and run in it.
  • the target prediction unit may be detected for abnormal distortion points in at least one candidate prediction mode in the mode information set to obtain the detection result corresponding to the at least one candidate prediction mode.
  • the mode cost of at least one candidate prediction mode in the mode information set can be calibrated according to the detection result corresponding to the at least one candidate prediction mode; the mode cost of each candidate prediction mode in the calibrated mode information set can be more accurate Reflect the corresponding code rate and distortion of the corresponding candidate prediction mode, so that according to the mode cost of each candidate prediction mode in the calibrated mode information set, the target prediction that is more suitable for the target prediction unit can be selected from multiple candidate prediction modes model.
  • a suitable target prediction mode can be used to perform prediction processing on the target prediction unit to obtain the encoded data of the target image block, which can reduce the probability of distortion of the target image block after encoding to a certain extent; and, due to the implementation of the present invention
  • the main example is to select the appropriate target prediction mode to reduce the probability of distortion by modifying the mode decision process. Therefore, it can effectively improve the image compression quality and improve the target image block without affecting the compression efficiency and coding complexity. Subjective quality.
  • the video encoding device includes at least a processor 701, an input interface 702, an output interface 703, a computer storage medium 704, and an encoder 705.
  • the computer storage medium 704 may be stored in the memory of the video encoding device, the computer storage medium 704 is used to store computer readable instructions, the computer readable instructions include program instructions, and the processor 701 is used to execute the computer storage medium 704 Stored program instructions.
  • the processor 701 (or CPU (Central Processing Unit, central processing unit)) is the computing core and control core of the video encoding device.
  • the processor 701 described in the embodiment of the present invention may be used to perform a series of video encoding on the target image block, including: obtaining the target prediction unit in the target image block and The mode information set of the target prediction unit, where the mode information set includes multiple candidate prediction modes and mode costs of various candidate prediction modes;
  • the target prediction unit detects abnormal distortion points to obtain the detection result corresponding to the at least one candidate prediction mode; according to the detection result corresponding to the at least one candidate prediction mode, the at least one candidate in the mode information set
  • the mode cost of the prediction mode is calibrated to obtain a calibrated mode information set; according to the mode cost of each candidate prediction mode in the calibrated mode information set, a target prediction mode is selected from the multiple candidate prediction modes; the target is adopted
  • the prediction mode performs prediction processing on the target prediction unit to obtain the encoded data of the target image block, and so on.
  • the embodiment of the present invention also provides a computer storage medium (Memory), which is a memory device in a video encoding device and is used to store programs and data.
  • the computer storage medium herein may include a built-in storage medium in the video encoding device, or of course, may also include an extended storage medium supported by the video encoding device.
  • the computer storage medium provides storage space, and the storage space stores the operating system of the video encoding device.
  • one or more instructions suitable for being loaded and executed by the processor 701 are also stored in the storage space, and these instructions may be one or more computer-readable instructions (including program codes).
  • the computer storage medium here can be a high-speed RAM memory, or a non-volatile memory (non-volatile memory), such as at least one disk memory; optionally, it can also be at least one located far away from the aforementioned processor.
  • Computer storage media can be a high-speed RAM memory, or a non-volatile memory (non-volatile memory), such as at least one disk memory; optionally, it can also be at least one located far away from the aforementioned processor.
  • the processor 701 can load and execute one or more first instructions stored in the computer storage medium to implement the corresponding steps of the method in the above-mentioned video encoding method embodiment; in specific implementation, the computer storage medium One or more of the first instructions are loaded by the processor 701 and execute the following steps:
  • the mode information set includes multiple candidate prediction modes and mode costs of various candidate prediction modes
  • the target prediction mode is used to perform prediction processing on the target prediction unit to obtain the encoded data of the target image block.
  • the at least one candidate prediction mode is an inter prediction mode
  • the inter prediction mode includes at least the following modes: a first prediction mode, a second prediction mode, and a third prediction mode;
  • the first prediction mode refers to a mode in which index information of a reference image block related to the target image block needs to be transmitted
  • the second prediction mode refers to a mode in which the residual information of the target image block and the index information of the reference image block related to the target image block need to be transmitted;
  • the third prediction mode refers to a mode that needs to transmit the residual information of the target image block, the motion vector data of the target image block, and the index information of the reference image block related to the target image block.
  • the One or more first instructions are loaded and specifically executed by the processor 701:
  • the reference prediction mode is any mode in the inter prediction mode
  • the detection result corresponding to the reference prediction mode indicates that the target prediction unit does not exist in the reference prediction mode. Abnormal distortion point.
  • the target threshold is associated with the reference prediction mode
  • the target threshold is equal to a first threshold; the first threshold is greater than an invalid value and less than the maximum value of the pixel value domain;
  • the target threshold is equal to the second threshold; the second threshold is greater than or equal to the first A threshold, and less than the maximum value of the pixel value range.
  • the mode cost of the at least one candidate prediction mode in the mode information set is calibrated according to the detection result corresponding to the at least one candidate prediction mode to obtain a calibrated mode information set
  • the one or more first instructions can be loaded and specifically executed by the processor 701:
  • the mode information set of; the reference prediction mode is any one of the inter prediction modes;
  • the cost adjustment strategy of the reference prediction mode is used to predict the reference in the mode information set
  • the mode cost of the mode is adjusted to obtain the calibrated mode information set.
  • the one or more first instructions are executed by the processor 701 Load and execute:
  • a penalty factor is used to amplify the mode cost of the reference prediction mode to obtain the calibrated mode cost of the reference prediction mode .
  • the one or more first instructions are executed by the processor 701 Load and execute:
  • the reference prediction mode is the first prediction mode, obtain a preset cost; the preset cost is greater than the mode cost of each candidate prediction mode in the calibrated mode information set except for the first prediction mode , And greater than the mode cost of the first prediction mode in the mode information set;
  • the mode cost of the reference prediction mode is adjusted to the preset cost.
  • the candidate prediction mode with the smallest mode cost in the calibrated mode information is selected as the target prediction mode.
  • the one or more first instructions are executed by the processor 701 Load and execute:
  • the reference prediction mode is the first prediction mode, keeping the mode cost of the first prediction mode in the mode information set unchanged;
  • a prohibition flag is added to the first prediction mode, where the prohibition flag indicates that the first prediction mode is forbidden to perform prediction processing on the target prediction unit.
  • the candidate prediction mode with the smallest mode cost in the calibrated mode information set is not the first prediction mode, or the candidate prediction mode with the smallest mode cost is the first prediction mode and the first prediction mode does not have all The forbidden flag, the candidate prediction mode with the smallest mode cost is used as the target prediction mode;
  • the candidate prediction mode with the smallest mode cost is the first prediction mode, and the first prediction mode has the forbidden flag, the candidate prediction mode with the second lowest mode cost in the calibrated mode information set is selected as the target prediction model.
  • the multiple candidate prediction modes include: intra prediction mode and inter prediction mode; correspondingly, the one or more first instructions may also be loaded and specifically executed by the processor 701:
  • the intra prediction mode is used to perform prediction processing on the target prediction unit to obtain the encoded data of the target image block; wherein,
  • the preset conditions include: the prediction complexity is less than or equal to the complexity threshold, and the target prediction unit has abnormal distortion points in at least one of the inter prediction modes;
  • the target prediction unit If it is determined according to the prediction complexity that the target prediction unit does not satisfy the preset condition, perform detection of the at least one candidate in the mode information set according to the detection result corresponding to the at least one candidate prediction mode. The step of calibrating the mode cost of the prediction mode to obtain the calibrated mode information set.
  • the target prediction unit may be detected for abnormal distortion points in at least one candidate prediction mode in the mode information set to obtain the detection result corresponding to the at least one candidate prediction mode.
  • the mode cost of at least one candidate prediction mode in the mode information set can be calibrated according to the detection result corresponding to the at least one candidate prediction mode; the mode cost of each candidate prediction mode in the calibrated mode information set can be more accurate Reflect the corresponding code rate and distortion of the corresponding candidate prediction mode, so that according to the mode cost of each candidate prediction mode in the calibrated mode information set, the target prediction that is more suitable for the target prediction unit can be selected from multiple candidate prediction modes model.
  • a suitable target prediction mode can be used to perform prediction processing on the target prediction unit to obtain the encoded data of the target image block, which can reduce the probability of distortion of the target image block after encoding to a certain extent; and, due to the implementation of the present invention
  • the main example is to select the appropriate target prediction mode to reduce the probability of distortion by modifying the mode decision process. Therefore, it can effectively improve the image compression quality and improve the target image block without affecting the compression efficiency and coding complexity. Subjective quality.
  • the embodiment of the present invention also discloses a video playback device.
  • the video playback device may be a computer readable instruction (including program code) running in a video playback device.
  • the video playback device can execute the method shown in FIG. 4. Referring to Figure 8, the video playback device can run the following units:
  • the acquiring unit 801 is configured to acquire the bitstream data of each frame image in the image frame sequence corresponding to the target video.
  • the bitstream data of each frame image includes the coded data of multiple image blocks; the first frame of the image frame sequence is excluded
  • the coded data of each image block in other frame images other than the image is obtained by using the video coding method shown in FIG. 2 or FIG. 3;
  • the decoding unit 802 is configured to decode the bitstream data of each frame image to obtain the each frame image;
  • the display unit 803 is configured to sequentially display the frames of images in the playback interface.
  • each step involved in the method shown in FIG. 4 may be executed by each unit in the video playback device shown in FIG. 8.
  • steps S401-S403 shown in FIG. 4 may be performed by the acquiring unit 801, decoding unit 802, and display unit 803 shown in FIG. 8 respectively.
  • the units in the video playback encoding device shown in FIG. 8 can be separately or all combined into one or several other units to form, or some unit(s) of them can also be formed. It is divided into multiple functionally smaller units to form, which can realize the same operation without affecting the realization of the technical effect of the embodiment of the present invention.
  • the above-mentioned units are divided based on logical functions.
  • the function of one unit can also be realized by multiple units, or the functions of multiple units can be realized by one unit.
  • the video-based playback device may also include other units.
  • these functions may also be implemented with the assistance of other units, and may be implemented by multiple units in cooperation.
  • a general-purpose computing device such as a computer including a central processing unit (CPU), a random access storage medium (RAM), a read-only storage medium (ROM) and other processing elements and storage elements
  • CPU central processing unit
  • RAM random access storage medium
  • ROM read-only storage medium
  • Run computer-readable instructions (including program codes) that can execute the steps involved in the corresponding method as shown in FIG. 4 to construct the video playback device as shown in FIG. 8 and to implement the embodiments of the present invention Video playback method.
  • the computer-readable instructions may be recorded on, for example, a computer-readable recording medium, and loaded into the aforementioned computing device through the computer-readable recording medium, and run in it.
  • the embodiment of the present invention also provides a video playback device.
  • the video playback device at least includes a processor 901, an input interface 902, an output interface 903, a computer storage medium 904, and a decoder 905.
  • the computer storage medium 904 may be stored in the memory of the video playback device, the computer storage medium 904 is used to store computer readable instructions, the computer readable instructions include program instructions, and the processor 901 is used to execute the computer storage medium 904 Stored program instructions.
  • the processor 901 (or CPU (Central Processing Unit, central processing unit)) is the computing core and control core of the video playback device.
  • the processor 901 described in the embodiment of the present invention may be used to perform a series of video playback on the target video, including: acquiring each of the image frame sequences corresponding to the target video
  • the code stream data of a frame image, the code stream data of each frame image includes the coded data of multiple image blocks; the coded data of each image block in the other frame images except the first frame image in the image frame sequence is shown in Figure 2
  • the video encoding method shown in FIG. 3 is encoded; the code stream data of each frame of image is decoded to obtain the frame of image; the frame of image is sequentially displayed in the playback interface, and so on.
  • the embodiment of the present invention also provides a computer storage medium (Memory).
  • the computer storage medium is a memory device in a video playback device for storing programs and data. It is understandable that the computer storage medium herein may include a built-in storage medium in the video playback device, or of course, may also include an extended storage medium supported by the video playback device.
  • the computer storage medium provides storage space, and the storage space stores the operating system of the video playback device.
  • one or more instructions suitable for being loaded and executed by the processor 901 are stored in the storage space, and these instructions may be one or more computer-readable instructions (including program codes).
  • the computer storage medium here can be a high-speed RAM memory, or a non-volatile memory (non-volatile memory), such as at least one disk memory; optionally, it can also be at least one located far away from the aforementioned processor.
  • Computer storage media can be a high-speed RAM memory, or a non-volatile memory (non-volatile memory), such as at least one disk memory; optionally, it can also be at least one located far away from the aforementioned processor.
  • the processor 901 can load and execute one or more second instructions stored in the computer storage medium to implement the corresponding steps of the method in the above-mentioned video playback method embodiment; in specific implementation, the computer storage medium One or more of the second instructions are loaded by the processor 901 and execute the following steps:
  • the bitstream data of each frame image includes the coded data of multiple image blocks;
  • the coded data of each image block is coded using the video coding method shown in FIG. 2 or FIG. 3;
  • the frames of images are sequentially displayed in the playback interface.
  • the embodiment of the present invention may first obtain the code stream data of each frame image in the image frame sequence corresponding to the target video, and the code stream data of each frame image includes the coded data of multiple image blocks. Secondly, the code stream data of each frame of image can be decoded to obtain each frame of image; and each frame of image can be displayed in sequence in the playback interface. Since the encoding data of each image block in the image frame sequence corresponding to the target video except the first frame image is encoded by the above-mentioned video encoding method; therefore, the probability of distortion of each image block can be effectively reduced, thereby When displaying each frame of image in the playback interface, it can reduce the probability of dirty spots in each frame of image to a certain extent, and improve the subjective quality of each frame of image.
  • a computer device including a memory and one or more processors, in which computer-readable instructions are stored, and the one or more processors implement the above-mentioned methods when the computer-readable instructions are executed. Steps in the embodiment.
  • one or more non-volatile computer-readable storage media storing computer-readable instructions, and the computer-readable instructions are stored by one or more processors.
  • the steps in the foregoing method embodiments are implemented during execution.
  • a computer program product or computer program includes computer-readable instructions stored in a computer-readable storage medium.
  • the processor reads the computer-readable instructions from the computer-readable storage medium, and the processor executes the computer-readable instructions so that the computer device executes the steps in the foregoing method embodiments.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

本申请公开一种视频编码方法、视频播放方法、相关设备及介质;视频编码方法包括:获取目标图像块中的目标预测单元及模式信息集;在模式信息集中的至少一种候选预测模式下对目标预测单元进行异常失真点检测,得到至少一种候选预测模式对应的检测结果;根据至少一种候选预测模式对应的检测结果,对模式信息集中至少一种候选预测模式的模式代价进行校准,得到校准后的模式信息集;根据校准后的模式信息集中各候选预测模式的模式代价,从多种候选预测模式中选取目标预测模式;采用目标预测模式对目标预测单元进行预测处理,以得到目标图像块的编码数据。

Description

视频编码方法、视频播放方法、相关设备及介质
本申请要求于2020年05月25日提交中国专利局,申请号为202010452023.1,申请名称为“视频编码方法、视频播放方法、相关设备及介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及互联网技术领域,具体涉及图像处理技术领域,尤其涉及一种视频编码方法、一种视频播放方法、一种视频编码装置、一种视频播放装置、一种视频编码设备、一种视频播放设备及一种计算机存储介质。
背景技术
视频编码通常是将待编码图像划分成多个图像块,通过对各图像块进行编码来得到待编码图像的码流数据。在对任一图像块进行编码的过程中,通常需要先采用预测模式对该图像块对应的各预测单元进行预测,以得到图像块的残差块;然后对残差块进行后续的变换量化等处理,从而得到图像块的编码数据。经研究表明,在对图像块的编码过程中,若针对预测单元所选取的预测模式不合适,则容易导致图像块在编码后产生较大的失真,使得图像块的主观质量较低。
发明内容
本发明实施例提供了一种视频编码方法、视频播放方法、相关设备及介质。
一种视频编码方法,包括:
获取目标图像块中的目标预测单元及所述目标预测单元的模式信息集,所述模式信息集包括多种候选预测模式及各种所述候选预测模式的模式代价;
在所述模式信息集中的至少一种候选预测模式下对所述目标预测单元进行异常失真点检测,得到所述至少一种候选预测模式对应的检测结果;
根据所述至少一种候选预测模式对应的检测结果,对所述模式信息集中的所述至少一种候选预测模式的模式代价进行校准,得到校准后的模式信息集;
根据校准后的模式信息集中的各候选预测模式的模式代价,从所述多种候选预测模式中选取目标预测模式;及
采用所述目标预测模式对所述目标预测单元进行预测处理,以得到所述目标图像块的编码数据。
一种视频编码装置,该视频编码装置包括:
获取单元,用于获取目标图像块中的目标预测单元及所述目标预测单元的模式信息集,所述模式信息集包括多种候选预测模式及各种所述候选预测模式的模式代价;
编码单元,用于在所述模式信息集中的至少一种候选预测模式下对所述目标预测单元进行异常失真点检测,得到所述至少一种候选预测模式对应的检测结果;
所述编码单元,还用于根据所述至少一种候选预测模式对应的检测结果,对所述模式信息集中的所述至少一种候选预测模式的模式代价进行校准,得到校准后的模式信息集;
所述编码单元,还用于根据校准后的模式信息集中的各候选预测模式的模式代价,从所 述多种候选预测模式中选取目标预测模式;及
所述编码单元,还用于采用所述目标预测模式对所述目标预测单元进行预测处理,以得到所述目标图像块的编码数据。
一种计算机设备,包括存储器和一个或多个处理器,所述存储器存储有计算机可读指令,所述计算机可读指令被所述一个或多个处理器执行时,使得所述一个或多个处理器执行上述视频编码方法的步骤。
一个或多个存储有计算机可读指令的非易失性计算机可读存储介质,其上存储有计算机可读指令,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行上述视频编码方法的步骤。
一种计算机程序产品或计算机程序,所述计算机程序产品或计算机程序包括计算机可读指令,所述计算机可读指令存储在计算机可读存储介质中,计算机设备的处理器从所述计算机可读存储介质读取所述计算机可读指令,所述处理器执行所述计算机可读指令,使得所述计算机设备执行上述视频编码方法的步骤。
一种视频播放方法,由视频播放设备执行,包括:
获取目标视频对应的图像帧序列中的各帧图像的码流数据,每帧图像的码流数据中包括多个图像块的编码数据;所述图像帧序列中除首帧图像以外的其他帧图像中的各图像块的编码数据采用上述的视频编码方法编码得到;
对所述各帧图像的码流数据进行解码,得到所述各帧图像;及
在播放界面中依次显示所述各帧图像。
一种视频播放装置,该视频播放装置包括:
获取单元,用于获取目标视频对应的图像帧序列中的各帧图像的码流数据,每帧图像的码流数据中包括多个图像块的编码数据;所述图像帧序列中除首帧图像以外的其他帧图像中的各图像块的编码数据采用上述的视频编码方法编码得到;
解码单元,用于对所述各帧图像的码流数据进行解码,得到所述各帧图像;及
显示单元,用于在播放界面中依次显示所述各帧图像。
一种计算机设备,包括存储器和一个或多个处理器,所述存储器存储有计算机可读指令,所述计算机可读指令被所述一个或多个处理器执行时,使得所述一个或多个处理器执行上述视频播放方法的步骤。
一个或多个存储有计算机可读指令的非易失性计算机可读存储介质,其上存储有计算机可读指令,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行上述视频播放方法的步骤。
一种计算机程序产品或计算机程序,所述计算机程序产品或计算机程序包括计算机可读指令,所述计算机可读指令存储在计算机可读存储介质中,计算机设备的处理器从所述计算机可读存储介质读取所述计算机可读指令,所述处理器执行所述计算机可读指令,使得所述计算机设备执行上述视频播放方法的步骤。
附图说明
为了更清楚地说明本发明实施例技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图是本发明的一些实施例,对于本领域普通技术 人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1a是本发明实施例提供的一种图像处理系统的架构示意图;
图1b是本发明实施例提供的一种图像处理流程的示意图;
图1c是本发明实施例提供的一种将帧图像划分成图像块的示意图;
图1d是本发明实施例提供的一种帧间预测模式的划分示意图;
图1e是本发明实施例提供的一种对帧图像进行编码的示意图;
图1f是本发明实施例提供的另一种对帧图像进行编码的示意图;
图2是本发明实施例提供的一种视频编码方法的流程示意图;
图3是本发明实施例提供的一种视频编码方法的流程示意图;
图4是本发明实施例提供的一种视频播放方法的流程示意图;
图5是本发明实施例提供的一种视频编码方法以及视频播放方法的应用场景图;
图6是本发明实施例提供的一种视频编码装置的结构示意图;
图7是本发明实施例提供的一种视频编码设备的结构示意图;
图8是本发明实施例提供的一种视频播放装置的结构示意图;
图9是本发明实施例提供的一种视频播放设备的结构示意图。
具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述。
在本发明实施例中,涉及了一种图像处理系统;参见图1a所示,该图像处理系统至少包括:视频编码设备11、视频播放设备(视频解码设备)12以及传输媒介13。其中,视频编码设备11可以是服务器或者终端;此处的服务器可以是独立的物理服务器,也可以是多个物理服务器构成的服务器集群或者分布式系统,还可以是提供云服务、云数据库、云计算、云函数、云存储、网络服务、云通信、中间件服务、域名服务、安全服务、CDN(Content Delivery Network,内容分发网络)、以及大数据和人工智能平台等基础云计算服务的云服务器,等等;终端可以是智能手机、平板电脑、笔记本电脑、台式计算机,等等。视频编码设备11的内部可至少包括编码器,该编码器用于执行一系列的编码流程。视频播放设备12可以是具有视频播放功能的任意设备,如智能手机、平板电脑、笔记本电脑、智能手表等终端,或者投影仪、投影机等可将视频图像投射到幕布上播放的设备。视频播放设备12中可至少包括解码器,该解码器用于执行一系列的解码流程。传输媒介13是指数据传输所经由的空间或实体,主要用于在视频编码设备11和视频播放设备12之间传输数据;其具体可以包括但不限于:移动网络、无线网络、有线网络等网络介质、或者诸如U盘(Universal Serial Bus,优盘)、移动硬盘等具有读写功能的可移动硬件介质。应理解的是,图1a只是示例性地表征本发明实施例所涉及的图像处理系统的架构,并不对图像处理系统的具体架构进行限定。例如,在其他实施例中,视频编码设备11和视频播放设备12也可以是同一个设备;再如,在其他实施例中,视频播放设备12的数量也可不局限于一个,等等。
在上述的图像处理系统中,针对图像帧序列中的任一帧图像的处理流程可一并参见图1b所示,其大致包括如下几个阶段:
(1)编码阶段:
视频编码设备11中的编码器在获取到待编码的当前帧图像后,可基于主流的视频编码 标准对该当前帧图像进行编码,得到当前帧图像的码流数据。其中,主流的视频编码标准可包括但不限于:H.264、H.265、VVC(Versatile Video Coding,多功能视频编码标准)、AVS3(Audio Video coding Standard 3,音视频编码标准),等等;此处的H.265又可称为HEVC(High Efficiency Video Coding,高性能视频编码标准)。以H.265为例,其大致的编码流程具体如下:
①将待编码的当前帧图像划分成若干个图像块(或称为CU(Coding Unit,编码单元)),此处的图像块是指视频编码的基本单元。具体实现中,可先将待编码的当前帧图像划分为若干个互不重叠的LCU(Largest Coding Unit,最大编码单元)。然后,可根据每个LCU的特点将相应的LCU进一步划分得到若干个CU,如图1c所示;应理解的是,图1c只是示例性表示LCU的划分方式,并不对其进行限定;如图1c表示的是将LCU平均地划分为多个CU,但实际也可将LCU非平均地划分为多个CU。每个CU可对应一种模式类型的预测模式,如帧间预测模式(Inter模式)和帧内预测模式(Intra模式)。其中,帧内预测模式主要从当前帧图像中已编码的图像块中查找参考图像块,并利用参考图像块的解码信息进行预测;帧内预测模式需将相应的预测模式以及残差信息传输至解码器。帧间预测模式主要根据当前图像块的运动估计(ME,Motion Estimation)和运动补偿(MC,Motion Compensation),从图像帧序列中已编码的帧图像中查找与当前图像块相匹配的参考图像块,并利用参考图像块的解码信息进行基于MV(Motion Vector,运动矢量)的预测。
参见图1d所示,帧间预测模式可至少包括AMVP(Advanced Motion Vector Prediction,高级运动向量预测)模式和Merge模式。其中,AMVP模式需传输当前图像块的运动矢量数据(MVD,Motion Vector Data)、参考图像块的索引信息以及当前图像块的残差信息至解码器;此处的MVD是指MVP(预测得到的运动矢量)与基于运动估计(ME)得到的运动矢量之间的差值。而Merge模式则无需传输当前图像块的运动矢量数据(MVD),其又可细分为普通Merge模式和SKIP模式。此处的SKIP模式是Merge模式中的一种特殊情况,SKIP模式和普通Merge模式的区别在于:普通Merge模式需传输参考图像块的索引信息以及当前图像块的残差信息至解码器,而SKIP模式则只需传输参考图像块的索引信息至解码器,无需传输当前图像块的残差信息。
②针对待编码的当前图像块,采用预测模式对该当前图像块中的各像素进行像素值预测,得到该当前图像块所对应的预测块;预测块中包括各像素的预测值。在具体实现中,可将当前图像块进一步划分为一个或多个预测单元(Prediction Unit,PU),并进行模式决策以根据输入信号的特征动态决策当前图像块对应的各预测单元的预测模式。具体的,可先根据当前图像块的特征确定模式类型,如帧间预测类型或者帧内预测类型;然后分别根据各预测单元的特征从该模式类型的预测模式中选取相应的预测模式。若确定的模式类型为帧内预测类型,则当前图像块对应的各预测单元的预测模式均为帧内预测模式;若确定的模式类型为帧间预测类型,则当前图像块对应的各预测单元的预测模式可为AMVP模式、普通Merge模式或者SKIP模式;在此情况下,当前图像块对应的各预测单元的预测模式可以相同,也可以不同。在确定各预测单元的预测模式后,便可采用相应的预测模式对各预测单元进行预测处理,得到每个预测单元的预测结果。然后,采用各预测单元的预测结果组合得到当前图像块所对应的预测块。
③根据预测块和当前图像块计算得到当前图像块的残差块,此处的残差块包括当前图像块中的各像素的预测值和实际像素值之间的差值;然后对残差块依次执行变换、量化以及熵 编码处理,得到当前图像块的编码数据。迭代执行上述编码流程所涉及的步骤②-③,直至当前帧图像中的各图像块均被编码;此时便可得到当前帧图像所包含的各个图像块的编码数据,从而得到当前帧图像的码流数据。
(2)传输阶段:
视频编码设备11经过上述编码阶段,得到当前帧图像的码流数据以及编码信息传输至视频显示设备12,以使得该视频显示设备12通过解码阶段采用编码信息对各个图像块的编码数据进行解码,得到当前帧图像。其中,码流数据包括当前帧图像中的各图像块的编码数据;编码信息至少包括在对当前帧图像中的各图像块的预测单元进行预测时,采用的预测模式所规定的传输信息,如AMVP模式规定的当前图像块的运动矢量数据、参考图像块的索引信息以及当前图像块的残差信息等传输信息,SKIP模式规定的参考图像块的索引信息等传输信息,等等。
(3)解码阶段:
视频播放设备12在接收到当前帧图像的码流数据以及编码信息后,便可根据编码信息依次对码流数据中的各图像块的编码数据进行解码。针对任一图像块的解码流程具体如下:对当前图像块的编码数据依次执行解码、反量化以及反变换处理,得到当前图像块的残差块。然后,可根据编码信息中当前图像块所对应的传输信息确定当前图像块的各预测单元所使用的预测模式,并根据确定的预测模式和残差块得到图像块。迭代执行上述解码流程所涉及的各步骤,可得到当前帧图像的各个图像块,从而得到当前帧图像。在得到当前帧图像后,视频播放设备12便可在播放界面显示该当前帧图像。
由上述图像处理流程可知,编码阶段所涉及的模式决策过程通常涉及多种预测模式。若在模式决策过程中针对预测单元所选取的预测模式不合适,且对残差块进行变换量化时所涉及的量化参数(Quantization Parameter,QP)较大时,则会导致图像块在编码后容易产生一些特别异常的失真点,如失真高达100+的像素点,进而使得在通过解码阶段解码得到的图像块中出现一些脏点,影响图像块和帧图像的主观质量,如图1e所示。基于此,本发明实施例提出了一种视频编码方案,该视频编码方案主要用于在编码过程中指导编码器进行模式决策,通过为预测单元选取合适的预测模式来减少图像块在编码后产生异常失真点的概率,此处的异常失真点是指解码得到的像素值和编码前的像素值之间的差值绝对值大于某个阈值的像素点。在具体实现中,该视频编码方案的方案原理大致如下:
对于待编码的目标帧图像,可将目标帧图像划分成一个或多个图像块,并可从目标帧图像中选取一个图像块作为待编码的目标图像块,然后将目标图像块进一步划分为一个或多个预测单元。针对目标图像块中的任一预测单元,在通过模式决策为该预测单元选取预测模式时,可先获取各预测模式的模式代价,并检测该预测单元在至少一个预测模式下是否存在异常失真点。若不存在,则采用模式决策算法从多个预测模式中为预测单元选取一个预测模式,此处的模式决策算法用于指示选取模式代价最小的预测模式。若存在,则调整模式决策算法,此处的调整模式决策算法是指:先对存在异常失真点所对应的预测模式的模式代价进行校准,然后根据被校准的预测模式的校准后的模式代价以及未被校准的预测模式的模式代价,为预测单元选取预测模式。在为预测单元选取了预测模式后,便可采用被选取的预测模式对预测单元进行预测处理,迭代上述步骤以对目标图像块中的各预测单元进行预测处理,从而得到目标图像块的编码数据。在得到目标图像块的编码数据后,便可从目标帧图像中重新选取一个图像块作为新的目标图像块,并执行上述步骤得到新的目标图像块的编码数据,在目标帧 图像中的各图像块均被编码后,便可得到目标帧图像的码流数据。
为了更清楚地说明本发明实施例所提出的视频编码方案的有益效果,仍以目标帧图像为图1e中上侧图所示的原始帧图像为例,采用本发明实施例的视频编码方案对其进行编码,可得到如图1f中下侧图所示的编码后的帧图像。通过对比图1e中的下侧图和图1f中的下侧图可见,本发明实施例所提出的编码方案通过在模式决策过程中增加异常失真点检测机制,可实现根据异常失真点检测结果来修正模式决策过程,从而有效较少目标图像帧在编码后所产生的异常失真点的数量,提升目标图像帧的主观质量。
本发明实施例在编码过程中,可先在模式信息集中的至少一种候选预测模式下对目标预测单元进行异常失真点检测,得到至少一种候选预测模式对应的检测结果。其次,可根据至少一种候选预测模式对应的检测结果,对模式信息集中的至少一种候选预测模式的模式代价进行校准,使得校准后的模式信息集中的各候选预测模式的模式代价更能准确地反映相应候选预测模式所对应的码率和失真,从而使得能够根据校准后的模式信息集中的各候选预测模式的模式代价,从多种候选预测模式中选取出更适合目标预测单元的目标预测模式。然后,可采用合适的目标预测模式对目标预测单元进行预测处理,以得到目标图像块的编码数据,这样可在一定程度上减少目标图像块在编码后出现失真的概率。并且,由于本发明实施例主要是通过修正模式决策过程来选取合适的目标预测模式的方式减少失真概率的,因此可实现在基本不影响压缩效率以及编码复杂度的情况下,有效改善图像压缩质量,提升目标图像块的主观质量。
本发明实施例可先获取目标视频对应的图像帧序列中的各帧图像的码流数据,每帧图像的码流数据中包括多个图像块的编码数据。其次,可对各帧图像的码流数据进行解码,得到各帧图像;并在播放界面中依次显示各帧图像。由于目标视频对应的图像帧序列中除首帧图像以外的其他帧图像中的各图像块的编码数据均采用上述的视频编码方法编码得到的;因此可有效减少各图像块出现失真的概率,从而使得在播放界面中显示各帧图像时,能够在一定程度上减少各帧图像出现脏点的概率,提升各帧图像的主观质量。
基于上述视频编码方案的描述,本发明实施例提出一种视频编码方法;该视频编码方法可以由上述所提及的视频编码设备执行,具体可由视频编码设备中的编码器执行。请参见图2,该视频编码方法可包括以下步骤S201-S205:
S201,获取目标图像块中的目标预测单元及目标预测单元的模式信息集。
在本发明实施例中,目标预测单元可以是目标图像块中的任一预测单元。目标预测单元的模式信息集可包括多种候选预测模式及各种候选预测模式的模式代价,此处的模式代价可用于反映采用候选预测模式对目标预测单元进行预测所带来的码率和失真,其可以包括但不限于率失真代价。其中,多种候选预测模式可至少包括:帧内预测模式和帧间预测模式;帧间预测模式可至少包括以下模式:第一预测模式、第二预测模式和第三预测模式。所谓的第一预测模式是指需传输与目标图像块相关的参考图像块的索引信息的模式,其具体可以是前述所提及的SKIP模式;第二预测模式是指需传输目标图像块的残差信息以及与目标图像块相关的参考图像块的索引信息的模式,其具体可以是前述所提及的普通merge模式;第三预测模式是指需传输目标图像块的残差信息、目标图像块的运动矢量数据,以及与目标图像块相关的参考图像块的索引信息的模式,其具体可以是前述所提及的AMVP模式。
S202,在模式信息集中的至少一种候选预测模式下对目标预测单元进行异常失真点检测, 得到至少一种候选预测模式对应的检测结果。
在一种具体实现中,可在模式信息集中的各候选预测模式下均对目标预测单元进行异常失真点检测,得到各候选预测模式所对应的检测结果;即在此具体实现中,至少一种候选预测模式可包括帧内预测模式和帧间预测模式。其中,每个候选预测模式的检测结果可用于指示:目标预测单元在该候选预测模式下是否存在异常失真点。具体的,以参考预测模式为例进行说明,参考预测模式可以是任意一种候选预测模式。采用参考预测模式对目标预测单元中的各像素点进行像素值预测,得到各像素点的预测值,计算目标预测单元中的各像素点的像素值和预测值之间的残差绝对值,若目标预测单元中存在残差绝对值大于目标阈值的像素点,则确定参考预测模式对应的检测结果指示目标预测单元在参考预测模式下存在异常失真点,若目标预测单元中不存在残差绝对值大于目标阈值的像素点,则确定参考预测模式对应的检测结果指示目标预测单元在所述参考预测模式下不存在异常失真点。
再一种具体实现中,经研究表明:采用帧内预测模式进行预测而导致产生异常失真点的概率较小;采用帧间预测模式进行预测而导致产生异常失真点的概率较大,尤其是帧间预测模式中的SKIP模式。由于采用SKIP模式进行预测时,MV是从其他参考图像块推导出的,且不传输残差信息;因此虽然SKIP模式可极大地节省码率,提升编码效率,但是在一些特殊场景下(如屏幕分享场景、视频直播场景等)容易造成局部点失真过大,使得目标图像块中产生异常失真点的概率较大。基于此研究结果,本发明实施例可只在帧间预测模式所包括的各模式下对目标预测单元进行异常失真点检测,得到帧间预测模式所包括的各模式对应的检测结果,也就是,至少一种候选预测模式可为帧间预测模式。这样,由于不对帧内预测模式执行任何检测处理,这样可通过减少执行在帧内预测模式下对目标预测单元进行异常失真点检测的操作,可有效节省处理资源,提升编码速度。
由前述可知,异常失真点是指解码得到的像素值和编码前的像素值之间的差值绝对值大于某个阈值的像素点。因此在一种实施方式中,本发明实施例可采用像素点的预测值和实际像素值(即编码前的像素值)之间的差异来判断该像素点是否为异常像素点。基于此,在任一候选预测模式下对目标预测单元进行异常失真点检测的检测原理如下:可采用该任一候选预测模式对目标预测单元中的各像素点进行像素值预测,若存在至少一个像素点的预测值和实际像素值之间的差异较大,则可确定目标预测单元在该任一候选预测模式下存在异常失真点;若各像素点的预测值和实际像素值之间的差异均较小,则可确定目标预测单元在该任一候选预测模式下不存在异常失真点。
再一种实施方式中,当至少一个候选预测模式为帧间预测模式时,还可根据像素点的运动补偿值和实际像素值(即编码前的像素值)之间的差异来判断该像素点是否为异常像素点,以提高检测结果的准确性;此处的运动补偿值等于像素点的预测值和对残差信息进行反变换反量化后所得到的残差的总和。需说明的是,由于第一预测模式不传输残差信息,因此在第一预测模式下的像素点的运动补偿值和预测值是相等的。基于此,在帧间预测模式中的任一模式下对目标预测单元进行异常失真点检测的检测原理还可如下:可采用任一模式对目标预测单元中的各像素点进行像素值预测,根据各像素点的预测值和残差信息计算得到各像素点的运动补偿值;若存在至少一个像素点的运动补偿值和实际像素值之间的差异较大,则可确定目标预测单元在任一模式下存在异常失真点;若各像素点的运动补偿值和实际像素值之间的差异均较小,则可确定目标预测单元在任一模式下不存在异常失真点。
S203,根据至少一种候选预测模式对应的检测结果,对模式信息集中的至少一种候选预 测模式的模式代价进行校准,得到校准后的模式信息集。
在具体实现中,可依次遍历被检测的至少一种候选预测模式中的各候选预测模式的检测结果。在每次遍历流程中,可根据当前被遍历的候选预测模式的检测结果,对模式信息集中的当前被遍历的候选预测模式的模式代价进行校准。具体的,若当前被遍历的候选预测模式的检测结果指示目标预测单元在该当前被遍历的候选预测模式下不存在异常失真点,则在模式信息中保持该当前被遍历的候选预测模式的模式代价不变,即在此情况下,当前被遍历的候选预测模式的校准后的模式代价与校准前的模式代价相同。若当前被遍历的候选预测模式的检测结果指示目标预测单元在该当前被遍历的候选预测模式下存在异常失真点,则可在模式信息集中对该当前被遍历的候选预测模式的模式代价执行以下至少一种惩罚处理:对当前被遍历的候选预测模式的模式代价进行放大处理,以及为该当前被遍历的候选预测模式添加禁用标识,即在此情况下,当前被遍历的候选预测模式的校准后的模式代价与校准前的模式代价可能相同,可能不同。迭代上述遍历步骤,直至在步骤S202中被检测的各候选预测模式均被遍历,则可得到校准后的模式信息集;该校准后的模式信息集中包括在步骤S202中被检测的各候选预测模式的校准后的模式代价,以及在步骤S202中未被检测的候选预测模式的模式代价。
S204,根据校准后的模式信息集中的各候选预测模式的模式代价,从多种候选预测模式中选取目标预测模式。
在一种具体实现中,可从多种候选预测模式中,选取校准后的模式信息集中的模式代价最小的候选预测模式作为目标预测模式。可选的,若校准后的模式信息集中的模式代价最小的候选预测模式具有禁用标识,则可选取校准后的模式信息集中的模式代价次小(即第二小)的候选预测模式作为目标预测模式。进一步的,若校准后的模式信息集中的模式代价次小的候选预测模式也具有禁用标识,则可选取校准后的模式信息集中的模式代价第三小的候选预测模式作为目标预测模式,以此类推。再一种具体实现中,可先根据校准后的模式信息集中的各候选预测模式的模式代价,从多种候选预测模式中筛选出备用预测模式;此处的备用预测模式是指校准后的模式信息集中模式代价大于代价阈值的候选预测模式,该代价阈值可根据经验值设置。然后,可从筛选出的备用预测模式中随机选取一个备用预测模式作为目标预测模式。
S205,采用目标预测模式对目标预测单元进行预测处理,以得到目标图像块的编码数据。
在具体实现中,可采用目标预测模式对目标预测单元中的各像素点进行像素值预测,得到目标预测单元的预测结果;此处的目标预测单元的预测结果可包括目标预测单元中的各像素点的预测值。重复迭代上述步骤S201-S205,得到目标图像块中的各预测单元的预测结果,然后可采用各预测单元的预测结果组合得到目标图像块所对应的预测块,并根据目标图像块和预测块得到残差块,最后对残差块依次执行变换、量化和熵编码处理,得到目标图像块的编码数据。
本发明实施例在编码过程中,可先在模式信息集中的至少一种候选预测模式下对目标预测单元进行异常失真点检测,得到至少一种候选预测模式对应的检测结果。其次,可根据至少一种候选预测模式对应的检测结果,对模式信息集中的至少一种候选预测模式的模式代价进行校准;使得校准后的模式信息集中的各候选预测模式的模式代价更能准确地反映相应候选预测模式所对应的码率和失真,从而使得能够根据校准后的模式信息集中的各候选预测模式的模式代价,从多种候选预测模式中选取出更适合目标预测单元的目标预测模式。然后, 可采用合适的目标预测模式对目标预测单元进行预测处理,以得到目标图像块的编码数据,这样可在一定程度上减少目标图像块在编码后出现失真的概率。并且,由于本发明实施例主要是通过修正模式决策过程来选取合适的目标预测模式的方式减少失真概率的,因此可实现在基本不影响压缩效率以及编码复杂度的情况下,有效改善图像压缩质量,提升目标图像块的主观质量。
请参见图3,是本发明实施例提供的另一种视频编码方法的流程示意图。该视频编码方法可以由上述所提及的视频编码设备执行,具体可由视频编码设备中的编码器执行。请参见图3,该视频编码方法可包括以下步骤S301-S307:
S301,获取目标图像块中的目标预测单元及目标预测单元的模式信息集。
在具体实现中,可将目标图像块划分为至少一个预测单元;然后,从至少一个预测单元中选取一个未进行预测处理的预测单元作为目标预测单元。在确定目标预测单元后,还可获取与目标预测单元相匹配的模式信息集。由前述可知,目标预测单元的模式信息集包括多种候选预测模式及各种所述候选预测模式的模式代价,那么相应的,获取与目标预测单元相匹配的模式信息集的具体实施方式可以包括如下步骤:
首先,可确定与目标预测单元相匹配的多个候选预测模式。具体的,可检测目标图像块是否属于I Slice(Intra Slice,帧内条带)。由于I Slice中通常只包括I宏块,而I宏块只能利用当前帧图像中已编码的像素点作为参考进行帧内预测,因此若目标图像块属于I Slice,则可直接采用帧内预测模式对目标预测单元进行预测处理,不再执行后续步骤。若目标图像块不属于I Slice,则表明可采用帧内预测模式或者帧间预测模式中的任一模式对目标预测单元进行预测,因此,可选取帧内预测模式和帧间预测模式中的各模式,作为与目标预测单元相匹配的多个候选预测模式。在确定多个候选预测模式后,便可采用代价函数分别计算各候选预测模式的模式代价;此处的代价函数可包括但不限于:RDO(Rate-disto rtion Optimized,率失真优化)模式的代价函数,例如下述公式1.1所示的代价函数;非RDO模式的代价函数,例如下述公式1.2所示的代价函数,等等。然后,可将计算得到的各候选预测模式的模式代价和相应的候选预测模式添加至模式信息集中。
cost=HAD+λ*R式1.1
cost=SAD+4R*λ(QP)式1.2
在式1.1中,cost表示候选预测模式的模式代价;HAD表示目标预测单元的残差信号经过Hadamard(哈达码)变换后的系数绝对值之和;λ表示拉格朗日系数,R表示编码候选预测模式所需的比特数(即码率)。在式1.2中,cost仍表示候选预测模式的模式代价;SAD表示采用候选预测模式对目标预测单元中的各像素点进行像素值预测,所得到的预测结果和目标预测单元之间的绝对差值和;4R表示使用候选预测模式后所估计得到的比特数;λ(QP)表示与量化参数(QP)相关的指数函数。需要说明的是,上述式1.1和1.2均只是用于对代价函数进行举例,并非穷举。
S302,在模式信息集中的至少一种候选预测模式下对目标预测单元进行异常失真点检测,得到至少一种候选预测模式对应的检测结果。
在本发明实施例中,主要以至少一种候选预测模式为帧间预测模式为例进行说明;该帧间预测模式至少包括以下模式:第一预测模式(SKIP模式)、第二预测模式(普通Merge模式)和第三预测模式(AMVP模式)。由于在帧间预测模式中的各模式下对目标预测单元进行 异常失真点检测的原理类似,因此为便于阐述,下面以参考预测模式为例对步骤S302的实施方式进行说明,其中,参考预测模式为帧间预测模式中的任一模式,即参考预测模式可以是第一预测模式、第二预测模式或者第三预测模式。
在一种具体实现中,可采用下述式1.3来进行异常失真点的检测:
ABS(DIFF(x,y))>TH     式1.3
在上述式1.3中,DIFF(x,y)表示目标预测单元中位置(x,y)处的像素点的实际像素值与预测值之间的残差(即差值);ABS表示取绝对值,TH表示目标阈值。若目标预测单元中位置(x,y)处的像素点的实际像素值与预测值之间的残差绝对值大于目标阈值,则可确定该像素点为异常像素点;否则,则可确定该像素点为正常的像素点。基于此,在具体执行步骤S302的过程中,可先采用参考预测模式对目标预测单元中的各像素点进行像素值预测,得到各像素点的预测值。其次,可计算目标预测单元中的各像素点的像素值和预测值之间的残差绝对值。若目标预测单元中存在残差绝对值大于目标阈值的像素点,则确定参考预测模式对应的检测结果指示目标预测单元在参考预测模式下存在异常失真点;若目标预测单元中不存在残差绝对值大于目标阈值的像素点,则可确定参考预测模式对应的检测结果指示目标预测单元在参考预测模式下不存在异常失真点。
其中,目标阈值可至少包括以下两种取值方式:在一种实施方式中,可以根据经验值将上述所提及的目标阈值设置为一个统一的固定值;即此情况下,无论参考预测模式是第一预测模式、第二预测模式或者第三预测模式,用于进行异常失真点判断的目标阈值均是相同的。再一种实施方式中,由于第二预测模式和第三预测模式需传输残差信息,因此可将第二预测模式和第三预测模式下的异常失真点的检测标准放宽一些;在此情况下,可根据经验值分别为不同的参考目标阈值针对性地设置目标阈值;即目标阈值可与参考预测模式相关联。具体的,若参考预测模式为帧间预测模式中的第一预测模式,则目标阈值可等于第一阈值。其中,第一阈值大于无效数值且小于像素值域的最大值;此处无效数值可根据经验值设置,例如设置为0;像素值域的最大值可根据像素位宽确定,所谓的像素位宽是指一次传输或显示像素的数量。也就是说,第一阈值(TH1)的取值范围可为:0<TH1<(2<<(BITDEPTH));其中,BITDEPTH表示像素位宽,<<表示取次方运算,2<<(BITDEPTH)表示像素值域的最大值;例如像素位宽为8,则像素值域的最大值便等于2的8次方(即256);那么第一阈值便可以是0-256中的任意值,例如可以将第一阈值设置为30。若参考预测模式为帧间预测模式中的第二预测模式或者第三预测模式,则目标阈值可等于第二阈值;该第二阈值大于或等于第一阈值,且小于像素值域的最大值。即第二阈值(TH2)的取值范围可为:TH1=<TH1<(2<<(BITDEPTH))。
再一种具体实现中,在具体执行步骤S302的过程中,还可先采用参考预测模式对目标预测单元中的各像素点进行像素值预测,得到各像素点的预测值,并根据各像素点的预测值和残差信息计算得到各像素点的运动补偿值。其次,可计算目标预测单元中的各像素点的像素值和运动补偿值之间的差值绝对值。若目标预测单元中存在差值绝对值大于目标阈值的像素点,则可确定参考预测模式对应的检测结果指示目标预测单元在参考预测模式下存在异常失真点;若目标预测单元中不存在差值绝对值大于目标阈值的像素点,则可确定参考预测模式对应的检测结果指示目标预测单元在参考预测模式下不存在异常失真点。其中,此实施方式下的目标阈值可设置为统一的固定值。
S303,对目标预测单元进行复杂度分析,得到目标预测单元的预测复杂度。
在一种具体实现中,可对目标预测单元所包含的像素值进行梯度运算,得到目标预测单元的图像梯度值;将该图像梯度值作为目标预测单元的预测复杂度。再一种具体实现中,可对目标预测单元所包含的像素值进行方差运算或均值运算,将运算得到的方差或均值作为目标预测单元的预测复杂度。应理解的是,本发明实施例只是示例性地列举了两种复杂度分析的具体实施方式,并非穷举。在分析得到目标预测单元的预测复杂度之后,便可根据预测复杂度检测目标预测单元是否满足预设条件;其中,预设条件可至少包括:预测复杂度小于或等于复杂度阈值,且目标预测单元在帧间预测模式中的至少一种模式下存在异常失真点。若满足,则可执行步骤S304;若不满足,则可执行步骤S305-S308。由此可见,本发明实施例通过加入预测复杂度等先验信息,可使得在根据预测复杂度确定目标预测单元满足预设条件时,跳过其他模式的决策过程,直接进行帧内预测;这样可有效加快模式决策过程,从而提升编码速度。
S304,若根据预测复杂度确定目标预测单元满足预设条件,则采用帧内预测模式对目标预测单元进行预测处理,以得到目标图像块的编码数据。
S305,若根据预测复杂度确定目标预测单元不满足预设条件,则根据至少一种候选预测模式对应的检测结果,对模式信息集中的至少一种候选预测模式的模式代价进行校准,得到校准后的模式信息集。
由于根据各候选预测模式的检测结果对各候选预测模式的模式代价进行校准的原理类似;因此为便于阐述,下面以参考预测模式为例对步骤S305的实施方式进行说明;其中,参考预测模式为帧间预测模式中的任一模式;即参考预测模式可以是第一预测模式、第二预测模式或者第三预测模式。在具体实现中,若参考预测模式对应的检测结果指示目标预测单元在参考预测模式下不存在异常失真点,则可保持模式信息集中的所述参考预测模式的模式代价不变,以得到校准后的模式信息集。若参考预测模式对应的检测结果指示目标预测单元在参考预测模式下存在异常失真点,则可采用参考预测模式的代价调整策略对模式信息集中的参考预测模式的模式代价进行调整,以得到校准后的模式信息集。具体的,若参考预测模式为第二预测模式或者第三预测模式,则采用参考预测模式的代价调整策略对模式信息集中的参考预测模式的模式代价进行调整这一步骤可包括:采用惩罚因子对参考预测模式的模式代价进行放大处理,得到参考预测模式的校准后的模式代价。其中,惩罚因子为大于0的任意值;其具体取值可根据经验值设置。需说明的是,在此情况下,若后续通过步骤S306选取的目标预测模式为第二预测模式或者第三预测模式,则还可强制视频编码设备(或编码器)将未进行变换量化的目标图像块中的各像素的残差值传输至解码器,以便于解码器可根据该残差值进行解码。
若参考预测模式为第一预测模式,则采用参考预测模式的代价调整策略对模式信息集中的参考预测模式的模式代价进行调整这一步骤还可包括以下几种实施方式:
第一种实施方式,可获取预设代价;该预设代价大于校准后的模式信息集中除第一预测模式以外的各候选预测模式的模式代价,且大于第一预测模式在模式信息集中的模式代价(即大于第一预测模式在校准前的模式代价)。然后在模式信息集中,将参考预测模式的模式代价调整为预设代价。在此实施方式下,下述步骤S306的具体实施方式可以是:从多种候选预测模式中,直接选取校准后的模式信息中模式代价最小的候选预测模式作为目标预测模式。由此可见,在此实施方式下,若目标预测单元在第一预测模式下存在异常失真点,则可通过将第一预测模式的模式代价设置为一个无穷大的值,使得后续在按照模式代价从小到大的顺 序选取目标预测模式时不会选取到该第一预测模式。
第二种实施方式,可保持模式信息集中的第一预测模式的模式代价不变,并为第一预测模式添加禁用标识,该禁用标识指示禁止使用第一预测模式对目标预测单元进行预测处理。在此实施方式下,下述步骤S306的具体实施方式可以是:若校准后的模式信息集中模式代价最小的候选预测模式不为第一预测模式,或者模式代价最小的候选预测模式为第一预测模式且第一预测模式不具有禁用标识,则可将模式代价最小的候选预测模式作为目标预测模式。若模式代价最小的候选预测模式为第一预测模式,且第一预测模式具有禁用标识,则可选取校准后的模式信息集中模式代价次小的候选预测模式作为目标预测模式。由此可见,在此实施方式下,若目标预测单元在第一预测模式下存在异常失真点,则可通过添加禁用标识的方式,使得后续在选取目标预测模式时不会选取到该第一预测模式。
第三种实施方式,可保持模式信息集中的第一预测模式的模式代价不变;并不对第一预测模式执行任何处理。在此实施方式下,下述步骤S306的具体实施方式可以是:若校准后的模式信息集中模式代价最小的候选预测模式不为第一预测模式,则直接将模式代价最小的候选预测模式作为目标预测模式。若校准后的模式信息集中模式代价最小的候选预测模式为第一预测模式,则可再次查询第一预测模式的检测结果。若第一预测模式的检测结果指示目标预测单元在第一预测模式下存在异常失真点,则可选取校准后的模式信息集中模式代价次小的候选预测模式作为目标预测模式;若第一预测模式的检测结果指示目标预测单元在第一预测模式下不存在异常失真点,则可将该第一预测模式作为目标预测模式。
S306,根据校准后的模式信息集中的各候选预测模式的模式代价,从多种候选预测模式中选取目标预测模式。
S307,采用目标预测模式对目标预测单元进行预测处理,以得到目标图像块的编码数据。
本发明实施例在编码过程中,可先在模式信息集中的至少一种候选预测模式下对目标预测单元进行异常失真点检测,得到至少一种候选预测模式对应的检测结果。其次,可根据至少一种候选预测模式对应的检测结果,对模式信息集中的至少一种候选预测模式的模式代价进行校准;使得校准后的模式信息集中的各候选预测模式的模式代价更能准确地反映相应候选预测模式所对应的码率和失真,从而使得能够根据校准后的模式信息集中的各候选预测模式的模式代价,从多种候选预测模式中选取出更适合目标预测单元的目标预测模式。然后,可采用合适的目标预测模式对目标预测单元进行预测处理,以得到目标图像块的编码数据,这样可在一定程度上减少目标图像块在编码后出现失真的概率;并且,由于本发明实施例主要是通过修正模式决策过程来选取合适的目标预测模式的方式减少失真概率的,因此可实现在基本不影响压缩效率以及编码复杂度的情况下,有效改善图像压缩质量,提升目标图像块的主观质量。
基于上述的视频编码方法实施例的相关描述,本发明实施例还提出了一种视频播放方法;该视频播放方法可以由上述所提及的视频播放设备执行。请参见图4,该视频播放方法可包括以下步骤S401-S403:
S401,获取目标视频对应的图像帧序列中的各帧图像的码流数据。
其中,目标视频可包括但不限于:屏幕分享视频、网络会议视频、网络直播视频、影视剧视频、短视频,等等。目标视频对应的图像帧序列中的每帧图像的码流数据中可包括多个图像块的编码数据;并且,图像帧序列中除首帧图像以外的其他帧图像中的各图像块的编码 数据均可采用如图2或图3所示的视频编码方法编码得到。
在具体实现中,视频播放设备可从视频编码设备处获取目标视频对应的图像帧序列中的各帧图像的码流数据。在一种实施方式中,目标视频所对应的图像帧序列中的各帧图像的码流数据可以是实时编码得到的,在此情况下,视频播放设备可实时地从视频编码设备处获取各帧图像的码流数据。也就是说,在此实施方式下,视频编码设备每编码得到一帧图像的码流数据,便可将该帧图像的码流数据传输至视频播放设备进行解码播放。再一种实施方式中,目标视频所对应的图像帧序列中的各帧图像的码流数据也可以是预先离线编码得到的,在此情况下,视频播放设备也可从视频编码设备处一次性获取图像帧序列中的各帧图像的码流数据。也就是说,在此实施方式下,视频编码设备可在编码得到所有帧图像的码流数据后,再可将所有帧图像的码流数据传输至视频播放设备进行解码播放。
S402,对各帧图像的码流数据进行解码,得到各帧图像。
S403,在播放界面中依次显示各帧图像。
需要说明的是,步骤S402-S403的具体实施方式可参见前述图像处理流程所提及的解码阶段的相关内容,在此不再赘述。且还需说明的是,若目标视频所对应的图像帧序列中的各帧图像的码流数据是实时编码且实时传输至视频播放设备的,则视频播放设备每接收到一帧图像的码流数据,便可执行步骤S402-S403,以实现帧图像的实时显示,从而实现目标视频的实时播放。
本发明实施例可先获取目标视频对应的图像帧序列中的各帧图像的码流数据,每帧图像的码流数据中包括多个图像块的编码数据。其次,可对各帧图像的码流数据进行解码,得到各帧图像;并在播放界面中依次显示各帧图像。由于目标视频对应的图像帧序列中除首帧图像以外的其他帧图像中的各图像块的编码数据均采用上述的视频编码方法编码得到的;因此可有效减少各图像块出现失真的概率,从而使得在播放界面中显示各帧图像时,能够在一定程度上减少各帧图像出现脏点的概率,提升各帧图像的主观质量。
应理解的是,本发明实施例所提出的视频编码方法和视频播放方法可运用在各种应用场景中;如视频会议中的屏幕分享场景、视频直播场景、影视剧视频播放场景,等等。下面以视频会议中的屏幕分享场景为例,对本发明实施例所提出的视频编码方法和视频播放方法的具体应用进行阐述:
在多个用户使用具有视频会议的通信客户端(如企业微信客户端、腾讯会议客户端等)进行视频会议的过程中,若用户A想要和其他用户分享自己的屏幕内容,则可开启屏幕分享功能。用户A所使用的第一通信客户端在检测到屏幕分享功能被打开后,可实时地获取用户A所对应的终端屏幕中的显示内容,并根据实时获取到的显示内容生成屏幕分享视频的当前帧图像。然后,第一通信客户端可对该当前帧图像进行编码,得到该当前帧图像的码流数据。具体的,可将该当前帧图像划分成多个图像块,并采用图2或图3所示的视频编码方法对各图像块进行编码,得到各图像块的编码数据;并采用各图像块的编码数据组合得到当前帧图像的码流数据。在得到当前帧图像的码流数据后,第一通信客户端便可通过服务器将该当前帧图像的码流数据传输至其他用户所使用的第二通信客户端中。相应的,其他用户所使用的第二通信客户端在接收到第一通信客户端发送的当前帧图像的码流数据后,可采用如图4所示的视频播放方法对当前帧图像的码流数据进行解码,得到当前帧图像;并在用户界面中显示该当前帧图像,如图5所示。
由此可见,采用本发明实施例所提出的视频编码方法和视频播放方法,可有效减少屏幕分享场景中出现异常失真点的概率,可有效改善屏幕分享视频的压缩视频质量,提升屏幕分享视频的主观质量。
基于上述视频编码方法实施例的描述,本发明实施例还公开了一种视频编码装置,所述视频编码装置可以是运行于视频编码设备中的一个计算机可读指令(包括程序代码)。该视频编码装置可以执行图2-图3所示的方法。请参见图6,所述视频编码装置可以运行如下单元:
获取单元601,用于获取目标图像块中的目标预测单元及所述目标预测单元的模式信息集,所述模式信息集包括多种候选预测模式及各种所述候选预测模式的模式代价;
编码单元602,用于在所述模式信息集中的至少一种候选预测模式下对所述目标预测单元进行异常失真点检测,得到所述至少一种候选预测模式对应的检测结果;
所述编码单元602,还用于根据所述至少一种候选预测模式对应的检测结果,对所述模式信息集中的所述至少一种候选预测模式的模式代价进行校准,得到校准后的模式信息集;
所述编码单元602,还用于根据校准后的模式信息集中的各候选预测模式的模式代价,从所述多种候选预测模式中选取目标预测模式;
所述编码单元602,还用于采用所述目标预测模式对所述目标预测单元进行预测处理,以得到所述目标图像块的编码数据。
在一种实施方式中,所述至少一种候选预测模式为帧间预测模式,所述帧间预测模式至少包括以下模式中的至少一种:第一预测模式、第二预测模式和第三预测模式;
所述第一预测模式是指需传输与所述目标图像块相关的参考图像块的索引信息的模式;
第二预测模式是指需传输所述目标图像块的残差信息以及与所述目标图像块相关的参考图像块的索引信息的模式;
第三预测模式是指需传输所述目标图像块的残差信息、所述目标图像块的运动矢量数据,以及与所述目标图像块相关的参考图像块的索引信息的模式。
再一种实施方式中,编码单元602在用于在所述模式信息集中的至少一种候选预测模式下对所述目标预测单元进行异常失真点检测,得到所述至少一种候选预测模式对应的检测结果时,可具体用于:
采用参考预测模式对所述目标预测单元中的各像素点进行像素值预测,得到各像素点的预测值;所述参考预测模式为所述帧间预测模式中的任一模式,或所述参考预测模式为所述至少一种候选预测模式中的任意一种;
计算所述目标预测单元中的各像素点的像素值和预测值之间的残差绝对值;
若所述目标预测单元中存在残差绝对值大于目标阈值的像素点,则确定所述参考预测模式对应的检测结果指示所述目标预测单元在所述参考预测模式下存在异常失真点;
若所述目标预测单元中不存在残差绝对值大于所述目标阈值的像素点,则确定所述参考预测模式对应的检测结果指示所述目标预测单元在所述参考预测模式下不存在所述异常失真点。
再一种实施方式中,所述目标阈值与所述参考预测模式相关联;
若所述参考预测模式为所述帧间预测模式中的第一预测模式,则所述目标阈值等于第一阈值;所述第一阈值大于无效数值且小于像素值域的最大值;
若所述参考预测模式为所述帧间预测模式中的所述第二预测模式或者所述第三预测模式,则所述目标阈值等于第二阈值;所述第二阈值大于或等于所述第一阈值,且小于所述像素值域的最大值。
再一种实施方式中,编码单元602在用于根据所述至少一种候选预测模式对应的检测结果,对所述模式信息集中的所述至少一种候选预测模式的模式代价进行校准,得到校准后的模式信息集时,可具体用于:
若参考预测模式对应的检测结果指示所述目标预测单元在所述参考预测模式下不存在异常失真点,则保持所述模式信息集中的所述参考预测模式的模式代价不变,以得到校准后的模式信息集;所述参考预测模式为所述帧间预测模式中的任一模式;
若所述参考预测模式对应的检测结果指示所述目标预测单元在所述参考预测模式下存在异常失真点,则采用所述参考预测模式的代价调整策略对所述模式信息集中的所述参考预测模式的模式代价进行调整,以得到校准后的模式信息集。
再一种实施方式中,编码单元602在用于采用所述参考预测模式的代价调整策略对所述模式信息集中的所述参考预测模式的模式代价进行调整时,可具体用于:
若所述参考预测模式为所述第二预测模式或者所述第三预测模式,则采用惩罚因子对所述参考预测模式的模式代价进行放大处理,得到所述参考预测模式的校准后的模式代价。
再一种实施方式中,编码单元602在用于采用所述参考预测模式的代价调整策略对所述模式信息集中的所述参考预测模式的模式代价进行调整时,还可用于:
若所述参考预测模式为所述第一预测模式,则获取预设代价;所述预设代价大于所述校准后的模式信息集中除所述第一预测模式以外的各候选预测模式的模式代价,且大于所述第一预测模式在所述模式信息集中的模式代价;
在所述模式信息集中,将所述参考预测模式的模式代价调整为所述预设代价。
再一种实施方式中,编码单元602在用于根据所述校准后的模式信息集中的各候选预测模式的模式代价,从所述多种候选预测模式中选取目标预测模式时,可具体用于:
从所述多种候选预测模式中,选取所述校准后的模式信息中模式代价最小的候选预测模式作为目标预测模式。
再一种实施方式中,编码单元602在用于采用所述参考预测模式的代价调整策略对所述模式信息集中的所述参考预测模式的模式代价进行调整时,还可用于:
若所述参考预测模式为所述第一预测模式,则保持所述模式信息集中的所述第一预测模式的模式代价不变;
为所述第一预测模式添加禁用标识,所述禁用标识指示禁止使用所述第一预测模式对所述目标预测单元进行预测处理。
再一种实施方式中,编码单元602在用于根据所述校准后的模式信息集中的各候选预测模式的模式代价,从所述多种候选预测模式中选取目标预测模式时,可具体用于:
若所述校准后的模式信息集中模式代价最小的候选预测模式不为所述第一预测模式,或者所述模式代价最小的候选预测模式为第一预测模式且所述第一预测模式不具有所述禁用标识,则将所述模式代价最小的候选预测模式作为目标预测模式;
若所述模式代价最小的候选预测模式为第一预测模式,且所述第一预测模式具有所述禁用标识,则选取所述校准后的模式信息集中模式代价次小的候选预测模式作为目标预测模式。
再一种实施方式中,所述多种候选预测模式包括:帧内预测模式和帧间预测模式,所述 帧间预测模式包括至少一种模式;相应的,编码单元602还可用于:
对所述目标预测单元进行复杂度分析,得到所述目标预测单元的预测复杂度;
若根据所述预测复杂度确定所述目标预测单元满足预设条件,则采用所述帧内预测模式对所述目标预测单元进行预测处理,以得到所述目标图像块的编码数据;其中,所述预设条件包括:所述预测复杂度小于或等于所述复杂度阈值,且所述目标预测单元在所述帧间预测模式中的至少一种模式下存在异常失真点;
若根据所述预测复杂度确定所述目标预测单元不满足所述预设条件,则执行根据所述至少一种候选预测模式对应的检测结果,对所述模式信息集中的所述至少一种候选预测模式的模式代价进行校准,得到校准后的模式信息集的步骤。
根据本发明的一个实施例,图2-图3所示的方法所涉及的各个步骤均可以是由图6所示的视频编码装置中的各个单元来执行的。例如,图2中所示的步骤S201可由图6中所示的获取单元601来执行,步骤S202-S205可由图6中所示的编码单元602来执行;又如,图3中所示的步骤S301可由图6中所示的获取单元601来执行,步骤S302-S307可由图6中所示的编码单元602来执行。
根据本发明的另一个实施例,图6所示的视频编码装置中的各个单元可以分别或全部合并为一个或若干个另外的单元来构成,或者其中的某个(些)单元还可以再拆分为功能上更小的多个单元来构成,这可以实现同样的操作,而不影响本发明的实施例的技术效果的实现。上述单元是基于逻辑功能划分的,在实际应用中,一个单元的功能也可以由多个单元来实现,或者多个单元的功能由一个单元实现。在本发明的其它实施例中,基于视频编码装置也可以包括其它单元,在实际应用中,这些功能也可以由其它单元协助实现,并且可以由多个单元协作实现。
根据本发明的另一个实施例,可以通过在包括中央处理单元(CPU)、随机存取存储介质(RAM)、只读存储介质(ROM)等处理元件和存储元件的例如计算机的通用计算设备上运行能够执行如图2-图3中所示的相应方法所涉及的各步骤的计算机可读指令(包括程序代码),来构造如图6中所示的视频编码装置设备,以及来实现本发明实施例的视频编码方法。所述计算机可读指令可以记载于例如计算机可读记录介质上,并通过计算机可读记录介质装载于上述计算设备中,并在其中运行。
本发明实施例在编码过程中,可先在模式信息集中的至少一种候选预测模式下对目标预测单元进行异常失真点检测,得到至少一种候选预测模式对应的检测结果。其次,可根据至少一种候选预测模式对应的检测结果,对模式信息集中的至少一种候选预测模式的模式代价进行校准;使得校准后的模式信息集中的各候选预测模式的模式代价更能准确地反映相应候选预测模式所对应的码率和失真,从而使得能够根据校准后的模式信息集中的各候选预测模式的模式代价,从多种候选预测模式中选取出更适合目标预测单元的目标预测模式。然后,可采用合适的目标预测模式对目标预测单元进行预测处理,以得到目标图像块的编码数据,这样可在一定程度上减少目标图像块在编码后出现失真的概率;并且,由于本发明实施例主要是通过修正模式决策过程来选取合适的目标预测模式的方式减少失真概率的,因此可实现在基本不影响压缩效率以及编码复杂度的情况下,有效改善图像压缩质量,提升目标图像块的主观质量。
基于上述视频编码方法实施例以及视频编码装置实施例的描述,本发明实施例还提供一 种视频编码设备。请参见图7,该视频编码设备至少包括处理器701、输入接口702、输出接口703、计算机存储介质704以及编码器705。其中,计算机存储介质704可以存储在视频编码设备的存储器中,计算机存储介质704用于存储计算机可读指令,所述计算机可读指令包括程序指令,所述处理器701用于执行计算机存储介质704存储的程序指令。处理器701(或称CPU(Central Processing Unit,中央处理器))是视频编码设备的计算核心以及控制核心,其适于实现一条或多条指令,具体适于加载并执行一条或多条指令从而实现相应方法流程或相应功能;在一个实施例中,本发明实施例所述的处理器701可以用于对目标图像块进行一系列的视频编码,包括:获取目标图像块中的目标预测单元及所述目标预测单元的模式信息集,所述模式信息集包括多种候选预测模式及各种所述候选预测模式的模式代价;在所述模式信息集中的至少一种候选预测模式下对所述目标预测单元进行异常失真点检测,得到所述至少一种候选预测模式对应的检测结果;根据所述至少一种候选预测模式对应的检测结果,对所述模式信息集中的所述至少一种候选预测模式的模式代价进行校准,得到校准后的模式信息集;根据校准后的模式信息集中的各候选预测模式的模式代价,从所述多种候选预测模式中选取目标预测模式;采用所述目标预测模式对所述目标预测单元进行预测处理,以得到所述目标图像块的编码数据,等等。
本发明实施例还提供了一种计算机存储介质(Memory),所述计算机存储介质是视频编码设备中的记忆设备,用于存放程序和数据。可以理解的是,此处的计算机存储介质既可以包括视频编码设备中的内置存储介质,当然也可以包括视频编码设备所支持的扩展存储介质。计算机存储介质提供存储空间,该存储空间存储了视频编码设备的操作系统。并且,在该存储空间中还存放了适于被处理器701加载并执行的一条或多条的指令,这些指令可以是一个或一个以上的计算机可读指令(包括程序代码)。需要说明的是,此处的计算机存储介质可以是高速RAM存储器,也可以是非不稳定的存储器(non-volatile memory),例如至少一个磁盘存储器;可选的还可以是至少一个位于远离前述处理器的计算机存储介质。
在一个实施例中,可由处理器701加载并执行计算机存储介质中存放的一条或多条第一指令,以实现上述有关视频编码方法实施例中的方法的相应步骤;具体实现中,计算机存储介质中的一条或多条第一指令由处理器701加载并执行如下步骤:
获取目标图像块中的目标预测单元及所述目标预测单元的模式信息集,所述模式信息集包括多种候选预测模式及各种所述候选预测模式的模式代价;
在所述模式信息集中的至少一种候选预测模式下对所述目标预测单元进行异常失真点检测,得到所述至少一种候选预测模式对应的检测结果;
根据所述至少一种候选预测模式对应的检测结果,对所述模式信息集中的所述至少一种候选预测模式的模式代价进行校准,得到校准后的模式信息集;
根据校准后的模式信息集中的各候选预测模式的模式代价,从所述多种候选预测模式中选取目标预测模式;
采用所述目标预测模式对所述目标预测单元进行预测处理,以得到所述目标图像块的编码数据。
在一种实施方式中,所述至少一种候选预测模式为帧间预测模式,所述帧间预测模式至少包括以下模式:第一预测模式、第二预测模式和第三预测模式;
所述第一预测模式是指需传输与所述目标图像块相关的参考图像块的索引信息的模式;
第二预测模式是指需传输所述目标图像块的残差信息以及与所述目标图像块相关的参 考图像块的索引信息的模式;
第三预测模式是指需传输所述目标图像块的残差信息、所述目标图像块的运动矢量数据,以及与所述目标图像块相关的参考图像块的索引信息的模式。
再一种实施方式中,在所述模式信息集中的至少一种候选预测模式下对所述目标预测单元进行异常失真点检测,得到所述至少一种候选预测模式对应的检测结果时,所述一条或多条第一指令由处理器701加载并具体执行:
采用参考预测模式对所述目标预测单元中的各像素点进行像素值预测,得到各像素点的预测值;所述参考预测模式为所述帧间预测模式中的任一模式;
计算所述目标预测单元中的各像素点的像素值和预测值之间的残差绝对值;
若所述目标预测单元中存在残差绝对值大于目标阈值的像素点,则确定所述参考预测模式对应的检测结果指示所述目标预测单元在所述参考预测模式下存在异常失真点;
若所述目标预测单元中不存在残差绝对值大于所述目标阈值的像素点,则确定所述参考预测模式对应的检测结果指示所述目标预测单元在所述参考预测模式下不存在所述异常失真点。
再一种实施方式中,所述目标阈值与所述参考预测模式相关联;
若所述参考预测模式为所述帧间预测模式中的第一预测模式,则所述目标阈值等于第一阈值;所述第一阈值大于无效数值且小于像素值域的最大值;
若所述参考预测模式为所述帧间预测模式中的所述第二预测模式或者所述第三预测模式,则所述目标阈值等于第二阈值;所述第二阈值大于或等于所述第一阈值,且小于所述像素值域的最大值。
再一种实施方式中,在根据所述至少一种候选预测模式对应的检测结果,对所述模式信息集中的所述至少一种候选预测模式的模式代价进行校准,得到校准后的模式信息集时,所述一条或多条第一指令可由处理器701加载并具体执行:
若参考预测模式对应的检测结果指示所述目标预测单元在所述参考预测模式下不存在异常失真点,则保持所述模式信息集中的所述参考预测模式的模式代价不变,以得到校准后的模式信息集;所述参考预测模式为所述帧间预测模式中的任一模式;
若所述参考预测模式对应的检测结果指示所述目标预测单元在所述参考预测模式下存在异常失真点,则采用所述参考预测模式的代价调整策略对所述模式信息集中的所述参考预测模式的模式代价进行调整,以得到校准后的模式信息集。
再一种实施方式中,在采用所述参考预测模式的代价调整策略对所述模式信息集中的所述参考预测模式的模式代价进行调整时,所述一条或多条第一指令由处理器701加载并具体执行:
若所述参考预测模式为所述第二预测模式或者所述第三预测模式,则采用惩罚因子对所述参考预测模式的模式代价进行放大处理,得到所述参考预测模式的校准后的模式代价。
再一种实施方式中,在采用所述参考预测模式的代价调整策略对所述模式信息集中的所述参考预测模式的模式代价进行调整时,所述一条或多条第一指令由处理器701加载并具体执行:
若所述参考预测模式为所述第一预测模式,则获取预设代价;所述预设代价大于所述校准后的模式信息集中除所述第一预测模式以外的各候选预测模式的模式代价,且大于所述第一预测模式在所述模式信息集中的模式代价;
在所述模式信息集中,将所述参考预测模式的模式代价调整为所述预设代价。
再一种实施方式中,在根据所述校准后的模式信息集中的各候选预测模式的模式代价,从所述多种候选预测模式中选取目标预测模式时,所述一条或多条第一指令由处理器701加载并具体执行:
从所述多种候选预测模式中,选取所述校准后的模式信息中模式代价最小的候选预测模式作为目标预测模式。
再一种实施方式中,在采用所述参考预测模式的代价调整策略对所述模式信息集中的所述参考预测模式的模式代价进行调整时,所述一条或多条第一指令由处理器701加载并具体执行:
若所述参考预测模式为所述第一预测模式,则保持所述模式信息集中的所述第一预测模式的模式代价不变;
为所述第一预测模式添加禁用标识,所述禁用标识指示禁止使用所述第一预测模式对所述目标预测单元进行预测处理。
再一种实施方式中,在根据所述校准后的模式信息集中的各候选预测模式的模式代价,从所述多种候选预测模式中选取目标预测模式时,所述一条或多条第一指令由处理器701加载并具体执行:
若所述校准后的模式信息集中模式代价最小的候选预测模式不为所述第一预测模式,或者所述模式代价最小的候选预测模式为第一预测模式且所述第一预测模式不具有所述禁用标识,则将所述模式代价最小的候选预测模式作为目标预测模式;
若所述模式代价最小的候选预测模式为第一预测模式,且所述第一预测模式具有所述禁用标识,则选取所述校准后的模式信息集中模式代价次小的候选预测模式作为目标预测模式。
再一种实施方式中,所述多种候选预测模式包括:帧内预测模式和帧间预测模式;相应的,所述一条或多条第一指令还可由处理器701加载并具体执行:
对所述目标预测单元进行复杂度分析,得到所述目标预测单元的预测复杂度;
若根据所述预测复杂度确定所述目标预测单元满足预设条件,则采用所述帧内预测模式对所述目标预测单元进行预测处理,以得到所述目标图像块的编码数据;其中,所述预设条件包括:所述预测复杂度小于或等于所述复杂度阈值,且所述目标预测单元在所述帧间预测模式中的至少一种模式下存在异常失真点;
若根据所述预测复杂度确定所述目标预测单元不满足所述预设条件,则执行根据所述至少一种候选预测模式对应的检测结果,对所述模式信息集中的所述至少一种候选预测模式的模式代价进行校准,得到校准后的模式信息集的步骤。
本发明实施例在编码过程中,可先在模式信息集中的至少一种候选预测模式下对目标预测单元进行异常失真点检测,得到至少一种候选预测模式对应的检测结果。其次,可根据至少一种候选预测模式对应的检测结果,对模式信息集中的至少一种候选预测模式的模式代价进行校准;使得校准后的模式信息集中的各候选预测模式的模式代价更能准确地反映相应候选预测模式所对应的码率和失真,从而使得能够根据校准后的模式信息集中的各候选预测模式的模式代价,从多种候选预测模式中选取出更适合目标预测单元的目标预测模式。然后,可采用合适的目标预测模式对目标预测单元进行预测处理,以得到目标图像块的编码数据,这样可在一定程度上减少目标图像块在编码后出现失真的概率;并且,由于本发明实施例主要是通过修正模式决策过程来选取合适的目标预测模式的方式减少失真概率的,因此可实现 在基本不影响压缩效率以及编码复杂度的情况下,有效改善图像压缩质量,提升目标图像块的主观质量。
基于上述视频播放方法实施例的描述,本发明实施例还公开了一种视频播放装置,所述视频播放装置可以是运行于视频播放设备中的一个计算机可读指令(包括程序代码)。该视频播放装置可以执行图4所示的方法。请参见图8,所述视频播放装置可以运行如下单元:
获取单元801,用于获取目标视频对应的图像帧序列中的各帧图像的码流数据,每帧图像的码流数据中包括多个图像块的编码数据;所述图像帧序列中除首帧图像以外的其他帧图像中的各图像块的编码数据采用图2或图3所示的视频编码方法编码得到;
解码单元802,用于对所述各帧图像的码流数据进行解码,得到所述各帧图像;
显示单元803,用于在播放界面中依次显示所述各帧图像。
根据本发明的一个实施例,图4所示的方法所涉及的各个步骤均可以是由图8所示的视频播放装置中的各个单元来执行的。例如,图4中所示的步骤S401-S403可分别由图8中所示的获取单元801、解码单元802以及显示单元803来执行。根据本发明的另一个实施例,图8所示的视频播放编码装置中的各个单元可以分别或全部合并为一个或若干个另外的单元来构成,或者其中的某个(些)单元还可以再拆分为功能上更小的多个单元来构成,这可以实现同样的操作,而不影响本发明的实施例的技术效果的实现。上述单元是基于逻辑功能划分的,在实际应用中,一个单元的功能也可以由多个单元来实现,或者多个单元的功能由一个单元实现。在本发明的其它实施例中,基于视频播放装置也可以包括其它单元,在实际应用中,这些功能也可以由其它单元协助实现,并且可以由多个单元协作实现。
根据本发明的另一个实施例,可以通过在包括中央处理单元(CPU)、随机存取存储介质(RAM)、只读存储介质(ROM)等处理元件和存储元件的例如计算机的通用计算设备上运行能够执行如图4中所示的相应方法所涉及的各步骤的计算机可读指令(包括程序代码),来构造如图8中所示的视频播放装置设备,以及来实现本发明实施例的视频播放方法。所述计算机可读指令可以记载于例如计算机可读记录介质上,并通过计算机可读记录介质装载于上述计算设备中,并在其中运行。
基于上述视频播放方法实施例以及视频播放装置实施例的描述,本发明实施例还提供一种视频播放设备。请参见图9,该视频播放设备至少包括处理器901、输入接口902、输出接口903、计算机存储介质904以及解码器905。其中,计算机存储介质904可以存储在视频播放设备的存储器中,计算机存储介质904用于存储计算机可读指令,所述计算机可读指令包括程序指令,所述处理器901用于执行计算机存储介质904存储的程序指令。处理器901(或称CPU(Central Processing Unit,中央处理器))是视频播放设备的计算核心以及控制核心,其适于实现一条或多条指令,具体适于加载并执行一条或多条指令从而实现相应方法流程或相应功能;在一个实施例中,本发明实施例所述的处理器901可以用于对目标视频进行一系列的视频播放,包括:获取目标视频对应的图像帧序列中的各帧图像的码流数据,每帧图像的码流数据中包括多个图像块的编码数据;所述图像帧序列中除首帧图像以外的其他帧图像中的各图像块的编码数据采用图2或图3所示的视频编码方法编码得到;对所述各帧图像的码流数据进行解码,得到所述各帧图像;在播放界面中依次显示所述各帧图像,等等。
本发明实施例还提供了一种计算机存储介质(Memory),所述计算机存储介质是视频播放设备中的记忆设备,用于存放程序和数据。可以理解的是,此处的计算机存储介质既可以 包括视频播放设备中的内置存储介质,当然也可以包括视频播放设备所支持的扩展存储介质。计算机存储介质提供存储空间,该存储空间存储了视频播放设备的操作系统。并且,在该存储空间中还存放了适于被处理器901加载并执行的一条或多条的指令,这些指令可以是一个或一个以上的计算机可读指令(包括程序代码)。需要说明的是,此处的计算机存储介质可以是高速RAM存储器,也可以是非不稳定的存储器(non-volatile memory),例如至少一个磁盘存储器;可选的还可以是至少一个位于远离前述处理器的计算机存储介质。
在一个实施例中,可由处理器901加载并执行计算机存储介质中存放的一条或多条第二指令,以实现上述有关视频播放方法实施例中的方法的相应步骤;具体实现中,计算机存储介质中的一条或多条第二指令由处理器901加载并执行如下步骤:
获取目标视频对应的图像帧序列中的各帧图像的码流数据,每帧图像的码流数据中包括多个图像块的编码数据;图像帧序列中除首帧图像以外的其他帧图像中的各图像块的编码数据采用图2或图3所示的视频编码方法编码得到;
对所述各帧图像的码流数据进行解码,得到所述各帧图像;
在播放界面中依次显示所述各帧图像。
本发明实施例可先获取目标视频对应的图像帧序列中的各帧图像的码流数据,每帧图像的码流数据中包括多个图像块的编码数据。其次,可对各帧图像的码流数据进行解码,得到各帧图像;并在播放界面中依次显示各帧图像。由于目标视频对应的图像帧序列中除首帧图像以外的其他帧图像中的各图像块的编码数据均采用上述的视频编码方法编码得到的;因此可有效减少各图像块出现失真的概率,从而使得在播放界面中显示各帧图像时,能够在一定程度上减少各帧图像出现脏点的概率,提升各帧图像的主观质量。
在一个实施例中,还提供了一种计算机设备,包括存储器和一个或多个处理器,存储器中存储有计算机可读指令,该一个或多个处理器执行计算机可读指令时实现上述各方法实施例中的步骤。
在一个实施例中,提供了一种一个或多个存储有计算机可读指令的非易失性计算机可读存储介质,存储有计算机可读指令,该计算机可读指令被一个或多个处理器执行时实现上述各方法实施例中的步骤。
在一个实施例中,提供了一种计算机程序产品或计算机程序,所述计算机程序产品或计算机程序包括计算机可读指令,所述计算机可读指令存储在计算机可读存储介质中,计算机设备的处理器从所述计算机可读存储介质读取所述计算机可读指令,所述处理器执行所述计算机可读指令,使得所述计算机设备执行上述各方法实施例中的步骤。
以上所揭露的仅为本发明较佳实施例而已,当然不能以此来限定本发明之权利范围,因此依本发明权利要求所作的等同变化,仍属本发明所涵盖的范围。

Claims (17)

  1. 一种视频编码方法,包括:
    获取目标图像块中的目标预测单元及所述目标预测单元的模式信息集,所述模式信息集包括多种候选预测模式及各种所述候选预测模式的模式代价;
    在所述模式信息集中的至少一种候选预测模式下对所述目标预测单元进行异常失真点检测,得到所述至少一种候选预测模式对应的检测结果;
    根据所述至少一种候选预测模式对应的检测结果,对所述模式信息集中的所述至少一种候选预测模式的模式代价进行校准,得到校准后的模式信息集;
    根据校准后的模式信息集中的各候选预测模式的模式代价,从所述多种候选预测模式中选取目标预测模式;及
    采用所述目标预测模式对所述目标预测单元进行预测处理,以得到所述目标图像块的编码数据。
  2. 根据权利要求1所述的方法,其特征在于,所述在所述模式信息集中的至少一种候选预测模式下对所述目标预测单元进行异常失真点检测,得到所述至少一种候选预测模式对应的检测结果,包括:
    采用参考预测模式对所述目标预测单元中的各像素点进行像素值预测,得到各像素点的预测值;所述参考预测模式为所述至少一种候选预测模式中的任意一种;
    计算所述目标预测单元中的各像素点的像素值和预测值之间的残差绝对值;
    若所述目标预测单元中存在残差绝对值大于目标阈值的像素点,则确定所述参考预测模式对应的检测结果指示所述目标预测单元在所述参考预测模式下存在异常失真点;及
    若所述目标预测单元中不存在残差绝对值大于所述目标阈值的像素点,则确定所述参考预测模式对应的检测结果指示所述目标预测单元在所述参考预测模式下不存在所述异常失真点。
  3. 根据权利要求2所述的方法,其特征在于,所述至少一种候选预测模式为帧间预测模式,所述帧间预测模式包括以下模式中的至少一种:第一预测模式、第二预测模式和第三预测模式;
    所述第一预测模式是指需传输与所述目标图像块相关的参考图像块的索引信息的模式;
    第二预测模式是指需传输所述目标图像块的残差信息以及与所述目标图像块相关的参考图像块的索引信息的模式;及
    第三预测模式是指需传输所述目标图像块的残差信息、所述目标图像块的运动矢量数据,以及与所述目标图像块相关的参考图像块的索引信息的模式。
  4. 根据权利要求3所述的方法,其特征在于,所述目标阈值与所述参考预测模式相关联;
    若所述参考预测模式为所述帧间预测模式中的第一预测模式,则所述目标阈值等于第一阈值;所述第一阈值大于无效数值且小于像素值域的最大值;及
    若所述参考预测模式为所述帧间预测模式中的所述第二预测模式或者所述第三预测模式,则所述目标阈值等于第二阈值;所述第二阈值大于或等于所述第一阈值,且小于所述像 素值域的最大值。
  5. 根据权利要求3所述的方法,其特征在于,所述根据所述至少一种候选预测模式对应的检测结果,对所述模式信息集中的所述至少一种候选预测模式的模式代价进行校准,得到校准后的模式信息集,包括:
    若参考预测模式对应的检测结果指示所述目标预测单元在所述参考预测模式下不存在异常失真点,则保持所述模式信息集中的所述参考预测模式的模式代价不变,以得到校准后的模式信息集;所述参考预测模式为所述帧间预测模式中的任一模式;及
    若所述参考预测模式对应的检测结果指示所述目标预测单元在所述参考预测模式下存在异常失真点,则采用所述参考预测模式的代价调整策略对所述模式信息集中的所述参考预测模式的模式代价进行调整,以得到校准后的模式信息集。
  6. 根据权利要求5所述的方法,其特征在于,所述采用所述参考预测模式的代价调整策略对所述模式信息集中的所述参考预测模式的模式代价进行调整,包括:
    若所述参考预测模式为所述第二预测模式或者所述第三预测模式,则采用惩罚因子对所述参考预测模式的模式代价进行放大处理,得到所述参考预测模式的校准后的模式代价。
  7. 根据权利要求6所述的方法,其特征在于,所述采用所述参考预测模式的代价调整策略对所述模式信息集中的所述参考预测模式的模式代价进行调整,还包括:
    若所述参考预测模式为所述第一预测模式,则获取预设代价;所述预设代价大于所述校准后的模式信息集中除所述第一预测模式以外的各候选预测模式的模式代价,且大于所述第一预测模式在所述模式信息集中的模式代价;及
    在所述模式信息集中,将所述参考预测模式的模式代价调整为所述预设代价。
  8. 根据权利要求7所述的方法,其特征在于,所述根据所述校准后的模式信息集中的各候选预测模式的模式代价,从所述多种候选预测模式中选取目标预测模式,包括:
    从所述多种候选预测模式中,选取所述校准后的模式信息中模式代价最小的候选预测模式作为目标预测模式。
  9. 根据权利要求6所述的方法,其特征在于,所述采用所述参考预测模式的代价调整策略对所述模式信息集中的所述参考预测模式的模式代价进行调整,还包括:
    若所述参考预测模式为所述第一预测模式,则保持所述模式信息集中的所述第一预测模式的模式代价不变;及
    为所述第一预测模式添加禁用标识,所述禁用标识指示禁止使用所述第一预测模式对所述目标预测单元进行预测处理。
  10. 根据权利要求9所述的方法,其特征在于,所述根据所述校准后的模式信息集中的各候选预测模式的模式代价,从所述多种候选预测模式中选取目标预测模式,包括:
    若所述校准后的模式信息集中模式代价最小的候选预测模式不为所述第一预测模式,或者所述模式代价最小的候选预测模式为第一预测模式且所述第一预测模式不具有所述禁用标 识,则将所述模式代价最小的候选预测模式作为目标预测模式;及
    若所述模式代价最小的候选预测模式为第一预测模式,且所述第一预测模式具有所述禁用标识,则选取所述校准后的模式信息集中模式代价次小的候选预测模式作为目标预测模式。
  11. 根据权利要求1-10任一项所述的方法,其特征在于,所述多种候选预测模式包括帧内预测模式和帧间预测模式,所述帧间预测模式包括至少一种模式;所述方法还包括:
    对所述目标预测单元进行复杂度分析,得到所述目标预测单元的预测复杂度;
    若根据所述预测复杂度确定所述目标预测单元满足预设条件,则采用所述帧内预测模式对所述目标预测单元进行预测处理,以得到所述目标图像块的编码数据;其中,所述预设条件包括:所述预测复杂度小于或等于所述复杂度阈值,且所述目标预测单元在所述帧间预测模式中的至少一种模式下存在异常失真点;及
    若根据所述预测复杂度确定所述目标预测单元不满足所述预设条件,则执行根据所述至少一种候选预测模式对应的检测结果,对所述模式信息集中的所述至少一种候选预测模式的模式代价进行校准,得到校准后的模式信息集的步骤。
  12. 一种视频播放方法,由视频播放设备执行,包括:
    获取目标视频对应的图像帧序列中的各帧图像的码流数据,每帧图像的码流数据中包括多个图像块的编码数据;所述图像帧序列中除首帧图像以外的其他帧图像中的各图像块的编码数据采用如权利要求1-11任一项所述的视频编码方法编码得到;
    对所述各帧图像的码流数据进行解码,得到所述各帧图像;及
    在播放界面中依次显示所述各帧图像。
  13. 一种视频编码装置,包括:
    获取单元,用于获取目标图像块中的目标预测单元及所述目标预测单元的模式信息集,所述模式信息集包括多种候选预测模式及各种所述候选预测模式的模式代价;
    编码单元,用于在所述模式信息集中的至少一种候选预测模式下对所述目标预测单元进行异常失真点检测,得到所述至少一种候选预测模式对应的检测结果;
    所述编码单元,还用于根据所述至少一种候选预测模式对应的检测结果,对所述模式信息集中的所述至少一种候选预测模式的模式代价进行校准,得到校准后的模式信息集;
    所述编码单元,还用于根据校准后的模式信息集中的各候选预测模式的模式代价,从所述多种候选预测模式中选取目标预测模式;及
    所述编码单元,还用于采用所述目标预测模式对所述目标预测单元进行预测处理,以得到所述目标图像块的编码数据。
  14. 根据权利要求13所述的装置,其特征在于,所述多种候选预测模式包括帧内预测模式和帧间预测模式,所述帧间预测模式包括至少一种模式,所述编码单元,还用于对所述目标预测单元进行复杂度分析,得到所述目标预测单元的预测复杂度,若根据所述预测复杂度确定所述目标预测单元满足预设条件,则采用所述帧内预测模式对所述目标预测单元进行预测处理,以得到所述目标图像块的编码数据,其中,所述预设条件包括:所述预测复杂度小于或等于所述复杂度阈值,且所述目标预测单元在所述帧间预测模式中的至少一种模式下存在异常失真点,若根据所述预测复杂度确定所述目标预测单元不满足所述预设条件,则执 行根据所述至少一种候选预测模式对应的检测结果,对所述模式信息集中的所述至少一种候选预测模式的模式代价进行校准,得到校准后的模式信息集的步骤。
  15. 一种视频播放装置,包括:
    获取单元,用于获取目标视频对应的图像帧序列中的各帧图像的码流数据,每帧图像的码流数据中包括多个图像块的编码数据;所述图像帧序列中除首帧图像以外的其他帧图像中的各图像块的编码数据采用如权利要求1-11任一项所述的视频编码方法编码得到;
    解码单元,用于对所述各帧图像的码流数据进行解码,得到所述各帧图像;及
    显示单元,用于在播放界面中依次显示所述各帧图像。
  16. 一种计算机存储介质,其特征在于,所述计算机存储介质存储有一条或多条第一指令,所述一条或多条第一指令适于由处理器加载并执行如权利要求1-11任一项所述的视频编码方法;或者,所述计算机存储介质存储有一条或多条第二指令,所述一条或多条第二指令适于由处理器加载并执行如权利要求12所述的视频播放方法。
  17. 一种计算机设备,包括存储器和一个或多个处理器,所述存储器存储有计算机可读指令,其特征在于,所述计算机可读指令被所述一个或多个处理器执行时,使得所述一个或多个处理器实现权利要求1至11或12中任一项所述的方法的步骤。
PCT/CN2021/089770 2020-05-25 2021-04-26 视频编码方法、视频播放方法、相关设备及介质 WO2021238546A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/719,691 US20220239904A1 (en) 2020-05-25 2022-04-13 Video Encoding Method, Video Playback Method, Related Device, and Medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010452023.1 2020-05-25
CN202010452023.1A CN111629206A (zh) 2020-05-25 2020-05-25 视频编码方法、视频播放方法、相关设备及介质

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/719,691 Continuation US20220239904A1 (en) 2020-05-25 2022-04-13 Video Encoding Method, Video Playback Method, Related Device, and Medium

Publications (1)

Publication Number Publication Date
WO2021238546A1 true WO2021238546A1 (zh) 2021-12-02

Family

ID=72259166

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/089770 WO2021238546A1 (zh) 2020-05-25 2021-04-26 视频编码方法、视频播放方法、相关设备及介质

Country Status (3)

Country Link
US (1) US20220239904A1 (zh)
CN (1) CN111629206A (zh)
WO (1) WO2021238546A1 (zh)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111629206A (zh) * 2020-05-25 2020-09-04 腾讯科技(深圳)有限公司 视频编码方法、视频播放方法、相关设备及介质
CN116486259A (zh) * 2023-04-04 2023-07-25 自然资源部国土卫星遥感应用中心 遥感图像中的点目标的提取方法和装置
CN117440156A (zh) * 2023-09-22 2024-01-23 书行科技(北京)有限公司 视频编码方法、视频发布方法及相关产品

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104902271A (zh) * 2015-05-15 2015-09-09 腾讯科技(北京)有限公司 预测模式选择方法及装置
CN105898297A (zh) * 2016-04-29 2016-08-24 上海高智科技发展有限公司 一种基于hevc的快速模式选择方法及系统
US20170180748A1 (en) * 2012-04-17 2017-06-22 Texas Instruments Incorporated Memory bandwidth reduction for motion compensation in video coding
CN109547798A (zh) * 2018-12-17 2019-03-29 杭州当虹科技股份有限公司 一种快速的hevc帧间模式选择方法
CN111629206A (zh) * 2020-05-25 2020-09-04 腾讯科技(深圳)有限公司 视频编码方法、视频播放方法、相关设备及介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170180748A1 (en) * 2012-04-17 2017-06-22 Texas Instruments Incorporated Memory bandwidth reduction for motion compensation in video coding
CN104902271A (zh) * 2015-05-15 2015-09-09 腾讯科技(北京)有限公司 预测模式选择方法及装置
CN105898297A (zh) * 2016-04-29 2016-08-24 上海高智科技发展有限公司 一种基于hevc的快速模式选择方法及系统
CN109547798A (zh) * 2018-12-17 2019-03-29 杭州当虹科技股份有限公司 一种快速的hevc帧间模式选择方法
CN111629206A (zh) * 2020-05-25 2020-09-04 腾讯科技(深圳)有限公司 视频编码方法、视频播放方法、相关设备及介质

Also Published As

Publication number Publication date
CN111629206A (zh) 2020-09-04
US20220239904A1 (en) 2022-07-28

Similar Documents

Publication Publication Date Title
WO2021238546A1 (zh) 视频编码方法、视频播放方法、相关设备及介质
US11711511B2 (en) Picture prediction method and apparatus
US10142626B2 (en) Method and system for fast mode decision for high efficiency video coding
US9414086B2 (en) Partial frame utilization in video codecs
US20140362918A1 (en) Tuning video compression for high frame rate and variable frame rate capture
US11736706B2 (en) Video decoding method and apparatus, and decoding device
Yuan et al. Hybrid distortion-based rate-distortion optimization and rate control for H. 265/HEVC
WO2019114721A1 (zh) 视频数据的帧间预测方法和装置
US11895297B2 (en) Prediction mode determining method and apparatus, encoding device, and decoding device
WO2019128716A1 (zh) 图像的预测方法、装置及编解码器
JP2015533461A (ja) レート歪みオプティマイザ及び複数の色成分の同時最適化を含む最適化技法
WO2020038378A1 (zh) 色度块预测方法及装置
WO2019184556A1 (zh) 一种双向帧间预测方法及装置
US9565404B2 (en) Encoding techniques for banding reduction
US11949892B2 (en) Content-adaptive online training for DNN-based cross component prediction with low-bit precision
KR102609215B1 (ko) 비디오 인코더, 비디오 디코더, 및 대응하는 방법
US20220400272A1 (en) Content-adaptive online training for dnn-based cross component prediction with scaling factors
WO2020224476A1 (zh) 一种图像划分方法、装置及设备
CN117616751A (zh) 动态图像组的视频编解码
WO2020134817A1 (zh) 预测模式确定方法、装置及编码设备和解码设备
WO2020048430A1 (zh) 色度块预测方法及装置
WO2020181476A1 (zh) 视频图像预测方法及装置
US11736730B2 (en) Systems, methods, and apparatuses for video processing
US20230171405A1 (en) Scene transition detection based encoding methods for bcw
CN112449184B (zh) 变换系数优化方法、编解码方法、装置、介质及电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21812510

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21812510

Country of ref document: EP

Kind code of ref document: A1