CN113115075B - Method, device, equipment and storage medium for enhancing video image quality - Google Patents

Method, device, equipment and storage medium for enhancing video image quality Download PDF

Info

Publication number
CN113115075B
CN113115075B CN202110309945.1A CN202110309945A CN113115075B CN 113115075 B CN113115075 B CN 113115075B CN 202110309945 A CN202110309945 A CN 202110309945A CN 113115075 B CN113115075 B CN 113115075B
Authority
CN
China
Prior art keywords
image quality
video frame
target video
quality enhancement
frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110309945.1A
Other languages
Chinese (zh)
Other versions
CN113115075A (en
Inventor
朱辉
陀健
章军海
黄浩填
曾招华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Huya Technology Co Ltd
Original Assignee
Guangzhou Huya Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Huya Technology Co Ltd filed Critical Guangzhou Huya Technology Co Ltd
Priority to CN202110309945.1A priority Critical patent/CN113115075B/en
Publication of CN113115075A publication Critical patent/CN113115075A/en
Application granted granted Critical
Publication of CN113115075B publication Critical patent/CN113115075B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4092Image resolution transcoding, e.g. by using client-server architectures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234363Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by altering the spatial resolution, e.g. for clients with a lower screen resolution
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440263Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by altering the spatial resolution, e.g. for displaying on a connected PDA
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The present disclosure provides a method, apparatus, device, and storage medium for enhancing video image quality, in which a target video frame is input into an image quality enhancement model for processing, and an image quality enhancement effect of a non-target video frame is quickly constructed by combining decoding information and image quality enhancement data of the target video frame, so that the image quality enhancement effect of the non-target video frame has an equivalent effect when the image quality enhancement model is processed, and less processing time is required, so that the processing speed of a video is greatly increased, and the method is suitable for a scene with real-time processing requirements, such as a live video.

Description

Method, device, equipment and storage medium for enhancing video image quality
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to a method, an apparatus, a device, and a storage medium for enhancing video image quality.
Background
The image quality refers to the picture quality, and if image information is lost in the compression process, the picture quality is correspondingly reduced. The basic principle of image quality enhancement is: the image is analyzed, the data such as color information and object edges in the image are determined, and the data are enhanced in a certain scale through an enhancement algorithm, wherein the enhancement algorithm comprises sharpening of the object edges, contrast ratio improvement, saturation improvement and the like, so that the image quality of the image is improved. However, the current image quality enhancement method is not efficient when processing video, and is difficult to be applied to some scenes with real-time requirements.
Disclosure of Invention
To overcome the problems in the related art, the present specification provides a method, apparatus, device, and storage medium for video image quality enhancement.
According to a first aspect of embodiments of the present disclosure, there is provided a method for enhancing video image quality, including:
acquiring a video frame set and decoding information from decoded data of a video, wherein the video frame set comprises a target video frame and at least one non-target video frame, and a time sequence difference value between the target video frame and the non-target video frame is smaller than or equal to a preset value;
for the set of video frames, performing the following:
inputting a target video frame into a trained image quality enhancement model, and acquiring the target video frame with enhanced image quality and image quality enhancement data of the target video frame by using the image quality enhancement model;
and predicting the image quality enhancement data of each non-target video frame based on the decoding information and the image quality enhancement data of the target video frame, and performing image quality enhancement processing on the non-target video frame by utilizing the image quality enhancement data of the non-target video frame.
In some examples, the obtaining the target video frame with enhanced image quality and the image quality enhancement data of the target video frame using the image quality enhancement model includes:
obtaining an image-quality-enhanced target video frame output by the image-quality enhancement model;
and obtaining the image quality enhancement data of the target video frame according to the comparison of the target video frame before and after the image quality enhancement.
In some examples, the non-target video frame includes at least one inter block, the decoding information includes inter information, and the inter information includes a motion vector of the inter block;
the image quality enhancement data of the inter block of the non-target video frame is obtained based on the following steps:
and searching for an inter-frame block matched with the non-target video frame in the target video frame based on the motion vector, and taking the image quality enhancement data of the inter-frame block in the target video frame as the image quality enhancement data of the inter-frame block in the non-target video frame.
In some examples, the non-target video frame further comprises at least one intra block, and the decoding information further comprises intra information;
the image quality enhancement data of the intra-frame blocks of the non-target video frame are predicted based on the intra-frame information and the image quality enhancement data of the inter-frame blocks in the non-target video frame.
In some examples, the intra information includes a correlation parameter for predicting the intra block using a surrounding image block of the intra block in a decoding process, the correlation parameter including a sampling value of the surrounding image block, a filling direction of the sampling value, and a weighting coefficient of the sampling value;
the image quality enhancement data of the intra block of the non-target video frame is obtained based on the following steps:
obtaining sampling values of image quality enhancement data of surrounding image blocks of intra-frame blocks in the non-target video frame based on the sampling values in the related parameters and the image quality enhancement data of the inter-frame blocks in the non-target video frame;
and taking the filling direction and the weighting coefficient of the sampling value in the related parameters as the filling direction and the weighting coefficient of the sampling value of the image quality enhancement data, carrying out weighted calculation by utilizing the sampling value and the weighting coefficient of the image quality enhancement data, and filling according to the filling direction of the sampling value of the image quality enhancement data to obtain the image quality enhancement data of the intra-frame block.
In some examples, the decoding information further includes parameters of deblocking filtering;
after obtaining the image quality enhancement data of the non-target video frame, the method further comprises the following steps:
and eliminating the blocking effect in the image quality enhancement data of the non-target video frame by using the parameters of the deblocking filtering.
In some examples, the video includes live video received by a live client.
According to a second aspect of embodiments of the present specification, there is provided an apparatus for video image quality enhancement, comprising:
an acquisition module for acquiring a set of video frames of an encoded video and decoding information, the set of video frames comprising a target video frame and at least one frame of non-target video frame, the video frames in the set of video frames being sequential in time;
the enhancement module is used for executing the following processing for the video frame set:
inputting a target video frame into a trained image quality enhancement model, and acquiring the target video frame with enhanced image quality and image quality enhancement data of the target video frame by using the image quality enhancement model;
and predicting the image quality enhancement data of each non-target video frame based on the decoding information and the image quality enhancement data of the target video frame, and performing image quality enhancement processing on the non-target video frame by utilizing the image quality enhancement data of the non-target video frame.
According to a third aspect of the embodiments of the present description, there is provided a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements any of the methods of the embodiments of the present description.
According to a fourth aspect of embodiments of the present specification, there is provided a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements any of the methods of the embodiments of the present specification when the program is executed.
The technical scheme provided by the embodiment of the specification can comprise the following beneficial effects:
in the embodiment of the specification, a method, a device, equipment and a storage medium for enhancing video image quality are disclosed, in the method, a target video frame is input into an image quality enhancement model for processing, and an image quality enhancement effect of a non-target video frame is quickly constructed by combining decoding information and image quality enhancement data of the target video frame, so that the image quality enhancement effect of the non-target video frame has an equivalent effect when the image quality enhancement model is processed, and less processing time is required, thus greatly accelerating the processing speed of video, and being suitable for scenes with real-time processing requirements such as live video.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the specification and together with the description, serve to explain the principles of the specification.
FIG. 1 is a flow chart illustrating a method of video image quality enhancement according to an exemplary embodiment of the present disclosure;
FIG. 2A is a schematic diagram of a frame of target video prior to processing, as illustrated in the present specification, according to an exemplary embodiment;
FIG. 2B is a schematic diagram of a frame of target video after processing, according to an exemplary embodiment of the present disclosure;
FIG. 2C is a schematic diagram of image quality enhancement data for a target video frame of a frame shown in accordance with an exemplary embodiment of the present disclosure;
FIG. 3A is a schematic diagram of a frame of non-target video prior to processing, according to an exemplary embodiment of the present disclosure;
FIG. 3B is a schematic diagram of image quality enhancement data for a non-target video frame shown in accordance with an exemplary embodiment of the present disclosure;
FIG. 4 is a schematic diagram of a frame of non-target video after processing, according to an exemplary embodiment of the present disclosure;
fig. 5 is a hardware configuration diagram of a computer device where the apparatus for enhancing video image quality according to the embodiment of the present disclosure is located;
fig. 6 is a block diagram of an apparatus for video image quality enhancement according to an exemplary embodiment of the present description.
Detailed Description
Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the present specification. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present description as detailed in the accompanying claims.
The terminology used in the description presented herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the description. As used in this specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any or all possible combinations of one or more of the associated listed items.
It should be understood that although the terms first, second, third, etc. may be used in this specification to describe various information, these information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, the first information may also be referred to as second information, and similarly, the second information may also be referred to as first information, without departing from the scope of the present description. The word "if" as used herein may be interpreted as "at … …" or "at … …" or "responsive to a determination", depending on the context.
The image quality enhancement processing is to analyze the image to determine the color information, the object edge and other data in the image, and enhance the data by a certain scale through an enhancement algorithm, such as sharpening the object edge, improving the contrast, the saturation and the like, so that the image quality of the image is improved. This is important to enhance the experience of the viewer.
In the related art, the image quality enhancement method is mainly divided into a method based on the traditional image processing technology and a method based on deep learning, wherein the method based on the traditional image processing technology is to directly process an image through guided filtering, bilateral filtering and the like, and the image details are easy to blur in the mode, so that the effect is poor; the method based on deep learning processes images through a trained deep learning model, and the processing effect is better than that of the former method, but the method is difficult to be suitable for live scenes because the method needs to re-analyze each frame of images during processing, and has large calculation amount and low processing efficiency. Based on this, the present specification provides a method of video image quality enhancement to solve the above-mentioned problems.
Next, embodiments of the present specification will be described in detail.
As shown in fig. 1, fig. 1 is a flowchart illustrating a method for video image quality enhancement according to an exemplary embodiment of the present disclosure, the method comprising:
in step 101, acquiring a video frame set and decoding information from decoded data of a video, wherein the video frame set comprises a target video frame and at least one non-target video frame, and a time sequence difference value between the target video frame and the non-target video frame is smaller than or equal to a preset value;
in step 102, for the set of video frames, the following processing is performed:
in step 1021, inputting a target video frame into a trained image quality enhancement model, and obtaining the target video frame with enhanced image quality and image quality enhancement data of the target video frame by using the image quality enhancement model;
in step 1022, based on the decoding information and the image quality enhancement data of the target video frame, image quality enhancement data of each of the non-target video frames is predicted, and image quality enhancement processing is performed on the non-target video frames using the image quality enhancement data of the non-target video frames.
The embodiment of the present disclosure provides a scheme for performing image quality enhancement processing on a video, which can be applied to a client that needs to perform image quality enhancement processing on a video, for example, a live client. Specifically, the image quality enhancement effect to be achieved may include tuning color, brightness, contrast, saturation, and the like of the picture, or may include enhancing resolution of the picture, or may further include enhancing edge texture information, removing scratches, noise, and the like, which is not limited in this embodiment.
The video image data has strong correlation, so that a large amount of redundant information exists, and the redundant information in the video image data can be removed in the video coding process, so that the compression is realized, and the storage and the transmission are convenient. In this embodiment, the correlation between the video frames is obtained based on the decoding information, and the processing results of the trained image quality enhancement model on part of the video frames are combined to construct the image quality enhancement effects of other video frames.
The decoded data of the video mentioned in step 101 may refer to data obtained after the client decodes the video stream through a decoding module such as a decoder. The video frame set is composed of video frames obtained after video stream decoding, and comprises at least a target video frame and at least one non-target video frame, wherein the time sequence difference value between the target video frame and the non-target video frame is smaller than or equal to a preset value. For example, a group of video frames obtained after decoding the video stream are sequenced into a frame 1, a frame 2 and a frame 3 to a frame 20 according to the time stamp, if the preset value is 1, the frame 1 and the frame 2 can be determined as a video frame set, the frame 3 and the frame 4 are determined as a video frame set, and so on; if the preset value is 2, frame 1, frame 2 and frame 3 may be determined as one video frame set, frame 4, frame 5 and frame 6 may be determined as one video frame set, or frame 1 and frame 3 may be determined as one video frame set, frame 2 and frame 4 may be determined as one video frame set, and so on. Also, a video frame set may include one or more target video frames, for example, the video frames in the video frame set include frame 1, frame 2 and frame 3, and the target video frame may be any one frame or any two frames of the three frames, so that there may be six combinations of target video frames and non-target video frames in the video frame set. It will be appreciated that since the timing difference between the target video frame and the non-target video frame does not exceed the preset value, a strong correlation between the target video frame and the non-target video frame can be maintained. It should be noted that this preset value may be set according to the needs of a specific scenario, which is not limited in this embodiment.
The decoding information refers to information utilized in the decoding process of the video. In general, the encoding process of the video includes a prediction encoding process, a transformation process, a quantization process, an entropy encoding process, etc., and the decoding process of the video includes an entropy decoding process, an inverse quantization process, an inverse transformation process, a prediction decoding process, etc., and each process involves more information, which is the decoding information mentioned in this embodiment. This decoded information can be obtained based on a callback function inserted inside the decoder, through which the decoded information can be passed out, since the decoder can parse out all the information during decoding of the video. Of course, in other embodiments, the decoding information may be obtained based on other manners, such as deriving from the encoding information of the encoding end, etc., which is not limited in this specification.
Step 102 is a process of performing image quality enhancement processing on video frames in the video frame set, and includes a process of processing target video frames in step 1021 and a process of processing non-target video frames in step 1022. For the target video frame, the embodiment performs image quality enhancement processing on the target video frame based on the trained image quality enhancement model. The image quality enhancement model mentioned in step 1021 may be obtained by training based on paired sample data, where the paired sample data includes low-quality image data and high-quality image data, and the high-quality image data may be obtained by performing image quality enhancement processing on the low-quality image data by a professional technician; in the case where the low-quality image data is formed by performing degradation operations such as imperfect recording and transmission of the original image data, the high-quality image data may be an original image of the low-quality image data.
For the image quality enhancement model, in different application scenes, a large amount of sample data can be used for training to obtain a corresponding image quality enhancement model based on the image quality enhancement effect to be achieved. For example, in an application scenario in which image resolution is improved, a function from low-resolution image data to high-resolution image data may be fitted by training using a neural network based on low-resolution image data and high-resolution image data as sample data, thereby obtaining a corresponding image quality enhancement model. The image quality enhancement model in this embodiment may be obtained based on a deep learning algorithm, such as a convolutional neural network algorithm. In particular, how to obtain the image quality enhancement model based on the deep learning training may be a training means in the related art, which is not limited herein.
In some examples, the obtaining the target video frame after image quality enhancement and the image quality enhancement data of the target video frame using the image quality enhancement model mentioned in step 1021 includes: obtaining an image-quality-enhanced target video frame output by the image-quality enhancement model; and obtaining the image quality enhancement data of the target video frame according to the comparison of the target video frame before and after the image quality enhancement. It can be understood that, after the target video frame is input into the trained model, the model outputs the target video frame with enhanced image quality, and the difference value between the target video frame before and after the image quality enhancement is the processing data of the model for the target video frame, namely the image quality enhancement data of the target video frame.
For non-target video frames, the present embodiment predicts the image quality enhancement data of each non-target video frame based on the image quality enhancement data of the target video frame and the decoding information, and further performs the image quality enhancement processing on the same. As mentioned above, there is a strong correlation between the target video frame and the non-target video frame, and on the basis of this, there is also a high similarity between the image quality enhancement data of the target video frame and the image quality enhancement data of the non-target video frame, and therefore, on the basis of obtaining the image quality enhancement data of the target video frame, the image quality enhancement data of the non-target video frame can be predicted in combination with the decoding information capable of indicating the correlation between the target video frame and the non-target video frame. Since the image quality enhancement data of the non-target video frame is obtained based on the image quality enhancement data of the target video frame and the decoded information, and the non-target video frame does not need to be comprehensively analyzed, the processing time can be greatly shortened while maintaining the same processing effect as compared with the case of inputting the non-target video frame into the image quality enhancement model for processing.
It will be appreciated that the video encoding process is typically block-based, in which an encoded image is typically divided into macroblocks (also called tiles), each macroblock typically being 16×16 pixels in size. The macroblock may be divided into an inter block, which is a macroblock reconstructed by inter prediction using a decoded picture as a reference picture, and an intra block, which is a macroblock reconstructed by intra prediction using a decoded pixel in a current frame as a reference, according to a prediction decoding scheme.
In general, a macroblock of a video frame may include only an inter block, or may include both an inter block and an intra block. In some examples, the non-target video frame includes at least one inter block, the decoding information includes inter information, the inter information including a motion vector of the inter block; the image quality enhancement data of the inter block of the non-target video frame is obtained based on the following steps: and searching for an inter-frame block matched with the non-target video frame in the target video frame based on the motion vector, and taking the image quality enhancement data of the inter-frame block in the target video frame as the image quality enhancement data of the inter-frame block in the non-target video frame. The motion vector (MotionVector, MV), also referred to as motion vector, refers in video coding to the relative displacement between the current coding block and the best matching block in its reference picture. In this embodiment, a macroblock matching with an inter block of a non-target video frame may be found from all macroblocks of the target video frame based on the motion vector, where the macroblock is the inter block of the target video frame. It will be appreciated that the inter-frame blocks of the non-target video frame are very similar blocks to the inter-frame blocks of the target video frame, and that the image quality enhancement effect of the similar blocks is similar, so that the image quality enhancement data of the inter-frame blocks in the target video frame can be directly applied to the matching inter-frame blocks in the non-target video frame. For example, the target video frame is frame 1, the macro blocks in frame 1 are respectively block 1, block 2, block 3 and block 4, the non-target video frame is frame 2, the macro blocks in frame 2 are respectively block 5, block 6, block 7 and block 8, if block 5 is an inter block of the non-target video frame, based on the motion vector, the inter block matching with block 5 can be found from the target video frame, and if the found result is block 1, the image quality enhancement data of block 1 in the target video frame can be directly used as the image quality enhancement data of block 5 in the non-target video frame.
In some examples, the non-target video frame further comprises at least one intra block, and the decoding information further comprises intra information; the image quality enhancement data of the intra-frame blocks of the non-target video frame are predicted based on the intra-frame information and the image quality enhancement data of the inter-frame blocks in the non-target video frame. In video coding, intra prediction is a key step, and the principle is to construct a predicted macroblock by using pixels at the left and top of the macroblock, and calculate the difference between the predicted macroblock and the real macroblock, so that the amount of data is reduced because the stored data is the difference and not the pixel value of the real macroblock. And the intra information is information utilized in the intra prediction process, so that the intra information may indicate correlation of the intra block with surrounding image blocks of the intra block. In the present embodiment, the intra information may be used to indicate the correlation between the image quality enhancement data of each of the intra block and the surrounding image blocks of the intra block.
When the non-target video frame includes inter-frame blocks and intra-frame blocks, the image quality enhancement data of the inter-frame blocks of the non-target video frame can be obtained based on the inter-frame information and the image quality enhancement data of the target video frame, and then the image quality enhancement data of the intra-frame blocks of the non-target video frame can be obtained based on the intra-frame information and the image quality enhancement data of the inter-frame blocks of the non-target video frame. Here, the surrounding image blocks generally refer to image blocks on the left and below, and when the surrounding image block of one of the intra blocks of the non-target video frame is also an intra block, the image quality enhancement data of the intra block in which the surrounding image block is an inter block may be obtained first, and then the image quality enhancement data of the intra block in which the surrounding image block includes an intra block may be obtained. If the macro blocks of the non-target video frame have 9 blocks, namely, block 1, block 2 and block 3 … … block 9, wherein, block 1 to block 7 are inter blocks, block 8 and block 9 are intra blocks, and the surrounding image blocks of block 9 are block 6 and block 8, wherein, block 8 is also intra block, the image quality enhancement data of block 8 is calculated first, and the surrounding image blocks of block 8 are inter blocks, and the image quality enhancement data of block 8 can be obtained based on the image quality enhancement data of block 5 and block 7 and the intra information, and then the image quality enhancement data of block 9 is obtained based on the image quality enhancement data of block 6 and block 8 and the intra information after the image quality enhancement data of block 8 is obtained.
Specifically, the intra information includes a correlation parameter for predicting the intra block by using surrounding image blocks of the intra block in a decoding process, where the correlation parameter includes sampling values of the surrounding image blocks, a filling direction of the sampling values, and weighting coefficients of the sampling values; the image quality enhancement data of the intra block of the non-target video frame is obtained based on the following steps: obtaining sampling values of image quality enhancement data of surrounding image blocks of intra-frame blocks in the non-target video frame based on the sampling values in the related parameters and the image quality enhancement data of the inter-frame blocks in the non-target video; and taking the filling direction and the weighting coefficient of the sampling value in the related parameters as the filling direction and the weighting coefficient of the sampling value of the image quality enhancement data, carrying out weighted calculation by utilizing the sampling value and the weighting coefficient of the image quality enhancement data, and filling according to the filling direction of the sampling value of the image quality enhancement data to obtain the image quality enhancement data of the intra-frame block. It will be appreciated that intra prediction usually uses the selected reference pixels and the corresponding weights to perform corresponding weighted calculation according to the prediction modes of horizontal prediction, vertical prediction or oblique prediction, so as to predict and obtain the intra block. In this embodiment, three relevant parameters included in the intra information indicate what macro blocks to choose from left and below during intra prediction, whether the prediction direction is horizontal prediction, vertical prediction or oblique prediction, and the weights corresponding to the selected surrounding macro blocks, so that after obtaining the image quality enhancement data of the inter block, the sampling value of the image quality enhancement data of the surrounding image block of the intra block can be determined based on the sampling value in the relevant parameters, that is, the reference macro block data selected during prediction of the image quality enhancement data of the intra block, then the filling direction and the weighting coefficient of the sampling value in the relevant parameters are used as the filling direction and the weighting coefficient of the reference macro block data, the reference macro block data and the weighting coefficient thereof are correspondingly weighted, and the filling is performed according to the filling direction, thereby obtaining the image quality enhancement data of the intra block.
To further improve the image quality of the processed non-target video frames, in some examples, the decoding information further includes parameters of deblocking filtering; after obtaining the image quality enhancement data of the non-target video frame, the method further comprises the following steps: and eliminating the blocking effect in the image quality enhancement data of the non-target video frame by using the parameters of the deblocking filtering. The parameters of the deblocking filtering are mainly offset, the offset is used for adjusting the filtering intensity, when the filtering intensity needs to be increased, the positive offset is used for removing the blocking effect caused by suboptimal motion estimation and improper coding mode selection, and improving the subjective quality of the image; when the filter strength needs to be reduced, the image details can be protected from blurring by the smoothing action of the filter by a negative offset. In video decoding, after the reconstructed image is obtained by predictive decoding, a loop filtering (also called deblocking filtering) step is usually included, which is aimed at removing the blocking effect, which is mainly introduced by the quantization process after DCT (Discrete Cosine Transform ). In this embodiment, a certain blocking effect is also introduced in the inter-frame prediction process, so that the blocking effect in the image quality enhancement data of the non-target video frame obtained by prediction is eliminated by using the parameters of deblocking filtering, and the image quality of the processed non-target video frame can be improved.
The image quality enhancement data of the target video frame can be obtained based on the comparison of the target video frame before and after the image quality enhancement processing, and the image quality enhancement data of the non-target video frame is fused to the non-target video frame after the image quality enhancement data of the non-target video frame is obtained, so that the non-target video frame with enhanced image quality can be obtained.
The target video frame is input into the image quality enhancement model for processing, and the image quality enhancement effect of the non-target video frame is quickly constructed by combining the decoding information and the image quality enhancement data of the target video frame, so that the image quality enhancement effect of the non-target video frame has the equivalent effect of the image quality enhancement model, and less processing time is required. Therefore, the processing speed of the video can be greatly increased, and the video in the live scene can be processed in real time. That is, the video in this embodiment may include live video received by a live client. Of course, the method of this embodiment is equally applicable to other scenes where real-time video processing is required.
To further explain the video image quality enhancement method of the present specification, a specific embodiment is described below:
when the image quality enhancement processing is carried out by applying a deep learning-based method, each frame of video frame is input into a trained image quality enhancement model for processing, and the average processing time length of each video frame is 30ms, so that the total processing time length of the video is 600ms, and the real-time requirement of a live broadcast scene cannot be met.
When the video image quality enhancement method of the present specification is applied to the live video, the processing flow is as follows:
s201, obtaining decoding information and 10 video frame sets from the decoded data of the video, wherein each video frame set comprises two continuous video frames, the video frame with the front time is a target video frame, and the video frame with the rear time is a non-target video frame; the decoding information comprises inter-frame information, intra-frame information and deblocking filtering parameters, the inter-frame information comprises motion vectors of inter-frame blocks, and the intra-frame information comprises relevant parameters for predicting the intra-frame blocks by utilizing surrounding image blocks of the intra-frame blocks in the decoding process;
s202, for each video frame set, executing the following steps S203-S206;
s203, inputting the target video frame into a trained image quality enhancement model to obtain a processed target video frame output by the image quality enhancement model;
s204, obtaining image quality enhancement data of the target video frame according to the comparison of the target video frame before and after the image quality enhancement processing; referring to fig. 2A, 2B, and 2C, fig. 2A is a schematic diagram before processing a target video frame, fig. 2B is a schematic diagram after processing the target video frame, and fig. 2C is a schematic diagram of image quality enhancement data of the target video frame according to the embodiment of the present disclosure; as can be seen from a comparison between fig. 2A and fig. 2B, the image effect shown in fig. 2B is higher in resolution and clearer in image detail than the image effect shown in fig. 2A, so that the image quality is better;
s205, macro blocks of the non-target video frame comprise inter blocks and intra blocks, and the image quality enhancement data of the non-target video frame is obtained based on the following processes:
for the inter-frame blocks of the non-target video frame, searching for the inter-frame blocks matched with the non-target video frame in the target video frame based on the motion vector, and taking the image quality enhancement data of the inter-frame blocks in the target video frame as the image quality enhancement data of the inter-frame blocks in the non-target video frame;
for intra-frame blocks of the non-target video frame, obtaining sampling values of image quality enhancement data of surrounding image blocks of the intra-frame blocks in the non-target video frame based on sampling values in the related parameters and the image quality enhancement data of inter-frame blocks in the non-target video frame; the filling direction and the weighting coefficient of the sampling value in the related parameters are used as the filling direction and the weighting coefficient of the sampling value of the image quality enhancement data, the sampling value and the weighting coefficient of the image quality enhancement data are used for weighting calculation, and filling is carried out according to the filling direction of the sampling value of the image quality enhancement data, so that the image quality enhancement data of the intra-frame block is obtained;
combining the image quality enhancement data of the inter-frame blocks and the image quality enhancement data of the intra-frame blocks to obtain predicted image quality enhancement data of the non-target video frames, and eliminating the block effect in the predicted image quality enhancement data of the non-target video frames by using the parameters of deblocking filtering to obtain the image quality enhancement data of the non-target video frames; referring to fig. 3A and 3B, fig. 3A is a schematic diagram before processing a non-target video frame, and fig. 3B is a schematic diagram of image quality enhancement data of the non-target video frame, according to an embodiment of the present disclosure;
s206, fusing the image quality enhancement data of the non-target video frame to the non-target video frame, so as to obtain a processed non-target video frame; referring to fig. 4, fig. 4 is a schematic view of a processed non-target video frame, which shows the effect of fusing the image quality enhancement data of the non-target video frame shown in fig. 3B to the non-target video frame shown in fig. 3A, and it can be seen that the image quality enhancement effect of the non-target video frame is substantially identical to the image quality enhancement effect obtained by inputting the target video frame into the image quality enhancement model.
In this embodiment, the average processing time length of each frame of the target video frame is 30ms, and the average processing time length of each frame of the non-target video frame is 3ms, so that the total processing time length of the video is 330ms, and compared with the mode based on deep learning, the method has the advantages that the equivalent processing effect is maintained, the processing time length is greatly shortened, and therefore, the real-time requirement of a live broadcast scene can be met.
Corresponding to the embodiment of the method, the present specification also provides embodiments of the apparatus for enhancing video image quality and the terminal to which the apparatus is applied.
The embodiments of the video image quality enhancement apparatus of the present specification may be applied to a computer device, such as a server or a terminal device. The apparatus embodiments may be implemented by software, or may be implemented by hardware or a combination of hardware and software. Taking software implementation as an example, the device in a logic sense is formed by reading corresponding computer program instructions in the nonvolatile memory into the memory through a processor of the file processing where the device is located. In terms of hardware, as shown in fig. 5, a hardware structure diagram of a computer device where the video image quality enhancement device according to the embodiment of the present invention is located is shown in fig. 5, and in addition to the processor 510, the memory 530, the network interface 520, and the nonvolatile memory 540 shown in fig. 5, the server or the electronic device where the device 531 is located in the embodiment may generally include other hardware according to the actual function of the computer device, which is not described herein again.
Accordingly, the present specification embodiment also provides a computer storage medium having a program stored therein, which when executed by a processor, implements the method in any of the above embodiments.
Embodiments of the present description may take the form of a computer program product embodied on one or more storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having program code embodied therein. Computer-usable storage media include both permanent and non-permanent, removable and non-removable media, and information storage may be implemented by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to: phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Disks (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, may be used to store information that may be accessed by the computing device.
As shown in fig. 6, fig. 6 is a block diagram of an apparatus for video image quality enhancement according to an exemplary embodiment of the present specification, the apparatus comprising:
an acquisition module 61 for acquiring a set of video frames of an encoded video, the set of video frames comprising a target video frame and at least one frame of non-target video frame, the video frames of the set of video frames being sequential in time sequence;
enhancement module 62 is configured to perform, for the set of video frames, the following processing:
inputting a target video frame into a trained image quality enhancement model, and acquiring the target video frame with enhanced image quality and image quality enhancement data of the target video frame by using the image quality enhancement model;
and predicting the image quality enhancement data of each non-target video frame based on the decoding information and the image quality enhancement data of the target video frame, and performing image quality enhancement processing on the non-target video frame by utilizing the image quality enhancement data of the non-target video frame.
The implementation process of the functions and roles of each module in the above device is specifically shown in the implementation process of the corresponding steps in the above method, and will not be described herein again.
For the device embodiments, reference is made to the description of the method embodiments for the relevant points, since they essentially correspond to the method embodiments. The apparatus embodiments described above are merely illustrative, wherein the modules illustrated as separate components may or may not be physically separate, and the components shown as modules may or may not be physical, i.e., may be located in one place, or may be distributed over a plurality of network modules. Some or all of the modules may be selected according to actual needs to achieve the purposes of the present description. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
The foregoing describes specific embodiments of the present disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
Other embodiments of the present description will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This specification is intended to cover any variations, uses, or adaptations of the specification following, in general, the principles of the specification and including such departures from the present disclosure as come within known or customary practice within the art to which the specification pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the specification being indicated by the following claims.
It is to be understood that the present description is not limited to the precise arrangements and instrumentalities shown in the drawings, which have been described above, and that various modifications and changes may be made without departing from the scope thereof. The scope of the present description is limited only by the appended claims.
The foregoing description of the preferred embodiments is provided for the purpose of illustration only, and is not intended to limit the scope of the disclosure, since any modifications, equivalents, improvements, etc. that fall within the spirit and principles of the disclosure are intended to be included within the scope of the disclosure.

Claims (8)

1. A method of video quality enhancement, comprising:
acquiring a video frame set and decoding information from decoded data of a video, wherein the video frame set comprises a target video frame and at least one non-target video frame, and a time sequence difference value between the target video frame and the non-target video frame is smaller than or equal to a preset value;
for the set of video frames, performing the following:
inputting a target video frame into a trained image quality enhancement model, and acquiring the target video frame with enhanced image quality and image quality enhancement data of the target video frame by using the image quality enhancement model;
predicting the image quality enhancement data of each non-target video frame based on the decoding information and the image quality enhancement data of the target video frame, and performing image quality enhancement processing on the non-target video frame by utilizing the image quality enhancement data of the non-target video frame;
wherein the non-target video frame comprises at least one inter block and at least one intra block, the decoding information comprises relevant parameters of the intra block predicted by using surrounding image blocks of the intra block in the decoding process, and the relevant parameters comprise sampling values of the surrounding image blocks, filling directions of the sampling values and weighting coefficients of the sampling values;
the image quality enhancement data of the intra block of the non-target video frame is obtained based on the following steps:
obtaining sampling values of image quality enhancement data of surrounding image blocks of intra-frame blocks in the non-target video frame based on the sampling values in the related parameters and the image quality enhancement data of the inter-frame blocks in the non-target video frame;
and taking the filling direction and the weighting coefficient of the sampling value in the related parameters as the filling direction and the weighting coefficient of the sampling value of the image quality enhancement data, carrying out weighted calculation by utilizing the sampling value and the weighting coefficient of the image quality enhancement data, and filling according to the filling direction of the sampling value of the image quality enhancement data to obtain the image quality enhancement data of the intra-frame block.
2. The method according to claim 1, wherein the obtaining the image quality enhanced target video frame and the image quality enhanced data of the target video frame using the image quality enhancement model comprises:
obtaining an image-quality-enhanced target video frame output by the image-quality enhancement model;
and obtaining the image quality enhancement data of the target video frame according to the comparison of the target video frame before and after the image quality enhancement.
3. The method of claim 1, wherein the decoding information comprises inter information, the inter information comprising a motion vector of the inter block;
the image quality enhancement data of the inter block of the non-target video frame is obtained based on the following steps:
and searching for an inter-frame block matched with the non-target video frame in the target video frame based on the motion vector, and taking the image quality enhancement data of the inter-frame block in the target video frame as the image quality enhancement data of the inter-frame block in the non-target video frame.
4. A method according to any of claims 1-3, wherein the decoding information further comprises parameters of deblocking filtering;
after obtaining the image quality enhancement data of the non-target video frame, the method further comprises the following steps:
and eliminating the blocking effect in the image quality enhancement data of the non-target video frame by using the parameters of the deblocking filtering.
5. The method of claim 1, wherein the video comprises live video received by a live client.
6. An apparatus for video image quality enhancement, comprising:
an acquisition module for acquiring a set of video frames of an encoded video and decoding information, the set of video frames comprising a target video frame and at least one frame of non-target video frame, the video frames in the set of video frames being sequential in time;
the enhancement module is used for executing the following processing for the video frame set:
inputting a target video frame into a trained image quality enhancement model, and acquiring the target video frame with enhanced image quality and image quality enhancement data of the target video frame by using the image quality enhancement model;
predicting the image quality enhancement data of each non-target video frame based on the decoding information and the image quality enhancement data of the target video frame, and performing image quality enhancement processing on the non-target video frame by utilizing the image quality enhancement data of the non-target video frame;
wherein the non-target video frame comprises at least one inter block and at least one intra block, the decoding information comprises relevant parameters of the intra block predicted by using surrounding image blocks of the intra block in the decoding process, and the relevant parameters comprise sampling values of the surrounding image blocks, filling directions of the sampling values and weighting coefficients of the sampling values;
the image quality enhancement data of the intra block of the non-target video frame is obtained based on the following steps:
obtaining sampling values of image quality enhancement data of surrounding image blocks of intra-frame blocks in the non-target video frame based on the sampling values in the related parameters and the image quality enhancement data of the inter-frame blocks in the non-target video frame;
and taking the filling direction and the weighting coefficient of the sampling value in the related parameters as the filling direction and the weighting coefficient of the sampling value of the image quality enhancement data, carrying out weighted calculation by utilizing the sampling value and the weighting coefficient of the image quality enhancement data, and filling according to the filling direction of the sampling value of the image quality enhancement data to obtain the image quality enhancement data of the intra-frame block.
7. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method of any one of claims 1 to 5 when the program is executed by the processor.
8. A computer readable storage medium, characterized in that a computer program is stored thereon, which program, when being executed by a processor, implements the method of any of claims 1 to 5.
CN202110309945.1A 2021-03-23 2021-03-23 Method, device, equipment and storage medium for enhancing video image quality Active CN113115075B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110309945.1A CN113115075B (en) 2021-03-23 2021-03-23 Method, device, equipment and storage medium for enhancing video image quality

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110309945.1A CN113115075B (en) 2021-03-23 2021-03-23 Method, device, equipment and storage medium for enhancing video image quality

Publications (2)

Publication Number Publication Date
CN113115075A CN113115075A (en) 2021-07-13
CN113115075B true CN113115075B (en) 2023-05-26

Family

ID=76710572

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110309945.1A Active CN113115075B (en) 2021-03-23 2021-03-23 Method, device, equipment and storage medium for enhancing video image quality

Country Status (1)

Country Link
CN (1) CN113115075B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113902651B (en) * 2021-12-09 2022-02-25 环球数科集团有限公司 Video image quality enhancement system based on deep learning

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102811354A (en) * 2011-05-30 2012-12-05 深圳市快播科技有限公司 Video image quality enhanced playing method and on-demand terminal
CN111031346A (en) * 2019-10-28 2020-04-17 网宿科技股份有限公司 Method and device for enhancing video image quality

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8135073B2 (en) * 2002-12-19 2012-03-13 Trident Microsystems (Far East) Ltd Enhancing video images depending on prior image enhancements
US7760805B2 (en) * 2005-05-31 2010-07-20 Hewlett-Packard Development Company, L.P. Method of enhancing images extracted from video
US10448014B2 (en) * 2017-05-23 2019-10-15 Intel Corporation Content adaptive motion compensated temporal filtering for denoising of noisy video for efficient coding
CN108235058B (en) * 2018-01-12 2021-09-17 广州方硅信息技术有限公司 Video quality processing method, storage medium and terminal
CN111010495B (en) * 2019-12-09 2023-03-14 腾讯科技(深圳)有限公司 Video denoising processing method and device
CN111681177B (en) * 2020-05-18 2022-02-25 腾讯科技(深圳)有限公司 Video processing method and device, computer readable storage medium and electronic equipment

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102811354A (en) * 2011-05-30 2012-12-05 深圳市快播科技有限公司 Video image quality enhanced playing method and on-demand terminal
CN111031346A (en) * 2019-10-28 2020-04-17 网宿科技股份有限公司 Method and device for enhancing video image quality

Also Published As

Publication number Publication date
CN113115075A (en) 2021-07-13

Similar Documents

Publication Publication Date Title
Agustsson et al. Scale-space flow for end-to-end optimized video compression
JP6356912B2 (en) Method and apparatus for performing graph-based prediction using an optimization function
JP5063648B2 (en) Method and assembly for video coding where video coding includes texture analysis and texture synthesis, corresponding computer program and corresponding computer-readable recording medium
US9602814B2 (en) Methods and apparatus for sampling-based super resolution video encoding and decoding
JP4565010B2 (en) Image decoding apparatus and image decoding method
CN108989802B (en) HEVC video stream quality estimation method and system by utilizing inter-frame relation
CN110036637B (en) Method and device for denoising and vocalizing reconstructed image
CN112102212B (en) Video restoration method, device, equipment and storage medium
CN113632484A (en) Method and apparatus for bit width control of bi-directional optical flow
CN111010495A (en) Video denoising processing method and device
EP2018070A1 (en) Method for processing images and the corresponding electronic device
CN112235569B (en) Quick video classification method, system and device based on H264 compressed domain
US20130022099A1 (en) Adaptive filtering based on pattern information
US20110110431A1 (en) Method of coding and decoding a stream of images; associated devices
Liu et al. End-to-end neural video coding using a compound spatiotemporal representation
CN108810549B (en) Low-power-consumption-oriented streaming media playing method
CN110392265B (en) Inter-frame motion estimation method and device, electronic equipment and readable storage medium
CN113115075B (en) Method, device, equipment and storage medium for enhancing video image quality
Klopp et al. Utilising low complexity cnns to lift non-local redundancies in video coding
Xiao et al. The interpretable fast multi-scale deep decoder for the standard HEVC bitstreams
US11778224B1 (en) Video pre-processing using encoder-aware motion compensated residual reduction
WO2016179261A1 (en) Methods and apparatus for optical blur modeling for improved video encoding
CN116193140A (en) LCEVC-based encoding method, decoding method and decoding equipment
CN113347425B (en) Information processing method and device, equipment and storage medium
KR101247024B1 (en) Method of motion estimation and compensation using in-loop preprocessing filtering

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant