CN114513684B

CN114513684B - Method for constructing video image quality enhancement model, video image quality enhancement method and device

Info

Publication number: CN114513684B
Application number: CN202011277818.XA
Authority: CN
Inventors: 李志华; 高政; 杨松
Original assignee: Feihu Information Technology Tianjin Co Ltd
Current assignee: Feihu Information Technology Tianjin Co Ltd
Priority date: 2020-11-16
Filing date: 2020-11-16
Publication date: 2024-05-28
Anticipated expiration: 2040-11-16
Also published as: CN114513684A

Abstract

The invention provides a method for constructing a video image quality enhancement model, a method and a device for video image quality enhancement, which utilize a machine learning method to construct a definition enhancement model, a color enhancement model and a resolution enhancement model, integrate the definition enhancement model, the color enhancement model and the resolution enhancement model into a video transcoding program according to the sequence of definition enhancement, color enhancement and resolution enhancement, realize definition enhancement and/or color enhancement and/or resolution enhancement of video image quality in the video transcoding program according to video image quality enhancement requests of users, meet different video image quality enhancement requests of users, and improve the universality of the video image quality enhancement method.

Description

Method for constructing video image quality enhancement model, video image quality enhancement method and device

Technical Field

The present invention relates to the field of image processing technologies, and in particular, to a method for constructing a video image quality enhancement model, a method and an apparatus for enhancing video image quality.

Background

With the development of the times, the requirements of people on video image quality are continuously improved. However, at present, a large number of low-quality videos are caused by factors such as lag of shooting equipment, poor shooting technology, damage to image quality of the videos in the processes of manufacturing, transcoding and transmission, viewing experience is seriously affected, and extra code rate overhead is increased under the same conditions. Therefore, it is important to improve the image quality of low-quality video.

Video image quality enhancement is generally classified into a conventional method and a deep learning method. The traditional video enhancement method is mostly a set of enhancement rules established by experts according to attribute information (brightness, color temperature and the like) of videos, the enhancement effect depends on experience, and the accuracy is low. Most of the existing deep learning methods are researched aiming at a special scene, an end-to-end model is trained, and the universality is not strong.

Disclosure of Invention

In view of the above, the invention provides a method for constructing a video image quality enhancement model, a method and a device for enhancing video image quality, which realize enhancement of definition, color enhancement and resolution of video frames, and have strong universality and good enhancement effect.

In order to achieve the above purpose, the specific technical scheme provided by the invention is as follows:

A method for constructing a video image quality enhancement model comprises the following steps:

Acquiring definition enhancement model training data, resolution enhancement model training data and color enhancement model training data;

Training a self-coding network by utilizing the definition enhancement model training data to obtain a definition enhancement model, wherein the self-coding network comprises an encoder and a decoder, and the encoder and the decoder respectively consist of a convolutional neural network;

training a two-way generation type countermeasure network by utilizing the color enhancement model training data to obtain a color enhancement model;

Training the convolutional neural network by utilizing the resolution enhancement model training data to obtain a resolution enhancement model;

The sharpness enhancement model, the color enhancement model and the resolution enhancement model are converted into a preset format and integrated in a video transcoding procedure in the order sharpness enhancement-color enhancement-resolution enhancement.

Optionally, the self-coding network further includes a noise estimation sub-network, and the noise estimation sub-network is obtained after training the convolutional neural network by using noise estimation training data.

Optionally, the training the self-coding network by using the sharpness enhancement model training data to obtain a sharpness enhancement model includes:

inputting the definition enhancement model training data into the noise estimation sub-network to obtain a noise value of the definition enhancement model training data;

Sequentially inputting the definition enhancement model training data and the noise value into the encoder and the decoder to obtain output data of the self-coding network;

Inputting the output data of the self-coding network and the real reference image data of the definition enhancement model training data into a first loss function to obtain an output value of the first loss function;

And when the output value of the first loss function converges, obtaining the definition enhancement model.

Optionally, the first loss function is a weighted sum of a minimum absolute value deviation function L1-loss, a minimum square error function L2-loss, and a smoothing loss function smoth-loss.

Optionally, the training the two-way generation type countermeasure network by using the color enhancement model training data to obtain a color enhancement model includes:

Inputting the training data of the color enhancement model into the two-way generation type countermeasure network to obtain output data of the two-way generation type countermeasure network;

Inputting the output data of the two-way generation type countermeasure network and the real reference image data of the color enhancement model training data into a second loss function to obtain an output value of the second loss function, wherein the second loss function is a cycle consistency loss function;

And when the output value of the second loss function converges, obtaining the color enhancement model.

Optionally, the convolutional neural network corresponding to the resolution enhancement model uses residual networks as basic modules, and a preset cascade mechanism is added between the residual networks.

Optionally, training the convolutional neural network by using the resolution enhancement model training data to obtain a resolution enhancement model, including:

inputting the training data of the resolution enhancement model into a convolutional neural network to obtain output data of the convolutional neural network;

Inputting the output data of the convolutional neural network and the real reference image data of the resolution enhancement model training data into a third loss function to obtain an output value of the third loss function, wherein the third loss function is a function based on a feature pyramid;

and when the output value of the third loss function converges, obtaining the resolution enhancement model.

A video image quality enhancement method, comprising:

Under the condition that a video image quality enhancement request is received, analyzing the video image quality enhancement request to obtain a video frame to be enhanced and enhancement processing options, wherein the enhancement processing options are at least any one of a definition enhancement option, a color enhancement option and a resolution enhancement option;

Inputting the video frame to be enhanced into a video image enhancement model corresponding to the enhancement processing option to obtain a video frame after video image enhancement processing, wherein the video image enhancement model is pre-constructed according to the method for constructing the video image enhancement model disclosed by the embodiment, the definition enhancement option corresponds to the definition enhancement model, the color enhancement option corresponds to the color enhancement model, and the resolution enhancement option corresponds to the resolution enhancement model.

Optionally, when the video image quality enhancement request includes more than one enhancement processing options, the inputting the video frame to be enhanced into a video image quality enhancement model corresponding to the enhancement processing options, to obtain a video frame after video image quality enhancement processing, includes:

And inputting the video frames to be enhanced into a video image quality enhancement model corresponding to the enhancement processing options according to the sequence of definition enhancement, color enhancement and resolution enhancement, so as to obtain video frames after video image quality enhancement processing.

A construction device of a video image quality enhancement model comprises:

the training data acquisition unit is used for acquiring the definition enhancement model training data, the resolution enhancement model training data and the color enhancement model training data;

The definition enhancement model building unit is used for training a self-coding network by utilizing the definition enhancement model training data to obtain a definition enhancement model, wherein the self-coding network comprises an encoder and a decoder, and the encoder and the decoder respectively consist of convolutional neural networks;

The color enhancement model building unit is used for training the two-way generation type countermeasure network by utilizing the color enhancement model training data to obtain a color enhancement model;

The resolution enhancement model building unit is used for training the convolutional neural network by utilizing the resolution enhancement model training data to obtain a resolution enhancement model;

And the model integration unit is used for converting the definition enhancement model, the color enhancement model and the resolution enhancement model into preset formats and integrating the definition enhancement model, the color enhancement model and the resolution enhancement model into a video transcoding program according to the sequence of definition enhancement, color enhancement and resolution enhancement.

Optionally, the sharpness enhancement model building unit is specifically configured to:

Optionally, the color enhancement model building unit is specifically configured to:

Optionally, the resolution enhancement model building unit is specifically configured to:

A video image quality enhancement apparatus comprising:

The enhancement request analysis unit is used for analyzing the video image quality enhancement request under the condition of receiving the video image quality enhancement request to obtain a video frame to be enhanced and enhancement processing options, wherein the enhancement processing options are at least any one of definition enhancement options, color enhancement options and resolution enhancement options;

The enhancement processing unit is configured to input the video frame to be enhanced into a video image quality enhancement model corresponding to the enhancement processing option, so as to obtain a video frame after video image quality enhancement processing, where the video image quality enhancement model is pre-constructed according to the method for constructing a video image quality enhancement model disclosed in the foregoing embodiment, the sharpness enhancement option corresponds to a sharpness enhancement model, the color enhancement option corresponds to a color enhancement model, and the resolution enhancement option corresponds to a resolution enhancement model.

Optionally, when the video image quality enhancement request includes more than one enhancement processing option, the enhancement processing unit is specifically configured to input the video frame to be enhanced into a video image quality enhancement model corresponding to the enhancement processing option according to a sequence of sharpness enhancement, color enhancement, and resolution enhancement, so as to obtain a video frame after video image quality enhancement processing.

Compared with the prior art, the invention has the following beneficial effects:

The invention discloses a method for constructing a video image quality enhancement model, which utilizes a machine learning method to construct a definition enhancement model, a color enhancement model and a resolution enhancement model, integrates the definition enhancement model, the color enhancement model and the resolution enhancement model into a video transcoding program according to the sequence of definition enhancement, color enhancement and resolution enhancement, realizes definition enhancement and/or color enhancement and/or resolution enhancement of video image quality in the video transcoding program according to video image quality enhancement requests of users, satisfies different video image quality enhancement requests of users, and improves the universality of the video image quality enhancement method.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present invention, and that other drawings can be obtained according to the provided drawings without inventive effort for a person skilled in the art.

Fig. 1 is a schematic flow chart of a method for constructing a video image quality enhancement model according to an embodiment of the present invention;

Fig. 2 is a schematic structural diagram of a self-coding network according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of a single-path generation type countermeasure network according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of a convolutional neural network model according to an embodiment of the present invention;

Fig. 5 is a flow chart of a video image quality enhancement method according to an embodiment of the present invention;

fig. 6 is a schematic structural diagram of a device for constructing a video image quality enhancement model according to an embodiment of the present invention;

Fig. 7 is a schematic structural diagram of a video image quality enhancement device according to an embodiment of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Referring to fig. 1, the embodiment of the invention discloses a method for constructing a video image quality enhancement model, which specifically comprises the following steps:

s101: acquiring definition enhancement model training data, resolution enhancement model training data and color enhancement model training data;

Sharpness enhancement is largely divided into denoising and deblurring, where noise includes gaussian noise, compression noise, etc.; blur here refers to the most common motion blur. Therefore, the definition enhancement model training data is generated by simulating different rules, namely adding random noise, adopting different operators for filtering and the like, on the one hand, from an actual low-definition video on a video platform line, and on the other hand, designing a data expansion rule conforming to the actual low-quality video distribution.

The resolution enhancement model training data is partially generated by adopting a real low-resolution video of a video platform and partially generated by script simulation, namely, random noise or filtering of different operators are added while the image size is reduced, so that data close to the real low-resolution video is generated.

In this embodiment, the two-way generation type countermeasure network is used for training the color enhancement model, and the end-to-end unsupervised training is performed, so that the color enhancement model training data is a film library resource of the video platform, i.e. only a group of images with the color effect desired by us needs to be collected.

S102: training a self-coding network by utilizing the definition enhancement model training data to obtain a definition enhancement model, wherein the self-coding network comprises an encoder and a decoder, and the encoder and the decoder respectively consist of convolutional neural networks;

the self-coding network used to train the sharpness enhancement model includes an encoder and a decoder, referring to fig. 2, the left half is the encoder and the right half is the decoder, both 11 convolutional layers and 2 pooling layers. Representing a feature map,/>Representing a convolution layer,/>Representing a pooling layer,/>Representing the upsampling layer, in the encoder, the input data is first stepwise converted through the multi-layer convolutional layer and the pooling layer into a feature map of spatial size 1*1 and channel number 256, which is then converted back to the original size and channel number of the input data in the decoder.

And a jump connection structure is widely adopted among the characteristic diagrams in the network structure and is used for combining information of different convolution layers, so that gradient propagation and acceleration convergence are facilitated. Both the encoder and decoder structures employ a residual network resnet as a base module.

In order to process videos with different low definition conditions, a noise estimation sub-network is designed, training data is firstly input into the noise estimation network before entering an encoder, and noise values of the output of the training data and the noise estimation network are simultaneously and sequentially input into the encoder and the decoder, so that a robust output is obtained.

The noise estimation sub-network adopts a common full convolution network, and the noise size of the noise graph is simulated and generated by outputting a real reference value in the training process, and the embedding of the network can make the whole network insensitive to the noise size of an input image.

In order to ensure the training effect of the sharpness enhancement model, the training effect is estimated through the first loss function, wherein the first loss function is a weighted sum of a minimum absolute value deviation function L1-loss, a minimum square error function L2-loss and a smooth loss function smoth-loss, so that the rapid convergence of a network can be ensured, and the stability of model training can be ensured.

The output of the training data after passing through the whole self-coding network is input into a first loss function together with the real reference image data, and the result is backward returned for updating the value of the network parameter. The optimizer adopts the self-contained adam method of pytorch frames, the parameters are beta 1=0.9, beta 2=0.999, the batch_size is set to be 16, the initial learning rate is set to be 0.001, 100 epochs are continuously trained, and the subsequent strategy of step-down is adopted, namely, the learning rate of every 20 epochs is reduced to be 10% of the previous one, so that the supervised training is carried out.

S103: training the two-way generation type countermeasure network by utilizing the training data of the color enhancement model to obtain the color enhancement model;

The two-way generation type countermeasure network GAN structure fully utilizes the advantages of GAN in the aspect of image generation, relies on the film library resources of the video platform, and performs end-to-end non-supervision training. The structure of the single-path GAN is shown in fig. 3, and the two-path GAN are two single-path GAN in parallel, and some coupling mechanisms are added between the two single-path GAN. A Generator (Generator) is used to generate a color enhanced image, and a discriminator (Discriminator) distinguishes a true target image from the enhanced image generated by the Generator. We understand the toning problem as an image translation problem, i.e. translating an image of one style into an image of another style. By using the style migration algorithm Cycle-GAN as a reference, the second loss function adopts a cyclic consistency loss function (Cycle consistency loss), so that the occurrence of unstable conditions in the GAN network training process is greatly reduced. Different from other tasks, the color enhancement is difficult to find the training data pair, the GAN network is adopted for non-supervision training, only a group of pictures with the color effect which we wish to achieve are collected, and the difficulty of data collection is greatly reduced. In the training process, the parameter batch size is set to be 4, the learning rate of the generator and the discriminator is set to be 0.00001, and the non-supervision training is carried out on the learning rate by adopting a step-down strategy.

S104: training the convolutional neural network by utilizing the training data of the resolution enhancement model to obtain a resolution enhancement model;

Referring to fig. 4, the convolutional neural network for training the resolution enhancement model uses resnet as a basic module, and in order to reduce the total parameter number, a cascade mechanism is added between each resnet modules, i.e. the output of the middle layer is cascade connected to a higher layer, and finally, the output of the middle layer is converged to the final layer of convolutional layer. The first layer and the last layer of the network are mean shift layers with a convolution kernel of 1*1, the de-averaging and the inverse operation are respectively finished, and parameters do not need to be updated in training. The convolution kernel size of the other convolution layers is 3*3, and relu is adopted as an activation function. And the up-sampling layer adopts Pixelshuffle to perform multiplication processing on the output characteristic diagram.

The convolutional neural network evaluates training effects through a third loss function, which is designed by adopting a characteristic pyramid idea, and uses the polynomial sum of some intermediate layers and a final output layer as a final expression. The shallow layer of the network contains more basic information including textures, lines and the like, the high layer of the network contains more semantic information, and the advantage of designing a loss function by adopting the thought of a feature pyramid is that some detail parts can be finely depicted while super-resolution of an image is realized, so that the whole and detail mapping relation from a low-resolution image to a high-resolution image is fully learned.

The optimizer adopts adam method, the parameter batch size is set to 64, the initial learning rate is set to 0.0001, and the learning rate is also supervised and trained by adopting a step-down strategy.

S105: the sharpness enhancement model, the color enhancement model, and the resolution enhancement model are converted into a preset format and integrated in a video transcoding procedure in the order sharpness enhancement-color enhancement-resolution enhancement.

The model training algorithm in the embodiment of the invention is developed by adopting pytorch frames, and the algorithm is trained on the model Tesla P40 GPU of nvidia company after the network structure design is completed. The training parameters are continuously adjusted according to the training output of the algorithm, so that the algorithm finally converges below an ideal precision. The training resulting model is converted to the pb format of tensorflow framework so that it can be integrated into the ffmpeg transcoding process. The final use flow is approximately as follows: source video, decoding into video frames, dividing video scenes, selecting different model combinations according to requirements to carry out video enhancement, and merging and outputting the video frames.

The embodiment also discloses a video image quality enhancement method, which uses the video image quality enhancement model constructed in the above embodiment to perform video image quality enhancement processing, referring to fig. 5, and the method comprises the following steps:

S201: under the condition that a video image quality enhancement request is received, analyzing the video image quality enhancement request to obtain a video frame to be enhanced and enhancement processing options, wherein the enhancement processing options are at least any one of a definition enhancement option, a color enhancement option and a resolution enhancement option;

That is, according to the enhancement requirements of the user on different video frames, a corresponding video image enhancement request is sent, and the enhancement processing options in the request can be any one of the definition enhancement options, the color enhancement options and the resolution enhancement options, any two of the options, or all three of the options.

S202: and inputting the video frames to be enhanced into a video image quality enhancement model corresponding to the enhancement processing options to obtain video frames subjected to video image quality enhancement processing, wherein the definition enhancement options correspond to the definition enhancement model, the color enhancement options correspond to the color enhancement model, and the resolution enhancement options correspond to the resolution enhancement model.

When the video image quality enhancement request comprises more than one enhancement processing options, inputting the video frames to be enhanced into a video image quality enhancement model corresponding to the enhancement processing options according to the sequence of definition enhancement, color enhancement and resolution enhancement, and obtaining video frames after video image quality enhancement processing.

Taking the example that the video image quality enhancement request comprises a definition enhancement option, a color enhancement option and a resolution enhancement option, inputting the video frame to be enhanced into a definition enhancement model, inputting the output result of the definition enhancement model into the color enhancement model, inputting the output result of the color enhancement model into the resolution enhancement model, and obtaining the output result of the resolution enhancement model as the video frame after the video image quality enhancement processing.

It can be seen that the video image quality enhancement method disclosed in this embodiment uses a machine learning method to construct a sharpness enhancement model, a color enhancement model and a resolution enhancement model, and integrates the sharpness enhancement model, the color enhancement model and the resolution enhancement model in a video transcoding program according to the order of sharpness enhancement, color enhancement and resolution enhancement, so as to implement sharpness enhancement and/or color enhancement and/or resolution enhancement of video image quality in the video transcoding program according to a video image quality enhancement request of a user, thereby satisfying different video image quality enhancement requests of the user and improving the universality of the video image quality enhancement method.

Based on the method for constructing a video image quality enhancement model disclosed in the foregoing embodiment, the present embodiment correspondingly discloses a device for constructing a video image quality enhancement model, please refer to fig. 6, which includes:

a training data obtaining unit 401, configured to obtain definition enhancement model training data, resolution enhancement model training data, and color enhancement model training data;

A sharpness enhancement model building unit 402, configured to train a self-coding network by using the sharpness enhancement model training data to obtain a sharpness enhancement model, where the self-coding network includes an encoder and a decoder, and the encoder and the decoder are respectively composed of convolutional neural networks;

A color enhancement model building unit 403, configured to train the two-way generation type countermeasure network by using the color enhancement model training data, so as to obtain a color enhancement model;

the resolution enhancement model building unit 404 is configured to train the convolutional neural network by using the resolution enhancement model training data to obtain a resolution enhancement model;

A model integration unit 405, configured to convert the sharpness enhancement model, the color enhancement model, and the resolution enhancement model into a preset format, and integrate the same into a video transcoding procedure in order of sharpness enhancement-color enhancement-resolution enhancement.

Optionally, the sharpness enhancement model building unit 402 is specifically configured to:

Optionally, the color enhancement model building unit 403 is specifically configured to:

Optionally, the resolution enhancement model building unit 404 is specifically configured to:

Based on the video image quality enhancement method disclosed in the above embodiment, the present embodiment correspondingly discloses a video image quality enhancement device, please refer to fig. 7, which includes:

An enhancement request parsing unit 501, configured to parse the video image quality enhancement request to obtain a video frame to be enhanced and an enhancement processing option, where the enhancement processing option is at least any one of a sharpness enhancement option, a color enhancement option and a resolution enhancement option, when the video image quality enhancement request is received;

The enhancement processing unit 502 is configured to input the video frame to be enhanced into a video image quality enhancement model corresponding to the enhancement processing option, to obtain a video frame after video image quality enhancement processing, where the video image quality enhancement model is pre-constructed according to a method for constructing a video image quality enhancement model disclosed in the foregoing embodiment, the sharpness enhancement option corresponds to a sharpness enhancement model, the color enhancement option corresponds to a color enhancement model, and the resolution enhancement option corresponds to a resolution enhancement model.

The embodiment discloses a device for constructing a video image quality enhancement model and a video image quality enhancement device, which utilize a machine learning method to construct a definition enhancement model, a color enhancement model and a resolution enhancement model, integrate the definition enhancement model, the color enhancement model and the resolution enhancement model into a video transcoding program according to the sequence of definition enhancement, color enhancement and resolution enhancement, realize definition enhancement and/or color enhancement and/or resolution enhancement of video image quality in the video transcoding program according to video image quality enhancement requests of users, satisfy different video image quality enhancement requests of users, and improve the universality of the video image quality enhancement method.

The above embodiments may be combined in any manner, and features described in the embodiments in the present specification may be replaced or combined with each other in the above description of the disclosed embodiments, so as to enable those skilled in the art to make or use the present application.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. The method for constructing the video image quality enhancement model is characterized by comprising the following steps of:

Converting the definition enhancement model, the color enhancement model and the resolution enhancement model into preset formats, and integrating the definition enhancement model, the color enhancement model and the resolution enhancement model into a video transcoding program according to the sequence of definition enhancement, color enhancement and resolution enhancement;

the self-coding network further comprises a noise estimation sub-network, wherein the noise estimation sub-network is obtained by training the convolutional neural network through noise estimation training data;

the training of the self-coding network by using the definition enhancement model training data to obtain a definition enhancement model comprises the following steps:

2. The method of claim 1, wherein the first loss function is a weighted sum of a minimum absolute value deviation function L1-loss, a minimum square error function L2-loss, and a smoothing loss function smoth-loss.

3. The method of claim 1, wherein training the two-way generation type countermeasure network with the color enhancement model training data to obtain a color enhancement model, comprising:

4. The method of claim 1, wherein the convolutional neural network corresponding to the resolution enhancement model uses residual networks as basic modules, and a preset cascade mechanism is added between the residual networks.

5. The method of claim 4, wherein training the convolutional neural network using the resolution enhancement model training data to obtain a resolution enhancement model comprises:

6. A method for enhancing video quality, comprising:

Inputting the video frame to be enhanced into a video image enhancement model corresponding to the enhancement processing option to obtain a video frame after video image enhancement processing, wherein the video image enhancement model is pre-constructed according to the method for constructing a video image enhancement model according to any one of claims 1-5, the definition enhancement option corresponds to a definition enhancement model, the color enhancement option corresponds to a color enhancement model, and the resolution enhancement option corresponds to a resolution enhancement model.

7. The method according to claim 6, wherein when the video image enhancement request includes more than one enhancement processing options, the inputting the video frame to be enhanced into the video image enhancement model corresponding to the enhancement processing options, to obtain the video frame after the video image enhancement processing, includes:

8. A device for constructing a video image quality enhancement model, comprising:

The model integration unit is used for converting the definition enhancement model, the color enhancement model and the resolution enhancement model into preset formats and integrating the definition enhancement model, the color enhancement model and the resolution enhancement model into a video transcoding program according to the sequence of definition enhancement, color enhancement and resolution enhancement;

9. A video image quality enhancement apparatus, comprising:

The enhancement processing unit is configured to input the video frame to be enhanced into a video image quality enhancement model corresponding to the enhancement processing option, to obtain a video frame after video image quality enhancement processing, where the video image quality enhancement model is pre-constructed according to a method for constructing a video image quality enhancement model according to any one of claims 1 to 5, the sharpness enhancement option corresponds to a sharpness enhancement model, the color enhancement option corresponds to a color enhancement model, and the resolution enhancement option corresponds to a resolution enhancement model.