CN115564655A - Video super-resolution reconstruction method, system and medium based on deep learning - Google Patents

Video super-resolution reconstruction method, system and medium based on deep learning Download PDF

Info

Publication number
CN115564655A
CN115564655A CN202211392882.1A CN202211392882A CN115564655A CN 115564655 A CN115564655 A CN 115564655A CN 202211392882 A CN202211392882 A CN 202211392882A CN 115564655 A CN115564655 A CN 115564655A
Authority
CN
China
Prior art keywords
video
module
super
resolution
ith
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211392882.1A
Other languages
Chinese (zh)
Inventor
季栋浩
潘金山
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Science and Technology
Original Assignee
Nanjing University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Science and Technology filed Critical Nanjing University of Science and Technology
Priority to CN202211392882.1A priority Critical patent/CN115564655A/en
Publication of CN115564655A publication Critical patent/CN115564655A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4053Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4046Scaling of whole images or parts thereof, e.g. expanding or contracting using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Multimedia (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a video super-resolution reconstruction method, a system and a medium based on deep learning, in particular to the technical field of video processing. The method comprises the following steps: inputting each frame of a video to be processed into a super-resolution model to obtain a super-resolution image corresponding to each frame of the video to be processed; obtaining a super-resolution video corresponding to the video to be processed according to the super-resolution image corresponding to each frame of image of the video to be processed, wherein a super-resolution model is obtained by training a BasicVSR model by taking the video to be trained as input, the super-resolution video corresponding to the video to be trained as output and taking the minimum frequency loss function as a target; the forward and backward branches of the BasicVSR model each include a GDFN module. The invention can improve the quality of high-resolution video images.

Description

Video super-resolution reconstruction method, system and medium based on deep learning
Technical Field
The invention relates to the technical field of video processing, in particular to a method, a system and a medium for reconstructing video super-resolution based on deep learning.
Background
The resolution is a set of performance parameters for evaluating the richness degree of detailed information contained in an image, and comprises time resolution, space resolution, color level resolution and the like, and the capability of an imaging system for actually reflecting the detailed information of an object is reflected. High resolution images typically include greater pixel density, richer texture details, and higher confidence than low resolution images. In practice, however, the ideal high-resolution image with sharp edges and no block blurring cannot be directly obtained under the constraint of many factors, such as the acquisition equipment and environment, the network transmission medium and bandwidth, the video degradation model itself, and the like. The most direct way to improve the image resolution is to improve the optical hardware in the acquisition system, but since the manufacturing process is difficult to be greatly improved and the manufacturing cost is very high, it is often too costly to physically solve the problem of low image resolution.
The super-resolution reconstruction technique of video refers to restoring a given low-resolution image into a corresponding high-resolution video through a specific algorithm. Compared with the image super-score, the video super-score can utilize the information of adjacent multi-frames to achieve a better super-score effect. The traditional hyper-resolution algorithm, such as interpolation, can cause the edge of the high-resolution video image to be blurred, and the effect is not good.
Disclosure of Invention
The invention aims to provide a video super-resolution reconstruction method, a system and a medium based on deep learning, which can improve the quality of a high-resolution video image.
In order to achieve the purpose, the invention provides the following scheme:
a video super-resolution reconstruction method based on deep learning comprises the following steps:
constructing a hyper-resolution model; the super-resolution model is obtained by training a BasicVSR model by taking an image corresponding to each frame of a video to be trained as input, taking a super-resolution image corresponding to each frame of the video to be trained as output and taking the minimum frequency loss function as a target; the forward and backward branches of the basicsvsr model each include a GDFN module;
acquiring a video to be processed;
inputting each frame image of the video to be processed into the super-resolution model to obtain a super-resolution image corresponding to each frame image of the video to be processed;
and obtaining a super-resolution video corresponding to the video to be processed according to the super-resolution image corresponding to each frame of image of the video to be processed.
Optionally, the basicsvr model includes a forward branch, a backward branch, and an upsampling branch; the output ends of the forward branch and the backward branch are connected with the input end of the up-sampling branch.
Optionally, the forward branch includes N forward propagation modules; the backward branch comprises N backward propagation modules; the up-sampling branch comprises N up-sampling propagation modules; n is a positive integer greater than 1;
the first input end of the ith forward propagation module is connected with the first output end of the (i-1) th forward propagation module; a second input end of the ith forward propagation module is used for inputting an ith frame image and an (i-1) th frame image of the video to be processed; a first output end of the ith forward propagation module is connected with a first input end of the (i + 1) th forward propagation module; a second output end of the ith forward propagation module is connected with a first input end of the ith up-sampling module;
the first input end of the ith backward propagation module is connected with the first output end of the (i + 1) th backward propagation module; a second input end of the ith backward propagation module is used for inputting an ith frame image and an (i-1) th frame image of the video to be processed; a first output end of the ith backward propagation module is connected with a first input end of the (i-1) th backward propagation module; and the second output end of the ith backward propagation module is connected with the second input end of the ith up-sampling module.
Optionally, the forward propagation module and the backward propagation module each include an optical flow estimation module, a spatial warping module and a depth residual block, and the optical flow estimation module, the spatial warping module, the GDFN module and the depth residual block are sequentially connected.
Optionally, the frequency loss function is specifically:
Figure BDA0003931932320000021
wherein the content of the first and second substances,
Figure BDA0003931932320000031
the function of the loss of frequency is represented,
Figure BDA0003931932320000032
representing an image generated by inputting the video to be trained into a BasicVSR model, I representing a super-resolution image corresponding to the video to be trained, epsilon representing a first constant, alpha representing a second constant,
Figure BDA0003931932320000033
presentation pair
Figure BDA0003931932320000034
A fast fourier transform is performed and,
Figure BDA0003931932320000035
indicating that I is fast fourier transformed.
A video super-resolution reconstruction system based on deep learning comprises:
the construction module is used for constructing a hyper-resolution model; the super-resolution model is obtained by training a BasicVSR model by taking an image corresponding to each frame of a video to be trained as input, taking a super-resolution image corresponding to each frame of the video to be trained as output and taking the minimum frequency loss function as a target; the forward branch and the backward branch of the BasicVSR model both comprise GDFN modules;
the acquisition module is used for acquiring a video to be processed;
the super-resolution image determining module is used for inputting each frame image of the video to be processed into the super-resolution model to obtain a super-resolution image corresponding to each frame image of the video to be processed;
a super-resolution video determination module; and obtaining a super-resolution video corresponding to the video to be processed according to the super-resolution image corresponding to each frame of image of the video to be processed.
Optionally, the basicsvr model includes a forward branch, a backward branch, and an upsampling branch; the output ends of the forward branch and the backward branch are connected with the input end of the up-sampling branch.
Optionally, the forward branch includes N forward propagation modules; the backward branch comprises N backward propagation modules; the up-sampling branch comprises N up-sampling propagation modules; n is a positive integer greater than 1;
the first input end of the ith forward propagation module is connected with the first output end of the (i-1) th forward propagation module; a second input end of the ith forward propagation module is used for inputting an ith frame image and an (i-1) th frame image of the video to be processed; a first output end of the ith forward propagation module is connected with a first input end of the (i + 1) th forward propagation module; a second output end of the ith forward propagation module is connected with a first input end of the ith up-sampling module;
the first input end of the ith backward propagation module is connected with the first output end of the (i + 1) th backward propagation module; a second input end of the ith backward propagation module is used for inputting an ith frame image and an (i-1) th frame image of the video to be processed; a first output end of the ith backward propagation module is connected with a first input end of the (i-1) th backward propagation module; and the second output end of the ith backward propagation module is connected with the second input end of the ith up-sampling module.
Optionally, the forward propagation module and the backward propagation module each include an optical flow estimation module, a spatial warping module and a depth residual block, and the optical flow estimation module, the spatial warping module, the GDFN module and the depth residual block are sequentially connected.
Optionally, a computer-readable storage medium stores a computer program, which when executed by a processor implements the method for super-resolution reconstruction of video based on deep learning as described above.
According to the specific embodiment provided by the invention, the invention discloses the following technical effects: the invention uses GDFN module to achieve better feature fusion effect, and can improve the quality of high-resolution video image.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise.
Fig. 1 is a flowchart of a video super-resolution reconstruction method based on deep learning according to an embodiment of the present invention;
FIG. 2 is a detailed architecture diagram of the BasicVSR model;
FIG. 3 is a detailed block diagram of a forward propagation module;
FIG. 4 is a detailed block diagram of the back propagation module;
FIG. 5 is a detailed block diagram of a GDFN module;
fig. 6 is a detailed block diagram of the video super-resolution system.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
With the rise of deep learning, the development of a video super-resolution technology based on the deep learning is more and more rapid, the invention provides a video super-resolution reconstruction method based on the deep learning, a hyper-resolution model in the invention uses a cyclic network architecture to carry out information transmission among video frame information, simultaneously uses a GDFN module to improve the effect of feature fusion, and adds a frequency loss function optimization network to ensure that the hyper-resolution model has the advantages of good performance, low parameter number and high calculation efficiency.
The embodiment of the invention provides a video super-resolution reconstruction method based on deep learning, which comprises the following steps of:
step 101: constructing a hyper-resolution model; the super-resolution model is obtained by training a BasicVSR model by taking an image corresponding to each frame of a video to be trained as input, taking a super-resolution image corresponding to each frame of the video to be trained as output and taking the minimum frequency loss function as a target; the forward and backward branches of the basicsvsr model each include a GDFN module.
Step 102: and acquiring a video to be processed.
Step 103: and inputting each frame image of the video to be processed into the super-resolution model to obtain a super-resolution image corresponding to each frame image of the video to be processed.
Step 104: and obtaining a super-resolution video corresponding to the video to be processed according to the super-resolution image corresponding to each frame of image of the video to be processed.
In practical application, the BasicVSR model comprises a forward branch, a backward branch and an up-sampling branch; and the output ends of the forward branch and the backward branch are connected with the input end of the up-sampling branch.
In practical applications, as shown in fig. 2, the forward branch includes N forward propagation modules; the backward branch comprises N backward propagation modules; the up-sampling branch comprises N up-sampling propagation modules; n is a positive integer greater than 1.
The first input end of the ith forward propagation module is connected with the first output end of the (i-1) th forward propagation module and is used for inputting the forward propagation characteristics of the (i-1) th frame image output by the (i-1) th forward propagation module
Figure BDA0003931932320000051
The second input end of the ith forward propagation module is used for inputting the ith frame image x of the video to be processed i And the i-1 st frame image x i-1 (ii) a A first output end of the ith forward propagation module is connected with a first input end of the (i + 1) th forward propagation module, and is used for outputting the forward propagation characteristics of the ith frame image output by the ith forward propagation module
Figure BDA0003931932320000061
The second output end of the ith forward propagation module is connected with the first input end of the ith up-sampling module
Figure BDA0003931932320000062
The first input end of the ith backward propagation module is connected with the first output end of the (i + 1) th backward propagation module and is used for inputting the backward propagation characteristics of the (i + 1) th frame image output by the (i + 1) th backward propagation module
Figure BDA0003931932320000063
A second input end of the ith backward propagation module is used for inputting an ith frame image x of the video to be processed i And the i-1 st frame image x i-1 (ii) a The first output end of the ith backward propagation module is connected with the first input end of the (i-1) th backward propagation module and is used for outputting the backward propagation characteristics of the ith frame image output by the ith backward propagation module
Figure BDA0003931932320000064
A second output end of the ith backward propagation module is connected with a second input end of the ith up-sampling module for outputting
Figure BDA0003931932320000065
The output end of the ith up-sampling module outputs a super-resolution image hr corresponding to the ith frame image i
In practical applications, as shown in fig. 3 and 4, the forward propagation module and the backward propagation module each include an optical flow estimation module, a spatial warping module, and a depth residual block, and the optical flow estimation module, the spatial warping module, the GDFN module, and the depth residual block are connected in sequence.
Taking the ith forward propagation module as an example, a specific work flow of forward propagation is introduced: first, x is calculated by an optical flow estimation module i-1 And x i Forward optical flow information of
Figure BDA0003931932320000066
By using
Figure BDA0003931932320000067
To pair
Figure BDA0003931932320000068
Carrying out space distortion alignment to obtain forward propagation characteristics of the i-1 frame image aligned with the i frame image
Figure BDA0003931932320000069
Then will pass through GDFN module
Figure BDA00039319323200000610
And x i Performing fusion to obtain fused features
Figure BDA00039319323200000611
Will be provided with
Figure BDA00039319323200000612
Feeding the depth residual block to obtain
Figure BDA00039319323200000613
Taking the ith backward propagation module as an example to introduce the specific workflow of backward propagation: first, x is calculated by an optical flow estimation module i-1 And x i Backward optical flow information of
Figure BDA00039319323200000614
By using
Figure BDA00039319323200000615
To pair
Figure BDA00039319323200000616
Carrying out space distortion alignment to obtain the backward propagation characteristic of the i +1 frame image aligned with the i frame image
Figure BDA00039319323200000617
Then will pass through GDFN module
Figure BDA00039319323200000618
And x i Performing fusion to obtain fused features
Figure BDA00039319323200000619
Will be provided with
Figure BDA00039319323200000620
Feeding the depth residual block to obtain
Figure BDA00039319323200000621
Then will be
Figure BDA00039319323200000622
And
Figure BDA00039319323200000623
and (4) performing fusion to obtain a final characteristic diagram, performing upsampling by a pixel-shuffle technology, and reconstructing a network to obtain a final high-resolution video.
In practical application, the specific structure of the GDFN module is as shown in fig. 5, and the feature fusion module GDFN uses depth-wise fusion to encode information from spatially adjacent pixel positions, which can be used to learn to effectively fuse features. The input characteristic is divided into two parts according to the channel after normalization operation (Norm), each part is convoluted by 1x1 convolution and 3x3 convolution, one branch is activated by a GELU activation function, then is subjected to element product with the other branch, and is added with the original input after the channel is restored by one 1x1 convolution to obtain the final result.
In practical applications, the invention uses the frequency loss function to help obtain more detailed information of the image. Using some commonly used loss functions, the model tends to make the video image smoother in order to reduce the loss value. These detail parts often correspond to the high frequency parts of the frequency signal, so that the difference in frequency space is reduced by the frequency loss function, and a clearer and sharper video is obtained. The frequency loss function is specifically:
Figure BDA0003931932320000071
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003931932320000072
the function of the loss of frequency is represented,
Figure BDA0003931932320000073
representing an image generated by inputting the video to be trained into a BasicVSR model, I representing a super-resolution image corresponding to the video to be trained, epsilon representing a first constant, alpha representing a second constant,
Figure BDA0003931932320000074
presentation pair
Figure BDA0003931932320000075
A fast fourier transform is performed and,
Figure BDA0003931932320000076
indicating that the fast fourier transform is performed on I.
The embodiment of the invention also provides a video super-resolution reconstruction system based on deep learning aiming at the method, which comprises the following steps:
the construction module is used for constructing a hyper-resolution model; the super-resolution model is obtained by training a BasicVSR model by taking an image corresponding to each frame of a video to be trained as input, taking a super-resolution image corresponding to each frame of the video to be trained as output and taking the minimum frequency loss function as a target; the forward and backward branches of the basicsvsr model each include a GDFN module.
And the acquisition module is used for acquiring the video to be processed.
And the super-resolution image determining module is used for inputting each frame image of the video to be processed into the super-resolution model to obtain a super-resolution image corresponding to each frame image of the video to be processed.
A super-resolution video determination module; and obtaining a super-resolution video corresponding to the video to be processed according to the super-resolution image corresponding to each frame image of the video to be processed.
In practical application, the BasicVSR model comprises a forward branch, a backward branch and an up-sampling branch; and the output ends of the forward branch and the backward branch are connected with the input end of the up-sampling branch.
In practical application, the forward branch comprises N forward propagation modules; the backward branch comprises N backward propagation modules; the up-sampling branch comprises N up-sampling propagation modules; n is a positive integer greater than 1.
The first input end of the ith forward propagation module is connected with the first output end of the (i-1) th forward propagation module; a second input end of the ith forward propagation module is used for inputting an ith frame image and an (i-1) th frame image of the video to be processed; a first output end of the ith forward propagation module is connected with a first input end of the (i + 1) th forward propagation module; and the second output end of the ith forward propagation module is connected with the first input end of the ith up-sampling module.
The first input end of the ith backward propagation module is connected with the first output end of the (i + 1) th backward propagation module; a second input end of the ith backward propagation module is used for inputting an ith frame image and an (i-1) th frame image of the video to be processed; a first output end of the ith backward propagation module is connected with a first input end of the (i-1) th backward propagation module; and the second output end of the ith backward propagation module is connected with the second input end of the ith up-sampling module.
In practical applications, the forward propagation module and the backward propagation module each include an optical flow estimation module, a spatial warping module, and a depth residual block, and the optical flow estimation module, the spatial warping module, the GDFN module, and the depth residual block are connected in sequence.
The embodiment of the present invention further provides a computer-readable storage medium, which stores a computer program, and when the computer program is executed by a processor, the method for reconstructing super-resolution video based on deep learning according to the above embodiment is implemented.
The embodiment of the invention also provides a video super-resolution system which comprises the following steps:
in order to better show the hyper-parting performance of the model, the model is converted by using open neural network exchange (ONNX), so that the video hyper-parting task can be carried out in an environment without installing a model dependency library. The remove system interface is built using pyqt. As shown in fig. 6, in the figure, select is a Video selection button, click select selects a Video file to be processed, originalVideo is an original Video player, modified Video is a super-score Video player, model a, model b, and model c are super-score algorithm selection buttons, and press a button will process and play the Video using the corresponding super-score algorithm, so as to obtain the corresponding Video.
The method is improved based on a BasicVSR model, and compared with the existing BasicVSR, the method achieves a better feature fusion effect by using a GDFN module. The loss of high frequency components of the over-resolution result is reduced using a frequency loss function.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. For the system disclosed by the embodiment, the description is relatively simple because the system corresponds to the method disclosed by the embodiment, and the relevant points can be referred to the method part for description.
The principles and embodiments of the present invention have been described herein using specific examples, which are provided only to help understand the method and the core concept of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, the specific embodiments and the application range may be changed. In view of the above, the present disclosure should not be construed as limiting the invention.

Claims (10)

1. A video super-resolution reconstruction method based on deep learning is characterized by comprising the following steps:
constructing a hyper-resolution model; the super-resolution model is obtained by training a BasicVSR model by taking an image corresponding to each frame of a video to be trained as input, taking a super-resolution image corresponding to each frame of the video to be trained as output and taking the minimum frequency loss function as a target; the forward and backward branches of the basicsvsr model each include a GDFN module;
acquiring a video to be processed;
inputting each frame image of the video to be processed into the super-resolution model to obtain a super-resolution image corresponding to each frame image of the video to be processed;
and obtaining a super-resolution video corresponding to the video to be processed according to the super-resolution image corresponding to each frame of image of the video to be processed.
2. The method for super-resolution reconstruction of videos based on deep learning of claim 1, wherein the BasicVSR model comprises a forward branch, a backward branch and an up-sampling branch; and the output ends of the forward branch and the backward branch are connected with the input end of the up-sampling branch.
3. The method for reconstructing super-resolution video based on deep learning of claim 2, wherein the forward branch comprises N forward propagation modules; the backward branch comprises N backward propagation modules; the up-sampling branch comprises N up-sampling propagation modules; n is a positive integer greater than 1;
the first input end of the ith forward propagation module is connected with the first output end of the (i-1) th forward propagation module; a second input end of the ith forward propagation module is used for inputting an ith frame image and an (i-1) th frame image of the video to be processed; a first output end of the ith forward propagation module is connected with a first input end of the (i + 1) th forward propagation module; a second output end of the ith forward propagation module is connected with a first input end of the ith up-sampling module;
the first input end of the ith backward propagation module is connected with the first output end of the (i + 1) th backward propagation module; a second input end of the ith backward propagation module is used for inputting an ith frame image and an (i-1) th frame image of the video to be processed; a first output end of the ith backward propagation module is connected with a first input end of the (i-1) th backward propagation module; and the second output end of the ith backward propagation module is connected with the second input end of the ith up-sampling module.
4. The method as claimed in claim 3, wherein the forward propagation module and the backward propagation module each include an optical flow estimation module, a spatial warping module, and a depth residual block, and the optical flow estimation module, the spatial warping module, the GDFN module, and the depth residual block are connected in sequence.
5. The method for reconstructing video super-resolution based on deep learning of claim 1, wherein the frequency loss function is specifically:
Figure FDA0003931932310000021
wherein, the first and the second end of the pipe are connected with each other,
Figure FDA0003931932310000022
the function of the loss of frequency is represented,
Figure FDA0003931932310000023
representing an image generated by inputting the video to be trained into a BasicVSR model, I representing a super-resolution image corresponding to the video to be trained, epsilon representing a first constant, alpha representing a second constant,
Figure FDA0003931932310000024
presentation pair
Figure FDA0003931932310000025
A fast fourier transform is performed and,
Figure FDA0003931932310000026
indicating that I is fast fourier transformed.
6. A video super-resolution reconstruction system based on deep learning is characterized by comprising:
the construction module is used for constructing a hyper-resolution model; the super-resolution model is obtained by training a BasicVSR model by taking an image corresponding to each frame of a video to be trained as input, taking a super-resolution image corresponding to each frame of the video to be trained as output and taking the minimum frequency loss function as a target; the forward and backward branches of the basicsvsr model each include a GDFN module;
the acquisition module is used for acquiring a video to be processed;
the super-resolution image determining module is used for inputting each frame image of the video to be processed into the super-resolution model to obtain a super-resolution image corresponding to each frame image of the video to be processed;
a super-resolution video determination module; and obtaining a super-resolution video corresponding to the video to be processed according to the super-resolution image corresponding to each frame of image of the video to be processed.
7. The deep learning-based video super-resolution reconstruction system of claim 6, wherein the BasicVSR model comprises a forward branch, a backward branch and an upsampling branch; the output ends of the forward branch and the backward branch are connected with the input end of the up-sampling branch.
8. The deep learning-based video super-resolution reconstruction system according to claim 7, wherein the forward branch comprises N forward propagation modules; the backward branch comprises N backward propagation modules; the upsampling branch comprises N upsampling propagation modules; n is a positive integer greater than 1;
the first input end of the ith forward propagation module is connected with the first output end of the (i-1) th forward propagation module; a second input end of the ith forward propagation module is used for inputting an ith frame image and an (i-1) th frame image of the video to be processed; a first output end of the ith forward propagation module is connected with a first input end of the (i + 1) th forward propagation module; a second output end of the ith forward propagation module is connected with a first input end of the ith up-sampling module;
the first input end of the ith backward propagation module is connected with the first output end of the (i + 1) th backward propagation module; a second input end of the ith backward propagation module is used for inputting an ith frame image and an (i-1) th frame image of the video to be processed; a first output end of the ith backward propagation module is connected with a first input end of the (i-1) th backward propagation module; and the second output end of the ith backward propagation module is connected with the second input end of the ith up-sampling module.
9. The deep learning-based video super-resolution reconstruction system of claim 8, wherein the forward propagation module and the backward propagation module each comprise an optical flow estimation module, a spatial warping module and a depth residual block, and the optical flow estimation module, the spatial warping module, the GDFN module and the depth residual block are connected in sequence.
10. A computer-readable storage medium, characterized in that it stores a computer program which, when executed by a processor, implements the deep learning-based video super-resolution reconstruction method according to any one of claims 1 to 5.
CN202211392882.1A 2022-11-08 2022-11-08 Video super-resolution reconstruction method, system and medium based on deep learning Pending CN115564655A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211392882.1A CN115564655A (en) 2022-11-08 2022-11-08 Video super-resolution reconstruction method, system and medium based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211392882.1A CN115564655A (en) 2022-11-08 2022-11-08 Video super-resolution reconstruction method, system and medium based on deep learning

Publications (1)

Publication Number Publication Date
CN115564655A true CN115564655A (en) 2023-01-03

Family

ID=84769542

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211392882.1A Pending CN115564655A (en) 2022-11-08 2022-11-08 Video super-resolution reconstruction method, system and medium based on deep learning

Country Status (1)

Country Link
CN (1) CN115564655A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118037549A (en) * 2024-04-11 2024-05-14 华南理工大学 Video enhancement method and system based on video content understanding

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118037549A (en) * 2024-04-11 2024-05-14 华南理工大学 Video enhancement method and system based on video content understanding

Similar Documents

Publication Publication Date Title
Zeng et al. Learning image-adaptive 3d lookup tables for high performance photo enhancement in real-time
CN109903228B (en) Image super-resolution reconstruction method based on convolutional neural network
Kim et al. Deep sr-itm: Joint learning of super-resolution and inverse tone-mapping for 4k uhd hdr applications
CN108022212B (en) High-resolution picture generation method, generation device and storage medium
CN111028150B (en) Rapid space-time residual attention video super-resolution reconstruction method
CN111898701B (en) Model training, frame image generation and frame insertion methods, devices, equipment and media
KR101137753B1 (en) Methods for fast and memory efficient implementation of transforms
AU2011216119B2 (en) Generic platform video image stabilization
CN110222758B (en) Image processing method, device, equipment and storage medium
CN111260560B (en) Multi-frame video super-resolution method fused with attention mechanism
US7418130B2 (en) Edge-sensitive denoising and color interpolation of digital images
US9462220B2 (en) Auto-regressive edge-directed interpolation with backward projection constraint
CN112581361B (en) Training method of image style migration model, image style migration method and device
CN113096013B (en) Blind image super-resolution reconstruction method and system based on imaging modeling and knowledge distillation
CN111784570A (en) Video image super-resolution reconstruction method and device
CN110717868A (en) Video high dynamic range inverse tone mapping model construction and mapping method and device
CN116681584A (en) Multistage diffusion image super-resolution algorithm
KR20200132682A (en) Image optimization method, apparatus, device and storage medium
CN115564655A (en) Video super-resolution reconstruction method, system and medium based on deep learning
CN117333398A (en) Multi-scale image denoising method and device based on self-supervision
CN114926336A (en) Video super-resolution reconstruction method and device, computer equipment and storage medium
CN116895037A (en) Frame insertion method and system based on edge information and multi-scale cross fusion network
CN116208812A (en) Video frame inserting method and system based on stereo event and intensity camera
CN115841523A (en) Double-branch HDR video reconstruction algorithm based on Raw domain
CN115170402A (en) Frame insertion method and system based on cyclic residual convolution and over-parameterized convolution

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination