CN115564655A - Video super-resolution reconstruction method, system and medium based on deep learning - Google Patents
Video super-resolution reconstruction method, system and medium based on deep learning Download PDFInfo
- Publication number
- CN115564655A CN115564655A CN202211392882.1A CN202211392882A CN115564655A CN 115564655 A CN115564655 A CN 115564655A CN 202211392882 A CN202211392882 A CN 202211392882A CN 115564655 A CN115564655 A CN 115564655A
- Authority
- CN
- China
- Prior art keywords
- video
- module
- super
- resolution
- ith
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 26
- 238000013135 deep learning Methods 0.000 title claims abstract description 24
- 230000006870 function Effects 0.000 claims abstract description 19
- 238000005070 sampling Methods 0.000 claims description 32
- 230000003287 optical effect Effects 0.000 claims description 17
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 claims description 14
- 238000004590 computer program Methods 0.000 claims description 4
- 238000010276 construction Methods 0.000 claims description 3
- 238000003860 storage Methods 0.000 claims description 3
- 230000004927 fusion Effects 0.000 description 8
- 238000010586 diagram Methods 0.000 description 6
- 230000000694 effects Effects 0.000 description 6
- 238000004422 calculation algorithm Methods 0.000 description 4
- 230000005540 biological transmission Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 230000004913 activation Effects 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 125000004122 cyclic group Chemical group 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4053—Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4046—Scaling of whole images or parts thereof, e.g. expanding or contracting using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/806—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Software Systems (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Multimedia (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Image Analysis (AREA)
Abstract
The invention relates to a video super-resolution reconstruction method, a system and a medium based on deep learning, in particular to the technical field of video processing. The method comprises the following steps: inputting each frame of a video to be processed into a super-resolution model to obtain a super-resolution image corresponding to each frame of the video to be processed; obtaining a super-resolution video corresponding to the video to be processed according to the super-resolution image corresponding to each frame of image of the video to be processed, wherein a super-resolution model is obtained by training a BasicVSR model by taking the video to be trained as input, the super-resolution video corresponding to the video to be trained as output and taking the minimum frequency loss function as a target; the forward and backward branches of the BasicVSR model each include a GDFN module. The invention can improve the quality of high-resolution video images.
Description
Technical Field
The invention relates to the technical field of video processing, in particular to a method, a system and a medium for reconstructing video super-resolution based on deep learning.
Background
The resolution is a set of performance parameters for evaluating the richness degree of detailed information contained in an image, and comprises time resolution, space resolution, color level resolution and the like, and the capability of an imaging system for actually reflecting the detailed information of an object is reflected. High resolution images typically include greater pixel density, richer texture details, and higher confidence than low resolution images. In practice, however, the ideal high-resolution image with sharp edges and no block blurring cannot be directly obtained under the constraint of many factors, such as the acquisition equipment and environment, the network transmission medium and bandwidth, the video degradation model itself, and the like. The most direct way to improve the image resolution is to improve the optical hardware in the acquisition system, but since the manufacturing process is difficult to be greatly improved and the manufacturing cost is very high, it is often too costly to physically solve the problem of low image resolution.
The super-resolution reconstruction technique of video refers to restoring a given low-resolution image into a corresponding high-resolution video through a specific algorithm. Compared with the image super-score, the video super-score can utilize the information of adjacent multi-frames to achieve a better super-score effect. The traditional hyper-resolution algorithm, such as interpolation, can cause the edge of the high-resolution video image to be blurred, and the effect is not good.
Disclosure of Invention
The invention aims to provide a video super-resolution reconstruction method, a system and a medium based on deep learning, which can improve the quality of a high-resolution video image.
In order to achieve the purpose, the invention provides the following scheme:
a video super-resolution reconstruction method based on deep learning comprises the following steps:
constructing a hyper-resolution model; the super-resolution model is obtained by training a BasicVSR model by taking an image corresponding to each frame of a video to be trained as input, taking a super-resolution image corresponding to each frame of the video to be trained as output and taking the minimum frequency loss function as a target; the forward and backward branches of the basicsvsr model each include a GDFN module;
acquiring a video to be processed;
inputting each frame image of the video to be processed into the super-resolution model to obtain a super-resolution image corresponding to each frame image of the video to be processed;
and obtaining a super-resolution video corresponding to the video to be processed according to the super-resolution image corresponding to each frame of image of the video to be processed.
Optionally, the basicsvr model includes a forward branch, a backward branch, and an upsampling branch; the output ends of the forward branch and the backward branch are connected with the input end of the up-sampling branch.
Optionally, the forward branch includes N forward propagation modules; the backward branch comprises N backward propagation modules; the up-sampling branch comprises N up-sampling propagation modules; n is a positive integer greater than 1;
the first input end of the ith forward propagation module is connected with the first output end of the (i-1) th forward propagation module; a second input end of the ith forward propagation module is used for inputting an ith frame image and an (i-1) th frame image of the video to be processed; a first output end of the ith forward propagation module is connected with a first input end of the (i + 1) th forward propagation module; a second output end of the ith forward propagation module is connected with a first input end of the ith up-sampling module;
the first input end of the ith backward propagation module is connected with the first output end of the (i + 1) th backward propagation module; a second input end of the ith backward propagation module is used for inputting an ith frame image and an (i-1) th frame image of the video to be processed; a first output end of the ith backward propagation module is connected with a first input end of the (i-1) th backward propagation module; and the second output end of the ith backward propagation module is connected with the second input end of the ith up-sampling module.
Optionally, the forward propagation module and the backward propagation module each include an optical flow estimation module, a spatial warping module and a depth residual block, and the optical flow estimation module, the spatial warping module, the GDFN module and the depth residual block are sequentially connected.
Optionally, the frequency loss function is specifically:
wherein the content of the first and second substances,the function of the loss of frequency is represented,representing an image generated by inputting the video to be trained into a BasicVSR model, I representing a super-resolution image corresponding to the video to be trained, epsilon representing a first constant, alpha representing a second constant,presentation pairA fast fourier transform is performed and,indicating that I is fast fourier transformed.
A video super-resolution reconstruction system based on deep learning comprises:
the construction module is used for constructing a hyper-resolution model; the super-resolution model is obtained by training a BasicVSR model by taking an image corresponding to each frame of a video to be trained as input, taking a super-resolution image corresponding to each frame of the video to be trained as output and taking the minimum frequency loss function as a target; the forward branch and the backward branch of the BasicVSR model both comprise GDFN modules;
the acquisition module is used for acquiring a video to be processed;
the super-resolution image determining module is used for inputting each frame image of the video to be processed into the super-resolution model to obtain a super-resolution image corresponding to each frame image of the video to be processed;
a super-resolution video determination module; and obtaining a super-resolution video corresponding to the video to be processed according to the super-resolution image corresponding to each frame of image of the video to be processed.
Optionally, the basicsvr model includes a forward branch, a backward branch, and an upsampling branch; the output ends of the forward branch and the backward branch are connected with the input end of the up-sampling branch.
Optionally, the forward branch includes N forward propagation modules; the backward branch comprises N backward propagation modules; the up-sampling branch comprises N up-sampling propagation modules; n is a positive integer greater than 1;
the first input end of the ith forward propagation module is connected with the first output end of the (i-1) th forward propagation module; a second input end of the ith forward propagation module is used for inputting an ith frame image and an (i-1) th frame image of the video to be processed; a first output end of the ith forward propagation module is connected with a first input end of the (i + 1) th forward propagation module; a second output end of the ith forward propagation module is connected with a first input end of the ith up-sampling module;
the first input end of the ith backward propagation module is connected with the first output end of the (i + 1) th backward propagation module; a second input end of the ith backward propagation module is used for inputting an ith frame image and an (i-1) th frame image of the video to be processed; a first output end of the ith backward propagation module is connected with a first input end of the (i-1) th backward propagation module; and the second output end of the ith backward propagation module is connected with the second input end of the ith up-sampling module.
Optionally, the forward propagation module and the backward propagation module each include an optical flow estimation module, a spatial warping module and a depth residual block, and the optical flow estimation module, the spatial warping module, the GDFN module and the depth residual block are sequentially connected.
Optionally, a computer-readable storage medium stores a computer program, which when executed by a processor implements the method for super-resolution reconstruction of video based on deep learning as described above.
According to the specific embodiment provided by the invention, the invention discloses the following technical effects: the invention uses GDFN module to achieve better feature fusion effect, and can improve the quality of high-resolution video image.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise.
Fig. 1 is a flowchart of a video super-resolution reconstruction method based on deep learning according to an embodiment of the present invention;
FIG. 2 is a detailed architecture diagram of the BasicVSR model;
FIG. 3 is a detailed block diagram of a forward propagation module;
FIG. 4 is a detailed block diagram of the back propagation module;
FIG. 5 is a detailed block diagram of a GDFN module;
fig. 6 is a detailed block diagram of the video super-resolution system.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
With the rise of deep learning, the development of a video super-resolution technology based on the deep learning is more and more rapid, the invention provides a video super-resolution reconstruction method based on the deep learning, a hyper-resolution model in the invention uses a cyclic network architecture to carry out information transmission among video frame information, simultaneously uses a GDFN module to improve the effect of feature fusion, and adds a frequency loss function optimization network to ensure that the hyper-resolution model has the advantages of good performance, low parameter number and high calculation efficiency.
The embodiment of the invention provides a video super-resolution reconstruction method based on deep learning, which comprises the following steps of:
step 101: constructing a hyper-resolution model; the super-resolution model is obtained by training a BasicVSR model by taking an image corresponding to each frame of a video to be trained as input, taking a super-resolution image corresponding to each frame of the video to be trained as output and taking the minimum frequency loss function as a target; the forward and backward branches of the basicsvsr model each include a GDFN module.
Step 102: and acquiring a video to be processed.
Step 103: and inputting each frame image of the video to be processed into the super-resolution model to obtain a super-resolution image corresponding to each frame image of the video to be processed.
Step 104: and obtaining a super-resolution video corresponding to the video to be processed according to the super-resolution image corresponding to each frame of image of the video to be processed.
In practical application, the BasicVSR model comprises a forward branch, a backward branch and an up-sampling branch; and the output ends of the forward branch and the backward branch are connected with the input end of the up-sampling branch.
In practical applications, as shown in fig. 2, the forward branch includes N forward propagation modules; the backward branch comprises N backward propagation modules; the up-sampling branch comprises N up-sampling propagation modules; n is a positive integer greater than 1.
The first input end of the ith forward propagation module is connected with the first output end of the (i-1) th forward propagation module and is used for inputting the forward propagation characteristics of the (i-1) th frame image output by the (i-1) th forward propagation moduleThe second input end of the ith forward propagation module is used for inputting the ith frame image x of the video to be processed i And the i-1 st frame image x i-1 (ii) a A first output end of the ith forward propagation module is connected with a first input end of the (i + 1) th forward propagation module, and is used for outputting the forward propagation characteristics of the ith frame image output by the ith forward propagation moduleThe second output end of the ith forward propagation module is connected with the first input end of the ith up-sampling module
The first input end of the ith backward propagation module is connected with the first output end of the (i + 1) th backward propagation module and is used for inputting the backward propagation characteristics of the (i + 1) th frame image output by the (i + 1) th backward propagation moduleA second input end of the ith backward propagation module is used for inputting an ith frame image x of the video to be processed i And the i-1 st frame image x i-1 (ii) a The first output end of the ith backward propagation module is connected with the first input end of the (i-1) th backward propagation module and is used for outputting the backward propagation characteristics of the ith frame image output by the ith backward propagation moduleA second output end of the ith backward propagation module is connected with a second input end of the ith up-sampling module for outputtingThe output end of the ith up-sampling module outputs a super-resolution image hr corresponding to the ith frame image i 。
In practical applications, as shown in fig. 3 and 4, the forward propagation module and the backward propagation module each include an optical flow estimation module, a spatial warping module, and a depth residual block, and the optical flow estimation module, the spatial warping module, the GDFN module, and the depth residual block are connected in sequence.
Taking the ith forward propagation module as an example, a specific work flow of forward propagation is introduced: first, x is calculated by an optical flow estimation module i-1 And x i Forward optical flow information ofBy usingTo pairCarrying out space distortion alignment to obtain forward propagation characteristics of the i-1 frame image aligned with the i frame imageThen will pass through GDFN moduleAnd x i Performing fusion to obtain fused featuresWill be provided withFeeding the depth residual block to obtain
Taking the ith backward propagation module as an example to introduce the specific workflow of backward propagation: first, x is calculated by an optical flow estimation module i-1 And x i Backward optical flow information ofBy usingTo pairCarrying out space distortion alignment to obtain the backward propagation characteristic of the i +1 frame image aligned with the i frame imageThen will pass through GDFN moduleAnd x i Performing fusion to obtain fused featuresWill be provided withFeeding the depth residual block to obtain
Then will beAndand (4) performing fusion to obtain a final characteristic diagram, performing upsampling by a pixel-shuffle technology, and reconstructing a network to obtain a final high-resolution video.
In practical application, the specific structure of the GDFN module is as shown in fig. 5, and the feature fusion module GDFN uses depth-wise fusion to encode information from spatially adjacent pixel positions, which can be used to learn to effectively fuse features. The input characteristic is divided into two parts according to the channel after normalization operation (Norm), each part is convoluted by 1x1 convolution and 3x3 convolution, one branch is activated by a GELU activation function, then is subjected to element product with the other branch, and is added with the original input after the channel is restored by one 1x1 convolution to obtain the final result.
In practical applications, the invention uses the frequency loss function to help obtain more detailed information of the image. Using some commonly used loss functions, the model tends to make the video image smoother in order to reduce the loss value. These detail parts often correspond to the high frequency parts of the frequency signal, so that the difference in frequency space is reduced by the frequency loss function, and a clearer and sharper video is obtained. The frequency loss function is specifically:
wherein, the first and the second end of the pipe are connected with each other,the function of the loss of frequency is represented,representing an image generated by inputting the video to be trained into a BasicVSR model, I representing a super-resolution image corresponding to the video to be trained, epsilon representing a first constant, alpha representing a second constant,presentation pairA fast fourier transform is performed and,indicating that the fast fourier transform is performed on I.
The embodiment of the invention also provides a video super-resolution reconstruction system based on deep learning aiming at the method, which comprises the following steps:
the construction module is used for constructing a hyper-resolution model; the super-resolution model is obtained by training a BasicVSR model by taking an image corresponding to each frame of a video to be trained as input, taking a super-resolution image corresponding to each frame of the video to be trained as output and taking the minimum frequency loss function as a target; the forward and backward branches of the basicsvsr model each include a GDFN module.
And the acquisition module is used for acquiring the video to be processed.
And the super-resolution image determining module is used for inputting each frame image of the video to be processed into the super-resolution model to obtain a super-resolution image corresponding to each frame image of the video to be processed.
A super-resolution video determination module; and obtaining a super-resolution video corresponding to the video to be processed according to the super-resolution image corresponding to each frame image of the video to be processed.
In practical application, the BasicVSR model comprises a forward branch, a backward branch and an up-sampling branch; and the output ends of the forward branch and the backward branch are connected with the input end of the up-sampling branch.
In practical application, the forward branch comprises N forward propagation modules; the backward branch comprises N backward propagation modules; the up-sampling branch comprises N up-sampling propagation modules; n is a positive integer greater than 1.
The first input end of the ith forward propagation module is connected with the first output end of the (i-1) th forward propagation module; a second input end of the ith forward propagation module is used for inputting an ith frame image and an (i-1) th frame image of the video to be processed; a first output end of the ith forward propagation module is connected with a first input end of the (i + 1) th forward propagation module; and the second output end of the ith forward propagation module is connected with the first input end of the ith up-sampling module.
The first input end of the ith backward propagation module is connected with the first output end of the (i + 1) th backward propagation module; a second input end of the ith backward propagation module is used for inputting an ith frame image and an (i-1) th frame image of the video to be processed; a first output end of the ith backward propagation module is connected with a first input end of the (i-1) th backward propagation module; and the second output end of the ith backward propagation module is connected with the second input end of the ith up-sampling module.
In practical applications, the forward propagation module and the backward propagation module each include an optical flow estimation module, a spatial warping module, and a depth residual block, and the optical flow estimation module, the spatial warping module, the GDFN module, and the depth residual block are connected in sequence.
The embodiment of the present invention further provides a computer-readable storage medium, which stores a computer program, and when the computer program is executed by a processor, the method for reconstructing super-resolution video based on deep learning according to the above embodiment is implemented.
The embodiment of the invention also provides a video super-resolution system which comprises the following steps:
in order to better show the hyper-parting performance of the model, the model is converted by using open neural network exchange (ONNX), so that the video hyper-parting task can be carried out in an environment without installing a model dependency library. The remove system interface is built using pyqt. As shown in fig. 6, in the figure, select is a Video selection button, click select selects a Video file to be processed, originalVideo is an original Video player, modified Video is a super-score Video player, model a, model b, and model c are super-score algorithm selection buttons, and press a button will process and play the Video using the corresponding super-score algorithm, so as to obtain the corresponding Video.
The method is improved based on a BasicVSR model, and compared with the existing BasicVSR, the method achieves a better feature fusion effect by using a GDFN module. The loss of high frequency components of the over-resolution result is reduced using a frequency loss function.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. For the system disclosed by the embodiment, the description is relatively simple because the system corresponds to the method disclosed by the embodiment, and the relevant points can be referred to the method part for description.
The principles and embodiments of the present invention have been described herein using specific examples, which are provided only to help understand the method and the core concept of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, the specific embodiments and the application range may be changed. In view of the above, the present disclosure should not be construed as limiting the invention.
Claims (10)
1. A video super-resolution reconstruction method based on deep learning is characterized by comprising the following steps:
constructing a hyper-resolution model; the super-resolution model is obtained by training a BasicVSR model by taking an image corresponding to each frame of a video to be trained as input, taking a super-resolution image corresponding to each frame of the video to be trained as output and taking the minimum frequency loss function as a target; the forward and backward branches of the basicsvsr model each include a GDFN module;
acquiring a video to be processed;
inputting each frame image of the video to be processed into the super-resolution model to obtain a super-resolution image corresponding to each frame image of the video to be processed;
and obtaining a super-resolution video corresponding to the video to be processed according to the super-resolution image corresponding to each frame of image of the video to be processed.
2. The method for super-resolution reconstruction of videos based on deep learning of claim 1, wherein the BasicVSR model comprises a forward branch, a backward branch and an up-sampling branch; and the output ends of the forward branch and the backward branch are connected with the input end of the up-sampling branch.
3. The method for reconstructing super-resolution video based on deep learning of claim 2, wherein the forward branch comprises N forward propagation modules; the backward branch comprises N backward propagation modules; the up-sampling branch comprises N up-sampling propagation modules; n is a positive integer greater than 1;
the first input end of the ith forward propagation module is connected with the first output end of the (i-1) th forward propagation module; a second input end of the ith forward propagation module is used for inputting an ith frame image and an (i-1) th frame image of the video to be processed; a first output end of the ith forward propagation module is connected with a first input end of the (i + 1) th forward propagation module; a second output end of the ith forward propagation module is connected with a first input end of the ith up-sampling module;
the first input end of the ith backward propagation module is connected with the first output end of the (i + 1) th backward propagation module; a second input end of the ith backward propagation module is used for inputting an ith frame image and an (i-1) th frame image of the video to be processed; a first output end of the ith backward propagation module is connected with a first input end of the (i-1) th backward propagation module; and the second output end of the ith backward propagation module is connected with the second input end of the ith up-sampling module.
4. The method as claimed in claim 3, wherein the forward propagation module and the backward propagation module each include an optical flow estimation module, a spatial warping module, and a depth residual block, and the optical flow estimation module, the spatial warping module, the GDFN module, and the depth residual block are connected in sequence.
5. The method for reconstructing video super-resolution based on deep learning of claim 1, wherein the frequency loss function is specifically:
wherein, the first and the second end of the pipe are connected with each other,the function of the loss of frequency is represented,representing an image generated by inputting the video to be trained into a BasicVSR model, I representing a super-resolution image corresponding to the video to be trained, epsilon representing a first constant, alpha representing a second constant,presentation pairA fast fourier transform is performed and,indicating that I is fast fourier transformed.
6. A video super-resolution reconstruction system based on deep learning is characterized by comprising:
the construction module is used for constructing a hyper-resolution model; the super-resolution model is obtained by training a BasicVSR model by taking an image corresponding to each frame of a video to be trained as input, taking a super-resolution image corresponding to each frame of the video to be trained as output and taking the minimum frequency loss function as a target; the forward and backward branches of the basicsvsr model each include a GDFN module;
the acquisition module is used for acquiring a video to be processed;
the super-resolution image determining module is used for inputting each frame image of the video to be processed into the super-resolution model to obtain a super-resolution image corresponding to each frame image of the video to be processed;
a super-resolution video determination module; and obtaining a super-resolution video corresponding to the video to be processed according to the super-resolution image corresponding to each frame of image of the video to be processed.
7. The deep learning-based video super-resolution reconstruction system of claim 6, wherein the BasicVSR model comprises a forward branch, a backward branch and an upsampling branch; the output ends of the forward branch and the backward branch are connected with the input end of the up-sampling branch.
8. The deep learning-based video super-resolution reconstruction system according to claim 7, wherein the forward branch comprises N forward propagation modules; the backward branch comprises N backward propagation modules; the upsampling branch comprises N upsampling propagation modules; n is a positive integer greater than 1;
the first input end of the ith forward propagation module is connected with the first output end of the (i-1) th forward propagation module; a second input end of the ith forward propagation module is used for inputting an ith frame image and an (i-1) th frame image of the video to be processed; a first output end of the ith forward propagation module is connected with a first input end of the (i + 1) th forward propagation module; a second output end of the ith forward propagation module is connected with a first input end of the ith up-sampling module;
the first input end of the ith backward propagation module is connected with the first output end of the (i + 1) th backward propagation module; a second input end of the ith backward propagation module is used for inputting an ith frame image and an (i-1) th frame image of the video to be processed; a first output end of the ith backward propagation module is connected with a first input end of the (i-1) th backward propagation module; and the second output end of the ith backward propagation module is connected with the second input end of the ith up-sampling module.
9. The deep learning-based video super-resolution reconstruction system of claim 8, wherein the forward propagation module and the backward propagation module each comprise an optical flow estimation module, a spatial warping module and a depth residual block, and the optical flow estimation module, the spatial warping module, the GDFN module and the depth residual block are connected in sequence.
10. A computer-readable storage medium, characterized in that it stores a computer program which, when executed by a processor, implements the deep learning-based video super-resolution reconstruction method according to any one of claims 1 to 5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211392882.1A CN115564655A (en) | 2022-11-08 | 2022-11-08 | Video super-resolution reconstruction method, system and medium based on deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211392882.1A CN115564655A (en) | 2022-11-08 | 2022-11-08 | Video super-resolution reconstruction method, system and medium based on deep learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115564655A true CN115564655A (en) | 2023-01-03 |
Family
ID=84769542
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211392882.1A Pending CN115564655A (en) | 2022-11-08 | 2022-11-08 | Video super-resolution reconstruction method, system and medium based on deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115564655A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118037549A (en) * | 2024-04-11 | 2024-05-14 | 华南理工大学 | Video enhancement method and system based on video content understanding |
-
2022
- 2022-11-08 CN CN202211392882.1A patent/CN115564655A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118037549A (en) * | 2024-04-11 | 2024-05-14 | 华南理工大学 | Video enhancement method and system based on video content understanding |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Zeng et al. | Learning image-adaptive 3d lookup tables for high performance photo enhancement in real-time | |
CN109903228B (en) | Image super-resolution reconstruction method based on convolutional neural network | |
Kim et al. | Deep sr-itm: Joint learning of super-resolution and inverse tone-mapping for 4k uhd hdr applications | |
CN108022212B (en) | High-resolution picture generation method, generation device and storage medium | |
CN111028150B (en) | Rapid space-time residual attention video super-resolution reconstruction method | |
CN111898701B (en) | Model training, frame image generation and frame insertion methods, devices, equipment and media | |
KR101137753B1 (en) | Methods for fast and memory efficient implementation of transforms | |
AU2011216119B2 (en) | Generic platform video image stabilization | |
CN110222758B (en) | Image processing method, device, equipment and storage medium | |
CN111260560B (en) | Multi-frame video super-resolution method fused with attention mechanism | |
US7418130B2 (en) | Edge-sensitive denoising and color interpolation of digital images | |
US9462220B2 (en) | Auto-regressive edge-directed interpolation with backward projection constraint | |
CN112581361B (en) | Training method of image style migration model, image style migration method and device | |
CN113096013B (en) | Blind image super-resolution reconstruction method and system based on imaging modeling and knowledge distillation | |
CN111784570A (en) | Video image super-resolution reconstruction method and device | |
CN110717868A (en) | Video high dynamic range inverse tone mapping model construction and mapping method and device | |
CN116681584A (en) | Multistage diffusion image super-resolution algorithm | |
KR20200132682A (en) | Image optimization method, apparatus, device and storage medium | |
CN115564655A (en) | Video super-resolution reconstruction method, system and medium based on deep learning | |
CN117333398A (en) | Multi-scale image denoising method and device based on self-supervision | |
CN114926336A (en) | Video super-resolution reconstruction method and device, computer equipment and storage medium | |
CN116895037A (en) | Frame insertion method and system based on edge information and multi-scale cross fusion network | |
CN116208812A (en) | Video frame inserting method and system based on stereo event and intensity camera | |
CN115841523A (en) | Double-branch HDR video reconstruction algorithm based on Raw domain | |
CN115170402A (en) | Frame insertion method and system based on cyclic residual convolution and over-parameterized convolution |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |