CN113487481B - Circular video super-resolution method based on information construction and multi-density residual block - Google Patents

Circular video super-resolution method based on information construction and multi-density residual block Download PDF

Info

Publication number
CN113487481B
CN113487481B CN202110746815.4A CN202110746815A CN113487481B CN 113487481 B CN113487481 B CN 113487481B CN 202110746815 A CN202110746815 A CN 202110746815A CN 113487481 B CN113487481 B CN 113487481B
Authority
CN
China
Prior art keywords
resolution
information
convolution
output
super
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN202110746815.4A
Other languages
Chinese (zh)
Other versions
CN113487481A (en
Inventor
于明
王书韵
薛翠红
郭迎春
朱叶
于洋
师硕
阎刚
刘依
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hebei University of Technology
Tianjin University of Technology
Original Assignee
Hebei University of Technology
Tianjin University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hebei University of Technology, Tianjin University of Technology filed Critical Hebei University of Technology
Priority to CN202110746815.4A priority Critical patent/CN113487481B/en
Publication of CN113487481A publication Critical patent/CN113487481A/en
Application granted granted Critical
Publication of CN113487481B publication Critical patent/CN113487481B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4053Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a circular video super-resolution method based on information construction and multiple density residual blocks, which utilizes an introduced information construction module to carry out prior information construction and filling on an initial circular neural network, so that the circular neural network with the front ordinal number has enough information to reconstruct a front frame, and simultaneously utilizes the multiple density residual blocks to extract the characteristics representing deep information and propagate in the circular neural network in a hidden state; the method comprises the following steps: preprocessing video data to obtain a corresponding low-resolution frame sequence; a construction information construction module, which inputs the first m frames of low resolution frame sequence and outputs the initial hidden information h0And initial pre-output information o0(ii) a And constructing a cyclic neural network of a plurality of dense residual blocks, and inputting two outputs of the information construction module into the cyclic neural network of the plurality of dense residual blocks to obtain a super-resolution frame sequence. And the system can be competent for online super-resolution tasks.

Description

Circular video super-resolution method based on information construction and multi-density residual block
Technical Field
The technical scheme of the invention relates to super-resolution reconstruction of videos, in particular to a circular video super-resolution method based on information construction and multiple density residual blocks.
Background
With the advent of the 5G era, videos gradually replace images to become mainstream information of internet transmission, and more ultra-high definition video data change lives of people without sound. However, how to quickly and clearly deliver the large amount of high-definition video to the hands of the users is undoubtedly a problem to be solved for application manufacturers. High quality video is often compressed and encoded and then transmitted in a channel, and then transmitted to a terminal for decoding. However, this process is lossy, and some pixels may be lost during encoding and decoding, or during transmission, which results in greatly reduced quality of decoded video and affects user experience. The super-resolution technology is a method for restoring a low-resolution video into a high-resolution video by calculating and filling missing information, and the decoded video is processed by a super-resolution algorithm, so that the quality of the video can be improved, and the quality damage in the encoding, decoding and transmission processes is relieved.
Different from the super-resolution method of a single-frame image, video data is composed of continuous frames, and how to fully utilize space-time information is a key problem for recovering video details. The traditional super-resolution method based on reconstruction and learning is not ideal in restoration effect of ultra-high-definition videos, and the existing video super-resolution method of the unidirectional cyclic neural network is extremely poor in reconstruction effect of initial frames due to the fact that rich detail information cannot be obtained when the previous frames are reconstructed, and the super-resolution effect and the user experience degree are greatly influenced. The bidirectional cyclic neural network method cannot output reconstructed video frames in time, and is greatly limited in application fields such as video transmission. For example, CN111587447A discloses a frame-cycling video super-resolution method, which uses a displayed motion estimation and optical flow method to perform warping and motion compensation on a current frame to be processed to mine time information and then perform recursion and iteration, so as to achieve super-resolution reconstruction, and the method has the following disadvantages: the motion compensation method for the display depends on the precision of the optical flow calculated by motion estimation, and once deviation occurs in the calculation of the optical flow, serious artifacts and distortion problems can be caused, so that the super-resolution result is seriously influenced. CN109102462A discloses a video super-resolution reconstruction method based on deep learning, which first uses a bidirectional circulation network to extract forward and backward features, and then uses deep 3D back projection to integrate time and space information. The disadvantages of this method are: the bidirectional circulation network can be further input into the next module only by calculating all video frames first, so that the bidirectional circulation network cannot process online in a video super-resolution task, a large number of 3D convolutions are used when time-space information is integrated, the calculation complexity is greatly increased, the calculation burden is increased, and a separation type method of extracting first and integrating second cannot multiplex generated characteristic information, so that the calculation efficiency is greatly reduced, and particularly when a video with higher resolution or more frames is processed, the time is long, and the requirement on an internal memory is high. CN111260560A discloses a multi-frame video super-resolution method fused with an attention mechanism, which connects a 3D convolution feature alignment module and a deformed convolution feature alignment module together as an implicit alignment module, and then uses a plurality of residual blocks added with a space and channel attention mechanism to carry out feature reconstruction. The disadvantages of this method are: firstly, the 3D convolution feature alignment module and the deformation convolution feature alignment module are both time-consuming ultrahigh modules, so that the calculation parameter quantity is greatly improved, the super-resolution time is greatly prolonged, and the efficiency of the super-resolution process is seriously influenced. Secondly, when the method uses the residual block to reconstruct the features, the local features generated by the residual block cannot be effectively and repeatedly used, the computing resources are wasted, and the added attention mechanism is slightly influenced on the feature reconstruction, and the computing load is increased.
Disclosure of Invention
Aiming at the defects of the prior art, the technical problems to be solved by the invention are as follows: the method utilizes an information construction module to carry out prior information construction and filling on an initial cyclic neural network, so that the previous cyclic neural network has enough information to reconstruct a previous frame, the defect of poor reconstruction effect of a one-way cyclic neural network method on the previous frame is overcome, memory occupation and calculation cost are reduced compared with a two-way cyclic neural network, and the method can be competent for an online super-resolution task; the characteristics of deep information are extracted and integrated by utilizing a plurality of dense residual blocks, and the deep information is spread in a hidden state in a recurrent neural network, so that the defect of insufficient integration of space-time information in the prior art is overcome.
The technical scheme adopted by the invention for solving the technical problem is as follows: the method comprises the steps of constructing and filling prior information for an initial cyclic neural network by utilizing an introduced information construction module, enabling the cyclic neural network with the front ordinal number to have enough information to reconstruct a front frame, extracting features representing deep information by utilizing a multi-density residual block, and transmitting the features in a hidden state in the cyclic neural network; the method comprises the following steps:
preprocessing video data to obtain corresponding low resolution frame sequence
Figure BDA0003144575400000021
And (3) constructing an information construction module: the information construction module comprises a feature extraction and channel attention, quick information architecture adjustment block and a structure simulation block, wherein the input is a first m-frame low-resolution frame sequence, and the output is initial hidden information h0And initial pre-output information o0
The process of feature extraction and channel attention is as follows: first a sequence of low resolution frames ILRIn the previous m (m is more than or equal to 1 and less than or equal to n) frames, each frame is subjected to convolution processing to carry out feature extraction to obtain shallow layer features
Figure BDA0003144575400000022
Then, shallow features are overlapped on channel dimensions and input into a channel attention module SE to obtain a shallow feature screening set K;
the specific structure of the quick information architecture adjusting block is as follows: inputting a shallow feature screening set K into a multi-residual block which is stacked in series after passing through a convolution and ReLU activation function, and performing deep feature extraction to output deep information features, wherein the residual block is formed by connecting two convolutions and a jump layer;
the specific operation of the structure simulation block is as follows: processing deep information features output by the rapid information architecture adjusting block through a convolution layer and a ReLU activation function to obtain initial hidden information, meanwhile, performing feature dimensionality reduction on the output deep information features through another convolution layer, feeding the output deep information features into a sub-pixel convolution layer, and adding the sub-pixel convolution layer and a low-resolution initial image subjected to r-time upsampling processing to obtain initial pre-output information;
constructing a cyclic neural network of a plurality of dense residual blocks, inputting two outputs of an information construction module into the cyclic neural network of the plurality of dense residual blocks, and obtaining the output h of the cyclic neural network of the t ordinal number through t times of circulationt、otAnd super-resolution image
Figure BDA0003144575400000023
If at this time t<n, the operation of the recurrent neural network with the (t + 1) th ordinal number is carried out, if t is equal to n, the super-resolution calculation of all n frames is completed, and a super-resolution frame sequence is obtained
Figure BDA0003144575400000024
The method comprises the following specific steps:
firstly, video data is preprocessed:
rotating or overturning the video data by using random function and self-defined threshold, and extracting frames from the rotated or overturned video data to obtain a video frame sequence V ═ (V ═ V-1,…,vn) Where n is the total number of frames of the video, v1,vnA first frame and a last frame, respectively, of V; carrying out average clipping processing on all video frames to ensure that the width and the height of the video frames are unified to 256 x 256 pixel values, and obtaining a high-resolution frame sequence corresponding to the video frame sequence V
Figure BDA0003144575400000025
Then for the resulting high resolution frame sequence IHRPerforming Gaussian blur processing with variance sigma of 1.6 and blur radius of 3 and down-sampling processing with sampling multiple r to obtain corresponding low-resolution frame sequence
Figure BDA0003144575400000031
And secondly, using partial low-resolution frame sequences to carry out information construction:
the process of feature extraction and channel attention is as follows: firstly, obtaining an initial n-frame low-resolution frame sequence I in a first stepLRIn the previous m (m is more than or equal to 1 and less than or equal to n, m is 7 in the embodiment) frames, each frame is subjected to convolution processing to carry out feature extraction, and shallow layer features are obtained
Figure BDA0003144575400000032
Then, shallow features are overlapped on channel dimensions and input into a channel attention module (SE), and the operation process of the SE is divided into two steps of compression and excitation: compressing through a given feature map (shallow features)
Figure BDA0003144575400000033
) Performing global average pooling to obtain global compression characteristic quantity of the current characteristic diagram, exciting a two-layer fully-connected bottleneck layer structure with a ReLU activation function in the middle to obtain a weight of each channel in the characteristic diagram, and performing weighting on the characteristic diagram and the weight to output a shallow characteristic screening set K; as shown in the following equation (2):
Figure BDA0003144575400000034
in the formula (2), [, ] represents the superposition operation of the channel dimensions, and SE (-) is the channel attention operation;
then, the obtained shallow feature screening set K passes through a quick information framework adjusting block to obtain deep information features, the deep information features are constructed through a structure simulation block to output initial hidden information (h)0) And initial pre-output information (o)0) Wherein:
the specific process of the operation of the fast information architecture adjusting block is as follows: and (3) inputting the shallow feature screening set K into a multi-residual block which is stacked in series after passing through a convolution and ReLU activation function to carry out deep feature extraction, and outputting deep information features, wherein the residual block is formed by connecting two convolutions and a jump layer.
The specific operation of the structure simulation block is as follows: processing deep information features output by the rapid information architecture adjusting block through a convolution layer and a ReLU activation function to obtain initial hidden information, meanwhile, performing feature dimensionality reduction on the output deep information features through another convolution layer, feeding the output deep information features into a sub-pixel convolution layer, and adding the sub-pixel convolution layer and a low-resolution initial image subjected to r-time upsampling processing to obtain initial pre-output information;
h0expressed by equation (3):
h0=ReLU(Conv(FIA(K))) (3)
o0expressed by equation (4):
Figure BDA0003144575400000035
in formulae (3) and (4), Hspc(. H) is a sub-pixel convolution operation, FIA (. H) is a fast information frame adjustment block, + represents the pixel addition of the same channel, Hus() r times bilinear upsampling operation, or other upsampling methods, Conv is convolution operation, the convolution kernels of the convolution operations in the formula (3) and the formula (4) are the same in size, the convolution parameters of the convolution kernels are different, ReLU is activation function operation,
Figure BDA0003144575400000036
for a sequence of low resolution frames ILRThe first frame in (1);
thirdly, shallow layer feature extraction in the recurrent neural network of the multi-density residual block:
because the ordinal number of the recurrent neural network and the subscript of the low-resolution frame needing super-resolution are in one-to-one correspondence, t is uniformly defined as the ordinal number of the recurrent neural network or the subscript of the low-resolution frame needing super-resolution, and the parameters of the recurrent neural network corresponding to different ordinal numbers are shared, t is traversed to n from 1 in sequence, and t is more than or equal to 1 and less than or equal to n; in the cyclic neural network of the t ordinal number, the low resolution frame sequence I obtained in the first step is processedLROf the t-th and t-1-th frames
Figure BDA0003144575400000037
And
Figure BDA0003144575400000038
hidden state h output by t-1 th cyclic neural networkt-1And downsampling the output result o by a factor of rt-1In the channel dimension, let h be connected in series when t is 1t-1,ot-1And
Figure BDA0003144575400000039
respectively initialized to h calculated in the second step0,o0And in a first step a sequence of low resolution frames ILRIn (1)
Figure BDA00031445754000000310
Inputting the series-connected sequence into a shallow feature extraction module in a recurrent neural network of a multi-density residual block to obtain a shallow feature FtThe specific process is formula (5):
Figure BDA0003144575400000041
in the formula (5), FtFor shallow features, Down (-) is downsampled, [, ]]For series connection in the channel dimension, Hsfe(r) is a shallow feature extraction moduleThe object is to perform feature extraction, comprising a convolutional layer, a nonlinear activation function ReLU layer and a downsampling operation, output ot-1After being processed by the down sampling operation, ht-1
Figure BDA0003144575400000042
Inputting convolution layers together, and finally obtaining shallow layer characteristics F through nonlinear activation function ReLU layer outputt
Fourthly, extracting and integrating depth detail information through the multi-density residual block:
the shallow layer characteristic F obtained in the third step is processedtDepth detail information extraction and integration are performed through multi-dense residual blocks, each multi-dense residual block comprises p dense residual blocks (RDBs) which are sequentially connected in series (the number of the dense residual blocks used in the experiment is 8-12), wherein the specific operation of each dense residual block is as follows: first, shallow feature FtInputting the two-dimensional convolution neural network with convolution kernel of 3 x 3, and inputting the features output by the two-dimensional convolution into a ReLU activation function for activation to obtain intermediate features
Figure BDA0003144575400000043
Then the intermediate features
Figure BDA0003144575400000044
And shallow feature FtConnecting in series in channel dimension, inputting into convolution and activation function structure same as above to obtain intermediate characteristics
Figure BDA0003144575400000045
And so on to get the intermediate features
Figure BDA0003144575400000046
And shallow feature FtInputting the same convolution and activation function structure in series in the channel dimension, and then obtaining the intermediate characteristics
Figure BDA0003144575400000047
Intermediate characteristics
Figure BDA0003144575400000048
And shallow feature FtThe channel dimensions are connected in series and input into a feature integration layer to obtain a feature integration mapping, and the obtained feature integration mapping is added with shallow features originally input into a dense residual block to obtain the output of a first dense residual block
Figure BDA0003144575400000049
See formula (6):
Figure BDA00031445754000000410
in the formula (6), Hff(. The) is a characteristic integration layer, which consists of a 1X 1 two-dimensional convolution neural network and only restricts the channel;
will be provided with
Figure BDA00031445754000000411
Inputting the second dense residual block to obtain the output of the second dense residual block
Figure BDA00031445754000000412
By analogy, the final output of the multiple dense residual blocks is obtained
Figure BDA00031445754000000413
See formula (7):
Figure BDA00031445754000000414
in formula (7), RDB1,RDB2,…,RDBpAll are dense residual blocks with the same structure but not shared parameters;
the fifth step, hiding state htObtaining:
the output of the multi-density residual block obtained in the fourth step
Figure BDA00031445754000000415
Inputting the result into a hidden state generation module to obtain the hidden state output of the recurrent neural network of the t ordinal number, which is shown in formula (8):
Figure BDA00031445754000000416
h in the formula (8)hg() is a hidden state generating module, which is composed of a 3 × 3 two-dimensional convolution neural network and a ReLU activation function, and the convolution is used for outputting hidden states;
and sixthly, obtaining a final super-resolution image through an SR reconstruction module:
simultaneously outputting the multi-density residual block obtained in the fourth step
Figure BDA00031445754000000417
Inputting the super-resolution image to an SR reconstruction module to obtain a final super-resolution image, wherein the process comprises the following steps: output of multiple dense residual blocks
Figure BDA00031445754000000418
Channel feature reduction and blending is performed through a convolution layer and then fed into a sub-pixel convolution layer which rearranges all pixels to H x W x r2c, arranging the feature maps into super-resolution residual images with the size of rH multiplied by rW multiplied by c, wherein the values of H and W are both 64, r is a sampling multiple, and c is the number of color channels; the super-resolution residual image and the low-resolution LR image subjected to r-time up-sampling processing are combined
Figure BDA00031445754000000419
Adding to obtain the final output result otI.e. obtaining a reconstructed super-resolution image
Figure BDA0003144575400000051
See formula (9):
Figure BDA0003144575400000052
the convolution layer in equation (9) adopts convolution with convolution kernel size of 3 × 3, Hspc(. H) is a sub-pixel convolution operation, Hus() r times bilinear upsampling operation;
so far, the output h of the recurrent neural network of the t ordinal number is obtainedt、otAnd super-resolution image
Figure BDA0003144575400000053
If at this time t<n, returning to the third step to operate the cyclic neural network with the (t + 1) th ordinal number, and if t is equal to n, indicating that the super-resolution calculation of all n frames is completed to obtain a super-resolution frame sequence
Figure BDA0003144575400000054
Seventh, the loss for the input video frame is calculated:
constructing a circular neural network model based on information construction and multiple density residual blocks through the first step to the sixth step, calculating a first frame with super-resolution, acquiring a subsequent super-resolution frame sequence through the circulation from the third step to the sixth step, and measuring the acquired final super-resolution frame sequence ISRWith the high-resolution frame sequence I obtained in the first stepHRThe difference between the two, the L1 loss function is adopted during training,
Figure BDA0003144575400000055
l in equation (10) is the value of the calculated L1 loss function;
and eighth step, combining the super-resolution image results into a video:
according to the original video frame rate, the super-resolution frame sequence I after the super-resolution is processedSRAnd synthesizing the video with the corresponding frame rate, and finishing the process of the circulating video super-resolution processing based on the information construction and the multi-density residual block.
The sampling multiple r is 4, and the number c of color channels is 3; m is 5 to 7, and n is 8 to 12.
Compared with the prior art, the invention has the beneficial effects that:
the outstanding substantive features of the invention are as follows:
(1) the method of the invention proposes a novel module, called information construction module. The method can reconstruct the previous frame by extracting the information of the previous frames, thereby not needing to carry out reverse circulation on all the frames, greatly reducing the calculation complexity and the time overhead, and simultaneously strengthening the reconstruction effect of the previous frame. In addition, the device is a migratory component and can be applied to any one-way recurrent neural network to improve the reconstruction effect of the devices. A large number of experiments show that the circular video super-resolution method based on information construction and multi-density residual block provided by the invention surpasses all current models in reconstruction effect, and the effectiveness of the module is verified.
(2) The method of the invention provides an improved recurrent neural network. The multi-density residual block is embedded into the recurrent neural network, and the accuracy of the video super-resolution is greatly improved because the multi-density residual block can be extracted to the depth detail information and the depth detail information is transmitted among frames in an implicit mode. The multi-density residual block realizes the real-time implementation of deep feature extraction and fusion, realizes the feature multiplexing, can keep higher signal-to-noise ratio, and the peak signal-to-noise ratio of the Vid4 benchmark test set reaches more than 28 dB.
The invention has the remarkable advantages that:
(1) compared with the CN111587447A method, the method of the invention adopts the improved recurrent neural network to propagate the high-dimensional implicit motion state. The invention has the outstanding substantive characteristics and remarkable progress that the optical flow does not need to be calculated, and the motion estimation and the motion compensation are displayed, and the super-resolution effect and the precision are better and faster than those of the optical flow method.
(2) Compared with the CN109102462A method, the method of the invention adopts the multi-density residual block to extract and integrate the detail information. The method has the outstanding substantive characteristics and remarkable progress that a large number of 3D convolution blocks are not needed to integrate and extract the space-time information, but a method based on multiple dense residual blocks is used for extracting and integrating detailed information, so that the computational complexity is greatly reduced, and the method adopts an improved single-cycle neural network, is an end-to-end method and has low memory occupancy rate.
(3) Compared with the CN111260560A method, the method provided by the invention adopts the cyclic neural network to carry out interframe information integration. The invention has the outstanding substantive characteristics and remarkable progress that the implicit alignment is carried out without a large number of 3D convolutions and deformable convolutions, but the interframe information is transferred and integrated by using the unidirectional cyclic neural network, and the time complexity and the space complexity are greatly reduced. And the module is used as the dense residual block during the feature extraction and reconstruction, so that the super-resolution effect is better and the efficiency is better due to the reuse of the features compared with the residual block.
(4) According to the method, the dense residual block is used for extracting and fusing the detail information, the feature map of the hidden layer can be fully utilized, the reuse of the features improves the use value of the generated features, so that the depth detail information can be better restored, particularly, annoying artifacts can not occur when an object with a large motion range in a video is encountered, and the super-resolution effect is better. Compared with RLSP, the problem of gradient disappearance can be avoided, and meanwhile, the precision of super-resolution is greatly improved. In addition, the RLSP has poor reconstruction effect of the previous frame due to no additional processing on the information of the previous frame, the invention adopts the information construction module to fully extract the detail information and the structure information of the previous frame, thereby greatly improving the reconstruction effect of the previous frame,
drawings
The invention is further illustrated with reference to the following figures and examples.
FIG. 1 is a block diagram showing the flow of the super-resolution method of the circular video based on information construction and multi-density residual block.
Fig. 2 is a schematic structural diagram of a circular video super-resolution method based on information construction and multi-density residual block in the present invention.
Fig. 3 is a schematic structural diagram of an information construction module in the present invention.
FIG. 4 is a schematic diagram of a recurrent neural network structure of multiple dense residual blocks in the present invention.
Fig. 5 is a schematic structural diagram of one dense residual block in multiple dense residual blocks in the present invention.
Detailed Description
The embodiment shown in fig. 1 shows that the process of the method for super-resolution of a circular video based on information construction and multi-density residual block according to the present invention is as follows:
obtaining video → preprocessing video data → constructing information by using partial low-resolution frame sequence → extracting shallow layer feature in a multi-density residual cyclic neural network → extracting and integrating depth detail information by using a multi-density residual block → obtaining hidden state → obtaining final super-resolution image by using an SR reconstruction module → calculating loss of input video frame → combining super-resolution image results into video → completing the cyclic video super-resolution method based on the multi-density residual block.
The embodiment shown in fig. 2 shows a super resolution method for a circular video based on information construction and multiple dense residual blocks, where the number of hidden state channels is 128, the number of output channels of the hidden layer is 48, the number of input and output channels is 3, m is the number of input frames, and the initialization is set to 7.
The embodiment shown in fig. 3 shows a schematic structural diagram of an information construction module, where m is an input frame number and is set to 7, 5 serially connected residual blocks are used in a multi-residual block in a fast information architecture adjustment block, and K is a shallow feature screening set. The information construction module comprises a feature extraction and channel attention, a rapid information architecture adjustment block and a structure simulation block, and input 1-m frames output input hidden states required by the initial recurrent neural network through the feature extraction and channel attention, the rapid information architecture adjustment and the structure simulation.
The embodiment shown in FIG. 4 shows a recurrent neural network structure with multiple dense residual blocksA schematic diagram, i.e. the recurrent neural network framework of the multi-dense residual block of fig. 2, FtIn order to be a shallow layer characteristic,
Figure BDA0003144575400000071
is the final output of the multi-dense residual block.
Fig. 5 is a schematic diagram of a dense residual block according to an embodiment of the present invention, wherein,
Figure BDA0003144575400000072
Ftthe channel sizes of (a) and (b) are all 128, in order to ensure the time efficiency of the present invention,
Figure BDA0003144575400000073
the number of channels of (2) is taken to be 32. All convolutions except the first convolution are channel-series-plus-convolution operations.
Example 1
The method for super-resolution of the circular video based on the information construction and the multi-density residual block comprises the following steps:
firstly, video data is preprocessed:
rotating or overturning the video data by using random function and self-defined threshold, and extracting frames from the rotated or overturned video data to obtain a video frame sequence V ═ (V ═ V-1,…,vn) Where n is the total number of frames of the video, v1,vnA first frame and a last frame, respectively, of V; carrying out average clipping processing on all video frames to ensure that the width and the height of the video frames are unified to 256 x 256 pixel values, and obtaining a high-resolution frame sequence corresponding to the video frame sequence V
Figure BDA0003144575400000074
Then for the resulting high resolution frame sequence IHRPerforming Gaussian blur processing with variance sigma of 1.6 and blur radius of 3 and down-sampling processing with sampling multiple of 4 to obtain corresponding low-resolution frame sequence
Figure BDA0003144575400000075
As shown in the following equation (1):
ILR=BlurDown(AverageCrop(V)) (1)
in formula (1), BlurDown (. cndot.) is Gaussian fuzzy down-sampling, and AverageCrop (. cndot.) is average clipping;
and secondly, using partial low-resolution frame sequences to carry out information construction:
the process of feature extraction and channel attention is as follows: firstly, obtaining an initial n-frame low-resolution frame sequence I in a first stepLRIn the previous m (m is more than or equal to 1 and less than or equal to n, m is 7 in the embodiment) frames, each frame is subjected to convolution processing to carry out feature extraction, and shallow layer features are obtained
Figure BDA0003144575400000076
Then, shallow features are overlapped on channel dimensions and input into a channel attention module (SE), and the operation process of the SE is divided into two steps of compression and excitation: compressing through a given feature map (shallow features)
Figure BDA0003144575400000077
) And performing global average pooling to obtain global compression characteristic quantity of the current characteristic diagram, exciting a bottleneck layer structure with a ReLU activation function in the middle and two layers of full connection to obtain a weight of each channel in the characteristic diagram, and weighting the characteristic diagram and the weight to output a shallow characteristic screening set K, wherein the formula (2) is as follows:
Figure BDA0003144575400000078
in the formula (2), [, ] represents the superposition operation of the channel dimensions, and SE (-) is the channel attention operation;
then, the obtained shallow feature screening set K passes through a quick information framework adjusting block to obtain deep information features, the deep information features are constructed through a structure simulation block to output initial hidden information (h)0) And initial pre-output information (o)0) Wherein:
the specific process of the operation of the fast information architecture adjusting block is as follows: and (3) inputting the shallow feature screening set K into a multi-residual block which is stacked in series after passing through a convolution and ReLU activation function to carry out deep feature extraction, and outputting deep information features, wherein the residual block is formed by connecting two convolutions and a jump layer.
The specific operation of the structure simulation block is as follows: processing deep information features output by the rapid information architecture adjusting block through a convolution layer and a ReLU activation function to obtain initial hidden information, meanwhile, performing feature dimensionality reduction on the output deep information features through another convolution layer, feeding the output deep information features into a sub-pixel convolution layer, and adding the sub-pixel convolution layer and a low-resolution initial image subjected to r-time upsampling processing to obtain initial pre-output information;
h0expressed by equation (3):
h0=ReLU(Conv(FIA(K))) (3)
o0expressed by equation (4):
Figure BDA0003144575400000081
in formulae (3) and (4), Hspc(. H) is a sub-pixel convolution operation, FIA (. H) is a fast information frame adjustment block, + represents the pixel addition of the same channel, Hus() r times bilinear upsampling operation, or other upsampling methods, Conv is convolution operation, the convolution kernels of the convolution operations in the formula (3) and the formula (4) are the same in size, the convolution parameters of the convolution kernels are different, ReLU is activation function operation,
Figure BDA0003144575400000082
for a sequence of low resolution frames ILRThe first frame in (1);
thirdly, shallow layer feature extraction in the recurrent neural network of the multi-density residual block:
because the ordinal number of the recurrent neural network and the subscript of the low-resolution frame needing super-resolution are in one-to-one correspondence, t is defined as the ordinal number of the recurrent neural network or the subscript of the low-resolution frame needing super-resolution, and the recurrent neural networks corresponding to different ordinal numbersParameters passing through the network are shared, t is traversed to n from 1 in sequence, and t is more than or equal to 1 and less than or equal to n; in the cyclic neural network of the t ordinal number, the low resolution frame sequence I obtained in the first step is processedLROf the t-th and t-1-th frames
Figure BDA0003144575400000083
And
Figure BDA0003144575400000084
hidden state h output by t-1 th cyclic neural networkt-1And downsampling the output result o by a factor of rt-1In the channel dimension, let h be connected in series when t is 1t-1,ot-1And
Figure BDA0003144575400000085
respectively initialized to h calculated in the second step0,o0And in a first step a sequence of low resolution frames ILRIn (1)
Figure BDA0003144575400000086
I.e. the initial season
Figure BDA0003144575400000087
Inputting the series-connected sequence into a shallow feature extraction module in a recurrent neural network of a multi-density residual block to obtain a shallow feature FtThe specific process is formula (5):
Figure BDA0003144575400000088
in the formula (5), FtFor shallow features, Down (-) is downsampled, [, ]]For series connection in the channel dimension, Hsfe(. h) is a shallow feature extraction module for feature extraction, comprising a convolutional layer, a nonlinear activation function ReLU layer and a downsampling operation, and outputting ot-1After being processed by the down sampling operation, ht-1
Figure BDA0003144575400000089
Inputting convolution layers together, and finally obtaining shallow layer characteristics F through nonlinear activation function ReLU layer outputt
Fourthly, extracting and integrating depth detail information through the multi-density residual block:
the shallow layer characteristic F obtained in the third step is processedtExtracting and integrating depth detail information by using multi-density residual blocks, wherein each multi-density residual block comprises p dense residual blocks (RDBs) which are sequentially connected in series, and the specific operation of each dense residual block is as follows: first, shallow feature FtInputting the two-dimensional convolution neural network with convolution kernel of 3 x 3, and inputting the features output by the two-dimensional convolution into a ReLU activation function for activation to obtain intermediate features
Figure BDA00031445754000000810
Then the intermediate features
Figure BDA00031445754000000811
And shallow feature FtConnecting in series in channel dimension, inputting into convolution and activation function structure same as above to obtain intermediate characteristics
Figure BDA00031445754000000812
And so on to get the intermediate features
Figure BDA00031445754000000813
And shallow feature FtInputting the same convolution and activation function structure in series in the channel dimension, and then obtaining the intermediate characteristics
Figure BDA00031445754000000814
Intermediate characteristics
Figure BDA00031445754000000815
And shallow feature FtThe channel dimensions are connected in series and input into a feature integration layer to obtain a feature integration mapping, and the obtained feature integration mapping is added with shallow features originally input into a dense residual block to obtain the output of a first dense residual blockGo out
Figure BDA00031445754000000816
See formula (6):
Figure BDA00031445754000000817
in the formula (6), Hff(. The) is a characteristic integration layer, which consists of a 1X 1 two-dimensional convolution neural network and only restricts the channel;
will be provided with
Figure BDA00031445754000000818
Inputting the second dense residual block to obtain the output of the second dense residual block
Figure BDA00031445754000000819
By analogy, the final output of the multiple dense residual blocks is obtained
Figure BDA00031445754000000820
See formula (7):
Figure BDA0003144575400000091
in formula (7), RDB1,RDB2,…,RDBpAll are dense residual blocks with the same structure but not shared parameters; in the present embodiment, 10 dense residual blocks are provided in total, i.e., p is 10.
The fifth step, hiding state htObtaining:
the output of the multi-density residual block obtained in the fourth step
Figure BDA0003144575400000092
Inputting the result into a hidden state generation module to obtain the hidden state output of the recurrent neural network of the t ordinal number, which is shown in formula (8):
Figure BDA0003144575400000093
h in the formula (8)hg() is a hidden state generating module, which is composed of a 3 × 3 two-dimensional convolution neural network and a ReLU activation function, and the convolution is used for outputting hidden states;
and sixthly, obtaining a final super-resolution image through an SR reconstruction module:
simultaneously outputting the multi-density residual block obtained in the fourth step
Figure BDA0003144575400000094
Inputting the super-resolution image to an SR reconstruction module to obtain a final super-resolution image, wherein the process comprises the following steps: output of multiple dense residual blocks
Figure BDA0003144575400000095
Channel feature reduction and blending is performed through a convolution layer and then fed into a sub-pixel convolution layer which rearranges all pixels to H x W x r2c, arranging the feature maps into super-resolution residual images with the size of rH multiplied by rW multiplied by c, wherein the values of H and W are both 64, r is a sampling multiple, and c is the number of color channels; the super-resolution residual image and the low-resolution LR image subjected to the up-sampling processing by 4 times are processed
Figure BDA0003144575400000096
Adding to obtain the final output result otI.e. obtaining a reconstructed super-resolution image
Figure BDA0003144575400000097
See formula (9):
Figure BDA0003144575400000098
the convolution layer in equation (9) adopts convolution with convolution kernel size of 3 × 3, Hspc(. H) is a sub-pixel convolution operation, HusR times bilinear upsamplingOperating;
so far, the output h of the recurrent neural network of the t ordinal number is obtainedt、otAnd super-resolution image
Figure BDA0003144575400000099
If at this time t<n, returning to the third step to operate the cyclic neural network with the (t + 1) th ordinal number, and if t is equal to n, indicating that the super-resolution calculation of all n frames is completed to obtain a super-resolution frame sequence
Figure BDA00031445754000000910
Seventh, the loss for the input video frame is calculated:
constructing a circular neural network model based on information construction and multiple density residual blocks through the first step to the sixth step, calculating a first frame with super-resolution, acquiring a subsequent super-resolution frame sequence through the circulation from the third step to the sixth step, and measuring the acquired final super-resolution frame sequence ISRWith the high-resolution frame sequence I obtained in the first stepHRThe difference between the two, the L1 loss function is adopted during training,
Figure BDA00031445754000000911
l in equation (10) is the value of the calculated L1 loss function;
and eighth step, combining the super-resolution image results into a video:
according to the original video frame rate, the super-resolution frame sequence I after the super-resolution is processedSRAnd synthesizing the video with the corresponding frame rate, and finishing the process of the circulating video super-resolution processing based on the information construction and the multi-density residual block.
The circular video super-resolution method based on the information construction and the multiple Dense Residual blocks is characterized in that the Dense Residual blocks are abbreviated as RDB in English, are called Residual Dense Block in full, are abbreviated as SE in channel attention, are called Squeeze and Excitation in full, the Linear rectification functions are abbreviated as ReLU in English, are called Rectified Linear Unit in full, and are all known in the technical field of BlurDown downsampling.
Nothing in this specification is said to apply to the prior art.

Claims (5)

1. A circular video super-resolution method based on information construction and multi-density residual block is characterized in that: the method comprises the steps that an introduced information building module is used for building and filling prior information for an initial cyclic neural network, so that the cyclic neural network with the front ordinal number has enough information to reconstruct a front frame, and meanwhile, a multi-density residual block is used for extracting features representing deep information and propagating the features in the cyclic neural network in a hidden state; the method comprises the following steps:
preprocessing video data to obtain corresponding low resolution frame sequence
Figure FDA0003470599210000011
And (3) constructing an information construction module: the information construction module comprises a feature extraction and channel attention, quick information architecture adjustment block and a structure simulation block, wherein the input is a first m-frame low-resolution frame sequence, and the output is initial hidden information h0And initial pre-output information o0
The process of feature extraction and channel attention is as follows: first a sequence of low resolution frames ILRIn the previous m frames, each frame is subjected to convolution processing to extract features, m is more than or equal to 1 and less than or equal to n, and shallow layer features are obtained
Figure FDA0003470599210000012
Then, shallow features are overlapped on channel dimensions and input into a channel attention module SE to obtain a shallow feature screening set K;
the specific structure of the quick information architecture adjusting block is as follows: inputting a shallow feature screening set K into a multi-residual block which is stacked in series after passing through a convolution and ReLU activation function, and performing deep feature extraction to output deep information features, wherein the residual block is formed by connecting two convolutions and a jump layer;
the specific operation of the structure simulation block is as follows: processing deep information features output by the rapid information architecture adjusting block through a convolution layer and a ReLU activation function to obtain initial hidden information, meanwhile, performing feature dimensionality reduction on the output deep information features through another convolution layer, feeding the output deep information features into a sub-pixel convolution layer, and adding the sub-pixel convolution layer and a low-resolution initial image subjected to r-time upsampling processing to obtain initial pre-output information;
constructing a cyclic neural network of a plurality of dense residual blocks, inputting two outputs of an information construction module into the cyclic neural network of the plurality of dense residual blocks, and obtaining the output h of the cyclic neural network of the t ordinal number through t times of circulationtAnd otSuper-resolution image
Figure FDA0003470599210000013
If at this time t<n, the operation of the recurrent neural network with the (t + 1) th ordinal number is carried out, if t is equal to n, the super-resolution calculation of all n frames is completed, and a super-resolution frame sequence is obtained
Figure FDA0003470599210000014
2. The method for super resolution of circular videos based on information construction and multi-density residual block as claimed in claim 1, wherein the operation process of the channel attention module SE is divided into two steps of compression and excitation: compressing by giving shallow features
Figure FDA0003470599210000015
And performing global average pooling to obtain global compression characteristic quantity of the current characteristic diagram, exciting a bottleneck layer structure with a ReLU activation function in the middle and two layers of full connection to obtain a weight of each channel in the characteristic diagram, weighting the characteristic diagram and the weight to output a shallow characteristic screening set K, and expressing by a formula (2):
Figure FDA0003470599210000016
in equation (2), [, ] represents the superposition operation of the channel dimensions, and SE (·) is the channel attention operation.
3. The method for super resolution of a circular video based on information construction and multiple dense residual blocks as claimed in claim 1, wherein the circular neural network of multiple dense residual blocks comprises a shallow feature extraction module, multiple dense residual blocks, a hidden state generation module and an SR reconstruction module,
the multiple dense residual blocks are used for extracting and integrating depth detail information of shallow output Ft of the shallow feature extraction module, each multiple dense residual block comprises p dense residual blocks, p is an integer not less than 2, and the specific operation of each dense residual block is as follows: first, shallow feature FtInputting the two-dimensional convolution neural network with convolution kernel of 3 x 3, and inputting the features output by the two-dimensional convolution into a ReLU activation function for activation to obtain intermediate features
Figure FDA0003470599210000021
A convolution and activation function structure is formed by a two-dimensional convolution neural network with a convolution kernel of 3 multiplied by 3 and a ReLU activation function; then the intermediate features
Figure FDA0003470599210000022
And shallow feature FtConnecting in series in channel dimension, inputting into convolution and activation function structure same as above to obtain intermediate characteristics
Figure FDA0003470599210000023
And so on to get the intermediate features
Figure FDA0003470599210000024
And shallow feature FtInputting the same convolution and activation function structure in series in the channel dimension, and then obtaining the intermediate characteristics
Figure FDA0003470599210000025
Intermediate characteristics
Figure FDA0003470599210000026
And shallow feature FtThe channel dimensions are connected in series and input into a feature integration layer to obtain a feature integration mapping, and the obtained feature integration mapping is added with shallow features originally input into a dense residual block to obtain the output of a first dense residual block
Figure FDA0003470599210000027
Will be provided with
Figure FDA0003470599210000028
As input to the second dense residual block, an output of the second dense residual block is obtained by the second dense residual block as described above
Figure FDA0003470599210000029
By analogy, the final output of the multiple dense residual blocks is obtained
Figure FDA00034705992100000210
The final output of the multi-dense residual block is
Figure FDA00034705992100000211
Inputting the result to a hidden state generation module to obtain the hidden state output h of the recurrent neural network of the t-th ordinal numbertSimultaneously outputting the final output of the multi-density residual block
Figure FDA00034705992100000212
Input to an SR reconstruction module to obtain a final super-resolution image
Figure FDA00034705992100000213
The characteristic integration layer is composed of a 1 x 1 two-dimensional convolution neural network; the hidden state generation module is composed of a 3 x 3 two-dimensional convolutional neural network and a ReLU activation function.
4. A circular video super-resolution method based on information construction and multi-density residual block comprises the following steps:
firstly, video data is preprocessed:
rotating or overturning the video data by using random function and self-defined threshold, and extracting frames from the rotated or overturned video data to obtain a video frame sequence V ═ (V ═ V-1,…,vn) Where n is the total number of frames of the video, v1,vnA first frame and a last frame, respectively, of V; carrying out average clipping processing on all video frames to ensure that the width and the height of the video frames are unified to 256 x 256 pixel values, and obtaining a high-resolution frame sequence corresponding to the video frame sequence V
Figure FDA00034705992100000214
Then for the resulting high resolution frame sequence IHRPerforming Gaussian blur processing with variance sigma of 1.6 and blur radius of 3 and down-sampling processing with sampling multiple r to obtain corresponding low-resolution frame sequence
Figure FDA00034705992100000215
And secondly, using partial low-resolution frame sequences to carry out information construction:
the process of feature extraction and channel attention is as follows: firstly, obtaining an initial n-frame low-resolution frame sequence I in a first stepLRIn the previous m frames, each frame is subjected to convolution processing to extract features, m is more than or equal to 1 and less than or equal to n, and shallow layer features are obtained
Figure FDA00034705992100000216
Then, shallow features are overlapped on channel dimensions and input into a channel attention module SE, and the operation process of the SE is divided into two steps of compression and excitationThe method comprises the following steps: by compressing in a given feature map, i.e. shallow features
Figure FDA00034705992100000217
Performing global average pooling to obtain global compression characteristic quantity of the current characteristic diagram, exciting a two-layer fully-connected bottleneck layer structure with a ReLU activation function in the middle to obtain a weight of each channel in the characteristic diagram, and performing weighting on the characteristic diagram and the weight to output a shallow characteristic screening set K;
then, the obtained shallow feature screening set K is subjected to a quick information framework adjusting block to obtain deep information features, the deep information features are subjected to construction of output information through a structure simulation block, and initial hidden information h is output0And initial pre-output information o0Wherein:
the specific process of the operation of the fast information architecture adjusting block is as follows: inputting a shallow feature screening set K into a multi-residual block which is stacked in series after passing through a convolution and ReLU activation function, and performing deep feature extraction to output deep information features, wherein the residual block is formed by connecting two convolutions and a jump layer;
the specific operation of the structure simulation block is as follows: processing deep information features output by the rapid information architecture adjusting block through a convolution layer and a ReLU activation function to obtain initial hidden information, meanwhile, performing feature dimensionality reduction on the output deep information features through another convolution layer, feeding the output deep information features into a sub-pixel convolution layer, and adding the sub-pixel convolution layer and a low-resolution initial image subjected to r-time upsampling processing to obtain initial pre-output information;
h0expressed by equation (3):
h0=ReLU(Conv(FIA(K))) (3)
o0expressed by equation (4):
Figure FDA0003470599210000031
in formulae (3) and (4), Hspc(. cndot.) is a sub-pixel convolution operation, FIA (. cndot.) is a fast information architecture adjustment blockAnd + represents the pixel addition of the same channel, Hus() r times bilinear upsampling operation, Conv convolution operation, convolution kernel sizes of convolution operations in formula (3) and formula (4) are the same, convolution parameters are different, ReLU is activation function operation,
Figure FDA0003470599210000032
for a sequence of low resolution frames ILRThe first frame in (1);
thirdly, shallow layer feature extraction in the recurrent neural network of the multi-density residual block:
because the ordinal number of the recurrent neural network and the subscript of the low-resolution frame needing super-resolution are in one-to-one correspondence, t is uniformly defined as the ordinal number of the recurrent neural network or the subscript of the low-resolution frame needing super-resolution, and the parameters of the recurrent neural network corresponding to different ordinal numbers are shared, t is traversed to n from 1 in sequence, and t is more than or equal to 1 and less than or equal to n; in the cyclic neural network of the t ordinal number, the low resolution frame sequence I obtained in the first step is processedLROf the t-th and t-1-th frames
Figure FDA0003470599210000033
And
Figure FDA0003470599210000034
hidden state h output by t-1 th cyclic neural networkt-1And downsampling the output result o by a factor of rt-1In the channel dimension, let h be connected in series when t is 1t-1、ot-1And
Figure FDA0003470599210000035
respectively initialized to h calculated in the second step0、o0And in a first step a sequence of low resolution frames ILRIn (1)
Figure FDA0003470599210000036
Inputting the series-connected sequence into a shallow feature extraction module in a recurrent neural network of a multi-density residual block to obtain a shallow feature FtThe specific process is formula (5):
Figure FDA0003470599210000037
in the formula (5), FtFor shallow features, Down (-) is downsampled, [, ]]For series connection in the channel dimension, Hsfe(. to) is a shallow layer feature extraction module, which is used for extracting features and comprises a convolution layer and a nonlinear activation function ReLU layer, down sampling and output ot-1After down sampling treatment, ht-1
Figure FDA0003470599210000038
Inputting convolution layers together, and finally obtaining shallow layer characteristics F through nonlinear activation function ReLU layer outputt
Fourthly, extracting and integrating depth detail information through the multi-density residual block:
the shallow layer characteristic F obtained in the third step is processedtExtracting and integrating depth detail information through multiple dense residual blocks, wherein each multiple dense residual block comprises p dense residual blocks which are sequentially connected in series, and the specific operation of each dense residual block is as follows: first, shallow feature FtInputting the two-dimensional convolution neural network with convolution kernel of 3 x 3, and inputting the features output by the two-dimensional convolution into a ReLU activation function for activation to obtain intermediate features
Figure FDA0003470599210000039
Then the intermediate features
Figure FDA00034705992100000310
And shallow feature FtConnecting in series in channel dimension, inputting into convolution and activation function structure same as above to obtain intermediate characteristics
Figure FDA00034705992100000311
And so on to get the intermediate features
Figure FDA00034705992100000312
And shallow feature FtInputting the same convolution and activation function structure in series in the channel dimension, and then obtaining the intermediate characteristics
Figure FDA00034705992100000313
Intermediate characteristics
Figure FDA00034705992100000314
And shallow feature FtThe channel dimensions are connected in series and input into a feature integration layer to obtain a feature integration mapping, and the obtained feature integration mapping is added with shallow features originally input into a dense residual block to obtain the output of a first dense residual block
Figure FDA0003470599210000041
See formula (6):
Figure FDA0003470599210000042
in the formula (6), Hff(. The) is a characteristic integration layer, which consists of a 1 x 1 two-dimensional convolution neural network;
will be provided with
Figure FDA0003470599210000043
Inputting the second dense residual block to obtain the output of the second dense residual block
Figure FDA0003470599210000044
By analogy, the final output of the multiple dense residual blocks is obtained
Figure FDA0003470599210000045
See formula (7):
Figure FDA0003470599210000046
in formula (7), RDB1,RDB2,…,RDBpAll are dense residual blocks with the same structure but not shared parameters;
the fifth step, hiding state htObtaining:
the output of the multi-density residual block obtained in the fourth step
Figure FDA0003470599210000047
Inputting the result into a hidden state generation module to obtain the hidden state output of the recurrent neural network of the t ordinal number, which is shown in formula (8):
Figure FDA0003470599210000048
h in the formula (8)hg() is a hidden state generating module, which is composed of a 3 × 3 two-dimensional convolution neural network and a ReLU activation function, and the convolution is used for outputting hidden states;
and sixthly, obtaining a final super-resolution image through an SR reconstruction module:
simultaneously outputting the multi-density residual block obtained in the fourth step
Figure FDA0003470599210000049
Inputting the super-resolution image to an SR reconstruction module to obtain a final super-resolution image, wherein the process comprises the following steps: output of multiple dense residual blocks
Figure FDA00034705992100000410
Channel feature reduction and blending is performed through a convolution layer and then fed into a sub-pixel convolution layer which rearranges all pixels to H x W x r2c, arranging the feature maps into super-resolution residual images with the size of rH multiplied by rW multiplied by c, wherein the values of H and W are both 64, r is a sampling multiple, and c is the number of color channels; will be overdividedResolution residual image and low resolution LR image subjected to r-time upsampling processing
Figure FDA00034705992100000411
Adding to obtain the final output result otI.e. obtaining a reconstructed super-resolution image
Figure FDA00034705992100000412
See formula (9):
Figure FDA00034705992100000413
the convolution layer in equation (9) adopts convolution with convolution kernel size of 3 × 3, Hspc(. H) is a sub-pixel convolution operation, Hus() r times bilinear upsampling operation;
so far, the output h of the recurrent neural network of the t ordinal number is obtainedtAnd otSuper-resolution image
Figure FDA00034705992100000414
If at this time t<n, returning to the third step to operate the cyclic neural network with the (t + 1) th ordinal number, and if t is equal to n, indicating that the super-resolution calculation of all n frames is completed to obtain a super-resolution frame sequence
Figure FDA00034705992100000415
Seventh, the loss for the input video frame is calculated:
constructing a circular neural network model based on information construction and multiple density residual blocks through the first step to the sixth step, calculating a first frame with super-resolution, acquiring a subsequent super-resolution frame sequence through the circulation from the third step to the sixth step, and measuring the acquired final super-resolution frame sequence ISRWith the high-resolution frame sequence I obtained in the first stepHRThe difference between the two, the L1 loss function is adopted during training,
Figure FDA00034705992100000416
l in equation (10) is the value of the calculated L1 loss function;
and eighth step, combining the super-resolution image results into a video:
according to the original video frame rate, the super-resolution frame sequence I after the super-resolution is processedSRAnd synthesizing the video with the corresponding frame rate, and finishing the process of the circulating video super-resolution processing based on the information construction and the multi-density residual block.
5. The method for super resolution of a circular video based on information construction and multi-density residual block according to claim 4, wherein the sampling multiple is 4, the number of color channels c is 3; m is 5-7, n is 8-12; the number of dense residual blocks is 8-12.
CN202110746815.4A 2021-07-02 2021-07-02 Circular video super-resolution method based on information construction and multi-density residual block Expired - Fee Related CN113487481B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110746815.4A CN113487481B (en) 2021-07-02 2021-07-02 Circular video super-resolution method based on information construction and multi-density residual block

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110746815.4A CN113487481B (en) 2021-07-02 2021-07-02 Circular video super-resolution method based on information construction and multi-density residual block

Publications (2)

Publication Number Publication Date
CN113487481A CN113487481A (en) 2021-10-08
CN113487481B true CN113487481B (en) 2022-04-12

Family

ID=77940004

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110746815.4A Expired - Fee Related CN113487481B (en) 2021-07-02 2021-07-02 Circular video super-resolution method based on information construction and multi-density residual block

Country Status (1)

Country Link
CN (1) CN113487481B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117173646A (en) * 2023-08-17 2023-12-05 金陵科技学院 Highway obstacle detection method, system, electronic device and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109934771A (en) * 2019-03-08 2019-06-25 北京航空航天大学 Unsupervised Remote sensed image super-resolution reconstruction method based on Recognition with Recurrent Neural Network
CN110570353A (en) * 2019-08-27 2019-12-13 天津大学 Dense connection generation countermeasure network single image super-resolution reconstruction method
CN111028150A (en) * 2019-11-28 2020-04-17 武汉大学 Rapid space-time residual attention video super-resolution reconstruction method
CN111260560A (en) * 2020-02-18 2020-06-09 中山大学 Multi-frame video super-resolution method fused with attention mechanism
WO2020212973A1 (en) * 2019-04-18 2020-10-22 Orca Ai Ltd. Marine data collection for marine artificial intelligence systems

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015180054A1 (en) * 2014-05-28 2015-12-03 北京大学深圳研究生院 Video coding and decoding methods and apparatuses based on image super-resolution
US20200162789A1 (en) * 2018-11-19 2020-05-21 Zhan Ma Method And Apparatus Of Collaborative Video Processing Through Learned Resolution Scaling
KR20210019835A (en) * 2019-08-13 2021-02-23 한국전자통신연구원 Apparatus and method for generating super resolution inmage using orientation adaptive parallel neural networks
CN111192200A (en) * 2020-01-02 2020-05-22 南京邮电大学 Image super-resolution reconstruction method based on fusion attention mechanism residual error network
CN111445390B (en) * 2020-02-28 2022-03-25 天津大学 Wide residual attention-based three-dimensional medical image super-resolution reconstruction method
CN111583112A (en) * 2020-04-29 2020-08-25 华南理工大学 Method, system, device and storage medium for video super-resolution

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109934771A (en) * 2019-03-08 2019-06-25 北京航空航天大学 Unsupervised Remote sensed image super-resolution reconstruction method based on Recognition with Recurrent Neural Network
WO2020212973A1 (en) * 2019-04-18 2020-10-22 Orca Ai Ltd. Marine data collection for marine artificial intelligence systems
CN110570353A (en) * 2019-08-27 2019-12-13 天津大学 Dense connection generation countermeasure network single image super-resolution reconstruction method
CN111028150A (en) * 2019-11-28 2020-04-17 武汉大学 Rapid space-time residual attention video super-resolution reconstruction method
CN111260560A (en) * 2020-02-18 2020-06-09 中山大学 Multi-frame video super-resolution method fused with attention mechanism

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Video Super-Resolution via Bidirectional;Yan Huang,et al;《IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE》;20180430;正文第1015-1028页 *
图像超分辨率重建研究综述;唐艳秋 等;《电子学报》;20200731;正文第1407-1420页 *
基于深度学习的视频超分辨率算法研究;李定一;《中国博士学位论文全文数据库》;20190815;I138-47 *

Also Published As

Publication number Publication date
CN113487481A (en) 2021-10-08

Similar Documents

Publication Publication Date Title
CN109903228B (en) Image super-resolution reconstruction method based on convolutional neural network
CN110969577B (en) Video super-resolution reconstruction method based on deep double attention network
CN111028150B (en) Rapid space-time residual attention video super-resolution reconstruction method
CN106683067B (en) Deep learning super-resolution reconstruction method based on residual sub-images
CN112348766B (en) Progressive feature stream depth fusion network for surveillance video enhancement
US20190206026A1 (en) Frame-Recurrent Video Super-Resolution
CN111311490A (en) Video super-resolution reconstruction method based on multi-frame fusion optical flow
CN111260560B (en) Multi-frame video super-resolution method fused with attention mechanism
CN107240066A (en) Image super-resolution rebuilding algorithm based on shallow-layer and deep layer convolutional neural networks
CN108805808A (en) A method of improving video resolution using convolutional neural networks
CN111179167A (en) Image super-resolution method based on multi-stage attention enhancement network
CN111681166A (en) Image super-resolution reconstruction method of stacked attention mechanism coding and decoding unit
CN110062232A (en) A kind of video-frequency compression method and system based on super-resolution
CN113409190B (en) Video super-resolution method based on multi-frame grouping and feedback network
CN110751597A (en) Video super-resolution method based on coding damage repair
CN112699844A (en) Image super-resolution method based on multi-scale residual error level dense connection network
CN111667406B (en) Video image super-resolution reconstruction method based on time domain correlation
CN113487481B (en) Circular video super-resolution method based on information construction and multi-density residual block
CN116167920A (en) Image compression and reconstruction method based on super-resolution and priori knowledge
CN116468605A (en) Video super-resolution reconstruction method based on time-space layered mask attention fusion
CN110047038B (en) Single-image super-resolution reconstruction method based on hierarchical progressive network
CN112991169B (en) Image compression method and system based on image pyramid and generation countermeasure network
CN114692765A (en) Video spatio-temporal hyper-resolution model construction method, device, equipment and readable storage medium
WO2023185284A1 (en) Video processing method and apparatuses
CN116681592A (en) Image super-resolution method based on multi-scale self-adaptive non-local attention network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20220412