CN109949224B - Deep learning-based cascade super-resolution reconstruction method and device - Google Patents

Deep learning-based cascade super-resolution reconstruction method and device Download PDF

Info

Publication number
CN109949224B
CN109949224B CN201910143361.4A CN201910143361A CN109949224B CN 109949224 B CN109949224 B CN 109949224B CN 201910143361 A CN201910143361 A CN 201910143361A CN 109949224 B CN109949224 B CN 109949224B
Authority
CN
China
Prior art keywords
super
model
picture
resolution
resolution reconstruction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910143361.4A
Other languages
Chinese (zh)
Other versions
CN109949224A (en
Inventor
王玄音
马洪兵
刘刚
顾桂华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Yuetu Remote Sensing Technology Development Co ltd
Original Assignee
Beijing Yuetu Remote Sensing Technology Development Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Yuetu Remote Sensing Technology Development Co ltd filed Critical Beijing Yuetu Remote Sensing Technology Development Co ltd
Priority to CN201910143361.4A priority Critical patent/CN109949224B/en
Publication of CN109949224A publication Critical patent/CN109949224A/en
Application granted granted Critical
Publication of CN109949224B publication Critical patent/CN109949224B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Abstract

The embodiment of the invention provides a cascade super-resolution reconstruction method and device based on deep learning. The super-resolution reconstruction model is obtained by machine learning an initial construction model, wherein the initial construction model adopts a plurality of cascade groups, and each cascade group comprises a convolution layer structure formed by serially connected convolution layers and a parallel deconvolution layer structure formed by parallelly connected deconvolution layers. The design of the initial construction model enhances the high-low frequency detail feature extraction capability and improves the reconstruction effect. The parallel deconvolution layer structure comprises a plurality of parallel deconvolution layers, so that the trained model can improve the resolution of the initial picture by different multiples, the application scene of the model is enlarged, and the applicability is improved.

Description

Deep learning-based cascade super-resolution reconstruction method and device
Technical Field
The embodiment of the invention relates to the technical field of image processing, in particular to a cascade super-resolution reconstruction method and device based on deep learning.
Background
Image Super-Resolution reconstruction (Super-Resolution) is an important digital image processing technology, and one or more low-Resolution images (or motion sequences) are utilized to reconstruct a high-Resolution and high-information-content image through a corresponding algorithm. The technology breaks through the resolution limit of the image sensor, and can improve the resolution of the image and improve the image quality on the premise of not changing or improving the image acquisition hardware. The processing result is favorable for visual interpretation of the earth surface, identification of an image target algorithm, analysis of the type of the ground object and improvement of quantitative inversion precision, and the information expression capability and the utilization value of the remote sensing image are improved. In the application, the technical support can be provided for ground target identification, type and number interpretation and target automatic detection application through the super-resolution reconstruction technology of the optical targets. There are various implementation methods of the optical target super-resolution reconstruction technology, such as implementation based on a frequency domain or based on a space domain, and implementation based on a single frame or multiple frames.
At present, the deep learning-based super-resolution reconstruction method also has many researches, such as ESPCN, SRCNN, SRGAN, VDSR, FSRCNN and the like, for realizing the super-resolution reconstruction of the image based on the convolutional neural network technology, and improving the spatial resolution of the image target. In the deep learning-based numerous image super-division reconstruction methods, respective characteristics and disadvantages exist, but in general, model training and reconstruction speed and processing precision are required to be further improved, the existing models are difficult to achieve better balance in training convergence time and processing effect, and an efficient and lightweight super-division reconstruction model is lacking. In real-time superdivision reconstruction tasks, effective application value is difficult.
In the process of realizing the embodiment of the invention, the inventor finds that the model for super-resolution reconstruction has single structure and poor reconstruction effect, and can only reconstruct the resolution by a specific multiple, so that the application applicability is lower.
Disclosure of Invention
The invention aims to solve the technical problems that the existing model for super-resolution reconstruction has single structure and poor reconstruction effect, and can only reconstruct the resolution of specific multiple, so that the application applicability is low.
Aiming at the technical problems, the embodiment of the invention provides a cascade super-resolution reconstruction method based on deep learning, which comprises the following steps:
acquiring an initial picture to be subjected to super-resolution reconstruction and a target multiple for improving the resolution of the initial picture;
inputting the initial picture into a pre-trained super-resolution reconstruction model, acquiring a target picture which is output by the super-resolution reconstruction model and matched with the target multiple, and outputting the target picture;
the super-resolution reconstruction model is obtained by machine learning training from an initial construction model; the initial construction model comprises at least two cascade groups which are connected in sequence, and the output of the former cascade group is used as the input of the latter cascade group; each cascade group comprises a convolution layer structure and a parallel deconvolution layer structure, wherein the output of the convolution layer structure is used as the input of the parallel deconvolution layer structure, and the parallel deconvolution layer structure comprises at least two parallel deconvolution layers.
The embodiment of the invention provides a device for cascade super-resolution reconstruction based on deep learning, which comprises:
the acquisition module is used for acquiring an initial picture to be subjected to super-resolution reconstruction and a target multiple for improving the resolution of the initial picture;
the processing module is used for inputting the initial picture into a pre-trained super-resolution reconstruction model, acquiring a target picture which is output by the super-resolution reconstruction model and matched with the target multiple, and outputting the target picture;
the super-resolution reconstruction model is obtained by machine learning training from an initial construction model; the initial construction model comprises at least two cascade groups which are connected in sequence, and the output of the former cascade group is used as the input of the latter cascade group; each cascade group comprises a convolution layer structure and a parallel deconvolution layer structure, wherein the output of the convolution layer structure is used as the input of the parallel deconvolution layer structure, and the parallel deconvolution layer structure comprises at least two parallel deconvolution layers.
An embodiment of the present invention provides an electronic apparatus including:
at least one processor, at least one memory, a communication interface, and a bus; wherein, the liquid crystal display device comprises a liquid crystal display device,
the processor, the memory and the communication interface complete the communication with each other through the bus;
the communication interface is used for information transmission between the electronic device and communication devices of other electronic devices;
the memory stores program instructions executable by the processor, the processor invoking the program instructions to perform the method of any of the above.
The present embodiment provides a non-transitory computer-readable storage medium storing computer instructions that cause the computer to perform the method of any one of the above.
The embodiment of the invention provides a cascade super-resolution reconstruction method and device based on deep learning. The super-resolution reconstruction model is obtained by machine learning an initial construction model, wherein the initial construction model adopts a plurality of cascade groups, and each cascade group comprises a convolution layer structure formed by serially connected convolution layers and a parallel deconvolution layer structure formed by parallelly connected deconvolution layers. The design of the initial construction model enhances the high-low frequency detail feature extraction capability and improves the reconstruction effect. The parallel deconvolution layer structure comprises a plurality of parallel deconvolution layers, so that the trained model can improve the resolution of the initial picture by different multiples, the application scene of the model is enlarged, and the applicability is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic flow chart of a method for cascade super-resolution reconstruction based on deep learning according to an embodiment of the present invention;
FIG. 2 is a schematic structural diagram of a super-resolution reconstruction model according to another embodiment of the present invention;
FIG. 3 shows a function PReLu (x) provided by another embodiment of the invention i ) Is a graph of (2);
FIG. 4 is a block diagram of an apparatus for cascade super-resolution reconstruction based on deep learning according to another embodiment of the present invention;
fig. 5 is a block diagram of an electronic device according to another embodiment of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Fig. 1 shows a flow chart of a method for cascade super-resolution reconstruction based on deep learning according to the present embodiment, referring to fig. 1, the method for cascade super-resolution reconstruction based on deep learning includes:
101: acquiring an initial picture to be subjected to super-resolution reconstruction and a target multiple for improving the resolution of the initial picture;
102: inputting the initial picture into a pre-trained super-resolution reconstruction model, acquiring a target picture which is output by the super-resolution reconstruction model and matched with the target multiple, and outputting the target picture;
the super-resolution reconstruction model is obtained by machine learning training from an initial construction model; the initial construction model comprises at least two cascade groups which are connected in sequence, and the output of the former cascade group is used as the input of the latter cascade group; each cascade group comprises a convolution layer structure and a parallel deconvolution layer structure, wherein the output of the convolution layer structure is used as the input of the parallel deconvolution layer structure, and the parallel deconvolution layer structure comprises at least two parallel deconvolution layers.
The method provided by the embodiment is executed by a device capable of calling or carrying a super-resolution reconstruction model, and the device is a server, a terminal device or a device special for performing super-resolution reconstruction on a picture. The super-resolution reconstruction model is a model capable of improving the resolution of the input initial picture. And inputting the initial pictures with lower resolution into a super-resolution reconstruction model, outputting pictures with the resolution of the initial pictures increased by different times by the super-resolution reconstruction model, and finding out the pictures with the increased times equal to the target times from the pictures, namely the target pictures. For example, the target magnification is 4 times, and the super-resolution reconstruction model can increase the resolution of the input initial picture by 2 times, 3 times, and 4 times. After the initial picture is input into the super-resolution reconstruction model, the super-resolution reconstruction model outputs 3 pictures with resolution respectively improved by 2 times, 3 times and 4 times, and then the picture with resolution of 4 times is the target picture.
The initial build model is a model structure built by convolution kernels and deconvolution kernels, and is not trained. And when the initial construction model is repeatedly trained, the obtained model is a super-resolution reconstruction model. Fig. 2 is a schematic structural diagram of the super-resolution reconstruction model provided in this embodiment, which is a DSRCNN super-resolution reconstruction model, as can be seen from fig. 2, an initial construction model for training the super-resolution reconstruction model is composed of multiple cascade groups, taking 5 cascade groups as an example, and only the first cascade group and the fifth cascade group are shown in fig. 2. Each concatenated group includes a convolutional layer structure, i.e., the structure of 3 convolutional layers of conv-1, conv-2 and conv-3 in fig. 2. Also included within each concatenated group is a parallel deconvolution layer structure, i.e., the structure of 3 parallel Dconv deconvolution layers in fig. 2. The feature map of the convolutional layer structure output serves as an input to the parallel deconvolution layer structure.
The number of deconvolution layers connected in parallel in the parallel deconvolution layer structure is the same as the number of super-resolution pictures which can be output by several times by the trained model. For example, if the parallel deconvolution layer structure in fig. 2 includes 3 deconvolution layers connected in parallel, the super-resolution reconstruction model in fig. 2 can output 3 multiple pictures. Therefore, the super-resolution reconstruction model can be adjusted by adjusting the number of parallel deconvolution layers in the initial construction model, the resolution of the picture can be improved by several times, and the model has good expansibility and applicability. The combination mode of the cascade group in the initial construction model and the combination mode of the convolution layer and the deconvolution layer in the cascade group can extract the characteristics from multiple dimensions, thereby enhancing the high-low frequency detail characteristic extraction capability and improving the reconstruction effect.
The embodiment provides a cascade super-resolution reconstruction method based on deep learning, which carries out super-resolution reconstruction on an initial picture through a super-resolution reconstruction model obtained through pre-training to obtain a target picture with improved resolution by a target multiple. The super-resolution reconstruction model is obtained by machine learning an initial construction model, wherein the initial construction model adopts a plurality of cascade groups, and each cascade group comprises a convolution layer structure formed by serially connected convolution layers and a parallel deconvolution layer structure formed by parallelly connected deconvolution layers. The design of the initial construction model enhances the high-low frequency detail feature extraction capability and improves the reconstruction effect. The parallel deconvolution layer structure comprises a plurality of parallel deconvolution layers, so that the trained model can improve the resolution of the initial picture by different multiples, the application scene of the model is enlarged, and the applicability is improved.
Further, on the basis of the above-described embodiment,
for any non-first-order level group in the initial construction model, the input of the parallel deconvolution layer structure in the non-first-order level group comprises the output of the deconvolution layer structure in the non-first-order level group and a residual calculation result;
and the residual calculation result is the result of residual calculation between the output of the previous cascade group of the non-first cascade group and the initial picture.
Further, the residual calculation result is calculated by an Eltwise method.
At each non-first cascade stageIn the group, an Eltwise method is added before deconvolution to perform residual calculation, so that the calculation of a depth convolution network is faster, and gradient explosion and overfitting phenomena are avoided. The residual calculation formula is as follows
Figure GDA0004130865180000061
In->
Figure GDA0004130865180000062
For the residual error of the i-th feature map of the layer, y is the true value of the high-resolution image, and h (x) is the simulated reconstructed high-resolution image. As shown in fig. 2, the input of the parallel deconvolution layer structure of the first concatenated group (first group in fig. 2) is the output of the convolution layers in the first concatenated group. Whereas for each non-first-order group, e.g., the fifth group in fig. 2, the input of the parallel deconvolution layer structure for the non-first-order group includes not only the output of the convolution layers in the non-first-order group, but also the residual calculation result. The residual calculation result is the result of residual calculation between the output of the previous cascade group (namely, the output of the parallel deconvolution layer of the previous cascade group) and the initial picture.
The embodiment provides a cascade super-resolution reconstruction method based on deep learning, wherein in a constructed cascade group, the input of each parallel deconvolution layer structure further comprises a residual calculation result, so that overfitting is avoided, and the quality of a super-resolution reconstructed picture is improved.
Further, on the basis of the above embodiments, the obtaining the super-resolution reconstruction model by the initial build model through machine learning training includes:
obtaining a sample picture and at least two preset multiples for improving the input picture by the super-resolution reconstruction model, and for each preset multiple, reducing the resolution of the sample picture by the preset multiple to obtain a picture corresponding to the preset multiple, so as to obtain a first picture set composed of pictures corresponding to each preset multiple;
taking the first picture set as an input sample of the initial construction model, taking a second picture set formed by the sample pictures as an output sample, training the initial construction model through a plurality of groups of the input samples and the output samples, and taking the trained model as the super-resolution reconstruction model;
wherein the number of sample pictures in the second set of pictures is equal to the number of pictures in the first set of pictures.
It should be noted that, the preset multiple is a multiple that the super-resolution reconstruction model can improve the resolution of the picture. The number of preset multiples is typically equal to the number of deconvolution layers contained in the parallel deconvolution layer structure in the initial build model. The values of the preset multiples are set manually, for example, for an initial build model of a parallel deconvolution layer structure comprising 3 deconvolution layers, the preset multiples are set to 2-fold, 3-fold and 4-fold or 2-fold, 4-fold and 8-fold, respectively.
When training an initial construction model, taking a sample picture with higher resolution as an output sample, taking a picture with the resolution reduced by a preset multiple from the sample picture as an input parameter, and obtaining a group of training samples from each sample picture. And training the initial construction model through a plurality of groups of training samples to obtain a super-resolution reconstruction model. The first picture set is a set of pictures with the resolution reduced by each preset multiple by the sample picture, for example, the resolution of the sample picture is reduced by 2 times, 3 times and 4 times respectively to obtain 3 pictures, and the 3 pictures are the first picture set of the input sample. The second set of pictures is the sample picture itself, e.g. a set of 3 sample pictures.
Specifically, the original high-resolution sample picture is set according to downsampling of each preset multiple to obtain low-resolution images, wherein the low-resolution images are input samples, and the original high-resolution sample picture is output samples. Thus, the low resolution in the model and the original high resolution image at the corresponding location constitute sets of image mapping matrices for model training.
Further, a pre-collected remote sensing image data set is taken as a sample picture, and the collected remote sensing image data set is preprocessed. Preprocessing includes original high resolution target image brightness/contrast enhancement, color space conversion, interpolation downsampling, affine transformation, multi-scale scaling, rotation, cropping.
Further, the preprocessing specifically comprises the steps of carrying out brightness and contrast enhancement on the high-resolution image, enabling an image target to be clearly resolved, carrying out YCbCr color space conversion on the image, extracting brightness (Y) components, carrying out 1/2 times interpolation downsampling on the brightness components by adopting a three-time convolution interpolation method, carrying out random affine transformation on four-corner coordinates of the image target, rotating and cutting every 90 degrees of minimum circumscribed rectangle of the image target, and obtaining a training sample set. In the image clipping process, the minimum circumscribed rectangle of a building target is used as a range, the target is clipped independently, 2000 target image sample data sets are constructed, and a mapping relation is constructed according to the image of which the resolution is reduced and the corresponding original resolution remote sensing image, so that a high-low resolution mapping matrix is generated.
The embodiment provides a cascade super-resolution reconstruction method based on deep learning, which is used for obtaining training samples by carrying out resolution reduction processing of preset multiples on sample pictures, thereby realizing training of an initial construction model. The training convergence speed is high, and the reconstruction efficiency is high.
Further, on the basis of the above embodiments, the training process of the initial build model further includes:
and acquiring an output picture set obtained after the input sample is input into the initial construction model of the current training, and taking the initial construction model of the current training as the super-resolution reconstruction model if the resolution deviation of each picture in the output picture set and the sample picture is within the tolerance range of a preset error function.
In the process of training the initial construction model, whether the initial construction model needs to be continuously trained or not can be determined through comparing the output picture of the trained model with the comparison result of the sample picture. For example, the resolution deviation of the resolutions of the two pictures can be obtained through an error function, if the resolution deviation is within the tolerance range of the preset error function, the currently trained model is used as a super-resolution reconstruction model, and otherwise, the training of the model is continued.
The embodiment provides a cascade super-resolution reconstruction method based on deep learning, and the quality of pictures output by a model is guaranteed through the inspection of the model.
Further, on the basis of the foregoing embodiments, the inputting the initial picture into a pre-trained super-resolution reconstruction model includes:
and acquiring the number of pictures with different multiples which can be output by the super-resolution reconstruction model, taking the number of pictures as the set number of multiples, and inputting initial pictures with the number equal to the set number of multiples into a pre-trained super-resolution reconstruction model.
According to the training process of the initial construction model, the pictures input into the super-resolution reconstruction model are multiple pictures, and when the resolution of a certain initial picture is actually required to be improved through the super-resolution reconstruction model, the multiple initial pictures can be used as the input of the super-resolution reconstruction model, so that errors in the calculation process are avoided.
Further, on the basis of the above embodiments, the convolution layer structure in each cascade group includes at least two convolution layers connected in sequence, the output of the former convolution layer serves as the input of the latter convolution layer, and the feature map of each convolution layer is calculated by using a pralu function.
Further, on the basis of the above embodiments, the initial build model includes 5 concatenated groups sequentially connected, the convolution layer structure in each concatenated group includes 3 sequentially connected convolution layers, and the parallel deconvolution layer structure in each concatenated group includes 3 parallel deconvolution layers.
Further, the operation of the super-resolution reconstruction model is realized by adopting a deep learning Caffe framework, and the method comprises the functions of network construction, parameter setting, strategy setting, model calculation, format conversion, result output and the like of a cascade multi-scale super-resolution reconstruction convolutional neural network. And calling a Caffe function through an interface, and performing cascade multi-scale super-division model training and reconstruction calculation by adopting a GPU.
As shown in the super-resolution reconstruction model in FIG. 2, the super-resolution reconstruction result of the previous cascade group is formed by 5 identical cascade groups, the super-resolution reconstruction result of the previous cascade group is continuously used as a low-resolution input image of the next cascade group, the target feature depth extraction is realized by continuous convolution and deconvolution operation, and as the structures of the 5 cascade groups are identical, the structure is equivalent to that of one cascade group which circulates for 5 times, and the last cascade group outputs a final super-resolution reconstruction image.
Each concatenated group consists of a parallel deconvolution layer structure of 3 convolution layers and 3 deconvolution layers. Because the 3 deconvolution layers are parallel structures, the network depth is calculated as one layer, and therefore, the whole super-resolution reconstruction network depth containing 5 cascade groups is 20 layers.
And in the 3 convolution layers, linear calculation of the space domain pixel value is realized, wherein the first convolution kernel structure is 3×3, the second convolution kernel structure is 5×5, and the third convolution kernel structure is 3×3. Each convolution layer realizes the full extraction of the image characteristic information, the high-dimensional characteristic map (feature map) after each layer of convolution calculation is calculated by adopting a PReLu piecewise function (the characteristic map after each convolution layer is calculated by adopting the PReLu function as shown in fig. 2), the nonlinear mapping relation (end-to-end image information mapping) among neurons with the same dimension is realized, and the characteristic map output dimension of each convolution layer is 64. The PReLu calculation formula is as follows:
Figure GDA0004130865180000101
where x is the neuron weight value, i is the image channel index, a is the learning parameter initialized to 0.25, and FIG. 3 is the function PReLu (x i ) Is a graph of (2). F (y) in FIG. 3 is PReLu (x) i ) Y is x i
The 3 deconvolution layers belong to parallel structures, and 2 times (expressed as 'x 2'), 3 times (expressed as 'x 3') and 4 times (expressed as 'x 4') of the low-resolution images are simultaneously lifted in a depth convolution neural network by establishing a multi-scale mixed high-low-resolution image mapping relation, so that network parameter training is simultaneously realized in one model, and a training sample set with a multi-resolution scale is constructed in super-division model training. In the same cascade group, three parallel deconvolution kernels differ in structure size, and the first deconvolution kernel (x 2) structure is 4 x 4; the second deconvolution core (×3) structure is 5×5; the second deconvolution core (×4) structure was 6×6. Table 1 shows the core parameters of the super-resolution reconstruction model provided in this embodiment.
Table 1 core parameters of the super resolution reconstruction model provided in this embodiment
Core parameter settings Parameter value
Each group of cascade network depth 4
Convolutional neural network cascade group number 5
Model network depth 20
First layer convolution kernel structure 3×3
Second layer convolution kernel structure 5×5
Third layer convolution kernel structure 3×3
Fourth layer deconvolution kernel structure (x 2) 4×4
Fourth layer deconvolution kernel structure (×3) 5×5
Fourth layer deconvolution kernel structure (x 4) 6×6
Training image size filling value (first/third/fourth layer) 1
Training image size filling value (second layer) 2
Convolution kernel moving step length (first/second/third) 1
Fourth layer deconvolution move step size (×2) 2
Fourth layer deconvolution move step size (×3) 3
Fourth layer deconvolution move step size (×4) 4
The convolution layer realizes multi-stage feature extraction of an image target, the size of an output image of an input low-resolution image after carrying out neural network convolution operation changes, and the original low-resolution image is assumed to be w in width 0 Length of h 0 The image size after convolution operation is:
width w= (w) of the convolved image 0 +2*pad-kernel_size)/stride+1
Length h= (h) of the convolved image 0 +2*pad-kernel_size)/stride+1
The deconvolution layer carries out reverse operation on the multi-stage characteristic image, so that the size of the image is expanded by 2 times, 3 times and 4 times by installing preset super-division reconstruction times, and the size of the image after deconvolution operation is as follows:
width w= (w) of deconvolution operation image 0 -1)*stride+kernel_size-2*pad
Length h= (h) of image after deconvolution operation 0 -1)*stride+kernel_size-2*pad
Where stride represents the step size of the convolution kernel, kernel_size represents the size of the convolution kernel, and pad represents the increased number of pixel rows.
The method has higher precision and faster training and reconstruction speed aiming at resolution improvement of the remote sensing image building target. The method is characterized in that quantitative objective indexes (such as PSNR, SSIM, MTF) and subjective targets are tested according to accuracy verification, and test results show that remote sensing spatial resolution is improved by 2 times, 3 times and 4 times; the multi-scale super-division reconstruction effect is achieved through one super-division model, and the training efficiency of the super-division model is remarkably improved compared with that of a single-scale super-division reconstruction model.
According to the embodiment of the invention, a multi-level multi-scale super-resolution reconstruction model is designed aiming at an optical image, a multi-level multi-scale mixed multi-scale resolution reconstruction structure design is adopted, the multi-level multi-resolution reconstruction structure design is adopted, the network depth is increased, the output image of each level group is used as a low-resolution input image for new level group network operation, multi-level cyclic reconstruction operation is carried out on the low-resolution image, each level group combines deconvolution and residual calculation, the conditions of over fitting and local optimization are effectively avoided, and the training efficiency and super-resolution reconstruction effect of the super-resolution model are effectively improved. 3 parallel layer deconvolution layers are added in the cascade multi-scale super-resolution reconstruction model, so that multi-scale promotion of 2 times, 3 times and 4 times of a low-resolution image can be realized in one model at the same time, the model training operation efficiency is greatly improved, and the applicability of the image super-resolution reconstruction application is effectively enhanced.
In a second aspect, fig. 4 is a block diagram of an apparatus for cascade super-resolution reconstruction based on deep learning according to this embodiment, and referring to fig. 4, the apparatus includes an acquisition module 401 and a processing module 402, where,
an obtaining module 401, configured to obtain an initial picture to be reconstructed with super resolution and a target multiple to increase the resolution of the initial picture;
the processing module 402 is configured to input the initial picture into a pre-trained super-resolution reconstruction model, obtain a target picture output by the super-resolution reconstruction model and matched with the target multiple, and output the target picture;
the super-resolution reconstruction model is obtained by machine learning training from an initial construction model; the initial construction model comprises at least two cascade groups which are connected in sequence, and the output of the former cascade group is used as the input of the latter cascade group; each cascade group comprises a convolution layer structure and a parallel deconvolution layer structure, wherein the output of the convolution layer structure is used as the input of the parallel deconvolution layer structure, and the parallel deconvolution layer structure comprises at least two parallel deconvolution layers.
The device for cascade super-resolution reconstruction based on deep learning provided in this embodiment is applicable to the method for cascade super-resolution reconstruction based on deep learning provided in the foregoing embodiment, and is not described herein again.
The embodiment provides a device for cascade super-resolution reconstruction based on deep learning, which performs super-resolution reconstruction on an initial picture through a super-resolution reconstruction model obtained through pre-training to obtain a target picture with improved resolution by a target multiple. The super-resolution reconstruction model is obtained by machine learning an initial construction model, wherein the initial construction model adopts a plurality of cascade groups, and each cascade group comprises a convolution layer structure formed by serially connected convolution layers and a parallel deconvolution layer structure formed by parallelly connected deconvolution layers. The design of the initial construction model enhances the high-low frequency detail feature extraction capability and improves the reconstruction effect. The parallel deconvolution layer structure comprises a plurality of parallel deconvolution layers, so that the trained model can improve the resolution of the initial picture by different multiples, the application scene of the model is enlarged, and the applicability is improved.
In a third aspect, fig. 5 is a block diagram showing the structure of an electronic apparatus provided in the present embodiment.
Referring to fig. 5, the electronic device includes: a processor (processor) 501, a memory (memory) 502, a communication interface (Communications Interface) 503, and a bus 504;
wherein, the liquid crystal display device comprises a liquid crystal display device,
the processor 501, the memory 502, and the communication interface 503 perform communication with each other through the bus 504;
the communication interface 503 is used for information transmission between the electronic device and communication devices of other electronic devices;
the processor 501 is configured to invoke the program instructions in the memory 502 to perform the methods provided in the above method embodiments, for example, including: acquiring an initial picture to be subjected to super-resolution reconstruction and a target multiple for improving the resolution of the initial picture; inputting the initial picture into a pre-trained super-resolution reconstruction model, acquiring a target picture which is output by the super-resolution reconstruction model and matched with the target multiple, and outputting the target picture; the super-resolution reconstruction model is obtained by machine learning training from an initial construction model; the initial construction model comprises at least two cascade groups which are connected in sequence, and the output of the former cascade group is used as the input of the latter cascade group; each cascade group comprises a convolution layer structure and a parallel deconvolution layer structure, wherein the output of the convolution layer structure is used as the input of the parallel deconvolution layer structure, and the parallel deconvolution layer structure comprises at least two parallel deconvolution layers.
The present embodiment provides a non-transitory computer-readable storage medium storing computer instructions that cause a computer to perform the methods provided by the above-described method embodiments, for example, including: acquiring an initial picture to be subjected to super-resolution reconstruction and a target multiple for improving the resolution of the initial picture; inputting the initial picture into a pre-trained super-resolution reconstruction model, acquiring a target picture which is output by the super-resolution reconstruction model and matched with the target multiple, and outputting the target picture; the super-resolution reconstruction model is obtained by machine learning training from an initial construction model; the initial construction model comprises at least two cascade groups which are connected in sequence, and the output of the former cascade group is used as the input of the latter cascade group; each cascade group comprises a convolution layer structure and a parallel deconvolution layer structure, wherein the output of the convolution layer structure is used as the input of the parallel deconvolution layer structure, and the parallel deconvolution layer structure comprises at least two parallel deconvolution layers.
The present embodiment discloses a computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, are capable of performing the methods provided by the above-described method embodiments, for example, comprising: acquiring an initial picture to be subjected to super-resolution reconstruction and a target multiple for improving the resolution of the initial picture; inputting the initial picture into a pre-trained super-resolution reconstruction model, acquiring a target picture which is output by the super-resolution reconstruction model and matched with the target multiple, and outputting the target picture; the super-resolution reconstruction model is obtained by machine learning training from an initial construction model; the initial construction model comprises at least two cascade groups which are connected in sequence, and the output of the former cascade group is used as the input of the latter cascade group; each cascade group comprises a convolution layer structure and a parallel deconvolution layer structure, wherein the output of the convolution layer structure is used as the input of the parallel deconvolution layer structure, and the parallel deconvolution layer structure comprises at least two parallel deconvolution layers.
Those of ordinary skill in the art will appreciate that: all or part of the steps for implementing the above method embodiments may be implemented by hardware associated with program instructions, where the foregoing program may be stored in a computer readable storage medium, and when executed, the program performs steps including the above method embodiments; and the aforementioned storage medium includes: various media that can store program code, such as ROM, RAM, magnetic or optical disks.
The above-described embodiments of electronic devices and the like are merely illustrative, wherein the elements described as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the embodiments of the present invention, and are not limited thereto; although embodiments of the present invention have been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the corresponding technical solutions from the scope of the technical solutions of the embodiments of the present invention.

Claims (10)

1. A method for cascade super-resolution reconstruction based on deep learning, comprising the steps of:
acquiring an initial picture to be subjected to super-resolution reconstruction and a target multiple for improving the resolution of the initial picture;
inputting the initial picture into a pre-trained super-resolution reconstruction model, acquiring a target picture which is output by the super-resolution reconstruction model and matched with the target multiple, and outputting the target picture;
the super-resolution reconstruction model is obtained by machine learning training from an initial construction model; the initial construction model comprises at least two cascade groups which are connected in sequence, and the output of the former cascade group is used as the input of the latter cascade group; each cascade group comprises a convolution layer structure and a parallel deconvolution layer structure, wherein the output of the convolution layer structure is used as the input of the parallel deconvolution layer structure, and the parallel deconvolution layer structure comprises at least two parallel deconvolution layers.
2. The method of claim 1, wherein for any non-first-order set in the initial build model, the inputs to the parallel deconvolution layer structure in the non-first-order set comprise the outputs of the deconvolution layer structure in the non-first-order set and residual calculation results;
and the residual calculation result is the result of residual calculation between the output of the previous cascade group of the non-first cascade group and the initial picture.
3. The method of claim 1, wherein the initially constructed model is trained by machine learning to obtain the super-resolution reconstruction model, comprising:
obtaining a sample picture and at least two preset multiples for improving the input picture by the super-resolution reconstruction model, and for each preset multiple, reducing the resolution of the sample picture by the preset multiple to obtain a picture corresponding to the preset multiple, so as to obtain a first picture set composed of pictures corresponding to each preset multiple;
taking the first picture set as an input sample of the initial construction model, taking a second picture set consisting of a corresponding number of sample pictures as an output sample, training the initial construction model through a plurality of groups of the input samples and the output samples, and taking the trained model as the super-resolution reconstruction model;
wherein the number of sample pictures in the second set of pictures is equal to the number of pictures in the first set of pictures.
4. A method according to claim 3, further comprising, in training the initial build model:
and acquiring an output picture set obtained after the input sample is input into the initial construction model of the current training, and taking the initial construction model of the current training as the super-resolution reconstruction model if the resolution deviation of each picture in the output picture set and the sample picture is within the tolerance range of a preset error function.
5. The method of claim 1, wherein the inputting the initial picture into a pre-trained super-resolution reconstruction model comprises:
and acquiring the number of pictures with different multiples which can be output by the super-resolution reconstruction model, taking the number of pictures as the set number of multiples, and inputting the initial pictures with the number equal to the set number of multiples into a pre-trained super-resolution reconstruction model.
6. The method of claim 1, wherein the convolution layer structure within each concatenated group comprises at least two serially connected convolution layers, the output of a preceding convolution layer being the input of a subsequent convolution layer, and wherein the signature of each convolution layer is calculated using a prilu function.
7. The method of claim 1, wherein the initial build model comprises 5 concatenated groups of sequentially connected convolutional layer structures within each concatenated group of sequentially connected convolutional layers, and wherein the parallel deconvolution layer structures within each concatenated group of sequentially connected convolutional layers comprise 3 parallel deconvolution layers.
8. A device for cascade super-resolution reconstruction based on deep learning, comprising:
the acquisition module is used for acquiring an initial picture to be subjected to super-resolution reconstruction and a target multiple for improving the resolution of the initial picture;
the processing module is used for inputting the initial picture into a pre-trained super-resolution reconstruction model, acquiring a target picture which is output by the super-resolution reconstruction model and matched with the target multiple, and outputting the target picture;
the super-resolution reconstruction model is obtained by machine learning training from an initial construction model; the initial construction model comprises at least two cascade groups which are connected in sequence, and the output of the former cascade group is used as the input of the latter cascade group; each cascade group comprises a convolution layer structure and a parallel deconvolution layer structure, wherein the output of the convolution layer structure is used as the input of the parallel deconvolution layer structure, and the parallel deconvolution layer structure comprises at least two parallel deconvolution layers.
9. An electronic device, comprising:
at least one processor, at least one memory, a communication interface, and a bus; wherein, the liquid crystal display device comprises a liquid crystal display device,
the processor, the memory and the communication interface complete the communication with each other through the bus;
the communication interface is used for information transmission between the electronic device and communication devices of other electronic devices;
the memory stores program instructions executable by the processor, the processor invoking the program instructions to perform the method of any of claims 1-7.
10. A non-transitory computer readable storage medium storing computer instructions that cause the computer to perform the method of any one of claims 1 to 7.
CN201910143361.4A 2019-02-26 2019-02-26 Deep learning-based cascade super-resolution reconstruction method and device Active CN109949224B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910143361.4A CN109949224B (en) 2019-02-26 2019-02-26 Deep learning-based cascade super-resolution reconstruction method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910143361.4A CN109949224B (en) 2019-02-26 2019-02-26 Deep learning-based cascade super-resolution reconstruction method and device

Publications (2)

Publication Number Publication Date
CN109949224A CN109949224A (en) 2019-06-28
CN109949224B true CN109949224B (en) 2023-06-30

Family

ID=67006906

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910143361.4A Active CN109949224B (en) 2019-02-26 2019-02-26 Deep learning-based cascade super-resolution reconstruction method and device

Country Status (1)

Country Link
CN (1) CN109949224B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110705699B (en) * 2019-10-18 2022-05-31 厦门美图之家科技有限公司 Super-resolution reconstruction method and device, electronic equipment and readable storage medium
CN111402131B (en) * 2020-03-10 2022-04-01 北京师范大学 Method for acquiring super-resolution land cover classification map based on deep learning
EP4128135A4 (en) * 2020-04-01 2023-06-07 BOE Technology Group Co., Ltd. Computer-implemented method, apparatus, and computer-program product
CN113552130A (en) * 2020-04-08 2021-10-26 台达电子工业股份有限公司 Flaw detection method and flaw detection device
TWI791970B (en) * 2020-04-08 2023-02-11 台達電子工業股份有限公司 Defect detection method and defect detection device
CN112946497A (en) * 2020-12-04 2021-06-11 广东电网有限责任公司 Storage battery fault diagnosis method and device based on fault injection deep learning
CN113012046B (en) * 2021-03-22 2022-12-16 华南理工大学 Image super-resolution reconstruction method based on dynamic packet convolution
CN113837946B (en) * 2021-10-13 2022-12-06 中国电子技术标准化研究院 Lightweight image super-resolution reconstruction method based on progressive distillation network

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107403415A (en) * 2017-07-21 2017-11-28 深圳大学 Compression depth plot quality Enhancement Method and device based on full convolutional neural networks
CN108921786A (en) * 2018-06-14 2018-11-30 天津大学 Image super-resolution reconstructing method based on residual error convolutional neural networks
CN109242771A (en) * 2018-08-16 2019-01-18 广州视源电子科技股份有限公司 A kind of super-resolution image reconstruction method and device, computer readable storage medium and computer equipment
CN109255755A (en) * 2018-10-24 2019-01-22 上海大学 Image super-resolution rebuilding method based on multiple row convolutional neural networks

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10803378B2 (en) * 2017-03-15 2020-10-13 Samsung Electronics Co., Ltd System and method for designing efficient super resolution deep convolutional neural networks by cascade network training, cascade network trimming, and dilated convolutions

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107403415A (en) * 2017-07-21 2017-11-28 深圳大学 Compression depth plot quality Enhancement Method and device based on full convolutional neural networks
CN108921786A (en) * 2018-06-14 2018-11-30 天津大学 Image super-resolution reconstructing method based on residual error convolutional neural networks
CN109242771A (en) * 2018-08-16 2019-01-18 广州视源电子科技股份有限公司 A kind of super-resolution image reconstruction method and device, computer readable storage medium and computer equipment
CN109255755A (en) * 2018-10-24 2019-01-22 上海大学 Image super-resolution rebuilding method based on multiple row convolutional neural networks

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Image Super-Resolution via Progressive Cascading Residual Network;Namhyuk Ahn ET AL;《2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops》;20181216;904-912 *
一种深度级联网络结构的单帧超分辨重建算法;王飞等;《光电工程》;20180715(第07期);40-49 *
基于反馈残差网络的矿井图像超分辨率重建算法研究;宋玉龙;《中国优秀硕士学位论文全文数据库 工程科技I辑》;20190215(第2期);B021-230 *

Also Published As

Publication number Publication date
CN109949224A (en) 2019-06-28

Similar Documents

Publication Publication Date Title
CN109949224B (en) Deep learning-based cascade super-resolution reconstruction method and device
CN111369440B (en) Model training and image super-resolution processing method, device, terminal and storage medium
Yu et al. A unified learning framework for single image super-resolution
CN112102177B (en) Image deblurring method based on compression and excitation mechanism neural network
CN112446383B (en) License plate recognition method and device, storage medium and terminal
CN111242846B (en) Fine-grained scale image super-resolution method based on non-local enhancement network
CN109064396A (en) A kind of single image super resolution ratio reconstruction method based on depth ingredient learning network
Chen et al. Multi-attention augmented network for single image super-resolution
CN112801904B (en) Hybrid degraded image enhancement method based on convolutional neural network
RU2697928C1 (en) Superresolution of an image imitating high detail based on an optical system, performed on a mobile device having limited resources, and a mobile device which implements
CN111861886B (en) Image super-resolution reconstruction method based on multi-scale feedback network
CN111476719A (en) Image processing method, image processing device, computer equipment and storage medium
CN115953303B (en) Multi-scale image compressed sensing reconstruction method and system combining channel attention
Guan et al. Srdgan: learning the noise prior for super resolution with dual generative adversarial networks
CN112419152A (en) Image super-resolution method and device, terminal equipment and storage medium
CN115713462A (en) Super-resolution model training method, image recognition method, device and equipment
CN109993701B (en) Depth map super-resolution reconstruction method based on pyramid structure
CN115867933A (en) Computer-implemented method, computer program product and system for processing images
Shen et al. RSHAN: Image super-resolution network based on residual separation hybrid attention module
Sahito et al. Transpose convolution based model for super-resolution image reconstruction
CN109996085B (en) Model training method, image processing method and device and electronic equipment
CN116029905A (en) Face super-resolution reconstruction method and system based on progressive difference complementation
Ooi et al. Enhanced dense space attention network for super-resolution construction from single input image
CN113012072A (en) Image motion deblurring method based on attention network
CN115631115B (en) Dynamic image restoration method based on recursion transform

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant