CN114581300A - Image super-resolution reconstruction method and device - Google Patents

Image super-resolution reconstruction method and device Download PDF

Info

Publication number
CN114581300A
CN114581300A CN202210147765.2A CN202210147765A CN114581300A CN 114581300 A CN114581300 A CN 114581300A CN 202210147765 A CN202210147765 A CN 202210147765A CN 114581300 A CN114581300 A CN 114581300A
Authority
CN
China
Prior art keywords
residual
features
attention
cascade
convolution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210147765.2A
Other languages
Chinese (zh)
Inventor
史景伦
李显惠
胡晨晨
王骁行
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Weibo Intelligent Technology Co ltd
South China University of Technology SCUT
Original Assignee
Guangdong Weibo Intelligent Technology Co ltd
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Weibo Intelligent Technology Co ltd, South China University of Technology SCUT filed Critical Guangdong Weibo Intelligent Technology Co ltd
Priority to CN202210147765.2A priority Critical patent/CN114581300A/en
Publication of CN114581300A publication Critical patent/CN114581300A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4053Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses an image super-resolution reconstruction method and device, wherein the method comprises the following steps: extracting shallow layer characteristics of the low-resolution input image; performing feature extraction, fusion and enhancement on the shallow features through m trunk networks consisting of cascade residual groups consisting of multi-scale cascade attention residual modules and global jump connection to obtain deep features; upsampling the deep features using sub-pixel convolution; and reconstructing the image by using the obtained characteristics to obtain an image with higher resolution. The invention adopts a multi-scale cascade attention residual error module, and extracts, enhances and fuses various characteristics from the angles of receptive field, width, attention and the like; by means of jump connection and cascade residual, low-frequency information is bypassed, and features of different levels in network depth are integrated, so that richer details are obtained; the method can reconstruct the image with richer details and higher quality, and can be widely applied to the field of image super-resolution reconstruction.

Description

Image super-resolution reconstruction method and device
Technical Field
The invention relates to the field of image super-resolution reconstruction, in particular to an image super-resolution reconstruction method and device.
Background
With the rapid development of computer technology and artificial intelligence field, high resolution vocabularies such as 8k and 1 hundred million pixels are constantly appearing in the visual field of people, and people have more and more requirements for high resolution images. In addition, in the fields of security monitoring, medical images, remote sensing, face recognition and the like, the images are used as important information carriers, and high-quality images can provide richer details and acquire more available information. Therefore, it would be very important to improve the resolution of the image in real life.
In recent years, the requirement for image resolution is becoming higher and higher, and super-resolution reconstruction has become one of the hot spots of research in the field of computer vision as a low-level vision task. The super-resolution solves the problem of reconstructing a high-resolution output image from a low-resolution input image, and the visual quality of the image is better and clearer by continuously enriching the details of the image.
With the continuous development of deep learning, the deep neural network is also widely applied to image super-resolution reconstruction, and a good reconstruction result is obtained. However, the mainstream algorithm at present often needs a very deep architecture and a long training time, and the deeper the network depth is, the more difficult the training is, and the required training skill is increased. Meanwhile, the low-resolution input contains rich low-frequency information which is treated equally among channels, and the learning of the convolutional neural network is also hindered to a certain extent. And the current convolutional neural network for super-resolution cannot fully utilize the features on multiple scales, so that the learning capability of the convolutional neural network is limited. Therefore, it is necessary to solve the existing problems and to rebuild high quality.
Disclosure of Invention
In order to solve at least one of the technical problems in the prior art to a certain extent, the present invention aims to provide a method and an apparatus for reconstructing image super resolution based on a multi-scale cascade attention residual network.
The technical scheme adopted by the invention is as follows:
an image super-resolution reconstruction method comprises the following steps:
extracting shallow features of the low-resolution input image;
performing feature extraction, fusion and enhancement on the shallow features through m trunk networks consisting of cascade residual groups consisting of multi-scale cascade attention residual modules and global jump connection to obtain deep features;
upsampling the deep features using sub-pixel convolution;
and (4) reconstructing the image by using the characteristics obtained by the up-sampling to obtain an image with higher resolution.
Further, the extracting shallow features from the low-resolution input image includes:
defining a feature extraction component composed of a convolution layer, extracting original features from a low-resolution input image, as shown in the following formula:
F0=HSFE(ILR) (1)
wherein HSFE(. The) is expressed as a convolution operation applied to low resolution feature extraction, ILRRepresenting an input image of low resolution, F0Representing shallow features extracted by convolution.
Further, the cascade residual group comprises n multi-scale cascade attention residual modules, n feature splicing units, n feature compression units, n short jump connections and 1 local jump connection, wherein the feature compression units are formed by convolution of 1x 1.
Further, the formula expression of the cascade residual group is as follows:
Fm,1=HMCRAB(Fm-1) (2)
Fm,2=HMCRAB(w1×1*[Fm,1,Fm-1]+b) (3)
Fm,n=HMCRAB(w1×1*[Fm,n-1,Fm,n-2]+b) (4)
Fm=Fm,n+Fm-1 (5)
wherein, Fm,nRepresents the output characteristics of the nth multi-scale cascaded attention residual module in the mth cascaded residual group, HMCRABRepresenting the operation of a multiscale cascade of attention residual modules, FmRepresents an output characteristic of the mth cascade residual group, [ alpha ], [ beta ], [ alpha ] of the mth cascade residual group]Representation feature splicing, w1×1Represents the weight of the 1 × 1 convolution and b represents the deviation of the convolution kernel.
Further, the multi-scale cascade attention residual module comprises an attention residual unit, an error feedback fusion unit, a jump connection and cascade operation;
the formula expression of the multi-scale cascade attention residual module is as follows:
F3×3,in1=w1×1*Fm,n-1+b (6)
F3,1=HRAB,3×3(F3×3,in1)+F3×3,in1 (7)
F3×3,in2=w1×1*[F3,1,F3×3,in1]+b (8)
F3,2=HRAB,3×3(F3×3,in2)+F3×3,in2 (9)
F3×3_in3=w1×1*[F3,2,F3,1,F3×3,in1]+b (10)
F5×5,in1=w1×1*Fm,n-1+b (11)
F5,1=HRAB,5×5(F5×5,in1)+F5×5,in1 (12)
F5×5,in2=w1×1*[F5,1,F5×5,in1]+b (13)
F5,2=HRAB,5×5(F5×5,in2)+F5×5,in2 (14)
F5×5,in3=w1×1*[F5,2,F5,1,F5×5,in1]+b (15)
Fm,n=HConfusion(F3×3,in3,F5×5,in3)+Fm,n-1 (16)
wherein, F3×3,in1、F3×3,in2、F3×3,in3Respectively representing input features at different stages with a scale of 3x3, F3,1、F3,2Respectively, an intermediate feature at a scale of 3x3, F5×5,in1、F5×5,in2、F5×5,in3Respectively representing input features at different stages with a scale of 5x5, F5,1、F5,2Respectively, the intermediate feature at the scale 5x5, HRAB,3×3、HRAB,5×5Attention residual units, H, with convolution kernels of 3x3 and 5x5, respectivelyConfusionIndicating error feedback fusion unit, Fm,nRepresenting an output characteristic of the nth multi-scale cascade attention residual module in the mth cascade residual set, [ 2 ]]Representation feature splicing, w1×1Represents the weight of the 1 × 1 convolution and b represents the deviation of the convolution kernel.
Furthermore, the attention residual error unit adopts a wide activation mode, under the condition that the parameter quantity is not changed, wider channel characteristics are obtained, a channel attention module is adopted to enhance the channel characteristics before activation, and finally a space attention unit is adopted to perform space characteristic enhancement on the residual error, wherein the two attention mechanisms both adopt a mode of combining average pooling and maximum pooling;
the formula expression of the attention residual unit is as follows:
y=τ(HCA(wk×k*x+b)) (17)
Fr=HSA(wk×k*y+b) (18)
wherein x and y represent input and output characteristics, respectively, and τ represents a nonlinear activation function ReLU, wk×kWeight, H, representing kxk convolutionCADenotes the channel attention Unit, HSARepresenting the channel attention units, wherein the formula expression of each attention unit is as follows:
HCA=σ(HFC(τ(HFC(PAvg(x)+PMax(x)))))*x (19)
HSA=σ(w7×7*[(PAvg(x),PMax(x)]+b))*x (20)
where σ denotes the nonlinear activation function Sigmoid, w7×7Weight of convolution of 7x7, PAvgDenotes average pooling operation, PMaxDenotes maximum pooling operation, HFCRepresenting a fully connected layer.
Further, the formula expression of the error feedback fusion unit is as follows:
ffeedback_3=τ(w3×3*F3×3,in3+b)-τ(w3×3*F5×5,in3+b) (21)
F3×3=τ(w3×3*ffeedback_3+b)+F3×3,in3 (22)
ffeedback_5=τ(w3×3*F5×5,in3+b)-τ(w3×3*F3×3,in3+b) (23)
F5×5=τ(w3×3*ffeedback_5+b)+F5×5,in3 (24)
Fconfusion=w3×3*[F3×3,F5×5]+b (25)
wherein f isfeedback_3、ffeedback_5Respectively representing error feedback between different scales; f3×3,F5×5Respectively representing the fusion characteristics after error feedback, and tau represents a nonlinear activation function ReLU, w3×3Is a 3x3 convolutionWeight of (1), FconfusionRepresenting a multi-scale residual fusion feature.
Further, the extracting, fusing and enhancing the features of the shallow features through a backbone network composed of m cascade residual groups composed of multi-scale cascade attention residual modules and global jump connection to obtain the deep features includes:
obtaining deep features from shallow features through a trunk network consisting of m cascaded residual groups, 1 residual feature extraction trunk with convolution cascade connection of 3x3 and global jump connection; the specific formula is as follows:
Fm=Hcrir,m(Fm-1)=Hcrir,m(Hcrir,m-1(…(Hcrir,1(F0)…)) (26)
FRes=τ(w3×3*Fm+b) (27)
FDF=F0+FRes (28)
wherein Hcrir,mFor the operation of the mth cascaded residual group, FResTo pass the final residual features after a 3 × 3 convolutional layer, FDFFor depth features, which finally consist of shallow features and residual features, F0Is a shallow feature.
Further, the upsampling the deep features using the sub-pixel convolution includes:
the deep level features extracted through the backbone network are up-sampled by using a layer of sub-pixel convolution, and the specific formula is as follows:
Fup=HSub_pixel(FDF) (29)
wherein HSub_pixelFor up-sampling operations with sub-pixel convolution, FupIs an upsampled output characteristic.
The reconstructing the image by using the features obtained by the upsampling to obtain the image with higher resolution comprises the following steps:
reconstruction of high resolution images I by upsampling predicted featuresSRSpecific formula (I)The following were used:
ISR=HR(Fup) (30)
wherein HRFor reconstructing high-resolution images ISRThe convolution operation of (1).
The other technical scheme adopted by the invention is as follows:
an image super-resolution reconstruction apparatus comprising:
at least one processor;
at least one memory for storing at least one program;
when executed by the at least one processor, cause the at least one processor to implement the method described above.
The invention has the beneficial effects that: the invention adopts a multi-scale cascade attention residual error module, and extracts, enhances and fuses various characteristics from the angles of receptive field, width, attention and the like; meanwhile, low-frequency information is bypassed by jump connection and cascade residual, and features of different levels in the depth of the network are integrated, so that richer details are obtained; by the method, the image with richer details and higher quality can be reconstructed.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the following description is made on the drawings of the embodiments of the present invention or the related technical solutions in the prior art, it should be understood that the drawings in the following description are only for convenience and clarity of describing some embodiments in the technical solutions of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a diagram of a multi-scale cascaded attention residual network in an embodiment of the invention;
FIG. 2 is a diagram of an attention residual unit in an embodiment of the present invention;
FIG. 3 is a diagram of a channel attention unit and a spatial attention unit in an embodiment of the invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the accompanying drawings are illustrative only for the purpose of explaining the present invention, and are not to be construed as limiting the present invention. The step numbers in the following embodiments are provided only for convenience of illustration, the order between the steps is not limited at all, and the execution order of each step in the embodiments can be adapted according to the understanding of those skilled in the art.
In the description of the present invention, it should be understood that the orientation or positional relationship referred to in the description of the orientation, such as the upper, lower, front, rear, left, right, etc., is based on the orientation or positional relationship shown in the drawings, and is only for convenience of description and simplification of description, and does not indicate or imply that the device or element referred to must have a specific orientation, be constructed and operated in a specific orientation, and thus, should not be construed as limiting the present invention.
In the description of the present invention, the meaning of a plurality of means is one or more, the meaning of a plurality of means is two or more, and larger, smaller, larger, etc. are understood as excluding the number, and larger, smaller, inner, etc. are understood as including the number. If the first and second are described for the purpose of distinguishing technical features, they are not to be understood as indicating or implying relative importance or implicitly indicating the number of technical features indicated or implicitly indicating the precedence of the technical features indicated.
In the description of the present invention, unless otherwise explicitly limited, terms such as arrangement, installation, connection and the like should be understood in a broad sense, and those skilled in the art can reasonably determine the specific meanings of the above terms in the present invention in combination with the specific contents of the technical solutions.
As shown in fig. 1, the present embodiment provides an image super-resolution reconstruction method based on a multi-scale cascade attention residual network diagram, which performs fusion reinforcement on extracted features through a multi-scale cascade attention residual module, learns depth residual features by using a structure of a global residual embedded cascade residual, and finally completes reconstruction, and specifically includes the following steps:
step 1, shallow feature extraction is carried out on an input image, and the method specifically comprises the following steps:
using a feature extraction module consisting of 3 × 3 convolutional layers, original features are extracted from the low-resolution input, as shown in the following formula:
F0=HSFE(ILR) (1)
wherein HSFE(. The) is expressed as a convolution operation applied to low resolution feature extraction, ILRRepresenting an input image of low resolution, F0Representing shallow features extracted by convolution.
And 2, extracting features of different scales by using convolution kernels of different sizes for the attention residual errors, so that the network can learn richer image information. As shown in fig. 1, shallow features are input into a backbone network formed by connecting m cascade residual groups and global jumps, so that the shallow features can be subjected to feature extraction and enhancement to obtain richer and deeper features. Wherein the cascade residual group comprises n multi-scale cascade residual attention modules, as shown in fig. 1, each multi-scale residual attention module comprises 4 attention residual units, 1 error feedback fusion unit, and 6 short jump connections. In this example, m is 3 and n is 3, but the values of m and n do not limit the technical solution of the present invention. The attention residual unit uses convolution kernels of 3x3 and 5x5, respectively, and is shown in FIG. 2, and the involved channel attention unit and spatial attention unit are shown in FIG. 3; as shown in fig. 1, the error feedback fusion unit is a fusion module formed by convolution of 4 pieces of 3x3 followed by relu and convolution of a feature splicing unit and 3x3, the feature splicing unit is formed by convolution of 1x1, and the network is easier to optimize by using residual learning. The specific process is as follows:
firstly, obtaining two different characteristics of the shallow layer characteristics extracted in the step 1 through an attention residual error unit, then performing characteristic fusion on the two characteristics through an error feedback fusion unit, and finally adding the two characteristics with the shallow layer characteristics to form a residual error block; the expression is as follows:
specifically, the formula is shown as follows:
F3×3,in1=w1×1*Fm,n-1+b (2)
F3,1=HRAB,3×3(F3×3,in1)+F3×3,in1 (3)
F3×3,in2=w1×1*[F3,1,F3×3,in1]+b (4)
F3,2=HRAB,3×3(F3×3,in2)+F3×3,in2 (5)
F3×3_in3=w1×1*[F3,2,F3,1,F3×3,in1]+b (6)
F5×5,in1=w1×1*Fm,n-1+b (7)
F5,1=HRAB,5×5(F5×5,in1)+F5×5,in1 (8)
F5×5,in2=w1×1*[F5,1,F5×5,in1]+b (9)
F5,2=HRAB,5×5(F5×5,in2)+F5×5,in2 (10)
F5×5,in3=w1×1*[F5,2,F5,1,F5×5,in1]+b (11)
Fm,n=HConfusion(F3×3,in3,F5×5,in3)+Fm,n-1 (12)
wherein, F3×3,in1、F3×3,in2、F3×3,in3Respectively representing input features at different stages with a scale of 3x3, F3,1、F3,2Respectively, an intermediate feature at a scale of 3x3, F5×5,in1、F5×5in2、F5×5,in3Respectively indicating rulerInput features at various stages with degrees 5x5, F5,1、F5,2Respectively, the intermediate feature at the scale 5x5, HRAB,3×3、HRAB,5×5Attention residual units, H, with convolution kernels of 3x3 and 5x5, respectivelyConfusionRepresenting an error feedback fusion unit. In this example, m is [1, 3 ]]And n is [1, 3 ]]。
And 3, obtaining the rich features required by the up-sampling by the shallow feature through a feature extraction network consisting of m cascaded residual groups, 1 residual feature extraction trunk which is formed by convolution and series connection of 3x3 and global jump connection. Specifically, the formula is shown as follows:
Fm=Hcrir,m(Fm-1)=Hcrir,m(Hcrir,m-1(…(Hcrir,1(F0)…)) (13)
FRes=τ(w3×3*Fm+b) (14)
FDF=F0+FRes (15)
wherein Hcrir,mFor the operation of the mth cascaded residual group, FResTo pass the final residual features after a 3 × 3 convolutional layer, FDFIs a depth feature which is finally composed of a shallow feature and a residual feature.
Step 4, a layer of sub-pixel convolution layer is used for up-sampling the deep level features extracted by the main network to obtain a high-resolution image, and the expression is as follows:
Fup=HSub_pixel(FDF) (16)
wherein HSub_pixelFor up-sampling operations with sub-pixel convolution, FupIs an upsampled output characteristic.
And 4, reconstructing the image, which specifically comprises the following steps: reconstruction of high resolution images I by upsampling predicted featuresSRThe expression is as follows:
ISR=HR(Fup) (15)
wherein HRFor reconstructing high-resolution images ISROf a rollAnd (4) performing product operation.
In summary, compared with the prior art, the invention has the following advantages and effects:
(1) according to the embodiment of the invention, multiple scale features are extracted from an image by adopting a multi-scale cascade attention residual error module, the channel feature before an activation function is widened on the premise of ensuring that parameters are not changed, the channel feature and the space feature in the residual error are respectively enhanced by adopting the channel attention and the space attention, and finally, the respective scale feature is enhanced by means of the error between scales through an error feedback fusion unit and the multi-scale enhanced features are fused, so that the extracted features are richer.
(2) According to the embodiment of the invention, by adopting the structure that the global residual is embedded with the cascade residual, the network can bypass the low-frequency information in the low-resolution input, learn more high-frequency residual information and acquire rich detailed characteristics. And a deep network does not need to be established, and a high-resolution reconstructed image with good effect can be obtained.
The embodiment also provides an image super-resolution reconstruction apparatus, including:
at least one processor;
at least one memory for storing at least one program;
when executed by the at least one processor, cause the at least one processor to implement the method described above.
The image super-resolution reconstruction device of the embodiment can execute the image super-resolution reconstruction method provided by the method embodiment of the invention, can execute any combination of the implementation steps of the method embodiment, and has corresponding functions and beneficial effects of the method.
The embodiment also provides a storage medium, which stores an instruction or a program capable of executing the image super-resolution reconstruction method provided by the embodiment of the method of the invention, and when the instruction or the program is executed, the method can be executed by any combination of the embodiment of the method, and the method has corresponding functions and beneficial effects.
In alternative embodiments, the functions/acts noted in the block diagrams may occur out of the order noted in the operational illustrations. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality/acts involved. Furthermore, the embodiments presented and described in the flow charts of the present invention are provided by way of example in order to provide a more thorough understanding of the technology. The disclosed methods are not limited to the operations and logic flows presented herein. Alternative embodiments are contemplated in which the order of various operations is changed and in which sub-operations described as part of larger operations are performed independently.
Furthermore, although the present invention is described in the context of functional modules, it should be understood that, unless otherwise stated to the contrary, one or more of the described functions and/or features may be integrated in a single physical device and/or software module, or one or more functions and/or features may be implemented in a separate physical device or software module. It will also be understood that a detailed discussion of the actual implementation of each module is not necessary for an understanding of the present invention. Rather, the actual implementation of the various functional modules in the apparatus disclosed herein will be understood within the ordinary skill of an engineer, given the nature, function, and internal relationship of the modules. Accordingly, those skilled in the art can, using ordinary skill, practice the invention as set forth in the claims without undue experimentation. It is also to be understood that the specific concepts disclosed are merely illustrative of and not intended to limit the scope of the invention, which is defined by the appended claims and their full scope of equivalents.
The functions may be stored in a computer-readable storage medium if they are implemented in the form of software functional units and sold or used as separate products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The logic and/or steps represented in the flowcharts or otherwise described herein, e.g., an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.
It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
In the foregoing description of the specification, reference to the description of "one embodiment/example," "another embodiment/example," or "certain embodiments/examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
While embodiments of the present invention have been shown and described, it will be understood by those of ordinary skill in the art that: various changes, modifications, substitutions and alterations can be made to the embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.
While the preferred embodiments of the present invention have been illustrated and described, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (10)

1. The image super-resolution reconstruction method is characterized by comprising the following steps of:
extracting shallow features of the low-resolution input image;
performing feature extraction, fusion and enhancement on the shallow features through m trunk networks consisting of cascade residual groups consisting of multi-scale cascade attention residual modules and global jump connection to obtain deep features;
upsampling the deep features using sub-pixel convolution;
and (4) reconstructing the image by using the characteristics obtained by the up-sampling to obtain an image with higher resolution.
2. The image super-resolution reconstruction method according to claim 1, wherein the extracting shallow features from the low-resolution input image comprises:
defining a feature extraction component composed of a convolutional layer, and extracting original features from a low-resolution input image, specifically as shown in the following formula:
F0=HSFE(ILR) (1)
wherein HSFE(. The) is expressed as a convolution operation applied to low resolution feature extraction, ILRRepresenting an input image of low resolution, F0Representing shallow features extracted by convolution.
3. The image super-resolution reconstruction method according to claim 1, wherein the cascade residual group comprises n multi-scale cascade attention residual modules, n feature concatenation units, n feature compression units, n short jump connections and 1 local jump connection, and the feature compression unit comprises a convolution of 1x 1.
4. The image super-resolution reconstruction method according to claim 3, wherein the formula expression of the cascade residual group is as follows:
Fm,1=HMCRAB(Fm-1) (2)
Fm,2=HMCRAB(w1×1*[Fm,1,Fm-1]+b) (3)
Fm,n=HMCRAB(w1×1*[Fm,n-1,Fm,n-2]+b) (4)
Fm=Fm,n+Fm-1 (5)
wherein, Fm,nRepresenting the output characteristics of the nth multi-scale cascaded attention residual module in the mth cascaded residual group, HMCRABRepresenting the operation of a multiscale cascade of attention residual modules, FmRepresents an output characteristic of the mth cascade residual group, [ alpha ], [ beta ], [ alpha ] of the mth cascade residual group]Representation feature splicing, w1×1Represents the weight of the 1 × 1 convolution and b represents the deviation of the convolution kernel.
5. The image super-resolution reconstruction method according to claim 3, wherein the multi-scale cascade attention residual module comprises an attention residual unit, an error feedback fusion unit, a jump connection and a cascade operation;
the formula expression of the multi-scale cascade attention residual module is as follows:
F3×3,in1=w1×1*Fm,n-1+b (6)
F3,1=HRAB,3×3(F3×3,in1)+F3×3,in1 (7)
F3×3,in2=w1×1*[F3,1,F3×3,in1]+b (8)
F3,2=HRAB,3×3(F3×3,in2)+F3×3,in2 (9)
F3×3_in3=w1×1*[F3,2,F3,1,F3×3,in1]+b (10)
F5×5,in1=w1×1*Fm,n-1+b (11)
F5,1=HRAB,5×5(F5×5,in1)+F5×5,in1 (12)
F5×5,in2=w1×1*[F5,1,F5×5,in1]+b (13)
F5,2=HRAB,5×5(F5×5,in2)+F5×5,in2 (14)
F5×5,in3=w1×1*[F5,2,F5,1,F5×5,in1]+b (15)
Fm,n=HConfusion(F3×3,in3,F5×5,in3)+Fm,n-1 (16)
wherein, F3×3,in1、F3×3,in2、F3×3,in3Respectively representing input features at different stages with a scale of 3x3, F3,1、F3,2Respectively, an intermediate feature at a scale of 3x3, F5×5,in1、F5×5,in2、F5×5,in3Respectively representing input features at different stages with a scale of 5x5, F5,1、F5,2Respectively, the intermediate feature at the scale 5x5, HRAB,3×3、HRAB,5×5Attention residual units, H, with convolution kernels of 3x3 and 5x5, respectivelyConfusionIndicating error feedback fusion unit, Fm,nRepresenting an output characteristic of the nth multi-scale cascade attention residual module in the mth cascade residual set, [ 2 ]]Representation feature splicing, w1×1Represents the weight of the 1 × 1 convolution and b represents the deviation of the convolution kernel.
6. The image super-resolution reconstruction method according to claim 5, wherein the attention residual error unit adopts a wide activation mode, obtains a wider channel feature under the condition that parameter quantity is not changed, simultaneously adopts a channel attention module to enhance the channel feature before activation, and finally adopts a space attention unit to perform space feature enhancement on the residual error, wherein both attention mechanisms adopt a mode of combining average pooling and maximum pooling;
the formula expression of the attention residual unit is as follows:
y=τ(HCA(wk×k*x+b)) (17)
Fr=HSA(wk×k*y+b) (18)
wherein x and y represent input and output characteristics, respectively, and τ tableShows the nonlinear activation function ReLU, wk×kWeight, H, representing kxk convolutionCADenotes the channel attention Unit, HSARepresenting the channel attention units, wherein the formula expression of each attention unit is as follows:
HCA=σ(HFC(τ(HFC(PAvg(x)+PMax(x)))))*x (19)
HSA=σ(w7×7*[(PAvg(x),PMax(x)]+b))*x (20)
where σ denotes the nonlinear activation function Sigmoid, w7×7Weight of convolution of 7x7, PAvgDenotes average pooling operation, PMaxDenotes maximum pooling operation, HFCRepresenting a fully connected layer.
7. The image super-resolution reconstruction method according to claim 5, wherein the formula expression of the error feedback fusion unit is as follows:
ffeedback_3=τ(w3×3*F3×3,in3+b)-τ(w3×3*F5×5,in3+b) (21)
F3×3=τ(w3×3*ffeedback_3+b)+F3×3,in3 (22)
ffeedback_5=τ(w3×3*F5×5,in3+b)-τ(w3×3*F3×3,in3+b) (23)
F5×5=τ(w3×3*ffeedback_5+b)+F5×5,in3 (24)
Fconfusion=w3×3*[F3×3,F5×5]+b (25)
wherein f isfeedback_3、ffeedback_5Respectively representing error feedback between different scales; f3×3,F3×5Respectively representing the fusion characteristics after error feedback, and tau represents a nonlinear activation function ReLU, w3×3Weight of convolution of 3x3, FconfusionRepresenting a multi-scale residual fusion feature.
8. The image super-resolution reconstruction method of claim 1, wherein the extracting, fusing and enhancing the features of the shallow features through a backbone network composed of m cascade residual groups composed of multi-scale cascade attention residual modules and global jump connection to obtain the deep features comprises:
obtaining deep features from shallow features through a trunk network consisting of m cascaded residual groups, 1 residual feature extraction trunk with convolution cascade connection of 3x3 and global jump connection; the specific formula is as follows:
Fm=Hcrir,m(Fm-1)=Hcrir,m(Hcrir,m-1(…(Hcrir,1(F0)…)) (26)
FRes=τ(w3×3*Fm+b) (27)
FDF=F0+FRes (28)
wherein Hcrir,mFor the operation of the mth cascaded residual group, FResTo pass the final residual features after one 3 × 3 convolution layer, FDFFor depth features, which finally consist of shallow features and residual features, F0Is a shallow feature.
9. The image super-resolution reconstruction method according to claim 1, wherein the up-sampling of deep features using sub-pixel convolution comprises:
the deep level features extracted through the backbone network are up-sampled by using a layer of sub-pixel convolution, and the specific formula is as follows:
Fup=HSub_pixel(FDF) (29)
wherein HSub_pixelFor up-sampling operations with sub-pixel convolution, FupIs an upsampled output characteristic.
The reconstructing the image by using the feature obtained by the upsampling to obtain the image with higher resolution includes:
feature reconstruction by upsampling predictionHigh resolution image ISRThe specific formula is as follows:
ISR=HR(Fup) (30)
wherein HRFor reconstructing high-resolution images ISRThe convolution operation of (1).
10. An image super-resolution reconstruction apparatus, comprising:
at least one processor;
at least one memory for storing at least one program;
when executed by the at least one processor, cause the at least one processor to implement the method of any one of claims 1-9.
CN202210147765.2A 2022-02-17 2022-02-17 Image super-resolution reconstruction method and device Pending CN114581300A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210147765.2A CN114581300A (en) 2022-02-17 2022-02-17 Image super-resolution reconstruction method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210147765.2A CN114581300A (en) 2022-02-17 2022-02-17 Image super-resolution reconstruction method and device

Publications (1)

Publication Number Publication Date
CN114581300A true CN114581300A (en) 2022-06-03

Family

ID=81773626

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210147765.2A Pending CN114581300A (en) 2022-02-17 2022-02-17 Image super-resolution reconstruction method and device

Country Status (1)

Country Link
CN (1) CN114581300A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115358932A (en) * 2022-10-24 2022-11-18 山东大学 Multi-scale feature fusion face super-resolution reconstruction method and system
CN115546032A (en) * 2022-12-01 2022-12-30 泉州市蓝领物联科技有限公司 Single-frame image super-resolution method based on feature fusion and attention mechanism
CN116071243A (en) * 2023-03-27 2023-05-05 江西师范大学 Infrared image super-resolution reconstruction method based on edge enhancement
CN116503260A (en) * 2023-06-29 2023-07-28 北京建筑大学 Image super-resolution reconstruction method, device and equipment
CN117173025A (en) * 2023-11-01 2023-12-05 华侨大学 Single-frame image super-resolution method and system based on cross-layer mixed attention transducer

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115358932A (en) * 2022-10-24 2022-11-18 山东大学 Multi-scale feature fusion face super-resolution reconstruction method and system
CN115546032A (en) * 2022-12-01 2022-12-30 泉州市蓝领物联科技有限公司 Single-frame image super-resolution method based on feature fusion and attention mechanism
CN116071243A (en) * 2023-03-27 2023-05-05 江西师范大学 Infrared image super-resolution reconstruction method based on edge enhancement
CN116503260A (en) * 2023-06-29 2023-07-28 北京建筑大学 Image super-resolution reconstruction method, device and equipment
CN116503260B (en) * 2023-06-29 2023-09-19 北京建筑大学 Image super-resolution reconstruction method, device and equipment
CN117173025A (en) * 2023-11-01 2023-12-05 华侨大学 Single-frame image super-resolution method and system based on cross-layer mixed attention transducer
CN117173025B (en) * 2023-11-01 2024-03-01 华侨大学 Single-frame image super-resolution method and system based on cross-layer mixed attention transducer

Similar Documents

Publication Publication Date Title
CN114581300A (en) Image super-resolution reconstruction method and device
CN110136066B (en) Video-oriented super-resolution method, device, equipment and storage medium
CN112862689B (en) Image super-resolution reconstruction method and system
CN107392865B (en) Restoration method of face image
Yan et al. SRGAT: Single image super-resolution with graph attention network
CN111368790A (en) Construction method, identification method and construction device of fine-grained face identification model
Luo et al. Lattice network for lightweight image restoration
CN116309648A (en) Medical image segmentation model construction method based on multi-attention fusion
KR20220000871A (en) Neural network processing method and apparatus using temporally adaptive, region-selective signaling
CN114170167A (en) Polyp segmentation method and computer device based on attention-guided context correction
CN111553861B (en) Image super-resolution reconstruction method, device, equipment and readable storage medium
CN114170099A (en) Method, system, equipment and storage medium for erasing characters in scenes with arbitrary shapes
CN113674156B (en) Method and system for reconstructing image super-resolution
CN114418987A (en) Retinal vessel segmentation method and system based on multi-stage feature fusion
Ahn et al. Super-resolution convolutional neural networks using modified and bilateral ReLU
CN109242919A (en) A kind of image down sampling method
CN103226818B (en) Based on the single-frame image super-resolution reconstruction method of stream shape canonical sparse support regression
CN110458849B (en) Image segmentation method based on feature correction
CN116029905A (en) Face super-resolution reconstruction method and system based on progressive difference complementation
CN114219738A (en) Single-image multi-scale super-resolution reconstruction network structure and method
CN112801912B (en) Face image restoration method, system, device and storage medium
Luo et al. Deep semantic image compression via cooperative network pruning
CN116778539A (en) Human face image super-resolution network model based on attention mechanism and processing method
KR102364628B1 (en) Video processing method and apparatus
CN114299010A (en) Method and device for segmenting brain tumor image, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination