CN111461987A

CN111461987A - Network construction method, image super-resolution reconstruction method and system

Info

Publication number: CN111461987A
Application number: CN202010250271.8A
Authority: CN
Inventors: 孙旭; 董晓宇; 高连如; 张兵
Original assignee: Aerospace Information Research Institute of CAS
Current assignee: Aerospace Information Research Institute of CAS
Priority date: 2020-04-01
Filing date: 2020-04-01
Publication date: 2020-07-28
Anticipated expiration: 2040-04-01
Also published as: CN111461987B

Abstract

The invention provides a network construction method, an image super-resolution reconstruction method and a system, comprising the following steps: and constructing the SFnet by utilizing a preset first convolution layer, the SFS and an up-sampling module, and training the SFnet based on sample data to obtain a super-resolution network model. And inputting the first resolution image of the channels A in the rows and the columns of M into a super-resolution network model for resolution improvement to obtain a second resolution image of the channels A in the rows r, the columns of r and the columns of M. In the scheme, in the process of processing the first resolution image, the SFS is utilized to extract and integrate multi-level information, and the SFS is utilized to extract and utilize the characteristic information of the first resolution image at the global level and the local level, so that the super-resolution network model can improve the resolution of the first resolution image on the premise of ensuring high fidelity, and obtain a second resolution image with high fidelity and high resolution.

Description

Network construction method, image super-resolution reconstruction method and system

Technical Field

The invention relates to the technical field of image processing, in particular to a network construction method, an image super-resolution reconstruction method and an image super-resolution reconstruction system.

Background

Super-Resolution reconstruction (SR) refers to restoring a low-Resolution image into a high-Resolution image, and with the development of scientific technology, Super-Resolution reconstruction is widely applied in the fields of monitoring, medicine, remote sensing, and the like. Therefore, how to restore a low-resolution image to a high-resolution image becomes an urgent problem to be solved today.

Disclosure of Invention

In view of this, embodiments of the present invention provide a network construction method, an image super-resolution reconstruction method and an image super-resolution reconstruction system, so as to restore a low-resolution image to a high-resolution image.

In order to achieve the above purpose, the embodiments of the present invention provide the following technical solutions:

the first aspect of the embodiments of the present invention discloses a network construction method, which includes:

constructing a first convolution layer by using C convolution kernels with the size of t × t × A, wherein C and A are positive integers, and t is a positive odd number;

constructing a second-order feedforward structure SFS by utilizing G first-order feedforward groups FFG, a first characteristic connection operation and a second convolution layer, wherein the FFG consists of B residual channel attention blocks RCAB, a second characteristic connection operation and a third convolution layer, the RCAB consists of a fourth convolution layer, a linear rectification function Re L U layer, a fifth convolution layer and a channel attention CA module, and G and B are positive integers;

connecting the output end of the first convolution layer with the input end of the SFS, and connecting the output end of the SFS with the input end of an up-sampling module to construct a second-order feedforward network SFnet, wherein the up-sampling module is used for executing r times of up-sampling operation, r is any real number greater than 1, and the convolution layer in the up-sampling module is composed of (A r) convolution kernels with the size of t × t × C;

and training the SFnet based on sample data to obtain a super-resolution network model.

Preferably, the constructing a second-order feedforward structure SFS by using the G first-order feedforward groups FFG, the first feature concatenation operation, and the second convolution layer includes:

sequentially connecting G FFGs, and respectively connecting an output end of each FFG with an input end of a first characteristic connection operation, wherein the input end of the 1 st FFG is connected with the input end of the first connection operation;

and connecting the output end of the first connection operation with the input end of a second convolution layer to construct the SFS, wherein the second convolution layer is formed by C convolution kernels with the size of t × t × C (G + 1).

Preferably, the process of constructing the FFG according to the B residual channel attention blocks RCAB, the second eigen-join operation, and the third convolution layer includes:

sequentially connecting B RCABs, and respectively connecting the output end of each RCAB with the input end of the second characteristic connection operation, wherein the input end of the 1 st RCAB is connected with the input end of the second characteristic connection operation;

and connecting the output end of the second characteristic connection operation with the input end of a third convolution layer to construct the FFG, wherein the third convolution layer is formed by C convolution kernels with the size of t × t × C (B + 1).

Preferably, the process of constructing the RCAB from the fourth convolutional layer, the linear rectification function Re L U layer, the fifth convolutional layer, and the channel attention CA module includes:

sequentially connecting a fourth convolutional layer, a Re L U layer, a fifth convolutional layer and a CA module, wherein the input end of the fourth convolutional layer is connected with the output end of the CA module through a element-by-element summation unit to construct RCAB, and the fourth convolutional layer and the fifth convolutional layer are formed by C convolutional cores with the size of t × t × C.

Preferably, if a convolutional layer in the upsampling module is composed of (C × r) convolutional kernels with the size of t × t × C, after connecting an output end of the first convolutional layer with an input end of the SFS and connecting an output end of the SFS with an input end of the upsampling module, the method further includes:

and connecting the output end of the up-sampling module with the input end of a sixth convolutional layer to construct SFnet, wherein the sixth convolutional layer is composed of A convolutional kernels with the size of t × t × C.

The second aspect of the embodiment of the present invention discloses an image super-resolution reconstruction method, which is applied to a super-resolution network model constructed by the network construction method disclosed in the first aspect of the embodiment of the present invention, and the image super-resolution reconstruction method includes:

acquiring a first resolution image of channels A in M rows and N columns, wherein M, N and A are positive integers;

and inputting the first resolution image into a super-resolution network model for resolution improvement to obtain a second resolution image of r M rows, r N columns and A channels, wherein r is the improvement multiple of the resolution.

Preferably, the SFS is respectively connected to the first convolution layer and the up-sampling module, and the step of inputting the first resolution image into a super-resolution network model to perform resolution enhancement to obtain a second resolution image of r × M rows, r × N columns and a channels includes:

inputting the first resolution image into the first convolution layer to obtain initial features of M rows and N columns of C channels, wherein the first convolution layer is composed of C convolution kernels with the size of t × t × A, and C is a positive integer;

inputting the initial features into the SFS to obtain SFS features of M rows and N columns of C channels;

and inputting the SFS features into the up-sampling module to obtain a second resolution image of r M rows, r N columns and A channels, wherein convolution layers in the up-sampling module are formed by (A r) convolution kernels with the size of t × t × C.

Preferably, G FFGs are sequentially connected and respective output ends thereof are respectively connected with a first feature connection operation, a first convolution layer is respectively connected with the first feature connection operation and the 1 st FFG, a second convolution layer is respectively connected with the first feature connection operation and the up-sampling module, the inputting of the initial feature into the SFS to obtain the SFS feature of M rows and N columns of C channels includes:

inputting the initial characteristic into the 1 st FFG to obtain the FFG characteristic of the C channels of M rows and N columns output by the 1 st FFG, and inputting the initial characteristic into the first characteristic connection operation;

inputting the FFG characteristic output by the ith FFG into the (y +1) th FFG to obtain the FFG characteristic output by the (y +1) th FFG, inputting the FFG characteristic output by the ith FFG into the first characteristic connection operation, wherein y is an integer which is greater than or equal to 1 and less than or equal to G, and when y is equal to G, inputting the FFG characteristic output by the G th FFG into the first characteristic connection operation;

integrating the initial features and all FFG features by utilizing the first feature connection operation to obtain FFG integration features of channels with M rows and N columns and C (G + 1);

and inputting the FFG integration characteristics into the second convolution layer to obtain SFS characteristics of M rows and N columns of C channels, wherein the second convolution layer is composed of C convolution kernels with the size of t × t × C (G + 1).

Preferably, the B RCABs are sequentially connected and respective output ends thereof are respectively connected to a second feature connection operation, the second feature connection operation is connected to a third convolution layer, the inputting of the initial feature into the 1 st FFG to obtain the FFG feature of the M rows and N columns of C channels output by the 1 st FFG includes:

inputting the initial characteristic into the 1 st RCAB to obtain the RCAB characteristic of the C channels of M rows and N columns output by the 1 st RCAB, and inputting the initial characteristic into the second characteristic connection operation;

inputting the RCAB characteristic output by the z-th RCAB into the z + 1-th RCAB to obtain the RCAB characteristic output by the z + 1-th RCAB, and inputting the RCAB characteristic output by the z + 1-th RCAB into the second characteristic connection operation, wherein z is an integer which is greater than or equal to 1 and less than or equal to B, and when z is equal to B, the RCAB characteristic output by the B-th RCAB is input into the second characteristic connection operation;

integrating the initial features and all the RCAB features by utilizing the second feature connection operation to obtain RCAB integration features of channels with M rows and N columns and C (B + 1);

inputting the RCAB integrated features into the third convolution layer to obtain FFG features of M rows and N columns of C channels output by the FFG 1 st, wherein the third convolution layer is composed of C convolution kernels with the size of t × t × C (B + 1).

Preferably, the connecting of the fourth convolutional layer, the Re L U layer, the fifth convolutional layer and the CA module in sequence, the inputting the initial characteristics into the 1 st RCAB to obtain the RCAB characteristics of the M rows and N columns of C channels output by the 1 st RCAB, includes:

inputting the initial features into the fourth convolution layer to obtain first sub-features of M rows and N columns of C channels;

processing the first sub-feature by using the Re L U layer to obtain a second sub-feature of the C channels in M rows and N columns;

inputting the second sub-features into the fifth convolution layer to obtain third sub-features of M rows and N columns of C channels;

processing the third sub-feature by using the CA module to obtain a fourth sub-feature of M rows and N columns of C channels;

performing element-by-element summation calculation on the fourth sub-feature and the initial feature to obtain RCAB features of M rows and N columns of C channels;

the fourth convolutional layer and the fifth convolutional layer are composed of C convolutional kernels of size t × t × C.

Preferably, if the convolutional layers in the upsampling module are composed of (C r) convolutional kernels with the size of t × t × C, the SFnet further includes a sixth convolutional layer, and after the initial features are input into the SFS to obtain SFS features of M rows and N columns of C channels, the method further includes:

inputting the SFS characteristics into the up-sampling module to obtain up-sampling module characteristics of r M rows, r N columns and C channels;

and inputting the characteristics of the up-sampling module into the sixth convolutional layer to obtain a second resolution image of r rows M, r columns N channels A, wherein the sixth convolutional layer is composed of A convolutional kernels with the size of t × t × C.

A third aspect of the embodiments of the present invention discloses a network construction system, including:

the first building unit is used for building a first convolution layer by utilizing C convolution kernels with the size of t × t × A, wherein C and A are positive integers, and t is a positive odd number;

the second construction unit is used for constructing a second-order feedforward structure SFS by utilizing G first-order feedforward groups FFG, a first characteristic connection operation and a second convolution layer, wherein the FFG consists of B residual error channel attention blocks RCAB, a second characteristic connection operation and a third convolution layer, the RCAB consists of a fourth convolution layer, a linear rectification function Re L U layer, a fifth convolution layer and a channel attention CA module, and G and B are positive integers;

a third constructing unit, configured to connect an output end of the first convolution layer to an input end of the SFS, and connect an output end of the SFS to an input end of an upsampling module, so as to construct a second-order feedforward network SFnet, where the upsampling module is configured to perform r-fold upsampling operation, r is any real number greater than 1, and the convolution layer in the upsampling module is formed by (a × r) convolution kernels with a size of t × t × C;

and the training unit is used for training the SFnet based on the sample data to obtain a super-resolution network model.

The fourth aspect of the present invention discloses an image super-resolution reconstruction system, which is applied to a super-resolution network model constructed by the network construction method disclosed in the first aspect of the present invention, and the image super-resolution reconstruction system includes:

an acquisition unit, configured to acquire a first resolution image of M rows and N columns of channels a, where M, N and a are positive integers;

and the processing unit is used for inputting the first resolution image into a super-resolution network model for resolution improvement to obtain a second resolution image of r M rows, r N columns and A channels, wherein r is an improvement multiple of resolution.

Based on the network construction method, the image super-resolution reconstruction method and the system provided by the embodiment of the invention, the method comprises the following steps: and constructing the SFnet by utilizing a preset first convolution layer, the SFS and an up-sampling module, and training the SFnet based on sample data to obtain a super-resolution network model. And inputting the first resolution image of the channels A in the rows and the columns of M into a super-resolution network model for resolution improvement to obtain a second resolution image of the channels A in the rows r, the columns of r and the columns of M. In the scheme, in the process of processing the first resolution image, the SFS is utilized to extract and integrate multi-level information, and the SFS is utilized to the characteristic information of the first resolution image at the global level and the local level, so that the super-resolution network model can improve the resolution of the first resolution image on the premise of ensuring high fidelity, and obtain a second resolution image with high fidelity and high resolution.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.

Fig. 1 is a schematic structural diagram of a convolution kernel according to an embodiment of the present invention;

FIG. 2 is a schematic structural diagram of a convolutional layer according to an embodiment of the present invention;

fig. 3 is a schematic structural diagram of a CA module according to an embodiment of the present invention;

fig. 4 is a flowchart of a network construction method according to an embodiment of the present invention;

fig. 5 is a schematic structural diagram of an SFnet according to an embodiment of the present invention;

fig. 6 is another schematic structural diagram of an SFnet according to an embodiment of the present invention;

FIG. 7 is a schematic structural diagram of an RCAB according to an embodiment of the present invention;

fig. 8 is a schematic structural diagram of an FFG according to an embodiment of the present invention;

FIG. 9 is a schematic structural diagram of an SFS provided by an embodiment of the present invention;

FIG. 10 is a flowchart of a super-resolution image reconstruction method according to an embodiment of the present invention;

FIG. 11 is a flowchart of obtaining a second resolution image according to an embodiment of the present invention;

FIG. 12 is a flow chart for obtaining SFS characteristics according to an embodiment of the present invention;

FIG. 13 is a flow chart for obtaining the 1 st FFG feature provided by the embodiments of the present invention;

fig. 14 is a block diagram illustrating a network construction system according to an embodiment of the present invention;

fig. 15 is a block diagram of a super-resolution image reconstruction system according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

In this application, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

As can be known from the background art, the resolution reconstruction technology is widely applied to various fields such as monitoring, medicine and remote sensing, and how to restore a low-resolution image into a high-resolution image is a problem that needs to be solved urgently today.

Therefore, the embodiment of the invention provides a network construction method, an image super-resolution reconstruction method and a system, wherein a first resolution image of M rows and N columns of A channels is input into a pre-trained super-resolution network model for resolution improvement to obtain a second resolution image of r rows and M rows and r columns of A channels, so that a low-resolution image is restored into a high-resolution image.

It is understood that the low-resolution image refers to an image with a resolution lower than a first preset resolution, and the high-resolution image refers to an image with a resolution higher than a second preset resolution, and the definitions of the low-resolution image and the high-resolution image are not limited specifically herein.

In order to better understand the contents in the embodiments of the present invention, some of the operations related to the embodiments of the present invention are described in the following. Note that the contents of the 12 operations shown below are for illustration only.

The 1 st operation content is convolution kernel, 1 convolution kernel with the size of t × t × C is a three-dimensional array of 1 t × t × C, and t × t × C weight values for performing convolution operation are indicated, C is a positive integer, and t is a positive odd number.

The structure of the convolution kernel is shown in fig. 1, where the cuboid in fig. 1 indicates a three-dimensional array of t × t × C.

Convolution layer (Conv), n convolution kernels with size t × t × C form 1 convolution layer, and in combination with the 1 st calculation content, all convolution kernels in the convolution layer have n × t × t × C weights for convolution operation, that is, the convolution layer is a four-dimensional array formed by n × t × t × C weights for convolution operation

(w represents a convolutional layer).

The convolutional layer structure is shown in FIG. 2. in FIG. 2, the convolutional layer is unrolled to form n convolutional kernels with the size of t × t × C.

The convolution function is that the convolution operation is carried out on the image X of C channels of M rows and N columns by using 1 convolution layer w consisting of N convolution kernels with the size of t × t × C, the content of the convolution operation is as the formula (1), and the output image Y is the image of N channels of M rows and N columns after the convolution operation.

Y＝F_Conv(X,w) (1)

Note that M, N, and N are positive integers, M is the length of image X, N is the width of image X, and the channels may also be referred to as bands.

For example: for the rgb (redgreenblue) format image, the rgb image has three bands of red, green and blue, i.e. the rgb image has the size of M rows and N columns of 3 bands (channels).

The 4 th operation content: and (5) performing characteristic connection operation. Feature(s)The join operation is: image X for N M rows and N columns of C channels₁,X₂,...,X_nAnd performing characteristic connection operation on each image, as shown in formula (2), wherein the output image Y is an image with M rows, N columns and N × C channels.

Y＝F_Concatenate(X₁,X₂,...,X_n) (2)

For example: assuming that the sizes of the image 1 (feature 1) and the image 2 (feature 2) are both M rows and N columns of 64 channels, the feature connection operation is performed on the image 1 and the image 2, and the size of the obtained image Y is M rows and N columns of 128 channels.

The 5 th operation is a linear rectification function (Rectified L initial Unit, Re L U). for the value x, Re L U is defined as formula (3).

y＝f_RELU(x)＝max(0,x) (3)

For vector x ═ x₁,...,x_c,...,x_C]Re L U is defined as formula (4), where the vector y ═ y₁,...,y_c,...,y_C]And y is_c＝f_RELU(x_c)＝max(0,x_c)。

y＝F_RELU(x) (4)

Image X ═ X for M rows, N columns, and C channels_ijc]_M×N×CRe L U is defined as formula (5), where image Y ═ Y_ijc]_M×N×CAnd y is_ijc＝f_RELU(x_ijc)＝max(0,x_ijc)。

Y＝F_ReLU(X) (5)

The contents of the 6 th operation are as follows: sigmoid function, whose role is to map the input quantity between 0 and 1.

For the value x, the sigmoid function is defined as equation (6), where y ∈ (0, 1).

For vector x ═ x₁,...,x_c,...,x_C]The sigmoid function is defined as formula (7), where y ═ y₁,...,y_c,...,y_C]，y_c＝f_sigmiod(x_c) And y is_c∈(0,1)。

y＝F_sigmoid(x) (7)

The 7 th operation content: and summing element by element. Element-by-element summation means: image X ═ X for M rows, N columns, and C channels_ijc]_M×N×CAnd image Y of M rows, N columns and C channels ═ Y_ijc]_M×N×CThe element-by-element summation operation (pixel-by-pixel summation) between the two is noted as

Operation result Z ═ Z of element-by-element summation_ijc]_M×N×CStill an image of M rows by N columns by C channels, where z_ijc＝x_ijc+y_ijc。

The 8 th operation content: multiplication element by element. Element-by-element multiplication means: for a vector x with dimension C ═ x₁,...,x_c,...,x_C]Sum vector y ═ y₁,...,y_c,...,y_C]The element-by-element multiplication between the two is expressed as

The operation result is z ═ z₁,...,z_c,...,z_C]Wherein z is_c＝x_c×y_c。

For a vector x with dimension C ═ x₁,...,x_c,...,x_C]And M rows and N columns of C channel image Y ═ Y_ijc]_M×N×CThe element-by-element multiplication between the two is described as

Wherein z is_ijc＝x_c×y_ijc。

The content of the 9 th operation: sub-pixel convolution (sub-pixel convolution). The sub-pixel convolution is used for carrying out merging and pixel rearrangement (pixel buffer) operation on different channels of the image, and the number of the image channels is sacrificed to enlarge two dimensions of the length and the width of the image. When the resolution of the image is increased by r times, N columns for M rows(C*r²) The image X of the channel, the operation of the subpixel convolution is as in equation (8).

Y＝F_PixelShuffle(X,r) (8)

In equation (8), r is the resolution enhancement factor (also referred to as the image upscaling factor and upsampling rate), Y is the result of the convolution of the sub-pixels, and Y is r × M rows r × N columns C channels.

For example, for a size of M rows and N columns 1 × 2²The image X of the channel is subjected to the operation in equation (8), assuming that r is 2, and the size of the operation result Y is 2M rows, 2N columns, and 1 channel.

The 10 th operation content: pooling is averaged globally channel-by-channel. The operation of per-channel global average pooling can be denoted as z ═ F_GP(X), image X ═ X for M rows, N columns, and C channels_ijc]_M×N×CThe C-dimensional channel-level statistical vector z ═ z can be obtained by channel-by-channel global average pooling operation₁,...,z_c,...,z_C]For z_cIs calculated as in equation (9).

The contents of the 11 th operation: channel Attention (CA) module. Image X ═ X for M rows, N columns, and C channels_ijc]_M×N×CThe CA module is marked as X ═ F_CA(X,w_up,w_down) The specific definition is shown in formula (10).

In the formula (10), w_downIs C/r' convolutional layers of size 1 × 1 × C, w_upThe C convolutional layers with the size of 1 × 1 × C/r ', and r' is a vector dimension transformation factor in a CA module.

As shown in fig. 3, a schematic diagram of the structure of the CA module is shown in fig. 3.

The contents of the 12 th operation: an upsampling module (upscalemodule). When the resolution improvement multiple of the image is r, the up-sampling module is composed of n-C r²Convolutional layer F consisting of convolutional kernels of size t × t × C_Conv(X, w) convolution with sub-pixel F_PixelShuffle(X, r). That is, the upsampling module may perform r times spatial size enlargement (also called resolution enhancement and upsampling) on the image (feature) of the input self, and the upsampling module is denoted as F_up(X,r)。

For image X with M rows, N columns and C channels, the upsampling operation of the upsampling module is as in equation (11).

Y＝F_up(X,w,r)＝F_PixelShuffle(F_Conv(X,w),r)(11)

The operation result Y of formula (11) has a size of r × M rows r × N columns C channels.

For example: 2 times up-sampling module F_up(X,2) by C2²Convolutional layer F consisting of convolutional kernels of size t × t × C_Conv(X, w) convolution with sub-pixel F_PixelShuffle(X,2) for an image (feature) X of M rows, N columns, C channels, the size of the result Y of the operation using equation (11) is 2M × 2N × C.

It should be noted that the 4-fold upsampling is indirectly implemented by two 2-fold upsampling operations, that is, Y is F_up(X,4)＝F_up(F_up(X,2), 2). Other multiple upsampling may be found in the foregoing and will not be illustrated here.

Referring to fig. 4, a flowchart of a network construction method provided in an embodiment of the present invention is shown, where the network construction method includes:

step S401, a first convolution layer is constructed by utilizing C convolution kernels with the size of t × t × A.

C and a are positive integers, and t is a positive odd number.

As can be seen from the foregoing, the convolutional layer is composed of a plurality of convolutional kernels, and in the process of implementing step S401, the first convolutional layer is constructed by using C convolutional kernels with the size of t × t × a.

Step S402: and constructing a second-order feed forward structure (SFS) by utilizing G first-order Feed Forward Groups (FFG), the first characteristic connection operation and the second convolution layer.

Note that the FFG is composed of B Residual Channel Attention Blocks (RCABs), a second eigen-join operation, and a third convolutional layer, the RCAB is composed of a fourth convolutional layer, a Re L U layer, a fifth convolutional layer, and a CA module, and G and B are positive integers.

In the process of implementing step S402 specifically, an RACB is constructed using the fourth convolutional layer, the Re L U layer, the fifth convolutional layer, and the CA module.

That is, the RACB and FFG are constructed in advance, and then the SFS is constructed using G FFGs, the first eigen-join operation, and the second convolution layer.

Step S403: and connecting the output end of the first convolution layer with the input end of the SFS, and connecting the output end of the SFS with the input end of the up-sampling module to construct a second-order feed forward network (SFnet).

It should be noted that the upsampling module is configured to perform an upsampling operation by a factor of r, where r is any real number greater than 1.

In the process of implementing step S403, the output terminal of the first convolution layer is connected to the input terminal of the SFS, and the output terminal of the SFS is connected to the input terminal of the up-sampling module, so as to construct SFnet.

That is, when processing an image using SFnet, the image of M rows, N columns, and a channels is input to the first convolution layer, the image is processed by the first convolution layer, the SFS, and the up-sampling module outputs the processed image of r rows, M rows, r columns, and a channels.

When inputting an image of M rows, N columns, and a channels into the SFnet, it is necessary to ensure that the number of channels of the image output by the SFnet is not changed, that is, the number of channels of the image output by the SFnet is also a. in combination with the aforementioned 12 th operation content, the convolution layer in the upsampling module is composed of (a r) convolution kernels with the size of t × t × C.

Preferably, if the convolutional layers in the upsampling module are composed of (C x r) convolutional kernels of size t × t × C, that is, the number of channels of the feature (image) output by the upsampling module is C, as is apparent from the above, the number of channels of the image output by the SFnet also needs to be a, and therefore, a sixth convolutional layer composed of a convolutional kernels of size t × t × C may be connected after the upsampling module.

Referring to fig. 6 in conjunction with fig. 5, another schematic structural diagram of the SFnet is shown, and the SFnet further includes: a sixth convolutional layer 504.

That is, SFnet is composed of a first convolution layer, an SFS, an up-sampling module, and a sixth convolution layer. The output end of the first convolution layer is connected with the input end of the SFS, the output end of the SFS is connected with the input end of the up-sampling module, the output end of the up-sampling module is connected with the input end of the sixth convolution layer, and the sixth convolution layer is utilized to convert the number of the characteristic channels output by the up-sampling module into A.

In fig. 6, the convolutional layers in the upsampling module are composed of (C r) convolutional kernels of size t × t × C, the sixth convolutional layer is composed of a convolutional kernels of size t × t × C, and the output terminal of the upsampling module is connected to the input terminal of the sixth convolutional layer.

That is, the image of the a channels in M rows and N columns is input into the first convolution layer, the image is processed by the first convolution layer, the SFS, the up-sampling module, and the sixth convolution layer outputs the processed image of the a channels in r × M rows and r × N columns.

Step S404: and training the SFnet based on sample data to obtain a super-resolution network model.

In the specific implementation process of step S404, after the SFnet is constructed, the SFnet is trained by using sample data until the SFnet converges, so as to obtain the super-resolution network model.

That is, the images of the channels a in M rows and N columns are input into the super-resolution network model, and the images of the channels a in r rows and N columns are obtained, and the structure of the super-resolution network model can be seen in fig. 5 and 6.

The details of training SFnet are described below.

Obtaining a high resolution sample image set comprising J high resolution images

Reducing the resolution of each high-resolution image in the high-resolution sample image set by r times to obtain a low-resolution sample image set containing J low-resolution images

Inputting the ith low-resolution image in the low-resolution sample image set into the SFnet for resolution improvement to obtain a super-resolution reconstructed image

It will be appreciated that the ith low resolution image corresponds to the ith high resolution image in the high resolution sample image set

Initializing model parameters in SFnet, using Adam algorithm, to

And

the minimum error is taken as a target, and a loss function is minimized in the training process to optimize the model parameters in the SFnet to obtain an optimized model parameter set of the SFnet

I.e. SFnet convergence. It should be noted that, in the following description,

the element in (1) is the weight of all convolutional layers in SFnet.

It will be appreciated that the above-described,

and

the error between is minimal meaning that the value of the loss function is minimal.

It should be noted that the loss function may be set according to actual situations, such as the first loss function shown in formula (12) and the second loss function shown in formula (13).

In the embodiment of the invention, the SFnet is constructed by utilizing the preset first convolution layer, the SFS and the up-sampling module, and the SFnet is trained based on sample data to obtain the super-resolution network model. And (3) improving the resolution of the image of the channels A in the rows and the columns of M through a super-resolution network model to obtain the image of the channels A in the rows r in the columns of N. The SFS is used for extracting and integrating multi-level information, and the SFS is used for utilizing the characteristic information of the image at the global level and the local level, so that the resolution of the image is improved by the super-resolution network model on the premise of ensuring high fidelity, and the image with high fidelity and high resolution is obtained.

The process of constructing the RCAB involved in step S402 of fig. 4 in the above embodiment of the present invention is shown in fig. 7, which is a schematic structural diagram of the RCAB provided in the embodiment of the present invention, and the RCAB includes a fourth convolutional layer 701, a Re L U layer 702, a fifth convolutional layer 703 and a CA module 704.

In fig. 7, the fourth convolutional layer, the Re L U layer, the fifth convolutional layer, and the CA module were connected in this order to construct an RCAB.

That is, the output of the fourth convolutional layer is connected to the input of the Re L U layer, the output of the Re L U layer is connected to the input of the fifth convolutional layer, and the output of the fifth convolutional layer is connected to the input of the CA module.

The input end of the fourth convolution layer is connected to the output end of the CA module through the element-by-element summation unit, that is, the image (feature) input to the fourth convolution layer and the image (feature) output by the CA module are subjected to element-by-element summation operation.

The fourth convolutional layer and the fifth convolutional layer are formed of C convolutional kernels having a size of t × t × C.

The operation of RCAB can be denoted as F_RCAB(X,W_RCAB)，F_RCAB(X,W_RCAB) Is defined as formula (14).

In equation (14), X is the image (feature) of the input RACB, and W is the set_RCAB＝(w_RCAB,1,w_RCAB,2,w_RCAB,up,w_RCAB,down) Including the fourth convolutional layer, the fifth convolutional layer, and two convolutional layers in the CA module that constitute the RACB.

The above-mentioned process of constructing the FFG related in step S402 in the embodiment of the present invention is shown in fig. 8, which is a schematic structural diagram of the FFG provided in the embodiment of the present invention, and the FFG includes: b RCABs 801, a second feature join operation 802, and a third convolution layer 803.

Note that RCAB-B in fig. 8 indicates the B-th RCAB.

Connect B RCAB in proper order, that is to say, the output of 1 st RCAB is connected with the input of 2 nd RACB, and the output of 2 nd RCAB is connected with the input of 3 rd RACB, so on and connect B RCAB in proper order.

The output of each RCAB is connected to the input of the second feature connection operation, and it should be noted that the input of the 1 st RCAB is connected to the input of the second feature connection operation, that is, the image (feature) of the 1 st RCAB is also input to the second feature connection operation.

The output of the second eigen-join operation is connected to the input of a third convolutional layer, which is composed of C convolutional kernels of size t × t × C (B +1), to construct the FFG.

The operation of FFG can be denoted as F_FFG(X,W_FFG)，F_FFG(X,W_FFG) Is specifically defined as in formula (15).

In the formula (15), the first and second groups,

x is an image (feature) of the input FFG, set W_FFG＝(w_RCAB,1,w_RCAB,2,...,w_RCAB,B,w_FFG) A convolutional layer containing B RCAB in FFG and a third convolutional layer.

The above-mentioned SFS construction process in step S402 in fig. 4 in the embodiment of the present invention is shown in fig. 9, which is a schematic structural diagram of the SFS provided in the embodiment of the present invention, and the SFS includes: g FFGs 901, a first feature join operation 902, and a second convolutional layer 903.

Note that FFG-G in fig. 9 indicates the G-th FFG.

The G FFGs are connected in sequence, that is, the output end of the 1 st FFG is connected with the input end of the 2 nd FFG, the output end of the 2 nd FFG is connected with the input end of the 3 rd FFG, and the like, the G FFGs are connected in sequence.

The output end of each FFG is connected to the input end of the first characteristic connection operation, and it should be noted that the input end of the 1 st FFG is connected to the input end of the first characteristic connection operation. That is, the image (feature) to which the 1 st FFG is input is also input to the first feature connection operation.

The output of the first join operation is connected to the input of a second convolutional layer, which is composed of C convolutional kernels of size t × t × C (G +1), to construct the SFS.

The operation of SFS can be written as F_SFS(X,W_SFS)，F_SFS(X,W_SFS) Is specifically defined as in formula (16).

In the formula (16), the first and second groups,

x is an image (feature) of the input SFS, and W is a set_SFS＝(w_FFG,1,w_FFG,2,...,w_FFG,G,w_SFS) A convolutional layer containing G FFGs in the SSF and a second convolutional layer.

In conjunction with the above description of FIGS. 7-9, the operation of SFnet can be written as F_SFnet(X,W_SFnetR), when the structure of SFnet is as shown in fig. 5, that is, the SFnet includes a first convolution layer, an SFS and an up-sampling module, F_SFnet(X,W_SFnetAnd r) is specifically defined as formula (17).

F_SFnet(X,W_SFnet,r)＝F_UP(F_SFS(F_Conv(X,w_SFnet,1),W_SFS),w_up,r) (17)

In the formula (17), X is an image inputted to SFnet, and W is a set_SFnet＝(W_SFS,w_up,...,w_SFnet,1) Including the convolutional layer in the SFS, the convolutional layer in the up-sampling module and the first convolutional layer w_SFnet,1。

When the structure of SFnet is as shown in fig. 6, that is, the SFnet includes a first convolution layer, an SFS, an up-sampling module and a sixth convolution layer, F_SFnet(X,W_SFnetAnd r) is specifically defined as in formula (18).

F_SFnet(X,W_SFnet,r)＝F_Conv(F_UP(F_SFS(F_Conv(X,w_SFnet,1),W_SFS),w_up,r),w_SFnet,2) (18)

Set W_SFnet＝(W_SFS,w_up,...,w_SFnet,1,w_SFnet,2) Including the convolution layer in SFS, the convolution layer in up-sampling module, and the first convolution layer w_SFnet,1And a sixth convolution layer w_SFnet,2。

It should be noted that, for the operation contents related in fig. 4 to fig. 9, reference may be made to the related contents in the 12 operation contents, and details are not repeated herein.

In the embodiment of the invention, the FFG is constructed by utilizing the pre-constructed RACB, and the SFS is constructed according to the constructed FFG. The method comprises the steps of constructing and training an SFnet by using an SFS (small form-factor sensor) to obtain a super-resolution network model, extracting and integrating multi-level information by using the SFS, and utilizing the characteristic information of an image at a global level and a local level by using the SFS, so that the resolution of the image is improved by using the super-resolution network model on the premise of ensuring high fidelity, and the image with high fidelity and high resolution is obtained.

Corresponding to the network construction method provided by the embodiment of the present invention, referring to fig. 10, an image super-resolution reconstruction method provided by the embodiment of the present invention is shown, the image super-resolution reconstruction method is applied to a super-resolution network model constructed by the network construction method, and the image super-resolution reconstruction method includes:

step S1001: a first resolution image of M rows, N columns and A channels is acquired.

M, N and A are positive integers.

It is understood that the first resolution image is an image that needs to be subjected to image processing, that is, the first resolution image is a low resolution image that needs to be subjected to resolution enhancement, for example: a first resolution image of M rows and N columns of 3 channels in rgb format is acquired.

Step S1002: and inputting the first resolution image into a super-resolution network model for resolution improvement to obtain a second resolution image of r rows M, r columns N channels A.

It should be noted that r is the improvement multiple of the resolution, and the super-resolution network model is constructed according to the network construction method.

The super-resolution network model comprises a first convolution layer, an SFS and an up-sampling module, and the specific structure can be seen in FIG. 5. Or, the super-resolution network model comprises a first convolution layer, an SFS, an up-sampling module and a sixth convolution layer, and the specific structure can be seen in fig. 6.

As can be seen from the above, after the resolution of the first resolution image with M rows, N columns and a channels is improved, the number of channels of the obtained second resolution image needs to be the same as the number of channels of the first resolution image.

That is, in the process of implementing step S1002, the first resolution image of M rows and N columns of a channels is input to the super-resolution network model to perform resolution enhancement, so as to obtain a second resolution image of r rows and r columns and a channels, that is, a high-resolution image with the number of channels consistent with that of the low-resolution image is obtained.

For example: and inputting the first resolution image of the 3 channels in the M rows and the N columns in the rgb format into a super-resolution network model for resolution improvement to obtain a second resolution image of the 3 channels in the r rows and the M rows and the r columns.

In the embodiment of the invention, the channels A with M rows and N columns are input into a super-resolution network model for resolution improvement. And extracting and integrating multilevel information by using an SFS in a super-resolution network model, utilizing the characteristic information of the first resolution image at a global level and a local level by using the SFS, and improving the resolution of the first resolution image by r times on the premise of ensuring high fidelity to obtain a second resolution image of r M rows r N columns of A channels.

The process of obtaining the second resolution image in step S1002 in fig. 10 according to the above embodiment of the present invention, referring to fig. 11, shows a flowchart of obtaining the second resolution image according to the embodiment of the present invention, which includes the following steps:

it should be noted that, the structure of the super-resolution network model is as shown in fig. 5, that is, the super-resolution network model includes a first convolution layer, an SFS and an up-sampling module, and the SFS is connected to the first convolution layer and the up-sampling module respectively.

Step S1101: inputting the first resolution image into the first convolution layer to obtain the initial characteristics of M rows, N columns and C channels.

The first convolution layer is formed of C convolution kernels having a size of t × t × a, where C is a positive integer.

In the process of implementing step S1101 specifically, the first resolution image of M rows and N columns of a channels is input to the first convolution layer by combining the 1 st operation content to the 3 rd operation content, and the first convolution layer processes the first resolution image to obtain the initial features of M rows and N columns of C channels.

It can be understood that the image is called as a feature in the operation process of the super-resolution network model, and the feature output by the super-resolution network model is called as an image.

Step S1102: and inputting the initial characteristics into the SFS to obtain the SFS characteristics of the C channels with M rows and N columns.

In the process of implementing step S1102 specifically, the initial characteristics of the M rows and N columns of C channels output by the first convolution layer are input into the SFS, and the SFS is used to process the initial characteristics to obtain the SFS characteristics of the M rows and N columns of C channels.

Step S1103: and inputting the SFS characteristics into the up-sampling module to obtain a second resolution image of the A channels in r M rows and r N columns.

In the structure of the super resolution network model shown in fig. 5, the convolution layer in the up-sampling module is formed by (a × r) convolution kernels having a size of t × t × C, that is, after the first resolution image is input to the super resolution network model, the number of channels of the output second resolution image is made to coincide with the number of channels of the first resolution image.

In the process of implementing step S1103 specifically, according to the 12 th operation content, the SFS features of M rows, N columns, and C channels are input into the upsampling module, and the upsampling module is used to process the SFS features, so as to obtain the second resolution image of r rows, M rows, r columns, and a channels, that is, the features of r rows, M rows, r columns, and a channels output by the upsampling module are the second resolution image.

Preferably, if the convolutional layer in the upsampling module is composed of (C × r) convolutional kernels with the size of t × t × C, that is, the size of the feature output by the upsampling module is r × M rows r × N columns of C channels, in order to ensure that the number of channels of the second resolution image is consistent with the number of channels of the first resolution image, a sixth convolutional layer composed of a convolutional kernels with the size of t × t × C is connected after the upsampling module, and the structure of the super-resolution network model is as shown in fig. 6.

That is, SFS features are input to the upsampling module, resulting in upsampling module features for r × M rows, r × N columns, and C channels. And inputting the characteristics of the up-sampling module of the r, M, r, N and C channels into the sixth convolution layer to obtain a second resolution image of the r, M, r, N and A channels.

In the embodiment of the invention, the extraction and integration of multi-level information are realized by using the SFS, the characteristic information of the first resolution image is utilized at the global level and the local level by using the SFS, and the resolution of the first resolution image is improved by r times on the premise of ensuring high fidelity, so that the second resolution image of r M rows, r N columns and A channels is obtained.

The process of obtaining SFS characteristics in step S1102 in fig. 11 in the embodiment of the present invention described above, referring to fig. 12, shows a flowchart of obtaining SFS characteristics provided in the embodiment of the present invention, including the following steps:

it should be noted that the SFS includes G FFGs, a first feature connection operation and a second convolution layer, the G FFGs are sequentially connected and respective output ends are respectively connected with the first feature connection operation, the first convolution layer is respectively connected with the first feature connection operation and the 1 st FFG, and the second convolution layer is respectively connected with the first feature connection operation and the upsampling module.

The specific structure of the SFS can be seen from the content shown in fig. 9, and will not be described herein.

Step S1201: inputting the initial characteristic into the 1 st FFG to obtain the FFG characteristic of the C channels of M rows and N columns output by the 1 st FFG, and inputting the initial characteristic into the first characteristic connection operation.

In the process of implementing step S1201 specifically, the first resolution image of M rows and N columns of a channels is input into the first convolution layer to obtain the initial features of M rows and N columns of C channels, with reference to the structure diagram of SFnet shown in fig. 5 and the structure diagram of SFS shown in fig. 9.

The first convolution layer inputs the initial characteristics of the C channels in the M rows and the N columns into the 1 st FFG and the first characteristic connection operation respectively, and the 1 st FFG processes the initial characteristics to obtain the corresponding FFG characteristics of the C channels in the M rows and the N columns.

Step S1202: inputting the FFG feature output by the ith FFG into the (y +1) th FFG to obtain the FFG feature output by the (y +1) th FFG, and inputting the FFG feature output by the ith FFG into first feature connection operation.

Y is an integer of 1 to G inclusive.

In the process of specifically implementing step S1202, the FFG feature output by the 1 st FFG is input into the 2 nd FFG and the first feature connection operation, and the 2 nd FFG processes the FFG feature of the 1 st FFG to obtain its corresponding FFG feature. And inputting the FFG feature output by the 2 nd FFG into the 3 rd FFG and the first feature connection operation, and processing the FFG feature of the 2 nd FFG by the 3 rd FFG to obtain the corresponding FFG feature. And by analogy, inputting the FFG characteristic output by the y-th FFG into the y + 1-th FFG to obtain the FFG characteristic output by the y + 1-th FFG.

That is, the FFG characteristics received by each FFG of the 2 nd, and the subsequent FFGs are the FFG characteristics output by the previous FFG, and the FFG characteristics output by each FFG are input to the first characteristic connection operation.

Note that the number of FFGs is G, that is, when y is equal to G, the FFG feature output from the G-th FFG is input only to the first feature concatenation operation. And, each FFG feature is sized to M rows and N columns of C channels.

Step S1203: and integrating the initial features and all the FFG features by utilizing a first feature connection operation to obtain the FFG integrated features of the channels with M rows, N columns and C (G + 1).

In the process of implementing step S1203 specifically, as can be seen from the contents in step S1201 and step S1202, the first feature connection operation includes the initial feature and the FFG feature output by each FFG, that is, includes (G +1) features.

According to the content of the 4 th operation, the initial feature and all the FFG features, i.e., (G +1) features, are integrated by using the first feature connection operation, so as to obtain the FFG integration features of the M rows, N columns, C × G +1 channels.

Step S1204: and inputting the FFG integration characteristic into the second convolution layer to obtain the SFS characteristic of M rows and N columns of C channels.

In step S1204, the FFG integration characteristics of M rows and N columns of C (G +1) channels are input into the second convolutional layer, and the SFS characteristics of M rows and N columns of C channels are obtained by processing the FFG integration characteristics using the second convolutional layer.

In the embodiment of the invention, the initial features and the FFG features output by each FFG are transmitted to a first feature connection operation to be combined to obtain the SFS features of the C channels of M rows and N columns, and the feature information of the first resolution image is utilized at the global level and the local level.

In the foregoing description, the process of obtaining the 1 st FFG feature in step S1201 in the embodiment of the present invention is shown in fig. 13, which is a flowchart of obtaining the 1 st FFG feature provided in the embodiment of the present invention, and includes the following steps:

note that the FFG includes: b RCABs, a second signature connection operation and a third convolution layer, wherein the B RCABs are connected in sequence and respective output ends are respectively connected with the second signature connection operation, the second signature connection operation is connected with the third convolution layer, and the structure of the FFG can refer to the content shown in fig. 8.

Step S1301: inputting the initial characteristics into the 1 st RCAB to obtain the RCAB characteristics of M rows and N columns of C channels output by the 1 st RCAB, and inputting the initial characteristics into a second characteristic connection operation.

With reference to the structure diagrams of SFnet and SFS shown in fig. 5 and fig. 9, in the process of implementing step S1301, the first convolution layer inputs the initial characteristics of the C channels in M rows and N columns into the 1 st FFG in the SFS, that is, inputs the initial characteristics into the 1 st RCAB in the 1 st FFG, and inputs the initial characteristics into the second characteristic connection operation.

It is understood that the RCAB includes a fourth convolutional layer, a Re L U layer, a fifth convolutional layer, and a CA module, and the fourth convolutional layer, the Re L U layer, the fifth convolutional layer, and the CA module are sequentially connected, and the structure of the RCAB is shown in fig. 7.

And after the initial characteristic is input into the 1 st RCAB, the fourth convolution layer processes the initial characteristic to obtain a first sub-characteristic of the C channels in the M rows and the N columns, and a Re L U layer is utilized to process the first sub-characteristic to obtain a second sub-characteristic of the C channels in the M rows and the N columns.

And inputting the second sub-feature into a fifth convolution layer, and processing the second sub-feature by the fifth convolution layer to obtain a third sub-feature of M rows and N columns of C channels. And processing the third sub-feature by using a CA module to obtain a fourth sub-feature of the C channels with M rows and N columns. And performing element-by-element summation calculation on the fourth sub-feature and the initial feature to obtain the RCAB features of the C channels in the N rows and the N columns of the M rows, namely inputting the initial feature into the 1 st RCAB to obtain the RCAB features of the C channels in the N rows and the N columns of the M rows.

Step S1302: inputting the RCAB characteristic output by the z-th RCAB into the z + 1-th RCAB to obtain the RCAB characteristic output by the z + 1-th RCAB, and inputting the RCAB characteristic output by the z + 1-th RCAB into a second characteristic connection operation.

In addition, z is a whole number of B of 1 or more and B or less.

In the process of implementing step S1302 specifically, the RCAB feature output by the 1 st RCAB is input to the 2 nd RCAB and second feature connection operation, and the 2 nd RCAB processes the RCAB feature output by the 1 st RCAB to obtain the RCAB feature corresponding to itself. Inputting the RCAB characteristics output by the 2 nd RCAB into the 3 rd RCAB and the second characteristics for connection operation, and processing the RCAB characteristics output by the 2 nd RCAB by the 3 rd RCAB to obtain the corresponding RCAB characteristics. And by analogy, inputting the RCAB characteristic output by the z-th RCAB into the z + 1-th RCAB to obtain the RCAB characteristic output by the z + 1-th RCAB.

It should be noted that the RCAB characteristics received by the 2 nd and the 2 nd subsequent RCABs are the RCAB characteristics output by the previous RCAB, and the RCAB characteristics output by each RCAB are input to the second characteristic connection operation.

It should be noted that the number of RCABs is B, that is, when z is equal to B, the RCAB feature output by the B-th RCAB is input to the second feature connection operation. The process of obtaining each RCAB feature can be seen in step S1301, and the size of each RCAB feature is M rows and N columns of C channels.

Step S1303: and integrating the initial features and all the RCAB features by utilizing a second feature connection operation to obtain the RCAB integrated features of the channels of M rows, N columns and C (B + 1).

In the process of implementing step S1303, as can be seen from the contents in step S1301 and step S1302, the second feature connection operation includes the initial features and the RCAB features output by each RCAB, that is, includes (B +1) features.

And according to the content of the 4-type operation, integrating the initial features and all the RCAB features by using a second feature connection operation to obtain the RCAB integration features of the channels of M rows, N columns and C (B + 1).

Step S1304: inputting the RCAB integrated characteristics into the third convolution layer to obtain FFG characteristics of M rows and N columns of C channels output by the 1 st FFG.

It should be noted that the third convolution layer is formed by C convolution kernels with the size t × t × C (B +1), and in the process of implementing step S1304, the RCAB integration features of the M rows and N columns of C (B +1) channels are input into the third convolution layer, and the third convolution layer is used to process the RCAB integration features to obtain the FFG features of the M rows and N columns of C channels, that is, the FFG features output by the 1 st FFG.

It is understood that, in the above steps S1301 to S1304, the FFG characteristic of the output of the 1 st FFG is calculated, and according to the contents in fig. 5 and fig. 9, the input of the 1 st FFG in the SFS is connected to the output of the first convolution layer, that is, the input of the 1 st RCAB in the 1 st FFG is connected to the output of the first convolution layer.

Preferably, for the FFGs 2 nd and later, the characteristics of the 1 st RCAB in the input FFG are the FFG characteristics output by the previous FFG. For example: the input characteristic of the 1 st RCAB in the 2 nd FFG is the FFG characteristic of the 1 st FFG output, and the input characteristic of the 1 st RCAB in the 3 rd FFG is the FFG characteristic of the 2 nd FFG output.

The process of obtaining the FFG characteristics output by the FFGs of the 2 nd and subsequent FFGs may refer to the contents shown in the above steps S1301 to S1304, and will not be described herein again.

In the embodiment of the invention, the initial features (or the FFG features output by the previous FFG) and each RCAB are transmitted to the second feature connection operation to be combined, so that the FFG features of M rows and N columns of C channels are obtained, and the feature information of the first-resolution image is fully utilized.

Corresponding to the network construction method provided by the above embodiment of the present invention, referring to fig. 14, an embodiment of the present invention further provides a structural block diagram of a network construction system, where the network construction system includes: a first building unit 1401, a second building unit 1402, a third building unit 1403, and a training unit 1404;

a first building unit 1401 for building a first convolutional layer using C convolutional kernels of size t × t × a, C and a being positive integers, t being a positive odd number.

A second constructing unit 1402 for constructing the SFS using G FFGs, the first feature concatenation operation, and the second convolution layer, wherein the FFG is composed of B RCABs, the second feature concatenation operation, and the third convolution layer, the RCAB is composed of a fourth convolution layer, a Re L U layer, a fifth convolution layer, and a CA module, and G and B are positive integers.

In a specific implementation, the second building unit 1402 for building the SFS is specifically configured to sequentially connect G FFGs and connect the output of each FFG to the input of the first eigen-join operation, respectively, wherein the input of the 1 st FFG is connected to the input of the first join operation, and the output of the first join operation is connected to the input of the second convolutional layer, which is formed by C convolutional cores with the size t × t × C (G +1), to build the SFS.

In a specific implementation, the second building unit 1402 for building the FFG is specifically configured to connect the B RCABs in sequence, and connect the output of each RCAB to the input of the second eigen-join operation, respectively, and connect the output of the second eigen-join operation to the input of a third convolutional layer, so as to build the FFG, wherein the input of the 1 st RCAB is connected to the input of the second eigen-join operation, and the third convolutional layer is formed by C convolutional cores with a size of t × t × C (B + 1).

In a specific implementation, the second building unit 1402 for building an RCAB is specifically configured to sequentially connect a fourth convolutional layer, a Re L U layer, a fifth convolutional layer, and a CA module, an input end of the fourth convolutional layer is connected to an output end of the CA module through an element-by-element summing unit, and the fourth convolutional layer and the fifth convolutional layer are formed by C convolutional kernels with a size of t × t × C.

A third building unit 1403, configured to connect the output end of the first convolutional layer to the input end of the SFS, connect the output end of the SFS to the input end of the upsampling module, and build the SFnet, where the upsampling module is configured to perform r times upsampling operation, r is any real number greater than 1, and the convolutional layer in the upsampling module is formed by (a r) convolutional kernels with a size of t × t × C.

Preferably, if the convolutional layers in the upsampling module are formed by (C r) convolutional kernels with size t × t × C, and after the output of the SFS is connected to the input of the upsampling module, the third building unit 1403 is further configured to connect the output of the upsampling module to the input of the sixth convolutional layer to build the SFnet, where the sixth convolutional layer is formed by a convolutional kernels with size t × t × C.

And a training unit 1404, configured to train SFnet based on the sample data to obtain a super-resolution network model.

Corresponding to the image super-resolution reconstruction method provided by the embodiment of the present invention, referring to fig. 15, the embodiment of the present invention further provides a structural block diagram of an image super-resolution reconstruction system, the image super-resolution reconstruction system is applied to the super-resolution network model constructed by the network construction method disclosed above, and the image super-resolution reconstruction system includes: an acquisition unit 1501 and a processing unit 1502;

an acquisition unit 1501 acquires a first resolution image of M rows and N columns of a channels, M, N and a being positive integers.

The processing unit 1502 is configured to input the first resolution image into the super-resolution network model to perform resolution enhancement, so as to obtain a second resolution image of r × M rows, r × N columns, and a channels, where r is a resolution enhancement multiple.

Preferably, the SFS is connected to the first convolution layer and the up-sampling module respectively, and in combination with the content shown in fig. 15, the processing unit 1502 includes a first processing module, a second processing module and a third processing module, and the execution principle of each module is as follows:

the first processing module is used for inputting the first resolution image into a first convolution layer to obtain initial features of M rows and N columns of C channels, wherein the first convolution layer is composed of C convolution kernels with the size of t × t × A, and C is a positive integer.

And the second processing module is used for inputting the initial characteristics into the SFS to obtain the SFS characteristics of the C channels in M rows and N columns.

And the third processing module is used for inputting the SFS characteristics into the up-sampling module to obtain a second resolution image of the A channel with r x M rows r x N columns, and the convolution layer in the up-sampling module is composed of (A x r) convolution kernels with the size of t × t × C.

Preferably, if the convolution layer in the upsampling module is composed of (C r) convolution kernels with the size of t × t × C, the SFnet further includes a sixth convolution layer, the third processing module is further configured to input SFS features into the upsampling module to obtain upsampling module features of r M rows, r N columns, C channels, and input the upsampling module features into the sixth convolution layer to obtain a second resolution image of r M rows, r N columns, a channels, and the sixth convolution layer is composed of a convolution kernels with the size of t × t × C.

Preferably, the G FFGs are sequentially connected and output terminals thereof are respectively connected to the first feature connection operation, the first convolution layer is respectively connected to the first feature connection operation and the 1 st FFG, the second convolution layer is respectively connected to the first feature connection operation and the up-sampling module, and with reference to the content in fig. 15, the second processing module includes: the system comprises a first processing submodule, a second processing submodule, a third processing submodule and a fourth processing submodule, wherein the execution principle of each submodule is as follows:

and the first processing submodule is used for inputting the initial characteristic into the 1 st FFG to obtain the FFG characteristic of the C channels of the M rows and the N columns output by the 1 st FFG, and inputting the initial characteristic into the first characteristic connection operation.

In the FFG, the B RCAB are connected in sequence, the output ends of the B RCAB are respectively connected with a second characteristic connection operation, and the second characteristic connection operation is connected with a third convolution layer.

In one specific implementation, the first processing sub-module is specifically configured to input the initial feature into the 1 st RCAB to obtain RCAB features of M rows and N columns of C channels output by the 1 st RCAB, input the initial feature into a second feature connection operation, input the RCAB feature output by the z th RCAB into the z +1 th RCAB to obtain RCAB features output by the z +1 th RCAB, and input the RCAB feature output by the z +1 th RCAB into a second feature connection operation, where z is an integer greater than or equal to 1 and less than or equal to B, where when z is equal to B, the RCAB feature output by the B th RCAB is input into the second feature connection operation, integrate the initial feature and all the RCAB features by using the second feature connection operation to obtain an RCAB integrated feature of M rows and N columns of C channels (B +1), input the RCAB integrated feature into a third convolution layer to obtain an FFG feature of M rows and N columns of C channels output by the 1 st, and the third convolution layer is composed of C rows and t 35t + 84C cores (B + 3).

In a specific implementation, the first processing sub-module is specifically configured to input initial features into the fourth convolutional layer to obtain first sub-features of M rows and N columns of C channels, process the first sub-features by using the Re L U layer to obtain second sub-features of M rows and N columns of C channels, input the second sub-features into the fifth convolutional layer to obtain third sub-features of M rows and N columns of C channels, process the third sub-features by using the CA module to obtain fourth sub-features of M rows and N columns of C channels, perform element-by-element summation calculation on the fourth sub-features and the initial features to obtain RCAB features of M rows and N columns of C channels, and the fourth convolutional layer and the fifth convolutional layer are formed by C convolutional cores with sizes of t × t × C.

And the second processing submodule is used for inputting the FFG characteristic output by the ith FFG into the (y +1) th FFG to obtain the FFG characteristic output by the (y +1) th FFG, inputting the FFG characteristic output by the ith FFG into first characteristic connection operation, wherein y is an integer which is greater than or equal to 1 and less than or equal to G, and when y is equal to G, inputting the FFG characteristic output by the G th FFG into the first characteristic connection operation.

And the third processing submodule is used for integrating the initial features and all the FFG features by utilizing the first feature connection operation to obtain the FFG integration features of the channels with M rows and N columns and C (G + 1).

And the fourth processing submodule is used for inputting the FFG integration characteristics into the second convolution layer to obtain SFS characteristics of M rows and N columns of C channels, and the second convolution layer is formed by C convolution kernels with the size of t × t × C (G + 1).

In summary, embodiments of the present invention provide a network construction method, an image super-resolution reconstruction method, and an image super-resolution reconstruction system, where a preset first convolution layer, an SFS, and an upsampling module are used to construct an SFnet, and the SFnet is trained based on sample data to obtain a super-resolution network model. The method comprises the steps of inputting a first resolution image of M rows and N columns of A channels into a super-resolution network model for resolution improvement, extracting and integrating multi-level information by using SFS, and utilizing the characteristic information of the first resolution image at a global level and a local level by using the SFS, so that the resolution of the first resolution image is improved by the super-resolution network model on the premise of ensuring high fidelity, and a second resolution image of r M rows and r N columns of A channels with high fidelity and high resolution is obtained.

The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, the system or system embodiments are substantially similar to the method embodiments and therefore are described in a relatively simple manner, and reference may be made to some of the descriptions of the method embodiments for related points. The above-described system and system embodiments are only illustrative, wherein the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A method of network construction, the method comprising:

2. The network construction method according to claim 1, wherein the constructing a second-order feedforward structure SFS using the G first-order feedforward groups FFG, the first eigen join operation, and the second convolution layer comprises:

3. The method of claim 1, wherein the FFG is constructed according to the B residual channel attention blocks RCAB, the second eigen-join operation, and the third convolution layer, and comprises:

4. The network building method according to claim 1, wherein the process of building the RCAB according to the fourth convolutional layer, the linear rectification function Re L U layer, the fifth convolutional layer, and the channel attention CA module comprises:

5. The method of claim 1, wherein if the convolutional layer in the upsampling module is formed by (C r) convolutional kernels with size t × t × C, connecting the output of the first convolutional layer to the input of the SFS, and connecting the output of the SFS to the input of the upsampling module, further comprising:

6. An image super-resolution reconstruction method applied to a super-resolution network model constructed by the network construction method according to any one of claims 1 to 5, the image super-resolution reconstruction method comprising:

7. The image super-resolution reconstruction method of claim 6, wherein SFS is respectively connected to the first convolution layer and the up-sampling module, and the step of inputting the first resolution image into a super-resolution network model for resolution enhancement to obtain a second resolution image of r × M rows, r × N columns and A channels comprises:

8. The image super-resolution reconstruction method of claim 7, wherein G FFGs are sequentially connected and respective output terminals thereof are respectively connected to a first feature connection operation, a first convolution layer is respectively connected to the first feature connection operation and the 1 st FFG, a second convolution layer is respectively connected to the first feature connection operation and the upsampling module, and the inputting of the initial feature into the SFS to obtain SFS features of M rows and N columns of C channels includes:

9. The image super-resolution reconstruction method of claim 8, wherein B RCABs are connected in sequence and their respective output terminals are connected to a second feature connection operation, the second feature connection operation is connected to a third convolution layer, the inputting the initial feature into the 1 st FFG to obtain FFG features of M rows and N columns of C channels output by the 1 st FFG includes:

10. The image super-resolution reconstruction method according to claim 9, wherein a fourth convolution layer, a Re L U layer, a fifth convolution layer and a CA module are connected in sequence, and the inputting the initial features into the 1 st RCAB to obtain the RCAB features of the C channels of M rows and N columns output by the 1 st RCAB comprises:

11. The image super-resolution reconstruction method of claim 7, wherein if the convolution layer in the up-sampling module is composed of (C r) convolution kernels with size t × t × C, the SFnet further includes a sixth convolution layer, and after the initial features are input into the SFS to obtain SFS features of M rows and N columns of C channels, the method further includes:

12. A network construction system, characterized in that the network construction system comprises:

13. An image super-resolution reconstruction system applied to a super-resolution network model constructed by the network construction method according to any one of claims 1 to 5, the image super-resolution reconstruction system comprising: