CN117372261A

CN117372261A - Resolution reconstruction method, device, equipment and medium based on convolutional neural network

Info

Publication number: CN117372261A
Application number: CN202311639919.0A
Authority: CN
Inventors: 徐华安; 杨雁清; 苏宇轩; 周立
Original assignee: Wuxi Unicomp Technology Co ltd
Current assignee: Wuxi Unicomp Technology Co ltd
Priority date: 2023-12-04
Filing date: 2023-12-04
Publication date: 2024-01-09
Anticipated expiration: 2043-12-04
Also published as: CN117372261B

Abstract

The invention discloses a resolution reconstruction method, device, equipment and medium based on a convolutional neural network. The method comprises the following steps: inputting the target test set into a target super-resolution reconstruction model; determining an initial feature map corresponding to a target test image in a target test set through a first convolution layer; the feature fusion layer is used for fusion processing, and a basic texture feature map corresponding to the initial feature map determined by the symmetry layer and a target texture feature map are densely connected to obtain a fusion feature map; determining a basic feature map corresponding to the fusion feature map through a second convolution layer; performing first summation processing on the basic feature map and the initial feature map to obtain a first summation feature map, and determining a target feature map corresponding to the first summation feature map through a third convolution layer; and processing the target feature map through up-sampling of the sub-pixel convolution layer to obtain a resolution reconstruction map. By the technical scheme, the super-resolution image can be reconstructed, and the generation rate and the accuracy of the super-resolution image are improved.

Description

Resolution reconstruction method, device, equipment and medium based on convolutional neural network

Technical Field

The present invention relates to the field of image processing technologies, and in particular, to a method, an apparatus, a device, and a medium for reconstructing resolution based on a convolutional neural network.

Background

With the rapid development of x-ray technology, there is also an increasing task of processing industrial x-ray images. For example, a defect detection task for precision products such as semiconductors, integrated circuit (integrated circuit, ic) elements, castings, and new energy batteries, or a detection and recognition task for chips in circuit boards, etc. Because in human visual perception, the high-resolution image can more accurately express the information such as the spatial structure, detail characteristics, edge texture details and the like of the image. Therefore, how to generate high resolution x-ray images becomes important to the processing task of industrial x-ray images.

In the prior art, the image processing method based on the convolutional neural network (Convolutional Neural Network, CNN) is widely applied to image super-resolution reconstruction, and compared with the traditional image enhancement method, the convolutional neural network is applied to image super-resolution reconstruction, and has the advantages of strong generalization, better effect and the like.

However, convolutional neural networks require high computational power on computing devices, are complex models, and require a large data set to train. If a lightweight convolutional neural network is used, the operation speed can be improved, but the operation result is insufficient. Therefore, how to quickly and accurately implement the reconstruction of the super-resolution image is a problem to be solved at present.

Disclosure of Invention

The invention provides a resolution reconstruction method, device, equipment and medium based on a convolutional neural network, which can solve the problem of low generation rate and accuracy of super-resolution images.

According to an aspect of the present invention, there is provided a resolution reconstruction method based on a convolutional neural network, including:

acquiring a target test set, and inputting the target test set into a trained target super-resolution reconstruction model;

the target super-resolution reconstruction model comprises a first convolution layer, a densely connected symmetrical layer, a characteristic fusion layer, a second convolution layer, a third convolution layer and a sub-pixel convolution layer which are sequentially connected;

determining an initial feature map which corresponds to the target test image in the target test set and meets the first tensor requirement through the first convolution layer;

determining a basic texture feature map and a target texture feature map corresponding to the initial feature map through the dense connection symmetry layer, and carrying out fusion processing on the basic texture feature map and the target texture feature map through the feature fusion layer to obtain a fusion feature map;

determining a basic feature map which corresponds to the fusion feature map and meets the requirement of a second tensor through the second convolution layer;

Performing first summation processing on the basic feature map and the initial feature map to obtain a first summation feature map, and determining a target feature map which corresponds to the first summation feature map and meets the requirement of a third tensor through the third convolution layer;

and processing the target feature map through up-sampling of the sub-pixel convolution layer to obtain a resolution reconstruction map corresponding to the target test image.

According to another aspect of the present invention, there is provided a resolution reconstruction apparatus based on a convolutional neural network, including:

the data input module is used for acquiring a target test set and inputting the target test set into the trained target super-resolution reconstruction model;

the first convolution processing module is used for determining an initial feature map which corresponds to the target test image in the target test set and meets the first tensor requirement through the first convolution layer;

the fusion processing module is used for determining a basic texture feature map and a target texture feature map corresponding to the initial feature map through the dense connection symmetrical layer, and carrying out fusion processing on the basic texture feature map and the target texture feature map through the feature fusion layer to obtain a fusion feature map;

The second convolution processing module is used for determining a basic feature map which corresponds to the fusion feature map and meets the second tensor requirement through the second convolution layer;

the third convolution processing module is used for carrying out first summation processing on the basic feature map and the initial feature map to obtain a first summation feature map, and determining a target feature map which corresponds to the first summation feature map and meets the third tensor requirement through the third convolution layer;

and the reconstruction image generation module is used for processing the target feature image through up-sampling of the sub-pixel convolution layer to obtain a resolution reconstruction image corresponding to the target test image.

According to another aspect of the present invention, there is provided an electronic apparatus including:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,

the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the convolutional neural network-based resolution reconstruction method of any one embodiment of the present invention.

According to another aspect of the present invention, there is provided a computer readable storage medium storing computer instructions for causing a processor to implement a resolution reconstruction method based on a convolutional neural network according to any one of the embodiments of the present invention when executed.

According to the technical scheme, the target test set is input into a trained target super-resolution reconstruction model comprising a first convolution layer, a dense connection symmetrical layer, a feature fusion layer, a second convolution layer, a third convolution layer and a sub-pixel convolution layer which are sequentially connected, an initial feature map which corresponds to a target test image in the target test set and meets the first tensor requirement is determined through the first convolution layer, a basic texture feature map which corresponds to the initial feature map and a target texture feature map are determined through the dense connection symmetrical layer, the basic texture feature map and the target texture feature map are fused through the feature fusion layer to obtain a fused feature map, a basic feature map which corresponds to the fused feature map and meets the second tensor requirement is determined through the second convolution layer, further, first addition and processing is carried out on the basic feature map and the initial feature map to obtain a first addition feature map, the target feature map which corresponds to the third tensor requirement is determined through the third convolution layer, finally, the target feature map is processed through sampling on the sub-pixel convolution layer to obtain a resolution reconstruction map which corresponds to the target test image, the resolution and super-resolution generation rate can be achieved, and the super-resolution image can be generated accurately.

It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the invention or to delineate the scope of the invention. Other features of the present invention will become apparent from the description that follows.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a flowchart of a resolution reconstruction method based on a convolutional neural network according to a first embodiment of the present invention.

Fig. 2 is a schematic structural diagram of a self-calibration convolution module according to a first embodiment of the present disclosure.

Fig. 3 is a schematic structural diagram of a channel attention module according to a first embodiment of the present invention.

Fig. 4 is a schematic structural diagram of a spatial attention module according to a first embodiment of the present invention.

Fig. 5 is a schematic structural diagram of an attention mechanism module according to a first embodiment of the present invention.

Fig. 6 is a schematic structural diagram of a target super-resolution reconstruction model according to a first embodiment of the present invention.

Fig. 7 is a flowchart of a resolution reconstruction method based on a convolutional neural network according to a second embodiment of the present invention.

Fig. 8 is a schematic structural diagram of a resolution reconstruction device based on a convolutional neural network according to a third embodiment of the present invention.

Fig. 9 is a schematic structural diagram of an electronic device implementing a convolutional neural network-based resolution reconstruction method according to an embodiment of the present invention.

Detailed Description

In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.

It should be noted that the terms "first," "second," "third," "target," "initial," and the like in the description and claims of the present invention and in the above-described figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

Example 1

Fig. 1 is a flowchart of a resolution reconstruction method based on a convolutional neural network according to an embodiment of the present invention, where the method may be performed by a resolution reconstruction device based on a convolutional neural network, where the resolution reconstruction device based on a convolutional neural network may be implemented in hardware and/or software, and the resolution reconstruction device based on a convolutional neural network may be configured in an electronic device, and may be configured in a computer device, for example. As shown in fig. 1, the method includes:

s110, acquiring a target test set, and inputting the target test set into a trained target super-resolution reconstruction model; the target super-resolution reconstruction model comprises a first convolution layer, a densely connected symmetrical layer, a characteristic fusion layer, a second convolution layer, a third convolution layer and a sub-pixel convolution layer which are sequentially connected.

Wherein the target test set may refer to a data set selected for high resolution image reconstruction. The target test set may be, for example, a dataset containing a large number of low resolution radiographic images.

The target super-resolution reconstruction model may refer to a model that processes a data image in a target test set to generate a high-resolution image.

S120, determining an initial feature map which corresponds to the target test image in the target test set and meets the first tensor requirement through the first convolution layer.

The target test image may refer to a data image selected for high-resolution image reconstruction in the target test set. The first convolution layer may refer to the first convolution layer constructed in the target super-resolution reconstruction model. The first convolution layer is mainly used for extracting features in the target test image. In an embodiment of the present invention, the convolution kernel size of the first convolution layer may be 3×3.

Wherein the first tensor requirement may refer to a defined requirement of the output characteristic according to the size of the convolution kernel in the first convolution layer. By way of example, a series of nonlinear transformations may be included in the first convolution layer, such as convolution operations, batch normalization, and activation functions. The initial feature map may refer to a feature map output by the first convolution layer that meets the first tensor requirement.

Specifically, in the first convolution layer, the following formula is first passed: Processing the target test image, wherein +.>Tensor input, w, which can represent D channels _i The weight on the convolution kernel corresponding to the ith channel may be represented. And then output meets the first tensor requirement: />Where O may represent the size of the output tensor, I may represent the size of the input tensor, K may represent the size of the convolution kernel, P may represent the zero padding number, and S may represent the movement step size. Further, by batch normalization formula:the output tensor for each layer is transformed into a normal distribution with a mean of 0 and a variance of 1, where,can represent the characteristics of the i-th layer of the output tensor,/->Characteristic average value, which can represent output tensor, < >>Can represent the variance of the feature, +.>And->Can represent the parameters introduced, < >>A constant added to the minimum batch variance may be represented. Finally, the function is activated by modifying the linear unit (Rectified Linear Unit, reLU): />A nonlinear process of the normally distributed output tensor is fitted. Where y may represent the normally distributed output tensor. Thus, an initial feature map meeting the first tensor requirement can be obtained.

S130, determining a basic texture feature map and a target texture feature map corresponding to the initial feature map through the dense connection symmetry layer, and fusing the basic texture feature map and the target texture feature map through the feature fusion layer to obtain a fused feature map.

The dense connection symmetric layer may refer to a layer that performs feature extraction on the initial feature map to obtain a base texture feature map and a target texture feature map. A base texture feature map may refer to an image that contains the basic texture features in the initial feature map. The target texture feature map may refer to an image in which feature refinement extraction is performed on the basis of the base texture feature map.

The feature fusion layer may refer to a layer that performs fusion processing on the basic texture feature map and the target texture feature map. The fused feature map may refer to a feature map generated by fusing the basic texture feature map and the target texture feature map.

In an optional embodiment, the determining, by the dense connection symmetry layer, the basic texture feature map and the target texture feature map corresponding to the initial feature map includes: calibrating and processing the initial feature map through a first self-calibration convolution module in the densely connected symmetrical layer, and generating a basic texture feature map corresponding to the initial feature map; refining the basic texture feature map through a target attention mechanism module in the dense connection symmetrical layer to obtain a refined feature map; and performing second addition processing on the refined feature map and the initial feature map to obtain a second addition feature map, and performing calibration processing on the second addition feature map through a second self-calibration convolution module in the densely connected symmetrical layer to obtain a target texture feature map corresponding to the initial feature map.

Wherein the first Self-calibrating convolution module may refer to a first layer Self-calibrating convolution (Self-Calibrated Convolutions Block, SCCB) module in a densely connected symmetrical layer. In general, the first self-calibrating convolution module may comprise three self-calibrating convolution modules. The second self-calibrating convolution module may refer to a second layer self-calibrating convolution module in the densely connected symmetrical layer. In general, the second self-calibrating convolution module may comprise three self-calibrating convolution modules.

The calibration process may refer to a process operation of performing fusion calibration on the multi-scale features and the spatial domain features in the initial feature map.

Fig. 2 is a schematic structural diagram of a self-calibration convolution module according to an embodiment of the present disclosure. Specifically, in the embodiment of the present invention, the self-calibration convolution module may include three branches, and after the initial feature map enters the self-calibration convolution module, the initial feature map is input into the three branches respectively, where the first branch undergoes a 1×1 convolution operation, a 3×3 convolution operation, and a leak ReLU activation function. The second branch performs matrix multiplication operation with the output of the first branch after 1×1 convolution operation and 3×3 convolution operation, and finally inputs the result to the fusion layer. The third branch is input to the fusion layer after 1×1 convolution operation and 3×3 convolution operation; and (3) performing element addition operation on the result output by the fusion layer and the input of the SCCB module connected with the residual after 1 multiplied by 1 convolution operation, wherein the final result is the output of the SCCB module. Wherein the leak ReLU activation function is represented by the formula: Realization of, wherein->May represent a non-zero slope and x may represent the output result of the 3 x 3 convolution operation in the first branch. Thus, the SCCB module reduces the number of channels by half by using pixel convolution of two branches, and the first two branches can calculate attention information of pixels, thereby extracting multi-scale features, and the third branch can restore spatial domain information, and finally, the attention information is fused with the spatial domain information by the pixel convolution to purposefully restore lost texture information.

The target attention mechanism module may refer to an attention mechanism module (Convolutional Block Attention Module, CBAM) in the dense connection symmetry layer that performs local detail feature refinement processing. The refinement process may refer to a process operation that performs a refinement fusion of channel attention features and spatial attention features in the base texture feature map. The refined feature map may refer to a feature map obtained by refining the basic texture feature map.

Wherein the second summation process may refer to summing the refined feature map with the initial feature map. The second addition feature map may refer to a feature map obtained by performing a second addition process on the refined feature map and the initial feature map.

Specifically, after the initial feature map is input to the dense connection symmetrical layer, the initial feature map can be calibrated and processed through a first self-calibration convolution module in the dense connection symmetrical layer to generate a basic texture feature map corresponding to the initial feature map, further, the basic texture feature map is refined and processed through a target attention mechanism module in the dense connection symmetrical layer to generate a refined feature map, the refined feature map and the initial feature map are subjected to second addition processing to generate a second addition feature map, and finally, the second addition feature map is calibrated and processed through a second self-calibration convolution module in the dense connection symmetrical layer to output a target texture feature map corresponding to the initial feature map. Therefore, a basic texture feature map and a target texture feature map corresponding to the initial feature map can be generated, and an effective basis is provided for subsequent operation.

In an optional implementation manner, the refining the basic texture feature map through the target attention mechanism module in the dense connection symmetry layer to obtain a refined feature map includes: carrying out first pooling treatment on the basic texture feature map through a channel attention module in the target attention mechanism module to obtain a channel attention feature map corresponding to the basic texture feature map; performing first weighting processing on the channel attention feature map and the basic texture feature map to obtain a weighted feature map; performing second pooling processing on the channel attention feature map through a space attention module in the target attention mechanism module to obtain a space attention feature map corresponding to the channel attention feature map; and carrying out second weighting processing on the spatial attention characteristic diagram and the weighted characteristic diagram to obtain a refined characteristic diagram.

The channel attention module may refer to a module for extracting channel attention features in the basic texture feature map. The first pooling process may refer to a downsampling process operation of the underlying texture feature map with a pooling layer, a convolution layer, and an activation function. The channel attention feature map may refer to a feature map obtained by performing a first pooling process on the base texture feature map.

Fig. 3 is a schematic structural diagram of a channel attention module according to an embodiment of the present invention. Specifically, after the basic texture feature map with the size of h×w×c is input to the target attention mechanism module, firstly, two feature maps with the size of 1×1×c can be obtained through a global maximum pooling layer and a global average pooling layer respectively, then, the feature maps are respectively sent to a two-layer multi-layer perceptron (Multilayer Perceptron, MLP) neural network, and then, the addition operation and ReLU function activation are performed on the features output by the MLP to complete the first pooling process, thereby obtaining the channel attention feature map corresponding to the basic texture feature map.

The first weighting process may refer to a weighting operation performed on the channel attention feature map and the base texture feature map. The weighted feature map may refer to a feature map generated by weighting a channel attention feature map and a base texture feature map.

The spatial attention module may refer to a module for extracting spatial attention features in the channel attention feature map. The second pooling process may refer to a downsampling process operation of the channel attention profile using the pooling layer, the convolution layer, and the activation function. The spatial attention profile may refer to a profile obtained by performing a second pooling process on the channel attention profile.

Fig. 4 is a schematic structural diagram of a spatial attention module according to an embodiment of the present invention. Specifically, after the channel attention feature map output by the channel attention module is used as an input feature map of the spatial attention module, firstly, two feature maps with the size of H×W×1 are obtained through a global maximum pooling layer and a global average pooling layer, then, fusion operation is carried out on the two feature maps based on the channel and the two feature maps pass through a 7×7 convolution layer, and finally, second pooling processing is completed through Sigmoid function activation, so that the spatial attention feature map corresponding to the channel attention feature map is obtained.

The second weighting process may refer to a weighting operation performed on the spatial attention profile and the weighted profile.

Fig. 5 is a schematic structural diagram of an attention mechanism module according to an embodiment of the present invention. Specifically, after the basic texture feature map output by the first self-calibration convolution module is input to the target attention mechanism module, the channel attention feature map corresponding to the basic texture feature map is obtained through the channel attention module, then the channel attention feature map and the basic texture feature map are subjected to first weighting processing to obtain a weighted feature map, the channel attention feature map is subjected to second pooling processing through the spatial attention module to obtain a spatial attention feature map corresponding to the channel attention feature map, and finally the spatial attention feature map and the weighted feature map are subjected to second weighting processing to obtain a refined feature map.

And S140, determining a basic feature map which corresponds to the fusion feature map and meets the second tensor requirement through the second convolution layer.

The second convolution layer may refer to a second convolution layer constructed in the target super-resolution reconstruction model. The second convolution layer is mainly used for extracting the features in the fusion feature map. In the embodiment of the invention, the convolution kernel size of the second convolution layer may be 1×1, so that the calculation amount may be reduced by changing the number of channels.

Wherein the second tensor requirement may refer to a defined requirement for the output characteristic based on the size of the convolution kernel in the second convolution layer. It should be noted that, in the embodiment of the present invention, the specific rule of the second tensor requirement is consistent with the first tensor requirement. The base profile may refer to a profile output by the second convolution layer that meets the second tensor requirement.

And S150, performing first summation processing on the basic feature map and the initial feature map to obtain a first summation feature map, and determining a target feature map which corresponds to the first summation feature map and meets the requirement of a third tensor through the third convolution layer.

Wherein the first addition process may refer to adding the base feature map to the initial feature map. The first addition feature map may refer to a feature map obtained by performing a first addition process on the base feature map and the initial feature map.

The third convolution layer may refer to a third convolution layer constructed in the target super-resolution reconstruction model. The third convolution layer is mainly used for extracting the features in the first summation feature map. In an embodiment of the present invention, the convolution kernel size of the third convolution layer may be 3×3.

Wherein the third tensor requirement may refer to a defined requirement for the output characteristic based on the size of the convolution kernel in the third convolution layer. It should be noted that, in the embodiment of the present invention, the specific rule of the third tensor requirement is consistent with the first tensor requirement. The target feature map may refer to a feature map output by the third convolution layer that meets a third tensor requirement.

And S160, processing the target feature map through up-sampling of the sub-pixel convolution layer to obtain a resolution reconstruction map corresponding to the target test image.

Among other things, a subpixel convolution (Pixel Shuffle) layer may refer to a layer that resolution-improves the target feature map. Illustratively, the sub-pixel convolution layer may be an upsampling layer, and may generate a high-resolution reconstruction map from the low-resolution target feature map through convolution and inter-channel recombination.

The up-sampling process may refer to an operation of expanding and amplifying the target feature map to generate a required size. The upsampling process may be implemented by an interpolation method or by a deconvolution method, which is not limited in this embodiment of the present invention. The resolution reconstruction map may refer to a reconstructed image with an improved resolution.

Fig. 6 is a schematic structural diagram of a target super-resolution reconstruction model according to an embodiment of the present invention. Specifically, after the target test set is input to the trained target super-resolution reconstruction model, an initial feature map meeting the requirement of a first tensor is generated through a first convolution layer, then the initial feature map is respectively input into two branches, the first branch is input into a first self-calibration convolution module 11 of the first self-calibration convolution module, and after a series of operations of the first self-calibration convolution module 11, the initial feature map is respectively input into a first middle self-calibration convolution module 12, a target attention mechanism module and a feature fusion layer under residual connection. After a series of arithmetic processing of the first middle self-calibration convolution module 12 of the first self-calibration convolution module, the result is respectively input to the first last self-calibration convolution module 13, the target attention mechanism module and the feature fusion layer under residual connection. After a series of arithmetic processing of the first last self-calibration convolution module 13 of the first self-calibration convolution module, the result is respectively input to the target attention mechanism module and the feature fusion layer under residual connection. The second branch is input into a second first-bit self-calibration convolution module 21 of the second self-calibration convolution module, and after a series of operation processes of the second first-bit self-calibration convolution module 21, the second branch is respectively input into a second middle-bit self-calibration convolution module 22 and a feature fusion layer under residual connection. After a series of arithmetic processing of the second middle self-calibration convolution module 22 of the second self-calibration convolution module, the second last self-calibration convolution module 23 and the feature fusion layer under residual connection are respectively input. After a series of arithmetic processing by the second last self-calibration convolution module 23 of the second self-calibration convolution module, the result is input to the feature fusion layer. Further, the basic feature map output by the second convolution layer is connected with the initial feature map output by the first convolution layer, a first addition processing is carried out to obtain a first addition feature map, and a target feature map which corresponds to the first addition feature map and meets the requirement of a third tensor is determined through the third convolution layer. And finally, processing the target feature map through up-sampling of the sub-pixel convolution layer to obtain a resolution reconstruction map corresponding to the target test image.

Example two

Fig. 7 is a flowchart of a resolution reconstruction method based on a convolutional neural network according to a second embodiment of the present invention, where the adding is based on the above embodiment, and in this embodiment, the adding is specifically performed on an operation before the acquisition of the target test set, and may specifically include: acquiring an initial image data set containing a low-resolution image, and dividing and processing the initial image data set according to a first set proportion to obtain a basic training set; image preprocessing basic training images in the basic training set to obtain a target training set expanded by data; inputting the target training images in the target training set into a basic super-resolution reconstruction model to generate a predicted image and a prediction score corresponding to the predicted image; and training and processing the basic super-resolution reconstruction model based on the association relation between the predicted score and the real score to obtain a target super-resolution reconstruction model after training. As shown in fig. 7, the method includes:

s210, acquiring an initial image data set containing a low-resolution image, and dividing and processing the initial image data set according to a first set proportion to obtain a basic training set.

Wherein the initial image dataset may refer to a dataset of pre-collected industrial X-ray images. Notably, the image data contained in the initial image dataset is a low resolution radiographic image.

The first setting proportion may be a preset value for normalizing the data amount in the basic training set. The first set proportion may be, for example, 80%. The basic training set may refer to a data set composed of image data satisfying a first set proportion among the initial image data sets.

Specifically, after the initial image data set is collected, the image data of the first set proportion in the initial image data set can be divided and extracted to form a basic training set, so that an effective basis is provided for subsequent training operation.

S220, preprocessing the basic training images in the basic training set by the images to obtain a target training set with expanded data.

The base training image may refer to each low resolution image contained in the base training set, among other things. Image preprocessing may refer to an operation of preprocessing the size of the base training image. For example, the image preprocessing may include a size scaling operation, may include a normalization operation, and the like. The target training set may refer to a data set generated after image preprocessing of the base training image.

In an alternative embodiment, the image preprocessing the basic training image in the basic training set to obtain a target training set expanded by data, including: size modification processing is carried out on each basic training image to obtain a standard size image meeting the set size requirement; randomly cutting and processing each standard size image to obtain an incremental training image with data expansion; and carrying out standard normalization processing on each increment training image to obtain normalized increment images, and carrying out combination processing on the normalized increment images corresponding to each basic training image to obtain a target training set expanded by data.

The size modification process may refer to an operation of modifying and adjusting the size of the basic training image according to a preset size requirement. The set size requirement may refer to a predetermined size of the base training image. Illustratively, the sizing requirement may be 256×256. The standard size image may refer to a training image obtained by performing a size modification process on the basic training image.

Specifically, taking the basic training image size of 1536×1536, the size requirement is set to 256×256 as an example. The modification of the size of the base training image from 1536×1536 to 256 may be accomplished by calling the torchvision.

The random cropping process may refer to an operation of randomly cropping out an image of a set size among the standard-size images. The incremental training image may refer to a training image obtained by performing random cropping processing on a standard-size image. For example, standard size images may be randomly cropped to produce 224 x 224 size incremental training images. Specifically, a torchvision.

The standard normalization processing may refer to an operation of normalizing and normalizing each incremental training image. The normalized incremental image may refer to a training image obtained by performing standard normalization processing on the incremental training image.

The normalization processing and normalization processing methods may specifically be：，/>Can represent the mean value of the incremental training images, X can represent the picture tensor of the incremental training images,/->Can represent standard deviation, max can represent maximum value of picture tensor, min can represent minimum value of picture tensor, X ₁ Can represent normalized picture tensor, X ₀ The normalized picture tensor may be represented.

It is noted that in the embodiment of the present invention, after each standard size image is randomly cut and processed to obtain an incremental training image with expanded data, the incremental training image may be randomly horizontally flipped according to a probability of 0.5, so as to convert the incremental training image into a tensor format, and further, the incremental training image in each tensor format is subjected to standard normalization processing, so as to obtain a normalized incremental image. Specifically, a torchvision.

S230, inputting the target training images in the target training set into a basic super-resolution reconstruction model, and generating a prediction image and a prediction score corresponding to the prediction image.

Wherein, the target training image may refer to image data in the target training set. The basic super-resolution reconstruction model may refer to a pre-constructed super-resolution reconstruction model. Notably, the basic super-resolution reconstruction model is consistent with the model structure of the trained target super-resolution reconstruction model.

The predicted image may refer to a reconstruction result generated by inputting the target training image into the basic super-resolution reconstruction model. The prediction score may refer to a numerical value that evaluates the accuracy of the predicted image. Illustratively, the predictive score may be 0.9.

S240, training and processing the basic super-resolution reconstruction model based on the association relation between the prediction scores and the real scores to obtain a target super-resolution reconstruction model after training.

Wherein the true score may refer to an expected score of the output result of the base super-resolution reconstruction model. Illustratively, the true score may be 1.

In an optional implementation manner, the training process of the basic super-resolution reconstruction model based on the association relationship between the prediction score and the true score to obtain a trained target super-resolution reconstruction model includes: and training and processing the basic super-resolution reconstruction model according to an average absolute error loss function based on the association relation between the prediction score and the real score to obtain a trained target super-resolution reconstruction model.

Wherein the mean absolute error loss (Mean Absolute Error, MAE) function can be calculated by the formula:the representation is performed. Where i may represent the sequential labels of the target training images in the target training set. n may represent the total number of target training images in the target training set. />The prediction scores output by the base super-resolution reconstruction model may be represented. />The true score of the underlying super-resolution reconstruction model may be represented.

Specifically, after the prediction scores corresponding to the predicted images output by the basic super-resolution reconstruction model are generated, the basic super-resolution reconstruction model can be trained and processed by using the MAE function, so that a trained target super-resolution reconstruction model is obtained, and an accurate model foundation is provided for subsequent practical application.

S250, acquiring an initial image data set containing the low-resolution image, and dividing and processing the initial image data set according to a second set proportion to obtain a target test set.

The second setting proportion may be a preset value for normalizing the data amount in the target test set. The second set proportion may be, for example, 20%. It is noted that the sum of the first set proportion and the second set proportion is 100%.

Specifically, after the initial image data set is collected, the image data of the second set proportion in the initial image data set can be divided and extracted to form a target test set, so that an effective basis is provided for the subsequent actual test.

S260, inputting the target test set into a trained target super-resolution reconstruction model; the target super-resolution reconstruction model comprises a first convolution layer, a densely connected symmetrical layer, a characteristic fusion layer, a second convolution layer, a third convolution layer and a sub-pixel convolution layer which are sequentially connected.

S270, determining an initial feature map which corresponds to the target test image in the target test set and meets the first tensor requirement through the first convolution layer.

S280, determining a basic texture feature map and a target texture feature map corresponding to the initial feature map through the dense connection symmetry layer, and fusing the basic texture feature map and the target texture feature map through the feature fusion layer to obtain a fused feature map.

S290, determining a basic feature map which corresponds to the fusion feature map and meets the requirement of a second tensor through the second convolution layer.

And S2100, performing first summation processing on the basic feature map and the initial feature map to obtain a first summation feature map, and determining a target feature map which corresponds to the first summation feature map and meets the requirement of a third tensor through the third convolution layer.

S2110, processing the target feature map through up-sampling of the sub-pixel convolution layer to obtain a resolution reconstruction map corresponding to the target test image.

S2120, analyzing and processing the resolution reconstruction map, determining defect problems contained in the resolution reconstruction map, and generating a target solution corresponding to a target test image according to the defect problems.

The defect problem may refer to a defect existing in the target test image itself. For example, there may be a minute weld defect or a wire breakage. The target solution may refer to a solution that matches according to the defect problem. For example, if the defect problem is wire breakage, the target solution may be to connect wires on the target object, or the like. Thus, detection and analysis of smaller and more complex structured objects can be achieved.

According to the technical scheme, a basic training set is obtained by dividing and processing an initial image data set according to a first set proportion, a basic training image in the basic training set is preprocessed by an image to obtain a target training set with data expansion, further, the target training image in the target training set is input into a basic super-resolution reconstruction model to generate a predicted image and a predicted score corresponding to the predicted image, the basic super-resolution reconstruction model is trained and processed based on the association relation between the predicted score and a true score to obtain a trained target super-resolution reconstruction model, an initial image data set containing a low-resolution image is further obtained, the initial image data set is divided and processed according to a second set proportion to obtain a target test set, the target test set is input into a trained target super-resolution reconstruction model containing a first convolution layer, a dense connection symmetrical layer, a feature fusion layer, a second convolution layer, a third convolution layer and a sub-pixel convolution layer, initial feature images corresponding to the target test image in the target test set are determined through the first convolution layer, texture feature images corresponding to the initial feature images are determined through the first convolution layer, texture feature images corresponding to the target feature images are determined through the dense connection symmetrical layer, the target feature images are processed through the second convolution layer and the first convolution layer and the target feature images corresponding to the target feature images are obtained through the first convolution layer and the second convolution layer, the feature images are fused through the first feature images and the second feature images corresponding to the target feature images are obtained through the first convolution layer, and finally, analyzing and processing the resolution reconstruction map, determining the defect problem contained in the resolution reconstruction map, and generating a target solution corresponding to the target test image according to the defect problem, thereby solving the problem of lower generation rate and accuracy of the super-resolution image, realizing the reconstruction of the super-resolution image and improving the generation rate and accuracy of the super-resolution image.

Example III

Fig. 8 is a schematic structural diagram of a resolution reconstruction device based on a convolutional neural network according to a third embodiment of the present invention. As shown in fig. 8, the apparatus includes: a data input module 310, a first convolution processing module 320, a fusion processing module 330, a second convolution processing module 340, a third convolution processing module 350, and a reconstructed graph generation module 360;

the data input module 310 is configured to obtain a target test set, and input the target test set to a trained target super-resolution reconstruction model; the target super-resolution reconstruction model comprises a first convolution layer, a densely connected symmetrical layer, a characteristic fusion layer, a second convolution layer, a third convolution layer and a sub-pixel convolution layer which are sequentially connected;

a first convolution processing module 320, configured to determine, by using the first convolution layer, an initial feature map corresponding to the target test image in the target test set and meeting a first tensor requirement;

the fusion processing module 330 is configured to determine a basic texture feature map and a target texture feature map corresponding to the initial feature map through the dense connection symmetry layer, and perform fusion processing on the basic texture feature map and the target texture feature map through the feature fusion layer to obtain a fused feature map;

A second convolution processing module 340, configured to determine, by using the second convolution layer, a base feature map corresponding to the fused feature map and meeting a second tensor requirement;

a third convolution processing module 350, configured to perform a first addition process on the basic feature map and the initial feature map to obtain a first added feature map, and determine, by using the third convolution layer, a target feature map corresponding to the first added feature map and meeting a third tensor requirement;

the reconstructed image generating module 360 is configured to obtain a resolution reconstructed image corresponding to the target test image by upsampling the target feature image by the sub-pixel convolution layer.

Optionally, the resolution reconstruction device based on the convolutional neural network may further include: the model training module specifically comprises a training set construction unit, an image preprocessing unit, a scoring generation unit and a model training unit;

the training set constructing unit is used for acquiring an initial image data set containing a low-resolution image before the target test set is acquired, and dividing and processing the initial image data set according to a first set proportion to obtain a basic training set;

the image preprocessing unit is used for preprocessing the basic training images in the basic training set to obtain a target training set expanded by data;

the score generating unit is used for inputting the target training images in the target training set into a basic super-resolution reconstruction model to generate a predicted image and a prediction score corresponding to the predicted image;

and the model training unit is used for training and processing the basic super-resolution reconstruction model based on the association relation between the prediction scores and the real scores to obtain a target super-resolution reconstruction model after training.

Optionally, the data input module 310 may specifically be configured to: acquiring an initial image data set containing a low-resolution image, and dividing and processing the initial image data set according to a second set proportion to obtain a target test set;

The resolution reconstruction device based on the convolutional neural network can further comprise: and the post-processing module is used for analyzing and processing the resolution reconstruction map after the target feature map is processed through up-sampling of the sub-pixel convolution layer to obtain the resolution reconstruction map corresponding to the target test image, determining defect problems contained in the resolution reconstruction map, and generating a target solution corresponding to the target test image according to the defect problems.

Optionally, the image preprocessing unit may specifically be configured to:

size modification processing is carried out on each basic training image to obtain a standard size image meeting the set size requirement;

randomly cutting and processing each standard size image to obtain an incremental training image with data expansion;

and carrying out standard normalization processing on each increment training image to obtain normalized increment images, and carrying out combination processing on the normalized increment images corresponding to each basic training image to obtain a target training set expanded by data.

Optionally, the model training unit may specifically be configured to: and training and processing the basic super-resolution reconstruction model according to an average absolute error loss function based on the association relation between the prediction score and the real score to obtain a trained target super-resolution reconstruction model.

Optionally, the fusion processing module 330 specifically may include: a first feature map generation unit, a second feature map generation unit, and a third feature map generation unit;

the first feature map generating unit is used for carrying out calibration processing on the initial feature map through a first self-calibration convolution module in the densely connected symmetrical layer to generate a basic texture feature map corresponding to the initial feature map;

the second feature map generating unit is used for refining the basic texture feature map through a target attention mechanism module in the dense connection symmetrical layer to obtain a refined feature map;

and the third feature map generating unit is used for carrying out second addition processing on the refined feature map and the initial feature map to obtain a second addition feature map, and carrying out calibration processing on the second addition feature map through a second self-calibration convolution module in the densely connected symmetrical layer to obtain a target texture feature map corresponding to the initial feature map.

Optionally, the second feature map generating unit may specifically be configured to:

carrying out first pooling treatment on the basic texture feature map through a channel attention module in the target attention mechanism module to obtain a channel attention feature map corresponding to the basic texture feature map;

Performing first weighting processing on the channel attention feature map and the basic texture feature map to obtain a weighted feature map;

performing second pooling processing on the channel attention feature map through a space attention module in the target attention mechanism module to obtain a space attention feature map corresponding to the channel attention feature map;

and carrying out second weighting processing on the spatial attention characteristic diagram and the weighted characteristic diagram to obtain a refined characteristic diagram.

The resolution reconstruction device based on the convolutional neural network provided by the embodiment of the invention can execute the resolution reconstruction method based on the convolutional neural network provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.

Example IV

Fig. 9 shows a schematic diagram of an electronic device 410 that may be used to implement an embodiment of the invention. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. Electronic equipment may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices (e.g., helmets, glasses, watches, etc.), and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed herein.

As shown in fig. 9, the electronic device 410 includes at least one processor 420, and a memory, such as a Read Only Memory (ROM) 430, a Random Access Memory (RAM) 440, etc., communicatively coupled to the at least one processor 420, wherein the memory stores computer programs executable by the at least one processor, and the processor 420 may perform various suitable actions and processes according to the computer programs stored in the Read Only Memory (ROM) 430 or the computer programs loaded from the storage unit 490 into the Random Access Memory (RAM) 440. In RAM440, various programs and data required for the operation of electronic device 410 may also be stored. The processor 420, ROM 430, and RAM440 are connected to each other by a bus 450. An input/output (I/O) interface 460 is also connected to bus 450.

Various components in the electronic device 410 are connected to the I/O interface 460, including: an input unit 470 such as a keyboard, a mouse, etc.; an output unit 480 such as various types of displays, speakers, and the like; a storage unit 490, such as a magnetic disk, an optical disk, or the like; and a communication unit 4100, such as a network card, modem, wireless communication transceiver, etc. The communication unit 4100 allows the electronic device 410 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunications networks.

Processor 420 may be a variety of general-purpose and/or special-purpose processing components having processing and computing capabilities. Some examples of processor 420 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various processors running machine learning model algorithms, digital Signal Processors (DSPs), and any suitable processor, controller, microcontroller, etc. The processor 420 performs the various methods and processes described above, such as a resolution reconstruction method based on a convolutional neural network.

The method comprises the following steps:

In some embodiments, the convolutional neural network-based resolution reconstruction method may be implemented as a computer program tangibly embodied on a computer-readable storage medium, such as storage unit 490. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 410 via the ROM 430 and/or the communication unit 4100. When the computer program is loaded into RAM 440 and executed by processor 420, one or more steps of the convolutional neural network-based resolution reconstruction method described above may be performed. Alternatively, in other embodiments, the processor 420 may be configured to perform a convolutional neural network-based resolution reconstruction method in any other suitable manner (e.g., by means of firmware).

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.

A computer program for carrying out methods of the present invention may be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the computer programs, when executed by the processor, cause the functions/acts specified in the flowchart and/or block diagram block or blocks to be implemented. The computer program may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of the present invention, a computer-readable storage medium may be a tangible medium that can contain, or store a computer program for use by or in connection with an instruction execution system, apparatus, or device. The computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. Alternatively, the computer readable storage medium may be a machine readable signal medium. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on an electronic device having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) through which a user can provide input to the electronic device. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), blockchain networks, and the internet.

The computing system may include clients and servers. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service are overcome.

It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present invention may be performed in parallel, sequentially, or in a different order, so long as the desired results of the technical solution of the present invention are achieved, and the present invention is not limited herein.

The above embodiments do not limit the scope of the present invention. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the scope of the present invention.

Claims

1. The resolution reconstruction method based on the convolutional neural network is characterized by comprising the following steps of:

processing the target feature map through up-sampling of the sub-pixel convolution layer to obtain a resolution reconstruction map corresponding to a target test image;

the determining, by the dense connection symmetry layer, a basic texture feature map and a target texture feature map corresponding to the initial feature map includes:

calibrating and processing the initial feature map through a first self-calibration convolution module in the densely connected symmetrical layer, and generating a basic texture feature map corresponding to the initial feature map;

Refining the basic texture feature map through a target attention mechanism module in the dense connection symmetrical layer to obtain a refined feature map;

and performing second addition processing on the refined feature map and the initial feature map to obtain a second addition feature map, and performing calibration processing on the second addition feature map through a second self-calibration convolution module in the densely connected symmetrical layer to obtain a target texture feature map corresponding to the initial feature map.

2. The method of claim 1, further comprising, prior to the acquiring the target test set:

acquiring an initial image data set containing a low-resolution image, and dividing and processing the initial image data set according to a first set proportion to obtain a basic training set;

image preprocessing basic training images in the basic training set to obtain a target training set expanded by data;

inputting the target training images in the target training set into a basic super-resolution reconstruction model to generate a predicted image and a prediction score corresponding to the predicted image;

and training and processing the basic super-resolution reconstruction model based on the association relation between the predicted score and the real score to obtain a target super-resolution reconstruction model after training.

3. The method of claim 2, wherein the obtaining the target test set comprises:

acquiring an initial image data set containing a low-resolution image, and dividing and processing the initial image data set according to a second set proportion to obtain a target test set;

after the target feature map is processed through up-sampling of the sub-pixel convolution layer to obtain a resolution reconstruction map corresponding to a target test image, the method further comprises the following steps:

analyzing and processing the resolution reconstruction map, determining defect problems contained in the resolution reconstruction map, and generating a target solution corresponding to a target test image according to the defect problems.

4. The method of claim 2, wherein the image pre-processing the base training images in the base training set to obtain a data-augmented target training set, comprising:

5. The method according to claim 2, wherein the training the base super-resolution reconstruction model based on the association between the prediction score and the true score to obtain a trained target super-resolution reconstruction model comprises:

and training and processing the basic super-resolution reconstruction model according to an average absolute error loss function based on the association relation between the prediction score and the real score to obtain a trained target super-resolution reconstruction model.

6. The method according to claim 1, wherein the refining the base texture feature map by the target attention mechanism module in the dense connection symmetry layer to obtain a refined feature map comprises:

7. A convolutional neural network-based resolution reconstruction device, comprising:

the reconstruction image generation module is used for processing the target feature image through up-sampling of the sub-pixel convolution layer to obtain a resolution reconstruction image corresponding to the target test image;

the fusion processing module is specifically configured to: calibrating and processing the initial feature map through a first self-calibration convolution module in the densely connected symmetrical layer, and generating a basic texture feature map corresponding to the initial feature map;

8. An electronic device, the electronic device comprising:

At least one processor; the method comprises the steps of,

a memory communicatively coupled to the at least one processor; wherein,

the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the convolutional neural network-based resolution reconstruction method of any one of claims 1-6.

9. A computer readable storage medium storing computer instructions for causing a processor to implement the convolutional neural network-based resolution reconstruction method of any one of claims 1-6 when executed.