CN118333859A - Method for realizing super resolution applied to automation equipment - Google Patents
Method for realizing super resolution applied to automation equipment Download PDFInfo
- Publication number
- CN118333859A CN118333859A CN202410733970.6A CN202410733970A CN118333859A CN 118333859 A CN118333859 A CN 118333859A CN 202410733970 A CN202410733970 A CN 202410733970A CN 118333859 A CN118333859 A CN 118333859A
- Authority
- CN
- China
- Prior art keywords
- layer
- image
- channels
- convolution
- resolution
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 47
- 238000012549 training Methods 0.000 claims abstract description 31
- 230000007547 defect Effects 0.000 claims abstract description 24
- 238000001514 detection method Methods 0.000 claims abstract description 20
- 230000000007 visual effect Effects 0.000 claims abstract description 19
- 230000004913 activation Effects 0.000 claims description 30
- 230000006870 function Effects 0.000 claims description 30
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 claims description 20
- 238000007689 inspection Methods 0.000 claims description 8
- 230000008447 perception Effects 0.000 claims description 6
- 238000005070 sampling Methods 0.000 claims description 6
- 230000003287 optical effect Effects 0.000 claims description 4
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 claims description 3
- 239000000284 extract Substances 0.000 claims description 3
- 238000002922 simulated annealing Methods 0.000 claims description 3
- 238000010586 diagram Methods 0.000 description 9
- 238000013135 deep learning Methods 0.000 description 5
- 238000003672 processing method Methods 0.000 description 3
- 238000013461 design Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000004807 localization Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
Landscapes
- Image Analysis (AREA)
Abstract
The invention discloses a method for realizing super-resolution applied to automation equipment, which aims to reconstruct and acquire high-quality high-resolution images under the condition of not increasing hardware cost, thereby improving the positioning accuracy of a vision module, improving the sorting success rate of ultra-small element sorting equipment, or improving the positioning accuracy of the vision module and improving the detection success rate of small-size defects of AOI equipment. The invention comprises the following steps: step S1: creating a sample dataset S; step S2: constructing a generator model; step S3: constructing a discriminator model; step S4: creating a VGG model; step S5: training a model; step S6: the trained model is deployed between a camera module and a visual positioning module for precise sorting of ultra-small elements or for visual detection of small-size defects. The invention is applied to the technical field of automation equipment needing to realize super resolution.
Description
Technical Field
The invention relates to a method for realizing super-resolution, in particular to a method for realizing super-resolution applied to automation equipment.
Background
Background one: the visual sorting apparatus may sort elements of different categories or different qualities into different containers. The sorting equipment collects images of the working area of the equipment through the camera module, the vision module positions the position of the element in the images and converts the position of the element in the working area, the sorting device is guided to finish element sorting, and the working flow of the sorting device is shown in figure 1. For ultra-small components, the accuracy of the positioning of the components by the vision module is an important bottleneck for the system. This is because the ultra-small elements occupy too few pixels in the image, and the features of the elements are not fully represented in the image, resulting in difficulty in locating the vision module.
Background II: with the development of automated techniques, automated optical inspection (Automated Optical Inspection, AOI) techniques have gradually replaced manual defect detection. Compared with the artificial defect detection, the AOI detection technology is not influenced by subjective factors, and can use a quantifiable unified standard for detection, so that the quality of products is effectively ensured. AOI inspection techniques require the use of an industrial camera and an industrial light source in combination to capture images and employ digital image processing algorithms to accomplish defect inspection. The digital image processing method includes a conventional image processing method and a deep learning-based image processing method, which accomplish the identification and localization of defects by detecting unique features of the defects on an image, and the workflow thereof is shown in fig. 2. The surface defects of the product are generally characterized by various forms, large size difference and the like. Since the field of view of the image acquisition system is required to be compatible with products of a variety of sizes, for small-sized defects, the identifiable features in the image are insufficient, which can easily lead to defects that cannot be accurately detected by the AOI device.
In order to solve the problems faced by the two above-mentioned backgrounds, it is necessary to provide a higher resolution input image to the vision module. At present, two main modes are available for acquiring high-resolution images, namely, the precision of a camera module is improved, and the high-resolution images are directly acquired; and secondly, reconstructing a Low-Resolution image (LR) into a High-Resolution image (HR) by a Super-Resolution method (SR). The use of high precision camera modules can significantly increase the production cost of the device and reduce the market competitiveness of the device. Therefore, reconstructing the high-resolution image by the super-resolution method becomes a more economical and practical design scheme.
The super resolution method can be classified into an interpolation-based method and a learning-based method. The interpolation-based method is to interpolate the gray level or color of the image pixel according to prior knowledge, and alternative methods include bilinear interpolation, bicubic interpolation, interpolation method with edge holding characteristic, and the like. Such methods are generally faster, but the texture portion of the high resolution image obtained by reconstruction is blurred. Early learning-based methods were mostly dictionary-like methods. The dictionary method is to cut the image into small images and group the images, establish a dictionary with the corresponding relation between the low resolution small images and the high resolution small images, and reconstruct the newly acquired low resolution images by using the dictionary. If the acquired low-resolution image is not similar to the image in the dictionary, distortion can occur in the high-resolution image obtained by reconstruction. With the development of deep learning, the super-resolution method based on deep learning has taken an important place in the learning-based method. The method learns the relation between the low-resolution image and the high-resolution image through model training, and obtains a model which can be used for reconstructing the high-resolution image. The super-resolution method based on deep learning directly optimizes the generation result of the model in the training process without manually providing prior or design features, thereby having better robustness.
Although many super-resolution methods based on deep learning have been proposed at present, the following problems still exist with these methods: 1. in the model training process, the partial method only focuses on the similarity between the reconstruction result of the current image and the corresponding high-resolution image. This easily results in model overfitting, poor generalization performance, and inability to apply on newly acquired images; 2. in the model training process, the other part of the method only focuses on the similarity between the reconstruction result of the current image and the dataset image. This results in the reconstructed high resolution image not accurately recovering the details in the image.
Disclosure of Invention
The invention aims to solve the technical problems of overcoming the defects of the prior art, and provides a method for realizing super resolution applied to automation equipment, which aims to reconstruct and acquire high-quality high-resolution images under the condition of not increasing hardware cost, thereby improving the positioning accuracy of a vision module, improving the sorting success rate of ultra-small element sorting equipment or improving the positioning accuracy of the vision module and improving the detection success rate of small-size defects of AOI equipment; in addition, plug and play can be realized through the invention, namely, the positioning precision of the visual detection module can be improved or the AOI detection precision of the visual detection module can be improved only by arranging the model between the camera module and the visual positioning module.
The technical scheme adopted by the invention is as follows: the method for realizing super resolution applied in the automation equipment comprises the following steps:
Step S1: creating a sample dataset S, collecting low resolution images and corresponding high resolution images of the device workspace in pairs, s= { (LR i,HRi) }, where LR i and HR i represent low resolution images and corresponding high resolution images of the ith ultra-small element in the dataset S, respectively;
Step S2: constructing a generator model, wherein a generator G consists of a plurality of convolution modules, a nonlinear module and an up-sampling module and is responsible for reconstructing an input low-resolution image into a corresponding high-resolution image;
Step S3: constructing a discriminator model, wherein a discriminator D consists of a plurality of convolution modules, a nonlinear module and a downsampling module and is responsible for discriminating whether an input image is a reconstructed image or a real high-resolution image in the training process;
step S4: creating a VGG model, wherein the VGG model extracts image characteristics and is used for calculating model training loss;
Step S5: training a model, namely setting optimizers of a generator G and a discriminator D as Adam and SGD respectively, setting a learning rate scheduling scheme as cosine simulated annealing, and alternately training the generator G and the discriminator D;
step S6: the trained model is deployed between a camera module and a visual positioning module for precise sorting of ultra-small elements or for visual detection of small-size defects.
Further, the apparatus in step S1 is an ultra-small component sorting apparatus, or an AOI optical inspection apparatus for small-size defect inspection.
Further, the generator G constructed in step S2 includes a first convolution layer, a first residual block, a second convolution layer, a first BN layer, a third residual block, a first upsampling block, a second upsampling block, and a third convolution layer; after the low resolution image is input from the first convolution layer, the low resolution image sequentially passes through a first residual block, a second convolution layer, a first BN layer, a first residual layer, a first upsampling block, a second upsampling block and a third convolution layer, and finally the high resolution image is output from the third convolution layer; the signal output end of the first convolution layer is divided into two paths, one path enters the first residual block, and the other path enters the first residual layer.
Further, each module of the generator G is:
First convolution layer: the number of input channels is 3, the number of output channels is 64, the convolution kernel size is 9×9, padding is set to 4, and a PReLU activation function is adopted, so that the layer converts the 3-channel color image channel of the input image into 64 channels;
First residual block: the number of input channels is 64;
Second residual block: the number of input channels is 64;
Second convolution layer: the number of input channels and the number of output channels are 64, the convolution kernel size is 3 multiplied by 3, the padding is set to be 1, and PReLU activation functions are adopted;
first BN layer: normalizing the convolved data, wherein the feature quantity is set to be 64;
first residual layer: performing pixel level addition with an input channel number of 64;
First upsampling layer: the number of input channels is 64;
Second upsampling layer: the number of input channels is 64;
third convolution layer: the number of input channels is 64, the number of output channels is 3, the convolution kernel size is 9×9, padding is set to 4, and PReLU activation functions are adopted; this layer converts the feature map into 3 channels to obtain the final high resolution image.
Further, the first residual block and the second residual block comprise a first convolution layer, a first BN layer, a second convolution layer, a second BN layer and a first residual layer in sequence, wherein,
Convolution layer one: the convolution kernel size is 3 multiplied by 3, padding is set to be 1, the number of input and output channels is channels, and PreLU activation functions are adopted;
BN layer one: normalizing the convolved data, wherein the characteristic quantity is channels;
Convolution layer two: the convolution kernel size is 3 multiplied by 3, padding is set to be 1, the number of input and output channels is channels, and PReLU activation functions are adopted;
BN layer two: normalizing the convolved data, wherein the characteristic quantity is channels;
residual layer one: residual learning is achieved by adding the input to the convolved and batch normalized output.
Further, the first upsampling layer and the second upsampling layer each comprise a convolutional layer a and Pixel Shuffler layers, wherein,
Convolution layer a: the convolution kernel size is 3×3, padding is set to 1, the number of input channels is in_channels, the number of output channels is in_channels×up_scale2, and a PreLU activation function is adopted to expand each channel in the input feature map to in_channels×up_scale2 channels, and the operation prepares data for the subsequent Pixel Shuffler layers;
pixel Shuffler layers: an up-sampling operation is performed to rearrange the pixels in the feature map to increase resolution.
Further, the discriminator D in the step S3 sequentially comprises a convolution layer I, a convolution layer II, a BN layer I, a convolution layer III, a BN layer II and a full connection layer, wherein,
Convolution layer I: the number of input channels is 3, the number of output channels is 64, the convolution kernel size is 3×3, padding is set to 1, a leakage ReLU activation function is adopted, and the slope is set to 0.2;
Convolution layer II: the number of input channels is 64, the number of output channels is 64, the convolution kernel size is 3×3, stride is set to 2, and padding is set to 1; using a leak ReLU activation function, setting the slope to 0.2;
BN layer I: the feature quantity is set to 64;
Convolution layer III: the number of input channels is 64, the number of output channels is 128, the convolution kernel size is 3×3, stride is set to 1, and padding is set to 1; BN layer: the feature quantity is set to 128; using a leak ReLU activation function, setting the slope to 0.2;
BN layer II: the feature quantity is set to 64;
Full tie layer: using Conv2d, the number of input channels is 512, the number of output channels is 1024, the convolution kernel is 1×1, a sigmoid activation function is adopted, and a judgment result is output.
Further, in step S5, the learning rate of the optimizer is set first for each training period, and then the alternating training of the generator and the arbiter is performed once; the specific content of one training is as follows:
① Training a discriminator: the real high-resolution picture HR i is firstly transmitted into a discriminator D to obtain D_real_loss, then the real low-resolution picture LR i is transmitted into a generator to obtain a picture, the picture is transmitted into the discriminator to obtain D_like_loss, and finally the D_train_loss is obtained as follows:
D_train_loss = D_real_loss + D_fake_loss;
② Training generator: transmitting the real picture into a generator to obtain a result, and obtaining an image error image_loss through the mean square error of the output image of the generator and the real image; then, the generated picture is transmitted into a discriminator, ADVERSARIAL LOSS is calculated, namely the probability that the discriminator discriminates the generated image into a real image; and finally, transmitting the generated image and the real image into a VGG19 model and calculating a mean square error to obtain perception _loss. The g_train_loss of the final computation generator is:
G_train_loss = image_loss + 10-3×adversarial loss + 2×10-6×perception_loss;
Wherein image_ loss, perception _loss reflects pixel and texture differences between the reconstructed image and the real high-resolution image, respectively, ADVERSARIAL LOSS reflects the distributed pixel levels between the reconstructed image and the real high-resolution image in the whole dataset.
Drawings
Fig. 1 is a flow chart of a visual sorting apparatus;
FIG. 2 is an AOI device workflow diagram;
FIG. 3 is a schematic diagram of the model structure of generator G;
FIG. 4 is a schematic diagram of a residual block structure;
FIG. 5 is a schematic diagram of an upsampling block structure;
FIG. 6 is a schematic diagram of a structure of a discriminator D;
FIG. 7 is a schematic diagram of the operation of a super-resolution model for sorting ultra-small elements;
FIG. 8 is a schematic diagram of the operation of a super-resolution model for defect feature enhancement.
Detailed Description
Example 1
Through in this example, can realize rebuilding under the condition that does not increase hardware cost, acquire high-quality high-resolution image, improve vision module location accuracy, promote the letter sorting success rate of ultra-small element letter sorting equipment, arrange the model in addition between camera module and vision positioning module, can promote vision detection module's positioning accuracy.
In this embodiment, the present invention is a method for realizing super resolution applied in a super small component sorting apparatus, including the steps of:
Step S1: a sample dataset S is created. Low-resolution images and corresponding high-resolution images of the ultra-small elements in the device working area are collected in pairs, s= { (LRi, HRi) }, where LRi and HRi represent the low-resolution image and corresponding high-resolution image of the ith ultra-small element in the dataset S, respectively.
Step S2: a generator model G is constructed. The generator G is composed of a plurality of convolution modules, a nonlinear module and an upsampling module, and is responsible for reconstructing an input low resolution image into a corresponding high resolution image.
Step S3: and constructing a discriminator model D. The discriminator D is composed of a plurality of convolution modules, nonlinear modules and downsampling modules, and is responsible for discriminating whether an input image is a generator reconstructed image or a real high-resolution image in the training process.
Step S4: a VGG model is created. The VGG model extracts image features for calculation of model training loss.
Step S5: and training a model. Optimizers of the generator and the discriminant are set as Adam and SGD respectively, a learning rate scheduling scheme is set as cosine simulated annealing, and the generator G and the discriminant D are trained alternately.
Step S6: the trained model is deployed between the camera module and the vision positioning module for accurate sorting of the ultra-small elements.
In step 2, the specific implementation of the generator G is shown in fig. 3, where the broken line arrows point to the residual layers of the respective modules. The specific definition of each module is as follows:
First convolution layer: the number of input channels is 3, the number of output channels is 64, the convolution kernel size is 9×9, padding is set to 4, and a PReLU activation function is adopted, so that the layer converts the 3-channel color image channel of the input image into 64 channels;
First residual block: the number of input channels is 64;
Second residual block: the number of input channels is 64;
Second convolution layer: the number of input channels and the number of output channels are 64, the convolution kernel size is 3 multiplied by 3, the padding is set to be 1, and PReLU activation functions are adopted;
first BN layer: normalizing the convolved data, wherein the feature quantity is set to be 64;
first residual layer: performing pixel level addition with an input channel number of 64;
First upsampling layer: the number of input channels is 64;
Second upsampling layer: the number of input channels is 64;
third convolution layer: the number of input channels is 64, the number of output channels is 3, the convolution kernel size is 9×9, padding is set to 4, and PReLU activation functions are adopted; this layer converts the feature map into 3 channels to obtain the final high resolution image.
The structure of the residual block is shown in fig. 4, and the specific implementation of each layer is as follows:
Convolution layer one: the convolution kernel size is 3 multiplied by 3, padding is set to be 1, the number of input and output channels is channels, and PreLU activation functions are adopted;
BN layer one: normalizing the convolved data, wherein the characteristic quantity is channels;
Convolution layer two: the convolution kernel size is 3 multiplied by 3, padding is set to be 1, the number of input and output channels is channels, and PReLU activation functions are adopted;
BN layer two: normalizing the convolved data, wherein the characteristic quantity is channels;
residual layer one: residual learning is achieved by adding the input to the convolved and batch normalized output.
The specific structure of the up-sampling block is shown in fig. 5, and the specific implementation of each layer is as follows:
Convolution layer a: the convolution kernel size is 3×3, padding is set to 1, the number of input channels is in_channels, the number of output channels is in_channels×up_scale2, and a PreLU activation function is adopted to expand each channel in the input feature map to in_channels×up_scale2 channels, and the operation prepares data for the subsequent Pixel Shuffler layers;
pixel Shuffler layers: an up-sampling operation is performed to rearrange the pixels in the feature map to increase resolution.
In step 3, as shown in fig. 6, the specific implementation of the discriminator D, the specific definition of each module is:
Convolution layer I: the number of input channels is 3, the number of output channels is 64, the convolution kernel size is 3×3, padding is set to 1, a leakage ReLU activation function is adopted, and the slope is set to 0.2;
Convolution layer II: the number of input channels is 64, the number of output channels is 64, the convolution kernel size is 3×3, stride is set to 2, and padding is set to 1; using a leak ReLU activation function, setting the slope to 0.2;
BN layer I: the feature quantity is set to 64;
Convolution layer III: the number of input channels is 64, the number of output channels is 128, the convolution kernel size is 3×3, stride is set to 1, and padding is set to 1; BN layer: the feature quantity is set to 128; using a leak ReLU activation function, setting the slope to 0.2;
BN layer II: the feature quantity is set to 64;
Full tie layer: using Conv2d, the number of input channels is 512, the number of output channels is 1024, the convolution kernel is 1×1, a sigmoid activation function is adopted, and a judgment result is output.
In this embodiment, in step S5, the learning rate of the optimizer is set first for each training period, and then the alternating training of the generator and the arbiter is performed once; the specific content of one training is as follows:
① Training a discriminator: the real high-resolution picture HR i is firstly transmitted into a discriminator D to obtain D_real_loss, then the real low-resolution picture LR i is transmitted into a generator to obtain a picture, the picture is transmitted into the discriminator to obtain D_like_loss, and finally the D_train_loss is obtained as follows:
D_train_loss = D_real_loss + D_fake_loss;
② Training generator: transmitting the real picture into a generator to obtain a result, and obtaining an image error image_loss through the mean square error of the output image of the generator and the real image; then, the generated picture is transmitted into a discriminator, ADVERSARIAL LOSS is calculated, namely the probability that the discriminator discriminates the generated image into a real image; and finally, transmitting the generated image and the real image into a VGG19 model and calculating a mean square error to obtain perception _loss. The g_train_loss of the final computation generator is:
G_train_loss = image_loss + 10-3×adversarial loss + 2×10-6×perception_loss;
Wherein image_ loss, perception _loss reflects pixel and texture differences between the reconstructed image and the real high-resolution image, respectively, ADVERSARIAL LOSS reflects the distributed pixel levels between the reconstructed image and the real high-resolution image in the whole dataset.
In this embodiment, the present invention can realize super resolution for plug and play, and fine sorting of ultra-small elements, and the schematic diagram thereof is shown in fig. 7. The super-resolution model trains the generator G and the arbiter D alternately by means of the dataset S. After the model is converged, the generator G is deployed on the camera module of the ultra-small element sorting equipment, and the low-resolution image output by the camera module is reconstructed into a high-resolution image, so that the purpose of improving the precision of the visual positioning module is achieved. The visual positioning module transmits the positions of the elements which are identified and acquired on the high-resolution image to the sorting device controller to finish sorting of the ultra-small elements.
Example two
In this example, the present invention is different from the first embodiment in that in this embodiment, feature enhancement for small-size defects is performed, and a schematic diagram thereof is shown in fig. 8. The super-resolution model trains the generator G and the arbiter D alternately by means of the dataset S. After the model is converged, the generator G is deployed on the camera module of the AOI equipment, and the low-resolution image output by the camera module is reconstructed into a high-resolution image, so that the aim of improving the defect visual detection precision is fulfilled. And the visual detection module recognizes the obtained defect characteristics on the high-resolution image and outputs the result of defect visual detection.
The method comprehensively considers the similarity between the reconstructed high-resolution image and the current image and between the reconstructed high-resolution image and the data set image, and the obtained model can be used for reconstructing the high-resolution image under the ultra-small element sorting scene or reconstructing the defect high-resolution image under the AOI scene, so that defect characteristic enhancement is realized. The invention reconstructs the collected low-resolution image into the high-resolution image, and under the condition of not increasing hardware cost, the high-quality high-resolution image is reconstructed and obtained, thereby improving the positioning accuracy of the vision module, improving the sorting success rate of the ultra-small element sorting equipment, or improving the positioning accuracy of the vision module and improving the detection success rate of the small-size defects of the AOI equipment; in addition, plug and play can be realized through the invention, namely, the positioning precision of the visual detection module can be improved or the AOI detection precision of the visual detection module can be improved only by arranging the model between the camera module and the visual positioning module.
While the embodiments of this invention have been described in terms of practical aspects, they are not to be construed as limiting the meaning of this invention, and modifications to the embodiments and combinations with other aspects thereof will be apparent to those skilled in the art from this description.
Claims (8)
1. A method for implementing super resolution for use in an automation device, characterized by: the method comprises the following steps:
Step S1: creating a sample dataset S, collecting low resolution images and corresponding high resolution images of the device workspace in pairs, s= { (LR i,HRi) }, where LR i and HR i represent low resolution images and corresponding high resolution images of the ith ultra-small element in the dataset S, respectively;
Step S2: constructing a generator model, wherein a generator G consists of a plurality of convolution modules, a nonlinear module and an up-sampling module and is responsible for reconstructing an input low-resolution image into a corresponding high-resolution image;
Step S3: constructing a discriminator model, wherein a discriminator D consists of a plurality of convolution modules, a nonlinear module and a downsampling module and is responsible for discriminating whether an input image is a reconstructed image or a real high-resolution image in the training process;
step S4: creating a VGG model, wherein the VGG model extracts image characteristics and is used for calculating model training loss;
Step S5: training a model, namely setting optimizers of a generator G and a discriminator D as Adam and SGD respectively, setting a learning rate scheduling scheme as cosine simulated annealing, and alternately training the generator G and the discriminator D;
step S6: the trained model is deployed between a camera module and a visual positioning module for precise sorting of ultra-small elements or for visual detection of small-size defects.
2. A method for realizing super resolution for use in an automation device according to claim 1, wherein: the apparatus in step S1 is an ultra-small component sorting apparatus, or an AOI optical inspection apparatus for small-size defect inspection.
3. A method for realizing super resolution for use in an automation device according to claim 1, wherein: the generator G constructed in the step S2 comprises a first convolution layer, a first residual block, a second convolution layer, a first BN layer, a third residual block, a first upsampling block, a second upsampling block and a third convolution layer; after the low resolution image is input from the first convolution layer, the low resolution image sequentially passes through a first residual block, a second convolution layer, a first BN layer, a first residual layer, a first upsampling block, a second upsampling block and a third convolution layer, and finally the high resolution image is output from the third convolution layer; the signal output end of the first convolution layer is divided into two paths, one path enters the first residual block, and the other path enters the first residual layer.
4. A method for realizing super resolution for use in an automation device according to claim 3, wherein: the generator G comprises the following modules:
First convolution layer: the number of input channels is 3, the number of output channels is 64, the convolution kernel size is 9×9, padding is set to 4, and a PReLU activation function is adopted, so that the layer converts the 3-channel color image channel of the input image into 64 channels;
First residual block: the number of input channels is 64;
Second residual block: the number of input channels is 64;
Second convolution layer: the number of input channels and the number of output channels are 64, the convolution kernel size is 3 multiplied by 3, the padding is set to be 1, and PReLU activation functions are adopted;
first BN layer: normalizing the convolved data, wherein the feature quantity is set to be 64;
first residual layer: performing pixel level addition with an input channel number of 64;
First upsampling layer: the number of input channels is 64;
Second upsampling layer: the number of input channels is 64;
third convolution layer: the number of input channels is 64, the number of output channels is 3, the convolution kernel size is 9×9, padding is set to 4, and PReLU activation functions are adopted; this layer converts the feature map into 3 channels to obtain the final high resolution image.
5. A method for realizing super resolution for use in an automation device according to claim 3 or 4, wherein: the first residual block and the second residual block sequentially comprise a first convolution layer, a first BN layer, a second convolution layer, a second BN layer and a first residual layer, wherein,
Convolution layer one: the convolution kernel size is 3 multiplied by 3, padding is set to be 1, the number of input and output channels is channels, and PreLU activation functions are adopted;
BN layer one: normalizing the convolved data, wherein the characteristic quantity is channels;
Convolution layer two: the convolution kernel size is 3 multiplied by 3, padding is set to be 1, the number of input and output channels is channels, and PReLU activation functions are adopted;
BN layer two: normalizing the convolved data, wherein the characteristic quantity is channels;
residual layer one: residual learning is achieved by adding the input to the convolved and batch normalized output.
6. A method for realizing super resolution for use in an automation device according to claim 3 or 4, wherein: the first upsampling layer and the second upsampling layer each comprise a convolutional layer a and Pixel Shuffler layer, wherein,
Convolution layer a: the convolution kernel size is 3×3, padding is set to 1, the number of input channels is in_channels, the number of output channels is in_channels×up_scale2, and a PreLU activation function is adopted to expand each channel in the input feature map to in_channels×up_scale2 channels, and the operation prepares data for the subsequent Pixel Shuffler layers;
pixel Shuffler layers: an up-sampling operation is performed to rearrange the pixels in the feature map to increase resolution.
7. A method for realizing super resolution for use in an automation device according to claim 1, wherein: the discriminator D in the step S3 sequentially comprises a convolution layer I, a convolution layer II, a BN layer I, a convolution layer III, a BN layer II and a full connection layer, wherein,
Convolution layer I: the number of input channels is 3, the number of output channels is 64, the convolution kernel size is 3×3, padding is set to 1, a leakage ReLU activation function is adopted, and the slope is set to 0.2;
Convolution layer II: the number of input channels is 64, the number of output channels is 64, the convolution kernel size is 3×3, stride is set to 2, and padding is set to 1; using a leak ReLU activation function, setting the slope to 0.2;
BN layer I: the feature quantity is set to 64;
Convolution layer III: the number of input channels is 64, the number of output channels is 128, the convolution kernel size is 3×3, stride is set to 1, and padding is set to 1; BN layer: the feature quantity is set to 128; using a leak ReLU activation function, setting the slope to 0.2;
BN layer II: the feature quantity is set to 64;
Full tie layer: using Conv2d, the number of input channels is 512, the number of output channels is 1024, the convolution kernel is 1×1, a sigmoid activation function is adopted, and a judgment result is output.
8. A method for realizing super resolution for use in an automation device according to claim 1, wherein:
in step S5, the learning rate of the optimizer is set first for each training period, and then the alternating training of the generator and the arbiter is performed once; the specific content of one training is as follows:
① Training a discriminator: the real high-resolution picture HR i is firstly transmitted into a discriminator D to obtain D_real_loss, then the real low-resolution picture LR i is transmitted into a generator to obtain a picture, the picture is transmitted into the discriminator to obtain D_like_loss, and finally the D_train_loss is obtained as follows:
D_train_loss = D_real_loss + D_fake_loss;
② Training generator: transmitting the real picture into a generator to obtain a result, and obtaining an image error image_loss through the mean square error of the output image of the generator and the real image; then, the generated picture is transmitted into a discriminator, ADVERSARIAL LOSS is calculated, namely the probability that the discriminator discriminates the generated image into a real image; finally, the generated image and the real image are transmitted into a VGG19 model, and a mean square error is calculated to obtain perception _loss; the g_train_loss of the final computation generator is:
G_train_loss = image_loss + 10-3×adversarial loss + 2×10-6×perception_loss;
Wherein image_ loss, perception _loss reflects pixel and texture differences between the reconstructed image and the real high-resolution image, respectively, ADVERSARIAL LOSS reflects the distributed pixel levels between the reconstructed image and the real high-resolution image in the whole dataset.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410733970.6A CN118333859A (en) | 2024-06-07 | 2024-06-07 | Method for realizing super resolution applied to automation equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410733970.6A CN118333859A (en) | 2024-06-07 | 2024-06-07 | Method for realizing super resolution applied to automation equipment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN118333859A true CN118333859A (en) | 2024-07-12 |
Family
ID=91778110
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410733970.6A Pending CN118333859A (en) | 2024-06-07 | 2024-06-07 | Method for realizing super resolution applied to automation equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN118333859A (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110458758A (en) * | 2019-07-29 | 2019-11-15 | 武汉工程大学 | A kind of image super-resolution rebuilding method, system and computer storage medium |
CN112215119A (en) * | 2020-10-08 | 2021-01-12 | 华中科技大学 | Small target identification method, device and medium based on super-resolution reconstruction |
CN113344793A (en) * | 2021-08-04 | 2021-09-03 | 深圳市安软科技股份有限公司 | Image super-resolution reconstruction method, device, equipment and storage medium |
CN114862679A (en) * | 2022-05-09 | 2022-08-05 | 南京航空航天大学 | Single-image super-resolution reconstruction method based on residual error generation countermeasure network |
CN117314750A (en) * | 2023-10-10 | 2023-12-29 | 江苏征途技术股份有限公司 | Image super-resolution reconstruction method based on residual error generation network |
-
2024
- 2024-06-07 CN CN202410733970.6A patent/CN118333859A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110458758A (en) * | 2019-07-29 | 2019-11-15 | 武汉工程大学 | A kind of image super-resolution rebuilding method, system and computer storage medium |
CN112215119A (en) * | 2020-10-08 | 2021-01-12 | 华中科技大学 | Small target identification method, device and medium based on super-resolution reconstruction |
CN113344793A (en) * | 2021-08-04 | 2021-09-03 | 深圳市安软科技股份有限公司 | Image super-resolution reconstruction method, device, equipment and storage medium |
CN114862679A (en) * | 2022-05-09 | 2022-08-05 | 南京航空航天大学 | Single-image super-resolution reconstruction method based on residual error generation countermeasure network |
CN117314750A (en) * | 2023-10-10 | 2023-12-29 | 江苏征途技术股份有限公司 | Image super-resolution reconstruction method based on residual error generation network |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Engin et al. | Cycle-dehaze: Enhanced cyclegan for single image dehazing | |
CN110033410B (en) | Image reconstruction model training method, image super-resolution reconstruction method and device | |
CN112287940B (en) | Semantic segmentation method of attention mechanism based on deep learning | |
CN112733950A (en) | Power equipment fault diagnosis method based on combination of image fusion and target detection | |
CN112734646B (en) | Image super-resolution reconstruction method based on feature channel division | |
CN111429403B (en) | Automobile gear finished product defect detection method based on machine vision | |
CN111260640B (en) | Tree generator network gear pitting image measuring method and device based on cyclean | |
CN111931857B (en) | MSCFF-based low-illumination target detection method | |
CN113705675B (en) | Multi-focus image fusion method based on multi-scale feature interaction network | |
CN113066025B (en) | Image defogging method based on incremental learning and feature and attention transfer | |
CN111833261A (en) | Image super-resolution restoration method for generating countermeasure network based on attention | |
CN115578406B (en) | CBCT jaw bone region segmentation method and system based on context fusion mechanism | |
CN112215907A (en) | Automatic extraction method for weld defects | |
CN114581432A (en) | Tongue appearance tongue image segmentation method based on deep learning | |
CN112163998A (en) | Single-image super-resolution analysis method matched with natural degradation conditions | |
CN116563916A (en) | Attention fusion-based cyclic face super-resolution method and system | |
CN112330613A (en) | Method and system for evaluating quality of cytopathology digital image | |
CN117809123A (en) | Anomaly detection and reconstruction method and system for double-stage image | |
Chen et al. | Dynamic degradation intensity estimation for adaptive blind super-resolution: A novel approach and benchmark dataset | |
CN117689550A (en) | Low-light image enhancement method and device based on progressive generation countermeasure network | |
CN117217997A (en) | Remote sensing image super-resolution method based on context perception edge enhancement | |
CN115830514B (en) | Whole river reach surface flow velocity calculation method and system suitable for curved river channel | |
CN118333859A (en) | Method for realizing super resolution applied to automation equipment | |
CN114463192A (en) | Infrared video distortion correction method based on deep learning | |
CN114898096A (en) | Segmentation and annotation method and system for figure image |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |