WO2022242029A1 - Generation method, system and apparatus capable of visual resolution enhancement, and storage medium - Google Patents

Generation method, system and apparatus capable of visual resolution enhancement, and storage medium Download PDF

Info

Publication number
WO2022242029A1
WO2022242029A1 PCT/CN2021/126019 CN2021126019W WO2022242029A1 WO 2022242029 A1 WO2022242029 A1 WO 2022242029A1 CN 2021126019 W CN2021126019 W CN 2021126019W WO 2022242029 A1 WO2022242029 A1 WO 2022242029A1
Authority
WO
WIPO (PCT)
Prior art keywords
resolution
image
single image
samples
scale
Prior art date
Application number
PCT/CN2021/126019
Other languages
French (fr)
Chinese (zh)
Inventor
金龙存
卢盛林
Original Assignee
广东奥普特科技股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 广东奥普特科技股份有限公司 filed Critical 广东奥普特科技股份有限公司
Publication of WO2022242029A1 publication Critical patent/WO2022242029A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4053Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Definitions

  • the invention belongs to the technical field of image processing, and in particular relates to a generation method, system, device and storage medium of visual resolution enhancement.
  • Image super-resolution aims to repair low-resolution images, so that the image contains more detailed information and improves the clarity of the image.
  • This technology has important practical significance. For example, in the field of security monitoring, surveillance video acquisition equipment acquires video frames that lack effective information due to cost constraints, and security monitoring relies heavily on high-resolution images with clear information. Using image super-resolution technology, it can increase the details of video frames. The supplement of these information can provide effective evidence for fighting crime.
  • using image super-resolution as a pre-processing technology can effectively improve the accuracy of tasks such as target detection, face recognition, and abnormal warning in the security field.
  • Image super-resolution based on interpolation is the first algorithm applied in the field of super-resolution. This type of algorithm is based on a fixed polynomial calculation mode, and the pixel value at the interpolation position is calculated from the existing pixel value, such as bilinear interpolation, Bicubic interpolation and Lanczos scaling.
  • Reconstruction-based methods use strict prior knowledge as a constraint to find a suitable reconstruction function in the constrained space, thereby reconstructing high-resolution images with detailed information. These algorithms usually fall into the problem of the image being too smooth and cannot recover the texture details of the image well.
  • the convolutional neural network uses external data sets to learn the mapping model between low-resolution images and high-resolution images, and uses the learned mapping model to reconstruct high-resolution images from low-resolution images.
  • the input low-resolution image lacks effective information, it is difficult for the neural network to learn the mapping relationship comprehensively.
  • Using this mapping model with incomplete learning will lead to serious blurring of the reconstructed image, and it is difficult to obtain the content information in the image.
  • the present invention provides a more accurate and efficient method, system, device and storage medium for visual resolution enhancement, which can be combined with image description information , to build a deeper network, so that it can acquire a single image with high definition a priori based on specific image description information.
  • the present invention adopts following technical scheme:
  • a method for generating visual resolution enhancement including:
  • the training method of the single image super-resolution model includes:
  • training samples include high-resolution single image samples, low-resolution single image samples and corresponding image description information samples;
  • a single image super-resolution model is established based on the preset loss function and high-resolution single image samples.
  • the collecting training samples, the training samples include high-resolution single image samples, low-resolution single image samples and their corresponding image description information samples, including:
  • the built-in bicube downsampling function of Matlab is used to degenerate the high-resolution single image sample into a low-resolution single image sample of the scaling factor; optionally, functions of other software or other functions can also be used to degenerate the single image sample
  • the high-resolution single image sample is a low-resolution single image sample of the scaling factor;
  • the corresponding image description information sample is obtained. Therefore, for image data sets using other targets, English sentence information used to describe at least one of the color, body features, motion posture and environmental performance of the target in the image data set can also be used to obtain the corresponding image Describe a sample of information.
  • the establishment of a single image super-resolution model based on a preset loss function and a high-resolution single image sample according to the collected training samples includes:
  • the low-resolution single image is extracted with shallow features, and the input low-resolution single image is converted from RGB color space to feature space;
  • the reconstructed high-resolution single image and the backup high-resolution single image sample Based on the preset loss function, the reconstructed high-resolution single image and the backup high-resolution single image sample, the positive sample combined with matching description information and the negative sample combined with mismatched description information are reversely converged to establish a single image super resolution model.
  • the encoding processing of the image description information by using an adaptive adjustment block to obtain description variables with the same dimensions as image features includes:
  • the self-adaptive regulation block is made up of two branches, wherein one branch is made up of a fully connected layer, outputs a description encoding vector, and the other branch is made up of a fully connected layer and a sigmoid activation function, and outputs a weight vector;
  • the vectors output by the two branches are multiplied by corresponding position element values, and transformed into description variables with the same dimensions as the image features; the description information is adjusted through the weight vector, and the description coding features are adaptively scaled, so that the image description Redundant information is eliminated, and effective information for image reconstruction is obtained.
  • the multi-scale sub-network is used to perform deep feature extraction on shallow features, including:
  • the shallow features are down-sampled into small-scale feature maps through bilinear interpolation, and their scale is reduced to half of the original;
  • the output of different sub-networks in the previous stage is scaled up through nearest neighbor interpolation, and then fused into the input of large-scale sub-networks; where the sub-networks are connected in series by a certain number of attention residual densely connected blocks at each stage Composition, the number of attention residual dense connection blocks used by sub-networks of different scales from top to bottom is 5, 7, and 3 respectively;
  • the adaptive fusion module based on the channel attention mechanism is used to fuse the information of different frequencies extracted by sub-networks at different scales.
  • the use of the upsampling module to scale up the deep features includes:
  • the feature scale is enlarged using the nearest neighbor interpolation algorithm.
  • the attention residual densely connected block is composed of three spatial attention residual densely connected units and one that connects the input of the attention residual densely connected block with the output of the last spatial attention residual densely connected unit local skip connections.
  • the spatial attention residual densely connected unit includes densely connected groups of five convolutional layers, a spatial attention convolution group, and the input of the spatial attention residual densely connected unit and the spatial attention convolution The output of the group is connected by a skip connection.
  • the self-adaptive fusion module based on the channel attention mechanism fuses information of different frequencies extracted by sub-networks at different scales, including:
  • the interpolated feature maps are passed to the global average pooling layer, channel compression convolution layer and channel expansion convolution layer respectively;
  • the obtained three feature maps are weighted and summed to obtain the fused output.
  • processing the low-resolution single image and its corresponding image description information through the single-image super-resolution model to output a high-resolution single image includes:
  • a generation system for visual resolution enhancement including:
  • An acquisition module configured to acquire a low-resolution single image to be processed and its corresponding image description information
  • An output module configured to process the low-resolution single image and its corresponding image description information through a single-image super-resolution model, and output a high-resolution single image
  • Training module for training described single image super-resolution model, described training module comprises:
  • the sampling sub-module is used to collect training samples, and the training samples include high-resolution single image samples, low-resolution single image samples and corresponding image description information samples;
  • the model establishment sub-module is used to establish a single image super-resolution model based on a preset loss function and a high-resolution single image sample according to the collected training samples.
  • sampling submodule includes:
  • the first sampling unit is used to use the public large-scale CUB bird image data set to obtain a high-resolution single image sample and back it up;
  • the second sampling unit is used to degenerate the high-resolution single image sample into a low-resolution single image sample of the scaling factor by adopting the built-in bicube down-sampling function of Matlab;
  • the third sampling unit is used to obtain the corresponding image description information sample by using the English sentence information used to describe the feather color, body characteristics, motion posture and environmental performance of the bird in the image in the large-scale CUB bird image data set .
  • model building submodule includes:
  • an acquisition unit configured to acquire a low-resolution single image and corresponding image description information
  • the extraction unit is used to extract shallow features from the low-resolution single image based on single-layer convolution, and convert the input low-resolution single image from RGB color space to feature space;
  • An encoding processing unit configured to use an adaptive adjustment block to encode the image description information to obtain description variables having the same dimensions as image features
  • the compression unit is used to concatenate the description variables and image features, and use a layer of convolution to perform channel compression on the concatenated features;
  • a deep feature extraction unit for performing deep feature extraction on shallow features using a multi-scale sub-network
  • the upsampling unit is used to scale up the deep features by using the upsampling module;
  • the reconstruction unit is used to reconstruct and output a high-resolution single image of RGB channel by adopting two layers of convolution;
  • the model building unit is used to reversely converge the reconstructed high-resolution single image and backup high-resolution single image samples, positive samples combined with matching description information, and negative samples combined with mismatched description information based on a preset loss function , to build a single image super-resolution model.
  • the encoding processing unit includes:
  • the encoding subunit consists of a layer of fully connected layers for outputting description encoding vectors
  • the first weight subunit consists of a fully connected layer and a sigmoid activation function for outputting a weight vector
  • the transformation subunit is used to multiply the value of the corresponding position element of the vector output by the encoding subunit and the weight subunit, and transform it into a description variable having the same dimension as the image feature.
  • the multi-scale sub-network includes:
  • the scaling unit is used to downsample the shallow features into a small-scale feature map through bilinear interpolation, and its scale is reduced to half of the original;
  • the input unit is used to amplify the output of different sub-networks in the previous stage through nearest neighbor interpolation, and fuse them into the input of large-scale sub-networks; wherein, the sub-networks are composed of a certain number of attention residuals in each stage.
  • the difference densely connected blocks are connected in series, and the number of attention residual densely connected blocks used by sub-networks of different scales from top to bottom are 5, 7, and 3 respectively;
  • the fusion unit is used to fuse information of different frequencies extracted by sub-networks at different scales by using an adaptive fusion module based on the channel attention mechanism.
  • the upsampling unit includes:
  • the enlargement subunit is used to enlarge the feature scale using the nearest neighbor interpolation algorithm.
  • the attention residual dense connection block includes:
  • the first component unit is used to compose three spatial attention residual densely connected units and a local skip connection connecting the input of the attention residual densely connected block and the output of the last spatial attention residual densely connected unit.
  • the fusion unit includes:
  • the mapping subunit is used to interpolate the small-scale feature map to generate a feature map with the same size as the large-scale feature map;
  • the transfer subunit is used to transfer the interpolated feature map to the global average pooling layer, channel compression convolution layer and channel expansion convolution layer respectively;
  • the second weight subunit is used to concatenate the obtained vectors of the three scales, and use the softmax layer on the same channel for processing to generate a corresponding weight matrix;
  • Multiply subunits It is used to divide the weight matrix into three weight components corresponding to three sub-networks, and multiply the feature maps after each scale interpolation with the corresponding weight components;
  • the output subunit performs a weighted summation operation on the obtained three feature maps to obtain a fused output.
  • the spatial attention residual dense connection unit includes:
  • the second component unit is used to connect a densely connected group of five convolutional layers, a spatial attention convolutional group, and a skip link connecting the input of the spatial attention residual densely connected unit and the output of the spatial attention convolutional group composition.
  • the output module includes:
  • the extraction sub-module is used to input a low-resolution single image into the shallow feature extraction module to obtain shallow image features
  • the output sub-module inputs the corresponding image description information into the adaptive adjustment block to obtain the description variable with the same dimension as the image feature, connects the description variable and the image feature in series, inputs the subsequent single image super-resolution model, and outputs a high-resolution single image images.
  • a device including:
  • memory for storing at least one program
  • a processor configured to execute the at least one program to implement the method as described above.
  • a storage medium which stores an executable program, and when the executable program is executed by a processor, the above method is implemented.
  • the low-resolution single image to be processed is obtained by using a single image training model established by using training samples including high-resolution single image samples, low-resolution single image samples and their corresponding image description information samples, and a preset loss function.
  • the resolution processing of each image can accurately and efficiently realize the effect of restoring a low-resolution single image to a high-resolution single image, and can obtain a priori a single image with higher definition based on specific image description information.
  • FIG. 1 is a schematic flow chart of steps in a method for generating visual resolution enhancement provided by an embodiment of the present invention
  • FIG. 2 is a structural block diagram of a generation system for visual resolution enhancement provided by an embodiment of the present invention
  • Fig. 3 is a schematic flow chart of a single image super-resolution model in an embodiment of the present invention.
  • FIG. 4 is a schematic structural diagram of a multi-scale sub-network in an embodiment of the present invention.
  • Fig. 5 is a schematic structural diagram of an attention residual densely connected block in an embodiment of the present invention.
  • Fig. 6 is a schematic diagram of the operation details of the adaptive fusion module in the embodiment of the present invention.
  • this embodiment provides a method for generating visual resolution enhancement, including the following steps:
  • the training process of the single image super-resolution model comprises the following steps:
  • training samples include high-resolution single image samples, low-resolution single image samples, and corresponding image description information samples;
  • a single image super-resolution model is established based on a preset loss function and high-resolution single image samples.
  • step S3 includes:
  • steps S31-S33 high-resolution single image samples, low-resolution single image samples and corresponding image description information samples can be obtained, so as to establish training samples.
  • step S4 includes:
  • step S43 includes:
  • the adaptive adjustment block is composed of two branches, one of which is composed of a fully connected layer, and outputs a description encoding vector, and the other branch is composed of a fully connected layer and a sigmoid activation function, and outputs a weight vector;
  • step S45 includes:
  • the input of the large-scale sub-network is fused by the output of different sub-networks in the previous stage through the nearest neighbor interpolation for scaling up; After the scale is enlarged by interpolation, it is fused to become the input of the large-scale sub-network; where the sub-network is composed of a certain number of attention residual densely connected blocks in series at each stage, and is composed of sub-networks of different scales from top to bottom.
  • the number of attention residual densely connected blocks used is 5, 7, 3 respectively;
  • step S454 includes:
  • the interpolated feature map is passed to the global average pooling layer, the channel compression convolution layer and the channel expansion convolution layer respectively;
  • this embodiment provides a generation system for visual resolution enhancement, which includes:
  • An acquisition module configured to acquire low-resolution single images and image description information to be processed
  • An output module configured to process the low-resolution single image and image description information variables through a single image super-resolution model, and output a reconstructed high-resolution single image, wherein the single image is super-resolved
  • the rate model is based on high-resolution single image samples and low-resolution single image samples.
  • Training modules include:
  • the sampling sub-module is used to collect training samples, and the training samples include high-resolution single image samples, low-resolution single image samples and corresponding image description information samples;
  • the model establishment sub-module is used to establish a single image super-resolution model based on a preset loss function and a high-resolution single image sample according to the collected training samples.
  • sampling submodule includes:
  • the first sampling unit is used to collect high-resolution single image samples, using the public large-scale CUB bird image data set to obtain high-resolution single image samples and back them up;
  • the second sampling unit is used to degenerate the high-resolution single image sample into the low-resolution single image sample of the scaling factor by adopting the built-in bicube down-sampling function of Matlab;
  • the third sampling unit is used to use the English sentence information used to describe the feather color, body characteristics, motion posture and environmental performance of the bird in the image in the above data set.
  • the above-mentioned sampling unit can acquire high-resolution single image samples, low-resolution single image samples and corresponding image description information samples, so as to establish training samples.
  • model building submodule includes:
  • An acquisition unit configured to acquire a low-resolution single image sample and corresponding image description information
  • the extraction unit is used to extract shallow features from the low-resolution single image based on single-layer convolution, and converts the input low-resolution single image from RGB color space to feature space;
  • An encoding processing unit configured to encode the description information of the image by using an adaptive adjustment block to obtain a description variable having the same dimension as the image feature;
  • the compression unit is used to concatenate the description variables and image features, and use a layer of convolution to perform channel compression on the concatenated features;
  • a deep feature extraction unit for performing deep feature extraction on shallow features using a multi-scale sub-network
  • the upsampling unit is used to scale up the deep features by using the upsampling module;
  • the reconstruction unit is used to reconstruct and output a high-resolution single image of RGB channel by adopting two layers of convolution;
  • the model building unit is used to reversely converge the reconstructed high-resolution single image and backup high-resolution single image samples, positive samples combined with matching description information, and negative samples combined with mismatched description information based on a preset loss function , to build a single image super-resolution model.
  • the encoding processing unit includes:
  • the encoding subunit consists of a fully connected layer and outputs a description encoding vector
  • the first weight subunit consists of a fully connected layer and a sigmoid activation function, and outputs a weight vector
  • the transformation subunit multiplies the vector corresponding position element value output by the encoding subunit and the weight subunit, and transforms it into a descriptive variable identical to the dimension of the image feature.
  • the multi-scale sub-network includes:
  • the scaling unit is used to downsample the shallow features into a small-scale feature map through bilinear interpolation, and its scale is reduced to half of the original;
  • the input unit is used to scale up the output of different sub-networks in the previous stage through nearest neighbor interpolation, and fuse them into the input of the large-scale sub-network; that is to say, the input of the large-scale sub-network is obtained from the previous stage
  • the outputs of different sub-networks are fused by nearest neighbor interpolation and then scaled up; among them, the sub-networks are composed of a certain number of attention residual densely connected blocks in series at each stage, from top to bottom of different scales
  • the number of attention residual densely connected blocks used by the sub-network is 5, 7, 3 respectively;
  • the fusion unit is used to fuse information of different frequencies extracted by sub-networks at different scales by using an adaptive fusion module based on a channel attention mechanism.
  • the upsampling unit includes:
  • the enlargement subunit is used to enlarge the feature scale using the nearest neighbor interpolation algorithm.
  • the attention residual dense connection block includes:
  • the first component unit is used to compose three spatial attention residual densely connected units and a local skip connection connecting the input of the attention residual densely connected block and the output of the last spatial attention residual densely connected unit.
  • the fusion unit includes:
  • the mapping subunit is used to interpolate the small-scale feature map to generate a feature map with the same size as the large-scale feature map;
  • the transfer subunit is used to transfer the interpolated feature map to the global average pooling layer, channel compression convolution layer and channel expansion convolution layer respectively;
  • the second weight subunit is used to concatenate the obtained vectors of the three scales, and use the softmax layer on the same channel for processing to generate a corresponding weight matrix;
  • Multiply subunits It is used to divide the weight matrix into three weight components corresponding to three sub-networks, and multiply the feature maps after each scale interpolation with the corresponding weight components;
  • the output subunit performs a weighted summation operation on the obtained three feature maps to obtain a fused output.
  • the spatial attention residual dense connection unit includes:
  • the second component unit is used to connect a densely connected group of five convolutional layers, a spatial attention convolutional group, and a skip link connecting the input of the spatial attention residual densely connected unit and the output of the spatial attention convolutional group composition.
  • the output module includes:
  • the extraction sub-module is used to input a low-resolution single image into the shallow feature extraction module to obtain shallow image features
  • the output sub-module inputs the corresponding image description information into the adaptive adjustment block to obtain the description variable with the same dimension as the image feature, connects the description variable and the image feature in series, inputs the subsequent single image super-resolution model, and outputs a high-resolution single image images.
  • the present embodiment provides a kind of device, and this device comprises:
  • At least one memory for storing at least one program
  • At least one processor When at least one program is executed by at least one processor, at least one processor is made to implement the steps of the method for generating visual resolution enhancement as in Embodiment 1 above.
  • This embodiment provides a storage medium, which stores a processor-executable program, and the executable program is used to execute the method for generating visual resolution enhancement as described in Embodiment 1 when executed by the processor. step.
  • this embodiment provides a flow chart of a method for generating visual resolution enhancement, which can be used as a specific implementation of Embodiment 1, and Embodiment 2 can also implement the method of this embodiment. Include the following steps:
  • training samples include high-resolution single image samples, low-resolution single image samples and their corresponding image description information samples;
  • step A is:
  • A1 Obtain the public large-scale CUB bird data set as the training data set.
  • the data set can be divided into 200 categories, with a total of 11,788 images. Each image has ten English sentences to describe the feather color, body characteristics, movement posture and environmental performance of the bird in the image.
  • the training set and test set are divided according to the ratio of 8855:2913. The proportion of training data set and test data set in each category is balanced, and there will be no problem of unbalanced distribution of training set and test set samples.
  • the corresponding image description information adopts the CNN-RNN encoding method to encode the description prior information.
  • A2 Use the "imresize" function of MATLAB to perform 4-fold bicube downsampling on a high-resolution single image to obtain the corresponding low-resolution single image, which constitutes a ternary matching of ⁇ I HR , I LR ,c ⁇ data set.
  • the negative sample description information used in the positive and negative sample matching loss is to randomly select one from the rest of the image descriptions as a mismatch description through random numbers, and obtain a dataset of ⁇ I LR , I HR , neg_c ⁇ triple negative sample descriptions; Horizontal or vertical flipping, 90° rotation, and random cropping of image blocks are used as data enhancement methods.
  • step B The specific implementation scheme of step B is:
  • a low-resolution image block with a size of 30 ⁇ 30 randomly cut from a low-resolution single image is used as an input, which is denoted as I LR .
  • the single-layer convolutional layer converts the input low-resolution image from the RGB color space to the feature space.
  • the obtained feature contains 64 channels, and the size is the same as the input image.
  • the convolutional layer consists of a 3 ⁇ 3 convolutional layer and an activation function.
  • the adaptive adjustment block encodes the description of the image to obtain the same description variable as the dimension of the image feature.
  • the adaptive adjustment block is composed of two branches, one of which is composed of a fully connected layer, which outputs a description encoding vector, and the other branch is composed of a fully connected layer and a sigmoid activation function, which outputs a weight vector, and the vectors output by the two branches correspond to
  • the positional element values are multiplied to obtain the description variable.
  • the descriptive variables and image features are concatenated, and the channel compression is performed through a layer of 3 ⁇ 3 convolutional layers, so as to obtain the shallow feature F S .
  • This process can be expressed as:
  • the shallow feature F S After obtaining the shallow feature F S , input it into the deep feature extraction module composed of multi-scale sub-networks, and generate an effective deep feature F d through parallel multiple sub-networks. Finally obtained deep feature F d ⁇ R 2W ⁇ 2H ⁇ C , it can be found that the scale of the deep feature is doubled on the scale of the shallow feature. characteristic information.
  • the shallow features are first sampled into small-scale feature maps by bilinear interpolation, and the scale is reduced to half of the original The deep feature extraction module takes this scale as the input of the first layer sub-network, and gradually increases the large-scale sub-network in stages.
  • the input of the large-scale sub-network is fused by the output of different sub-networks in the previous stage through nearest neighbor interpolation for scaling up.
  • the sub-network consists of a certain number of attention residual densely connected blocks concatenated at each stage. For sub-networks of different scales, the number of attention residual densely connected blocks connected in series is also different. The number of attention residual densely connected blocks used by sub-networks of different scales from top to bottom is 5, 7, and 3 respectively. .
  • the subsequent adaptive fusion module based on the channel attention mechanism effectively fuses information of different frequencies extracted by sub-networks at different scales. This module can be expressed as:
  • F up is the feature after upsampling
  • Inter( ) represents the nearest neighbor interpolation function
  • s represents the amplification factor
  • the discriminator uses a VGG network composed of convolutions with a step size.
  • the input image is a generated image and a real image.
  • the input image features are dimensionally changed, and the feature map is reduced.
  • the output feature map is concatenated with the image description coding vector c, and the true and false logical value of the judgment is obtained through the binary classifier. This process can be expressed as:
  • the loss function of the generator consists of three parts: the reconstruction loss L rec , the perceptual loss L VGG and the confrontation loss L adv :
  • ⁇ 1 , ⁇ 2 and ⁇ 3 correspond to the weights of these three losses, respectively.
  • the reconstruction loss uses the L 1 loss function:
  • W, H, and C represent the width, height and number of channels of a high-resolution single image, respectively.
  • the feature information extracted by the fixed classification network VGG of the reconstructed image should be similar to the real image, and the perceptual loss is used to constrain it.
  • the perceptual loss is defined as follows:
  • var represents attribute information of a single frame.
  • the goal of the adversarial loss of the discriminator is to distinguish the reconstructed image from the real image as much as possible in the image distribution.
  • this embodiment adds positive and negative sample adversarial loss constraints.
  • the positive sample refers to the combination of matched description information c
  • the negative sample refers to the discriminator loss neg_c combined with mismatched description information.
  • the adversarial loss for the discriminator is defined as follows:
  • the batch size is set to 16
  • the initial learning rate is set to 10 -4
  • the description code is 1024 hidden variables.
  • a low-resolution image with a size of 30 ⁇ 30 is randomly cut from the low-resolution image.
  • the high-resolution image block is paired with a 120 ⁇ 120 high-resolution image block.
  • the learning rate is attenuated by half.
  • step C is specifically:
  • step D is specifically:
  • the adaptive adjustment block encodes the description corresponding to the image to obtain a description variable with the same dimension as the image feature, and then converts the description variable and the input low-resolution image from the RGB color space to the feature through a single-layer convolution layer
  • the spatial image features are concatenated, and finally channel compression is performed through a layer of convolution to obtain shallow features, and then processed by the subsequent network to output a high-resolution single image.
  • the single image training model established by using the training samples including high-resolution single image samples, low-resolution single image samples and their corresponding image description information samples, and the preset loss function is used to obtain the to-be-processed
  • the resolution processing of a low-resolution single image can accurately and efficiently realize the effect of restoring a low-resolution single image to a high-resolution single image, and can obtain higher definition based on specific image description information a priori of a single image.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Image Processing (AREA)
  • Editing Of Facsimile Originals (AREA)

Abstract

A generation method, system and apparatus capable of visual resolution enhancement, and a storage medium. An acquired low-resolution single image to be processed is subjected to resolution processing by means of a single-image training model established by using training samples and a preset loss function, which training samples contain a high-resolution single image sample, a low-resolution single image sample and corresponding image description information samples thereof, such that the effect of restoring the low-resolution single image to a high-resolution single image can be accurately and efficiently realized. A single image having a higher resolution can be acquired on the basis of specific image description information prior, and high-frequency information of the single image can be restored, such that an output high-resolution single image contains more texture structure details, and the definition of the single image is thus improved. The present invention can be widely applied to the technical field of image processing.

Description

视觉分辨率增强的生成方法、系统、装置及存储介质Generation method, system, device and storage medium for visual resolution enhancement
本申请要求于2021年05月18日提交中国专利局、申请号为202110541939.9、发明名称为“视觉分辨率增强的生成方法、系统、装置及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application submitted to the China Patent Office on May 18, 2021, with the application number 202110541939.9, and the title of the invention is "Method, system, device and storage medium for visual resolution enhancement", the entire content of which Incorporated in this application by reference.
技术领域technical field
本发明属于图像处理技术领域,尤其涉及视觉分辨率增强的生成方法、系统、装置及存储介质。The invention belongs to the technical field of image processing, and in particular relates to a generation method, system, device and storage medium of visual resolution enhancement.
背景技术Background technique
近些年来,由于数字图像采集设备体积、重量以及成本的限制,采集到的图像分辨率较低,这极大地降低了图像的清晰度。同时,人们对于高清晰度图像的需求日益增加,如何提升图像和视频质量成为了日益重要的问题。图像超分辨率旨在对低分辨率图像进行修复,使图像包含更多的细节信息,提高图像的清晰度。这项技术有着重要的实用意义,例如在安全监控领域,监控视频采集设备由于成本限制,获取到的是缺乏有效信息的视频帧,而安全监控极大依赖于信息明确的高分辨率图像。采用图像超分辨率技术,能够增加视频帧的细节。这些信息的补充能够为打击犯罪提供有效证据。目前,将图像超分辨率作为前期预处理技术,能够有效地提升安全领域中的目标检测、人脸识别及异常预警等任务的精度。In recent years, due to the limitations of volume, weight and cost of digital image acquisition equipment, the resolution of the collected images is low, which greatly reduces the clarity of the images. At the same time, people's demand for high-definition images is increasing, and how to improve image and video quality has become an increasingly important issue. Image super-resolution aims to repair low-resolution images, so that the image contains more detailed information and improves the clarity of the image. This technology has important practical significance. For example, in the field of security monitoring, surveillance video acquisition equipment acquires video frames that lack effective information due to cost constraints, and security monitoring relies heavily on high-resolution images with clear information. Using image super-resolution technology, it can increase the details of video frames. The supplement of these information can provide effective evidence for fighting crime. At present, using image super-resolution as a pre-processing technology can effectively improve the accuracy of tasks such as target detection, face recognition, and abnormal warning in the security field.
以前图像超分辨率使用的方法是基于插值或者基于重建的方法。基于插值方式的图像超分辨率是超分辨率领域中最先应用的算法,这类算法基于固定的多项式计算模式,由已有的像素值推算出插值位置的像素值,例如双线性插值、双立方插值和Lanczos缩放。基于重建的方法采用严格的先验知识作为约束,在约束空间内找到合适的重建函数,从而重建出具有细节信息的高分辨率图像。这些算法通常会陷入图像过于平滑的问题,不能很好地恢复图像的纹理细节。Previous methods used for image super-resolution are interpolation-based or reconstruction-based methods. Image super-resolution based on interpolation is the first algorithm applied in the field of super-resolution. This type of algorithm is based on a fixed polynomial calculation mode, and the pixel value at the interpolation position is calculated from the existing pixel value, such as bilinear interpolation, Bicubic interpolation and Lanczos scaling. Reconstruction-based methods use strict prior knowledge as a constraint to find a suitable reconstruction function in the constrained space, thereby reconstructing high-resolution images with detailed information. These algorithms usually fall into the problem of the image being too smooth and cannot recover the texture details of the image well.
近年来,随着深度学习以及卷积神经网络的发展,图像超分辨率技术 取得了巨大的突破。卷积神经网络利用外部数据集学习出低分辨率图像与高分辨率图像之间的映射模型,并利用学习到的映射模型从低分辨率图像中重建出高分辨率图像。当输入的低分辨率图像缺乏有效信息时,神经网络难以全面地学习到该映射关系。使用这种学习不全面的映射模型,会导致重建图像的模糊现象严重,并难以获取图像中的内容信息。In recent years, with the development of deep learning and convolutional neural networks, image super-resolution technology has made great breakthroughs. The convolutional neural network uses external data sets to learn the mapping model between low-resolution images and high-resolution images, and uses the learned mapping model to reconstruct high-resolution images from low-resolution images. When the input low-resolution image lacks effective information, it is difficult for the neural network to learn the mapping relationship comprehensively. Using this mapping model with incomplete learning will lead to serious blurring of the reconstructed image, and it is difficult to obtain the content information in the image.
发明内容Contents of the invention
为了解决图像超分辨率任务中,低分辨率图像缺乏有效信息的问题,本发明提供了一种更加精确和高效的视觉分辨率增强的生成方法、系统、装置及存储介质,可以结合图像描述信息,构建更深层次的网络,使其能够基于特定的图像描述信息先验获取清晰度高的单幅图像。In order to solve the problem of lack of effective information in low-resolution images in image super-resolution tasks, the present invention provides a more accurate and efficient method, system, device and storage medium for visual resolution enhancement, which can be combined with image description information , to build a deeper network, so that it can acquire a single image with high definition a priori based on specific image description information.
本发明采用以下技术方案:The present invention adopts following technical scheme:
第一方面,提供了一种视觉分辨率增强的生成方法,包括:In the first aspect, a method for generating visual resolution enhancement is provided, including:
获取待处理的低分辨率单幅图像以及其对应的图像描述信息;Obtain the low-resolution single image to be processed and its corresponding image description information;
通过单幅图像超分辨率模型对所述低分辨率单幅图像以及其对应的图像描述信息进行处理,输出高分辨率单幅图像;Processing the low-resolution single image and its corresponding image description information through a single image super-resolution model, and outputting a high-resolution single image;
所述单幅图像超分辨率模型的训练方法,包括:The training method of the single image super-resolution model includes:
采集训练样本,所述训练样本含高分辨率单幅图像样本、低分辨率单幅图像样本和其对应的图像描述信息样本;Collecting training samples, the training samples include high-resolution single image samples, low-resolution single image samples and corresponding image description information samples;
根据采集的训练样本,基于预设损失函数和高分辨率单幅图像样本建立单幅图像超分辨率模型。According to the collected training samples, a single image super-resolution model is established based on the preset loss function and high-resolution single image samples.
可选地,所述采集训练样本,所述训练样本含高分辨率单幅图像样本、低分辨率单幅图像样本和其对应的图像描述信息样本,包括:Optionally, the collecting training samples, the training samples include high-resolution single image samples, low-resolution single image samples and their corresponding image description information samples, including:
采用公开的大规模CUB鸟类图像数据集,得到高分辨率单幅图像样本并备份;可选地,还可以采用其他目标(例如,非鸟类)的图像数据集,以得到高分辨率单幅图像样本并备份;Use the publicly available large-scale CUB bird image dataset to obtain high-resolution single image samples and back them up; image samples and back them up;
采用Matlab内置的双立方下采样函数,将所述高分辨率单幅图像样本退化为缩放系数的低分辨率单幅图像样本;可选地,还可以采用其他软件的函数或者其他函数,退化所述高分辨率单幅图像样本为缩放系数的低分 辨率单幅图像样本;The built-in bicube downsampling function of Matlab is used to degenerate the high-resolution single image sample into a low-resolution single image sample of the scaling factor; optionally, functions of other software or other functions can also be used to degenerate the single image sample The high-resolution single image sample is a low-resolution single image sample of the scaling factor;
采用所述大规模CUB鸟类图像数据集中的用于描述图像中的鸟的羽毛颜色、体态特征、运动姿态和环境表现中的至少一种的英文语句信息,得到对应的图像描述信息样本。因此,对于采用其他目标的图像数据集,还可以采用图像数据集中的用于描述图像中的目标的颜色、体态特征、运动姿态和环境表现中的至少一种的英文语句信息,得到对应的图像描述信息样本。Using the English sentence information used to describe at least one of the feather color, body characteristics, movement posture and environmental performance of the bird in the image in the large-scale CUB bird image data set, the corresponding image description information sample is obtained. Therefore, for image data sets using other targets, English sentence information used to describe at least one of the color, body features, motion posture and environmental performance of the target in the image data set can also be used to obtain the corresponding image Describe a sample of information.
可选地,所述根据采集的训练样本,基于预设损失函数和高分辨率单幅图像样本建立单幅图像超分辨率模型,包括:Optionally, the establishment of a single image super-resolution model based on a preset loss function and a high-resolution single image sample according to the collected training samples includes:
获取低分辨率单幅图像样本和对应的图像描述信息;Obtain low-resolution single image samples and corresponding image description information;
基于单层卷积对所述低分辨率单幅图像提取浅层特征,将输入的低分辨率单幅图像从RGB颜色空间转换到特征空间;Based on single-layer convolution, the low-resolution single image is extracted with shallow features, and the input low-resolution single image is converted from RGB color space to feature space;
采用自适应调节块对所述图像描述信息编码处理,得到与图像特征的维度相同的描述变量;Using an adaptive adjustment block to encode the image description information to obtain a description variable with the same dimension as the image feature;
将描述变量和图像特征串联起来,采用一层卷积对串联起来的特征进行通道压缩;Concatenate the description variables and image features, and use a layer of convolution to perform channel compression on the concatenated features;
采用多尺度子网络对浅层特征进行深层特征提取;Using multi-scale sub-networks to extract deep features from shallow features;
采用上采样模块对深层的特征进行尺度放大;Use the upsampling module to scale up the deep features;
采用两层卷积,重建并输出RGB通道的高分辨率单幅图像;Use two layers of convolution to reconstruct and output a high-resolution single image of RGB channels;
基于预设损失函数对重建后高分辨率单幅图像与备份的高分辨率单幅图像样本、结合匹配描述信息的正样本与结合不匹配描述信息的负样本反向收敛,建立单幅图像超分辨率模型。Based on the preset loss function, the reconstructed high-resolution single image and the backup high-resolution single image sample, the positive sample combined with matching description information and the negative sample combined with mismatched description information are reversely converged to establish a single image super resolution model.
可选地,所述采用自适应调节块对所述图像描述信息编码处理,得到与图像特征的维度相同的描述变量,包括:Optionally, the encoding processing of the image description information by using an adaptive adjustment block to obtain description variables with the same dimensions as image features includes:
所述自适应调节块由两分支组成,其中一分支由一层全连接层组成,输出描述编码向量,另一分支由一层全连接层和sigmoid激活函数组成,输出权重向量;The self-adaptive regulation block is made up of two branches, wherein one branch is made up of a fully connected layer, outputs a description encoding vector, and the other branch is made up of a fully connected layer and a sigmoid activation function, and outputs a weight vector;
两分支输出的向量对应位置元素值相乘,并变换成与图像特征的维度相同的描述变量;通过权重向量对描述信息做调整,自适应地对描述编码 特征进行缩放,从而将图像描述里的冗余信息进行消去,并得到了对图像重建有效的信息。The vectors output by the two branches are multiplied by corresponding position element values, and transformed into description variables with the same dimensions as the image features; the description information is adjusted through the weight vector, and the description coding features are adaptively scaled, so that the image description Redundant information is eliminated, and effective information for image reconstruction is obtained.
可选地,所述采用多尺度子网络对浅层特征进行深层特征提取,包括:Optionally, the multi-scale sub-network is used to perform deep feature extraction on shallow features, including:
通过双线性插值将浅层特征降采样为小尺度的特征图,其尺度缩小为原来的一半;The shallow features are down-sampled into small-scale feature maps through bilinear interpolation, and their scale is reduced to half of the original;
以该尺度作为第一层子网络的输入,分阶段逐步增加大尺度子网络;Use this scale as the input of the first layer of sub-network, and gradually increase the large-scale sub-network in stages;
将上一阶段的不同子网络的输出经过最近邻插值进行尺度放大后,并融合成为大尺度子网络的输入;其中,子网络在每个阶段都由一定数目的注意力残差密集连接块串联组成,由上至下的不同尺度的子网络所使用的注意力残差密集连接块数量分别为5、7、3;The output of different sub-networks in the previous stage is scaled up through nearest neighbor interpolation, and then fused into the input of large-scale sub-networks; where the sub-networks are connected in series by a certain number of attention residual densely connected blocks at each stage Composition, the number of attention residual dense connection blocks used by sub-networks of different scales from top to bottom is 5, 7, and 3 respectively;
采用基于通道注意力机制的自适应融合模块将不同尺度下的子网络提取的不同频率的信息进行融合。The adaptive fusion module based on the channel attention mechanism is used to fuse the information of different frequencies extracted by sub-networks at different scales.
可选地,所述采用上采样模块对深层的特征进行尺度放大,包括:Optionally, the use of the upsampling module to scale up the deep features includes:
使用最近邻插值算法对特征尺度进行放大。The feature scale is enlarged using the nearest neighbor interpolation algorithm.
可选地,所述注意力残差密集连接块由三个空间注意力残差密集连接单元和一个将注意力残差密集连接块的输入和最后一个空间注意力残差密集连接单元输出相连的局部跳跃连接组成。Optionally, the attention residual densely connected block is composed of three spatial attention residual densely connected units and one that connects the input of the attention residual densely connected block with the output of the last spatial attention residual densely connected unit local skip connections.
可选地,所述空间注意力残差密集连接单元包含五个卷积层的密集连接组、一个空间注意力卷积组以及将空间注意力残差密集连接单元的输入和空间注意力卷积组的输出相连的跳跃连接。Optionally, the spatial attention residual densely connected unit includes densely connected groups of five convolutional layers, a spatial attention convolution group, and the input of the spatial attention residual densely connected unit and the spatial attention convolution The output of the group is connected by a skip connection.
可选地,所述采用基于通道注意力机制的自适应融合模块将不同尺度下的子网络提取的不同频率的信息进行融合,包括:Optionally, the self-adaptive fusion module based on the channel attention mechanism fuses information of different frequencies extracted by sub-networks at different scales, including:
对小尺度特征映射进行插值,生成与大尺度特征映射大小相同的特征映射;Interpolate the small-scale feature map to generate a feature map of the same size as the large-scale feature map;
插值后的特征映射分别传递给全局平均池化层、通道压缩卷积层和通道扩大卷积层;The interpolated feature maps are passed to the global average pooling layer, channel compression convolution layer and channel expansion convolution layer respectively;
将获得的三个尺度的向量串联起来,并在同一通道上使用softmax层进行处理,生成对应的权重矩阵;Concatenate the obtained vectors of the three scales and process them with the softmax layer on the same channel to generate the corresponding weight matrix;
将权重矩阵分成三个权重分量对应于三个子网络,将各尺度插值后的 特征映射分别与相应的权重分量相乘;Divide the weight matrix into three weight components corresponding to three sub-networks, and multiply the feature maps after each scale interpolation with the corresponding weight components;
将得到的三个特征图进行加权求和操作从而得到融合后的输出。The obtained three feature maps are weighted and summed to obtain the fused output.
可选地,所述通过单幅图像超分辨率模型对所述低分辨率单幅图像以及其对应的图像描述信息进行处理,输出高分辨率单幅图像,包括:Optionally, processing the low-resolution single image and its corresponding image description information through the single-image super-resolution model to output a high-resolution single image includes:
将低分辨率单幅图像输入浅层特征提取模块,得到浅层图像特征;Input a low-resolution single image into the shallow feature extraction module to obtain shallow image features;
将对应的图像描述信息输入自适应调节块获得与图像特征的维度相同的描述变量,将描述变量和图像特征串联起来,输入后续单幅图像超分辨率模型,输出高分辨率单幅图像。Input the corresponding image description information into the adaptive adjustment block to obtain the description variable with the same dimension as the image feature, connect the description variable and image feature in series, input the subsequent single image super-resolution model, and output a high-resolution single image.
第二方面,提供了一种视觉分辨率增强的生成系统,包括:In the second aspect, a generation system for visual resolution enhancement is provided, including:
获取模块,用于获取待处理的低分辨率单幅图像以及其对应的图像描述信息;An acquisition module, configured to acquire a low-resolution single image to be processed and its corresponding image description information;
输出模块,用于通过单幅图像超分辨率模型对所述低分辨率单幅图像以及其对应的图像描述信息进行处理,输出高分辨率单幅图像;An output module, configured to process the low-resolution single image and its corresponding image description information through a single-image super-resolution model, and output a high-resolution single image;
训练模块,用于训练所述单幅图像超分辨率模型,所述训练模块包括:Training module, for training described single image super-resolution model, described training module comprises:
采样子模块,用于采集训练样本,所述训练样本含高分辨率单幅图像样本、低分辨率单幅图像样本和其对应的图像描述信息样本;The sampling sub-module is used to collect training samples, and the training samples include high-resolution single image samples, low-resolution single image samples and corresponding image description information samples;
模型建立子模块,用于根据采集的训练样本,基于预设损失函数和高分辨率单幅图像样本建立单幅图像超分辨率模型。The model establishment sub-module is used to establish a single image super-resolution model based on a preset loss function and a high-resolution single image sample according to the collected training samples.
可选地,所述采样子模块包括:Optionally, the sampling submodule includes:
第一采样单元,用于采用公开的大规模CUB鸟类图像数据集,得到高分辨率单幅图像样本并备份;The first sampling unit is used to use the public large-scale CUB bird image data set to obtain a high-resolution single image sample and back it up;
第二采样单元,用于采用Matlab内置的双立方下采样函数,将所述高分辨率单幅图像样本退化为缩放系数的低分辨率单幅图像样本;The second sampling unit is used to degenerate the high-resolution single image sample into a low-resolution single image sample of the scaling factor by adopting the built-in bicube down-sampling function of Matlab;
第三采样单元,用于采用所述大规模CUB鸟类图像数据集中的用于描述图像中的鸟的羽毛颜色、体态特征、运动姿态和环境表现的英文语句信息,得到对应的图像描述信息样本。The third sampling unit is used to obtain the corresponding image description information sample by using the English sentence information used to describe the feather color, body characteristics, motion posture and environmental performance of the bird in the image in the large-scale CUB bird image data set .
可选地,所述模型建立子模块包括:Optionally, the model building submodule includes:
获取单元,用于获取低分辨率单幅图像和对应的图像描述信息;an acquisition unit, configured to acquire a low-resolution single image and corresponding image description information;
提取单元,用于基于单层卷积对所述低分辨率单幅图像提取浅层特征, 将输入的低分辨率单幅图像从RGB颜色空间转换到特征空间;The extraction unit is used to extract shallow features from the low-resolution single image based on single-layer convolution, and convert the input low-resolution single image from RGB color space to feature space;
编码处理单元,用于采用自适应调节块对所述图像描述信息编码处理,得到与图像特征的维度相同的描述变量;An encoding processing unit, configured to use an adaptive adjustment block to encode the image description information to obtain description variables having the same dimensions as image features;
压缩单元,用于将描述变量和图像特征串联起来,采用一层卷积对串联起来的特征进行通道压缩;The compression unit is used to concatenate the description variables and image features, and use a layer of convolution to perform channel compression on the concatenated features;
深层特征提取单元,用于采用多尺度子网络对浅层特征进行深层特征提取;A deep feature extraction unit for performing deep feature extraction on shallow features using a multi-scale sub-network;
上采样单元,用于采用上采样模块对深层的特征进行尺度放大;The upsampling unit is used to scale up the deep features by using the upsampling module;
重建单元,用于采用两层卷积,重建并输出RGB通道的高分辨率单幅图像;The reconstruction unit is used to reconstruct and output a high-resolution single image of RGB channel by adopting two layers of convolution;
模型建立单元,用于基于预设损失函数对重建后高分辨率单幅图像与备份的高分辨率单幅图像样本、结合匹配描述信息的正样本与结合不匹配描述信息的负样本反向收敛,建立单幅图像超分辨率模型。The model building unit is used to reversely converge the reconstructed high-resolution single image and backup high-resolution single image samples, positive samples combined with matching description information, and negative samples combined with mismatched description information based on a preset loss function , to build a single image super-resolution model.
可选地,所述编码处理单元包括:Optionally, the encoding processing unit includes:
编码子单元,由一层全连接层组成,用于输出描述编码向量;The encoding subunit consists of a layer of fully connected layers for outputting description encoding vectors;
第一权重子单元,由一层全连接层和sigmoid激活函数组成,用于输出权重向量;The first weight subunit consists of a fully connected layer and a sigmoid activation function for outputting a weight vector;
变换子单元,用于将所述编码子单元和所述权重子单元输出的向量对应位置元素值相乘,并变换成与与图像特征的维度相同的描述变量。The transformation subunit is used to multiply the value of the corresponding position element of the vector output by the encoding subunit and the weight subunit, and transform it into a description variable having the same dimension as the image feature.
可选地,所述多尺度子网络包括:Optionally, the multi-scale sub-network includes:
缩放单元,用于通过双线性插值将浅层特征降采样为小尺度的特征图,其尺度缩小为原来的一半;The scaling unit is used to downsample the shallow features into a small-scale feature map through bilinear interpolation, and its scale is reduced to half of the original;
增加单元,用于以该尺度作为第一层子网络的输入,分阶段逐步增加大尺度子网络;Adding units to use this scale as the input of the first layer of sub-networks, and gradually increase the large-scale sub-networks in stages;
输入单元,用于将上一阶段的不同子网络的输出经过最近邻插值进行尺度放大后,并融合成为大尺度子网络的输入;其中,子网络在每个阶段都由一定数目的注意力残差密集连接块串联组成,由上至下的不同尺度的子网络所使用的注意力残差密集连接块数量分别为5、7、3;The input unit is used to amplify the output of different sub-networks in the previous stage through nearest neighbor interpolation, and fuse them into the input of large-scale sub-networks; wherein, the sub-networks are composed of a certain number of attention residuals in each stage. The difference densely connected blocks are connected in series, and the number of attention residual densely connected blocks used by sub-networks of different scales from top to bottom are 5, 7, and 3 respectively;
融合单元,用于采用基于通道注意力机制的自适应融合模块将不同尺 度下的子网络提取的不同频率的信息进行融合。The fusion unit is used to fuse information of different frequencies extracted by sub-networks at different scales by using an adaptive fusion module based on the channel attention mechanism.
可选地,所述上采样单元包括:Optionally, the upsampling unit includes:
放大子单元,用于使用最近邻插值算法对特征尺度进行放大。The enlargement subunit is used to enlarge the feature scale using the nearest neighbor interpolation algorithm.
可选地,所述注意力残差密集连接块包括:Optionally, the attention residual dense connection block includes:
第一组成单元,用于将三个空间注意力残差密集连接单元和一个将注意力残差密集连接块的输入和最后一个空间注意力残差密集连接单元输出相连的局部跳跃连接组成。The first component unit is used to compose three spatial attention residual densely connected units and a local skip connection connecting the input of the attention residual densely connected block and the output of the last spatial attention residual densely connected unit.
可选地,所述融合单元包括:Optionally, the fusion unit includes:
映射子单元,用于对小尺度特征映射进行插值,生成与大尺度特征映射大小相同的特征映射;The mapping subunit is used to interpolate the small-scale feature map to generate a feature map with the same size as the large-scale feature map;
传递子单元,用于插值后的特征映射分别传递给全局平均池化层、通道压缩卷积层和通道扩大卷积层;The transfer subunit is used to transfer the interpolated feature map to the global average pooling layer, channel compression convolution layer and channel expansion convolution layer respectively;
第二权重子单元,用于将获得的三个尺度的向量串联起来,并在同一通道上使用softmax层进行处理,生成对应的权重矩阵;The second weight subunit is used to concatenate the obtained vectors of the three scales, and use the softmax layer on the same channel for processing to generate a corresponding weight matrix;
相乘子单元。用于将权重矩阵分成三个权重分量对应于三个子网络,将各尺度插值后的特征映射分别与相应的权重分量相乘;Multiply subunits. It is used to divide the weight matrix into three weight components corresponding to three sub-networks, and multiply the feature maps after each scale interpolation with the corresponding weight components;
输出子单元,将得到的三个特征图进行加权求和操作从而得到融合后的输出。The output subunit performs a weighted summation operation on the obtained three feature maps to obtain a fused output.
可选地,所述空间注意力残差密集连接单元包括:Optionally, the spatial attention residual dense connection unit includes:
第二组成单元,用于将五个卷积层的密集连接组、一个空间注意力卷积组以及将空间注意力残差密集连接单元的输入和空间注意力卷积组的输出相连的跳跃链接组成。The second component unit is used to connect a densely connected group of five convolutional layers, a spatial attention convolutional group, and a skip link connecting the input of the spatial attention residual densely connected unit and the output of the spatial attention convolutional group composition.
可选地,所述输出模块包括:Optionally, the output module includes:
提取子模块,用于将低分辨率单幅图像输入浅层特征提取模块,得到浅层图像特征;The extraction sub-module is used to input a low-resolution single image into the shallow feature extraction module to obtain shallow image features;
输出子模块,将对应的图像描述信息输入自适应调节块获得与图像特征的维度相同的描述变量,将描述变量和图像特征串联起来,输入后续单幅图像超分辨率模型,输出高分辨率单幅图像。The output sub-module inputs the corresponding image description information into the adaptive adjustment block to obtain the description variable with the same dimension as the image feature, connects the description variable and the image feature in series, inputs the subsequent single image super-resolution model, and outputs a high-resolution single image images.
第三方面,提供了一种装置,包括:In a third aspect, a device is provided, including:
存储器,用于存储至少一个程序;memory for storing at least one program;
处理器,用于执行所述至少一个程序,以实现如上所述方法。A processor, configured to execute the at least one program to implement the method as described above.
第四方面,提供了一种存储介质,存储有可执行的程序,所述可执行的程序被处理器执行时实现如上所述方法。In a fourth aspect, a storage medium is provided, which stores an executable program, and when the executable program is executed by a processor, the above method is implemented.
与现有技术相比,本发明实施例具有以下有益效果:Compared with the prior art, the embodiments of the present invention have the following beneficial effects:
采用包含高分辨率单幅图像样本、低分辨率单幅图像样本和其对应的图像描述信息样本的训练样本、预设损失函数建立的单幅图像训练模型对获取的待处理的低分辨率单幅图像进行分辨率处理能够准确和高效的实现对由低分辨率单幅图像恢复为高分辨率单幅图像的效果,能够基于特定的图像描述信息先验获取清晰度更高的单幅图像。The low-resolution single image to be processed is obtained by using a single image training model established by using training samples including high-resolution single image samples, low-resolution single image samples and their corresponding image description information samples, and a preset loss function. The resolution processing of each image can accurately and efficiently realize the effect of restoring a low-resolution single image to a high-resolution single image, and can obtain a priori a single image with higher definition based on specific image description information.
附图说明Description of drawings
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其它的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the accompanying drawings in the following description are only These are some embodiments of the present invention. Those skilled in the art can also obtain other drawings based on these drawings without creative work.
本说明书所绘示的结构、比例、大小等,均仅用以配合说明书所揭示的内容,以供熟悉此技术的人士了解与阅读,并非用以限定本发明可实施的限定条件,故不具技术上的实质意义,任何结构的修饰、比例关系的改变或大小的调整,在不影响本发明所能产生的功效及所能达成的目的下,均应仍落在本发明所揭示的技术内容所能涵盖的范围内。The structures, proportions, sizes, etc. shown in this manual are only used to cooperate with the content disclosed in the manual, so that people familiar with this technology can understand and read, and are not used to limit the conditions for the implementation of the present invention, so there is no technical In the substantive meaning above, any modification of structure, change of proportional relationship or adjustment of size shall still fall within the scope of the technical content disclosed in the present invention without affecting the effect and purpose of the present invention. within the range that can be covered.
图1是本发明实施例提供的视觉分辨率增强的生成方法步骤流程示意图;FIG. 1 is a schematic flow chart of steps in a method for generating visual resolution enhancement provided by an embodiment of the present invention;
图2是本发明实施例提供的视觉分辨率增强的生成系统结构框图;FIG. 2 is a structural block diagram of a generation system for visual resolution enhancement provided by an embodiment of the present invention;
图3是本发明实施例中单幅图像超分辨率模型的流程示意图;Fig. 3 is a schematic flow chart of a single image super-resolution model in an embodiment of the present invention;
图4是本发明实施例中多尺度子网络的结构示意图;FIG. 4 is a schematic structural diagram of a multi-scale sub-network in an embodiment of the present invention;
图5是本发明实施例中注意力残差密集连接块的结构示意图;Fig. 5 is a schematic structural diagram of an attention residual densely connected block in an embodiment of the present invention;
图6是本发明实施例中自适应融合模块的操作细节示意图。Fig. 6 is a schematic diagram of the operation details of the adaptive fusion module in the embodiment of the present invention.
具体实施方式Detailed ways
为使得本发明的目的、特征、优点能够更加的明显和易懂,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,下面所描述的实施例仅仅是本发明一部分实施例,而非全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其它实施例,都属于本发明保护的范围。In order to make the purpose, features and advantages of the present invention more obvious and understandable, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present invention. Obviously, the following description The embodiments are only some of the embodiments of the present invention, but not all of them. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.
实施例1Example 1
如图1所示,本实施例提供一种视觉分辨率增强的生成方法,包括以下步骤:As shown in Figure 1, this embodiment provides a method for generating visual resolution enhancement, including the following steps:
S1、获取待处理的低分辨率单幅图像以及其对应的图像描述信息;S1. Obtain the low-resolution single image to be processed and its corresponding image description information;
S2、通过单幅图像超分辨率模型对所述低分辨率单幅图像以及其对应的图像描述信息进行处理,输出高分辨率单幅图像;S2. Process the low-resolution single image and its corresponding image description information through a single-image super-resolution model, and output a high-resolution single image;
所述单幅图像超分辨率模型的训练过程包括以下步骤:The training process of the single image super-resolution model comprises the following steps:
S3、采集训练样本,所述训练样本含高分辨率单幅图像样本、低分辨率单幅图像样本和其对应的图像描述信息样本;S3. Collect training samples, where the training samples include high-resolution single image samples, low-resolution single image samples, and corresponding image description information samples;
S4、根据采集的训练样本,基于预设损失函数和高分辨率单幅图像样本建立单幅图像超分辨率模型。S4. According to the collected training samples, a single image super-resolution model is established based on a preset loss function and high-resolution single image samples.
可选地,所述步骤S3包括:Optionally, the step S3 includes:
S31、采集高分辨率单幅图像样本,采用公开的大规模CUB鸟类图像数据集得到高分辨率单幅图像样本并备份;S31. Collect a high-resolution single image sample, use the public large-scale CUB bird image data set to obtain a high-resolution single image sample and back it up;
S32、采用Matlab内置的双立方下采样函数将所述高分辨率单幅图像样本退化为×4缩放系数的低分辨率单幅图像样本;S32. Using the built-in bicubic downsampling function of Matlab to degenerate the high-resolution single image sample into a low-resolution single image sample with a scaling factor of ×4;
S33、采用上述数据集中的用于描述图像中的鸟的羽毛颜色、体态特征、运动姿态和环境表现的英文语句信息;S33. Using the English sentence information in the above data set for describing the feather color, body characteristics, movement posture and environmental performance of the bird in the image;
因此,通过步骤S31~S33,可以获取高分辨率单幅图像样本、低分辨率单幅图像样本和其对应的图像描述信息样本,从而建立训练样本。Therefore, through steps S31-S33, high-resolution single image samples, low-resolution single image samples and corresponding image description information samples can be obtained, so as to establish training samples.
可选地,所述步骤S4包括:Optionally, the step S4 includes:
S41、获取低分辨率单幅图像和对应的图像描述信息;S41. Obtain a low-resolution single image and corresponding image description information;
S42、基于单层卷积对所述低分辨率单幅图像提取浅层特征,将输入的低分辨率单幅图像从RGB颜色空间转换到特征空间;S42. Extract shallow features from the low-resolution single image based on single-layer convolution, and convert the input low-resolution single image from RGB color space to feature space;
S43、采用自适应调节块将图像的描述信息编码处理,得到与图像特征的维度相同的描述变量;S43. Using an adaptive adjustment block to encode the description information of the image to obtain a description variable having the same dimension as the image feature;
S44、将描述变量和图像特征串联起来,采用一层卷积对串联起来的特征进行通道压缩;S44. Concatenating the descriptive variables and image features, and performing channel compression on the concatenated features by using a layer of convolution;
S45、采用多尺度子网络对浅层特征进行深层特征提取;S45. Using a multi-scale sub-network to extract deep features from shallow features;
S46、采用上采样模块对深层的特征进行尺度放大;S46. Using an upsampling module to scale up deep features;
S47、采用两层卷积,重建并输出RGB通道的高分辨率单幅图像;S47. Reconstruct and output a high-resolution single image of RGB channels by using two layers of convolution;
S48、基于预设损失函数对重建后高分辨率单幅图像与备份的高分辨率单幅图像样本、结合匹配描述信息的正样本与结合不匹配描述信息的负样本反向收敛,建立单幅图像超分辨率模型。S48. Reversely converge the reconstructed high-resolution single image and backup high-resolution single image samples, positive samples combined with matching description information, and negative samples combined with mismatched description information based on the preset loss function, to establish a single image Image super-resolution model.
可选地,所述步骤S43包括:Optionally, the step S43 includes:
S431、自适应调节块由两分支组成,其中一分支由一层全连接层组成,输出描述编码向量,另一分支由一层全连接层和sigmoid激活函数组成,输出权重向量;S431. The adaptive adjustment block is composed of two branches, one of which is composed of a fully connected layer, and outputs a description encoding vector, and the other branch is composed of a fully connected layer and a sigmoid activation function, and outputs a weight vector;
S432、两分支输出的向量对应位置元素值相乘,并变换成与图像特征的维度相同的描述变量;具体地,通过权重向量对描述信息做调整,自适应地对描述编码特征进行缩放,从而将图像描述里的冗余信息进行消去,并得到了对图像重建有效的信息。S432. Multiply the element values corresponding to the positions of the vectors output by the two branches, and transform it into a description variable having the same dimension as the image feature; specifically, adjust the description information through the weight vector, and adaptively scale the description encoding feature, thereby The redundant information in the image description is eliminated, and the effective information for image reconstruction is obtained.
可选地,所述步骤S45包括:Optionally, the step S45 includes:
S451、通过双线性插值将浅层特征降采样为小尺度的特征图,其尺度缩小为原来的一半;S451. Downsample the shallow features into a small-scale feature map through bilinear interpolation, and the scale is reduced to half of the original;
S452、以该尺度作为第一层子网络的输入,分阶段逐步增加大尺度子网络;S452. Using the scale as the input of the first-layer sub-network, gradually increase the large-scale sub-network in stages;
S453、大尺度子网络的输入是由上一阶段的不同子网络的输出经过最近邻插值进行尺度放大后融合而成的;也即是说,将上一阶段的不同子网络的输出经过最近邻插值进行尺度放大后,并融合成为大尺度子网络的输入;其中,子网络在每个阶段都由一定数目的注意力残差密集连接块串联 组成,由上至下的不同尺度的子网络所使用的注意力残差密集连接块数量分别为5、7、3;S453. The input of the large-scale sub-network is fused by the output of different sub-networks in the previous stage through the nearest neighbor interpolation for scaling up; After the scale is enlarged by interpolation, it is fused to become the input of the large-scale sub-network; where the sub-network is composed of a certain number of attention residual densely connected blocks in series at each stage, and is composed of sub-networks of different scales from top to bottom. The number of attention residual densely connected blocks used is 5, 7, 3 respectively;
S454、采用基于通道注意力机制的自适应融合模块将不同尺度下的子网络提取的不同频率的信息进行融合。S454. Use an adaptive fusion module based on a channel attention mechanism to fuse information of different frequencies extracted by sub-networks at different scales.
可选地,所述步骤S454包括:Optionally, the step S454 includes:
S4541、对小尺度特征映射进行插值,生成与大尺度特征映射大小相同的特征映射;S4541. Interpolate the small-scale feature map to generate a feature map with the same size as the large-scale feature map;
S4542、插值后的特征映射分别传递给全局平均池化层、通道压缩卷积层和通道扩大卷积层;S4542, the interpolated feature map is passed to the global average pooling layer, the channel compression convolution layer and the channel expansion convolution layer respectively;
S4543、将获得的三个尺度的向量串联起来,并在同一通道上使用softmax层进行处理,生成对应的权重矩阵;S4543. Concatenate the obtained vectors of three scales, and use a softmax layer on the same channel to process, and generate a corresponding weight matrix;
S4544、将权重矩阵分成三个权重分量对应于三个子网络,将各尺度插值后的特征映射分别与相应的权重分量相乘;S4544. Divide the weight matrix into three weight components corresponding to the three sub-networks, and multiply the interpolated feature maps of each scale by the corresponding weight components;
S4545、将得到的三个特征图进行加权求和操作从而得到融合后的输出。S4545. Perform a weighted summation operation on the obtained three feature maps to obtain a fused output.
实施例2Example 2
如图2所示,本实施例提供一种视觉分辨率增强的生成系统,该系统包括:As shown in Figure 2, this embodiment provides a generation system for visual resolution enhancement, which includes:
获取模块,用于获取待处理的低分辨率单幅图像和图像描述信息;An acquisition module, configured to acquire low-resolution single images and image description information to be processed;
输出模块,用于通过单幅图像超分辨率模型对所述低分辨率单幅图像和图像描述信息变量进行处理,输出重建后的高分辨率单幅图像,其中,所述单幅图像超分辨率模型基于高分辨率单幅图像样本和低分辨率单幅图像样本。An output module, configured to process the low-resolution single image and image description information variables through a single image super-resolution model, and output a reconstructed high-resolution single image, wherein the single image is super-resolved The rate model is based on high-resolution single image samples and low-resolution single image samples.
训练模块包括:Training modules include:
采样子模块,用于采集训练样本,所述训练样本含高分辨率单幅图像样本、低分辨率单幅图像样本和对应的图像描述信息样本;The sampling sub-module is used to collect training samples, and the training samples include high-resolution single image samples, low-resolution single image samples and corresponding image description information samples;
模型建立子模块,用于根据采集的训练样本,基于预设损失函数和高分辨率单幅图像样本建立单幅图像超分辨率模型。The model establishment sub-module is used to establish a single image super-resolution model based on a preset loss function and a high-resolution single image sample according to the collected training samples.
可选地,采样子模块包括:Optionally, the sampling submodule includes:
第一采样单元,用于采集高分辨率单幅图像样本,采用公开的大规模CUB鸟类图像数据集得到高分辨率单幅图像样本并备份;The first sampling unit is used to collect high-resolution single image samples, using the public large-scale CUB bird image data set to obtain high-resolution single image samples and back them up;
第二采样单元,用于采用Matlab内置的双立方下采样函数将所述高分辨率单幅图像样本退化为缩放系数的低分辨率单幅图像样本;The second sampling unit is used to degenerate the high-resolution single image sample into the low-resolution single image sample of the scaling factor by adopting the built-in bicube down-sampling function of Matlab;
第三采样单元,用于采用上述数据集中的用于描述图像中的鸟的羽毛颜色、体态特征、运动姿态和环境表现的英文语句信息。The third sampling unit is used to use the English sentence information used to describe the feather color, body characteristics, motion posture and environmental performance of the bird in the image in the above data set.
因此,上述采样单元,可以获取高分辨率单幅图像样本、低分辨率单幅图像样本和其对应的图像描述信息样本,从而建立训练样本。Therefore, the above-mentioned sampling unit can acquire high-resolution single image samples, low-resolution single image samples and corresponding image description information samples, so as to establish training samples.
可选地,模型建立子模块包括:Optionally, the model building submodule includes:
获取单元,用于获取低分辨率单幅图像样本和对应的图像描述信息;An acquisition unit, configured to acquire a low-resolution single image sample and corresponding image description information;
提取单元,用于基于单层卷积对所述低分辨率单幅图像提取浅层特征,将输入的低分辨率单幅图像从RGB颜色空间转换到特征空间;The extraction unit is used to extract shallow features from the low-resolution single image based on single-layer convolution, and converts the input low-resolution single image from RGB color space to feature space;
编码处理单元,用于采用自适应调节块将图像的描述信息编码处理,得到与图像特征的维度相同的描述变量;An encoding processing unit, configured to encode the description information of the image by using an adaptive adjustment block to obtain a description variable having the same dimension as the image feature;
压缩单元,用于将描述变量和图像特征串联起来,采用一层卷积对串联起来的特征进行通道压缩;The compression unit is used to concatenate the description variables and image features, and use a layer of convolution to perform channel compression on the concatenated features;
深层特征提取单元,用于采用多尺度子网络对浅层特征进行深层特征提取;A deep feature extraction unit for performing deep feature extraction on shallow features using a multi-scale sub-network;
上采样单元,用于采用上采样模块对深层的特征进行尺度放大;The upsampling unit is used to scale up the deep features by using the upsampling module;
重建单元,用于采用两层卷积,重建并输出RGB通道的高分辨率单幅图像;The reconstruction unit is used to reconstruct and output a high-resolution single image of RGB channel by adopting two layers of convolution;
模型建立单元,用于基于预设损失函数对重建后高分辨率单幅图像与备份的高分辨率单幅图像样本、结合匹配描述信息的正样本与结合不匹配描述信息的负样本反向收敛,建立单幅图像超分辨率模型。The model building unit is used to reversely converge the reconstructed high-resolution single image and backup high-resolution single image samples, positive samples combined with matching description information, and negative samples combined with mismatched description information based on a preset loss function , to build a single image super-resolution model.
可选地,所述编码处理单元包括:Optionally, the encoding processing unit includes:
编码子单元,由一层全连接层组成,输出描述编码向量;The encoding subunit consists of a fully connected layer and outputs a description encoding vector;
第一权重子单元,由一层全连接层和sigmoid激活函数组成,输出权重向量;The first weight subunit consists of a fully connected layer and a sigmoid activation function, and outputs a weight vector;
变换子单元,将所述编码子单元和权重子单元输出的向量对应位置元 素值相乘,并变换成与与图像特征的维度相同的描述变量。The transformation subunit multiplies the vector corresponding position element value output by the encoding subunit and the weight subunit, and transforms it into a descriptive variable identical to the dimension of the image feature.
可选地,所述多尺度子网络包括:Optionally, the multi-scale sub-network includes:
缩放单元,用于通过双线性插值将浅层特征降采样为小尺度的特征图,其尺度缩小为原来的一半;The scaling unit is used to downsample the shallow features into a small-scale feature map through bilinear interpolation, and its scale is reduced to half of the original;
增加单元,用于以该尺度作为第一层子网络的输入,分阶段逐步增加大尺度子网络;Adding units to use this scale as the input of the first layer of sub-networks, and gradually increase the large-scale sub-networks in stages;
输入单元,用于将上一阶段的不同子网络的输出经过最近邻插值进行尺度放大后,并融合成为大尺度子网络的输入;也即是说,大尺度子网络的输入是由上一阶段的不同子网络的输出经过最近邻插值进行尺度放大后融合而成的;其中,子网络在每个阶段都由一定数目的注意力残差密集连接块串联组成,由上至下的不同尺度的子网络所使用的注意力残差密集连接块数量分别为5、7、3;The input unit is used to scale up the output of different sub-networks in the previous stage through nearest neighbor interpolation, and fuse them into the input of the large-scale sub-network; that is to say, the input of the large-scale sub-network is obtained from the previous stage The outputs of different sub-networks are fused by nearest neighbor interpolation and then scaled up; among them, the sub-networks are composed of a certain number of attention residual densely connected blocks in series at each stage, from top to bottom of different scales The number of attention residual densely connected blocks used by the sub-network is 5, 7, 3 respectively;
融合单元,用于采用基于通道注意力机制的自适应融合模块将不同尺度下的子网络提取的不同频率的信息进行融合。The fusion unit is used to fuse information of different frequencies extracted by sub-networks at different scales by using an adaptive fusion module based on a channel attention mechanism.
可选地,所述上采样单元包括:Optionally, the upsampling unit includes:
放大子单元,用于使用最近邻插值算法对特征尺度进行放大。The enlargement subunit is used to enlarge the feature scale using the nearest neighbor interpolation algorithm.
可选地,所述注意力残差密集连接块包括:Optionally, the attention residual dense connection block includes:
第一组成单元,用于将三个空间注意力残差密集连接单元和一个将注意力残差密集连接块的输入和最后一个空间注意力残差密集连接单元输出相连的局部跳跃连接组成。The first component unit is used to compose three spatial attention residual densely connected units and a local skip connection connecting the input of the attention residual densely connected block and the output of the last spatial attention residual densely connected unit.
可选地,所述融合单元包括:Optionally, the fusion unit includes:
映射子单元,用于对小尺度特征映射进行插值,生成与大尺度特征映射大小相同的特征映射;The mapping subunit is used to interpolate the small-scale feature map to generate a feature map with the same size as the large-scale feature map;
传递子单元,用于插值后的特征映射分别传递给全局平均池化层、通道压缩卷积层和通道扩大卷积层;The transfer subunit is used to transfer the interpolated feature map to the global average pooling layer, channel compression convolution layer and channel expansion convolution layer respectively;
第二权重子单元,用于将获得的三个尺度的向量串联起来,并在同一通道上使用softmax层进行处理,生成对应的权重矩阵;The second weight subunit is used to concatenate the obtained vectors of the three scales, and use the softmax layer on the same channel for processing to generate a corresponding weight matrix;
相乘子单元。用于将权重矩阵分成三个权重分量对应于三个子网络,将各尺度插值后的特征映射分别与相应的权重分量相乘;Multiply subunits. It is used to divide the weight matrix into three weight components corresponding to three sub-networks, and multiply the feature maps after each scale interpolation with the corresponding weight components;
输出子单元,将得到的三个特征图进行加权求和操作从而得到融合后的输出。The output subunit performs a weighted summation operation on the obtained three feature maps to obtain a fused output.
可选地,所述空间注意力残差密集连接单元包括:Optionally, the spatial attention residual dense connection unit includes:
第二组成单元,用于将五个卷积层的密集连接组、一个空间注意力卷积组以及将空间注意力残差密集连接单元的输入和空间注意力卷积组的输出相连的跳跃链接组成。The second component unit is used to connect a densely connected group of five convolutional layers, a spatial attention convolutional group, and a skip link connecting the input of the spatial attention residual densely connected unit and the output of the spatial attention convolutional group composition.
可选地,所述输出模块包括:Optionally, the output module includes:
提取子模块,用于将低分辨率单幅图像输入浅层特征提取模块,得到浅层图像特征;The extraction sub-module is used to input a low-resolution single image into the shallow feature extraction module to obtain shallow image features;
输出子模块,将对应的图像描述信息输入自适应调节块获得与图像特征的维度相同的描述变量,将描述变量和图像特征串联起来,输入后续单幅图像超分辨率模型,输出高分辨率单幅图像。The output sub-module inputs the corresponding image description information into the adaptive adjustment block to obtain the description variable with the same dimension as the image feature, connects the description variable and the image feature in series, inputs the subsequent single image super-resolution model, and outputs a high-resolution single image images.
实施例3Example 3
本实施例提供一种装置,该装置包括:The present embodiment provides a kind of device, and this device comprises:
至少一个处理器;at least one processor;
至少一个存储器,用于存储至少一个程序;at least one memory for storing at least one program;
当至少一个程序被至少一个处理器执行,使得至少一个处理器实现如上述实施例1的一种视觉分辨率增强的生成方法的步骤。When at least one program is executed by at least one processor, at least one processor is made to implement the steps of the method for generating visual resolution enhancement as in Embodiment 1 above.
实施例4Example 4
本实施例提供了一种存储介质,其中存储有处理器可执行的程序,可执行的程序在由处理器执行时用于执行如实施例1所述的一种视觉分辨率增强的生成方法的步骤。This embodiment provides a storage medium, which stores a processor-executable program, and the executable program is used to execute the method for generating visual resolution enhancement as described in Embodiment 1 when executed by the processor. step.
实施例5Example 5
参照图3至图6,本实施例提供了一种视觉分辨率增强的生成方法流程图,可以作为实施例一的一种具体实施方式,实施例二也可以实施本实施例的方法,其具体包括以下步骤:Referring to Fig. 3 to Fig. 6, this embodiment provides a flow chart of a method for generating visual resolution enhancement, which can be used as a specific implementation of Embodiment 1, and Embodiment 2 can also implement the method of this embodiment. Include the following steps:
A、采集训练样本,所述训练样本含高分辨率单幅图像样本、低分辨率单幅图像样本和其对应的图像描述信息样本;A. Collect training samples, the training samples include high-resolution single image samples, low-resolution single image samples and their corresponding image description information samples;
B、根据采集的训练样本,建立单幅图像超分辨率模型;B. Establish a single image super-resolution model based on the collected training samples;
C、获取待处理的低分辨率单幅图像以及其对应的图像描述信息;C. Obtain the low-resolution single image to be processed and its corresponding image description information;
D、通过单幅图像超分辨率模型对待处理的低分辨率单幅图像以及其对应的图像描述信息进行处理,输出高分辨率单幅图像。D. Process the low-resolution single image to be processed and its corresponding image description information through the single image super-resolution model, and output a high-resolution single image.
其中,步骤A的具体实施方案是:Wherein, the concrete implementation scheme of step A is:
A1、获取公开的大规模CUB鸟类数据集作为训练数据集。数据集可分为200个类别,总计11788张图片,每张图像分别有十句英文语句用于描述图像中的鸟的羽毛颜色、体态特征、运动姿态和环境表现。依据8855:2913的比例划分出训练集和测试集,训练数据集和测试数据集在每个类别内的比例都是均衡的,不会出现训练集和测试集样本分布失衡的问题。对应的图像描述信息采用CNN-RNN编码方式对描述先验信息进行编码。A1. Obtain the public large-scale CUB bird data set as the training data set. The data set can be divided into 200 categories, with a total of 11,788 images. Each image has ten English sentences to describe the feather color, body characteristics, movement posture and environmental performance of the bird in the image. The training set and test set are divided according to the ratio of 8855:2913. The proportion of training data set and test data set in each category is balanced, and there will be no problem of unbalanced distribution of training set and test set samples. The corresponding image description information adopts the CNN-RNN encoding method to encode the description prior information.
A2、使用MATLAB的“imresize”函数对高分辨率单幅图像进行4倍的双立方下采样,得到对应的低分辨率单幅图像,构成{I HR,I LR,c}的三元匹配的数据集。正负样本匹配损失中使用的负样本描述信息,则是通过随机数从其余图像描述中随机选取一个作为不匹配描述,获得{I LR,I HR,neg_c}三元负样本描述的数据集;采用水平或竖直翻转,90°旋转以及随机切取图像块作为数据增强的方式。 A2. Use the "imresize" function of MATLAB to perform 4-fold bicube downsampling on a high-resolution single image to obtain the corresponding low-resolution single image, which constitutes a ternary matching of {I HR , I LR ,c} data set. The negative sample description information used in the positive and negative sample matching loss is to randomly select one from the rest of the image descriptions as a mismatch description through random numbers, and obtain a dataset of {I LR , I HR , neg_c} triple negative sample descriptions; Horizontal or vertical flipping, 90° rotation, and random cropping of image blocks are used as data enhancement methods.
步骤B的具体实施方案是:The specific implementation scheme of step B is:
B1、将低分辨率单幅图像随机切取30×30大小的低分辨率图像块作为输入,记为I LRB1. A low-resolution image block with a size of 30×30 randomly cut from a low-resolution single image is used as an input, which is denoted as I LR .
B2、在浅层特征提取模块当中,单层卷积层将输入的低分辨率图像从RGB颜色空间转换到特征空间,所得到的特征包含64个通道,大小与输入的图片大小相同。该卷积层由一个3×3卷积层,以及一个激活函数组成。同时,自适应调节块将图像的描述编码处理得到与图像特征的维度相同的描述变量。自适应调节块由两分支组成,其中一分支由一层全连接层组成,输出描述编码向量,另一分支由一层全连接层和sigmoid激活函数组成,输出权重向量,两分支输出的向量对应位置元素值相乘得到描述变量。接 着将描述变量和图像特征串联起来,经过一层3×3卷积层进行通道压缩,从而获得浅层特征F S。该过程可表示为: B2. In the shallow feature extraction module, the single-layer convolutional layer converts the input low-resolution image from the RGB color space to the feature space. The obtained feature contains 64 channels, and the size is the same as the input image. The convolutional layer consists of a 3×3 convolutional layer and an activation function. At the same time, the adaptive adjustment block encodes the description of the image to obtain the same description variable as the dimension of the image feature. The adaptive adjustment block is composed of two branches, one of which is composed of a fully connected layer, which outputs a description encoding vector, and the other branch is composed of a fully connected layer and a sigmoid activation function, which outputs a weight vector, and the vectors output by the two branches correspond to The positional element values are multiplied to obtain the description variable. Then, the descriptive variables and image features are concatenated, and the channel compression is performed through a layer of 3×3 convolutional layers, so as to obtain the shallow feature F S . This process can be expressed as:
Figure PCTCN2021126019-appb-000001
Figure PCTCN2021126019-appb-000001
B3、在获取浅层特征F S之后,将其输入由多尺度子网络构成的深层特征提取模块中,通过并行的多个子网络生成有效的深层特征F d。最终获得的深层特征F d∈R 2W×2H×C,可以发现深层特征在浅层特征的尺度上扩大了一倍,在多尺度子网络当中主要通过上、下采样的方式来获取不同尺度的特征信息。为了构建不同尺度的特征图,首先通过双线性插值将浅层特征采样为小尺度的特征图,其尺度缩小为原来的一半
Figure PCTCN2021126019-appb-000002
深层特征提取模块以该尺度作为第一层子网络的输入,分阶段逐步增加大尺度子网络。大尺度子网络的输入是由上一阶段的不同子网络的输出经过最近邻插值进行尺度放大后融合而成的。子网络在每个阶段都由一定数目的注意力残差密集连接块串联组成。而不同尺度的子网络,其串联的注意力残差密集连接块数目也不相同,由上至下的不同尺度的子网络所使用的注意力残差密集连接块数量分别为5、7、3。后续基于通道注意力机制的自适应融合模块,有效地将不同尺度下的子网络提取的不同频率的信息进行融合。该模块可表示为:
B3. After obtaining the shallow feature F S , input it into the deep feature extraction module composed of multi-scale sub-networks, and generate an effective deep feature F d through parallel multiple sub-networks. Finally obtained deep feature F d ∈ R 2W×2H×C , it can be found that the scale of the deep feature is doubled on the scale of the shallow feature. characteristic information. In order to construct feature maps of different scales, the shallow features are first sampled into small-scale feature maps by bilinear interpolation, and the scale is reduced to half of the original
Figure PCTCN2021126019-appb-000002
The deep feature extraction module takes this scale as the input of the first layer sub-network, and gradually increases the large-scale sub-network in stages. The input of the large-scale sub-network is fused by the output of different sub-networks in the previous stage through nearest neighbor interpolation for scaling up. The sub-network consists of a certain number of attention residual densely connected blocks concatenated at each stage. For sub-networks of different scales, the number of attention residual densely connected blocks connected in series is also different. The number of attention residual densely connected blocks used by sub-networks of different scales from top to bottom is 5, 7, and 3 respectively. . The subsequent adaptive fusion module based on the channel attention mechanism effectively fuses information of different frequencies extracted by sub-networks at different scales. This module can be expressed as:
F d=MARDN(F s) F d =MARDN(F s )
B4、在取得深层特征F d之后,将其输入上采样模块,上采样模块是使用最近邻插值算法对特征尺度进行放大。该模块可表示为: B4. After obtaining the deep feature F d , input it into the up-sampling module, and the up-sampling module uses the nearest neighbor interpolation algorithm to enlarge the feature scale. This module can be expressed as:
F up=Inter(F d,s)↑ F up =Inter(F d ,s)↑
其中F up是上采样后的特征,Inter(·)代表最近邻插值函数,s表示放大系数。 Among them, F up is the feature after upsampling, Inter( ) represents the nearest neighbor interpolation function, and s represents the amplification factor.
B5、最后通过两层3×3卷积层重建并输入RGB通道的超分辨率图像I SR。该过程可表示为: B5. Finally, the super-resolution image I SR of RGB channels is reconstructed through two layers of 3×3 convolutional layers. This process can be expressed as:
I SR=Conv(Conv(F up)) I SR =Conv(Conv(F up ))
B6、判别器采用带步长的卷积构成的VGG网络,输入的图像为生成图像和真实图像,经过几个带步长的卷积层将输入的图像特征进行维度变化, 缩小特征图,最终输出的特征图与图像描述编码向量c进行串联,经过二分类器得到判定的真假逻辑值。该过程可表示为:B6. The discriminator uses a VGG network composed of convolutions with a step size. The input image is a generated image and a real image. After several convolution layers with a step size, the input image features are dimensionally changed, and the feature map is reduced. Finally, The output feature map is concatenated with the image description coding vector c, and the true and false logical value of the judgment is obtained through the binary classifier. This process can be expressed as:
Var=Net D({I SR,I HR},c) Var=Net D ({I SR ,I HR },c)
B7、采用损失函数对重建后的高分辨率单幅图像I SR与备份的高分辨率单幅图像样本反向收敛,建立单幅图像超分辨率模型。在训练过程中,生成器的损失函数由重建损失L rec、感知损失L VGG以及对抗损失L adv三个部分构成: B7. Using a loss function to reversely converge the reconstructed high-resolution single image ISR and the backup high-resolution single image sample, and establish a single image super-resolution model. During the training process, the loss function of the generator consists of three parts: the reconstruction loss L rec , the perceptual loss L VGG and the confrontation loss L adv :
L G=λ 1*L rec2*L VGG3*L adv L G =λ 1 *L rec2 *L VGG3 *L adv
λ 1、λ 2和λ 3分别对应这三个损失的权重。为了保证重建图像与真实图像在图像内容上尽量相似,通过重建损失在图像空间进行逐像素的约束,这里重建损失使用的是L 1损失函数: λ 1 , λ 2 and λ 3 correspond to the weights of these three losses, respectively. In order to ensure that the reconstructed image is as similar as possible to the real image in terms of image content, pixel-by-pixel constraints are performed in the image space through the reconstruction loss. Here, the reconstruction loss uses the L 1 loss function:
Figure PCTCN2021126019-appb-000003
Figure PCTCN2021126019-appb-000003
其中,N=H×W×C表示图像的总像素,W、H、C分别表示高分辨率单幅图像的宽度、高度以及通道数目。Among them, N=H×W×C represents the total pixels of the image, and W, H, and C represent the width, height and number of channels of a high-resolution single image, respectively.
同时,为了增加图像的纹理信息,重建图像通过固定的分类网络VGG提取的特征信息应保证与真实图像的相似,使用感知损失来对其进行约束,感知损失定义如下:At the same time, in order to increase the texture information of the image, the feature information extracted by the fixed classification network VGG of the reconstructed image should be similar to the real image, and the perceptual loss is used to constrain it. The perceptual loss is defined as follows:
Figure PCTCN2021126019-appb-000004
Figure PCTCN2021126019-appb-000004
这里,M=H×W×C表示指定特征图的大小。Here, M=H×W×C denotes the size of the specified feature map.
此外,为了保证生成器与判别器的相互博弈,需要使用对抗损失函数来训练生成器和判别器。生成器的对抗损失目的是让重建图像和真实图像在分布上尽可能地趋近,其定义如下:In addition, in order to ensure the mutual game between the generator and the discriminator, it is necessary to use an adversarial loss function to train the generator and the discriminator. The purpose of the generator's adversarial loss is to make the distribution of the reconstructed image and the real image as close as possible, which is defined as follows:
L adv=log(1-Net D(Net G(I LR,c))) L adv =log(1-Net D (Net G (I LR ,c)))
其中,var表示单幅的属性信息。Wherein, var represents attribute information of a single frame.
与生成器的对抗损失不同,判别器对抗损失的目标是将重建图像与真实图像在图像分布上尽可能地区分开来。相比于SRGAN和ESRGAN只计算图像之间的对抗损失,本实施例加入了正负样本对抗损失约束。其中,正样本指的是结合匹配的描述信息c,而负样本指的是结合不匹配的描述信息判别器损失neg_c。判别器的对抗损失定义如下:Different from the adversarial loss of the generator, the goal of the adversarial loss of the discriminator is to distinguish the reconstructed image from the real image as much as possible in the image distribution. Compared with SRGAN and ESRGAN, which only calculate the adversarial loss between images, this embodiment adds positive and negative sample adversarial loss constraints. Among them, the positive sample refers to the combination of matched description information c, and the negative sample refers to the discriminator loss neg_c combined with mismatched description information. The adversarial loss for the discriminator is defined as follows:
Figure PCTCN2021126019-appb-000005
Figure PCTCN2021126019-appb-000005
Figure PCTCN2021126019-appb-000006
表示生成图像与真实图像的对抗损失;
Figure PCTCN2021126019-appb-000007
是加入匹配描述编码的联合判断,目的是使判别器判定生成图像逼真程度的同时还能辨别出与描述是否对应;
Figure PCTCN2021126019-appb-000008
表示的不匹配描述编码的联合判断,值得注意的是无论是真实图像还是生成的图像,与不匹配描述信息联合输入判别器中,得到的结果都是“假”的判定。这三个损失的定义如下:
Figure PCTCN2021126019-appb-000006
Denotes the adversarial loss between the generated image and the real image;
Figure PCTCN2021126019-appb-000007
It is a joint judgment of adding matching description coding, the purpose is to make the discriminator judge the fidelity of the generated image and at the same time distinguish whether it corresponds to the description;
Figure PCTCN2021126019-appb-000008
It is worth noting that whether it is a real image or a generated image, if it is jointly input into the discriminator with the mismatch description information, the result obtained is a "false" judgment. These three losses are defined as follows:
Figure PCTCN2021126019-appb-000009
Figure PCTCN2021126019-appb-000009
Figure PCTCN2021126019-appb-000010
Figure PCTCN2021126019-appb-000010
Figure PCTCN2021126019-appb-000011
Figure PCTCN2021126019-appb-000011
设置好学习率,通过最小化损失函数误差来反向传播梯度,更新网络参数,不断迭代直至将网络训练到收敛。Set the learning rate, backpropagate the gradient by minimizing the loss function error, update the network parameters, and iterate until the network is trained to converge.
反向收敛训练时将batch size设为16,初始学习率设为10 -4,描述编码为1024的隐变量,为了构造批处理数据,在低分辨率图像上随机切取30×30大小的低分辨率图像块,与之配对的是120×120高分辨图像块,在迭代训练的过程中,根据网络的收敛情况,当训练的迭代总数达到{5×10 4, 1×10 5,2×10 5,3×10 5}时就对学习率进行折半衰减。先利用重建损失L rec对生成器进行训练,避免判别器轻易就能判别生成图像和真实图像带来的梯度消失的问题。本实施例使用ADAM优化器来对模型进行反向梯度传播,其中ADAM的参数设置为β 1=0.9,β 2=0.999以及ε=10 -8。使用L1损失函数保证重建图像与真实图像在图像内容上尽量相似,使用感知损失保证图像纹理信息与真实图像尽量相似,使用对抗损失让重建图像和真实图像在分布上尽可能地趋近,能够辨别出与描述是否对应,设置这些损失函数的系数,通过最小化这些误差和来反向传播更新网络参数,不断迭代直至将网络训练至收敛。 During the reverse convergence training, the batch size is set to 16, the initial learning rate is set to 10 -4 , and the description code is 1024 hidden variables. In order to construct batch data, a low-resolution image with a size of 30×30 is randomly cut from the low-resolution image. The high-resolution image block is paired with a 120×120 high-resolution image block. During the iterative training process, according to the convergence of the network, when the total number of training iterations reaches {5×10 4 , 1×10 5 ,2×10 5 ,3×10 5 }, the learning rate is attenuated by half. First use the reconstruction loss L rec to train the generator to avoid the problem that the discriminator can easily distinguish the gradient from the generated image from the real image. In this embodiment, the ADAM optimizer is used to perform reverse gradient propagation on the model, where the parameters of ADAM are set as β 1 =0.9, β 2 =0.999 and ε=10 −8 . Use the L1 loss function to ensure that the reconstructed image is as similar as possible to the real image in terms of image content, use perceptual loss to ensure that the image texture information is as similar as possible to the real image, and use confrontation loss to make the distribution of the reconstructed image and the real image as close as possible to distinguish Find out whether it corresponds to the description, set the coefficients of these loss functions, and update the network parameters by back propagation by minimizing these error sums, and iterate until the network is trained to convergence.
步骤C的方案具体为:The scheme of step C is specifically:
获取预先划分的CUB测试数据集,数据集中包含多样的低分辨率单幅图像以及其对应的图像描述信息变量。Obtain the pre-divided CUB test data set, which contains a variety of low-resolution single images and their corresponding image description information variables.
步骤D的方案具体为:The scheme of step D is specifically:
将待恢复的CUB测试数据集的低分辨率单幅图像输入训练好的单幅图像超分辨率模型中,通过单幅图像超分辨率模型对输入的CUB测试数据集单幅图像进行步骤B的实施方案,自适应调节块将图像对应的描述编码处理得到与图像特征的维度相同的描述变量,然后将描述变量和通过单层卷积层将输入的低分辨率图像从RGB颜色空间转换到特征空间的图像特征串联起来,最后经过一层卷积进行通道压缩,得到浅层特征,再经过后续网络的处理,即可输出高分辨率的单幅图像。Input the low-resolution single image of the CUB test data set to be restored into the trained single image super-resolution model, and perform step B on the input CUB test data set single image through the single image super-resolution model In an embodiment, the adaptive adjustment block encodes the description corresponding to the image to obtain a description variable with the same dimension as the image feature, and then converts the description variable and the input low-resolution image from the RGB color space to the feature through a single-layer convolution layer The spatial image features are concatenated, and finally channel compression is performed through a layer of convolution to obtain shallow features, and then processed by the subsequent network to output a high-resolution single image.
综上所述,采用包含高分辨率单幅图像样本、低分辨率单幅图像样本和其对应的图像描述信息样本的训练样本、预设损失函数建立的单幅图像训练模型对获取的待处理的低分辨率单幅图像进行分辨率处理能够准确和高效的实现对由低分辨率单幅图像恢复为高分辨率单幅图像的效果,能够基于特定的图像描述信息先验获取清晰度更高的单幅图像。To sum up, the single image training model established by using the training samples including high-resolution single image samples, low-resolution single image samples and their corresponding image description information samples, and the preset loss function is used to obtain the to-be-processed The resolution processing of a low-resolution single image can accurately and efficiently realize the effect of restoring a low-resolution single image to a high-resolution single image, and can obtain higher definition based on specific image description information a priori of a single image.
以上所述,以上实施例仅用以说明本发明的技术方案,而非对其限制; 尽管参照前述实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的精神和范围。As mentioned above, the above embodiments are only used to illustrate the technical solutions of the present invention, not to limit them; Although the present invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: it can still understand the foregoing The technical solutions recorded in each embodiment are modified, or some of the technical features are replaced equivalently; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions of the various embodiments of the present invention.

Claims (13)

  1. 一种视觉分辨率增强的生成方法,其特征在于,包括:A method for generating visual resolution enhancement, comprising:
    获取待处理的低分辨率单幅图像以及其对应的图像描述信息;Obtain the low-resolution single image to be processed and its corresponding image description information;
    通过单幅图像超分辨率模型对所述低分辨率单幅图像以及其对应的图像描述信息进行处理,输出高分辨率单幅图像;Processing the low-resolution single image and its corresponding image description information through a single image super-resolution model, and outputting a high-resolution single image;
    所述单幅图像超分辨率模型的训练方法,包括:The training method of the single image super-resolution model includes:
    采集训练样本,所述训练样本含高分辨率单幅图像样本、低分辨率单幅图像样本和其对应的图像描述信息样本;Collecting training samples, the training samples include high-resolution single image samples, low-resolution single image samples and corresponding image description information samples;
    根据采集的训练样本,基于预设损失函数和高分辨率单幅图像样本建立单幅图像超分辨率模型。According to the collected training samples, a single image super-resolution model is established based on the preset loss function and high-resolution single image samples.
  2. 根据权利要求1所述的视觉分辨率增强的生成方法,其特征在于,所述采集训练样本,所述训练样本含高分辨率单幅图像样本、低分辨率单幅图像样本和其对应的图像描述信息样本,包括:The generation method of visual resolution enhancement according to claim 1, characterized in that, the collection of training samples, the training samples include high-resolution single image samples, low-resolution single image samples and their corresponding images Sample descriptive information, including:
    采用预设目标的图像数据集,得到高分辨率单幅图像样本并备份;Use the image data set of the preset target to obtain a high-resolution single image sample and back it up;
    将所述高分辨率单幅图像样本退化为缩放系数的低分辨率单幅图像样本;degenerating the high resolution single image sample to a low resolution single image sample of a scaling factor;
    采用所述图像数据集中的用于描述图像中的预设目标的颜色、体态特征、运动姿态和环境表现中的至少一种的英文语句信息,得到对应的图像描述信息样本。The corresponding image description information samples are obtained by using the English sentence information in the image data set used to describe at least one of the color, body characteristics, motion posture and environmental performance of the preset target in the image.
  3. 根据权利要求1所述的视觉分辨率增强的生成方法,其特征在于,所述根据采集的训练样本,基于预设损失函数和高分辨率单幅图像样本建立单幅图像超分辨率模型,包括:The generation method of visual resolution enhancement according to claim 1, wherein, according to the collected training samples, a single image super-resolution model is established based on a preset loss function and a high-resolution single image sample, including :
    获取低分辨率单幅图像样本和对应的图像描述信息;Obtain low-resolution single image samples and corresponding image description information;
    基于单层卷积对所述低分辨率单幅图像提取浅层特征,将输入的低分辨率单幅图像从RGB颜色空间转换到特征空间;Based on single-layer convolution, the low-resolution single image is extracted with shallow features, and the input low-resolution single image is converted from RGB color space to feature space;
    采用自适应调节块对所述图像描述信息编码处理,得到与图像特征的维度相同的描述变量;Using an adaptive adjustment block to encode the image description information to obtain a description variable with the same dimension as the image feature;
    将描述变量和图像特征串联起来,采用一层卷积对串联起来的特征进行通道压缩;Concatenate the description variables and image features, and use a layer of convolution to perform channel compression on the concatenated features;
    采用多尺度子网络对浅层特征进行深层特征提取;Using multi-scale sub-networks to extract deep features from shallow features;
    采用上采样模块对深层的特征进行尺度放大;Use the upsampling module to scale up the deep features;
    采用两层卷积,重建并输出RGB通道的高分辨率单幅图像;Use two layers of convolution to reconstruct and output a high-resolution single image of RGB channels;
    基于预设损失函数对重建后高分辨率单幅图像与备份的高分辨率单幅图像样本、结合匹配描述信息的正样本与结合不匹配描述信息的负样本反向收敛,建立单幅图像超分辨率模型。Based on the preset loss function, the reconstructed high-resolution single image and the backup high-resolution single image sample, the positive sample combined with matching description information and the negative sample combined with mismatched description information are reversely converged to establish a single image super resolution model.
  4. 根据权利要求3所述的视觉分辨率增强的生成方法,其特征在于,所述采用自适应调节块对所述图像描述信息编码处理,得到与图像特征的维度相同的描述变量,包括:The generation method of visual resolution enhancement according to claim 3, characterized in that, said image description information is encoded using an adaptive adjustment block to obtain description variables that are the same as the dimensions of image features, including:
    所述自适应调节块由两分支组成,其中一分支由一层全连接层组成,输出描述编码向量,另一分支由一层全连接层和sigmoid激活函数组成,输出权重向量;The self-adaptive regulation block is made up of two branches, wherein one branch is made up of a fully connected layer, outputs a description encoding vector, and the other branch is made up of a fully connected layer and a sigmoid activation function, and outputs a weight vector;
    两分支输出的向量对应位置元素值相乘,并变换成与图像特征的维度相同的描述变量。The vectors output by the two branches are multiplied by corresponding position element values, and transformed into a description variable with the same dimension as the image feature.
  5. 根据权利要求3所述的视觉分辨率增强的生成方法,其特征在于,所述采用多尺度子网络对浅层特征进行深层特征提取,包括:The generation method of visual resolution enhancement according to claim 3, characterized in that, the use of multi-scale sub-networks to perform deep feature extraction on shallow features includes:
    通过双线性插值将浅层特征降采样为小尺度的特征图,其尺度缩小为原来的一半;The shallow features are down-sampled into small-scale feature maps through bilinear interpolation, and their scale is reduced to half of the original;
    以该尺度作为第一层子网络的输入,分阶段逐步增加大尺度子网络;Use this scale as the input of the first layer of sub-network, and gradually increase the large-scale sub-network in stages;
    将上一阶段的不同子网络的输出经过最近邻插值进行尺度放大后,并融合成为大尺度子网络的输入;其中,子网络在每个阶段都由一定数目的注意力残差密集连接块串联组成,由上至下的不同尺度的子网络所使用的注意力残差密集连接块数量分别为5、7、3;The output of different sub-networks in the previous stage is scaled up through nearest neighbor interpolation, and then fused into the input of large-scale sub-networks; where the sub-networks are connected in series by a certain number of attention residual densely connected blocks at each stage Composition, the number of attention residual dense connection blocks used by sub-networks of different scales from top to bottom is 5, 7, and 3 respectively;
    采用基于通道注意力机制的自适应融合模块将不同尺度下的子网络提取的不同频率的信息进行融合。The adaptive fusion module based on the channel attention mechanism is used to fuse the information of different frequencies extracted by sub-networks at different scales.
  6. 根据权利要求3所述的视觉分辨率增强的生成方法,其特征在于,所述采用上采样模块对深层的特征进行尺度放大,包括:The generation method of visual resolution enhancement according to claim 3, wherein said using an upsampling module to scale up deep features includes:
    使用最近邻插值算法对特征尺度进行放大。The feature scale is enlarged using the nearest neighbor interpolation algorithm.
  7. 根据权利要求5所述的视觉分辨率增强的生成方法,其特征在于, 所述注意力残差密集连接块由三个空间注意力残差密集连接单元和一个将注意力残差密集连接块的输入和最后一个空间注意力残差密集连接单元输出相连的局部跳跃连接组成。The method for generating visual resolution enhancement according to claim 5, wherein the attention residual densely connected block consists of three spatial attention residual densely connected units and one attention residual densely connected block It consists of local skip connections connecting the input to the output of the last spatial attention residual densely connected unit.
  8. 根据权利要求5所述的视觉分辨率增强的生成方法,其特征在于,所述采用基于通道注意力机制的自适应融合模块将不同尺度下的子网络提取的不同频率的信息进行融合,包括:The method for generating visual resolution enhancement according to claim 5, wherein said adaptive fusion module based on a channel attention mechanism fuses information of different frequencies extracted by sub-networks at different scales, including:
    对小尺度特征映射进行插值,生成与大尺度特征映射大小相同的特征映射;Interpolate the small-scale feature map to generate a feature map of the same size as the large-scale feature map;
    插值后的特征映射分别传递给全局平均池化层、通道压缩卷积层和通道扩大卷积层;The interpolated feature maps are passed to the global average pooling layer, channel compression convolution layer and channel expansion convolution layer respectively;
    将获得的三个尺度的向量串联起来,并在同一通道上使用softmax层进行处理,生成对应的权重矩阵;Concatenate the obtained vectors of the three scales and process them with the softmax layer on the same channel to generate the corresponding weight matrix;
    将权重矩阵分成三个权重分量对应于三个子网络,将各尺度插值后的特征映射分别与相应的权重分量相乘;Divide the weight matrix into three weight components corresponding to three sub-networks, and multiply the feature maps after each scale interpolation with the corresponding weight components;
    将得到的三个特征图进行加权求和操作从而得到融合后的输出。The obtained three feature maps are weighted and summed to obtain the fused output.
  9. 根据权利要求7所述的视觉分辨率增强的生成方法,其特征在于,所述空间注意力残差密集连接单元包含五个卷积层的密集连接组、一个空间注意力卷积组以及将空间注意力残差密集连接单元的输入和空间注意力卷积组的输出相连的跳跃连接。The generation method of visual resolution enhancement according to claim 7, wherein the spatial attention residual densely connected unit comprises a densely connected group of five convolutional layers, a spatial attention convolution group, and a spatial Skip connections connecting the input of the attention residual densely connected unit and the output of the spatial attention convolution group.
  10. 根据权利要求1至9中任一所述的视觉分辨率增强的生成方法,其特征在于,所述通过单幅图像超分辨率模型对所述低分辨率单幅图像以及其对应的图像描述信息进行处理,输出高分辨率单幅图像,包括:The method for generating visual resolution enhancement according to any one of claims 1 to 9, wherein the low-resolution single image and its corresponding image description information are described by the single image super-resolution model Process and output a high-resolution single image, including:
    将低分辨率单幅图像输入浅层特征提取模块,得到浅层图像特征;Input a low-resolution single image into the shallow feature extraction module to obtain shallow image features;
    将对应的图像描述信息输入自适应调节块获得与图像特征的维度相同的描述变量,将描述变量和图像特征串联起来,输入后续单幅图像超分辨率模型,输出高分辨率单幅图像。Input the corresponding image description information into the adaptive adjustment block to obtain the description variable with the same dimension as the image feature, connect the description variable and image feature in series, input the subsequent single image super-resolution model, and output a high-resolution single image.
  11. 一种视觉分辨率增强的生成系统,其特征在于,包括:A generation system for visual resolution enhancement, characterized by comprising:
    获取模块,用于获取待处理的低分辨率单幅图像以及其对应的图像描述信息;An acquisition module, configured to acquire a low-resolution single image to be processed and its corresponding image description information;
    输出模块,用于通过单幅图像超分辨率模型对所述低分辨率单幅图像以及其对应的图像描述信息进行处理,输出高分辨率单幅图像;An output module, configured to process the low-resolution single image and its corresponding image description information through a single-image super-resolution model, and output a high-resolution single image;
    训练模块,用于训练所述单幅图像超分辨率模型,所述训练模块包括:Training module, for training described single image super-resolution model, described training module comprises:
    采样子模块,用于采集训练样本,所述训练样本含高分辨率单幅图像样本、低分辨率单幅图像样本和其对应的图像描述信息样本;The sampling sub-module is used to collect training samples, and the training samples include high-resolution single image samples, low-resolution single image samples and corresponding image description information samples;
    模型建立子模块,用于根据采集的训练样本,基于预设损失函数和高分辨率单幅图像样本建立单幅图像超分辨率模型。The model establishment sub-module is used to establish a single image super-resolution model based on a preset loss function and a high-resolution single image sample according to the collected training samples.
  12. 一种装置,其特征在于,包括:A device, characterized in that it comprises:
    存储器,用于存储至少一个程序;memory for storing at least one program;
    处理器,用于执行所述至少一个程序,以实现权利要求1至10中任一项所述方法。A processor, configured to execute the at least one program to implement the method according to any one of claims 1-10.
  13. 一种存储介质,其特征在于,存储有可执行的程序,所述可执行的程序被处理器执行时实现如权利要求1至10中任一项所述方法。A storage medium, characterized in that an executable program is stored, and when the executable program is executed by a processor, the method according to any one of claims 1 to 10 is implemented.
PCT/CN2021/126019 2021-05-18 2021-10-25 Generation method, system and apparatus capable of visual resolution enhancement, and storage medium WO2022242029A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110541939.9 2021-05-18
CN202110541939.9A CN113139907B (en) 2021-05-18 2021-05-18 Generation method, system, device and storage medium for visual resolution enhancement

Publications (1)

Publication Number Publication Date
WO2022242029A1 true WO2022242029A1 (en) 2022-11-24

Family

ID=76817554

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/126019 WO2022242029A1 (en) 2021-05-18 2021-10-25 Generation method, system and apparatus capable of visual resolution enhancement, and storage medium

Country Status (2)

Country Link
CN (1) CN113139907B (en)
WO (1) WO2022242029A1 (en)

Cited By (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115546274A (en) * 2022-11-29 2022-12-30 城云科技(中国)有限公司 Image depth judgment model, and construction method, device and application thereof
CN115936990A (en) * 2022-12-07 2023-04-07 中国科学技术大学 Synchronous processing method and system for multi-scale super-resolution and denoising of seismic data
CN116029907A (en) * 2023-02-14 2023-04-28 江汉大学 Processing method, device and processing equipment for image resolution reduction model
CN116071243A (en) * 2023-03-27 2023-05-05 江西师范大学 Infrared image super-resolution reconstruction method based on edge enhancement
CN116128727A (en) * 2023-02-02 2023-05-16 中国人民解放军国防科技大学 Super-resolution method, system, equipment and medium for polarized radar image
CN116156144A (en) * 2023-04-18 2023-05-23 北京邮电大学 Integrated system and method for hyperspectral information acquisition and transmission
CN116168352A (en) * 2023-04-26 2023-05-26 成都睿瞳科技有限责任公司 Power grid obstacle recognition processing method and system based on image processing
CN116402692A (en) * 2023-06-07 2023-07-07 江西财经大学 Depth map super-resolution reconstruction method and system based on asymmetric cross attention
CN116503260A (en) * 2023-06-29 2023-07-28 北京建筑大学 Image super-resolution reconstruction method, device and equipment
CN116523759A (en) * 2023-07-04 2023-08-01 江西财经大学 Image super-resolution reconstruction method and system based on frequency decomposition and restarting mechanism
CN116523740A (en) * 2023-03-13 2023-08-01 武汉大学 Infrared image super-resolution method based on light field
CN116594061A (en) * 2023-07-18 2023-08-15 吉林大学 Seismic data denoising method based on multi-scale U-shaped attention network
CN116681980A (en) * 2023-07-31 2023-09-01 北京建筑大学 Deep learning-based large-deletion-rate image restoration method, device and storage medium
CN116823602A (en) * 2023-05-26 2023-09-29 天津大学 Parallax-guided spatial super-resolution reconstruction method for light field image
CN116934618A (en) * 2023-07-13 2023-10-24 江南大学 Image halftone method, system and medium based on improved residual error network
CN117274316A (en) * 2023-10-31 2023-12-22 广东省水利水电科学研究院 River surface flow velocity estimation method, device, equipment and storage medium
CN117437131A (en) * 2023-12-21 2024-01-23 珠海视新医用科技有限公司 Electronic staining method and device for endoscope image, equipment and storage medium
CN117495679A (en) * 2023-11-03 2024-02-02 北京科技大学 Image super-resolution method and device based on non-local sparse attention
CN117495681A (en) * 2024-01-03 2024-02-02 国网山东省电力公司济南供电公司 Infrared image super-resolution reconstruction system and method
CN117809310A (en) * 2024-03-03 2024-04-02 宁波港信息通信有限公司 Port container number identification method and system based on machine learning
CN117952830A (en) * 2024-01-24 2024-04-30 天津大学 Three-dimensional image super-resolution reconstruction method based on iterative interaction guidance
CN118097770A (en) * 2023-12-25 2024-05-28 浙江金融职业学院 Personnel behavior state detection and analysis method applied to bank
CN118096534A (en) * 2024-04-26 2024-05-28 江西师范大学 Infrared image super-resolution reconstruction method based on complementary reference
CN118169752A (en) * 2024-03-13 2024-06-11 北京石油化工学院 Seismic phase pickup method and system based on multi-feature fusion
CN118262258A (en) * 2024-05-31 2024-06-28 西南科技大学 Ground environment image aberration detection method and system
CN118333860A (en) * 2024-06-12 2024-07-12 济南大学 Residual enhancement type frequency space mutual learning face super-resolution method
CN118411291A (en) * 2024-07-04 2024-07-30 临沂大学 Transformer-based coal-rock image super-resolution reconstruction method and device
CN118469820A (en) * 2024-07-10 2024-08-09 江苏金寓信息科技有限公司 Super-resolution image reconstruction method, device, medium and equipment
CN118521482A (en) * 2024-07-23 2024-08-20 华东交通大学 Depth image guided super-resolution reconstruction network model

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113139907B (en) * 2021-05-18 2023-02-14 广东奥普特科技股份有限公司 Generation method, system, device and storage medium for visual resolution enhancement
CN114170089B (en) * 2021-09-30 2023-07-07 成都市第二人民医院 Method for classifying diabetic retinopathy and electronic equipment
WO2023122927A1 (en) * 2021-12-28 2023-07-06 Boe Technology Group Co., Ltd. Computer-implemented method, apparatus, and computer-program product
CN116681627B (en) * 2023-08-03 2023-11-24 佛山科学技术学院 Cross-scale fusion self-adaptive underwater image generation countermeasure enhancement method

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109615582A (en) * 2018-11-30 2019-04-12 北京工业大学 A kind of face image super-resolution reconstruction method generating confrontation network based on attribute description
US20190304063A1 (en) * 2018-03-29 2019-10-03 Mitsubishi Electric Research Laboratories, Inc. System and Method for Learning-Based Image Super-Resolution
CN111340708A (en) * 2020-03-02 2020-06-26 北京理工大学 Method for rapidly generating high-resolution complete face image according to prior information
CN111583112A (en) * 2020-04-29 2020-08-25 华南理工大学 Method, system, device and storage medium for video super-resolution
CN112699844A (en) * 2020-04-23 2021-04-23 华南理工大学 Image super-resolution method based on multi-scale residual error level dense connection network
CN113139907A (en) * 2021-05-18 2021-07-20 广东奥普特科技股份有限公司 Generation method, system, device and storage medium for visual resolution enhancement

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014174087A1 (en) * 2013-04-25 2014-10-30 Thomson Licensing Method and device for performing super-resolution on an input image
US9123140B1 (en) * 2013-09-25 2015-09-01 Pixelworks, Inc. Recovering details in single frame super resolution images
CN111182254B (en) * 2020-01-03 2022-06-24 北京百度网讯科技有限公司 Video processing method, device, equipment and storage medium
CN112734646B (en) * 2021-01-19 2024-02-02 青岛大学 Image super-resolution reconstruction method based on feature channel division

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190304063A1 (en) * 2018-03-29 2019-10-03 Mitsubishi Electric Research Laboratories, Inc. System and Method for Learning-Based Image Super-Resolution
CN109615582A (en) * 2018-11-30 2019-04-12 北京工业大学 A kind of face image super-resolution reconstruction method generating confrontation network based on attribute description
CN111340708A (en) * 2020-03-02 2020-06-26 北京理工大学 Method for rapidly generating high-resolution complete face image according to prior information
CN112699844A (en) * 2020-04-23 2021-04-23 华南理工大学 Image super-resolution method based on multi-scale residual error level dense connection network
CN111583112A (en) * 2020-04-29 2020-08-25 华南理工大学 Method, system, device and storage medium for video super-resolution
CN113139907A (en) * 2021-05-18 2021-07-20 广东奥普特科技股份有限公司 Generation method, system, device and storage medium for visual resolution enhancement

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
JIAYU QIN: "Research on Multi-scale Sub-networks and Combined Prior Information Single Image Super-resolution", MASTER THESIS, no. 2, 27 June 2020 (2020-06-27), CN, pages 1 - 82, XP093009576, DOI: 10.27151/d.cnki.ghnlu *

Cited By (47)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115546274A (en) * 2022-11-29 2022-12-30 城云科技(中国)有限公司 Image depth judgment model, and construction method, device and application thereof
CN115546274B (en) * 2022-11-29 2023-02-17 城云科技(中国)有限公司 Image depth judgment model and construction method, device and application thereof
CN115936990A (en) * 2022-12-07 2023-04-07 中国科学技术大学 Synchronous processing method and system for multi-scale super-resolution and denoising of seismic data
CN115936990B (en) * 2022-12-07 2023-11-17 中国科学技术大学 Seismic data multi-scale super-resolution and denoising synchronous processing method and system
CN116128727A (en) * 2023-02-02 2023-05-16 中国人民解放军国防科技大学 Super-resolution method, system, equipment and medium for polarized radar image
CN116029907A (en) * 2023-02-14 2023-04-28 江汉大学 Processing method, device and processing equipment for image resolution reduction model
CN116029907B (en) * 2023-02-14 2023-08-08 江汉大学 Processing method, device and processing equipment for image resolution reduction model
CN116523740A (en) * 2023-03-13 2023-08-01 武汉大学 Infrared image super-resolution method based on light field
CN116523740B (en) * 2023-03-13 2023-09-15 武汉大学 Infrared image super-resolution method based on light field
CN116071243A (en) * 2023-03-27 2023-05-05 江西师范大学 Infrared image super-resolution reconstruction method based on edge enhancement
CN116071243B (en) * 2023-03-27 2023-06-16 江西师范大学 Infrared image super-resolution reconstruction method based on edge enhancement
CN116156144A (en) * 2023-04-18 2023-05-23 北京邮电大学 Integrated system and method for hyperspectral information acquisition and transmission
CN116156144B (en) * 2023-04-18 2023-08-01 北京邮电大学 Integrated system and method for hyperspectral information acquisition and transmission
CN116168352B (en) * 2023-04-26 2023-06-27 成都睿瞳科技有限责任公司 Power grid obstacle recognition processing method and system based on image processing
CN116168352A (en) * 2023-04-26 2023-05-26 成都睿瞳科技有限责任公司 Power grid obstacle recognition processing method and system based on image processing
CN116823602B (en) * 2023-05-26 2023-12-15 天津大学 Parallax-guided spatial super-resolution reconstruction method for light field image
CN116823602A (en) * 2023-05-26 2023-09-29 天津大学 Parallax-guided spatial super-resolution reconstruction method for light field image
CN116402692B (en) * 2023-06-07 2023-08-18 江西财经大学 Depth map super-resolution reconstruction method and system based on asymmetric cross attention
CN116402692A (en) * 2023-06-07 2023-07-07 江西财经大学 Depth map super-resolution reconstruction method and system based on asymmetric cross attention
CN116503260B (en) * 2023-06-29 2023-09-19 北京建筑大学 Image super-resolution reconstruction method, device and equipment
CN116503260A (en) * 2023-06-29 2023-07-28 北京建筑大学 Image super-resolution reconstruction method, device and equipment
CN116523759B (en) * 2023-07-04 2023-09-05 江西财经大学 Image super-resolution reconstruction method and system based on frequency decomposition and restarting mechanism
CN116523759A (en) * 2023-07-04 2023-08-01 江西财经大学 Image super-resolution reconstruction method and system based on frequency decomposition and restarting mechanism
CN116934618B (en) * 2023-07-13 2024-06-11 江南大学 Image halftone method, system and medium based on improved residual error network
CN116934618A (en) * 2023-07-13 2023-10-24 江南大学 Image halftone method, system and medium based on improved residual error network
CN116594061B (en) * 2023-07-18 2023-09-22 吉林大学 Seismic data denoising method based on multi-scale U-shaped attention network
CN116594061A (en) * 2023-07-18 2023-08-15 吉林大学 Seismic data denoising method based on multi-scale U-shaped attention network
CN116681980B (en) * 2023-07-31 2023-10-20 北京建筑大学 Deep learning-based large-deletion-rate image restoration method, device and storage medium
CN116681980A (en) * 2023-07-31 2023-09-01 北京建筑大学 Deep learning-based large-deletion-rate image restoration method, device and storage medium
CN117274316B (en) * 2023-10-31 2024-05-03 广东省水利水电科学研究院 River surface flow velocity estimation method, device, equipment and storage medium
CN117274316A (en) * 2023-10-31 2023-12-22 广东省水利水电科学研究院 River surface flow velocity estimation method, device, equipment and storage medium
CN117495679A (en) * 2023-11-03 2024-02-02 北京科技大学 Image super-resolution method and device based on non-local sparse attention
CN117437131B (en) * 2023-12-21 2024-03-26 珠海视新医用科技有限公司 Electronic staining method and device for endoscope image, equipment and storage medium
CN117437131A (en) * 2023-12-21 2024-01-23 珠海视新医用科技有限公司 Electronic staining method and device for endoscope image, equipment and storage medium
CN118097770A (en) * 2023-12-25 2024-05-28 浙江金融职业学院 Personnel behavior state detection and analysis method applied to bank
CN117495681B (en) * 2024-01-03 2024-05-24 国网山东省电力公司济南供电公司 Infrared image super-resolution reconstruction system and method
CN117495681A (en) * 2024-01-03 2024-02-02 国网山东省电力公司济南供电公司 Infrared image super-resolution reconstruction system and method
CN117952830A (en) * 2024-01-24 2024-04-30 天津大学 Three-dimensional image super-resolution reconstruction method based on iterative interaction guidance
CN117809310B (en) * 2024-03-03 2024-04-30 宁波港信息通信有限公司 Port container number identification method and system based on machine learning
CN117809310A (en) * 2024-03-03 2024-04-02 宁波港信息通信有限公司 Port container number identification method and system based on machine learning
CN118169752A (en) * 2024-03-13 2024-06-11 北京石油化工学院 Seismic phase pickup method and system based on multi-feature fusion
CN118096534A (en) * 2024-04-26 2024-05-28 江西师范大学 Infrared image super-resolution reconstruction method based on complementary reference
CN118262258A (en) * 2024-05-31 2024-06-28 西南科技大学 Ground environment image aberration detection method and system
CN118333860A (en) * 2024-06-12 2024-07-12 济南大学 Residual enhancement type frequency space mutual learning face super-resolution method
CN118411291A (en) * 2024-07-04 2024-07-30 临沂大学 Transformer-based coal-rock image super-resolution reconstruction method and device
CN118469820A (en) * 2024-07-10 2024-08-09 江苏金寓信息科技有限公司 Super-resolution image reconstruction method, device, medium and equipment
CN118521482A (en) * 2024-07-23 2024-08-20 华东交通大学 Depth image guided super-resolution reconstruction network model

Also Published As

Publication number Publication date
CN113139907B (en) 2023-02-14
CN113139907A (en) 2021-07-20

Similar Documents

Publication Publication Date Title
WO2022242029A1 (en) Generation method, system and apparatus capable of visual resolution enhancement, and storage medium
WO2022241995A1 (en) Visual image enhancement generation method and system, device, and storage medium
CN109903228B (en) Image super-resolution reconstruction method based on convolutional neural network
CN112750082B (en) Human face super-resolution method and system based on fusion attention mechanism
Qin et al. Multi-scale feature fusion residual network for single image super-resolution
WO2022110638A1 (en) Human image restoration method and apparatus, electronic device, storage medium and program product
CN109544448B (en) Group network super-resolution image reconstruction method of Laplacian pyramid structure
CN111028150B (en) Rapid space-time residual attention video super-resolution reconstruction method
CN113096017B (en) Image super-resolution reconstruction method based on depth coordinate attention network model
CN111353940B (en) Image super-resolution reconstruction method based on deep learning iterative up-down sampling
CN111311490A (en) Video super-resolution reconstruction method based on multi-frame fusion optical flow
CN111815516B (en) Super-resolution reconstruction method for weak supervision infrared remote sensing image
CN112801877B (en) Super-resolution reconstruction method of video frame
CN111861961A (en) Multi-scale residual error fusion model for single image super-resolution and restoration method thereof
CN112734646A (en) Image super-resolution reconstruction method based on characteristic channel division
CN113298716B (en) Image super-resolution reconstruction method based on convolutional neural network
CN113554058A (en) Method, system, device and storage medium for enhancing resolution of visual target image
CN111402128A (en) Image super-resolution reconstruction method based on multi-scale pyramid network
CN109949217B (en) Video super-resolution reconstruction method based on residual learning and implicit motion compensation
CN112419150A (en) Random multiple image super-resolution reconstruction method based on bilateral up-sampling network
Hui et al. Two-stage convolutional network for image super-resolution
CN112907448A (en) Method, system, equipment and storage medium for super-resolution of any-ratio image
CN112396554A (en) Image super-resolution algorithm based on generation countermeasure network
CN113674154A (en) Single image super-resolution reconstruction method and system based on generation countermeasure network
CN116188272B (en) Two-stage depth network image super-resolution reconstruction method suitable for multiple fuzzy cores

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21940475

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21940475

Country of ref document: EP

Kind code of ref document: A1