WO2021227877A1 - 图像超分辨率处理方法、装置、设备及存储介质 - Google Patents

图像超分辨率处理方法、装置、设备及存储介质 Download PDF

Info

Publication number
WO2021227877A1
WO2021227877A1 PCT/CN2021/090646 CN2021090646W WO2021227877A1 WO 2021227877 A1 WO2021227877 A1 WO 2021227877A1 CN 2021090646 W CN2021090646 W CN 2021090646W WO 2021227877 A1 WO2021227877 A1 WO 2021227877A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
sample
information content
neural network
target
Prior art date
Application number
PCT/CN2021/090646
Other languages
English (en)
French (fr)
Inventor
孔德辉
刘衡祁
徐科
杨维
宋剑军
朱方
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Priority to EP21804003.8A priority Critical patent/EP4152244A4/en
Publication of WO2021227877A1 publication Critical patent/WO2021227877A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4053Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0475Generative networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/09Supervised learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/094Adversarial learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4046Scaling of whole images or parts thereof, e.g. expanding or contracting using neural networks

Definitions

  • the present disclosure relates to, but is not limited to, the field of computer technology.
  • the super-resolution reconstruction technology based on deep learning shows obvious advantages over traditional methods in the comparison of reconstruction effects.
  • the mainstream of current super-resolution reconstruction methods has begun to move closer to deep learning.
  • the current super-analysis methods based on deep neural networks are divided into two categories, one is based on Generative Adversarial Networks (GAN), and the other is based on supervised fully convolutional neural networks.
  • GAN Generative Adversarial Networks
  • the former uses the perceptron cost function in the GAN network framework to make the output result located in the high-level space where the real background (Ground Truth, GT) is located, and the purpose is that the discriminator cannot distinguish between the generated image and the real image.
  • the supervised method is to generate a low-resolution and high-resolution pair (LHP) through a certain degradation model, and realize the identification of this LHP with the help of responsible network structure modeling.
  • LHP low-resolution and high-resolution pair
  • an embodiment of the present disclosure provides an image super-resolution processing method, including: acquiring a to-be-processed image; performing super-resolution processing on the to-be-processed image through a target neural network model, and the target neural network model passes the target
  • the image sample set is obtained by iterative training of the pyramid neural network model, and the target image sample is obtained by filtering the original image sample according to the sample information content interval, and the number of image samples in each sample information content interval is the same.
  • an embodiment of the present disclosure provides an image super-resolution processing device, including: an acquisition module configured to acquire an image to be processed; a processing module configured to super-resolution the image to be processed through a target neural network model Processing, the target neural network model is obtained by iterative training of the pyramid neural network model of the target image sample set, the target image sample is obtained by filtering the original image samples according to the sample information content interval, and the number of image samples in each sample information content interval is the same.
  • embodiments of the present disclosure provide a device including: one or more processors; a storage device configured to store one or more programs; when the one or more programs are processed by the one or more The processor executes, so that the one or more processors implement any method in the embodiments of the present disclosure.
  • an embodiment of the present disclosure provides a storage medium that stores a computer program, and when the computer program is executed by a processor, any one of the methods in the embodiments of the present disclosure is implemented.
  • FIG. 1 is a schematic flowchart of an image super-resolution processing method provided by the present disclosure
  • Fig. 1a is an architecture diagram of a deep neural network provided by the present disclosure
  • Figure 1b is a training flowchart of the neural network model provided by the present disclosure
  • FIG. 2 is a schematic structural diagram of an image super-resolution processing device provided by the present disclosure
  • Fig. 3 is a schematic structural diagram of a device provided by the present disclosure.
  • the current super-analysis methods based on deep neural networks are divided into two categories, one is based on Generative Adversarial Networks (GAN), and the other is based on supervised full volume.
  • Product neural network uses the perceptron cost function in the GAN network framework to make the output result located in the high-level space where the real background (Ground Truth, GT) is located, and the purpose is that the discriminator cannot distinguish between the generated image and the real image.
  • the supervised method is to generate a low-resolution and high-resolution pair (LHP) through a certain degradation model, and realize the recognition of this LHP with the help of responsible network structure modeling.
  • LHP low-resolution and high-resolution pair
  • the two types of methods generally randomly crop each data set into fixed-size blocks, and then combine multiple blocks to form a sample set.
  • the sample set is used as input for forward and backward propagation in the network to update network parameters.
  • This method has a significant imbalance problem in natural images, mainly because the smooth area accounts for a higher percentage of statistics than various textures. In other words, random sampling without interference will enter more smooth areas of the network, causing the network to be more inclined to adopt the LHP of the smooth area.
  • the present disclosure particularly provides an image super-resolution processing method, device, device, and storage medium, which substantially avoid one or more of the problems caused by the limitations and shortcomings of the prior art. According to an embodiment of the present disclosure, the problem of unbalanced sampling output sample ratio in the prior art can be solved.
  • FIG. 1 is a schematic flowchart of an image super-resolution processing method provided by the present disclosure. This method can be applied to the case of super-resolution processing of low-resolution to-be-processed images.
  • the method may be executed by the image super-resolution processing apparatus provided in the present disclosure, and the image super-resolution processing apparatus may be implemented by software and/or hardware and integrated on the device.
  • an image super-resolution processing method provided by the present disclosure includes steps S110-S120.
  • step S110 an image to be processed is acquired.
  • the image to be processed is a low-resolution image.
  • the method of acquiring the image to be processed may be to acquire the image to be processed through a camera, or it may be to acquire the image to be processed through interception.
  • the embodiment of the present disclosure does not limit the method of acquiring the image to be processed.
  • step S120 super-resolution processing is performed on the image to be processed by a target neural network model, the target neural network model is obtained by iterative training of a pyramid neural network model through a target image sample set, and the target image sample is based on the sample information content interval It is obtained by filtering the original image samples, and the number of image samples in the information content interval of each sample is the same.
  • the pyramid neural network model aims at the basic feature extraction module, and realizes the acquisition from shallow features to high-level features through multi-layer cascade.
  • the use of the pyramid network structure retains the characteristics of the bottom, middle, and high-level to participate in the upsampling process, and on the other hand, the stability of the network can be improved due to the multi-layer jump connection.
  • the basic feature extraction module ⁇ through multi-layer cascade, the acquisition from shallow features to high-level features is realized.
  • the basic feature extraction module can choose the residual network or the U-Net structure model. The features extracted by each module are used as the input of the next level of feature extraction, and finally the features of each layer are used as the input of the upsampling module, as shown in the following formula:
  • is the basic feature extraction module.
  • FIG. 1a it is an architecture diagram of a deep neural network in an embodiment of the present disclosure.
  • a low-resolution image is input to a neural network model to obtain a high-resolution image.
  • a shallow layer is first performed.
  • the basic feature extraction module includes: basic feature extraction module ⁇ 0 , basic feature extraction module ⁇ 1 ... basic feature extraction module ⁇ n , the features extracted by each module are used as the input of the next level of feature extraction, and finally each layer The features are all used as input to the upsampling module.
  • the output of super-resolution reconstruction is not limited to a limited number of classification targets, and is different from a deep neural network for the purpose of classification and recognition.
  • super-resolution is a many-to-many mapping process, as shown in the following formula:
  • represents a classification and recognition network
  • X represents an input image, which includes two-dimensional (grayscale) or three-dimensional (for example, color information) or even four-dimensional (for example, video that includes color information, etc.).
  • q represents a scalar.
  • q may be a bool variable.
  • q may be a scalar of limited size.
  • K represents the super-resolution mapping relationship, where Y represents the output high-resolution results. In most cases, Y should maintain the same dimensions as X, and the amplitude in each dimension is not less than X, which can be regarded as an information expansion process. In terms of information dimension and effect improvement, K needs to pay more attention to all input information.
  • the traditional super-analysis input sample extraction is to randomly select a batch of samples ⁇ a 0 ,a 1 ,...,a n ⁇ from the sample set ⁇ , and then randomly crop each sample a i , and finally perform the data as needed Augment, and then enter the network for training.
  • This method is simply a migration based on the classification and recognition network, and does not take into account the similarity of X in the underlying information, that is, most images will contain large smooth areas (this phenomenon is more obvious in high-resolution images), smooth Regions have higher similarities.
  • This similarity is manifested in the network training process as the input statistically contains a higher proportion than other texture regions, and this imbalance will cause the super-resolution neural network to be more inclined to output smooth results. This result violates the image information of the real background, resulting in the loss of some of the original clear area details.
  • the target image sample is obtained by filtering the original image sample according to the sample information content interval, that is, the method of filtering is used to realize the equalization control of the sample.
  • Each image sample calculates its Sobel gradient information, and sums its gradient information, and uses the result of the sum as the sample information content. Or it can be: use variance, because variance can also characterize the degree of sample change, calculate the variance in each sample obtained randomly, and then fill it into the corresponding interval according to the value of the variance, until the samples in each interval are satisfied . Since each sample is obtained in a random manner, the shuffle operation is used between each round of training to ensure the randomness of the sample at the same time.
  • the entire sample information content space corresponding to the target image sample set is divided to obtain at least one sample information content interval, and each interval has a starting point and an ending point of the sample information content, and the interval within the interval is set for each interval.
  • Sample size obtain the sample information content of the original image sample, and determine the sample information content interval corresponding to the sample information content according to the sample information content. After finding the corresponding interval, obtain the number of samples in the space. If the number of samples does not reach the set sample number , Then select to enter the space, if it reaches the set number of samples, then do not select.
  • the target image sample is obtained after filtering the original image sample
  • the pyramid neural network model is iteratively trained through the target image sample set to obtain the target neural network model
  • the target neural network model completed through the training is subjected to super-resolution processing on the image to be processed.
  • FIG. 1b it is the training flowchart of the neural network model. First obtain the initial samples from the training set, select the samples, train the network model according to the selected samples, calculate Loss, update the neural network model, and iteratively perform the above steps until the final Model.
  • An image super-resolution processing method obtaineds an image to be processed, and performs super-resolution processing on the image to be processed through a target neural network model, and the target neural network model iteratively trains a pyramid nerve through a target image sample set
  • the network model is obtained, the target image sample is obtained by filtering the original image sample according to the sample information content interval, and the number of image samples in each sample information content interval is the same.
  • the iterative training of the pyramid neural network model through the target image sample set includes: establishing a pyramid neural network model; inputting low-resolution images in the target sample set into the pyramid neural network to obtain a predicted image;
  • the objective function formed by the image and the high-resolution image corresponding to the low-resolution image trains the parameters of the pyramid neural network, wherein the objective function includes a function L2+ ⁇ L1, a function L1+ ⁇ LP, and a function L2+ ⁇ L1+ ⁇ LP
  • the L2 is the L2 norm
  • the L1 is the L1 norm
  • the LP is the LP norm
  • the ⁇ , ⁇ , ⁇ , and ⁇ are the weights of the regularization factors
  • L1 norm is The L2 norm is The LP norm is Wherein, x i is the difference between the predicted image and the high-resolution image corresponding to the low-resolution image.
  • x i when p is greater than 0 and less than 1, x i will have a greater response in the LP norm. Therefore, using the LP norm as the objective function can make smaller differences output larger Loss, and improve the sensitivity of the network to weak differences.
  • the parameters for training the pyramid neural network according to the objective function formed by the predicted image and the high-resolution image corresponding to the low-resolution image may be: the objective function is the function L2+ ⁇ L1, and L2 is the L2 norm, The L1 is the L1 norm, and the L1 norm is The L2 norm is x i is the difference between the predicted image and the high-resolution image corresponding to the low-resolution image; or it can be: the initial objective function is the function L2+ ⁇ L1, after the objective function does not decrease significantly, the objective function is the function L1+ ⁇ LP , The LP is the LP norm, the ⁇ and ⁇ are the weights of the regularization factors, and the LP norm is x i is the difference between the predicted image and the high-resolution image corresponding to the low-resolution image; it can also be: the objective function is the function L2+ ⁇ L1+ ⁇ LP, ⁇ and ⁇ are the weights of the regularization factors, which can be adjusted by adjusting the weights.
  • training the parameters of the pyramid neural network based on the objective function formed by the predicted image and the high-resolution image corresponding to the low-resolution image includes: based on the objective function L2+ ⁇ L1 and setting to a first value The learning rate for training the parameters of the pyramid neural network; after the falling value of the output value of the objective function is less than a predetermined value (ie, a predefined value), based on the objective function L1+ ⁇ LP, and the learning rate set to the second value , Continue to train the parameters of the pyramid neural network, wherein the second value is smaller than the first value, and the ⁇ and ⁇ increase as the learning rate decreases.
  • the decreasing value of the output value of the objective function refers to the difference between the current output value of the objective function and the previous output value of the objective function.
  • the first value may be a set larger learning rate.
  • the second value may be a set relatively small learning rate.
  • the function L2+ ⁇ L1 is used as the objective function, and a larger learning rate is used for training.
  • L1+ ⁇ LP is used as the objective function, and training is performed at a relatively small learning rate.
  • training the parameters of the pyramid neural network according to the objective function formed by the high-resolution image corresponding to the predicted image and the low-resolution image includes: according to the predicted image and the low-resolution image
  • the objective function L2+ ⁇ L1+ ⁇ LP corresponding to the high-resolution image formation trains the parameters of the pyramid neural network; in each round of training, or between two rounds of training, adjust ⁇ and ⁇ to achieve the global optimization Approaching.
  • the L2+ ⁇ L1+ ⁇ LP norm constraint is globally selected as the objective function, and the global optimal approximation is achieved by dynamically adjusting the weight of the regularization factor for each round of training or between two rounds of training.
  • the embodiments of the present disclosure relate to the setting of the objective function in the training process and the parameter adjustment of the training process.
  • a multi-norm regularization method is used to improve the contribution to the insignificant texture part.
  • the dynamic adjustment improves the convergence efficiency of the network on the one hand, and prevents the network from falling into a minimum on the other hand, and more effectively updates the network hyperparameters using multi-norm regularization. change.
  • the analysis method in the prior art has low stability, and occasionally there will be components or textures not included in the original image, and its objective indicators (such as PSNR (Peak Signal to Noise Ratio, peak signal to noise ratio), SSIM (Structural SIMilarity, similar in structure) Sex)) is lower.
  • L2 norm or L1 norm is used as the objective function, and the network training results under the constraints of this function usually have high objective quality. Pursuing quantitative indicators usually leads to a substantial increase in network complexity, and the increased demand for computing power brings a limited effect improvement, which is not conducive to implementation on the deployment side. And because of the preference to increase the depth, the influence of the network structure and data is ignored.
  • the embodiment of the present disclosure adopts the simultaneous joint LP norm and adopts a dynamic adjustment method during the training process to improve training convergence efficiency and focus on small targets.
  • the sample information content includes: gradient information of the image sample, or variance within the image sample.
  • the gradient information represents the richness of the texture of the image sample.
  • the Sobel operator gradient information is calculated for each sample, and the gradient information is summed, and the result of the summation is used as a representation of the information content of the sample.
  • the variance within the image sample represents the degree of change of the image sample.
  • the variance in each sample obtained at random is calculated and used as an indication of the information content of the sample.
  • the sample information content is:
  • PIC
  • the abscissa and ordinate, w, h represent the width and height of the selected image sample, a i is the image sample, Sobel represents the Sobel operator, max represents the maximum value, and abs represents the absolute value.
  • the abscissa and the ordinate may be coordinates in a Cartesian rectangular coordinate system.
  • the sample information content is:
  • PIC
  • the abscissa and ordinate, w, h represent the width and height of the selected image sample, a i is the image sample, Var represents the variance, max represents the maximum value, and abs represents the absolute value.
  • the abscissa and the ordinate may be coordinates in a Cartesian rectangular coordinate system.
  • the target image sample is obtained by filtering the original image sample according to the sample information content interval, and the number of image samples in each sample information content interval is the same includes: dividing the sample information content space corresponding to the target image sample set to obtain at least Two sample information content intervals, each sample information content interval corresponds to a different information content range, where each sample information content interval contains a target number of image samples; obtain the information content of the image sample; determine the corresponding information according to the information content of the image sample If the sample information content interval of the sample information content interval is less than the target number, then the image sample is determined as the target image sample.
  • the sample information content space corresponding to the target image sample set is divided to obtain at least two sample information content intervals.
  • the sample information content space corresponding to the target image sample set can be divided to obtain 4 sample information content intervals , As shown in the following formula:
  • [0,pic 0 ] represents the starting point and ending point of pic in the first interval
  • [pic 0 ,pic 1 ] represents the starting point and ending point of pic in the second interval
  • [pic 1 ,pic 2 ] represents the third interval
  • [pic 2 , ⁇ ] represents the starting point and ending point of pic in the fourth interval.
  • the number of samples corresponding to the target image sample set is set to N, and the number of samples corresponding to the target image sample set is averaged to four intervals, and the number of image samples in each interval is N/4, as shown in the following formula:
  • N is the number of samples corresponding to the target image sample set
  • Interval represents the number of samples allocated in each interval after the overall PIC space is divided.
  • the sample information content space corresponding to the target image sample set is divided into 4 effective intervals, the number of samples corresponding to the target image sample set is N, and the number of samples in each interval is selected as N/4.
  • the number allocated in the first interval is N/4.
  • FIG. 2 is a schematic structural diagram of an image super-resolution processing device provided in the present disclosure.
  • an image super-resolution processing apparatus in an embodiment of the present disclosure may be integrated on a device.
  • the device includes: an acquisition module 21 configured to acquire an image to be processed; a processing module 22 configured to perform super-resolution processing on the image to be processed through a target neural network model, and the target neural network model is iterated through a target image sample set
  • the pyramid neural network model is trained, and the target image sample is obtained by filtering the original image sample according to the sample information content interval, and the number of image samples in each sample information content interval is the same.
  • the device provided in this embodiment is used to implement the method of the embodiment shown in FIG. 1.
  • the implementation principle and technical effect of the device provided in this embodiment are similar to the method of the embodiment shown in FIG.
  • the processing module 22 may be used to: establish a pyramid neural network model; input the low-resolution image in the target sample into the pyramid neural network to obtain a predicted image; according to the predicted image and the low-resolution image
  • the objective function formed by the high-resolution image corresponding to the image trains the parameters of the pyramid neural network, where the objective function includes one or more of the function L2+ ⁇ L1, the function L1+ ⁇ LP, and the function L2+ ⁇ L1+ ⁇ LP , wherein, the L2 is the L2 norm, the L1 is the L1 norm, the LP is the LP norm, and the ⁇ , ⁇ , ⁇ , ⁇ are the weights of the regularization factors; return execution to the target sample
  • the concentrated low-resolution images are input into the pyramid neural network to obtain the operation of predicting the image, until the target neural network model is obtained.
  • the processing module 22 may be used to train the parameters of the pyramid neural network based on the objective function L2+ ⁇ L1 and the learning rate set to the first value; after the decrease value of the output value of the objective function is less than a predetermined value, Based on the objective function L1+ ⁇ LP and the learning rate set to a second value, continue training the parameters of the pyramid neural network, where the second value is less than the first value, and the ⁇ and ⁇ increase with the learning rate The decrease and increase.
  • the decreasing value of the output value of the objective function refers to the difference between the current output value of the objective function and the previous output value of the objective function.
  • the processing module 22 may be used to train the parameters of the pyramid neural network according to the objective function L2+ ⁇ L1+ ⁇ LP formed by the predicted image and the high-resolution image corresponding to the low-resolution image; In rounds of training, or between two rounds of training, adjust ⁇ and ⁇ to achieve an approximation to the global optimum.
  • the sample information content includes: gradient information of the image sample, or variance within the image sample.
  • the sample information content is:
  • PIC
  • the abscissa and ordinate, w, h represent the width and height of the selected image sample, a i is the image sample, Sobel represents the Sobel operator, max represents the maximum value, and abs represents the absolute value.
  • the abscissa and the ordinate may be coordinates in a Cartesian rectangular coordinate system.
  • the sample information content is:
  • PIC
  • the abscissa and ordinate, w, h represent the width and height of the selected image sample, a i is the image sample, Var represents the variance, max represents the maximum value, and abs represents the absolute value.
  • the abscissa and the ordinate may be coordinates in a Cartesian rectangular coordinate system.
  • the processing module 22 may be used to: divide the sample information content space corresponding to the target image sample set to obtain at least two sample information content intervals, and each sample information content interval corresponds to a different information content range.
  • the sample information content interval includes the target number of image samples; the information content of the image sample is obtained; the corresponding sample information content interval is determined according to the information content of the image sample; if the sample number in the sample information content interval is less than the target number, then The image sample is determined as the target image sample.
  • An image super-resolution processing device provided by the present disclosure is used to obtain an image to be processed, and perform super-resolution processing on the image to be processed through a target neural network model, and the target neural network model is iteratively trained through a target image sample set A pyramid neural network model is obtained.
  • the target image sample is obtained by filtering the original image sample according to the sample information content interval, and the number of image samples in each sample information content interval is the same. The problem of unbalanced sampling output sample ratio in the prior art is solved, and the bias of the network is effectively alleviated.
  • the embodiments of the present disclosure filter the samples by adding sample information content to the samples, so as to improve the balance of statistical significance when the samples enter the network training.
  • the embodiments of the present disclosure adopt a pyramid structure as the main structure of the network.
  • the bottom, middle, and high-level semantic information is simultaneously introduced into the up-sampling module.
  • the semantic information tends to be higher-level, and discarding the bottom-level information and the middle-level information will reduce the attention to the “weak” details of the image in the reconstruction results. Therefore, on the one hand, the pyramid structure is used to carry out the extracted information. Condensation reduces the computational cost.
  • a variety of information is fused to improve the reconstruction effect. Target function setting during training and hyperparameter adjustment during training. In the training process, a multi-norm regularization method is used to improve the contribution to the insignificant texture part.
  • the embodiment of the present disclosure addresses the problem of the imbalance of the sample output sample ratio in the existing super-resolution method.
  • the similarity of the smooth region, and the ratio of the smooth region statistically will be significantly higher than the texture part after other data segmentation.
  • the deepening of the network structure leads to the lack of shallow information, causing the imbalance of shallow, middle and high-level information in the reconstruction process.
  • the solution of the present disclosure can effectively alleviate the bias of the network, improve the problem of missing details in the super-resolution reconstruction process, and the defect of the direction of the reconstructed texture part being disordered.
  • the embodiments of the present disclosure make full use of various feature information in the sample, and the effect of supervised deep learning largely depends on the richness of the sample.
  • the sample data provided by the target has been greatly improved in absolute quantity, the data The relative number of spaces is not balanced, especially for high-resolution samples, where there are often large areas of sky, glass, or other slowly changing areas.
  • the embodiment of the present disclosure reduces the imbalance of data by analyzing the information entropy of the sample, and provides better information input for the neural network model.
  • the multi-scale network architecture uses the pyramid method to input multi-level information into the up-sampling module. Reduced that the accompanying increase in depth causes subsequent features to be biased toward high-level features, resulting in a decrease in the proportion of shallow features and middle-level features.
  • all the multilayer outputs of the pyramid are imported into the up-sampling module to improve the data richness of the up-sampling module.
  • multi-norm regularization compared with the commonly used L2 norm or L1 norm in the current supervised method as the objective function, but the present disclosure proposes to adopt the simultaneous joint LP norm, and adopt the method of dynamic adjustment during the training process. Improve training convergence efficiency and focus on small goals.
  • FIG. 3 is a schematic structural diagram of a device provided by the present disclosure.
  • the device provided by the present disclosure includes one or more processors 31 and a storage device 32; There may be one or more processors 31.
  • one processor 31 is taken as an example; the storage device 32 is configured to store one or more programs; the one or more programs are used by the one or more processors 31 Execute, so that the one or more processors 31 implement the method described in FIG. 1 in the embodiment of the present disclosure.
  • the equipment further includes: a communication device 33, an input device 34, and an output device 35.
  • the processor 31, the storage device 32, the communication device 33, the input device 34, and the output device 35 in the device may be connected by a bus or other methods.
  • the connection by a bus is taken as an example.
  • the input device 34 may be configured to receive input digital or character information, and generate key signal input related to user settings and function control of the device.
  • the output device 35 may include a display device such as a display screen.
  • the communication device 33 may include a receiver and a transmitter.
  • the communication device 33 is configured to transmit and receive information under the control of the processor 31.
  • the information includes but is not limited to uplink authorization information.
  • the storage device 32 can be configured to store software programs, computer-executable programs, and modules, such as the program instructions/modules corresponding to the image super-resolution processing method described in FIG. 1 of the embodiment of the present disclosure (for example, The acquisition module 21 and the processing module 22 in the image super-resolution processing device).
  • the storage device 32 may include a program storage area and a data storage area, where the program storage area may store an operating system and an application program required by at least one function; the data storage area may store data created according to the use of the device, and the like.
  • the storage device 32 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, a flash memory device, or other non-volatile solid-state storage devices.
  • the storage device 32 may further include a memory provided remotely with respect to the processor 31, and these remote memories may be connected to the device through a network. Examples of the aforementioned networks include, but are not limited to, the Internet, corporate intranets, local area networks, mobile communication networks, and combinations thereof.
  • the embodiment of the present disclosure further provides a storage medium, the storage medium stores a computer program, and when the computer program is executed by a processor, the image super-resolution processing method according to the embodiment of the present disclosure is implemented, and the method includes: obtaining Process the image; perform super-resolution processing on the image to be processed through a target neural network model, the target neural network model is obtained by iterative training of a pyramid neural network model through a target image sample set, and the target image sample is filtered according to the sample information content interval The original image samples are obtained, and the number of image samples in each sample information content interval is the same.
  • the computer storage media of the embodiments of the present disclosure may adopt any combination of one or more computer-readable media.
  • the computer-readable medium may be a computer-readable signal medium or a computer-readable storage medium.
  • the computer-readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or a combination of any of the above.
  • Computer-readable storage media include: electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (Read Only Memory, ROM), Erasable Programmable Read Only Memory (EPROM), flash memory, optical fiber, portable CD-ROM, optical storage device, magnetic storage device, or any suitable combination of the above .
  • the computer-readable storage medium may be any tangible medium that contains or stores a program, and the program may be used by or in combination with an instruction execution system, apparatus, or device.
  • the computer-readable signal medium may include a data signal propagated in baseband or as a part of a carrier wave, and computer-readable program code is carried therein. This propagated data signal can take many forms, including but not limited to: electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • the computer-readable signal medium may also be any computer-readable medium other than the computer-readable storage medium.
  • the computer-readable medium may send, propagate, or transmit the program for use by or in combination with the instruction execution system, apparatus, or device .
  • the program code contained on the computer-readable medium can be transmitted by any suitable medium, including but not limited to: wireless, wire, optical cable, radio frequency (RF), etc., or any suitable combination of the foregoing.
  • suitable medium including but not limited to: wireless, wire, optical cable, radio frequency (RF), etc., or any suitable combination of the foregoing.
  • the computer program code for performing the operations of the present disclosure can be written in one or more programming languages or a combination thereof.
  • the programming languages include object-oriented programming languages—such as Java, Smalltalk, C++, and also conventional Procedural programming language-such as "C" language or similar programming language.
  • the program code can be executed entirely on the user's computer, partly on the user's computer, executed as an independent software package, partly on the user's computer and partly executed on a remote computer, or entirely executed on the remote computer or server.
  • the remote computer can be connected to the user’s computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (for example, using an Internet service provider to Connect via the Internet).
  • LAN local area network
  • WAN wide area network
  • user equipment encompasses any suitable type of wireless user equipment, such as a mobile phone, a portable data processing device, a portable web browser, or a vehicle-mounted mobile station.
  • the various embodiments of the present disclosure may be implemented in hardware or special purpose circuits, software, logic or any combination thereof.
  • some aspects may be implemented in hardware, while other aspects may be implemented in firmware or software that may be executed by a controller, microprocessor, or other computing device, although the present disclosure is not limited thereto.
  • Computer program instructions can be assembly instructions, Instruction Set Architecture (ISA) instructions, machine instructions, machine-related instructions, microcode, firmware instructions, state setting data, or written in any combination of one or more programming languages Source code or object code.
  • ISA Instruction Set Architecture
  • the block diagram of any logic flow in the drawings of the present disclosure may represent program steps, or may represent interconnected logic circuits, modules, and functions, or may represent a combination of program steps and logic circuits, modules, and functions.
  • the computer program can be stored on the memory.
  • the memory can be of any type suitable for the local technical environment and can be implemented using any suitable data storage technology, such as but not limited to read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), optical Memory devices and systems (Digital Video Disc (DVD) or Compact Disk (CD)), etc.
  • Computer-readable media may include non-transitory storage media.
  • the data processor can be any type suitable for the local technical environment, such as but not limited to general-purpose computers, special-purpose computers, microprocessors, digital signal processors (Digital Signal Processing, DSP), application specific integrated circuits (ASICs) ), programmable logic devices (Field-Programmable Gate Array, FGPA), and processors based on multi-core processor architecture.
  • DSP Digital Signal Processing
  • ASICs application specific integrated circuits
  • FGPA programmable logic devices

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Processing (AREA)
  • Image Analysis (AREA)

Abstract

本公开提出一种图像超分辨率处理方法、装置、设备及存储介质。图像超分辨率处理方法包括:获取待处理图像;通过目标神经网络模型对所述待处理图像进行超分辨率处理,所述目标神经网络模型通过目标图像样本集迭代训练金字塔神经网络模型得到,所述目标图像样本根据样本信息含量区间过滤原始图像样本得到,各样本信息含量区间的图像样本数量相同。

Description

图像超分辨率处理方法、装置、设备及存储介质 技术领域
本公开涉及但不限于计算机技术领域。
背景技术
基于深度学习的超分辨率重建技术在重构效果对比中展现出了比传统方法非常明显的优势。当前超分辨率重建方法的主流开始向深度学习靠拢。
当前基于深度神经网路的超解析方法分为两个大类,一种是基于生成式对抗网络(Generative Adversarial Networks,GAN)的方法,另一类则是基于有监督的全卷积神经网络。前者通过感知机代价函数在GAN网络框架下,使输出结果位于真实背景(Ground Truth,GT)所处的高位空间之中,以判别器无法分辨生成图像和真实图像为目的。基于有监督的方法则是通过确定的退化模型生成低分辨率-高分辨率对(Low-resolution and high-resolution pair,LHP),借助负责的网络结构建模实现对这种LHP的识别。
发明内容
第一方面,本公开实施例提供一种图像超分辨率处理方法,包括:获取待处理图像;通过目标神经网络模型对所述待处理图像进行超分辨率处理,所述目标神经网络模型通过目标图像样本集迭代训练金字塔神经网络模型得到,所述目标图像样本根据样本信息含量区间过滤原始图像样本得到,各样本信息含量区间的图像样本数量相同。
第二方面,本公开实施例提供一种图像超分辨率处理装置,包括:获取模块,配置为获取待处理图像;处理模块,配置为通过目标神经网络模型对所述待处理图像进行超分辨率处理,所述目标神经网络模型通过目标图像样本集迭代训练金字塔神经网络模型得到,所述目标图像样本根据样本信息含量区间过滤原始图像样本得到,各样本信息含量区间的图像样本数量相同。
第三方面,本公开实施例提供一种设备,包括:一个或多个处理器;存储装置,配置为存储一个或多个程序;当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现本公开实施例中的任意一种方法。
第四方面,本公开实施例提供一种存储介质,所述存储介质存储有计算机程序,所述计算机程序被处理器执行时实现本公开实施例中的任意一种方法。
关于本公开的以上实施例和其他方面以及其实现方式,在附图说明、具体实施方式和权利要求中提供更多说明。
附图说明
图1为本公开提供的一种图像超分辨率处理方法的流程示意图;
图1a为本公开提供的一种深度神经网络的架构图;
图1b为本公开提供的神经网络模型的训练流程图;
图2为本公开提供的一种图像超分辨率处理装置的结构示意图;
图3为本公开提供的一种设备的结构示意图。
具体实施方式
为使本公开的目的、技术方案和优点更加清楚明白,下文中将结合附图对本公开的实施例进行详细说明。需要说明的是,在不冲突的情况下,本公开中的实施例及实施例中的特征可以相互任意组合。
在附图的流程图示出的步骤可以在诸如一组计算机可执行指令的计算机系统中执行。并且,虽然在流程图中示出了逻辑顺序,但是在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤。
如上所述,当前基于深度神经网路的超解析方法分为两个大类,一种是基于生成式对抗网络(Generative Adversarial Networks,GAN)的方法,另一类则是基于有监督的全卷积神经网络。前者通过感知机代价函数在GAN网络框架下,使输出结果位于真实背景(Ground Truth,GT)所处的高位空间之中,以判别器无法分辨生成图像和真实图像为目的。基于有监督的方法则是通过确定的退化模型生成低分 辨率-高分辨率对(Low-resolution and high-resolution pair,LHP),借助负责的网络结构建模实现对这种LHP的识别。
两类方法一般针对每个数据集进行随机裁剪成固定大小的块,然后将多个块组成样本集,样本集作为输入在网络中进行正向和反向传播,更新网络参数。这种方法在自然图像中存在显著的不均衡问题,主要是由于平滑区域在统计中相比于各类纹理而言占比较高。也就是说,无干扰的随机采样会输入网络更多的平滑区域,导致网络更倾向于采纳平滑区域的LHP。
因此,本公开特别提供一种图像超分辨率处理方法、装置、设备及存储介质,其实质上避免了由于现有技术的局限和缺点所导致的问题中的一个或多个。根据本公开的一个实施例,可以解决现有技术中采样输出样本比例失衡问题。
在一个示例性实施方式中,图1为本公开提供的一种图像超分辨率处理方法的流程示意图。该方法可以适用于将低分辨率的待处理图像进行超分辨率处理的情况。该方法可以由本公开提供的图像超分辨率处理装置执行,该图像超分辨率处理装置可以由软件和/或硬件实现,并集成在设备上。
如图1所示,本公开提供的一种图像超分辨率处理方法,包括步骤S110-S120。
在步骤S110,获取待处理图像。
其中,所述待处理图像为低分辨率图像。
示例性地,获取待处理图像的方式可以为通过摄像头采集待处理图像,还可以为通过截取的方式获取待处理图像,本公开实施例对获取待处理图像的方式不进行限制。
在步骤S120,通过目标神经网络模型对所述待处理图像进行超分辨率处理,所述目标神经网络模型通过目标图像样本集迭代训练金字塔神经网络模型得到,所述目标图像样本根据样本信息含量区间过滤原始图像样本得到,各样本信息含量区间的图像样本数量相同。
其中,金字塔神经网络模型针对基本特征提取模块,通过多层级联,实现从浅层特征到高层特征的获取。需要说明的是,采用该金 字塔式的网络结构,一方面保留了底层特征、中层特征和高层特征均可以参与上采样过程,另一方面由于多层跳连可以提升网络的稳定性。针对基本特征提取模块φ,通过多层级联,实现从浅层特征到高层特征的获取。基本特征提取模块可以选择残差网络,也可以选用U-Net的结构模型。每个模块提取的特征作为下一级特征提取的输入,最终各层特征都作为上采样模块的输入,如下式所示:
Input:[φ 01,...,φ n],
其中,φ为基本特征提取模块。
示例性地,如图1a所示,为本公开实施例中的深度神经网络的架构图,将低分辨率图像输入神经网络模型,得到高分辨率图像,在神经网络模型中,先是进行浅层特征提取,基本特征提取模块包括:基本特征提取模块φ 0,基本特征提取模块φ 1...基本特征提取模块φ n,每个模块提取的特征作为下一级特征提取的输入,最终各层特征都作为上采样模块的输入。
本公开实施例中,超分辨率重构的输出不局限于有限个分类目标,和以分类识别为目的的深度神经网络不同。从输入输出维度而言,超分辨率属于多对多的映射过程,如下式所示:
Γ:X→q
K:X→Y
其中,Γ表示分类识别网络,X表示输入图像,包含了二维(灰度图)或者三维(例如,包含了色彩信息)乃至四维(例如,包含了色彩信息的视频等)。q表示一个标量,针对目标识别,q可能是bool型变量,针对多分类问题,q可能是一个有限大小的标量。K表示超分辨率映射关系,其中Y表示输出高分辨率结果,大多数情况下,Y应与X保持相同维度,在每个维度上的幅值不小于X,可看作是一个信息膨胀的过程。从信息维度和效果提升而言K需要更广泛的关注所有输入信息。
传统的超解析输入样本提取是从样本集Α中随机选取一批(batch)样本{a 0,a 1,…,a n},然后对每个样本a i进行随机裁剪,最后根据需要进行数据增广,然后输入网络进行训练。这种方式只是简单 的根据分类识别网络进行了迁移,并未考虑到X在底层信息的相似性,即多数图像将包含大片平滑区域(这种现象在高分辨率图像中更为明显),平滑区域具有更高的相似性。这种相似性在网络训练过程中表现为输入端在统计上包含了超过其他纹理区域的更高的比重,而这种不均衡性将导致超分辨率神经网络更倾向于输出平滑的结果。这种结果违背了真实背景的图像信息,导致部分原本清晰的区域细节被损失。
本公开实施例目标图像样本根据样本信息含量区间过滤原始图像样本得到,也就是采用滤波的方法实现对样本的均衡控制,示例性的实现方式可以为:采用梯度信息表征样本的纹理丰富程度,对每个图像样本计算其Sobel梯度信息,并对其梯度信息进行求和,将求和的结果作为样本信息含量。或者可以为:采用方差,因为方差同样可以表征样本的变化程度,计算随机获取的每个样本内的方差,然后根据方差的数值填充到对应的间隔内,直到每个间隔内的样本均达到满足。由于每次样本的获取都采用随机的方式,每一轮训练之间使用shuffle操作,可以同时保证了样本的随机性。
示例性地,对目标图像样本集对应的整个样本信息含量空间进行划分,得到至少一个样本信息含量区间,每个区间具备样本信息含量的起始点和终止点,并对每个区间设置区间内的样本数量,获取原始图像样本的样本信息含量,根据样本信息含量确定样本信息含量对应的样本信息含量区间,找到对应的区间后,获取该空间内的样本数目,若样本数目没有达到设置的样本数量,则选取进入空间,若达到设置的样本数量,则不选用。
示例性地,在对原始图像样本进行过滤后得到目标图像样本,通过目标图像样本集迭代训练金字塔神经网络模型得到目标神经网络模型,通过训练完成的目标神经网络模型对待处理图像进行超分辨处理。
如图1b所示,为神经网络模型的训练流程图,先从训练集中获取初始样本,选择样本,根据选择的样本训练网络模型,计算Loss,更新神经网络模型,迭代执行上述步骤,直至得到最终模型。
本公开提供的一种图像超分辨率处理方法,获取待处理图像,通过目标神经网络模型对所述待处理图像进行超分辨率处理,所述目标神经网络模型通过目标图像样本集迭代训练金字塔神经网络模型得到,所述目标图像样本根据样本信息含量区间过滤原始图像样本得到,各样本信息含量区间的图像样本数量相同。解决了现有技术中采样输出样本比例失衡问题,有效的缓解网络的有偏性。
在上述实施例的基础上,提出了上述实施例的变型实施例,在此需要说明的是,为了使描述简要,在变型实施例中仅描述与上述实施例的不同之处。
在一个实施例中,通过目标图像样本集迭代训练金字塔神经网络模型包括:建立金字塔神经网络模型;将所述目标样本集中的低分辨率图像输入所述金字塔神经网络得到预测图像;根据所述预测图像和所述低分辨率图像对应的高分辨率图像形成的目标函数训练所述金字塔神经网络的参数,其中,所述目标函数包括函数L2+αL1、函数L1+θLP以及函数L2+μL1+βLP中的一种或多种,其中,所述L2为L2范数,所述L1为L1范数,所述LP为LP范数,所述α、θ、μ、β为正则化因子的权重;返回执行将所述目标样本集中的低分辨率图像输入所述金字塔神经网络得到预测图像的操作,直至得到目标神经网络模型。
其中,所述L1范数为
Figure PCTCN2021090646-appb-000001
L2范数为
Figure PCTCN2021090646-appb-000002
LP范数为
Figure PCTCN2021090646-appb-000003
其中,x i为预测图像与所述低分辨率图像对应的高分辨率图像的差值。
具体的,当p大于0小于1时,x i在LP范数下会有更大的响应。所以采用LP范数作为目标函数可以让较小的差异输出更大的Loss,提升网络对弱小差异的敏感性。
具体的,根据所述预测图像和所述低分辨率图像对应的高分辨率图像形成的目标函数训练所述金字塔神经网络的参数可以为:目标函数为函数L2+αL1,L2为L2范数,所述L1为L1范数,L1范数 为
Figure PCTCN2021090646-appb-000004
L2范数为
Figure PCTCN2021090646-appb-000005
x i为预测图像与所述低分辨率图像对应的高分辨率图像的差值;或者可以为:初始时目标函数是函数L2+αL1,目标函数下降不明显之后,目标函数为函数L1+θLP,所述LP为LP范数,所述α和θ为正则化因子的权重,LP范数为
Figure PCTCN2021090646-appb-000006
x i为预测图像与所述低分辨率图像对应的高分辨率图像的差值;还可以为:目标函数为函数L2+μL1+βLP,μ、β为正则化因子的权重,通过调整权重来实现对全局最优的逼近。
在一个实施例中,根据所述预测图像和所述低分辨率图像对应的高分辨率图像形成的目标函数训练所述金字塔神经网络的参数包括:基于目标函数L2+αL1和设置为第一数值的学习率,训练所述金字塔神经网络的参数;在目标函数输出值的下降数值小于预定数值(即,预定义的数值)后,基于目标函数L1+θLP,和设置为第二数值的学习率,继续训练所述金字塔神经网络的参数,其中,所述第二数值小于所述第一数值,所述α和θ随着学习率的减小而增大。这里,目标函数输出值的下降数值指的是目标函数的当前输出值与目标函数的前一输出值之间的差值。
其中,所述第一数值可以为设定的较大的学习率。所述第二数值可以为设定的相对小的学习率。
具体的,在训练的初始阶段采用函数L2+αL1作为目标函数,同时使用较大学习率进行训练。在目标函数下降不明显之后,采用L1+θLP作为目标函数,在相对小的学习率下进行训练。
在一个实施例中,根据所述预测图像和所述低分辨率图像对应的高分辨率图像形成的目标函数训练所述金字塔神经网络的参数包括:根据所述预测图像和所述低分辨率图像对应的高分辨率图像形成的目标函数L2+μL1+βLP训练所述金字塔神经网络的参数;在每轮训练中,或者,在两轮训练之间,调整μ和β,进而实现对全局最优的逼近。
具体的,全局选择L2+μL1+βLP范数约束作为目标函数,通过 对每轮训练或者两轮训练之间动态调整正则化因子的权重实现对全局最优的逼近。
本公开实施例涉及训练过程中的目标函数设定及训练过程的参数调整。在训练过程中采用多范数正则化方法提升对非显著纹理部分的贡献。同时,通过在训练过程中根据网络收敛情况调整正则化因子的权重,动态调整一方面提升网络的收敛效率,另一方面避免网络陷入极小值,更有效地更新网络超参采用多范数正则化。现有技术中的解析方法稳定性低,偶尔会出现原图像中未包含的成分或者纹理,其客观指标(例如PSNR(Peak Signal to Noise Ratio,峰值信噪比)、SSIM(Structural SIMilarity,结构相似性))较低。通常以L2范数或者L1范数作为目标函数,该函数约束下的网络训练结果通常具有较高的客观质量。在追求量化指标的情况下,通常导致网络复杂性大幅增加,增加的算力需求带来有限的效果提升,不利于部署侧实现。而且由于偏向于提升深度忽略了网络结构和数据的影响。相比于目前常用的L2范数或者L1范数作为目标函数,本公开实施例采用同时联合LP范数,并且在训练过程中采用动态调整的方式,提升训练收敛效率和对细小目标的关注。
在一个实施例中,所述样本信息含量包括:图像样本的梯度信息,或者,图像样本内的方差。
其中,梯度信息表征图像样本的纹理丰富程度。
具体的,对每个样本计算其Sobel算子梯度信息,并对其梯度信息进行求和,将求和的结果作为样本信息含量的表示。
其中,图像样本内的方差表征图像样本的变化程度。
具体的,计算随机获取的每个样本内的方差,将其作为样本信息含量的表示。
在一个实施例中,所述样本信息含量为:
PIC=||max(abs(Sobel(a i[x,y,w,h])))|| 1,其中,PIC表示样本信息含量,所述x,y分别表示选取的图像样本起始点的横坐标和纵坐标,w,h表示选取的图像样本的宽和高,a i为图像样本,Sobel表示Sobel算子,max表示取最大值,abs表示取绝对值。这里,所述横 坐标和纵坐标可以是笛卡尔直角坐标系中的坐标。
在一个实施例中,所述样本信息含量为:
PIC=||max(abs(Var(a i[x,y,w,h])))|| 1,其中,PIC表示样本信息含量,所述x,y分别表示选取的图像样本起始点的横坐标和纵坐标,w,h表示选取的图像样本的宽和高,a i为图像样本,Var表示方差,max表示取最大值,abs表示取绝对值。这里,所述横坐标和纵坐标可以是笛卡尔直角坐标系中的坐标。
在一个实施例中,所述目标图像样本根据样本信息含量区间过滤原始图像样本得到,各样本信息含量区间的图像样本数量相同包括:将目标图像样本集对应的样本信息含量空间进行划分,得到至少两个样本信息含量区间,各样本信息含量区间对应的信息含量范围不同,其中,各样本信息含量区间包含目标数量的图像样本;获取图像样本的信息含量;根据所述图像样本的信息含量确定对应的样本信息含量区间;若所述样本信息含量区间内的样本数量小于目标数量,则将所述图像样本确定为目标图像样本。
其中,将目标图像样本集对应的样本信息含量空间进行划分,得到至少两个样本信息含量区间,例如可以是,将目标图像样本集对应的样本信息含量空间进行划分,得到4个样本信息含量区间,如下式所示:
PIC=[[0,pic 0],[pic 0,pic 1],[pic 1,pic 2],[pic 2,∞]];
其中,[0,pic 0]表示第一区间的pic起始点和终止点,[pic 0,pic 1]表示第二区间的pic起始点和终止点,[pic 1,pic 2]表示第三区间的pic起始点和终止点,[pic 2,∞]表示第四区间的pic起始点和终止点。
其中,将目标图像样本集对应的样本数量设定为N,将目标图像样本集对应的样本数量平均至四个区间,每个区间内的图像样本数量为N/4,如下式所示:
Interval=[N/4,N/4,N/4,N/4];
其中,N为目标图像样本集对应的样本数量,Interval表示将整体PIC空间进行划分之后的每个区间内分配的样本数量。
具体的,将目标图像样本集对应的样本信息含量空间划分为4 个有效间隔,目标图像样本集对应的样本数量为N,将每个间隔内样本数量选择为N/4。例如,第一个区间分配数目为N/4,当采样的样本的PIC(样本信息含量)位于[0,pic0]区间内,且该区间内样本数未到达N/4,则该样本被选取进入待处理样本。同时,超过N/4时,该区域内的样本即达到填满状态,PIC位于该区间的采样的样本不再被选用。此时,如果其他间隔内仍然存在不足,那么重新进行一次随机采样,对每个采样的样本做信息量衡量,直到所有间隔内的数据均达到满足状态。
本公开提供了一种图像超分辨率处理装置,图2为本公开提供的一种图像超分辨率处理装置的结构示意图。如图2所示,本公开实施例中的一种图像超分辨率处理装置,可以集成在设备上。该装置包括:获取模块21,配置为获取待处理图像;处理模块22,配置为通过目标神经网络模型对所述待处理图像进行超分辨率处理,所述目标神经网络模型通过目标图像样本集迭代训练金字塔神经网络模型得到,所述目标图像样本根据样本信息含量区间过滤原始图像样本得到,各样本信息含量区间的图像样本数量相同。
本实施例提供的装置用于实现如图1所示实施例的方法,本实施例提供的装置实现原理和技术效果与图1所示实施例的方法类似,此处不再赘述。
在上述实施例的基础上,提出了上述实施例的变型实施例,在此需要说明的是,为了使描述简要,在变型实施例中仅描述与上述实施例的不同之处。
在一个实施例中,处理模块22可用于:建立金字塔神经网络模型;将所述目标样本中的低分辨率图像输入所述金字塔神经网络得到预测图像;根据所述预测图像和所述低分辨率图像对应的高分辨率图像形成的目标函数训练所述金字塔神经网络的参数,其中,所述目标函数包括函数L2+αL1、函数L1+θLP以及函数L2+μL1+βLP中的一种或多种,其中,所述L2为L2范数,所述L1为L1范数,所述LP为LP范数,所述α、θ、μ、β为正则化因子的权重;返回执行将所 述目标样本集中的低分辨率图像输入所述金字塔神经网络得到预测图像的操作,直至得到目标神经网络模型。
在一个实施例中,处理模块22可用于:基于目标函数L2+αL1和设置为第一数值的学习率,训练所述金字塔神经网络的参数;在目标函数输出值的下降数值小于预定数值后,基于目标函数L1+θLP,和设置为第二数值的学习率,继续训练所述金字塔神经网络的参数,其中,所述第二数值小于所述第一数值,所述α和θ随着学习率的减小而增大。这里,目标函数输出值的下降数值指的是目标函数的当前输出值与目标函数的前一输出值之间的差值。
在一个实施例中,处理模块22可用于:根据所述预测图像和所述低分辨率图像对应的高分辨率图像形成的目标函数L2+μL1+βLP训练所述金字塔神经网络的参数;在每轮训练中,或者,在两轮训练之间,调整μ和β,进而实现对全局最优的逼近。
在一个实施例中,所述样本信息含量包括:图像样本的梯度信息,或者,图像样本内的方差。
在一个实施例中,所述样本信息含量为:
PIC=||max(abs(Sobel(a i[x,y,w,h])))|| 1,其中,PIC表示样本信息含量,所述x,y分别表示选取的图像样本起始点的横坐标和纵坐标,w,h表示选取的图像样本的宽和高,a i为图像样本,Sobel表示Sobel算子,max表示取最大值,abs表示取绝对值。这里,所述横坐标和纵坐标可以是笛卡尔直角坐标系中的坐标。
在一个实施例中,所述样本信息含量为:
PIC=||max(abs(Var(a i[x,y,w,h])))|| 1,其中,PIC表示样本信息含量,所述x,y分别表示选取的图像样本起始点的横坐标和纵坐标,w,h表示选取的图像样本的宽和高,a i为图像样本,Var表示方差,max表示取最大值,abs表示取绝对值。这里,所述横坐标和纵坐标可以是笛卡尔直角坐标系中的坐标。
在一个实施例中,处理模块22可用于:将目标图像样本集对应的样本信息含量空间进行划分,得到至少两个样本信息含量区间,各样本信息含量区间对应的信息含量范围不同,其中,各样本信息含量 区间包含目标数量的图像样本;获取图像样本的信息含量;根据所述图像样本的信息含量确定对应的样本信息含量区间;若所述样本信息含量区间内的样本数量小于目标数量,则将所述图像样本确定为目标图像样本。
本公开提供的一种图像超分辨率处理装置,用于获取待处理图像,通过目标神经网络模型对所述待处理图像进行超分辨率处理,所述目标神经网络模型通过目标图像样本集迭代训练金字塔神经网络模型得到,所述目标图像样本根据样本信息含量区间过滤原始图像样本得到,各样本信息含量区间的图像样本数量相同。解决了现有技术中采样输出样本比例失衡问题,有效的缓解网络的有偏性。
具体的,训练样本的收集过程,本公开实施例采用针对样本添加样本信息含量的方式对样本进行滤波,提升样本进入网络训练时统计意义的均衡。首先建立一种评价样本信息量的统计指标,在该指标下对数据集随机裁剪得到的样本进行信息量标定,通过对统计指标进行分段以确定每段所需的样本数。然后,对随机裁剪的样本进行第二次选择,控制最终选择的目标图像样本集对应的样本数量在整个训练过程中保持均衡。网络架构设计上,本公开实施例采用金字塔结构作为网络的主体结构。同时将底层、中层和高层语义信息同时引入上采样模块之中。考虑到随着卷积的加深,语义信息偏向于高层,将底层信息和中层信息丢弃会降低重构结果中对图像“微弱”细节的关注,所以一方面,采用金字塔结构,将提取的信息进行浓缩,降低运算代价,另一方面多种信息融合,提升重构效果。训练过程中的目标函数设定及训练过程超参数调整。在训练过程中采用多范数正则化方法提升对非显著纹理部分的贡献。同时,通过在训练过程中根据网络收敛情况调整正则化因子的权重,动态调整一方面提升网络的收敛效率,另一方面避免网络陷入极小值,更有效地更新网络超参数。
本公开实施例针对现有超分辨率方法中的采样输出样本比例失衡问题,平滑区域的相似性,统计上平滑区域的比例将显著高于其他数据分割之后的纹理部分。网络结构的加深导致浅层信息缺失,引起重构过程中的浅层、中层和高层信息不均衡。通过本公开的方案可以 有效地缓解网络的有偏性,提升超解析重建过程中的细节缺失问题,以及重构纹理部分方向错乱的缺陷。
本公开实施例充分利用样本中的各种特征信息,有监督的深度学习效果很大程度上依赖于样本的丰富性,虽然目标提供的样本数据在绝对数量上已经有了很大提升,但是数据间的相对数目并不均衡,尤其是针对高分辨率样本中,经常存在大片的天空、玻璃或者其他缓变的区域。本公开实施例通过对样本的信息熵分析,降低数据不均衡性,为神经网络模型提供更优质的信息输入。多尺度网络架构,通过金字塔方法将多级信息输入上采样模块之中。降低了伴随深度增加导致后续特征偏向于高层特征,导致浅层特征和中层特征比例的降低。本公开将金字塔的多层输出均导入上采样模块,提升上采样模块的数据丰富性。采用多范数正则化,相比于目前有监督方法中常用的L2范数或者L1范数作为目标函数,但是本公开建议采用同时联合LP范数,并且在训练过程中采用动态调整的方式,提升训练收敛效率和对细小目标的关注。
本公开提供了一种设备,图3本公开提供的一种设备的结构示意图,如图3所示,本公开提供的设备,包括一个或多个处理器31和存储装置32;该设备中的处理器31可以是一个或多个,图3中以一个处理器31为例;存储装置32配置为存储一个或多个程序;所述一个或多个程序被所述一个或多个处理器31执行,使得所述一个或多个处理器31实现如本公开实施例中图1所述的方法。
设备还包括:通信装置33、输入装置34和输出装置35。
设备中的处理器31、存储装置32、通信装置33、输入装置34和输出装置35可以通过总线或其他方式连接,图3中以通过总线连接为例。
输入装置34可配置为接收输入的数字或字符信息,以及产生与设备的用户设置以及功能控制有关的按键信号输入。输出装置35可包括显示屏等显示设备。
通信装置33可以包括接收器和发送器。通信装置33设置为根 据处理器31的控制进行信息收发通信。信息包括但不限于上行授权信息。
存储装置32作为一种计算机可读存储介质,可设置为存储软件程序、计算机可执行程序以及模块,如本公开实施例图1所述图像超分辨率处理方法对应的程序指令/模块(例如,图像超分辨率处理装置中的获取模块21和处理模块22)。存储装置32可包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序;存储数据区可存储根据设备的使用所创建的数据等。此外,存储装置32可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他非易失性固态存储器件。在一些实例中,存储装置32可进一步包括相对于处理器31远程设置的存储器,这些远程存储器可以通过网络连接至设备。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。
本公开实施例还提供一种存储介质,所述存储介质存储有计算机程序,所述计算机程序被处理器执行时实现本公开实施例所述的图像超分辨率处理方法,该方法包括:获取待处理图像;通过目标神经网络模型对所述待处理图像进行超分辨率处理,所述目标神经网络模型通过目标图像样本集迭代训练金字塔神经网络模型得到,所述目标图像样本根据样本信息含量区间过滤原始图像样本得到,各样本信息含量区间的图像样本数量相同。
本公开实施例的计算机存储介质,可以采用一个或多个计算机可读的介质的任意组合。计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子(非穷举的列表)包括:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机存取存储器(Random Access Memory,RAM)、只读存储器(Read Only Memory,ROM)、可擦式可编程只读存储器(Erasable  Programmable Read Only Memory,EPROM)、闪存、光纤、便携式CD-ROM、光存储器件、磁存储器件、或者上述的任意合适的组合。计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。
计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于:电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。
计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:无线、电线、光缆、无线电频率(Radio Frequency,RF)等等,或者上述的任意合适的组合。
可以以一种或多种程序设计语言或其组合来编写用于执行本公开操作的计算机程序代码,所述程序设计语言包括面向对象的程序设计语言—诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络——包括局域网(LAN)或广域网(WAN)——连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。
以上所述,仅为本公开的示例性实施例而已,并非用于限定本公开的保护范围。
本领域内的技术人员应明白,术语用户设备涵盖任何适合类型的无线用户设备,例如移动电话、便携数据处理装置、便携网络浏览器或车载移动台。
一般来说,本公开的多种实施例可以在硬件或专用电路、软件、逻辑或其任何组合中实现。例如,一些方面可以被实现在硬件中,而其它方面可以被实现在可以被控制器、微处理器或其它计算装置执行的固件或软件中,尽管本公开不限于此。
本公开的实施例可以通过移动装置的数据处理器执行计算机程序指令来实现,例如在处理器实体中,或者通过硬件,或者通过软件和硬件的组合。计算机程序指令可以是汇编指令、指令集架构(Instruction Set Architecture,ISA)指令、机器指令、机器相关指令、微代码、固件指令、状态设置数据、或者以一种或多种编程语言的任意组合编写的源代码或目标代码。
本公开附图中的任何逻辑流程的框图可以表示程序步骤,或者可以表示相互连接的逻辑电路、模块和功能,或者可以表示程序步骤与逻辑电路、模块和功能的组合。计算机程序可以存储在存储器上。存储器可以具有任何适合于本地技术环境的类型并且可以使用任何适合的数据存储技术实现,例如但不限于只读存储器(Read-Only Memory,ROM)、随机访问存储器(Random Access Memory,RAM)、光存储器装置和系统(数码多功能光碟(Digital Video Disc,DVD)或光盘(Compact Disk,CD))等。计算机可读介质可以包括非瞬时性存储介质。数据处理器可以是任何适合于本地技术环境的类型,例如但不限于通用计算机、专用计算机、微处理器、数字信号处理器(Digital Signal Processing,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、可编程逻辑器件(Field-Programmable Gate Array,FGPA)以及基于多核处理器架构的处理器。
通过示范性和非限制性的示例,上文已提供了对本公开的示范实施例的详细描述。但结合附图和权利要求来考虑,对以上实施例的多种修改和调整对本领域技术人员来说是显而易见的,但不偏离本公开的范围。因此,本公开的恰当范围将根据权利要求确定。

Claims (11)

  1. 一种图像超分辨率处理方法,包括:
    获取待处理图像;
    通过目标神经网络模型对所述待处理图像进行超分辨率处理,所述目标神经网络模型通过目标图像样本集迭代训练金字塔神经网络模型得到,所述目标图像样本根据样本信息含量区间过滤原始图像样本得到,各样本信息含量区间的图像样本数量相同。
  2. 根据权利要求1所述的方法,其中,通过目标图像样本集迭代训练金字塔神经网络模型包括:
    建立金字塔神经网络模型;
    将所述目标样本集中的低分辨率图像输入所述金字塔神经网络得到预测图像;
    根据所述预测图像和所述低分辨率图像对应的高分辨率图像形成的目标函数训练所述金字塔神经网络的参数,其中,所述目标函数包括函数L2+αL1、函数L1+θLP以及函数L2+μL1+βLP中的一种或多种,其中,L2为L2范数,L1为L1范数,LP为LP范数,α、θ、μ、β为正则化因子的权重;
    返回执行将所述目标样本集中的低分辨率图像输入所述金字塔神经网络得到预测图像的操作,直至得到目标神经网络模型。
  3. 根据权利要求2所述的方法,其中,根据所述预测图像和所述低分辨率图像对应的高分辨率图像形成的目标函数训练所述金字塔神经网络的参数包括:
    基于目标函数L2+αL1和设置为第一数值的学习率,训练所述金字塔神经网络的参数;
    在目标函数输出值的下降数值小于预定数值后,基于目标函数L1+θLP,和设置为第二数值的学习率,继续训练所述金字塔神经网络的参数,其中,所述第二数值小于所述第一数值,α和θ随着学习率的减小而增大。
  4. 根据权利要求2所述的方法,其中,根据所述预测图像和所 述低分辨率图像对应的高分辨率图像形成的目标函数训练所述金字塔神经网络的参数包括:
    根据所述预测图像和所述低分辨率图像对应的高分辨率图像形成的目标函数L2+μL1+βLP训练所述金字塔神经网络的参数;
    在每轮训练中,或者,在两轮训练之间,调整μ和β,进而实现对全局最优的逼近。
  5. 根据权利要求1所述的方法,其中,所述样本信息含量包括:图像样本的梯度信息,或者,图像样本内的方差。
  6. 根据权利要求1所述的方法,其中,所述样本信息含量为:
    PIC=||max(abs(Sobel(a i[x,y,w,h])))|| 1,其中,PIC表示样本信息含量,x,y分别表示选取的图像样本起始点的横坐标和纵坐标,w,h表示选取的图像样本的宽和高,a i为图像样本。
  7. 根据权利要求1所述的方法,其中,所述样本信息含量为:
    PIC=||max(abs(Var(a i[x,y,w,h])))|| 1,其中,PIC表示样本信息含量,x,y分别表示选取的图像样本起始点的横坐标和纵坐标,w,h表示选取的图像样本的宽和高,a i为图像样本。
  8. 根据权利要求1所述的方法,其中,所述目标图像样本根据样本信息含量区间过滤原始图像样本得到,各样本信息含量区间的图像样本数量相同包括:
    将目标图像样本集对应的样本信息含量空间进行划分,得到至少两个样本信息含量区间,各样本信息含量区间对应的信息含量范围不同,其中,各样本信息含量区间包含目标数量的图像样本;
    获取图像样本的信息含量;
    根据所述图像样本的信息含量确定对应的样本信息含量区间;
    若所述样本信息含量区间内的样本数量小于目标数量,则将所述图像样本确定为目标图像样本。
  9. 一种图像超分辨率处理装置,其中,包括:
    获取模块,配置为获取待处理图像;
    处理模块,配置为通过目标神经网络模型对所述待处理图像进行超分辨率处理,所述目标神经网络模型通过目标图像样本集迭代训练金字塔神经网络模型得到,所述目标图像样本根据样本信息含量区间过滤原始图像样本得到,各样本信息含量区间的图像样本数量相同。
  10. 一种设备,其中,包括:
    一个或多个处理器;
    存储装置,配置为存储一个或多个程序;
    当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如权利要求1-8中任一所述的图像超分辨率处理方法。
  11. 一种存储介质,其中,所述存储介质存储有计算机程序,所述计算机程序被处理器执行时实现权利要求1-8任一项所述的图像超分辨率处理方法。
PCT/CN2021/090646 2020-05-13 2021-04-28 图像超分辨率处理方法、装置、设备及存储介质 WO2021227877A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP21804003.8A EP4152244A4 (en) 2020-05-13 2021-04-28 METHOD, DEVICE AND APPARATUS FOR HIGH RESOLUTION IMAGE PROCESSING AND STORAGE MEDIUM

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010403330.0A CN113674143A (zh) 2020-05-13 2020-05-13 图像超分辨率处理方法、装置、设备及存储介质
CN202010403330.0 2020-05-13

Publications (1)

Publication Number Publication Date
WO2021227877A1 true WO2021227877A1 (zh) 2021-11-18

Family

ID=78526408

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/090646 WO2021227877A1 (zh) 2020-05-13 2021-04-28 图像超分辨率处理方法、装置、设备及存储介质

Country Status (3)

Country Link
EP (1) EP4152244A4 (zh)
CN (1) CN113674143A (zh)
WO (1) WO2021227877A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115471398A (zh) * 2022-08-31 2022-12-13 北京科技大学 图像超分辨率方法、系统、终端设备及存储介质
CN115578553A (zh) * 2022-11-22 2023-01-06 河南知微生物工程有限公司 一种基于时序图像序列的甲醛的快速检测方法
CN115880157A (zh) * 2023-01-06 2023-03-31 中国海洋大学 一种k空间金字塔特征融合的立体图像超分辨率重建方法

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007190413A (ja) * 2007-04-12 2007-08-02 Toshiba Corp X線ct装置
CN102208103A (zh) * 2011-04-08 2011-10-05 东南大学 一种用于影像快速融合与评价的方法
CN109544448A (zh) * 2018-11-09 2019-03-29 浙江工业大学 一种拉普拉斯金字塔结构的团网络超分辨率图像重建方法

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007190413A (ja) * 2007-04-12 2007-08-02 Toshiba Corp X線ct装置
CN102208103A (zh) * 2011-04-08 2011-10-05 东南大学 一种用于影像快速融合与评价的方法
CN109544448A (zh) * 2018-11-09 2019-03-29 浙江工业大学 一种拉普拉斯金字塔结构的团网络超分辨率图像重建方法

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
"Financial time series analysis experiment tutorial", 31 August 2012, ISBN: 978-7-307-09741-4, article HU, LIQIN: "Passage; Financial time series analysis experiment tutorial", pages: 288 - 290, XP009531862 *
LAI WEI-SHENG; HUANG JIA-BIN; AHUJA NARENDRA; YANG MING-HSUAN: "Deep Laplacian Pyramid Networks for Fast and Accurate Super-Resolution", 2017 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), IEEE COMPUTER SOCIETY, US, 21 July 2017 (2017-07-21), US , pages 5835 - 5843, XP033249944, ISSN: 1063-6919, DOI: 10.1109/CVPR.2017.618 *
MIAO JUN, JUN CHU, ZHANG GUI-MEI, LU WANG: "Dense Multi-planar Scene Reconstruction from Sparse Point Cloud", ACTA AUTOMATICA SINICA, KEXUE CHUBANSHE, BEIJING, CN, vol. 41, no. 4, 30 April 2015 (2015-04-30), CN , pages 813 - 822, XP055867439, ISSN: 0254-4156, DOI: 10.16383/j.aas.2015.c140279 *
See also references of EP4152244A4 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115471398A (zh) * 2022-08-31 2022-12-13 北京科技大学 图像超分辨率方法、系统、终端设备及存储介质
CN115471398B (zh) * 2022-08-31 2023-08-15 北京科技大学 图像超分辨率方法、系统、终端设备及存储介质
CN115578553A (zh) * 2022-11-22 2023-01-06 河南知微生物工程有限公司 一种基于时序图像序列的甲醛的快速检测方法
CN115880157A (zh) * 2023-01-06 2023-03-31 中国海洋大学 一种k空间金字塔特征融合的立体图像超分辨率重建方法

Also Published As

Publication number Publication date
EP4152244A4 (en) 2024-07-03
EP4152244A1 (en) 2023-03-22
CN113674143A (zh) 2021-11-19

Similar Documents

Publication Publication Date Title
WO2021227877A1 (zh) 图像超分辨率处理方法、装置、设备及存储介质
US11798132B2 (en) Image inpainting method and apparatus, computer device, and storage medium
CN108986050B (zh) 一种基于多分支卷积神经网络的图像和视频增强方法
CN102905058B (zh) 产生去除了重影模糊的高动态范围图像的设备和方法
WO2021063341A1 (zh) 图像增强方法以及装置
CN105721853A (zh) 用于深度图生成的数码相机的配置设置
Shanmugavadivu et al. Particle swarm optimized bi-histogram equalization for contrast enhancement and brightness preservation of images
CN117408890B (zh) 一种视频图像传输质量增强方法及系统
CN112529146B (zh) 神经网络模型训练的方法和装置
WO2018211127A1 (en) Methods, systems and apparatus to optimize pipeline execution
CN110675334A (zh) 一种图像增强方法及装置
CN110443775B (zh) 基于卷积神经网络的离散小波变换域多聚焦图像融合方法
WO2021082819A1 (zh) 一种图像生成方法、装置及电子设备
CN112149476B (zh) 目标检测方法、装置、设备和存储介质
WO2022116104A1 (zh) 图像处理方法、装置、设备及存储介质
CN114511576B (zh) 尺度自适应特征增强深度神经网络的图像分割方法与系统
CN113221925A (zh) 一种基于多尺度图像的目标检测方法及装置
Babu et al. Contrast enhancement using real coded genetic algorithm based modified histogram equalization for gray scale images
CN110689478B (zh) 图像风格化处理方法、装置、电子设备及可读介质
Yang et al. MSE-Net: generative image inpainting with multi-scale encoder
Zhang et al. Photo-realistic dehazing via contextual generative adversarial networks
CN114170113B (zh) 一种无人机航空测绘三维建模方法及系统
Lian et al. [Retracted] Film and Television Animation Sensing and Visual Image by Computer Digital Image Technology
Kocdemir et al. TMO-Det: Deep tone-mapping optimized with and for object detection
CN115375909A (zh) 一种图像处理方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21804003

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2021804003

Country of ref document: EP

Effective date: 20221213