CN115526792A - Point spread function prior-based coding imaging reconstruction method - Google Patents

Point spread function prior-based coding imaging reconstruction method Download PDF

Info

Publication number
CN115526792A
CN115526792A CN202211077821.6A CN202211077821A CN115526792A CN 115526792 A CN115526792 A CN 115526792A CN 202211077821 A CN202211077821 A CN 202211077821A CN 115526792 A CN115526792 A CN 115526792A
Authority
CN
China
Prior art keywords
layer
convolution
output
point spread
input
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211077821.6A
Other languages
Chinese (zh)
Inventor
张闻文
张颖
何伟基
陈钱
顾国华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Science and Technology
Original Assignee
Nanjing University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Science and Technology filed Critical Nanjing University of Science and Technology
Priority to CN202211077821.6A priority Critical patent/CN115526792A/en
Publication of CN115526792A publication Critical patent/CN115526792A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/73Deblurring; Sharpening
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/10Image enhancement or restoration using non-spatial domain filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/20Image enhancement or restoration using local operators
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20048Transform domain processing
    • G06T2207/20064Wavelet transform [DWT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a point spread function prior-based coding imaging reconstruction method, which comprises two stages: (1) In a trainable inversion stage, a measured fuzzy coding image is mapped to an intermediate reconstruction space after wiener filtering through learning a priori parameters in a forward model of a coding system, and preliminary decoding is completed; (2) And in the artifact correction stage, wavelet transformation is introduced by utilizing an improved U-Net structure to complete multi-level frequency domain filtering, so that residual artifacts of the intermediate reconstructed image are eliminated, and the visual perception quality is improved. Aiming at the problems of long time consumption and low definition of image reconstruction in lens-free coding imaging, the invention builds a light-weight deep learning convolutional neural network based on a physical model, effectively reduces the time of network training and image reconstruction with less memory requirement and higher convergence rate, and simultaneously improves the reconstruction quality of a coded image.

Description

Point spread function prior-based coding imaging reconstruction method
Technical Field
The invention belongs to a lens-free coding imaging technology, and particularly relates to a point spread function prior-based coding imaging reconstruction method.
Background
The lens-free coding imaging technology adopts a single-chip coding mask to replace a complex optical component in a traditional camera to code scene light, such as a diffractive optical element, a coding aperture and the like, and completes inversion of an optical imaging process through a computational imaging technology so as to reconstruct an image of a target scene. The method transfers the main imaging pressure from the front-end optical imaging equipment to the rear-end calculation reconstruction technology, avoids the alignment, integration and process problems of complex lens groups in the traditional imaging system, obviously reduces the thickness, weight and cost of the system, provides a reasonable and feasible realization idea for light and thin imaging systems such as miniature cameras and the like, and has great demands in the fields of safety, wearable equipment, implantable equipment, sensor networks of the Internet of things and the like.
The core idea of the image reconstruction method based on coding mask imaging is to regularly regulate and control a light field through a designed mask and obtain a clear target scene from a fuzzy unfocused pattern by combining an image reconstruction algorithm in a computing system. At present, most of relevant research on the coding mask imaging technology is based on research on the structure, the imaging model and the application scene of the coding mask, and problems of artifacts, loss of details and the like exist in a refocused image obtained by a back-end scene reconstruction part. Therefore, it is necessary to develop a related research to exploit the potential of high-quality reconstruction of the back-end image of the coded mask imaging system, and improve the overall imaging performance of the system.
Currently, some back-end reconstruction algorithms already exist and can be classified into two categories: traditional iterative optimization algorithms and deep learning algorithms based on neural networks. The Ashok vector algorithm has proposed a flutcam system, which uses an amplitude modulated separable Coded mask Imaging system, and SVD, BM3D, TV based algorithms achieve a 512 × 512 visible image reconstruction, but the reconstructed image has poor quality and simple reconstruction targets (1.asif, m.s., ayremou, a., sankararayanan, a., veeraragan, a., & bannikamiuk, r.g. (2017). Flutcam: thin, less Cameras Using Coded adaptation and computation. Ieee Transactions on Computational Imaging,3 (3), 384-397.). Jiachen Wu et al uses fresnel zone apertures to encode incoherent light into the form of a wavefront and uses a compressive sensing algorithm to effectively eliminate double image artifacts due to sparsity in natural scenes. The method has the advantages that the signal-to-noise ratio of a single-lens image is obviously improved, the development of a camera architecture which is flat and reliable in structure and does not need strict calibration is promoted, but the application of the traditional algorithm is still limited by the slower reconstruction speed (2. Wu J, zhang H, zhang W, et al. Image reconstruction methods based on deep learning are increasingly popular due to their excellent reconstruction effect. However, compared to the traditional iterative method, the method based on deep learning is difficult to interpret, and there is no structured method to integrate the knowledge of the imaging system. The unfolding optimization then represents an intermediate zone between the classical and the in-depth approach. In the unfolding optimization, the fixed number of iterations of the classical algorithm is interpreted as a deep network, each iteration being one layer in the network. Kristinamokhova et al developed an alternating direction multiplier Algorithm (ADMM) study for lensless imaging. They proposed several network variations along the spectrum between the classical method and the depth method by varying the number of trainable parameters, including Le-ADMM, le-ADMM x, and Le-ADMM-U. The network trades off data fidelity against image perceived quality to produce a more visually appealing image at the cost of reduced data fidelity, but the method constraints are complex and image detail can be overwhelmed by artifacts (3. Kristina monakhova, joshua yuretscher, grace kuo, nickel antipa, kyrolos yanny, and Laura Waller, "raw orientations for reactive mask-based lens identification," op.express 27,28075-28090 (2019)).
Disclosure of Invention
The invention aims to provide a point spread function prior-based coded imaging reconstruction method, which aims to solve the problems of low reconstruction speed, low reconstruction precision and artifact residue of a deep learning method in the traditional method in lens-free coded imaging, effectively reduce the time of network training and image reconstruction with less memory requirement and higher convergence speed, and improve the reconstruction quality of a coded image.
The technical scheme for realizing the purpose of the invention is as follows: a point spread function prior-based coding imaging reconstruction method specifically comprises the following steps:
step 1: simulating or collecting a set of target data sets without loss as reference images;
and 2, step: simulating or collecting a group of coding image data sets based on a lens-free coding imaging system, generating a training data pair with a specified size, and calculating a point spread function of a coding mask with a corresponding size;
and 3, step 3: constructing a reconstruction network, wherein the reconstruction network adopts a convolution neural network based on point spread function prior, and the reconstruction network consists of two parts: the image correction method comprises a wiener filtering inversion part based on point diffusion function prior and an artifact correction part based on a wavelet convolution neural network, wherein the point diffusion function with a specified size is used as learnable prior information and is input into a filtering kernel of the wiener filtering inversion part;
and 4, step 4: constructing a loss function of a reconstructed network: calculating the error between the network output result and the target image by adopting a negative Pearson correlation coefficient, wherein a loss function is defined as the quotient of the product of the covariance between two variables and the standard deviation of the two variables;
and 5: optimizing the wavelet convolution neural network by adopting an Adam optimizer, setting an initial learning rate of the optimization algorithm, multiplying the training completion of each period by an attenuation factor, and setting an exponential attenuation rate of first-order moment estimation, an exponential attenuation rate of second-order moment estimation and iteration times of each period;
and 6: training the network according to the set hyper-parameters for b periods, and finishing the training in two times: the first b/2 periods are fixed with wiener filtering kernels, namely, a wiener filtering inversion part does not participate in back propagation, and only a wavelet convolution neural network module is trained; after the training of the first b/2 periods is finished, the network reaches a preliminary convergence state, then the training of the next b/2 periods is carried out, and the wiener filtering module is brought into a back propagation process in the iteration of the next b/2 periods, namely the parameters of the two modules are trained simultaneously, wherein b is an even number;
and 7: and inputting the coded images of the test set into a network for prediction, and outputting reconstructed decoded images.
Compared with the prior art, the invention has the following remarkable advantages: (1) The method comprises the steps of adopting a wiener filtering fast reconstruction method based on the smallest second-order problem under the Tikhonov regularization to realize preliminary decoding of a coded image, fully utilizing prior physical information, establishing an inversion physical model, and incorporating the physical model into a deep learning network learning framework to enable the reconstruction process to have interpretability; (2) For the intermediate reconstructed image of the initial decoding, introducing wavelet transform to replace each pooling operation, and expanding the receptive field under the condition of not losing information; (3) The network only comprises 32 convolutional layers, thereby greatly reducing the parameter number of the network, better removing image artifacts while improving the running speed, keeping the details of the image and improving the reconstruction quality.
Drawings
FIG. 1 is a schematic diagram of a system model for validating the present invention.
FIG. 2 is a schematic diagram of the modulation process of the light by the encoding mask in the present invention.
FIG. 3 is a schematic diagram of the structure of the wavelet convolutional neural network in the present invention.
FIG. 4 is a schematic diagram of the overall structure of the method of the present invention.
FIG. 5 is a comparison graph of image effects reconstructed by the point spread function prior-based encoding imaging reconstruction method of the present invention.
Detailed Description
The present invention is described in further detail below with reference to the attached drawing figures.
A point spread function prior-based coding imaging reconstruction method specifically comprises the following steps:
step 1: a group of target data sets without loss are collected to be used as reference images, high-definition data sets disclosed on a network can be used, and the target data sets can be collected by self.
Step 2: a lensless coded imaging system was built. FIG. 1 is a schematic diagram of a system model for verifying the present invention. The target scene is a lossless image displayed on a display screen, the distance between the display screen and the coding mask is about 30cm, the distance between the coding mask and the sensor is about 3mm, the coding mask replaces a traditional lens group and a sensor module to assemble a set of lens-free camera system, the verification system adopts a Fresnel zone plate as a reference coding mask, the diameter of the zone plate is about 4.5mm, and the Fresnel constant is 0.325mm.
A group of encoding image data sets are simulated or collected based on a lens-free encoding imaging system, training data pairs with specified sizes are generated, and point spread functions of encoding masks with corresponding sizes are measured. For the directly acquired data set, taking a lossless image on a display screen by using a lens-free coding imaging system under the parameter condition; for a simulation dataset, a strict forward propagation model needs to be established to simulate the encoded image captured on the sensor, and the specific simulation process is as follows.
The principle of lens-free coding imaging is as follows: in fourier optics, the formation of an incoherent image can be viewed as a collection of point sources, each point source will produce a shifting effect of the point spread function, and since the sources are incoherent with each other, the shifted point spread function will increase in intensity linearly at the sensor, and the detected image can be represented as a convolution model of the target image and the system point spread function. The modulation effect of the encoding mask on incident light in the present invention is expressed in the form of a point spread function, as shown in fig. 2, where the intensity pattern of the point spread function varies with the variation of the diffraction distance.
For sensor imaging, the generation of the simulated encoded image dataset is:
Y=C(PSF z *X+N)
where Y is the simulated image-plane encoded image, C is the crop operator, PSF z Is the point spread function of the coding mask captured on the outgoing light field at z from the target, X is the input lossless target image, N is additive noiseA convolution operator. For a broadband light source, the encoded image can be calculated by integrating the diffracted intensity at multiple wavelengths. The imaging model also takes into account specific spectral response curves for a particular sensor, since the image sensor has different sensitivities to light of different wavelengths, and therefore the integration should pass through the spectral response Q c (λ) weighting:
Figure BDA0003832367900000051
wherein [ lambda ] minmax ]Indicating spectral range, PSF z (λ) is the point spread function for monochromatic light of wavelength λ, X (λ) is the intensity of the light of wavelength λ emitted by the screen, Q c Is the spectral response curve of the sensor, eta is the readout noise of the sensor, and Gaussian noise eta-N (0, sigma) is generally adopted 2 ),Y c Is a coded image captured by an analog sensor.
And step 3: constructing a reconstruction network, wherein the reconstruction network adopts a convolution neural network based on point spread function prior, and the reconstruction network consists of two parts: a wiener filtering inversion part based on point spread function prior and an artifact correction part based on a wavelet convolution neural network.
(1) The specific method of the wiener filtering inversion part is as follows:
the imaging model of a mask-based lensless imaging system is typically characterized by a convolution of the scene with the mask shadow, i.e., a point spread function (point spread function):
y=p*x+e
where p is the point spread function, x is the scene irradiance, y is the image formed on the sensor, and e is the measurement noise. To reconstruct x from y using the known p, a general image recovery method for lensless imaging is to minimize an objective function that typically consists of a data fidelity term and a regularization term:
Figure BDA0003832367900000052
wherein
Figure BDA0003832367900000053
Quantization data fidelity, regularization term
Figure BDA0003832367900000054
Prior knowledge is introduced to alleviate the ill-qualification of the inverse problem, the regularization parameter gamma controls the relative weights of the two terms,
Figure BDA0003832367900000055
representing Tikhonov regularization. The least squares problem under Tikhonov regularization has a closed-form solution given by wiener deconvolution:
Figure BDA0003832367900000056
the form of the trainable inversion stage in the convolution case behaves as a learned inversion of the Hadamard product in the fourier domain:
X interm =F -1 (F(W)⊙F(Y))
wherein, X interm Is the output of this stage, Y is the measurement, F (-), and F -1 (. Is a Fourier transform and inverse Fourier transform operation, W is a filter learned by the neural network, and as such, indicates a Hadamard product. For a measurement where one dimension is nxm, the dimension of W is also nxm. W completes initialization using fourier transform of the calibrated point spread function, i.e.:
Figure BDA0003832367900000061
where K is a regularization parameter and the initial value is set to 10 4 H = F (p), p being the point spread function prior of the input, * representing the conjugate operator.
In the reconstruction network, a point spread function of a specified size is input as learnable a priori information into a filter kernel of a wiener filter module.
(2) The wavelet convolutional neural network adopts a U-Net architecture and consists of 4-layer wavelet transform, 32 volume blocks and 4-layer wavelet inverse transform, wherein each volume block comprises an optional BN process and a Relu activation process. . Since the wavelet transform is invertible, this down-sampling scheme can ensure that all information is preserved. In addition, the wavelet transformation can simultaneously capture the frequency and position information of the characteristic diagram, and has better time-frequency localization characteristic and detail retention capability. The wavelet transformation enhances the learning of the network to the high-frequency information and the low-frequency information, and is beneficial to realizing artifact correction. The wavelet convolutional neural network can embed wavelet transformation into any convolutional neural network with pooling and has stronger capability of modeling the spatial context and intersubband dependency.
Specifically, the wavelet convolution neural network is composed of an encoder subnetwork and a decoder subnetwork, and the two parts have a symmetrical U-shaped structure. In an encoder subnetwork, completing characteristic diagram down-sampling through a wavelet transform layer, adding 4 convolution blocks in any 2 wavelet transform layers, and taking an output sub-band characteristic diagram of wavelet transform as the input of a subsequent convolution block; similarly, in the decoder subnetwork, the characteristic graph is up-sampled by inverse wavelet transform, 4 convolutional blocks are added in any 2 inverse wavelet transform layers, and the output subband characteristic graph of the inverse wavelet transform is used as the input of a subsequent convolutional layer; each convolution block consists of a 3 x 3 filter convolution, batch normalization and modified linear unit (ReLU activation function). For the last volume block, predicting a residual image by convolution without batch normalization and a ReLU activation function; in the up-sampling process, the feature maps of the encoder sub-network and the decoder sub-network are fused by adopting a method of element-by-element summation.
The wavelet transformation is realized by the following method:
taking Haar wavelet as an example, in a two-dimensional Haar wavelet, the definition of the low-pass filter is:
Figure BDA0003832367900000062
it can be seen that,
Figure BDA0003832367900000071
What is actually achieved is a summation pooling operation. When only the low frequency sub-band is considered, the wavelet transform and the inverse wavelet transform play a role in pooling and up-convolution, respectively, in the network. When all sub-bands are considered, the network can avoid information loss caused by conventional sub-band sampling, and the recovery result is facilitated. In reconstructing the network, f is also used LH ,f HL ,f HH Three subband filters, defined as:
Figure BDA0003832367900000072
given an image x with an image size of m × n, the (i, j) th value x of 4 sub-band diagrams of the image after 2-dimensional Haar transform (wavelet transform) k (i, j) (k =1,2,3, 4) are written as:
Figure BDA0003832367900000073
meanwhile, the inverse wavelet transform process can be obtained as follows:
Figure BDA0003832367900000074
the image size used for training is NxNx1, the output inversion graph size of the input image after passing through the wiener filtering module is still NxNx1, the wavelet convolution neural network carries out 4-layer down-sampling characteristic learning operation and 4-layer up-sampling characteristic learning operation on the intermediate reconstructed image, wherein the specific structure of the encoder sub-network is as follows:
a first feature layer: the input feature map is subjected to down-sampling by the first-layer wavelet transform, the number of output channels is 4, and then feature learning is completed by 4 convolution blocks. Each volume block includes: (1) Convolution layer, convolution kernel size is 3 x 3, output channel number is 40, step length is 1, filling is 1; (2) batch normalization layer, the number of output channels is 40; (3) an activation layer, activated using a ReLU function; wherein the number of the input channels of the first convolution layer is 4, and the number of the input channels of the second convolution layer to the fourth convolution layer is 40.
A second characteristic layer: the output of the first characteristic layer is subjected to wavelet transform of the second layer to complete down-sampling, the number of output channels is 160, and then characteristic learning is completed through 4 convolution blocks. Each volume block includes: (1) A convolution layer, the size of a convolution kernel is 3 multiplied by 3, the number of output channels is 64, the step length is 1, and the filling is 1; (2) batch normalization layer, the number of output channels is 64; (3) an activation layer, activated using a ReLU function; wherein, the number of the input channels of the first convolution layer is 160, and the number of the input channels of the second to the fourth convolution layers is 64.
A third characteristic layer: the output of the second characteristic layer is subjected to down-sampling by the wavelet transform of the third layer, the number of output channels is 256, and then the characteristic learning is finished by 4 convolution blocks. Each volume block includes: (1) Convolution layer, convolution kernel size is 3 x 3, output channel number is 64, step length is 1, filling is 1; (2) batch normalization layer, the number of output channels is 64; (3) an activation layer, activated using a ReLU function; the number of input channels of the first convolutional layer is 256, and the number of input channels of the second convolutional layer to the fourth convolutional layer is 64.
A fourth feature layer: the output of the third characteristic layer is subjected to down-sampling by wavelet transform of a fourth layer, the number of output channels is 256, and then characteristic learning is finished by 4 convolution blocks. Each volume block includes: (1) A convolution layer, the size of a convolution kernel is 3 multiplied by 3, the number of output channels is 64, the step length is 1, and the filling is 1; (2) batch normalization layer, the number of output channels is 64; (3) an activation layer, activated using a ReLU function; the number of input channels of the first convolutional layer is 256, and the number of input channels of the second convolutional layer to the fourth convolutional layer is 64.
The specific structure of the decoder subnetwork is:
a fourth feature layer: the output of the fourth feature layer of the encoder is used as input, and feature learning is firstly completed through 4 rolling blocks, and each rolling block comprises: (1) Convolution layer, convolution kernel size is 3 x 3, number is 64, step length is 1, filling is 1; (2) batch normalization layer; (3) an activation layer, activated using a ReLU function; the number of output channels of the first to third convolution layers and the batch normalization layer is 64, and the number of output channels of the fourth convolution layer and the batch normalization layer is 256; then, the up-sampling is completed through the wavelet inverse transformation of the fourth layer, and the number of output channels is 64;
a third characteristic layer: the output of the third feature layer of the encoder is added to the output of the fourth feature layer of the decoder, and as the input of the third feature layer of the decoder, feature learning is completed through 4 convolution blocks, and each convolution block comprises: (1) Convolution layer, convolution kernel size is 3 x 3, input channel number is 64, step length is 1, filling is 1; (2) batch normalization layer; (3) an activation layer, activated using a ReLU function; the number of output channels of the first to third convolution layers and the batch normalization layer is 64, and the number of output channels of the fourth convolution layer and the batch normalization layer is 256; then, the up-sampling is completed through the wavelet inverse transformation of the third layer, and the number of output channels is 64;
a second characteristic layer: the output of the second characteristic layer of the coder is added with the output of the third characteristic layer of the decoder to be used as the input of the second characteristic layer of the decoder, and firstly, the feature learning is completed through 4 convolution blocks, wherein each convolution block comprises: (1) Convolution layer, convolution kernel size is 3 x 3, input channel number is 64, step length is 1, filling is 1; (2) batch normalization layer; (3) an activation layer, activated using a ReLU function; the number of output channels of the first to third convolution layers and the batch normalization layer is 64, and the number of output channels of the fourth convolution layer and the batch normalization layer is 160; then, the up-sampling is completed through the two-layer wavelet inverse transformation, and the number of output channels is 40;
a first feature layer: the output of the first feature layer of the encoder is added with the output of the second feature layer of the decoder, and the added output is used as the input of the first feature layer of the decoder, and feature learning is completed through 3 convolution blocks, wherein each convolution block comprises: (1) Convolution layer, convolution kernel size is 3 x 3, input channel number is 40, output channel number is 40, step length is 1, filling is 1; (2) batch normalization layer, the number of output channels is 40; (3) an activation layer, activated using a ReLU function; then, the convolution kernel size is 3 multiplied by 3, the number of input channels is 40, the number of output channels is 4, the step length is 1 and the filling is 1 through a fourth convolution layer; finally, completing up-sampling through the first layer of wavelet inverse transformation, wherein the number of output channels is 1;
a zeroth feature layer: the input of the encoder is added to the output of the first feature layer of the decoder to obtain the final output.
And 4, step 4: establishing a loss function of a reconstructed network, namely recording an input coding picture of the network as X, recording a corresponding lossless target image as Y and recording an output reconstructed image of the network as T; and calculating the error between the network output result and the target image by adopting a negative Pearson correlation coefficient, wherein the loss function is defined as the quotient of the covariance between the two variables and the product of the standard deviation of the two variables:
Figure BDA0003832367900000091
wherein n is the number of pixels of the input picture, and i traverses the gray value corresponding to each pixel point. The value range of the loss function is [ -1,0], and the closer to-1, the higher the correlation degree is, the better the image reconstruction effect is;
and 5: the whole network is optimized by adopting an Adam (Adaptive Moment Estimate) optimizer, the initial learning rate lr of the optimization algorithm is set to be 0.0005, each period is multiplied by an attenuation factor of 0.8 after training is completed, the exponential attenuation rate of the first Moment Estimate is 0.9, and the exponential attenuation rate of the second Moment Estimate is 0.999; creating 3200 pairs of samples, wherein the number of batch processing pictures is 4, and each period finishes 800 iterations;
step 6: fig. 4 is a schematic diagram of the overall structure of the method of the present invention, and in combination with the diagram, the network is trained according to the set hyper-parameters, and the training is completed in 40 periods in two times: the first 20 periods are fixed with wiener filtering kernels, namely the wiener filtering module does not participate in back propagation, and only the wavelet convolution neural network module is trained; after the 20-period training is finished, the network reaches a preliminary convergence state, the next 20-period training is carried out, and the wiener filtering module is brought into a back propagation process in the iteration, namely parameters of the two modules are trained simultaneously;
and 7: and storing the finally trained network model, inputting the coded images of the test set into a network for prediction, and outputting reconstructed decoded images.
FIG. 5 is a comparison graph of image effects reconstructed by the point spread function prior-based coding imaging reconstruction method of the present invention, and on a test set, the average reconstruction time of an image with a size of 1024 × 1024 is 0.033s and the average reconstruction peak signal-to-noise ratio is 22.15dB by using the method.

Claims (10)

1.A point spread function prior-based coding imaging reconstruction method is characterized by comprising the following specific steps:
step 1: simulating or collecting a set of target data sets without loss as reference images;
and 2, step: simulating or collecting a group of coding image data sets based on a lens-free coding imaging system, generating a training data pair with a specified size, and calculating a point spread function of a coding mask with a corresponding size;
and step 3: constructing a reconstruction network, wherein the reconstruction network adopts a convolutional neural network based on point spread function prior, and the reconstruction network consists of two parts: the image correction method comprises a wiener filtering inversion part based on point spread function prior and an artifact correction part based on a wavelet convolution neural network, wherein the point spread function with a specified size is used as learnable prior information and is input into a filtering kernel of the wiener filtering inversion part;
and 4, step 4: constructing a loss function of a reconstructed network: calculating the error between the network output result and the target image by adopting a negative Pearson correlation coefficient, wherein a loss function is defined as the quotient of the product of the covariance between two variables and the standard deviation of the two variables;
and 5: optimizing the wavelet convolution neural network by adopting an Adam optimizer, setting an initial learning rate of the optimization algorithm, multiplying an attenuation factor after each period of training is finished, and setting an exponential attenuation rate of first-order moment estimation, an exponential attenuation rate of second-order moment estimation and iteration times of each period;
step 6: training the network according to the set hyper-parameters, wherein b periods are trained, and the training is completed in two times: the first b/2 periods are fixed with wiener filtering kernels, namely, a wiener filtering inversion part does not participate in back propagation, and only a wavelet convolution neural network module is trained; after the training of the first b/2 periods is finished, the network reaches a preliminary convergence state, then the training of the next b/2 periods is carried out, and the wiener filtering module is brought into a back propagation process in the iteration of the next b/2 periods, namely the parameters of the two modules are trained simultaneously, wherein b is an even number;
and 7: and inputting the coded images of the test set into a network for prediction, and outputting reconstructed decoded images.
2. The point spread function prior-based coded imaging reconstruction method of claim 1, wherein: the lens-free coding imaging system comprises a display screen, a coding mask and an image acquisition device which are arranged on the same horizontal light path, wherein a lossless target scene is displayed on the display screen, the distance between the display screen and the coding mask is about 30cm, and the distance between the coding mask and the image acquisition device is 3mm.
3. The point spread function prior-based coded imaging reconstruction method of claim 1, wherein: the generation process of the simulated encoded image data set in step 2 is:
Y=C(PSF z *X+N)
where Y is the encoded image on the simulated image plane, C is the crop operator, PSF z Is the point spread function of the encoded mask captured on the outgoing light field at a distance z from the target, X is the input lossless target image, N is additive noise, and X represents the convolution operator.
4. The point spread function prior-based coded imaging reconstruction method according to claim 1, wherein the specific process of the point spread function prior-based wiener filter inversion part in step 3 is as follows:
X interm =F -1 (F(W)⊙F(Y))
wherein, X interm Is the output of the wiener filter inversion section, Y is the measurement, F (-) and F -1 (. Is) the Fourier transform and inverse Fourier transform operations, respectively, W is the filter learned by the neural network, and & -refers to the Hadamard product; for a measurement where one dimension is N x M,the dimension of W is also N M; w completes initialization using fourier transform of the calibrated point spread function, i.e.:
Figure FDA0003832367890000021
where K is a regularization parameter, H = F (p), p is the point spread function prior of the input, * representing the conjugate operator.
5. The point spread function prior-based coded imaging reconstruction method of claim 1, wherein the wavelet convolutional neural network portion in step 3 is composed of an encoder subnetwork and a decoder subnetwork, and the two portions have a symmetrical U-shaped structure.
6. The point spread function prior based coded imaging reconstruction method of claim 5, wherein the encoder subnetwork includes 4 wavelet transform layers, each wavelet transform layer being followed by 4 convolution blocks, the feature map downsampling being performed by the wavelet transform layers, each convolution block consisting of 3 x 3 filter convolution, batch normalization and modified linear units.
7. The point spread function prior-based encoded imaging reconstruction method of claim 6, wherein the encoder subnetwork comprises:
a first feature layer: the input feature map is subjected to down-sampling through a first layer of wavelet transform, feature learning is completed through 4 convolution blocks, the number of output channels after down-sampling is 4, and each convolution block comprises: (1) Convolution layer, convolution kernel size is 3 x 3, output channel number is 40, step length is 1, filling is 1; (2) batch normalization layer, the number of output channels is 40; (3) an activation layer, activated using a ReLU function; wherein, the number of input channels of the first convolution layer is 4, and the number of input channels of the second convolution layer to the fourth convolution layer is 40;
a second characteristic layer: the output of the first characteristic layer is subjected to the second layer wavelet transform to complete the down-sampling, the characteristic learning is completed through 4 convolution blocks, the number of output channels after the down-sampling is 160, and each convolution block comprises: (1) Convolution layer, convolution kernel size is 3 x 3, output channel number is 64, step length is 1, filling is 1; (2) batch normalization layer, the number of output channels is 64; (3) an activation layer, activated using a ReLU function; wherein the number of the input channels of the first convolution layer is 160, and the number of the input channels of the second convolution layer to the fourth convolution layer is 64;
a third characteristic layer: the output of the second feature layer is subjected to down-sampling through wavelet transform of a third layer, feature learning is completed through 4 convolution blocks, the number of output channels after down-sampling is 256, and each convolution block comprises: (1) Convolution layer, convolution kernel size is 3 x 3, output channel number is 64, step length is 1, filling is 1; (2) batch normalization layer, the number of output channels is 64; (3) an activation layer, activated using a ReLU function; wherein, the number of input channels of the first convolution layer is 256, and the number of input channels of the second convolution layer to the fourth convolution layer is 64;
a fourth feature layer: the output of the third feature layer is subjected to down-sampling through wavelet transform of a fourth layer, feature learning is completed through 4 convolution blocks, the number of output channels after down-sampling is 256, and each convolution block comprises: (1) Convolution layer, convolution kernel size is 3 x 3, output channel number is 64, step length is 1, filling is 1; (2) batch normalization layer, the number of output channels is 64; (3) an activation layer, activated using a ReLU function; the number of input channels of the first convolutional layer is 256, and the number of input channels of the second convolutional layer to the fourth convolutional layer is 64.
8. The point spread function prior-based coded imaging reconstruction method of claim 6, wherein said decoder subnetwork includes 4 inverse wavelet transform layers, each inverse wavelet transform layer followed by 4 convolution blocks; for the last volume block, predicting a residual image by convolution without batch normalization and a ReLU activation function; the inverse wavelet transform layer is used for completing feature map up-sampling, and in the up-sampling process, feature maps of an encoder sub-network and a decoder sub-network are fused by adopting a method of element-by-element summation.
9. The point spread function prior-based coded imaging reconstruction method of claim 6, wherein the decoder subnetwork comprises:
a fourth feature layer: the output of the fourth feature layer of the encoder is used as input, feature learning is completed through 4 convolution blocks, and each convolution block comprises: (1) Convolution layer, convolution kernel size is 3 x 3, number is 64, step length is 1, filling is 1; (2) batch normalization layer; (3) an activation layer, activated using a ReLU function; the number of output channels of the first to third convolution layers and the batch normalization layer is 64, and the number of output channels of the fourth convolution layer and the batch normalization layer is 256; then, the up-sampling is completed through the wavelet inverse transformation of the fourth layer, and the number of output channels is 64;
a third characteristic layer: the output of the third feature layer of the encoder is added with the output of the fourth feature layer of the decoder, and the added output is used as the input of the third feature layer of the decoder, feature learning is completed through 4 convolution blocks, and each convolution block comprises: (1) Convolution layer, convolution kernel size is 3 x 3, input channel number is 64, step length is 1, filling is 1; (2) batch normalization layer; (3) an activation layer, activated using a ReLU function; the number of output channels of the first to third convolution layers and the batch normalization layer is 64, and the number of output channels of the fourth convolution layer and the batch normalization layer is 256; then, the up-sampling is completed through the third layer of wavelet inverse transformation, and the number of output channels is 64;
a second feature layer: the output of the second characteristic layer of the coder is added with the output of the third characteristic layer of the decoder, and the added output is used as the input of the second characteristic layer of the decoder, and firstly feature learning is completed through 4 convolution blocks, wherein each convolution block comprises: (1) Convolution layer, convolution kernel size is 3 x 3, input channel number is 64, step length is 1, filling is 1; (2) batch normalization layer; (3) an activation layer, activated using a ReLU function; the number of output channels of the first to third convolution layers and the batch normalization layer is 64, and the number of output channels of the fourth convolution layer and the batch normalization layer is 160; then, the up-sampling is completed through two-layer wavelet inverse transformation, and the number of output channels is 40;
a first feature layer: the output of the first characteristic layer of the coder is added with the output of the second characteristic layer of the decoder, and the added output is used as the input of the first characteristic layer of the decoder, and firstly, the characteristic learning is completed through 3 convolution blocks, and each convolution block comprises: (1) Convolution layer, convolution kernel size is 3 x 3, input channel number is 40, output channel number is 40, step length is 1, filling is 1; (2) batch normalization layer, the number of output channels is 40; (3) an activation layer, activated using a ReLU function; then, the convolution kernel size is 3 multiplied by 3, the number of input channels is 40, the number of output channels is 4, the step length is 1 and the filling is 1 through a fourth convolution layer; finally, completing up-sampling through the first layer of wavelet inverse transformation, wherein the number of output channels is 1;
the zeroth characteristic layer: the input of the encoder is added to the output of the first feature layer of the decoder to obtain the final output.
10. The point spread function prior-based coding imaging reconstruction method according to claim 1, wherein the loss function constructed in step 4 is specifically:
Figure FDA0003832367890000041
wherein n is the number of pixels of the input picture, i traverses each pixel point, and T i In order to reconstruct the gray value corresponding to the pixel point i in the image,
Figure FDA0003832367890000042
the table is the mean of all the gray values of the pixels of the reconstructed image, Y i The gray value corresponding to the pixel point i in the lossless target image,
Figure FDA0003832367890000043
is the average value of all pixel gray values of the lossless target image.
CN202211077821.6A 2022-09-05 2022-09-05 Point spread function prior-based coding imaging reconstruction method Pending CN115526792A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211077821.6A CN115526792A (en) 2022-09-05 2022-09-05 Point spread function prior-based coding imaging reconstruction method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211077821.6A CN115526792A (en) 2022-09-05 2022-09-05 Point spread function prior-based coding imaging reconstruction method

Publications (1)

Publication Number Publication Date
CN115526792A true CN115526792A (en) 2022-12-27

Family

ID=84698027

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211077821.6A Pending CN115526792A (en) 2022-09-05 2022-09-05 Point spread function prior-based coding imaging reconstruction method

Country Status (1)

Country Link
CN (1) CN115526792A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116703728A (en) * 2023-08-07 2023-09-05 北京理工大学 Super-resolution method and system for optimizing system parameters

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116703728A (en) * 2023-08-07 2023-09-05 北京理工大学 Super-resolution method and system for optimizing system parameters
CN116703728B (en) * 2023-08-07 2023-10-13 北京理工大学 Super-resolution method and system for optimizing system parameters

Similar Documents

Publication Publication Date Title
CN111709895A (en) Image blind deblurring method and system based on attention mechanism
CN111968044A (en) Low-illumination image enhancement method based on Retinex and deep learning
US20220301114A1 (en) Noise Reconstruction For Image Denoising
CN109447891A (en) A kind of high quality imaging method of the spectrum imaging system based on convolutional neural networks
CN110650340B (en) Space-time multiplexing compressed video imaging method
CN109741407A (en) A kind of high quality reconstructing method of the spectrum imaging system based on convolutional neural networks
CN111563562B (en) Color target reconstruction method of single-frame scattering image based on convolutional neural network
CN115880225A (en) Dynamic illumination human face image quality enhancement method based on multi-scale attention mechanism
CN112270646B (en) Super-resolution enhancement method based on residual dense jump network
CN115526792A (en) Point spread function prior-based coding imaging reconstruction method
CN116012243A (en) Real scene-oriented dim light image enhancement denoising method, system and storage medium
CN115170915A (en) Infrared and visible light image fusion method based on end-to-end attention network
CN112163998A (en) Single-image super-resolution analysis method matched with natural degradation conditions
Chen et al. Image denoising via deep network based on edge enhancement
Lou et al. Irregularly sampled seismic data interpolation via wavelet-based convolutional block attention deep learning
Lu et al. Underwater image enhancement method based on denoising diffusion probabilistic model
Wen et al. The power of complementary regularizers: Image recovery via transform learning and low-rank modeling
Zhang et al. Efficient content reconstruction for high dynamic range imaging
CN114549361B (en) Image motion blur removing method based on improved U-Net model
CN117111000A (en) SAR comb spectrum interference suppression method based on dual-channel attention residual network
CN114353946B (en) Diffraction snapshot spectrum imaging method
CN115861749A (en) Remote sensing image fusion method based on window cross attention
CN115311149A (en) Image denoising method, model, computer-readable storage medium and terminal device
Lian et al. An Image Deblurring Method Using Improved U-Net Model
Kim Learning Computational Hyperspectral Imaging

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination