CN116645432A - High-quality hologram generating method based on improved ViT network - Google Patents

High-quality hologram generating method based on improved ViT network Download PDF

Info

Publication number
CN116645432A
CN116645432A CN202310665894.5A CN202310665894A CN116645432A CN 116645432 A CN116645432 A CN 116645432A CN 202310665894 A CN202310665894 A CN 202310665894A CN 116645432 A CN116645432 A CN 116645432A
Authority
CN
China
Prior art keywords
network
hologram
image
vit
encoding
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310665894.5A
Other languages
Chinese (zh)
Inventor
李燕
凌玉烨
徐超
董振兴
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Jiaotong University
Original Assignee
Shanghai Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Jiaotong University filed Critical Shanghai Jiaotong University
Priority to CN202310665894.5A priority Critical patent/CN116645432A/en
Publication of CN116645432A publication Critical patent/CN116645432A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • G06T9/002Image coding using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/10Image enhancement or restoration using non-spatial domain filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/20Image enhancement or restoration using local operators
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20024Filtering details
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20048Transform domain processing
    • G06T2207/20056Discrete and fast Fourier transform, [DFT, FFT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Multimedia (AREA)
  • Holo Graphy (AREA)

Abstract

A high quality hologram generating method based on modified ViT network, by constructing encoding-decoding architecture, the modified Vision Transformer network is used as encoding part to encode the target image into its corresponding hologram; simulating free space propagation of light in the decoding part through an angular spectrum propagation algorithm to obtain a reconstructed image of the hologram, and performing iterative training on the encoding part of the encoding-decoding architecture by calculating a loss function between the reconstructed image and the target image; and reconstructing a high-quality holographic display image by using the pure phase hologram generated by the trained encoding-decoding architecture through a holographic display system in an online stage. The invention generates higher quality holograms and achieves holographic display by focusing on global information of the target image to improve the Vision transducer network.

Description

High-quality hologram generating method based on improved ViT network
Technical Field
The invention relates to a technology in the field of image processing, in particular to a high-quality hologram generating method based on an improved ViT (Vision Transformer) network.
Background
Existing Deep Neural Network (DNN) -based computer-generated hologram (CGH) algorithms calculate holograms by training one or more Convolutional Neural Networks (CNNs) and apply to holographic display systems, shortening the time to calculate high quality holograms, but are less time consuming iterative algorithms in terms of display quality than conventional ones. One important reason is that diffraction of light waves is a cross-domain process from spatial domain to frequency domain, with global characteristics, whereas CNNs typically use local convolution operations, with limited receptive fields, and it is difficult to learn a cross-domain mapping from a target map (spatial domain) to a hologram (frequency domain).
Disclosure of Invention
Aiming at the problem that the display quality of the generated hologram is relatively low in the conventional CNN-based computer-generated holography, the invention provides a high-quality hologram generating method based on an improved ViT network, which aims at the global information of a target image, generates a higher-quality hologram by an improved Vision Transformer network and realizes high-quality hologram display, solves the problem that the receptive field of the conventional CNN-based CGH algorithm is limited, and improves the display image quality in the holographic display.
The invention is realized by the following technical scheme:
the invention relates to a high-quality hologram generating method based on an improved ViT network, which aims at CGH tasks by constructing an encoding-decoding framework, improves Vision Transformer the network, and takes the improved ViT as an encoding part to encode a target image into a corresponding phase-only hologram; simulating free space propagation of light in the decoding part through an angular spectrum propagation algorithm to obtain a reconstructed image of the hologram, and performing iterative training on the encoding part of the encoding-decoding architecture by calculating a loss function between the reconstructed image and the target image; the pure phase hologram is generated by adopting the improved Vision Transformer network after training in the online stage, and a high-quality holographic display image is reconstructed through a holographic display system.
Technical effects
The invention utilizes a pre-trained modified Vision Transformer network to calculate the phase-only hologram of the target display image and to achieve holographic display. Compared with the existing method for calculating the phase-only hologram of the target display image by using CNN, the method utilizes the characteristic of capturing global features by using the improved Vision Transformer network, improves the quality of the network calculation hologram, and obviously improves the quality of the reconstructed image in holographic display.
Drawings
FIG. 1 is a diagram of a network training framework of the present invention;
FIG. 2 is a schematic diagram of an embodiment;
FIG. 3 is a schematic diagram of an optical display system;
fig. 4 is an effect diagram of the embodiment.
Detailed Description
As shown in fig. 1 (a), the present embodiment includes an encoding section and a decoding section for the modified Vision Transformer network training frame based on the high image quality hologram generating method of the modified ViT network. The coding part is an improved Vision Tranformer network, which is a U-shaped framework consisting of four downsampling modules and corresponding upsampling modules, wherein:
each downsampling module and the corresponding upsampling module comprise two global filtering blocks.
As shown in fig. 1 (b), the global filtering block includes: two layer normalization units, a global filtering layer and a local enhanced feed forward network (LeFF), wherein: the global filtering layer firstly converts the input spatial features into frequency domains through two-dimensional fast Fourier transform (2D FFT), filters the frequency domain features through a learnable global filter, and then converts the frequency domain feature map back into the spatial features through two-dimensional inverse fast Fourier transform (2D IFFT). The global filtering layer effectively improves the receptive field of the network and the operation speed, and the quality of the holographic display image realized by utilizing the hologram obtained by the trained network is obviously improved.
As shown in FIG. 1 (c)In the high-image-quality holographic display method based on ViT network, the decoding part of the network training process is an angular spectrum propagation model, and the free space propagation of light is simulated through an angular spectrum propagation algorithm to obtain a reconstructed image of the simulated hologram, wherein: the angular spectrum propagation method comprises the following steps:wherein: e, e iφ(x,y) For the complex amplitude distribution of the diffraction plane, +.>For the complex amplitude distribution of the image plane, f x ,f y Is the spatial frequency, λ is the wavelength, and z is the propagation distance. In this example, the wavelength was set to 543nm and the propagation distance was set to 7cm.
The loss function employed in the high quality hologram generation method training framework of the present embodiment based on the modified ViT network includes: mean Square Error (MSE), perceptual loss function and Total Variation (TV) regularization term, specifically: wherein: />To reconstruct the amplitude of an image, a gt For the amplitude of the target image, +.>For the output of each layer of a pretrained VGG network +.>Representing the operation of calculating the total variation, phi is the calculated phase hologram. Alpha is the weight of the perceptual loss function and beta is the weight of the total variation regularization term. The weight of the perceptual loss function in this embodiment is set to 0.025 and the weight of the total variation regularization term is set to 0.001.
Fig. 2 is a schematic diagram of the principle of the present embodiment. The target image is input into a trained modified Vision Transformer network, and the network outputs a phase-only hologram corresponding to the target image. The hologram is loaded onto the SLM of a holographic display system and a reconstructed holographic display image can be captured with an industrial camera.
As shown in fig. 3, the holographic display system includes: the method comprises the steps of loading a pure phase hologram onto a phase spatial light modulator, modulating a planar light wave through the hologram, and then transmitting a diffraction pattern of 7cm, namely a reconstruction pattern, filtering high-order diffracted light through a 4f system, and capturing the reconstruction image on a back focal plane of the 4f system through an industrial camera.
As shown in fig. 4, an object display image, a phase-only hologram, and a hologram display image in the present embodiment are shown.
Through specific practical experiments, a network is built by adopting Python 3.8.0 and PyTorch 1.8.0 as basic environments, a DIV2K data set (3200 images in total) enhanced by horizontal and rotation is selected as an input training set, the resolution of the image is 1024 multiplied by 1024, the pixel size is set to be 3.74 mu m multiplied by 3.74 mu m, the wavelength of a laser source is set to be 543nm, and the propagation distance is set to be 7cm. The batch size used for training is 1, the initial learning rate is 0.001, an AdamW optimizer with angular momentum (0.9,0.999) is used for training the network, the training period is 50, and a cosine decay strategy is used for reducing the learning rate. The training device used was a NVIDIA GeForce RTX 3090GPU card.
TABLE 1
Method GS SGD DPAC U-Net The invention is that
PSNR(dB) 22.53 32.12 26.32 22.76 32.41
SSIM 0.653 0.945 0.895 0.739 0.946
Time(s) 1.127 12.96 0.001 0.006 0.132
As shown in Table 1, compared with the prior art, the method works on fifty randomly selected test target images, and takes the peak image signal to noise ratio (PSNR) and Structural Similarity (SSIM) as indexes for measuring the display image quality. The simulated display images PSNR and SSIM are 32.41dB and 0.946 respectively, which are improved by 9.65dB and 0.207 respectively compared with the U-Net method (based on CNN). The approach to approximate PSNR and SSIM is compared to the SOTA iterative method SGD using 500 iterations, but the method increases 98 times the hologram generation time.
The foregoing embodiments may be partially modified in numerous ways by those skilled in the art without departing from the principles and spirit of the invention, the scope of which is defined in the claims and not by the foregoing embodiments, and all such implementations are within the scope of the invention.

Claims (6)

1. A high quality hologram generating method based on a modified ViT network, characterized in that a target image is encoded into its corresponding hologram by constructing an encoding-decoding architecture and using a modified ViT as an encoding part; simulating free space propagation of light in the decoding part through an angular spectrum propagation algorithm to obtain a reconstructed image of the hologram, and performing iterative training on the encoding part of the encoding-decoding architecture by calculating a loss function between the reconstructed image and the target image; and reconstructing a high-quality holographic display image by using the pure phase hologram generated by the trained encoding-decoding architecture through a holographic display system in an online stage.
2. The improved ViT network-based high quality hologram generating method of claim 1 wherein said improved Vision Transformer network comprises: the U-shaped framework consists of four downsampling modules and corresponding upsampling modules, wherein: each downsampling module and the corresponding upsampling module comprise two global filtering blocks.
3. The method for generating a high quality hologram based on a modified ViT network as claimed in claim 2, wherein said global filtering block comprises: two layer normalization units, a global filtering layer and a local enhanced feed forward network (LeFF), wherein: the global filtering layer firstly converts the input spatial features into frequency domains through two-dimensional fast Fourier transform (2D FFT), filters the frequency domain features through a learnable global filter, and then converts the frequency domain feature map back into the spatial features through two-dimensional inverse Fourier transform (2D IFFT).
4. The improved ViT network-based high-image-quality holographic display method of claim 1, wherein said angular spectrum propagation method is:wherein: e, e iφ(x,y) For the complex amplitude distribution of the diffraction plane, +.>For the complex amplitude distribution of the image plane, f x ,f y Is the spatial frequency, λ is the wavelength, and z is the propagation distance.
5. The method for generating a high quality hologram based on an improved ViT network as claimed in claim 1, wherein said loss function comprises: mean Square Error (MSE), perceptual loss function and Total Variation (TV) regularization term, specifically: wherein: />To reconstruct the amplitude of an image, a gt For the amplitude of the target image, +.>For the output of each layer of a pretrained VGG network +.>Representing the operation of computing the total variation, phi is the computed phase hologram, alpha is the weight of the perceptual loss function, and beta is the weight of the total variation regularization term.
6. The method for holographic display of high image quality based on ViT network of claim 1, wherein said optical display system comprises: the laser source, the beam expanding, the line deflection sheet, the semi-transparent and semi-reflective mirror configured with the SLM, the phase-only hologram, the first lens, the imaging aperture, the second lens and the industrial camera are sequentially arranged.
CN202310665894.5A 2023-06-07 2023-06-07 High-quality hologram generating method based on improved ViT network Pending CN116645432A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310665894.5A CN116645432A (en) 2023-06-07 2023-06-07 High-quality hologram generating method based on improved ViT network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310665894.5A CN116645432A (en) 2023-06-07 2023-06-07 High-quality hologram generating method based on improved ViT network

Publications (1)

Publication Number Publication Date
CN116645432A true CN116645432A (en) 2023-08-25

Family

ID=87639781

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310665894.5A Pending CN116645432A (en) 2023-06-07 2023-06-07 High-quality hologram generating method based on improved ViT network

Country Status (1)

Country Link
CN (1) CN116645432A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118071866A (en) * 2024-04-18 2024-05-24 南昌大学 Sparse digital holographic image reconstruction method

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118071866A (en) * 2024-04-18 2024-05-24 南昌大学 Sparse digital holographic image reconstruction method

Similar Documents

Publication Publication Date Title
Shi et al. Towards real-time photorealistic 3D holography with deep neural networks
Tricoles Computer generated holograms: an historical review
Bernardo et al. Holographic representation: Hologram plane vs. object plane
Liu et al. 4K-DMDNet: diffraction model-driven network for 4K computer-generated holography
Blinder et al. The state-of-the-art in computer generated holography for 3D display
Chen et al. Computer generated hologram with geometric occlusion using GPU-accelerated depth buffer rasterization for three-dimensional display
CN116645432A (en) High-quality hologram generating method based on improved ViT network
US20230205133A1 (en) Real-time Photorealistic 3D Holography With Deep Neural Networks
CN114387395A (en) Phase-double resolution ratio network-based quick hologram generation method
Ishii et al. Optimization of phase-only holograms calculated with scaled diffraction calculation through deep neural networks
CN117876591A (en) Real fuzzy three-dimensional hologram reconstruction method for combined training of multiple neural networks
TW202236210A (en) Totagraphy: coherent diffractive/digital information reconstruction by iterative phase recovery using special masks
Li et al. Speckle noise suppression algorithm of holographic display based on spatial light modulator
Shiomi et al. Fast hologram calculation method using wavelet transform: WASABI-2
Liao et al. Scattering imaging as a noise removal in digital holography by using deep learning
CN113658330B (en) Holographic encoding method based on neural network
CN115797231A (en) Real-time hologram generation method based on neural network of Fourier inspiration
CN115690252A (en) Hologram reconstruction method and system based on convolutional neural network
Manisha et al. Randomness assisted in-line holography with deep learning
Yan et al. Generating Multi‐Depth 3D Holograms Using a Fully Convolutional Neural Network
Dong et al. Vision transformer-based, high-fidelity, computer-generated holography
Agour et al. Speckle reduction in holographic projection using temporal-multiplexing of spatial frequencies
Bo Deep learning approach for computer-generated holography
CN114764220B (en) Method for improving speckle autocorrelation reconstruction effect based on off-axis digital holography
Chen et al. Real-time hologram generation using a non-iterative modified Gerchberg-Saxton algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination