CN116645432A - High-quality hologram generating method based on improved ViT network - Google Patents
High-quality hologram generating method based on improved ViT network Download PDFInfo
- Publication number
- CN116645432A CN116645432A CN202310665894.5A CN202310665894A CN116645432A CN 116645432 A CN116645432 A CN 116645432A CN 202310665894 A CN202310665894 A CN 202310665894A CN 116645432 A CN116645432 A CN 116645432A
- Authority
- CN
- China
- Prior art keywords
- network
- hologram
- image
- vit
- encoding
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 27
- 238000012549 training Methods 0.000 claims abstract description 14
- 238000001228 spectrum Methods 0.000 claims abstract description 7
- 238000001914 filtration Methods 0.000 claims description 10
- 238000010606 normalization Methods 0.000 claims description 2
- 230000003287 optical effect Effects 0.000 claims description 2
- 238000003384 imaging method Methods 0.000 claims 1
- 238000013527 convolutional neural network Methods 0.000 description 7
- 238000010586 diagram Methods 0.000 description 5
- 230000006870 function Effects 0.000 description 5
- 230000000694 effects Effects 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000001093 holography Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000004904 shortening Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T9/00—Image coding
- G06T9/002—Image coding using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
- G06N3/0455—Auto-encoder networks; Encoder-decoder networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/10—Image enhancement or restoration using non-spatial domain filtering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/20—Image enhancement or restoration using local operators
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20024—Filtering details
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20048—Transform domain processing
- G06T2207/20056—Discrete and fast Fourier transform, [DFT, FFT]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Software Systems (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Health & Medical Sciences (AREA)
- Mathematical Physics (AREA)
- Multimedia (AREA)
- Holo Graphy (AREA)
Abstract
A high quality hologram generating method based on modified ViT network, by constructing encoding-decoding architecture, the modified Vision Transformer network is used as encoding part to encode the target image into its corresponding hologram; simulating free space propagation of light in the decoding part through an angular spectrum propagation algorithm to obtain a reconstructed image of the hologram, and performing iterative training on the encoding part of the encoding-decoding architecture by calculating a loss function between the reconstructed image and the target image; and reconstructing a high-quality holographic display image by using the pure phase hologram generated by the trained encoding-decoding architecture through a holographic display system in an online stage. The invention generates higher quality holograms and achieves holographic display by focusing on global information of the target image to improve the Vision transducer network.
Description
Technical Field
The invention relates to a technology in the field of image processing, in particular to a high-quality hologram generating method based on an improved ViT (Vision Transformer) network.
Background
Existing Deep Neural Network (DNN) -based computer-generated hologram (CGH) algorithms calculate holograms by training one or more Convolutional Neural Networks (CNNs) and apply to holographic display systems, shortening the time to calculate high quality holograms, but are less time consuming iterative algorithms in terms of display quality than conventional ones. One important reason is that diffraction of light waves is a cross-domain process from spatial domain to frequency domain, with global characteristics, whereas CNNs typically use local convolution operations, with limited receptive fields, and it is difficult to learn a cross-domain mapping from a target map (spatial domain) to a hologram (frequency domain).
Disclosure of Invention
Aiming at the problem that the display quality of the generated hologram is relatively low in the conventional CNN-based computer-generated holography, the invention provides a high-quality hologram generating method based on an improved ViT network, which aims at the global information of a target image, generates a higher-quality hologram by an improved Vision Transformer network and realizes high-quality hologram display, solves the problem that the receptive field of the conventional CNN-based CGH algorithm is limited, and improves the display image quality in the holographic display.
The invention is realized by the following technical scheme:
the invention relates to a high-quality hologram generating method based on an improved ViT network, which aims at CGH tasks by constructing an encoding-decoding framework, improves Vision Transformer the network, and takes the improved ViT as an encoding part to encode a target image into a corresponding phase-only hologram; simulating free space propagation of light in the decoding part through an angular spectrum propagation algorithm to obtain a reconstructed image of the hologram, and performing iterative training on the encoding part of the encoding-decoding architecture by calculating a loss function between the reconstructed image and the target image; the pure phase hologram is generated by adopting the improved Vision Transformer network after training in the online stage, and a high-quality holographic display image is reconstructed through a holographic display system.
Technical effects
The invention utilizes a pre-trained modified Vision Transformer network to calculate the phase-only hologram of the target display image and to achieve holographic display. Compared with the existing method for calculating the phase-only hologram of the target display image by using CNN, the method utilizes the characteristic of capturing global features by using the improved Vision Transformer network, improves the quality of the network calculation hologram, and obviously improves the quality of the reconstructed image in holographic display.
Drawings
FIG. 1 is a diagram of a network training framework of the present invention;
FIG. 2 is a schematic diagram of an embodiment;
FIG. 3 is a schematic diagram of an optical display system;
fig. 4 is an effect diagram of the embodiment.
Detailed Description
As shown in fig. 1 (a), the present embodiment includes an encoding section and a decoding section for the modified Vision Transformer network training frame based on the high image quality hologram generating method of the modified ViT network. The coding part is an improved Vision Tranformer network, which is a U-shaped framework consisting of four downsampling modules and corresponding upsampling modules, wherein:
each downsampling module and the corresponding upsampling module comprise two global filtering blocks.
As shown in fig. 1 (b), the global filtering block includes: two layer normalization units, a global filtering layer and a local enhanced feed forward network (LeFF), wherein: the global filtering layer firstly converts the input spatial features into frequency domains through two-dimensional fast Fourier transform (2D FFT), filters the frequency domain features through a learnable global filter, and then converts the frequency domain feature map back into the spatial features through two-dimensional inverse fast Fourier transform (2D IFFT). The global filtering layer effectively improves the receptive field of the network and the operation speed, and the quality of the holographic display image realized by utilizing the hologram obtained by the trained network is obviously improved.
As shown in FIG. 1 (c)In the high-image-quality holographic display method based on ViT network, the decoding part of the network training process is an angular spectrum propagation model, and the free space propagation of light is simulated through an angular spectrum propagation algorithm to obtain a reconstructed image of the simulated hologram, wherein: the angular spectrum propagation method comprises the following steps:wherein: e, e iφ(x,y) For the complex amplitude distribution of the diffraction plane, +.>For the complex amplitude distribution of the image plane, f x ,f y Is the spatial frequency, λ is the wavelength, and z is the propagation distance. In this example, the wavelength was set to 543nm and the propagation distance was set to 7cm.
The loss function employed in the high quality hologram generation method training framework of the present embodiment based on the modified ViT network includes: mean Square Error (MSE), perceptual loss function and Total Variation (TV) regularization term, specifically: wherein: />To reconstruct the amplitude of an image, a gt For the amplitude of the target image, +.>For the output of each layer of a pretrained VGG network +.>Representing the operation of calculating the total variation, phi is the calculated phase hologram. Alpha is the weight of the perceptual loss function and beta is the weight of the total variation regularization term. The weight of the perceptual loss function in this embodiment is set to 0.025 and the weight of the total variation regularization term is set to 0.001.
Fig. 2 is a schematic diagram of the principle of the present embodiment. The target image is input into a trained modified Vision Transformer network, and the network outputs a phase-only hologram corresponding to the target image. The hologram is loaded onto the SLM of a holographic display system and a reconstructed holographic display image can be captured with an industrial camera.
As shown in fig. 3, the holographic display system includes: the method comprises the steps of loading a pure phase hologram onto a phase spatial light modulator, modulating a planar light wave through the hologram, and then transmitting a diffraction pattern of 7cm, namely a reconstruction pattern, filtering high-order diffracted light through a 4f system, and capturing the reconstruction image on a back focal plane of the 4f system through an industrial camera.
As shown in fig. 4, an object display image, a phase-only hologram, and a hologram display image in the present embodiment are shown.
Through specific practical experiments, a network is built by adopting Python 3.8.0 and PyTorch 1.8.0 as basic environments, a DIV2K data set (3200 images in total) enhanced by horizontal and rotation is selected as an input training set, the resolution of the image is 1024 multiplied by 1024, the pixel size is set to be 3.74 mu m multiplied by 3.74 mu m, the wavelength of a laser source is set to be 543nm, and the propagation distance is set to be 7cm. The batch size used for training is 1, the initial learning rate is 0.001, an AdamW optimizer with angular momentum (0.9,0.999) is used for training the network, the training period is 50, and a cosine decay strategy is used for reducing the learning rate. The training device used was a NVIDIA GeForce RTX 3090GPU card.
TABLE 1
Method | GS | SGD | DPAC | U-Net | The invention is that |
PSNR(dB) | 22.53 | 32.12 | 26.32 | 22.76 | 32.41 |
SSIM | 0.653 | 0.945 | 0.895 | 0.739 | 0.946 |
Time(s) | 1.127 | 12.96 | 0.001 | 0.006 | 0.132 |
As shown in Table 1, compared with the prior art, the method works on fifty randomly selected test target images, and takes the peak image signal to noise ratio (PSNR) and Structural Similarity (SSIM) as indexes for measuring the display image quality. The simulated display images PSNR and SSIM are 32.41dB and 0.946 respectively, which are improved by 9.65dB and 0.207 respectively compared with the U-Net method (based on CNN). The approach to approximate PSNR and SSIM is compared to the SOTA iterative method SGD using 500 iterations, but the method increases 98 times the hologram generation time.
The foregoing embodiments may be partially modified in numerous ways by those skilled in the art without departing from the principles and spirit of the invention, the scope of which is defined in the claims and not by the foregoing embodiments, and all such implementations are within the scope of the invention.
Claims (6)
1. A high quality hologram generating method based on a modified ViT network, characterized in that a target image is encoded into its corresponding hologram by constructing an encoding-decoding architecture and using a modified ViT as an encoding part; simulating free space propagation of light in the decoding part through an angular spectrum propagation algorithm to obtain a reconstructed image of the hologram, and performing iterative training on the encoding part of the encoding-decoding architecture by calculating a loss function between the reconstructed image and the target image; and reconstructing a high-quality holographic display image by using the pure phase hologram generated by the trained encoding-decoding architecture through a holographic display system in an online stage.
2. The improved ViT network-based high quality hologram generating method of claim 1 wherein said improved Vision Transformer network comprises: the U-shaped framework consists of four downsampling modules and corresponding upsampling modules, wherein: each downsampling module and the corresponding upsampling module comprise two global filtering blocks.
3. The method for generating a high quality hologram based on a modified ViT network as claimed in claim 2, wherein said global filtering block comprises: two layer normalization units, a global filtering layer and a local enhanced feed forward network (LeFF), wherein: the global filtering layer firstly converts the input spatial features into frequency domains through two-dimensional fast Fourier transform (2D FFT), filters the frequency domain features through a learnable global filter, and then converts the frequency domain feature map back into the spatial features through two-dimensional inverse Fourier transform (2D IFFT).
4. The improved ViT network-based high-image-quality holographic display method of claim 1, wherein said angular spectrum propagation method is:wherein: e, e iφ(x,y) For the complex amplitude distribution of the diffraction plane, +.>For the complex amplitude distribution of the image plane, f x ,f y Is the spatial frequency, λ is the wavelength, and z is the propagation distance.
5. The method for generating a high quality hologram based on an improved ViT network as claimed in claim 1, wherein said loss function comprises: mean Square Error (MSE), perceptual loss function and Total Variation (TV) regularization term, specifically: wherein: />To reconstruct the amplitude of an image, a gt For the amplitude of the target image, +.>For the output of each layer of a pretrained VGG network +.>Representing the operation of computing the total variation, phi is the computed phase hologram, alpha is the weight of the perceptual loss function, and beta is the weight of the total variation regularization term.
6. The method for holographic display of high image quality based on ViT network of claim 1, wherein said optical display system comprises: the laser source, the beam expanding, the line deflection sheet, the semi-transparent and semi-reflective mirror configured with the SLM, the phase-only hologram, the first lens, the imaging aperture, the second lens and the industrial camera are sequentially arranged.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310665894.5A CN116645432A (en) | 2023-06-07 | 2023-06-07 | High-quality hologram generating method based on improved ViT network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310665894.5A CN116645432A (en) | 2023-06-07 | 2023-06-07 | High-quality hologram generating method based on improved ViT network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116645432A true CN116645432A (en) | 2023-08-25 |
Family
ID=87639781
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310665894.5A Pending CN116645432A (en) | 2023-06-07 | 2023-06-07 | High-quality hologram generating method based on improved ViT network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116645432A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118071866A (en) * | 2024-04-18 | 2024-05-24 | 南昌大学 | Sparse digital holographic image reconstruction method |
-
2023
- 2023-06-07 CN CN202310665894.5A patent/CN116645432A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118071866A (en) * | 2024-04-18 | 2024-05-24 | 南昌大学 | Sparse digital holographic image reconstruction method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Shi et al. | Towards real-time photorealistic 3D holography with deep neural networks | |
Tricoles | Computer generated holograms: an historical review | |
Bernardo et al. | Holographic representation: Hologram plane vs. object plane | |
Liu et al. | 4K-DMDNet: diffraction model-driven network for 4K computer-generated holography | |
Blinder et al. | The state-of-the-art in computer generated holography for 3D display | |
Chen et al. | Computer generated hologram with geometric occlusion using GPU-accelerated depth buffer rasterization for three-dimensional display | |
CN116645432A (en) | High-quality hologram generating method based on improved ViT network | |
US20230205133A1 (en) | Real-time Photorealistic 3D Holography With Deep Neural Networks | |
CN114387395A (en) | Phase-double resolution ratio network-based quick hologram generation method | |
Ishii et al. | Optimization of phase-only holograms calculated with scaled diffraction calculation through deep neural networks | |
CN117876591A (en) | Real fuzzy three-dimensional hologram reconstruction method for combined training of multiple neural networks | |
TW202236210A (en) | Totagraphy: coherent diffractive/digital information reconstruction by iterative phase recovery using special masks | |
Li et al. | Speckle noise suppression algorithm of holographic display based on spatial light modulator | |
Shiomi et al. | Fast hologram calculation method using wavelet transform: WASABI-2 | |
Liao et al. | Scattering imaging as a noise removal in digital holography by using deep learning | |
CN113658330B (en) | Holographic encoding method based on neural network | |
CN115797231A (en) | Real-time hologram generation method based on neural network of Fourier inspiration | |
CN115690252A (en) | Hologram reconstruction method and system based on convolutional neural network | |
Manisha et al. | Randomness assisted in-line holography with deep learning | |
Yan et al. | Generating Multi‐Depth 3D Holograms Using a Fully Convolutional Neural Network | |
Dong et al. | Vision transformer-based, high-fidelity, computer-generated holography | |
Agour et al. | Speckle reduction in holographic projection using temporal-multiplexing of spatial frequencies | |
Bo | Deep learning approach for computer-generated holography | |
CN114764220B (en) | Method for improving speckle autocorrelation reconstruction effect based on off-axis digital holography | |
Chen et al. | Real-time hologram generation using a non-iterative modified Gerchberg-Saxton algorithm |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |