CN118135397A - Visible light hyperspectral compression acquisition chip parameter training and acquisition method and system based on diffraction coding - Google Patents

Visible light hyperspectral compression acquisition chip parameter training and acquisition method and system based on diffraction coding Download PDF

Info

Publication number
CN118135397A
CN118135397A CN202410237404.6A CN202410237404A CN118135397A CN 118135397 A CN118135397 A CN 118135397A CN 202410237404 A CN202410237404 A CN 202410237404A CN 118135397 A CN118135397 A CN 118135397A
Authority
CN
China
Prior art keywords
visible light
residual block
hyperspectral
sample
diffraction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410237404.6A
Other languages
Chinese (zh)
Inventor
于振明
马靖越
程黎明
狄珈羽
林亮
徐坤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Posts and Telecommunications
Original Assignee
Beijing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Posts and Telecommunications filed Critical Beijing University of Posts and Telecommunications
Priority to CN202410237404.6A priority Critical patent/CN118135397A/en
Publication of CN118135397A publication Critical patent/CN118135397A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N21/00Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
    • G01N21/17Systems in which incident light is modified in accordance with the properties of the material investigated
    • G01N21/25Colour; Spectral properties, i.e. comparison of effect of material on the light at two or more different wavelengths or wavelength bands
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/194Terrestrial scenes using hyperspectral data, i.e. more or other wavelengths than RGB

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Biomedical Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Remote Sensing (AREA)
  • Computational Linguistics (AREA)
  • Chemical & Material Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Biochemistry (AREA)
  • Immunology (AREA)
  • Pathology (AREA)
  • Investigating Or Analysing Materials By Optical Means (AREA)

Abstract

The invention provides a diffraction-coding-based visible light hyperspectral compression acquisition chip parameter training and acquisition method and system. In the training process, the deviation is calculated based on the reconstructed visible light hyperspectrum and the original incident sample visible light hyperspectrum, and the height map distribution of the diffraction optical element and the parameters of the U-Net neural network are updated and iterated simultaneously by minimizing the deviation, so that the detection efficiency and the detection precision of the visible light hyperspectrum are improved.

Description

Visible light hyperspectral compression acquisition chip parameter training and acquisition method and system based on diffraction coding
Technical Field
The invention relates to the technical field of photoelectrons, in particular to a method and a system for training and collecting parameters of a visible light hyperspectral compression collection chip based on diffraction coding.
Background
Hyperspectral imaging techniques provide high spectral information of the object being acquired, accounting for the fundamental information composition of the scene in the wavelength dimension of the light. Spectral information has various applications in the fields of remote sensing, agriculture, medical imaging, and the like. However, due to the limitation of hardware system devices, the original hyperspectral imaging needs to be scanned to obtain an accurate measurement result, and the non-scanning infrared hyperspectral imaging system generally has the problems of high noise and high cost.
Due to the heavy development of image sensors and algorithm reconstruction, snapshot hyperspectral imaging is rapidly developed, and the possibility is provided for obtaining high-resolution and high-precision spectral information by single exposure. The present invention aims to provide a hyperspectral imaging scheme for visible light.
Disclosure of Invention
In view of this, the embodiment of the invention provides a method and a system for training and collecting parameters of a visible light hyperspectral compression collection chip based on diffraction coding, so as to eliminate or improve one or more defects in the prior art, and solve the problems of multiple scanning, low precision and high noise in the prior art.
The invention provides a diffraction coding-based visible light hyperspectral compression acquisition chip and an acquisition model parameter training method, which comprise the following steps:
Acquiring a training sample set, wherein the training sample set comprises a plurality of sample data, and the sample data is a sample visible light hyperspectrum;
Taking a diffraction optical element as a visible light hyperspectral compression acquisition chip, constructing a height map distribution of a Diffraction Optical Element (DOE), calculating sample image data obtained by spectral response on an image sensor after the sample visible spectrum is modulated by the diffraction optical element according to the height map distribution based on the Fresnel theorem;
Acquiring an initial neural network model, wherein the initial neural network model is of a U-Net model structure, the initial neural network model takes the sample image data as input, a residual block is used for extracting a characteristic image in an encoding stage, a maximum pooling layer is used for downsampling, an upsampling layer and convolution are used for upsampling the encoded characteristic image in a decoding stage, the characteristic image in the decoding stage is cascaded with the characteristic image in the encoding stage, and the initial neural network model outputs a reconstructed visible light hyperspectrum of the sample data;
And training the height map distribution and the initial neural network by adopting the training sample set, taking the deviation of the sample visible light hyperspectral and the reconstructed visible light hyperspectral as loss, and carrying out iterative updating on the height map distribution and the initial neural network by minimizing the loss to obtain a target height map distribution and a target hyperspectral acquisition model.
In some embodiments, the method further comprises:
Setting polar coordinates at the central position of the diffraction optical element, and uniformly dividing the diffraction optical element into a first set number of lobe areas within the angle range of 2 pi;
A second set number of distance intervals are divided from the central position to the periphery in each lobe area, and the heights of the diffractive optical elements in each distance interval are consistent.
In some embodiments, the flap region is alongAnd an angle of 2pi is divided into 8.
In some embodiments, constructing a height map distribution of a diffractive optical element, calculating sample image data obtained by spectral response on an image sensor after the sample visible spectrum is modulated by the diffractive optical element according to the height map distribution based on fresnel theorem, including:
Based on the fresnel theorem, modeling the sample visible light hyperspectrum as:
Wherein (x, y) represents a position coordinate on the diffractive optical element, λ represents a wavelength of a sample visible light, i represents an imaginary number, and z represents a depth of the sample visible light hyperspectral incident position from a plane of the diffractive optical element;
After the sample visible spectrum is modulated by the diffractive optical element, the expression is:
Wherein U 1 (x, y, λ) represents the sample visible spectrum modulated by the diffractive optical element, ΔΦ h represents the phase retardation, Δη λ represents the difference between the refractive index of the material and air, Δh (r) represents the height map distribution, λ represents the wavelength of the sample visible light;
light reaching the image sensor is modeled as:
Wherein U 2 (x, y, λ) represents light reaching the image sensor, F represents fourier transform, λ represents wavelength of sample visible light; A frequency variable representing x; /(I) A frequency variable representing y;
The point spread function formed on the image sensor is expressed as:
P(x,y,λ)∝|U2(x,y,λ)|2
the sample image data on the image sensor is represented as:
Wherein I' (x, y, λ) represents the sample hyperspectral data modulated via the point spread function; p (x, y, λ) represents the point spread function; i (x, y, λ) represents the sample hyperspectral data; Representing a convolution; i (x, y) represents the sample image data; lambda 0 represents the lower limit of the hyperspectral wavelength to be collected; lambda 1 represents the upper limit of the hyperspectral wavelength that needs to be collected; r (λ) represents a response curve of the image sensor to different wavelengths; d (λ) represents the differentiation of the sample visible wavelength; η represents noise.
In some embodiments, in the U-Net model structure of the initial neural network model, the encoding stage includes a first residual block, a second residual block, a third residual block, a fourth residual block, a fifth residual block, a sixth residual block, and a seventh residual block connected in sequence, and the decoding stage includes an eighth residual block, a ninth residual block, a tenth residual block, an eleventh residual block, a twelfth residual block, a thirteenth residual block, a first convolution layer, and a first activation function layer connected in sequence;
The feature map output by the first residual block is connected to the feature map input by the thirteenth residual block in a jumping manner, the feature map output by the second residual block is connected to the feature map input by the twelfth residual block in a jumping manner, the feature map output by the third residual block is connected to the feature map input by the eleventh residual block in a jumping manner, the feature map output by the fourth residual block is connected to the feature map input by the tenth residual block in a jumping manner, the feature map output by the fifth residual block is connected to the feature map input by the ninth residual block in a jumping manner, and the feature map output by the sixth residual block is connected to the feature map input by the eighth residual block in a jumping manner.
In some embodiments, the first activation function layer is sigmod activation function layers;
The first residual block, the second residual block, the third residual block, the fourth residual block, the fifth residual block, the sixth residual block, the seventh residual block, the eighth residual block, the ninth residual block, the tenth residual block, the eleventh residual block, the twelfth residual block and the thirteenth residual block are respectively in a structure of a second convolution layer, a third convolution layer, a first batch of standardization layers, a second activation function layer, a fourth convolution layer, a second batch of standardization layers and a third activation function layer which are sequentially connected, and the output of the second convolution layer and the output of the second batch of standardization layers are added and connected to each other to be input into the third activation function layer;
Wherein the second activation function layer adopts an exponential linear unit.
In some embodiments, with the deviation of the sample visible light hyperspectral and the reconstructed visible light hyperspectral as a penalty, comprising:
And calculating the average absolute error of the sample visible light hyperspectral and the reconstructed visible light hyperspectral as loss.
On the other hand, the invention also provides a visible light hyperspectral acquisition method, which comprises the following steps:
obtaining visible light to be detected;
modulating the visible light to be detected through a visible light hyperspectral compression acquisition chip, and then obtaining the image data of the visible light to be detected through spectral response on a preset image sensor, wherein the visible light hyperspectral compression acquisition chip adopts the target height map distribution in the visible light hyperspectral compression acquisition chip based on diffraction coding and the acquisition model parameter training method;
And processing the image data through the diffraction encoding-based visible light hyperspectral compression acquisition chip and a target hyperspectral acquisition model in the acquisition model parameter training method to obtain the visible light hyperspectrum of the visible light to be detected.
In another aspect, the present invention further provides a visible light hyperspectral collection system, including:
The visible light hyperspectral compression acquisition chip adopts the target height map distribution in the visible light hyperspectral compression acquisition chip based on diffraction codes and the acquisition model parameter training method; the visible light hyperspectral compression acquisition chip is used for modulating visible light;
An image sensor for generating image data in response to the modulated visible light;
And the processor is loaded with the visible light hyperspectral compression acquisition chip based on the diffraction codes and a target hyperspectral acquisition model in the acquisition model parameter training method and is used for generating the hyperspectrum of the visible light according to the image data.
In another aspect, the present invention also provides a computer readable storage medium having stored thereon a computer program/instruction which when executed by a processor performs the steps of the above method.
The invention has the advantages that:
According to the diffraction coding-based visible light hyperspectral compression acquisition chip parameter training and acquisition method and system, incident visible light is modulated and imaged based on a diffraction optical element, space and spectrum information of a natural scene are captured at the same time through single exposure, and a U-Net neural network is introduced to process an imaged image and reconstruct the imaged image to obtain the visible light hyperspectrum. In the training process, the deviation is calculated based on the reconstructed visible light hyperspectrum and the original incident sample visible light hyperspectrum, and the height map distribution of the diffraction optical element and the parameters of the U-Net neural network are updated and iterated simultaneously by minimizing the deviation, so that the detection efficiency and the detection precision of the visible light hyperspectrum are improved.
Furthermore, by improving the structure of the U-Net neural network, a multi-stage residual block consisting of a 2D convolution kernel, an exponential linear unit and a batch normalization layer is introduced to encode and decode, and a feature map obtained by downsampling in the encoding stage is cascaded to a feature map in the decoding stage to perform upsampling, so that the detection precision of the visible light hyperspectrum is improved.
Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention. The objectives and other advantages of the invention may be realized and attained by the structure particularly pointed out in the written description and drawings.
It will be appreciated by those skilled in the art that the objects and advantages that can be achieved with the present invention are not limited to the above-described specific ones, and that the above and other objects that can be achieved with the present invention will be more clearly understood from the following detailed description.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate and together with the description serve to explain the application. In the drawings:
Fig. 1 is a flow chart of a visible light hyperspectral collection model and a corresponding diffraction optical element height map distribution training method according to an embodiment of the invention.
Fig. 2 is a schematic diagram of a connection structure between a diffraction optical element and an image sensor in a visible light hyperspectral collection system according to an embodiment of the present invention.
Fig. 3 is a schematic diagram of a diffraction optical element structure used in a visible light hyperspectral collection model and a corresponding diffraction optical element height map distribution training method according to an embodiment of the present invention.
FIG. 4 is a diagram illustrating a structure of a U-Net model used in a method for training the height map distribution of a visible hyperspectral collection model and a corresponding diffractive optical element according to an embodiment of the present invention.
Fig. 5 is a diagram showing a comparison between a true value and a simulated value in a simulation test of visible light by the visible light hyperspectral collection system according to an embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the following embodiments and the accompanying drawings, in order to make the objects, technical solutions and advantages of the present invention more apparent. The exemplary embodiments of the present invention and the descriptions thereof are used herein to explain the present invention, but are not intended to limit the invention.
It should be noted here that, in order to avoid obscuring the present invention due to unnecessary details, only structures and/or processing steps closely related to the solution according to the present invention are shown in the drawings, while other details not greatly related to the present invention are omitted.
It should be emphasized that the term "comprises/comprising" when used herein is taken to specify the presence of stated features, elements, steps or components, but does not preclude the presence or addition of one or more other features, elements, steps or components.
It is also noted herein that the term "coupled" may refer to not only a direct connection, but also an indirect connection in which an intermediate is present, unless otherwise specified.
Hereinafter, embodiments of the present invention will be described with reference to the accompanying drawings. In the drawings, the same reference numerals represent the same or similar components, or the same or similar steps.
The snapshot hyperspectral imaging method based on the diffraction optical element (DIFFRACTIVE OPTICAL ELEMENTS, DOE) is a better hyperspectral snapshot imaging method, and can simultaneously capture the space and spectral information of a natural scene through single exposure. The fundamental principle of DOE-based snapshot hyperspectral imaging methods is to capture compressed measurement data containing point spread function (Point Spread Function, PSF) modulated spectrum and spatial information and reconstruct a three-dimensional hyperspectral cube by a reconstruction algorithm. The invention builds a miniaturized high-speed hyperspectral imaging system based on the method. Based on the point spread function engineering principle, an efficient snapshot compression hyperspectral acquisition process is provided, compression of hyperspectral data is achieved by designing different diffraction optical element height map distributions to generate different point spread functions, complex scanning is not needed, and high-precision spectral information can be obtained only through single exposure.
In the implementation process, the incident light is modulated by the diffraction optical element, and the modulation process mainly depends on parameter setting of the height map distribution of the diffraction optical element, so that single exposure captures the space and spectrum information of a natural scene at the same time, and then after the image sensor is imaged, feature extraction and reconstruction of visible light hyperspectrum are carried out by means of a neural network coding and decoding algorithm model. Therefore, in order to obtain a more accurate detection effect, the height map distribution of the diffractive optical element and the parameters of the neural network codec algorithm model need to be designed in a combined way.
Specifically, the invention provides a method and a system for training and collecting parameters of a visible light hyperspectral compression collection chip based on diffraction coding, as shown in fig. 1, wherein the method comprises the following steps of S101 to S104:
step S101: and acquiring a training sample set, wherein the training sample set comprises a plurality of sample data, and the sample data is a sample visible light hyperspectral.
Step S102: and taking the diffraction optical element as a visible light hyperspectral compression acquisition chip, constructing the height map distribution of the diffraction optical element, calculating sample image data obtained by modulating the sample visible spectrum by the diffraction optical element according to the height map distribution based on the Fresnel theorem, and performing spectral response on the image sensor.
Step S103: the method comprises the steps of obtaining an initial neural network model, wherein the initial neural network model is of a U-Net model structure, the initial neural network model takes sample image data as input, a residual block is used for extracting a characteristic image in an encoding stage, a maximum pooling layer is used for downsampling, an upsampling layer and convolution are used for upsampling the encoded characteristic image in a decoding stage, the characteristic image in the decoding stage is cascaded with the characteristic image in the encoding stage, and the initial neural network model outputs a reconstructed visible light hyperspectrum of the sample data.
Step S104: training the height map distribution and the initial neural network by using a training sample set, taking the deviation of the sample visible light hyperspectral and the reconstructed visible light hyperspectral as loss, and carrying out iterative updating on the height map distribution and the initial neural network by minimizing the loss to obtain a target height map distribution and a target hyperspectral acquisition model.
Steps S101 to S104 are joint training performed by combining the height map distribution of the diffractive optical element and the U-Net model, and aim to obtain the height map distribution of the diffractive optical element with the optimal effect and the corresponding U-Net model parameters with the optimal reconstruction capability.
In step S101, each sample data in the training sample set contains a modeling parameter of visible light, which is set based on the fresnel theorem.
In step S102, as shown in fig. 2, incident light is modulated by a diffractive optical element and imaged on an image sensor, and the imaged image data is used for the subsequent. The height map distribution of the DOE of the diffractive optical element may be set according to practical application requirements, in some embodiments, as shown in fig. 3, polar coordinates are set at a center position of the diffractive optical element, and the diffractive optical element is uniformly divided into a first set number of lobe areas within an angle range of 2pi; the second set number of distance intervals are divided from the central position to the periphery in each lobe area, and the heights of the diffraction optical elements in each distance interval are consistent.
In some embodiments, the flap region is alongAnd an angle of 2pi is divided into 8.
Further, since 512 distance sections are divided from the central position to the periphery in each lobe region, 512 distance sections are provided in each of the 8 lobe regions for the diffractive optical element, and 512×8 parameters are total.
In some embodiments, a height map distribution of the diffractive optical element is constructed, sample image data obtained by spectral response on the image sensor after the sample visible spectrum is modulated by the diffractive optical element is calculated according to the height map distribution based on the fresnel theorem, and the steps include steps S201 to S205:
Step S201: based on the fresnel theorem, the sample visible light hyperspectral is modeled as:
wherein (x, y) represents a position coordinate on the diffractive optical element, λ represents a wavelength of a sample visible light, i represents an imaginary number, and z represents a depth of the sample visible light hyperspectral incident position from a plane of the diffractive optical element;
Step S202: after the sample visible spectrum is modulated by the diffractive optical element, the expression is:
Wherein U 1 (x, y, λ) represents the sample visible spectrum modulated by the diffractive optical element, ΔΦ h represents the phase retardation, Δη λ represents the difference between the refractive index of the material and air, Δh (r) represents the height map distribution, λ represents the wavelength of the sample visible light;
step S203: the light reaching the image sensor is modeled as:
where U 2 (x, y, λ) represents light reaching the image sensor, F represents the Fourier transform, λ represents the wavelength of the sample visible light; A frequency variable representing x; /(I) A frequency variable representing y;
Step S204: the point spread function formed on the image sensor is expressed as:
P(x,y,λ)∝|U2(x,y,λ)|2
Step S205: sample image data on the image sensor is represented as:
Where I' (x, y, λ) represents sample hyperspectral data modulated via a point spread function; p (x, y, λ) represents a point spread function; i (x, y, λ) represents sample hyperspectral data; Representing a convolution; i (x, y) represents sample image data; lambda 0 represents the lower limit of the hyperspectral wavelength to be collected; lambda 1 represents the upper limit of the hyperspectral wavelength that needs to be collected; r (λ) represents the response curve of the image sensor to different wavelengths; d (λ) represents the differentiation of the sample visible wavelength; η represents noise.
In step S103, an improved U-Net model is introduced to perform feature extraction and reconstruction on the image data obtained by imaging the image sensor to obtain a hyperspectral spectrum of incident visible light.
In some embodiments, as shown in fig. 4, in the U-Net model structure of the initial neural network model, the encoding stage includes a first residual block, a second residual block, a third residual block, a fourth residual block, a fifth residual block, a sixth residual block, and a seventh residual block that are sequentially connected, and the decoding stage includes an eighth residual block, a ninth residual block, a tenth residual block, an eleventh residual block, a twelfth residual block, a thirteenth residual block, a first convolution layer, and a first activation function layer that are sequentially connected.
The feature map output by the first residual block is connected to the feature map input by the thirteenth residual block in a jumping manner, the feature map output by the second residual block is connected to the feature map input by the twelfth residual block in a jumping manner, the feature map output by the third residual block is connected to the feature map input by the eleventh residual block in a jumping manner, the feature map output by the fourth residual block is connected to the feature map input by the tenth residual block in a jumping manner, the feature map output by the fifth residual block is connected to the feature map input by the ninth residual block in a jumping manner, and the feature map output by the sixth residual block is connected to the feature map input by the eighth residual block in a jumping manner.
In some embodiments, the first activation function layer is sigmod activation function layers. The first residual block, the second residual block, the third residual block, the fourth residual block, the fifth residual block, the sixth residual block, the seventh residual block, the eighth residual block, the ninth residual block, the tenth residual block, the eleventh residual block, the twelfth residual block and the thirteenth residual block are all of a second convolution layer, a third convolution layer, a first standardization layer, a second activation function layer, a fourth convolution layer, a second standardization layer and a third activation function layer which are sequentially connected, and the output of the second convolution layer and the output of the second standardization layer are added and connected to each other to be input into the third activation function layer; wherein the second activation function layer adopts an exponential linear unit.
The improved U-Net neural network employed in the present invention is hereinafter referred to as Res-U-Net.
In step S104, the height map distribution and the initial neural network are trained and synchronously updated with a training sample set, which in some embodiments includes as a penalty the deviation of the sample visible light hyperspectral and the reconstructed visible light hyperspectral, including: the mean absolute error (MAE LOSS) of the sample visible hyperspectral and reconstructed visible hyperspectral were calculated as losses.
On the other hand, the invention also provides a visible light hyperspectral collection method, which comprises the following steps S301 to S303:
Step S301: and obtaining visible light to be detected.
Step S302: after the visible light to be detected is modulated by the visible light hyperspectral compression acquisition chip, spectrum response is carried out on a preset image sensor to obtain image data of the visible light to be detected, and the visible light hyperspectral compression acquisition chip adopts the diffraction encoding-based visible light hyperspectral compression acquisition chip and the target height map distribution in the acquisition model parameter training method in the steps S101-S104.
Step S303: and (3) processing the image data through the visible light hyperspectral compression acquisition chip based on the diffraction codes and the target hyperspectral acquisition model in the acquisition model parameter training method in the steps S101-S104 to obtain the visible light hyperspectrum of the visible light to be detected.
In another aspect, the present invention further provides a visible light hyperspectral collection system, including:
the visible light hyperspectral compression acquisition chip adopts the visible light hyperspectral compression acquisition chip based on diffraction codes and the target height map distribution in the acquisition model parameter training method in the steps S101-S104; the visible light hyperspectral compression acquisition chip is used for modulating visible light.
And an image sensor for generating image data in response to the modulated visible light.
And the processor is used for loading the visible light hyperspectral compression acquisition chip based on the diffraction codes and the target hyperspectral acquisition model in the acquisition model parameter training method in the steps S101-S104 and generating the hyperspectrum of the visible light according to the image data.
In another aspect, the present invention also provides a computer readable storage medium having stored thereon a computer program/instruction which when executed by a processor performs the steps of the above method.
The invention is described below in connection with a specific embodiment:
The embodiment provides a diffraction optical element design scheme based on point spread function engineering and a miniaturized visible light hyperspectral acquisition system. The design scheme proposed in this embodiment and the hyperspectral collection system hardware are shown in fig. 2. The system includes a first portion of optics including a Diffractive Optical Element (DOE) and a CMOS camera and a second portion of codec. The codec device loads a neural network decoding algorithm for reconstructing the detected visible light hyperspectrum.
The DOE realizes the phase modulation of incident light through different height maps, different PSFs can be formed on the CMOS sensor through different phase modulations, and the image compression is realized through convolution of the PSFs and spectrum information. Specifically, for a light incident from z depth, it can be modeled as follows according to the fresnel theorem:
wherein (x, y) represents the position coordinates on the diffractive optical element, λ represents the wavelength of the sample visible light, i represents an imaginary number, and z represents the depth of the sample visible light hyperspectral incident position from the plane of the diffractive optical element.
The light of the incident light after passing through the DOE is:
Wherein U 1 (x, y, λ) represents the sample visible spectrum modulated by the diffractive optical element, ΔΦ h represents the phase retardation, Δη λ represents the difference between the refractive index of the material and air, Δh (r) represents the height map distribution, λ represents the wavelength of the sample visible light;
Light passing through the DOE aperture reaches the image sensor plane, assuming that the image sensor plane is d from the aperture, the light reaching the sensor can be modeled as follows from fresnel's law:
Where U 2 (x, y, λ) represents light reaching the image sensor, F represents the Fourier transform, λ represents the wavelength of the sample visible light; A frequency variable representing x; /(I) A frequency variable representing y;
The point spread function PSF formed on the image sensor is expressed as:
P(x,y,λ)∝|U2(x,y,λ)|2
the incident visible light hyperspectral data I (x, y, λ) are expressed as after modulation by PSF:
The image data on the image sensor is expressed as:
Where I' (x, y, λ) represents sample hyperspectral data modulated via a point spread function; p (x, y, λ) represents a point spread function; i (x, y, λ) represents sample hyperspectral data; Representing a convolution; i (x, y) represents sample image data; lambda 0 represents the lower limit of the hyperspectral wavelength to be collected; lambda 1 represents the upper limit of the hyperspectral wavelength that needs to be collected; r (λ) represents the response curve of the image sensor to different wavelengths; d (λ) represents the differentiation of the sample visible wavelength; η represents noise.
It follows that different PSFs can be achieved by designing different DOE height map distributions to obtain different hyperspectral encoded RGB information on the image sensor.
The embodiment realizes diffraction optical imaging by training and learning the specific height map distribution of the DOE through the neural network. Specific parameters of the DOE and camera are first set, which have various selectivities, and an exemplary parameter setting scheme is provided with reference to table 1.
Table 1: experimental system hardware parameters of diffractive optical element
FIG. 3 is a 1024X 1024 DOE height map matrix, with polar coordinates set at the center, dividing the 2 pi angle range of the diffractive optical element into 8 lobes, dividing into angle regionsAnd 2pi. Meanwhile, n distance intervals are set according to the distance from the point to the center position in the DOE height map matrix, and the interval is half of the length or width of the DOE height map matrix, in an example, the DOE height map matrix size is 1024 x 1024, and n=512. Thus, the DOE height map matrix is divided into 8 angles, each having 512 distance bins, for a total of 512 x 8 parameters. Whereby a 1024 x 1024 DOE can be generated from a 512 x 8 matrix. Thus, the 512×8 matrix proposed herein can be put into the neural network as a learnable parameter for training and combined with the backend neural network for encoding and decoding.
The neural network decoding algorithm proposed in this embodiment is hereinafter referred to as Res-U-net. The Res-U-net input is a compression encoded RGB image, and output is a hyperspectral image of incident visible light. The specific structure is shown in fig. 4. Res-U-net constructs an encoder-decoder structure on the basis of U-net that uses residual blocks in the network, the inside of which consists of 2D convolution kernel and ELU activation function and BatchNormalization (BN) layers. In the encoding stage, res-U-net uses residual blocks to extract features, and MaxPool layers are used for realizing downsampling; in the decoding stage, res-U-net upsamples the encoded feature map using Upsample layers and convolutions and concatenates with the feature map of the encoding stage.
Experimental simulation and results: a hyperspectral acquisition simulation system was built according to figure 2. In the simulation, the diffractive optical element material was chosen to be SKi300, with a single pixel size of 5um, a size of 1024 pixels by 1024 pixels, a system focal length of 50mm, a CMOS camera resolution of 4504 by 4504, and a pixel size of 2.74 μm. The depth of the DOE defined during training should not exceed 3.5um, the hyperspectral band range is 400-700nm, and the wavelength resolution is 10nm. The Res-U-net network 60 is trained round using the CAVE dataset. Fig. 5 is a graph comparing simulation results and actual results of the method used in this example, and using KAIST dataset for testing, the average PSNR (PEAK SIGNAL-to-Noise Ratio) is 34.41 db, and the ssim (Structural Similarity ) is 0.947.
Accordingly, the present invention also provides an apparatus/system comprising a computer device including a processor and a memory, the memory having stored therein computer instructions for executing the computer instructions stored in the memory, the apparatus/system implementing the steps of the method as described above when the computer instructions are executed by the processor.
The embodiments of the present invention also provide a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the edge computing server deployment method described above. The computer readable storage medium may be a tangible storage medium such as Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, floppy disks, hard disk, a removable memory disk, a CD-ROM, or any other form of storage medium known in the art.
In summary, according to the visible light hyperspectral collection model, the DOE height map distribution training and collection method and system, incident visible light is modulated and imaged based on the diffraction optical element, space and spectrum information of a natural scene are captured at the same time through single exposure, and a U-Net neural network is introduced to process and reconstruct an imaged image to obtain the visible light hyperspectrum. In the training process, the deviation is calculated based on the reconstructed visible light hyperspectrum and the original incident sample visible light hyperspectrum, and the height map distribution of the diffraction optical element and the parameters of the U-Net neural network are updated and iterated simultaneously by minimizing the deviation, so that the detection efficiency and the detection precision of the visible light hyperspectrum are improved.
Furthermore, by improving the structure of the U-Net neural network, a multi-stage residual block consisting of a 2D convolution kernel, an exponential linear unit and a batch normalization layer is introduced to encode and decode, and a feature map obtained by downsampling in the encoding stage is cascaded to a feature map in the decoding stage to perform upsampling, so that the detection precision of the visible light hyperspectrum is improved.
Those of ordinary skill in the art will appreciate that the various illustrative components, systems, and methods described in connection with the embodiments disclosed herein can be implemented as hardware, software, or a combination of both. The particular implementation is hardware or software dependent on the specific application of the solution and the design constraints. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention. When implemented in hardware, it may be, for example, an electronic circuit, an Application Specific Integrated Circuit (ASIC), suitable firmware, a plug-in, a function card, or the like. When implemented in software, the elements of the invention are the programs or code segments used to perform the required tasks. The program or code segments may be stored in a machine readable medium or transmitted over transmission media or communication links by a data signal carried in a carrier wave.
It should be understood that the invention is not limited to the particular arrangements and instrumentality described above and shown in the drawings. For the sake of brevity, a detailed description of known methods is omitted here. In the above embodiments, several specific steps are described and shown as examples. The method processes of the present invention are not limited to the specific steps described and shown, but various changes, modifications and additions, or the order between steps may be made by those skilled in the art after appreciating the spirit of the present invention.
In this disclosure, features that are described and/or illustrated with respect to one embodiment may be used in the same way or in a similar way in one or more other embodiments and/or in combination with or instead of the features of the other embodiments.
The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, and various modifications and variations can be made to the embodiments of the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. The visible light hyperspectral compression acquisition chip based on diffraction coding and the acquisition model parameter training method are characterized by comprising the following steps:
Acquiring a training sample set, wherein the training sample set comprises a plurality of sample data, and the sample data is a sample visible light hyperspectrum;
Taking a diffraction optical element as a visible light hyperspectral compression acquisition chip, constructing the height map distribution of the diffraction optical element, calculating sample image data obtained by spectral response on an image sensor after the sample visible spectrum is modulated by the diffraction optical element according to the height map distribution based on the Fresnel theorem;
Acquiring an initial neural network model, wherein the initial neural network model is of a U-Net model structure, the initial neural network model takes the sample image data as input, a residual block is used for extracting a characteristic image in an encoding stage, a maximum pooling layer is used for downsampling, an upsampling layer and convolution are used for upsampling the encoded characteristic image in a decoding stage, the characteristic image in the decoding stage is cascaded with the characteristic image in the encoding stage, and the initial neural network model outputs a reconstructed visible light hyperspectrum of the sample data;
And training the height map distribution and the initial neural network by adopting the training sample set, taking the deviation of the sample visible light hyperspectral and the reconstructed visible light hyperspectral as loss, and carrying out iterative updating on the height map distribution and the initial neural network by minimizing the loss to obtain a target height map distribution and a target hyperspectral acquisition model.
2. The diffraction-encoded visible light hyperspectral compression acquisition chip and acquisition model parameter training method as claimed in claim 1, further comprising:
Setting polar coordinates at the central position of the diffraction optical element, and uniformly dividing the diffraction optical element into a first set number of lobe areas within the angle range of 2 pi;
A second set number of distance intervals are divided from the central position to the periphery in each lobe area, and the heights of the diffractive optical elements in each distance interval are consistent.
3. The diffraction-encoded visible light hyperspectral compression acquisition chip and acquisition model parameter training method as claimed in claim 2, wherein the lobe zone edgeπ、/>And an angle of 2pi is divided into 8.
4. The diffraction-encoded visible light hyperspectral compression acquisition chip and acquisition model parameter training method as claimed in claim 1, wherein constructing a height map distribution of a diffraction optical element, calculating sample image data obtained by modulating the sample visible spectrum by the diffraction optical element according to the height map distribution based on fresnel theorem, and performing spectral response on an image sensor, comprises:
Based on the fresnel theorem, modeling the sample visible light hyperspectrum as:
wherein (x, y) represents a position coordinate on the diffractive optical element, λ represents a wavelength of a sample visible light, i represents an imaginary number, and z represents a depth of the sample visible light hyperspectral incident position from a plane of the diffractive optical element;
After the sample visible spectrum is modulated by the diffractive optical element, the expression is:
wherein U 1 (x, y, λ) represents the sample visible spectrum modulated by the diffractive optical element, ΔΦ h represents the phase retardation, Δη λ represents the difference between the refractive index of the material and air, Δh (r) represents the height map distribution, λ represents the wavelength of the sample visible light;
light reaching the image sensor is modeled as:
Wherein U 2 (x, y, λ) represents light reaching the image sensor, F represents fourier transform, λ represents wavelength of sample visible light; A frequency variable representing x; /(I) A frequency variable representing y;
The point spread function formed on the image sensor is expressed as:
P(x,y,λ)∝|U2(x,y,λ)|2
the sample image data on the image sensor is represented as:
Wherein I' (x, y, λ) represents the sample hyperspectral data modulated via the point spread function; p (x, y, λ) represents the point spread function; i (x, y, λ) represents the sample hyperspectral data; Representing a convolution; i (x, y) represents the sample image data; lambda 0 represents the lower limit of the hyperspectral wavelength to be collected; lambda 1 represents the upper limit of the hyperspectral wavelength that needs to be collected; r (λ) represents a response curve of the image sensor to different wavelengths; d (λ) represents the differentiation of the sample visible wavelength; η represents noise.
5. The diffraction encoding-based visible light hyperspectral compression acquisition chip and acquisition model parameter training method as claimed in claim 1, wherein in the U-Net model structure of the initial neural network model, the encoding stage includes a first residual block, a second residual block, a third residual block, a fourth residual block, a fifth residual block, a sixth residual block and a seventh residual block which are sequentially connected, and the decoding stage includes an eighth residual block, a ninth residual block, a tenth residual block, an eleventh residual block, a twelfth residual block, a thirteenth residual block, a first convolution layer and a first activation function layer which are sequentially connected;
The feature map output by the first residual block is connected to the feature map input by the thirteenth residual block in a jumping manner, the feature map output by the second residual block is connected to the feature map input by the twelfth residual block in a jumping manner, the feature map output by the third residual block is connected to the feature map input by the eleventh residual block in a jumping manner, the feature map output by the fourth residual block is connected to the feature map input by the tenth residual block in a jumping manner, the feature map output by the fifth residual block is connected to the feature map input by the ninth residual block in a jumping manner, and the feature map output by the sixth residual block is connected to the feature map input by the eighth residual block in a jumping manner.
6. The diffraction-encoded visible light hyperspectral compression acquisition chip and acquisition model parameter training method as claimed in claim 5, wherein the first activation function layer is sigmod activation function layer;
The first residual block, the second residual block, the third residual block, the fourth residual block, the fifth residual block, the sixth residual block, the seventh residual block, the eighth residual block, the ninth residual block, the tenth residual block, the eleventh residual block, the twelfth residual block and the thirteenth residual block are respectively in a structure of a second convolution layer, a third convolution layer, a first batch of standardization layers, a second activation function layer, a fourth convolution layer, a second batch of standardization layers and a third activation function layer which are sequentially connected, and the output of the second convolution layer and the output of the second batch of standardization layers are added and connected to each other to be input into the third activation function layer;
Wherein the second activation function layer adopts an exponential linear unit.
7. The diffraction-encoded visible light hyperspectral compression acquisition chip and acquisition model parameter training method as claimed in claim 1, wherein the method comprises the steps of:
And calculating the average absolute error of the sample visible light hyperspectral and the reconstructed visible light hyperspectral as loss.
8. The visible light hyperspectral collection method is characterized by comprising the following steps of:
obtaining visible light to be detected;
Modulating the visible light to be detected through a visible light hyperspectral compression acquisition chip, and then obtaining image data of the visible light to be detected through spectral response on a preset image sensor, wherein the visible light hyperspectral compression acquisition chip adopts the diffraction coding-based visible light hyperspectral compression acquisition chip and the target height map distribution in the acquisition model parameter training method;
Processing the image data through the diffraction encoding-based visible light hyperspectral compression acquisition chip and a target hyperspectral acquisition model in an acquisition model parameter training method to obtain the visible light hyperspectral of the visible light to be detected.
9. A visible light hyperspectral collection system, comprising:
A visible light hyperspectral compression acquisition chip, wherein the visible light hyperspectral compression acquisition chip adopts the diffraction encoding-based visible light hyperspectral compression acquisition chip and the target height map distribution in the acquisition model parameter training method according to any one of claims 1 to 7; the visible light hyperspectral compression acquisition chip is used for modulating visible light;
An image sensor for generating image data in response to the modulated visible light;
A processor loaded with the diffraction encoding-based visible light hyperspectral compression acquisition chip and the target hyperspectral acquisition model in the acquisition model parameter training method as claimed in any one of claims 1 to 7, for generating the hyperspectrum of visible light from the image data.
10. A computer readable storage medium having stored thereon a computer program/instruction which when executed by a processor performs the steps of the method according to any of claims 1 to 8.
CN202410237404.6A 2024-03-01 2024-03-01 Visible light hyperspectral compression acquisition chip parameter training and acquisition method and system based on diffraction coding Pending CN118135397A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410237404.6A CN118135397A (en) 2024-03-01 2024-03-01 Visible light hyperspectral compression acquisition chip parameter training and acquisition method and system based on diffraction coding

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410237404.6A CN118135397A (en) 2024-03-01 2024-03-01 Visible light hyperspectral compression acquisition chip parameter training and acquisition method and system based on diffraction coding

Publications (1)

Publication Number Publication Date
CN118135397A true CN118135397A (en) 2024-06-04

Family

ID=91243045

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410237404.6A Pending CN118135397A (en) 2024-03-01 2024-03-01 Visible light hyperspectral compression acquisition chip parameter training and acquisition method and system based on diffraction coding

Country Status (1)

Country Link
CN (1) CN118135397A (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114353946A (en) * 2021-12-29 2022-04-15 北京理工大学 Diffraction snapshot spectral imaging method

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114353946A (en) * 2021-12-29 2022-04-15 北京理工大学 Diffraction snapshot spectral imaging method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
LINGEN LI等: "Quantization-aware Deep Optics for Diffractive Snapshot Hyperspectral Imaging", 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 27 September 2022 (2022-09-27), pages 1 - 4 *
NAN XU等: "Snapshot hyperspectral imaging based on equalization designed DOE", OPTICS EXPRESS, vol. 31, no. 5, 1 June 2023 (2023-06-01), pages 20489 *

Similar Documents

Publication Publication Date Title
Cai et al. Mst++: Multi-stage spectral-wise transformer for efficient spectral reconstruction
Zavorin et al. Use of multiresolution wavelet feature pyramids for automatic registration of multisensor imagery
CN107525588B (en) Rapid reconstruction method of dual-camera spectral imaging system based on GPU
CN106997581A (en) A kind of method that utilization deep learning rebuilds high spectrum image
Mandanici et al. A multi-image super-resolution algorithm applied to thermal imagery
CN108955882B (en) Three-dimensional data reconstruction method based on liquid crystal hyperspectral calculation imaging system
Peng et al. Residual pixel attention network for spectral reconstruction from RGB images
CN111797744B (en) Multimode remote sensing image matching method based on co-occurrence filtering algorithm
Li et al. Drcr net: Dense residual channel re-calibration network with non-local purification for spectral super resolution
CN111598962B (en) Single-pixel imaging method and device based on matrix sketch analysis
CN114419392A (en) Hyperspectral snapshot image recovery method, device, equipment and medium
CN114219890A (en) Three-dimensional reconstruction method, device and equipment and computer storage medium
CN115063466A (en) Single-frame three-dimensional measurement method based on structured light and deep learning
Wang et al. A frequency-separated 3D-CNN for hyperspectral image super-resolution
CN114266957A (en) Hyperspectral image super-resolution restoration method based on multi-degradation mode data augmentation
CN116797676A (en) Method suitable for coded aperture compression spectrum polarization imaging reconstruction
CN117975284A (en) Cloud layer detection method integrating Swin transformer and CNN network
Mantripragada et al. The effects of spectral dimensionality reduction on hyperspectral pixel classification: A case study
CN116091640B (en) Remote sensing hyperspectral reconstruction method and system based on spectrum self-attention mechanism
Mullah et al. Fast multi‐spectral image super‐resolution via sparse representation
CN116579959B (en) Fusion imaging method and device for hyperspectral image
CN111968073B (en) No-reference image quality evaluation method based on texture information statistics
Martínez et al. Efficient transfer learning for spectral image reconstruction from RGB images
CN118135397A (en) Visible light hyperspectral compression acquisition chip parameter training and acquisition method and system based on diffraction coding
CN117114983A (en) Hyperspectral image super-resolution method and device based on spatial spectrum joint matching

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination