WO2024004121A1

WO2024004121A1 - Imaging device, imaging method, and program

Info

Publication number: WO2024004121A1
Application number: PCT/JP2022/026172
Authority: WO
Inventors: 稜白川; 陽光曽我部; 暁経三反崎; 正樹北原
Original assignee: 日本電信電話株式会社
Priority date: 2022-06-30
Filing date: 2022-06-30
Publication date: 2024-01-04

Abstract

One aspect of the present invention relates to an imaging device for measuring a hyperspectral image using compressive sensing, the imaging device comprising: a CASSI observation system comprising an encoding unit that encodes an input signal by applying a coded-aperture mask to the input signal and outputs the encoded signal, a dispersing unit that disperses the input signal encoded by the encoding unit into wavelength components and outputs the dispersed wavelength components, and a measuring unit that captures an image of the input signal subjected to the wavelength dispersion by the dispersing unit; an attention weight setting unit that sets a weight, which indicates the degree of attention, for each pixel used for imaging by the CASSI observation system; and a mask generating unit that generates the coded-aperture mask on the basis of the attention weight set for each pixel by the attention weight setting unit.

Description

Imaging device, imaging method, and program

The present invention relates to an imaging device, an imaging method, and a program technology.

Conventionally, there is a hyperspectral image measurement technology based on compressed sensing theory called compressed spectral imaging. One of the implementation methods is a technology called CASSI (Coded Aperture Snapshot Spectral Imaging), which combines a coded aperture mask and a dispersive optical element (for example, see Non-Patent Document 1).

Regarding such CASSI, a method for designing a coded aperture mask in a CASSI optical system has been proposed as a method for improving the measurement accuracy of hyperspectral images (see, for example, Non-Patent Document 2).

On the other hand, we have improved the measurement accuracy by limiting the wavelength range information to be acquired by devising the design of the coded aperture mask in CASSI and the formulation of the optimization problem in the reconstruction process from the compressed signal to the original signal. Techniques for improving this have been proposed (for example, see Non-Patent Document 3).

In general, the above-mentioned CASSI is formulated as shown in equation (1).

In equation (1), g represents a compressed signal, f represents an original signal, and Φ is an observation matrix (also referred to as a compression matrix) representing a CASSI observation process. In this way, the observation process (observation matrix) for obtaining a compressed signal in compressed sensing is closely related to the performance of the subsequent original signal estimation.

However, the only degree of freedom in design in the CASSI observation process is the design of the coded aperture mask. Therefore, it is difficult to design an ideal observation matrix for original signal reconstruction. The observation process (Φ) of CASSI is a process of encoding→dispersion→integration, but the dispersion and integration processes are unique to the optical element, and there is no degree of freedom in design.

Furthermore, in the conventional technology that limits the wavelength range to be acquired, there are two choices: whether to acquire each wavelength range or not, and the importance of each wavelength range cannot be considered. In addition, since the wavelength range is uniformly limited for input signals, it is not possible to obtain information on different wavelength ranges for spatially different regions.

In addition, in conventional technology, in order to limit the wavelength range to be acquired, pixels containing only information in the limited wavelength range are extracted from the compressed signal to solve the reconstruction problem, reducing the number of elements of the available compressed signal. Therefore, it is assumed that the reconstruction accuracy will decrease. Conventionally, therefore, an approach has been taken in which the reduction in reconstruction accuracy is suppressed by supplementing the compressed signal information through multiple imaging, but the disadvantage of this is that it increases the cost of measurement.

In view of the above circumstances, the present invention aims to provide a technique that can improve the measurement accuracy of hyperspectral images by CASSI.

One aspect of the present invention is an imaging device that measures a hyperspectral image by compressed sensing, which includes: an encoding unit that encodes and outputs an input signal by applying a coded aperture mask to the input signal; a CASSI observation system comprising: a dispersion section that wavelength-disperses and outputs the input signal encoded by the encoding section; and a measurement section that images the input signal wavelength-dispersed by the dispersion section; and the CASSI observation system. an attention weight setting unit that sets a weight of the degree of attention for each pixel imaged by the camera; and a mask generation unit that generates the coded aperture mask based on the weight of the degree of attention for each pixel set by the attention weight setting unit. An imaging device comprising:

One aspect of the present invention is an imaging method for measuring a hyperspectral image by compressed sensing, which includes: an encoding unit that encodes and outputs an input signal by applying a coded aperture mask to the input signal; An input signal is imaged by a CASSI observation system including a dispersion unit that wavelength-disperses and outputs the input signal encoded by the encoding unit, and a measurement unit that images the input signal wavelength-dispersed by the dispersion unit. This imaging method includes an imaging step and a modulation step of modulating the output signal of the CASSI observation system using a multiplication mask.

One aspect of the present invention is a program for causing a computer to function as the above-described imaging device.

According to the present invention, it is possible to improve the measurement accuracy of hyperspectral images by CASSI.

FIG. 1 is a block diagram showing the configuration of an imaging device 1A according to the first embodiment. It is an image diagram showing the flow of processing by the imaging device 1A of the first embodiment. It is a figure showing an example of the effect that 1 A of imaging devices of a 1st embodiment show. It is a block diagram showing the composition of imaging device 1B of a 2nd embodiment. FIG. 7 is an image diagram showing the flow of processing by the imaging device 1B of the second embodiment. FIG. 12 is an image diagram showing the flow of processing when the imaging device 1B of the second embodiment generates a coded aperture mask and a multiplication mask. FIG. 7 is an image diagram showing a flow in which a mask generation unit 160 generates a multiplication mask and a reconstruction processing unit 140 learns a reconstruction model in an imaging apparatus 1B according to the second embodiment.

Embodiments of the present invention will be described in detail below with reference to the drawings.
<First embodiment>
FIG. 1 is a block diagram showing the configuration of an imaging device 1A according to the first embodiment. The imaging device 1A includes an optical system 110, a CASSI observation system 120, a modulation section 130, and a reconstruction processing section 140.

The imaging device 1A is configured using a processor such as a CPU (Central Processing Unit) and a memory. The imaging device 1A functions as a device that inputs light as an input signal and outputs an estimated signal of a hyperspectral image by a processor executing a program. Among the units included in the imaging device 1A, a part of the CASSI observation system 120, the modulation unit 130, and the reconstruction processing unit 140 are realized by a processor executing a program. Note that some or all of the functions of the imaging device 1A that execute electrical signal processing are implemented using hardware such as ASIC (Application Specific Integrated Circuit), PLD (Programmable Logic Device), and FPGA (Field Programmable Gate Array). It may be realized using. The above program may be recorded on a computer-readable recording medium. Computer-readable recording media include portable media such as flexible disks, magneto-optical disks, ROMs, CD-ROMs, semiconductor storage devices (such as SSDs: Solid State Drives), and hard disks and semiconductor storage built into computer systems. It is a storage device such as a device. The above program may be transmitted via a telecommunications line.

The optical system 110 is composed of a lens that forms an image. Optical system 110 may include multiple lenses.

The CASSI observation system 120 has a function of encoding and measuring an input signal using CASSI. More specifically, the CASSI observation system 120 includes, for example, an encoding section 121, a dispersion section 122, and a measurement section 123. The configuration of the CASSI observation system 120 will be described below, with the intention that the CASSI observation system 120 of the first embodiment and the second embodiment may be similar to the conventional CASSI observation system. .

The encoding unit 121 outputs an encoded input signal by applying the encoded aperture mask to the input signal. Coded aperture masks include LCOS (Liquid Crystal On Silicon) and DMD (Digital Mirror Device). The coded aperture mask may be composed of discrete values of 0 and 1 (hereinafter referred to as {0,1}), or continuous values from 0 to 1 (hereinafter referred to as [0,1]). ). For example, a coded aperture mask can be generated using a data-driven approach in which the coded aperture is modeled as a variable parameter and determined by end-to-end input/output optimization, including a reconstruction model. . On the other hand, contrary to such data-driven approaches, coded aperture masks can also be generated by a theoretical approach based on the theory of compressed sensing that takes advantage of the incoherence with respect to the basis during sparse transformation of images. good.

The dispersion unit 122 is, for example, a prism. The dispersion unit 122 inputs the output signal of the encoding unit 121, performs wavelength dispersion on the input signal, and outputs the resultant signal.

The measurement unit 123 is, for example, an FPA (Focal Plane Array) array sensor. In the measurement unit 123, each sensor for each pixel inputs the output signal of the dispersion unit 122, integrates the signal in the wavelength direction, and outputs the integrated signal. The measurement unit 123 outputs a compressed signal (image) as a measurement result.

The modulation unit 130 inputs the output signal of the CASSI observation system 120, modulates it using a multiplication mask, and outputs it. For example, data indicates that the multiplication mask is modeled as a variable parameter in the same way as the coded aperture mask described above, and is determined by end-to-end input/output optimization including the coded aperture mask and the reconstruction model. can be generated by a driven approach.

The reconstruction processing unit 140 inputs the output signal of the modulation unit 130, performs reconstruction processing on it, and outputs an estimation result (estimated signal) of the hyperspectral image.

FIG. 2 is an image diagram showing the flow of processing by the imaging device 1A. FIG. 2 shows how the signal transformation of each step is parameterized (modeled). In the signal conversion shown in FIG. 2, the solid line represents conversion based on unique parameter values (fixed) of the optical element (prism, sensor, etc.), and the broken line represents conversion based on variable parameters that can be designed.

First, the imaging device 1A inputs wavelength data (Spectral Data Cube) D0 of light in the imaging target space to the CASSI observation system 120 via the optical system 110 (step S1).

Next, in the CASSI observation system 120, the encoding unit 121 outputs encoded wavelength data D1 to the dispersion unit 122 by applying the encoded aperture mask M1 to the input wavelength data (input signal). (Step S2). This step means covering each pixel with a coded aperture mask for 3D (spatial and wavelength) input signals.

Subsequently, in the CASSI observation system 120, the dispersion unit 122 performs wavelength dispersion on the encoded wavelength data D1 and outputs it to the measurement unit 123 (step S3). This step means tilting the wavelength axis using a dispersive optical element (prism).

Next, in the CASSI observation system 120, the measuring unit 123 inputs the wavelength-dispersed wavelength data D2, and each pixel integrates the wavelength data D2 in the wavelength direction, thereby obtaining compressed signal data as a measurement result of the target space. D3 is generated and output to the modulation section 130 (step S4). This step means integrating each tilted wavelength signal pixel by pixel. Through the steps up to this point, a 2D compressed signal is obtained.

Subsequently, the modulation unit 130 generates modulated compressed signal data D4 by performing modulation using a multiplication mask on the 2D compressed signal data D3 input from the CASSI observation system 120 (step S5). The modulation section 130 outputs the generated compressed modulated signal data D4 to the reconstruction processing section 140.

Subsequently, the reconstruction processing unit 140 generates and outputs a hyperspectral image IMG as an estimation result by performing reconstruction processing on the modulated compressed signal data D4 input from the modulation unit 130 (step S6 ). For example, the reconstruction processing unit 140 performs reconstruction processing by inputting the modulated compressed signal data D4 to the reconstruction model.

Here, the reconstructed model is generated, for example, by previously learning the correlation (i.e., observation matrix) between the desired signal data and the modulated compressed signal data using a machine learning method such as a neural network. For example, the reconstruction model is constructed using a DUN (Deep Unrolled Network), which uses deep learning to implement iterative optimization algorithms such as ADMM (Alternating Direction Method of Multiplier) and ISTA (Iterative Shrinkage Thresholding Algorithm). It can be learned by optimizing variable parameters end-to-end including the modulator 121 and the modulator 130.

FIG. 3 is a diagram showing an example of the effects produced by the imaging device 1A of the first embodiment. FIG. 3 shows the estimation accuracy of an image using a conventional configuration that does not use a multiplication mask (only a coded aperture mask), and the image estimation accuracy of an image using the configuration of this embodiment using the multiplication mask described above (a combination of a coded aperture mask and a multiplication mask). A comparison with the estimation accuracy is shown. The vertical axis of the graph shown in FIG. 3 is the strength of PSNR (Peak Signal to Noise Ratio). As is clear from FIG. 3, it can be seen that the estimation result of this embodiment has a larger PSNR value and improved image quality than the conventional configuration.

According to the imaging device 1A of the first embodiment configured in this way, in the CASSI observation system that measures hyperspectral images, by performing modulation processing using a multiplication mask on the measured signal, the hyperspectral image is The measurement accuracy of spectral images can be improved.

More specifically, since each pixel of the coded aperture mask used in the CASSI observation system blocks the corresponding pixel information of the input signal (attenuates in the case of [0, 1]), In this configuration, signal conversion for each wavelength signal within the same pixel in the input signal is limited to signal conversion common to the pixels (conversion to {0,1} or scalar multiplication of [0,1]). In contrast, the imaging device 1A of the first embodiment uses a coded aperture mask that directly acts on the input signal before dispersion processing, as well as a measurement signal that is the result of dispersion processing and integration processing. By having the modulation unit 130 that multiplies and modulates the mask, it is possible to perform different signal conversions on each wavelength signal within the same pixel of the input signal.

For this reason, it is possible to design observation matrices with a high degree of freedom when measuring hyperspectral images, and by appropriately designing the two masks, the encoded aperture mask and the multiplication mask, the accuracy of the final source signal can be improved. It becomes possible to improve the performance.

Additionally, the design of the CASSI observation matrix uses various optical elements and is implemented on hardware. In contrast, the modulation unit 130 of this embodiment performs signal processing after signal compression, and can be implemented in software. Therefore, there are advantages such as fewer physical restrictions in implementing the modulation section 130 and ease of handling in terms of conversion speed and continuous value expression.

Furthermore, in the imaging device 1A of the first embodiment, the modulation unit 130 does not add or change the conventional imaging process using CASSI. Therefore, according to the imaging device 1A of this embodiment, the estimation accuracy of the original signal can be easily improved without complicating the device.

<Second embodiment>
FIG. 4 is a block diagram showing the configuration of an imaging device 1B according to the second embodiment. Regarding FIG. 4, the same components as in FIG. 1 are given the same reference numerals as in FIG. 1, and the explanation here will be omitted. The imaging device 1B differs from the imaging device 1A of the first embodiment in that it does not include a modulation section 130 and includes an attention weight setting section 150 and a mask generation section 160.

The attention weight setting unit 150 sets a weight (hereinafter referred to as "attention weight") for each pixel imaged by the CASSI observation system 120, indicating the degree to which it should be noticed as observation data. For example, the attention weight setting unit 150 receives an input for setting an attention weight from an operation input unit (not shown), generates attention weight information indicating the attention weight for each pixel based on the input information, and generates a mask. 160.

More specifically, the attention weight is a value within the range of continuous values [0, 1], and the attention weight setting unit 150 sets a 3D (spatial and wavelength) attention weight for each pixel.

Note that the attention weight may be set by any method other than the method described above. For example, attention weight information may be received from another device via communication, or attention weight information stored in advance from a storage unit (not shown) may be read.

The mask generation unit 160 generates a coded aperture mask based on the attention weight information input from the attention weight setting unit 150. The mask generation unit 160 sets the generated encoded aperture mask in the encoding unit 121 of the CASSI observation system 120 at the subsequent stage.

FIG. 5 is an image diagram showing the flow of processing by the imaging device 1B. First, in the imaging device 1B, the attention weight setting unit 150 performs a process of setting attention weight for each pixel (step S201), and outputs attention weight information indicating the settings to the mask generation unit 160. For example, the attention weight setting unit 150 may accept an operation to select a rectangular region of interest in an image, or an operation to input an annotation map that associates each region in an image with information as to whether it is a region of interest or not. may be accepted.

Next, the mask generation unit 160 receives the attention weight information from the attention weight setting unit 150 and generates a coded aperture mask based on the attention weight information (step S202). For example, the mask generation unit 160 can generate a coded aperture mask using an image processing model based on a DNN (Deep Neural Network) such as CNN (Convolutional Neural Network) or U-NET. The mask generation unit 160 sets the generated encoded aperture mask in the encoding unit 121 of the subsequent CASSI observation system 120.

The subsequent processing is similar to the flow of processing when the multiplication mask is not applied in the first embodiment. Specifically, the CASSI observation system 120 acquires (images) observation data using the encoded aperture mask set in step S202, and the reconstruction processing unit 140 performs reconstruction processing on the acquired observation data. This outputs an image that is the estimation result of the original signal. In FIG. 5, the same reference numerals as in FIG. 2 indicate the same processes as those executed by the imaging device 1A of the first embodiment.

Note that in FIG. 5, a case has been described in which the mask generation unit 160 generates only the coded aperture mask by the attention weight setting operation, but the mask generation unit 160 also generates the coded aperture mask in addition to the coded aperture mask of the first embodiment. A multiplication mask may also be generated. In this case, the function of the mask generation unit 160 to generate a multiplication mask may be realized by machine learning using a deep learning model.

FIG. 6 is an image diagram showing the flow of processing in this case. In this case, the imaging device 1B may further include the modulation unit 130 of the first embodiment, and may be configured to modulate the observation data of the CASSI observation system 120 and then perform the reconstruction process.

FIG. 7 is an image diagram showing a flow in which the mask generation unit 160 generates a multiplication mask and the reconstruction processing unit 140 learns a reconstruction model. As described above, the function of the mask generation unit 160 to generate a multiplication mask and the reconstruction processing unit 140 can be realized by a deep learning model. For example, the mask generation unit 160 and the reconstruction processing unit 140 each specialize in reconstructing the region of interest by learning as a loss MSE (Mean Squared Error) weighted by attention weights set for each pixel. It is possible to construct a generation model and a reconstruction model of the multiplicative mask. In other words, by expressing the attention weight as a real value [0, 1] and designing the loss using that value, it is possible to measure hyperspectral images that take into account the magnitude of importance. . For example, loss is defined as in equation (2) below.

As described above, the imaging device 1B of the second embodiment sets the attention weight indicating the degree of attention for each pixel in the captured image for the CASSI observation system that measures hyperspectral images, and sets By generating a coded aperture mask based on the attention weight information, a reconstruction model specialized for the attention area can be constructed. Therefore, according to the imaging device 1B of the second embodiment, it is possible to improve the measurement accuracy of hyperspectral images by CASSI.

Furthermore, the imaging device 1B of the second embodiment includes the modulation unit 130 of the first embodiment, and the mask generation unit 160 includes an encoded aperture mask based on the attention weight information set by the attention weight setting unit 150,・A multiplication mask is generated based on the generative model (DNN) learned to optimize from end to end, and the modulation unit 130 modulates the observed data using the generated multiplication mask, so that CASSI The measurement accuracy of hyperspectral images can be further improved.

More specifically, the imaging device 1B of the second embodiment weights the coded aperture mask used in the CASSI observation system 120 with respect to the wavelength range of interest for each pixel (setting the attention weight). , a coded aperture mask is generated using a DNN (Deep Neural Network) using the attention weights for each pixel and wavelength range as input. In other words, by designing pixel-wise weights and learning the coded aperture mask generation model and reconstruction model that take this into account, it is possible to acquire measurement images with spatially varying wavelength regions of interest. That's true.

As a result, the importance (weight) for each wavelength range can be set as a value within a continuous range of [0, 1] instead of a discrete value of {0, 1}, and the importance of a different wavelength range for each pixel can be set. It is possible to set the degree and perform imaging. According to the imaging device 1B of the second embodiment, unlike the conventional method, all the pixel information of the acquired compressed signal can be used, so the reconstruction result in a single image capture has sufficient performance. Since this can be expected, the measurement cost can be reduced.

Although the embodiments of the present invention have been described above in detail with reference to the drawings, the specific configuration is not limited to these embodiments, and includes designs within the scope of the gist of the present invention.

The present invention is applicable to an imaging device that measures hyperspectral images using a CASSI observation system.

1A, 1B...Imaging device, 110...Optical system, 120...CASSI observation system, 121...Encoding section, 122...Dispersion section, 123...Measurement section, 130...Modulation section, 140...Reconstruction processing section, 150...Weight setting Section, 160...Mask generation section

Claims

An imaging device that measures hyperspectral images by compressed sensing,
an encoding unit that encodes and outputs the input signal by applying a coded aperture mask to the input signal; a dispersion unit that wavelength-disperses and outputs the input signal encoded by the encoding unit; a CASSI observation system comprising: a measurement unit that images the input signal wavelength-dispersed by the unit;
an attention weight setting unit that sets a degree of attention weight for each pixel imaged by the CASSI observation system;
a mask generation unit that generates the encoded aperture mask based on the weight of the degree of attention for each pixel set by the attention weight setting unit;
An imaging device comprising:
further comprising a modulation unit that modulates the output signal of the CASSI observation system using a multiplication mask;
The imaging device according to claim 1.
The mask generation unit generates the multiplication mask in addition to the coded aperture mask based on the weight of the degree of attention, and sets the generated multiplication mask in the modulation unit.
The imaging device according to claim 2.
The attention weight setting unit is capable of setting an arbitrary value between 0 and 1 as a weight of the degree of attention for each pixel.
The imaging device according to claim 1.
further comprising a reconstruction processing unit that inputs the output signal of the modulation unit and performs reconstruction processing on it to output an estimation result of a hyperspectral image;
The imaging device according to claim 2.
The reconstruction processing unit has previously learned as a reconstruction model a neural network constructed to optimize input and output of the imaging device including the encoded aperture mask and the multiplication mask.
The imaging device according to claim 5.
An imaging method for measuring hyperspectral images by compressed sensing, the method comprising:
an encoding unit that encodes and outputs the input signal by applying a coded aperture mask to the input signal; a dispersion unit that wavelength-disperses and outputs the input signal encoded by the encoding unit; an imaging step of imaging the input signal with a CASSI observation system comprising: a measurement unit that images the input signal wavelength-dispersed by the unit;
a modulation step of modulating the output signal of the CASSI observation system by a multiplication mask;
An imaging method having.
A program for causing a computer to function as the imaging device according to any one of claims 1 to 6.