CN115993611A

CN115993611A - Non-visual field imaging method and device based on transient signal super-resolution network

Info

Publication number: CN115993611A
Application number: CN202310282255.0A
Authority: CN
Inventors: 邱凌云; 王健羽; 付星; 史作强; 刘新桐; 柳强; 肖乐平
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2023-03-22
Filing date: 2023-03-22
Publication date: 2023-04-21
Anticipated expiration: 2043-03-22
Also published as: CN115993611B

Abstract

The invention provides a non-visual field imaging method and device based on a transient signal super-resolution network, and belongs to the technical field of optical non-visual field imaging. Wherein the method comprises the following steps: acquiring a sparse detection signal of an object to be imaged for non-visual field imaging through a sparse detection point preset on an intermediate wall surface; recovering the sparse detection signal into a dense detection signal by using a preset transient signal super-resolution network; and obtaining a non-visual field imaging result of the target according to the intensive detection signals. The invention can fully utilize the information contained in the sparse detection signal to restore the signal to the dense detection signal so as to realize the imaging of the non-visual field target, has the characteristics of high detection speed, high imaging precision and wide application range, and overcomes the defects of the existing non-visual field imaging technology.

Description

Non-visual field imaging method and device based on transient signal super-resolution network

Technical Field

The invention belongs to the technical field of optical non-visual field imaging, and particularly provides a non-visual field imaging method and device based on a transient signal super-resolution network.

Background

The non-visual field imaging technology is an optical detection technology for reconstructing out-of-visual field target information by receiving laser signals subjected to multiple reflections through an intermediate wall surface. The acquisition time of the signals is one of important factors influencing the practical value of the non-visual field imaging technology, and shortening the acquisition time of the signals can accelerate the practical process of the technology in the fields of automatic driving, disaster relief, security anti-terrorism and the like. Currently, there are several techniques to solve this problem. Common methods can be categorized into reducing single point acquisition time and reducing the number of probe points. Reducing the single point acquisition time refers to reducing the detection time for each detection point while keeping the number of detection points unchanged, for example: a first photon detection method; the reduction of the number of the detection points means that the number of the detection points is reduced under the condition that the detection time of each detection point is kept unchanged, and the detection points can be divided into sparse detection, circumferential detection, random detection and the like according to the distribution of the detection points. The method can utilize information in signals to a greater extent by reducing the number of detection points, avoids the problem of signal-to-noise ratio reduction of the signals after single-point detection time is reduced, and is widely applied to the field of non-visual field imaging. Among these, the sparse detection method has received a lot of attention in recent years due to the characteristics of a large number of applicable scenes.

In the case of sparse detection, an observer emits laser light to the intermediate wall surface, and a point irradiated to the intermediate wall surface is referred to as an irradiation point. The intermediate wall surface refers to a wall surface where an observer and an object located in a non-visual field area are visible. Photons enter the non-field of view region through diffuse reflection and are reflected back to the intervening wall surface at the non-field of view target surface. At this time, photon echo intensity information with time resolution at a certain point on the intermediate wall surface is detected by using the photon receiving device, and the point is called a receiving point. The irradiation points and the receiving points form a detection point pair, and the number of the point pairs used in sparse detection is relatively small. When the luminous points and the receiving points in all the detection points are identical, the detection is called confocal detection, otherwise, the detection is called non-confocal detection. Both the sparse detection method and the traditional detection method adopt rectangular lattices for detection, but in the lattices of the sparse detection method, the number of dot pairs is smaller, so that the acquisition time is reduced.

However, there are fewer reconstruction algorithms that utilize sparse detection signals. One is to combine compressed sensing with traditional iterative algorithm, the distribution of detection points of the method on an intermediate wall surface is a sparse square lattice, and the method introduces a sparse sampling operator in the least square problem with a regularization term to minimize the error between a synthesized signal and an actual measurement signal through the sparse detection signal obtained on the lattice; such methods have the technical problems of long calculation time and low imaging quality, and no perfect method is available for solving the problems. The other method is to interpolate the sparse detection signal by using the traditional interpolation method, such as adjacent point interpolation or bicubic interpolation, so as to obtain a dense detection signal, and then use the existing imaging algorithm to image; however, such methods require a small distance between adjacent points of the sparse detection lattice. When the distance between adjacent points of the sparse detection lattice is large, a clear imaging result cannot be obtained by using the recovered dense detection signals. The method has the technical problems of unobvious lifting effect and few applicable scenes, and a perfect method for solving the problems is not available.

Disclosure of Invention

The invention aims to overcome the defects of the prior art and provides a non-visual field imaging method and device based on a transient signal super-resolution network. The invention can fully utilize the information contained in the sparse detection signal to restore the signal to the dense detection signal so as to realize the imaging of the non-visual field target, has the characteristics of high detection speed, high imaging precision and wide application range, and overcomes the defects of the existing non-visual field imaging technology.

An embodiment of a first aspect of the present invention provides a non-visual field imaging method based on a transient signal super-resolution network, including:

acquiring a sparse detection signal of an object to be imaged for non-visual field imaging through a sparse detection point preset on an intermediate wall surface;

recovering the sparse detection signal into a dense detection signal by using a preset transient signal super-resolution network;

and obtaining a non-visual field imaging result of the target according to the intensive detection signals.

In a specific embodiment of the present invention, the transient signal super-resolution network is composed of an interpolation branch and a transient signal super-resolution branch, and dense detection signals corresponding to the sparse detection signals are obtained by summing output results obtained after the sparse detection signals are respectively input into two branches;

The interpolation branch comprises a tri-linear interpolation module which is used for performing tri-linear interpolation calculation on the input sparse detection signal;

the transient signal super-resolution branch is used for extracting features of the input sparse detection signals and expanding feature dimensions in a set direction.

In a specific embodiment of the present invention, the obtaining a non-field of view imaging result of the target according to the dense detection signal includes:

the propagation of light in space is described by a wave model represented by the following formula:

wherein ,

for Laplace operator>

Is the speed of light; />

For a complex-valued scalar wave field, for characterizing the moment +.>

Time in position->

Is a light field of (2);

any detection point on the intermediate wall surface

At, time->

Wave field->

And detection signal->

The relation of (2) is:

the method comprises the steps of carrying out a first treatment on the surface of the For dense detection signals +.>

There is +.>

；

Then according to

Solving->

Obtaining a non-visual field imaging result of the target; wherein->

Time for acquiring signal at any detection point, +.>

Is the area to be solved.

In a specific embodiment of the present invention, the transient signal super-resolution branch includes a first three-dimensional convolution layer, 16 identical attention modules, a first upsampling module, a second upsampling module, and a second three-dimensional convolution layer that are sequentially connected;

The first three-dimensional convolution layer comprises 40 convolution filters of 3 multiplied by 3, and the output characteristics of the first three-dimensional convolution layer sequentially pass through 16 attention modules to obtain the output characteristics of the attention modules;

summing the output characteristics of the first three-dimensional convolution layer and the output characteristics of the attention module, and sequentially passing through the first upsampling module, the second upsampling module and the second three-dimensional convolution layer to obtain an output result of the transient signal super-resolution branch; the second three-dimensional convolution layer includes 24 3 x 3 convolution filters.

In a specific embodiment of the present invention, before the recovering the sparse detection signal into the dense detection signal by using the preset transient signal super-resolution network, the method further includes:

training the transient signal super-resolution network;

the training the transient signal super-resolution network comprises the following steps:

constructing a simulation model of the intermediate wall surface containing the sparse detection points, and setting dense detection points in the simulation model of the intermediate wall surface, wherein a set formed by the sparse detection points is a subset of a set formed by the dense detection points;

Through simulation, respectively acquiring a sparse detection signal and a dense detection signal of a preset virtual non-visual field target for non-visual field imaging at the sparse detection point and the dense detection point to form a training sample;

training samples corresponding to different virtual non-visual field targets form a training set;

constructing the transient signal super-resolution network;

and training the transient signal super-resolution network by using the training set to obtain the trained transient signal super-resolution network.

In a specific embodiment of the present invention, the attention module includes: the three-dimensional convolution device comprises a third three-dimensional convolution layer, three sub-branches respectively connected with the output of the third three-dimensional convolution layer, a first leakage linear rectifying unit connected with the total output of the three sub-branches, and a ninth three-dimensional convolution layer connected with the output of the first leakage linear rectifying unit;

wherein the three sub-branches comprise: weight learning sub-branches, inattentive sub-branches and attentive sub-branches;

the third three-dimensional convolution layer comprises 40 1 multiplied by 1 convolution filters and a second leaky linear rectification unit which are connected in sequence;

the weight learning sub-branch comprises a three-dimensional average pooling layer, a first full-connection layer, a linear rectifying unit, a second full-connection layer and a flexible maximum layer which are sequentially connected; the weight learning sub-branch is used for outputting the weight when the other two sub-branches in the current attention module carry out weighted summation;

The inattentive sub-branches are a fourth three-dimensional convolution layer comprising 40 3 x 3 convolution filters for extracting the inattentive features of the input data;

the attention sub-branch includes: the system comprises a fifth three-dimensional convolution layer, a sixth three-dimensional convolution layer, an S-shaped unit, a seventh three-dimensional convolution layer and an eighth three-dimensional convolution layer; the fifth three-dimensional convolution layer comprises 40 3 multiplied by 3 convolution filters and a third leakage linear rectification unit which are sequentially connected, and the input of the third leakage linear rectification unit is the input of the sub-branch; the sixth three-dimensional convolution layer includes 40 1 x 1 convolution filters, the input is the output of the fifth three-dimensional convolution layer; the input of the S-shaped unit is the output of the sixth three-dimensional convolution layer; said seventh three-dimensional convolution layer comprising 40 3 x 3 convolution filters, the inputs being inputs to the sub-branches; the eighth three-dimensional convolution layer includes 40 3 x 3 convolution filters, the input is the result of multiplying the output of the S-shaped unit and the output of the seventh three-dimensional convolution layer by corresponding elements, and the output is the output result of the attention sub-branch; the attention sub-branch is used for extracting attention characteristics of input data;

the output results of the non-attention sub-branch and the attention sub-branch are weighted and summed according to the output results of the weight learning sub-branch and then input to the first leakage linear rectifying unit; and summing the output result of the first leakage linear rectifying unit with the input of the attention module after passing through the ninth three-dimensional convolution layer to obtain the output of the attention module, wherein the ninth three-dimensional convolution layer comprises 40 1 multiplied by 1 convolution filters.

In a specific embodiment of the present invention, the sparse detection signal and the dense detection signal for non-field of view imaging of the preset virtual non-field of view target are obtained by using the following expressions:

wherein ,

to be at any detection point->

At->

Accumulating the obtained photon intensities at intervals; />

A surface normal to the intermediate wall portion for the non-view object; />

Surface of representation->

Any point on the upper part; />

Representing the inner product in three-dimensional European space, < >>

and />

Respectively represent from the point->

Point to->

And from the point->

Point to->

Unit vector of>

and />

Respectively express surface->

At the point->

The point of the wall surface of the department and the middle>

Pointing to a unit normal direction in the non-view area; />

Representing points from the surface

To the middle wall surface>

Is a distance of (2); />

As a dirac function, +.>

Is the speed of light; />

Time resolution for the detector;

for->

Reflectivity at->

For surface->

Go up some->

Area measure at.

In a specific embodiment of the present invention, the virtual non-view targets adopt a two-dimensional image or a three-dimensional model, and the virtual non-view targets are placed in a virtual cuboid space with a fixed size during simulation to fix the positions of the virtual non-view targets, so that the center point of each virtual non-view target is consistent with the distance between the intermediate wall surfaces during simulation.

In a specific embodiment of the present invention, the training the transient super-resolution network using the training set uses a loss function as follows:

wherein ,

for training sample total number>

Representing pointwise->

A loss function; />

Representing the parameter to be trained as +.>

Transient signal super-resolution network of->

and />

Respectively represent the +.>

Sparse and dense detection signals in the individual training samples.

An embodiment of a second aspect of the present invention provides a non-field-of-view imaging device based on a transient signal super-resolution network, including:

the sparse detection signal acquisition module is used for acquiring sparse detection signals of the target to be imaged for non-visual field imaging through sparse detection points preset on the intermediate wall surface;

the dense detection signal recovery module is used for recovering the sparse detection signal into a dense detection signal by utilizing a preset transient signal super-resolution network;

and the imaging module is used for obtaining a non-visual field imaging result of the target according to the intensive detection signals.

An embodiment of a third aspect of the present invention provides an electronic device, including:

at least one processor; and a memory communicatively coupled to the at least one processor;

Wherein the memory stores instructions executable by the at least one processor, the instructions configured to perform a transient signal super resolution network-based non-field of view imaging method as described above.

An embodiment of a fourth aspect of the present invention proposes a computer-readable storage medium storing computer instructions for causing the computer to perform the above-described non-visual field imaging method based on a transient signal super-resolution network.

The invention has the characteristics and beneficial effects that:

the invention can more accurately and efficiently detect and image the non-visual field target. According to the invention, laser is emitted to sparse detection points of an intermediate wall surface, then echo data received by a receiver is used for obtaining sparse detection data, the sparse detection data is restored to dense detection data by using a neural network, and finally, the imaging of a non-visual field target is completed by combining a rapid algorithm aiming at the dense detection data.

The invention fully utilizes all information of the sparse detection signals in the signal domain to reconstruct the position and reflectivity information of the non-visual field target, overcomes the defect that the existing sparse detection reconstruction algorithm only utilizes information in the image domain, cannot use a rapid algorithm, and can accurately and efficiently image the non-visual field target.

Compared with the existing dense detection mode, the signal acquisition time is shortened by about 95%; compared with the existing sparse detection imaging algorithm, the imaging time of the method is shortened by about 99%. The method has stronger robustness to noise, greatly reduces the detection time of a single point, increases the noise level, and can clearly give the imaging result of a non-visual field target.

Drawings

The foregoing and/or additional aspects and advantages of the invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings, in which:

fig. 1 is a general flow chart of a non-field-of-view imaging method based on a transient signal super-resolution network in an embodiment of the invention.

FIG. 2 is a schematic diagram of a scenario in which training samples are collected by simulation in one embodiment of the present invention.

Fig. 3 is a schematic diagram of a distribution of detection points when detecting a signal according to an embodiment of the present invention.

Fig. 4 is a schematic diagram of a transient signal super-resolution network structure according to an embodiment of the present invention.

Fig. 5 is a schematic diagram of the structure of the attention module in an embodiment of the present invention.

Detailed Description

The embodiment of the invention provides a non-visual field imaging method and device based on a transient signal super-resolution network, and the method and device are further described in detail below with reference to drawings and specific embodiments.

An embodiment of a first aspect of the present invention provides a non-visual field imaging method and apparatus based on a transient signal super-resolution network, including:

In a specific embodiment of the present invention, the non-visual field imaging method based on the transient signal super-resolution network, the overall flow is shown in fig. 1, includes the following steps:

1) Training;

1-1) acquiring a training set;

in the embodiment of the invention, the virtual non-visual field target set used for generating the composite signal is obtained, the virtual non-visual field target set can comprise a plurality of two-dimensional images and three-dimensional models (only one or two of the two-dimensional images and the three-dimensional models can be provided), and the invention has no special requirements on the composition of the virtual non-visual field target set and the distance between the images or the three-dimensional models in the set and the intermediate wall surface.

In a specific embodiment of the invention, 5000 two-dimensional images (the image content has no special requirement) are selected from Fashion-MNIST, 5000 three-dimensional automobile models are selected from automobile subclasses of Shape-Net to form a virtual non-visual field target set together, and a single image or a single three-dimensional model in the virtual non-visual field target set is used as a virtual non-visual field target to be arranged in a set 1m×1m× 0.2m virtual box for fixing the target, wherein 0.2m refers to the length of the box in the vertical direction of an intermediary wall (representing the depth of the virtual non-visual field target in simulation), and the center of the box (representing the center of the target in simulation) is 1m away from the intermediary wall. In this embodiment, a schematic view of a scene in which a training sample is collected is shown in fig. 2, and an area surrounded by a dashed line in the figure is a box containing a virtual non-visual field target, where the virtual non-visual field target is in a non-visual field of an observer, and in this embodiment, the intermediate wall surface is a cement wall surface. The intermediate wall surface set during simulation is consistent with the real intermediate wall surface for actual imaging, and the sparse detection points and the dense detection points on the simulation intermediate wall surface are consistent with the actual imaging.

After placement of the virtual non-field of view object, the present embodiment generates a composite signal with the aid of a three-point transmission model commonly used in non-field of view imaging. The model has a plurality of simplified forms, and the embodiment has no special requirement on the specific simplified form. In one embodiment of the present invention, the reflection process of the detection laser light emitted by the laser in the space can be determined by a three-point reflection model as follows:

（1）

wherein ,

for +.>

At->

Photon intensity accumulated over time interval, +.>

Typically counting from 0 corresponds to an initial time of 0./>

A surface normal to the intermediate wall portion for the non-view object; />

Surface of representation->

Any point above. />

Representing the inner product in three-dimensional European space, < >>

and />

Respectively represent from the point->

Point to->

And from the point->

Pointing to

Unit vector of>

and />

Respectively express surface->

At the point->

The point of the wall surface of the department and the middle>

Pointing normal to the unit in the non-field of view region. />

Representing the point +.>

To the middle wall surface>

Is a distance of (3). />

As a dirac function, +.>

Is the speed of light, in a specific embodiment of the invention is set to +.>

m/s vacuum light velocity. />

For the time resolution of the detector, 32ps is set in one embodiment of the invention. / >

Is a superficial dot->

Reflectivity of the region, +.>

For surface->

Go up some->

Area measure at.

FIG. 3 is a schematic diagram of the distribution of detection points when detecting a signal in an embodiment of the present invention; as shown in fig. 3, the smaller circular dots represent omissions; the larger circular points represent detection points in sparse detection; square points represent points which are not detected during sparse detection, and signals of positions of the points which are not detected are recovered by the transient signal super-resolution network; the signals at all the circular and square points constitute a dense detection signal. The invention can generate the synthesized dense detection signal corresponding to each virtual non-visual field target through the transmission model shown in the formula (1)

Synthesizing sparse detection signal +.>

As a training sample, a training set is formed by training samples corresponding to each virtual non-visual field object in the virtual non-visual field object set.

The specific generation method of each training sample comprises the following steps: each virtual non-field of view targetCorresponding tof(x) By changing the positions and the number of the detection points, the sparse detection signals and the dense detection signals corresponding to the virtual non-visual field targets can be obtained by using the formula (1) for later training.

And completing the process for all targets in the virtual non-visual field target set to obtain a training set.

The embodiment of the invention can finish the following steps

To->

Has no special requirement on the proportional relation of the number of the two signal detection points. Wherein the set of sparse probe points is a subset of the set of dense probe points. In a specific embodiment of the invention, the proportional relation is set to 4, i.e. +.>

The number of detection points isN×NWhen (I)>

The number of detection points is 4N×4N. The pair of the inventionNIn one embodiment of the present invention, the number of detection points in sparse detection is set to 8×8, and the points are uniformly distributed on an intermediate wall surface of 2m×2m; the detection points of the dense detection signals are set to be 32×32 and are uniformly distributed on the intermediate wall surface of 2m×2m. The time resolution of the detector is not particularly required in the present invention, and in one embodiment of the present invention, the time resolution of the detector is set to 32ps. Taking a virtual non-visual field target in this embodiment as an example, by using a transmission model as defined in formula (1), detecting signals are respectively generated on 8×8 points uniformly distributed on the intermediate wall surface, and the obtained sparse detecting signal dimension is 8×8×512; in addition, the detection signals are respectively generated at 32×32 points uniformly distributed on the intermediate wall surface, and the obtained dense detection signal dimension is 32×32×512.

1-2) constructing a transient signal super-resolution network.

Fig. 4 is a schematic structural diagram of a transient signal super-resolution network according to an embodiment of the present invention. The network consists of an interpolation branch and a transient signal super-resolution branch, and dense detection signals corresponding to the sparse detection signals are obtained by summing output results obtained after the sparse detection signals are respectively input into the two branches. In one embodiment of the present invention, the input of the transient super-resolution network is a sparse detection signal with dimensions of 1×1×8×8×512, and the output is a dense detection signal with dimensions of 1×1×32×32×512.

Specifically, the interpolation branch comprises a tri-linear interpolation module, which is used for performing tri-linear interpolation calculation on the input sparse detection signal to obtain an output result of the interpolation branch, and the result is summed with the output of the transient signal super-resolution branch to obtain a dense detection signal recovered by the transient signal super-resolution network. In the present embodiment, the dimension of the branch input is 1×1×8×8×512, and the dimension of the output is 1×1×32×32×512.

In the embodiment of the invention, the transient signal super-resolution branch comprises a first three-dimensional convolution layer, 16 identical attention modules, 2 up-sampling modules and a second three-dimensional convolution layer which are connected in sequence. Wherein the first three-dimensional convolution layer comprises 40 3 x 3 convolution filters, the input of the convolution layer is a sparse detection signal with dimensions 1×01×18×28×3512, and the output is a tensor with dimensions 1×440×58×68×7512, called feature 1. The input to the 16 attention modules is feature 1 and the output is a tensor of dimension 1 x 840 x 98 x 8 x 0512, called feature 2. Wherein the input dimension of each attention module is 1×140×28×38×4512 and the output of each attention module is a tensor of 1×540×68×78×8512. Feature 1 and feature 2 are summed and input to a first upsampling module, which in this embodiment has an input of a tensor of 1 x 940 x 8 x 08 x 1512 and an output of a tensor of 1 x 224 x 316 x 416 x 5512. The input of the second upsampling module is the output of the first upsampling module, which is a tensor of 1 x 624 x 716 x 816 x 9512, and the output of which is a tensor of dimension 1 x 24 x 32 x 512. The second three-dimensional convolution layer includes 24 3 x 3 convolution filters, its delivery The input is 1×24×32×32×512 tensors, and the output is 1×1×32×32×512 tensors, namely the output result of the transient signal super-resolution branch. In the transient signal super-resolution branch, a first three-dimensional convolution layer is used for improving the dimension of data in one direction; the attention module is used for extracting and combining the features of the high-dimensional data; the up-sampling module is used for lifting dimension of the extracted features in two specific directions and performing dimension reduction operation in one specific direction; the second three-dimensional convolution layer further reduces dimensions in a particular one of the directions. In the embodiment, the dimension of the data in two directions is determined by the processing of the transient signal super-resolution network

Become->

Recovery of the dense probe signal is completed.

Further, in an embodiment of the present invention, the attention module is configured as shown in fig. 5, and the attention module includes: a third three-dimensional convolution layer, three sub-branches respectively connected with the outputs of the third three-dimensional convolution layer, a first leaky linear rectifying unit (leaky relu) connected with the total output of the three sub-branches, and a ninth three-dimensional convolution layer connected with the output of the first leaky linear rectifying unit;

The input to the attention module is a tensor with dimensions 1 x 40 x 8 x 512. Firstly, the input tensor passes through a third three-dimensional convolution layer, wherein the third three-dimensional convolution layer comprises 40 1 multiplied by 1 convolution filters and a second leakage linear rectification unit which are sequentially connected, so that the tensor output is 1 multiplied by 40 multiplied by 8 multiplied by 512, and then the tensor enters into three sub-branches of an attention module, namely a weight learning sub-branch, an attention-free sub-branch and an attention-free sub-branch. The weight learning sub-branch comprises a three-dimensional average pooling layer (3D adaptive average pooling), two fully connected layers, a linear rectifying unit (ReLU) between the fully connected layers, and a flexible maximum layer which are connected in sequence. Wherein the input of the three-dimensional average pooling layer is a tensor of 1 x 40 x 8 x 512, the output is 1×40×1 tensors x 1; the input to the first fully connected layer is a tensor of 1 x 40 x 1, outputting a vector of 1×10; the input of the linear rectifying unit is a vector of 1×10, and the output is a vector of 1×10; the input of the second full connection layer is a vector of 1×10, and the output is a vector of 1×2; the input of the flexible maximum layer is a vector of 1×2, the output is a vector of 1×2, and the vector is taken as the weight when the two latter branches are summed, and the function of the sub-branches is to learn the weight when the other two branches in the current module are weighted and summed. The non-attentive sub-branch is a fourth three-dimensional convolution layer comprising 40 3 x 3 convolution filters, the input of which is a tensor of dimension 1 x 40 x 8 x 512 and the output of which is a tensor of dimension 1 x 40 x 8 x 512, the sub-branch serving to directly process the input tensor to optimise the target feature expressed therein. The attention sub-branch includes: a fifth three-dimensional convolution layer, a sixth three-dimensional convolution layer, an S-shaped unit, a seventh three-dimensional convolution layer, and an eighth three-dimensional convolution layer. Note that the input of the attention sub-branch is a tensor of dimension 1×40×8×8×512, and the output is a tensor of dimension 1×40×8×8×512. Wherein the fifth three-dimensional convolution layer comprises 40 3 multiplied by 3 convolution filters and a third leaky linear rectification unit which are connected in sequence, the input is the input of the subbranch, the tensor with dimension 1×40×8×8×512, and the output is the tensor with dimension 1×40×8×8×512; the sixth three-dimensional convolution layer comprises 40 1 x 1 convolution filters, the input being the output of the fifth three-dimensional convolution layer (i.e., a tensor of dimension 1 x 40 x 8 x 512), the output being a tensor of dimension 1 x 40 x 8 x 512; the input of the S-shaped unit (Sigmoid) is the output of the sixth three-dimensional convolution layer, the output being a tensor of dimension 1×40×8×8×512; the seventh three-dimensional convolution layer contains 40 3 x 3 convolution filters, the input being the input of the sub-branch, tensors of dimension 1×40×8×8×512, the output is tensors of dimension 1×40×8×8×512; the eighth three-dimensional convolution layer comprises 40 3 x 3 convolution filters, the input is the result of multiplying the output of the S-shaped unit and the output of the seventh three-dimensional convolution layer by the corresponding elements, is a tensor with dimensions 1×40×8×8×512, the output is a tensor with dimensions 1×40×8×8×512, and the output dimension is consistent with the input for the output result of the attention sub-branch. The effect of this sub-branch is to perform a focused mechanism of processing on the input, i.e. to give more weight to certain features therein, which are of greater interest in performing the subsequent processing. After the operation of the three sub-branches is completed, the results of the inattentive sub-branches and the attentive sub-branches are weighted and summed according to the outputs of the weight learning sub-branches, and input to the first leaky linear rectifying unit, the input dimension of which is 1×40×8×8×512, and the output dimension of which is 1×40×8×8×512. The result is passed through a ninth three-dimensional convolution layer comprising 40 1 x 1 convolution filters and summed with the input of the attention module to obtain the output of the attention module, where the dimension of the tensor is 1 x 40 x 8 x 512.

Each up-sampling module has the same structure and comprises a tri-linear interpolation sub-module and four three-dimensional convolution layers which are sequentially connected. Wherein the input of the first up-sampling module is a tensor with dimension of 1×40×8×8×512, and the dimension is changed to 1×40×16×16×512 by the processing of the tri-linear interpolation sub-module in the up-sampling module; then input to a first three-dimensional convolution layer in a first up-sampling module, the convolution layer comprising 24 3 x 3 convolution filters, the output dimension being 1 x 24 x 16 x 512; the second three-dimensional convolution layer of the first upsampling module comprises 24 3 x 3 convolution filters and an S-shaped unit process, the input of which is the output of the first three-dimensional convolution layer in the module, the dimensions of which are 1 x 24 x 16 x 512, and the dimensions of which are 1 x 24 x 16 x 512; the third three-dimensional convolution layer in this module contains 24 3 x 3 convolution filters and a second leaky linear rectification unit of this module, the input is the result of multiplying the corresponding elements of the outputs of the first and second three-dimensional convolution layers in the module and passing through the first leaky linear rectifying unit of the module, the dimension is 1×24×16×16×512, and the output is the tensor of the dimension is 1×24×16×16×512.

The input to the second upsampling module is a tensor of dimension 1 x 24 x 16 x 512, which is processed by the tri-linear interpolation sub-module in the upsampling module to become 1 x 24 x 32 x 512; then input into a first three-dimensional convolution layer in a second up-sampling module, the convolution layer comprising 24 3 x 3 convolution filters; the second three-dimensional convolution layer in the second up-sampling module comprises 24 3 x 3 convolution filters and one S-shaped unit process, the input is the output of the first three-dimensional convolution layer in the module, the dimension is 1×24×32×32×512, and the dimension of the output is 1×24×32×32×512; the third three-dimensional convolution layer in this module contains 24 3 x 3 convolution filters and a second leaky linear rectification unit of this module, the input is the result of multiplying the corresponding elements of the outputs of the first three-dimensional convolution layer and the second three-dimensional convolution layer in the module and passing through the first leaky linear rectifying unit of the module, the dimension is 1×24×32×32×512, and the output is tensor with the dimension of 1×24×32×32×512.

1-3) training the transient signal super-resolution network established in the step 1-2) by using the training set obtained in the step 1-1) to obtain a trained transient signal super-resolution network.

In one embodiment of the invention, the transient signal super-resolution network uses an Adam optimizer to update the network weights; the model initialization adopts an orthogonal initialization method; the learning rate is cosine annealing learning rate with restarting, and the initial learning rate is

And decays to 0 in 24 generations. Loss function used in training the network>

The expression is as follows:

（2）

wherein ,

for training sample total number>

Representing pointwise->

A loss function. />

Representing the parameter to be trained as +.>

Transient signal super-resolution network of->

and />

Respectively represent the +.>

Sparse and dense detection signals in the individual training samples.

And after training is finished, obtaining the trained transient signal super-resolution network.

2) And (3) a testing stage.

In this embodiment, in the test stage, a detection laser is emitted to an intermediate wall surface (the intermediate wall surface is completely consistent with an intermediate wall surface simulated during training), a signal at a preset sparse detection point is detected to obtain a sparse detection signal, then the sparse detection signal is recovered to a dense detection signal by using the transient signal super-resolution network after training in the step 1), and finally the dense detection signal is used for imaging by using a method based on fast fourier transform. The method comprises the following specific steps:

2-1) acquiring sparse detection signals of an object to be imaged

。

The embodiment of the invention has no special requirements on the laser emitter and the detector in the test stage. In one embodiment of the present invention NKT Photonics OneFive KATANA 05 HP was used as a laser transmitter with a wavelength of 532nm, a pulse width of 35ps and a repetition rate of 10MHz. The detector consists of a single photon avalanche photodiode (Micro Photon Devices PDM series SPAD), a time-dependent single photon counter (PicoQuant PicoHarp) and a two-dimensional scanning galvanometer. The single photon avalanche photodiode is used as a detector, the response wave band of the single photon avalanche photodiode is a visible light wave band, and the single photon avalanche photodiode has a gating function. The time-dependent single photon counter is used as a counting unit, the time resolution is 32ps, and the time-dependent single photon counter and a detector are matched to form a detection module.

The embodiment of the invention has no special requirements on the surface material and the shape of the object to be imaged, and in one specific embodiment of the invention, the non-visual object is a statue made of gypsum. In the non-field of view imaging problem, the distance between the non-field of view object and the intermediate wall surface is typically 0.5m-2m, and in one embodiment of the present invention, the distance between the non-field of view object and the intermediate wall surface is set to 1m, and the object faces the intermediate wall surface to prevent the emitted light from being unable to be received. It should be noted that, in the test stage, the distance suggestion between the target and the intermediate wall surface is consistent with the distance in the training stage, so that a better imaging effect can be obtained; the size of the test stage target is recommended to be the size that can be put into the virtual box of the training stage, so that better imaging effect can be achieved.

In this embodiment, the probe point settings of the test phase remain consistent with the generation training phase. In one embodiment of the present invention, the sparse detection points are uniformly distributed on the intermediate wall surface of 2m×2m with a pitch of 0.25m, and the number of corresponding lattices is 8×8, wherein the detection points in the leftmost column are distributed on the edge of the intermediate wall surface, as shown by the larger black dots in fig. 3.

2-2) inputting the sparse detection signals obtained in the step 2-1) into the trained transient signal super-resolution network obtained in the step 1) to obtain dense detection signals

。

In this embodiment, the measured sparse detection signal is input into the trained transient signal super-resolution network in step 1), so as to obtain dense detection data:

（3）

2-3) dense detection signals obtained by using 2-2)

Fast imaging is performed.

In one embodiment of the present invention, the propagation of light in space can be described by a wave model:

（4）

wherein ,

for Laplace operator>

For the speed of light, in one embodiment of the invention, the speed of vacuum light is taken +.>

m/s，/>

Representing time; />

For a complex-valued scalar wave field, for characterizing the moment +.>

Time in position->

Is a light field of (c). Any detection point on the intermediate wall surface>

At, time->

Wave field->

And detection signal->

The relation of (2) is: />

The method comprises the steps of carrying out a first treatment on the surface of the I.e. for the retrieved dense detection signal +.>

There is +.>

. The non-field of view imaging problem can be translated into according to this model: known->

Solving->

And obtaining an imaging result of the non-visual field target. Wherein->

Time for acquiring signal at any detection point in 2-1), +.>

Is the area to be solved.

The imaging algorithm may be solved using any existing efficient algorithm. In one embodiment of the invention, the classical frequency domain-wavenumber method is used for fast solving.

To achieve the above embodiments, a second aspect of the present invention provides a non-field of view imaging device based on a transient signal super resolution network, including:

It should be noted that the foregoing explanation of the embodiment of a non-visual field imaging method based on a transient signal super-resolution network is also applicable to a non-visual field imaging device based on a transient signal super-resolution network in this embodiment, and is not repeated herein. According to the non-visual field imaging device based on the transient signal super-resolution network, which is provided by the embodiment of the invention, a sparse detection signal of an object to be imaged for non-visual field imaging is obtained through a sparse detection point preset on an intermediate wall surface; recovering the sparse detection signal into a dense detection signal by using a preset transient signal super-resolution network; and obtaining a non-visual field imaging result of the target according to the intensive detection signals. Therefore, the method can fully utilize the information contained in the sparse detection signal to restore the signal to the dense detection signal so as to image the non-visual field target, has the characteristics of high detection speed, high imaging precision and wide application range, and overcomes the defects of the existing non-visual field imaging technology.

To achieve the above embodiments, an embodiment of a third aspect of the present invention provides an electronic device, including:

To achieve the above embodiments, a fourth aspect of the present invention provides a computer-readable storage medium storing computer instructions for causing the computer to execute the above-described non-visual field imaging method based on a transient signal super resolution network.

It should be noted that the computer readable medium described in the present disclosure may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this disclosure, a computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present disclosure, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, fiber optic cables, RF (radio frequency), and the like, or any suitable combination of the foregoing.

The computer readable medium may be contained in the electronic device; or may exist alone without being incorporated into the electronic device. The computer-readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to perform a non-visual field imaging method based on a transient signal super resolution network of the above embodiment.

Computer program code for carrying out operations of the present disclosure may be written in one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).

In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present application. In this specification, schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, the different embodiments or examples described in this specification and the features of the different embodiments or examples may be combined and combined by those skilled in the art without contradiction.

Furthermore, the terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In the description of the present application, the meaning of "plurality" is at least two, such as two, three, etc., unless explicitly defined otherwise.

Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and further implementations are included within the scope of the preferred embodiment of the present application in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the embodiments of the present application.

Logic and/or steps represented in the flowcharts or otherwise described herein, e.g., a ordered listing of executable instructions for implementing logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium may even be paper or other suitable medium upon which the program is printed, as the program may be electronically captured, via, for instance, optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner, if necessary, and then stored in a computer memory.

It is to be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above-described embodiments, the various steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, may be implemented using any one or combination of the following techniques, as is well known in the art: discrete logic circuits having logic gates for implementing logic functions on data signals, application specific integrated circuits having suitable combinational logic gates, programmable Gate Arrays (PGAs), field Programmable Gate Arrays (FPGAs), and the like.

Those of ordinary skill in the art will appreciate that all or a portion of the steps carried out in the method of the above-described embodiments may be implemented by a program to instruct related hardware, where the program may be stored in a computer readable storage medium, and where the program, when executed, includes one or a combination of the steps of the method embodiments.

In addition, each functional unit in each embodiment of the present application may be integrated in one processing module, or each unit may exist alone physically, or two or more units may be integrated in one module. The integrated modules may be implemented in hardware or in software functional modules. The integrated modules may also be stored in a computer readable storage medium if implemented as software functional modules and sold or used as a stand-alone product.

The above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, or the like. Although embodiments of the present application have been shown and described above, it will be understood that the above embodiments are illustrative and not to be construed as limiting the application, and that variations, modifications, alternatives, and variations may be made to the above embodiments by one of ordinary skill in the art within the scope of the application.

Claims

1. The non-visual field imaging method based on the transient signal super-resolution network is characterized by comprising the following steps of:

2. The method according to claim 1, wherein the transient signal super-resolution network is composed of interpolation branches and transient signal super-resolution branches, and dense detection signals corresponding to the sparse detection signals are obtained by summing output results obtained after the sparse detection signals are respectively input into the two branches;

3. The method of claim 1, wherein the obtaining a non-field of view imaging of the object from the dense detection signal comprises:

wherein ,

for Laplace operator>

Is the speed of light; />

For a complex-valued scalar wave field, for characterizing the moment +.>

Time in position->

Is a light field of (2);

in the middleAny detection point on dielectric wall

At, time->

Wave field->

And detection signal->

The relation of (2) is:

There is +.>

；

Then according to

Solving->

Obtaining a non-visual field imaging result of the target; wherein->

Time for acquiring signal at any detection point, +.>

Is the area to be solved.

4. The method of claim 2, wherein the transient signal super-resolution branch comprises a first three-dimensional convolution layer, 16 identical attention modules, a first upsampling module, a second upsampling module, and a second three-dimensional convolution layer connected in sequence;

5. The method of claim 4, wherein prior to recovering the sparse probe signal to a dense probe signal using the pre-set transient super resolution network, the method further comprises:

training the transient signal super-resolution network;

constructing the transient signal super-resolution network;

6. The method of claim 4, wherein the attention module comprises: the three-dimensional convolution device comprises a third three-dimensional convolution layer, three sub-branches respectively connected with the output of the third three-dimensional convolution layer, a first leakage linear rectifying unit connected with the total output of the three sub-branches, and a ninth three-dimensional convolution layer connected with the output of the first leakage linear rectifying unit;

7. The method of claim 5, wherein the sparse and dense detection signals of the preset virtual non-field of view target for non-field of view imaging are obtained using the following expression:

wherein ,

to be at any detection point->

At->

Accumulating the obtained photon intensities at intervals; />

A surface normal to the intermediate wall portion for the non-view object; />

Surface of representation->

Any point on the upper part; />

Representing three-dimensional European skyInner accumulation in the middle, ->

and />

Respectively represent from the point->

Point to->

And from the point->

Point to->

Unit vector of>

and />

Respectively express surface->

At the point->

The point of the wall surface of the department and the middle>

Pointing to a unit normal direction in the non-view area; />

Representing the point +.>

To the middle wall surface>

Is a distance of (2); />

As a dirac function, +.>

Is the speed of light; />

Time resolution for the detector; />

For->

Reflectivity at->

For surface->

Go up some->

Area measure at.

8. The method of claim 5, wherein the virtual non-view targets employ a two-dimensional image or a three-dimensional model, the virtual non-view targets being placed in a virtual cuboid space of a fixed size at the time of simulation to fix the positions of the virtual non-view targets such that the center point of each virtual non-view target coincides with the intervening wall distance at the time of simulation.

9. The method of claim 5, wherein training the transient signal super resolution network using the training set employs a loss function as follows: