CN112308941B - Mutual information-based limited view photoacoustic image reconstruction method - Google Patents

Mutual information-based limited view photoacoustic image reconstruction method Download PDF

Info

Publication number
CN112308941B
CN112308941B CN202011215182.6A CN202011215182A CN112308941B CN 112308941 B CN112308941 B CN 112308941B CN 202011215182 A CN202011215182 A CN 202011215182A CN 112308941 B CN112308941 B CN 112308941B
Authority
CN
China
Prior art keywords
frequency domain
time domain
output
sharing module
information sharing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011215182.6A
Other languages
Chinese (zh)
Other versions
CN112308941A (en
Inventor
高飞
张佳冬
高峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ShanghaiTech University
Original Assignee
ShanghaiTech University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ShanghaiTech University filed Critical ShanghaiTech University
Priority to CN202011215182.6A priority Critical patent/CN112308941B/en
Publication of CN112308941A publication Critical patent/CN112308941A/en
Application granted granted Critical
Publication of CN112308941B publication Critical patent/CN112308941B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06T5/80
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10132Ultrasound image

Abstract

The invention discloses a limited visual angle photoacoustic image reconstruction method based on mutual information. The invention provides that the image domain and the k space image are obtained by using the time domain and the frequency domain algorithm and are used as the input of the post-processing network, the time-frequency domain U-shaped network model designed by the invention can well combine the organization structure information of the two inputs and remove the artifacts, meanwhile, the mutual information constraint provides rich priori knowledge for the network, and experiments prove that in the problem of photoacoustic reconstruction of limited visual angles, the method of the invention is superior to the current more mature and novel reconstruction algorithm.

Description

Mutual information-based limited view photoacoustic image reconstruction method
Technical Field
The invention relates to a limited visual angle photoacoustic image reconstruction method based on priori knowledge compensation of mutual information and used for time-frequency domain input.
Background
Photoacoustic imaging is a non-invasive medical imaging means, and combines the high contrast characteristic of light and the deep penetration characteristic of sound, and has recently attracted widespread attention from domestic and foreign students. Based on the photoacoustic effect, biological tissues generate ultrasonic signals under optical excitation, and after the ultrasonic signals are received by surrounding ultrasonic probes, people can obtain photoacoustic images of the biological tissues through a reconstruction algorithm.
As an important photoacoustic imaging apparatus, photoacoustic computed tomography (PAT) is rapidly developing due to its rapid imaging and the like. At present, the photoacoustic image obtained by PAT has no advantages in preclinical and clinical applications such as blood oxygen saturation quantification, small animal imaging, breast tumor benign and malignant judgment and the like. The accuracy of the diagnostic result depends largely on the quality of the reconstructed photoacoustic image. This means that the photoacoustic reconstruction algorithm determines the value of the photoacoustic apparatus in clinical applications.
The quality of the photoacoustic image obtained by the traditional reconstruction algorithm is low at present. These reconstruction algorithms often introduce artifact information, which makes diagnosis difficult for the physician. Particularly in practical clinical applications, due to space and detection environment limitations, the ultrasound probe often covers only a portion of the patient, resulting in limited viewing angles. The use of conventional reconstruction algorithms may further result in the biological tissue information becoming mixed in the artifact and indistinguishable.
Disclosure of Invention
The invention aims to solve the technical problems that: under the condition of limited viewing angles, the existing photoacoustic image reconstruction algorithm can cause information loss and introduce artifact information.
In order to solve the technical problems, the technical scheme of the invention provides a limited visual angle photoacoustic image reconstruction method based on mutual information, which is characterized by comprising the following steps:
DAS image x obtained by using time domain reconstruction algorithm for ultrasonic signals received by ultrasonic probe i At the same time, k-space image x obtained using frequency domain reconstruction algorithm k
Imaging DAS x i And k-space image x k And reconstructing after inputting the time-frequency domain U-shaped network model to obtain a photoacoustic image, wherein the time-frequency domain U-shaped network model adopts an encoder-decoder structure:
DAS image x i The real part of complex number is obtained after fast Fourier transformation is carried out after the input encoder, and then the complex number is combined with k-space image x of the input encoder k Inputting the first information sharing module after the channel connection; k-space image x k The real part of complex number is obtained after fast Fourier transform is carried out after the input encoder, and then the real part is matched with DAS image x of the input encoder i Inputting the first information sharing module after the channel connection;
the time domain output I and the frequency domain output I obtained by the information sharing module I are used as time domain input and frequency domain input of the information sharing module II after passing through the maximum pooling layer I; the time domain output II and the frequency domain output II obtained by the information sharing module II are used as time domain input and frequency domain input of the information sharing module III after passing through the maximum pooling layer II; the time domain output III and the frequency domain output III obtained by the information sharing module III are used as time domain input and frequency domain input of the information sharing module IV after passing through the maximum pooling layer III; after the time domain output four and the frequency domain output four of the information sharing module four pass through the maximum pooling layer four, the frequency domain output four is connected with the time domain output four on a channel after being subjected to inverse Fourier transformFrom this, an intermediate latent representation z is then obtained by convolution kernel calculation 2 Intermediate latent representation z 2 Is the final output of the encoder;
the first information sharing module, the second information sharing module, the third information sharing module and the fourth information sharing module perform the same processing on time domain input and frequency domain input, and the first information sharing module, the second information sharing module, the third information sharing module or the fourth information sharing module is defined as an information sharing module, and then the method comprises the following steps:
the information sharing module extracts shallow layer characteristics of time domain input and outputs the shallow layer characteristics to the decoder;
the information sharing module interactively fuses the extracted high-level features of the time domain input, the extracted high-level features of the frequency domain input transformed into the time domain and the original features of the time domain input to form time domain output I, time domain output II, time domain output III or time domain output IV;
the information sharing module interactively fuses the extracted high-level features of the frequency domain input, the extracted high-level features of the time domain input after being transformed into the frequency domain, and the original features of the frequency domain input to form frequency domain output I, frequency domain output II, frequency domain output III or frequency domain output IV;
the decoder will encode the intermediate latent representation z of the encoder output 2 And restored to a photoacoustic image.
Preferably, the first max pooling layer, the second max pooling layer, the third max pooling layer and the fourth max pooling layer are all 2×2 max pooling layers.
Preferably, the information sharing module directly outputs the time domain input to extract the shallow features of the time domain input, and the output time domain input is connected to the corresponding decoder part after being calculated by two convolution kernels of 3×3, so as to preserve the shallow information of the time domain, wherein batch normalization and ReLU activation operations are additionally added after each convolution operation.
Preferably, the high-level characteristics of the time domain input are obtained by extracting the time domain input through two 3×3 convolution kernel calculations and two residual convolution layer calculations in the information sharing module, wherein each convolution operation is additionally added with batch normalization and ReLU activation operations;
the time domain input is subjected to fast Fourier transformation in the information sharing module, and then is subjected to two 3×3 convolution kernel calculations and two residual convolution layer calculations to obtain high-level characteristics after the time domain input is transformed into a frequency domain, wherein batch normalization and ReLU activation operations are additionally added after each convolution operation;
the time domain input is not subject to any computation within the information sharing module to preserve the original characteristics of the time domain input.
Preferably, the high-level feature of the time domain input, the high-level feature of the frequency domain input after being transformed into the time domain, and the original feature of the time domain input are obtained after two 3×3 convolution kernels are calculated after channel connection, and a batch normalization operation and a ReLU activation operation are additionally added after each convolution operation.
Preferably, the high-level characteristics of the frequency domain input are obtained by extracting the frequency domain input through two convolution kernel calculations of 3×3 and two residual convolution layer calculations in the information sharing module, wherein each convolution operation is additionally added with batch normalization and ReLU activation operations;
the frequency domain input is subjected to inverse fast Fourier transformation in the information sharing module, and then is subjected to two 3×3 convolution kernel calculations and two residual convolution layer calculations to obtain high-level characteristics after the frequency domain input is transformed into a time domain, wherein batch normalization and ReLU activation operations are additionally added after each convolution operation;
the frequency domain input is not subjected to any computation within the information sharing module to preserve the original characteristics of the frequency domain input.
Preferably, the high-level feature of the frequency domain input, the high-level feature of the time domain input after being transformed into the frequency domain, and the original feature of the frequency domain input are obtained after two 3×3 convolution kernels are calculated after being connected through a channel, and the frequency domain output one, the frequency domain output two, the frequency domain output three or the frequency domain output four are obtained, wherein each convolution operation is additionally added with batch normalization and ReLU activation operations.
Preferably, the intermediate latent representation z 2 The following steps are performed in the decoder:
step 1, the intermediate latent representation z 2 Firstly, splicing the up-sampled convolution layer with shallow layer characteristics output by the information sharing module four on a channel, then continuously performing convolution kernel calculation of two 3 multiplied by 3, and then performing up-sampled convolution layer operation, wherein batch normalization and ReLU activation operation are additionally added after each convolution operation;
step 2, splicing the shallow features of the output obtained in the step 1 and the three outputs of the information sharing module on a channel, then continuing to perform convolution kernel calculation of two 3×3, and performing up-sampling convolution layer operation, wherein batch normalization and ReLU activation operation are additionally added after each convolution operation;
step 3, splicing the shallow features of the output obtained in the step 2 and the output of the information sharing module II on a channel, then continuing to perform convolution kernel calculation of two 3 multiplied by 3, and performing up-sampling convolution layer operation, wherein batch normalization and ReLU activation operation are additionally added after each convolution operation;
and 4, splicing the shallow features obtained in the step 3 and the first output of the information sharing module on a channel, then continuing to perform convolution kernel calculation of two 3 multiplied by 3, and performing convolution kernel calculation to obtain the final photoacoustic image, wherein batch normalization and ReLU activation operations are additionally added after each convolution operation.
Preferably, the potential variables in the middle of mutual information constraint are used to compensate a priori knowledge in the training phase of the time-frequency domain U-shaped network model.
Preferably, the compensation of the a priori knowledge using potential variables in between the mutual information constraints specifically comprises the steps of:
the first step, reconstructing objective function of the time-frequency domain U-shaped network model
Figure BDA0002760120970000041
The following formula (1):
Figure BDA0002760120970000042
in the formula (1), θ represents all the trainable parameters in the time-frequency domain U-shaped network model; y represents a real photoacoustic image;
Figure BDA0002760120970000043
representing mathematical expectations about the parameter θ; f (f) θ (. Cndot.) represents the time-frequency domain U-shaped network model;
second, pre-training an auxiliary network to provide intermediate potential representation z 1 As mutual information constraints;
thirdly, after mutual information constraint is added, in the training stage of the time-frequency domain U-shaped network model, the reconstruction objective function of the time-frequency domain U-shaped network model
Figure BDA0002760120970000044
Represented by the following formula (2):
Figure BDA0002760120970000045
in the formula (2), I (·, ·) represents mutual information; λ represents a penalty coefficient for mutual information constraint;
Figure BDA0002760120970000046
third step, optimizing objective function
Figure BDA0002760120970000047
Lower variation bound of (2):
by p (z) 1 |z 2 ) Representing the true conditional distribution, using the variation distribution q (z 1 |z 2 ) To approximate substitution of p (z) 1 |z 2 ) Then mutual information item I (z 1 ,z 2 ) Represented by the following formula (3):
Figure BDA0002760120970000051
in the formula (3), H (z) 1 ) Representing intermediate latent representation z 1 Is an information entropy of (a); h (z) 1 |z 2 ) Representing a given intermediate potential representation z 2 Post intermediate latent characterization z 1 Conditional entropy of (2);
Figure BDA0002760120970000052
representing the parameter z 1 ,z 2 Is a mathematical expectation of (a); />
Figure BDA0002760120970000053
The representation represents KL divergence;
obtaining a lower variation bound of the reconstruction objective function according to the above (3)
Figure BDA0002760120970000054
The following formula (4):
Figure BDA0002760120970000055
fourth, gaussian distribution is selected as variation distribution q (z 1 |z 2 ) The following formula (5) shows:
Figure BDA0002760120970000056
in formula (5), μ (z 2 ) Mean value of Gaussian distribution, intermediate latent variable z 2 Calculated by a 1 multiplied by 1 convolution layer;
σ(z 2 ) Representing the variance of the gaussian distribution, calculated using the following equation (6):
Figure BDA0002760120970000057
in the above formula (6), C, H, W each represents z 1 And z 2 Channel number, height and width; epsilon is a positive constant;
Figure BDA0002760120970000058
representing a real number domain space with dimension C;
fifthly, substituting the formula (6) into the formula (5) and taking the logarithm, and then changing the distribution q (z) 1 |z 2 ) Represented by the following formula (7):
Figure BDA0002760120970000059
in formula (7), c is a constant;
the complex mutual information item which cannot be solved is expressed as the mutual information variable which is easy to solve and realize by the formula (7), and the intermediate latent variable of the time-frequency domain U-shaped network model can learn more supervision information from the encoder to achieve the effect of priori knowledge compensation through the formula (7).
The invention provides that the image domain and the k space image are obtained by using the time domain and the frequency domain algorithm and are used as the input of the post-processing network, the time-frequency domain U-shaped network model designed by the invention can well combine the organization structure information of the two inputs and remove the artifacts, meanwhile, the mutual information constraint provides rich priori knowledge for the network, and experiments prove that in the problem of photoacoustic reconstruction of limited visual angles, the method of the invention is superior to the current more mature and novel reconstruction algorithm.
Drawings
FIG. 1 is an overall schematic diagram of DuDoUnet;
FIG. 2 is a schematic diagram of the structure of an ISB;
FIG. 3 is an overall schematic diagram of a mutual information constraint framework;
fig. 4 shows the results of a simulation experiment (three parts of which the reconstruction differences are evident, marked and enlarged under the picture).
Detailed Description
The invention will be further illustrated with reference to specific examples. It is to be understood that these examples are illustrative of the present invention and are not intended to limit the scope of the present invention. Further, it is understood that various changes and modifications may be made by those skilled in the art after reading the teachings of the present invention, and such equivalents are intended to fall within the scope of the claims appended hereto.
Aiming at the problem of photoacoustic reconstruction of a limited viewing angle, the invention provides a method for reconstructing a photoacoustic image of the limited viewing angle based on mutual information, which comprises the following steps:
DAS image x obtained by using time domain reconstruction algorithm for ultrasonic signals received by ultrasonic probe i At the same time, k-space image x obtained using frequency domain reconstruction algorithm k . The two reconstruction algorithms can reconstruct the same biological tissue information, but introduce different artifact information, which can teach the post-processing network at the input level which are tissue which are artifacts.
In order to further distinguish the two kinds of information, the invention designs a variant model of Unet, namely a time-frequency Domain U-shaped network (hereinafter referred to as DuDoUnet model), and the whole structure is shown in figure 1.
DAS image x obtained by DuDoUnet model through time domain reconstruction algorithm i And k-space image x obtained by frequency domain reconstruction algorithm k As input. In fig. 1, FFT and IFFT represent fast fourier transform and inverse fast fourier transform, respectively, and Concat represents that inputs are connected in the channel dimension. This allows two inputs of image domain and k-space to share information and distinguish artifacts from each other. The entire DuDoUnet model is considered as an encoder-decoder structure, where the encoder architecture and operational flow is as follows:
DAS image x i The size is 128 x1, k-space image x k The size is 128 x 2, where the two channels are k-space images x, respectively k Real and imaginary parts of (a) are provided. DAS image x i The real part of the complex number is taken after the fast fourier transform and has a size of 128×128×1. Image x in sum k-space k After the channel connection, its size becomes 128×128×3. Also, k-space image x k Taking the real part of complex number after fast Fourier transform, the size of the real part is 128×128×1, and the sum DAS image x i After the channel connection, its size becomes 128×128×2.
The time-frequency domain input then enters a first information sharing module (Information Sharing Block, hereinafter abbreviated ISB). The specific structure of the ISB is shown in fig. 2.
The ISB has two inputs, a time domain input and a frequency domain input, respectively. Time domain input in ISB accomplishes two different tasks:
task 1: and directly outputting. This part is used to extract shallow features in the time domain.
Task 2: is divided into three parts for time domain feature extraction and conversion from time domain to frequency domain features, and further comprises:
task 2.1: two 3 x 3 convolution kernel calculations and two residual convolution layer calculations are performed. Batch normalization and ReLU activation operations are additionally added after each convolution operation. This part is used to extract high-level features of the time domain.
Task 2.2: and directly outputting. This section is used to preserve the original features of the temporal input layer.
Task 2.3: after fast fourier transformation, two 3×3 convolution kernel calculations and two residual convolution layer calculations are performed. This part is used to extract the high-level features after the time domain transformation to the frequency domain.
It is noted that although task 1 and task 2.2 are identical, the corresponding output operations are different and play a different role.
ISB only requires frequency domain input to complete one task, namely task 3:
task 3: is divided into three parts for frequency domain feature extraction and frequency domain to time domain feature conversion, and further comprises:
task 3.1: two 3 x 3 convolution kernel calculations and two residual convolution layer calculations are performed. This part is used to extract the high-level features of the frequency domain.
Task 3.2: and directly outputting. This section is used to preserve the original features of the frequency domain input layer.
Task 3.3: after inverse fast fourier transformation, two convolution kernels of 3×3 and two residual convolution layers are calculated. This part is used to extract high-level features after the frequency domain is transformed into the time domain.
ISB has three outputs, namely a connection output, a time domain output, and a frequency domain output:
and (3) connection output: the output in task 1 is computed by two 3 x 3 convolution kernels and then connected to the corresponding decoder portion to preserve the shallow information in the time domain.
Time domain output: the outputs of the tasks 2.1, 2.2 and 3.3 are connected on the channel and output after being calculated by two convolution kernels of 3 multiplied by 3, and the interactive fusion of the time domain information is ensured.
Frequency domain output: the outputs of the task 3.1, the task 3.2 and the task 2.3 are connected on a channel and are output after being calculated by two convolution kernels of 3 multiplied by 3, and the interaction fusion of the frequency domain information is ensured.
ISB is a universal module in the DuDoUnet model, and can adapt to different input channel numbers and output the characteristics of the required channel numbers, wherein the unique time-frequency domain transformation ensures that the photoacoustic image information sharing of an image domain and k space is ensured, and meanwhile, the mutual interference between different tasks due to the time-frequency domain characteristics is also ensured.
After passing through the first ISB, the output one is connected, the sizes of the time domain output one and the frequency domain output one are 128×128×64, and the sizes of the time domain output one and the frequency domain output two are 64×164×264 after passing through the maximum pooling layer of 2×02. Then these two outputs are used as the time domain input and the frequency domain input of the second ISB, and after the computation of the second ISB, the two outputs are connected, and the two outputs of the time domain output and the two outputs of the frequency domain output are both 64×364×4128. The time domain output two and the frequency domain output two pass through a 2×52 maximum pooling layer, and the sizes become 32×632×7128. Then, the two outputs are used as a time domain input and a frequency domain input of a third ISB, and after the third ISB calculation, the three outputs are connected, and the sizes of the three outputs become 32×832×9256. The time domain output three and the frequency domain output three pass through a 2×2 maximum pooling layer, and the sizes become 16×16×256. Then, the two outputs are used as a time domain input and a frequency domain input of a fourth ISB, and after the fourth ISB calculation, the four outputs are connected, and the sizes of the four outputs of the time domain and the four outputs of the frequency domain are changed into 16 multiplied by 512. The time domain output four and the frequency domain output four pass through a 2×2 maximum pooling layer, and the sizes become 8×8×512. Frequency domain output fourThe size of the channel is changed into 8 multiplied by 1024 after the inverse Fourier transform and the time domain input and output are connected together on the channel. After the output is calculated by two convolution kernels of 3×3, the intermediate potential representation z is finally obtained 2 I.e. the final output of the encoder, which is 8 x 512 in size.
The decoder has the following specific structure and operation flow:
intermediate latent representation z 2 The size of the up-sampled convolution layer of 2×2 becomes 16×16×0512, and the connection output four in the fourth ISB is spliced on the channel to obtain an output, and the size of the output is 16×116×21024. The output continues through two 3 x 33 convolution kernel calculations, becoming 16 x 416 x 5512 in size. The output then goes through a 2 x 62 up-sampled convolutional layer operation to become 32 x 732 x 8256 in size. And the third connected output in the third ISB is spliced on the channel to obtain an output with the size of 32 multiplied by 932 multiplied by 512. The output continues through two convolution kernel calculations of 3 x 03 to become 32 x 132 x 2256 in size. The output then goes through a 2 x 32 up-sampled convolutional layer operation to become 64 x 464 x 5128 in size. And the second connection output in the second ISB is spliced on the channel to obtain an output with the size of 64 multiplied by 664 multiplied by 7256. The output continues through two 3 x 83 convolution kernel calculations, becoming 64 x 964 x 128 in size. The output then goes through a 2 x 02 up-sampled convolutional layer operation to become 128 x 1128 x 264 in size. And the connection output in the first ISB-after splicing on the channel-yields an output of size 128 x 3128 x 4128. The output continues through two 3 x 3 convolution kernel calculations, becoming 128 x 64 in size. Finally, the output is calculated by a layer of convolution kernel of 1×1, resulting in a final output of the same size as the real image (GT), 128×128×1.
We use the real photoacoustic image y as a supervision tag, using f θ (. Cndot.) represents the DuDoUnet model, θ represents all parameters trainable in the DuDoUnet model. The reconstructed objective function of the DuDoUnet whole can be expressed as:
Figure BDA0002760120970000091
in experiments, we selected root Mean Square Error (MSE) as the reconstructed loss function.
Considering that the limited view photoacoustic image input can only learn a priori knowledge through the supervision labels, the problem of gradient disappearance easily occurs as the network deepens, and meanwhile, if the decoder of the network is strong enough, the encoder cannot learn useful information, and at the moment, the reconstruction model also becomes a generation model. In this regard, the present invention proposes to use potential variables in the middle of mutual information constraints to compensate for a priori knowledge during the phase of model training. The overall frame is shown in fig. 3.
In order for the encoder of the DuDoUnet model to learn rich a priori knowledge, a secondary network (selected from the auto-encoders in the experiment) is pre-trained to provide an intermediate potential representation z 1 I.e. the output of the encoder in an automatic encoder, using mutual information to constrain z 1 And an intermediate latent representation z extracted from the DuDoUnet model 2 (output of encoder in DuDoUnet model). Mutual information constraint ensures that the encoder of the DuDoUnet model can learn priori knowledge from the supervision labels, can learn more priori knowledge through intermediate potential characterization, and achieves the effect of priori knowledge compensation.
After mutual information constraint is added, in the model training stage, the overall objective function can be expressed as:
Figure BDA0002760120970000092
in the formula (2), I (·, ·) represents mutual information; λ represents a penalty coefficient for mutual information constraint;
Figure BDA0002760120970000101
all parameters of the model can be obtained by maximizing equation (2). However, it is very difficult to directly solve equation (2). We therefore only optimize the lower variation bound of equation (2). We use p (z) 1 |z 2 ) Representing the true conditional distribution, but p (z 1 |z 2 ) Often unknown, we use the variation distribution q (z 1 |z 2 ) To approximate substitution of p (z) 1 |z 2 ). The mutual information item can thus be expressed as:
Figure BDA0002760120970000102
in the formula (3), H (z) 1 ) Representing intermediate latent representation z 1 Is an information entropy of (a); h (z) 1 |z 2 ) Representing a given intermediate potential representation z 2 Post intermediate latent characterization z 1 Conditional entropy of (2);
Figure BDA0002760120970000103
representing the parameter z 1 ,z 2 Is a mathematical expectation of (a); />
Figure BDA0002760120970000104
The representation represents the KL divergence. Because of z 1 Is an intermediate potential representation provided by the pre-training network, so z during training 1 The KL divergence is non-negative, so we can remove both terms, and get the lower variation bound of the objective function as shown in equation (4):
Figure BDA0002760120970000105
we choose gaussian as the variational distribution q (z 1 |z 2 ):
Figure BDA0002760120970000106
Wherein μ (z 2 ) Sum sigma (z) 2 ) Representing the mean and variance of the gaussian distribution. Intermediate latent variable z 2 By 1x1 convolution layer calculation we can get the mean μ (z 2 ). The variance σ (z) of the gaussian distribution is calculated using the same method 2 ) Will cause instability in the network training, so we use equation (6) to calculate the variance:
Figure BDA0002760120970000107
in the above formula (6), C, H, W each represents z 1 And z 2 Channel number, height and width; epsilon is a positive constant;
Figure BDA0002760120970000108
representing the real number domain space of dimension C.
After substituting formula (6) into formula (5) and taking the logarithm, the variational mutual information item can be expressed as:
Figure BDA0002760120970000111
in the formula (7), c is a constant.
Unlike MSE, equation (7) allows the network to impose different penalty constraints on different layers when the variance σ (z) 2 ) At a constant equal to 1, equation (7) and MSE are equivalent.
Equation (7) expresses complex mutual information terms that cannot be solved as a mutual information variation term that is easy to solve for programming implementation. Through formula (7), intermediate latent variables of the DuDoUnet model can learn more supervision information from the encoder to achieve the effect of priori knowledge compensation.
To illustrate the effectiveness of the invention, we used a public data set to perform experimental verification. We have picked a total of 2000 samples, 1500 of which are used for training, and the remaining 500 are used for testing. We use K-wave photoacoustic simulation toolbox to generate photoacoustic signals, and obtain x through time domain reconstruction algorithm and frequency domain reconstruction algorithm i And x k . We then obtain a reconstructed photoacoustic image using the mutual information constrained DuDoUnet model. In this experiment, the model parameters λ and ε were each set to 1.
In order to further compare the effectiveness of the method and the model, we choose the commonly used reconstruction models such as the Unet, the Ynet, the DEUnet and the like and compare with the models. Wherein the Unet is divided into Unet#1 and Unet#2. Wherein Unet #1 inputs only x i Unet #2 input x i And x k But only one encoder is used. The Ynet and DEUnet models have a dual encoder structure, input x i And x k Each using a single encoder to extract the information.
We use the Structural Similarity (SSIM) and peak signal to noise ratio (PSNR) to pair weight results as follows:
Figure BDA0002760120970000112
under the condition that mutual information constraint is not available, the reconstruction result of the DuDoUnet model exceeds that of other comparison models, and after the mutual information constraint is added, the model obtains a better reconstruction result.
We further visually compare the reconstructed results by reconstructed photoacoustic images as shown in fig. 4. As can be seen from fig. 4, under the same input conditions, duDoUnet and dudounet+mi are more capable of restoring structural information of blood vessels.

Claims (10)

1. The method for reconstructing the limited-view photoacoustic image based on the mutual information is characterized by comprising the following steps of:
DAS image x obtained by using time domain reconstruction algorithm for ultrasonic signals received by ultrasonic probe i At the same time, k-space image x obtained using frequency domain reconstruction algorithm k
Imaging DAS x i And k-space image x k And reconstructing after inputting the time-frequency domain U-shaped network model to obtain a photoacoustic image, wherein the time-frequency domain U-shaped network model adopts an encoder-decoder structure:
DAS image x i The real part of complex number is obtained after fast Fourier transformation is carried out after the input encoder, and then the complex number is combined with k-space image x of the input encoder k Inputting the first information sharing module after the channel connection; k-space image x k The real part of complex number is obtained after fast Fourier transform is carried out after the input encoder, and then the real part is matched with DAS image x of the input encoder i Inputting the first information sharing module after the channel connection;
obtained through the first information sharing moduleThe time domain output I and the frequency domain output I of the information sharing module II are used as time domain input and frequency domain input of the information sharing module II after passing through the maximum pooling layer I; the time domain output II and the frequency domain output II obtained by the information sharing module II are used as time domain input and frequency domain input of the information sharing module III after passing through the maximum pooling layer II; the time domain output III and the frequency domain output III obtained by the information sharing module III are used as time domain input and frequency domain input of the information sharing module IV after passing through the maximum pooling layer III; after the time domain output four and the frequency domain output four of the information sharing module four pass through the maximum pooling layer four, the frequency domain output four is connected with the time domain output four on a channel after being subjected to inverse Fourier transform, and then the intermediate potential representation z is obtained through convolution kernel calculation 2 Intermediate latent representation z 2 Is the final output of the encoder;
the first information sharing module, the second information sharing module, the third information sharing module and the fourth information sharing module perform the same processing on time domain input and frequency domain input, and the first information sharing module, the second information sharing module, the third information sharing module or the fourth information sharing module is defined as an information sharing module, and then the method comprises the following steps:
the information sharing module extracts shallow layer characteristics of time domain input and outputs the shallow layer characteristics to the decoder;
the information sharing module interactively fuses the extracted high-level features of the time domain input, the extracted high-level features of the frequency domain input transformed into the time domain and the original features of the time domain input to form time domain output I, time domain output II, time domain output III or time domain output IV;
the information sharing module interactively fuses the extracted high-level features of the frequency domain input, the extracted high-level features of the time domain input after being transformed into the frequency domain, and the original features of the frequency domain input to form frequency domain output I, frequency domain output II, frequency domain output III or frequency domain output IV;
the decoder will encode the intermediate latent representation z of the encoder output 2 And restored to a photoacoustic image.
2. The method for reconstructing a limited view photoacoustic image based on mutual information according to claim 1, wherein the first maximum pooling layer, the second maximum pooling layer, the third maximum pooling layer and the fourth maximum pooling layer are all 2 x 2 maximum pooling layers.
3. The method of claim 1, wherein the information sharing module directly outputs the time domain input to extract the shallow features of the time domain input, and the output time domain input is connected to the corresponding decoder after two convolution kernels of 3×3 to preserve the shallow information of the time domain, wherein each convolution operation is additionally followed by a batch normalization and a ReLU activation operation.
4. The method for reconstructing a limited view photoacoustic image based on mutual information according to claim 1, wherein the time domain input is extracted from the information sharing module through two convolution kernel calculations of 3×3 and two residual convolution layer calculations to obtain high-level features of the time domain input, wherein each convolution operation is additionally added with batch normalization and ReLU activation operations;
the time domain input is subjected to fast Fourier transformation in the information sharing module, and then is subjected to two 3×3 convolution kernel calculations and two residual convolution layer calculations to obtain high-level characteristics after the time domain input is transformed into a frequency domain, wherein batch normalization and ReLU activation operations are additionally added after each convolution operation;
the time domain input is not subject to any computation within the information sharing module to preserve the original characteristics of the time domain input.
5. The method for reconstructing a limited view photoacoustic image based on mutual information according to claim 1, wherein the high-level features of the time domain input, the high-level features of the frequency domain input after being transformed into the time domain, and the original features of the time domain input are obtained by performing two 3×3 convolution kernel calculations after channel connection, so as to obtain the first time domain output, the second time domain output, the third time domain output or the fourth time domain output, wherein each convolution operation is additionally added with batch normalization and ReLU activation operations.
6. The method for reconstructing a limited view photoacoustic image based on mutual information according to claim 1, wherein the frequency domain input is extracted from the information sharing module through two convolution kernel calculations of 3×3 and two residual convolution layer calculations to obtain a high-level characteristic of the frequency domain input, wherein each convolution operation is additionally added with batch normalization and ReLU activation operations;
the frequency domain input is subjected to inverse fast Fourier transformation in the information sharing module, and then is subjected to two 3×3 convolution kernel calculations and two residual convolution layer calculations to obtain high-level characteristics after the frequency domain input is transformed into a time domain, wherein batch normalization and ReLU activation operations are additionally added after each convolution operation;
the frequency domain input is not subjected to any computation within the information sharing module to preserve the original characteristics of the frequency domain input.
7. The method for reconstructing a limited view photoacoustic image based on mutual information according to claim 1, wherein the high-level features of the frequency domain input, the high-level features of the time domain input after being transformed into the frequency domain, and the original features of the frequency domain input are obtained by performing two 3×3 convolution kernel calculations after channel connection, to obtain the first frequency domain output, the second frequency domain output, the third frequency domain output or the fourth frequency domain output, wherein each convolution operation is additionally supplemented with batch normalization and ReLU activation operations.
8. A limited view photoacoustic image reconstruction method according to claim 1, wherein the intermediate potential representation z 2 The following steps are performed in the decoder:
step 1, the intermediate latent representation z 2 The method comprises the steps of firstly performing up-sampling convolution layer, then splicing shallow layer features output by an information sharing module on a channel, then continuously performing two convolution kernel calculations of 3 multiplied by 3, and then performing up-sampling convolution layer operation, wherein each convolutionBatch normalization and ReLU activation operations are added after the operation;
step 2, splicing the shallow features of the output obtained in the step 1 and the three outputs of the information sharing module on a channel, then continuing to perform convolution kernel calculation of two 3×3, and performing up-sampling convolution layer operation, wherein batch normalization and ReLU activation operation are additionally added after each convolution operation;
step 3, splicing the shallow features of the output obtained in the step 2 and the output of the information sharing module II on a channel, then continuing to perform convolution kernel calculation of two 3 multiplied by 3, and performing up-sampling convolution layer operation, wherein batch normalization and ReLU activation operation are additionally added after each convolution operation;
and 4, splicing the shallow features obtained in the step 3 and the first output of the information sharing module on a channel, then continuing to perform convolution kernel calculation of two 3 multiplied by 3, and performing convolution kernel calculation to obtain the final photoacoustic image, wherein batch normalization and ReLU activation operations are additionally added after each convolution operation.
9. A limited view photoacoustic image reconstruction method according to claim 1, wherein potential variations in the middle of mutual information constraints are used to compensate a priori knowledge during the training phase of the time-frequency domain U-shaped network model.
10. A limited view photoacoustic image reconstruction method according to claim 1, wherein the compensating a priori knowledge using potential variables in between the mutual information constraints comprises the steps of:
the first step, reconstructing objective function of the time-frequency domain U-shaped network model
Figure FDA0002760120960000031
The following formula (1):
Figure FDA0002760120960000041
in the formula (1), θ represents all the trainable parameters in the time-frequency domain U-shaped network model; y represents a real photoacoustic image;
Figure FDA0002760120960000042
representing mathematical expectations about the parameter θ; f (f) θ (. Cndot.) represents the time-frequency domain U-shaped network model;
second, pre-training an auxiliary network to provide intermediate potential representation z 1 As mutual information constraints;
thirdly, after mutual information constraint is added, in the training stage of the time-frequency domain U-shaped network model, the reconstruction objective function of the time-frequency domain U-shaped network model
Figure FDA0002760120960000043
Represented by the following formula (2):
Figure FDA0002760120960000044
in the formula (2), I (·, ·) represents mutual information; λ represents a penalty coefficient for mutual information constraint;
Figure FDA0002760120960000045
third step, optimizing objective function
Figure FDA0002760120960000046
Lower variation bound of (2):
by p (z) 1 |z 2 ) Representing the true conditional distribution, using the variation distribution q (z 1 |z 2 ) To approximate substitution of p (z) 1 |z 2 ) Then mutual information item I (z 1 ,z 2 ) Represented by the following formula (3):
Figure FDA0002760120960000047
in the formula (3), H (z) 1 ) Representing intermediate latent representation z 1 Is an information entropy of (a); h (z) 1 |z 2 ) Representing a given intermediate potential representation z 2 Post intermediate latent characterization z 1 Conditional entropy of (2);
Figure FDA0002760120960000048
representing the parameter z 1 ,z 2 Is a mathematical expectation of (a); />
Figure FDA0002760120960000049
The representation represents KL divergence;
obtaining a lower variation bound of the reconstruction objective function according to the above (3)
Figure FDA00027601209600000410
The following formula (4):
Figure FDA00027601209600000411
fourth, gaussian distribution is selected as variation distribution q (z 1 |z 2 ) The following formula (5) shows:
Figure FDA00027601209600000412
in formula (5), μ (z 2 ) Mean value of Gaussian distribution, intermediate latent variable z 2 Calculated by a 1 multiplied by 1 convolution layer;
σ(z 2 ) Representing the variance of the gaussian distribution, calculated using the following equation (6):
Figure FDA0002760120960000051
in the above formula (6), C, H, W each represents z 1 And z 2 Channel number, height and width; epsilon is a positive constant;
Figure FDA0002760120960000052
representing a real number domain space with dimension C;
fifthly, substituting the formula (6) into the formula (5) and taking the logarithm, and then changing the distribution q (z) 1 |z 2 ) Represented by the following formula (7):
Figure FDA0002760120960000053
in formula (7), c is a constant;
the complex mutual information item which cannot be solved is expressed as the mutual information variable which is easy to solve and realize by the formula (7), and the intermediate latent variable of the time-frequency domain U-shaped network model can learn more supervision information from the encoder to achieve the effect of priori knowledge compensation through the formula (7).
CN202011215182.6A 2020-11-04 2020-11-04 Mutual information-based limited view photoacoustic image reconstruction method Active CN112308941B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011215182.6A CN112308941B (en) 2020-11-04 2020-11-04 Mutual information-based limited view photoacoustic image reconstruction method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011215182.6A CN112308941B (en) 2020-11-04 2020-11-04 Mutual information-based limited view photoacoustic image reconstruction method

Publications (2)

Publication Number Publication Date
CN112308941A CN112308941A (en) 2021-02-02
CN112308941B true CN112308941B (en) 2023-06-20

Family

ID=74325545

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011215182.6A Active CN112308941B (en) 2020-11-04 2020-11-04 Mutual information-based limited view photoacoustic image reconstruction method

Country Status (1)

Country Link
CN (1) CN112308941B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
RU2014115603A (en) * 2014-04-18 2015-10-27 Федеральное государственное бюджетное учреждение науки Научно-технологический центр уникального приборостроения РАН ACOUSTOPTIC DEVICE FOR RECEIVING SPECTRAL STEREO IMAGES WITH SPECTRAL TRANSFORMATION
CN111507884A (en) * 2020-04-19 2020-08-07 衡阳师范学院 Self-adaptive image steganalysis method and system based on deep convolutional neural network
CN111640069A (en) * 2020-04-17 2020-09-08 上海交通大学 Compressive imaging method, system and device based on light sensing network and phase compensation

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2754388B1 (en) * 2013-01-15 2020-09-09 Helmholtz Zentrum München Deutsches Forschungszentrum für Gesundheit und Umwelt GmbH System and method for quality-enhanced high-rate optoacoustic imaging of an object
US20200342304A1 (en) * 2019-04-25 2020-10-29 International Business Machines Corporation Feature importance identification in deep learning models
CN111915691B (en) * 2019-05-07 2023-08-22 上海科技大学 Image processing system, method, terminal and medium based on neural network

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
RU2014115603A (en) * 2014-04-18 2015-10-27 Федеральное государственное бюджетное учреждение науки Научно-технологический центр уникального приборостроения РАН ACOUSTOPTIC DEVICE FOR RECEIVING SPECTRAL STEREO IMAGES WITH SPECTRAL TRANSFORMATION
CN111640069A (en) * 2020-04-17 2020-09-08 上海交通大学 Compressive imaging method, system and device based on light sensing network and phase compensation
CN111507884A (en) * 2020-04-19 2020-08-07 衡阳师范学院 Self-adaptive image steganalysis method and system based on deep convolutional neural network

Also Published As

Publication number Publication date
CN112308941A (en) 2021-02-02

Similar Documents

Publication Publication Date Title
GB2568623B (en) Prediction of age related macular degeneration by image reconstruction
Zhou et al. High spatial–temporal resolution reconstruction of plane-wave ultrasound images with a multichannel multiscale convolutional neural network
AU2013400936B2 (en) Image analysis techniques for diagnosing diseases
CN110097512A (en) Construction method and the application of the three-dimensional MRI image denoising model of confrontation network are generated based on Wasserstein
Wang et al. High-resolution image reconstruction for portable ultrasound imaging devices
Luo et al. Bayesian MRI reconstruction with joint uncertainty estimation using diffusion models
Singh et al. Medical image generation using generative adversarial networks
CN110610528A (en) Model-based dual-constraint photoacoustic tomography image reconstruction method
Wang et al. High fidelity direct-contrast synthesis from magnetic resonance fingerprinting in diagnostic imaging
Panda et al. A 3D wide residual network with perceptual loss for brain MRI image denoising
Yang et al. Recent advances in deep-learning-enhanced photoacoustic imaging
CN112308941B (en) Mutual information-based limited view photoacoustic image reconstruction method
CN112819740A (en) Medical image fusion method based on multi-component low-rank dictionary learning
Dehner et al. A deep neural network for real-time optoacoustic image reconstruction with adjustable speed of sound
Bhaiya et al. Classification of MRI brain images using neuro fuzzy model
CN115496659A (en) Three-dimensional CT image reconstruction method and device based on single projection data
CN111798427B (en) System for detecting karyokiness in gastrointestinal stromal tumor based on migration learning
Zarei et al. Harmonizing CT images via physics-based deep neural networks
Liu et al. Progressive residual learning with memory upgrade for ultrasound image blind super-resolution
Zheng et al. Liver Fat Assessment with Body Shape
Seoni et al. Ultrasound Image Beamforming Optimization Using a Generative Adversarial Network
CN112085687B (en) Method for converting T1 to STIR image based on detail enhancement
CN116385329B (en) Multilayer knowledge distillation medical image generation method and device based on feature fusion
Cammarasana et al. Super-resolution of 2D ultrasound images and videos
Ratnaparkhi et al. Segmentation and identification of fatty infiltration of the erector spinae using deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant