CN111260741B

CN111260741B - Three-dimensional ultrasonic simulation method and device by utilizing generated countermeasure network

Info

Publication number: CN111260741B
Application number: CN202010082738.2A
Authority: CN
Inventors: 杨健; 范敬凡; 艾丹妮; 董嘉慧; 王涌天
Original assignee: Beijing Institute of Technology BIT
Current assignee: Beijing Institute of Technology BIT
Priority date: 2020-02-07
Filing date: 2020-02-07
Publication date: 2022-05-10
Anticipated expiration: 2040-02-07
Also published as: CN111260741A

Abstract

A three-dimensional ultrasonic simulation method and a device for generating a countermeasure network are used for generating a simulated three-dimensional ultrasonic image and a real ultrasonic image to be highly simulated. The method comprises the following steps: (1) simultaneously inputting the magnetic resonance MR image and the ultrasonic US image into an encoder in a generator in a least squares generative countermeasure network based on spectral regularization; (2) embedding a three-dimensional self-adaptive example normalization layer between an encoder and a decoder of a network, and keeping the mean value and the variance of the MR image characteristics consistent with the mean value and the variance of the US image characteristics; (3) in the design of the network architecture of the generator, designing a Res-U-Net network architecture, wherein a residual block with a bottleneck is designed to replace blocks in an encoder and a decoder in the original U-Net network architecture; (4) combining content loss including modal perception with feature matching loss and antagonistic loss of spectral regularization to construct a new loss function for ultrasonic simulation; (5) a least squares generation dyadic network of spectral regularization is constructed.

Description

Three-dimensional ultrasonic simulation method and device by utilizing generated countermeasure network

Technical Field

The invention relates to the technical field of ultrasonic image simulation, in particular to a three-dimensional ultrasonic simulation method by utilizing a generated countermeasure network and a three-dimensional ultrasonic simulation device by utilizing the generated countermeasure network.

Background

Ultrasound (US) images and Magnetic Resonance (MR) images are important in a variety of medical cases and are used in a wide variety of applications, such as: head defect and tumor location, liver acute and chronic change detection, navigation of other organ operations and the like. Due to the characteristics of real-time performance, noninvasive imaging and the like, the ultrasonic scanner is widely applied to clinical operations, but the quality of an ultrasonic image is lower than that of an MR image. Magnetic resonance images are a safe imaging protocol that, when used in clinical diagnostics, can provide more anatomical detail than ultrasound images, but unfortunately are not imaged in real time.

In order to enable image-guided surgery, it is often necessary to align the preoperative MR and intraoperative US images. One method of registration of MR and US images is to synthesize the MR images into a US image and then register with the real US image. In addition, since it is practically difficult to simultaneously obtain an ultrasound image and a magnetic resonance image, synthesizing the magnetic resonance image into an ultrasound image is a way of acquiring data and expanding the data. Image composition is defined as a possible non-linear generation intensity transformation process that is applied to a given set of input modality images to generate a new modality image. Image synthesis can be applied in many clinical trials, for example, in the ultrasound image navigation process, it is important to perform registration fusion of an ultrasound image and a high-quality magnetic resonance image before an operation. However, it is technically difficult to directly register the magnetic resonance image and the ultrasound image because the appearance between the magnetic resonance image and the ultrasound image may change drastically. One method is to simulate or synthesize an ultrasound image from a magnetic resonance image, and then register the synthesized ultrasound image with a corresponding ultrasound image to complete the final image fusion. However, these two modality images are very different, and the mr image contains more abundant texture information than the ultrasound image, and so on, which make the synthesis task challenging.

There are many reports related to the synthesis of one modality image from another modality image, such as: a CT image is synthesized from the MR image, and a PET image is synthesized from the CT image. These methods are mainly classified into the following three categories. The first type of method is based on segmentation, which is performed by Zaidi et al by classifying tissues of significantly different densities and compositions into different tissue classes and then manually refining the segmentation. However, these methods rely heavily on the accuracy of the segmentation and often require manual operations to obtain the final accurate results. The second category of methods is atlas registration based methods, which first register all atlases to a new object image, pack the corresponding object images of the atlases into a new space using the resulting deformation field, and then synthesize the object images by fusing the subjective images of the atlases. Fuerst et al first simulated the CT or MR images to generate US images, and then proposed an algorithm for automatic registration of the MR or CT images with the US images using a linearly combined linear correlation similarity metric. The high computational burden of this method makes it inapplicable to real-time surgical navigation. Meanwhile, the lack of anatomical correspondence of the tumor resection area may lead to instability of registration before and after resection by using a direct free deformation registration method. However, the performance of the above method depends to a large extent on the accuracy of the registration and may consume a large amount of registration time. The third category is based on learning methods that learn complex mappings of one modality to another modality of the same part. To solve the problem of large calculation amount in the coefficient learning method, Tri et al adopt a small CT block directly predicted by a structured random forest as structured output, and simultaneously adopt a new self-context model to ensure the robustness of prediction. Vemulapalli et al propose an unsupervised method to simultaneously maximize the global mutual information cost function and the local spatial consistency cost function. Since the synthesis from MR to US is not a one-to-one mapping task, using only intensity values does not effectively distinguish structural details, the performance of the above mentioned method depends to a large extent on the quality of the proposed features and to what extent these features represent the natural properties of the target modality image.

In recent years, methods based on deep learning have been widely used and have achieved good results in the field of cross-modality medical image synthesis. In these works, different Convolutional Neural Network (CNN) structures were used to learn end-to-end non-linear mappings from one modality to another. Nie et al propose to synthesize CT images from MR images using a countertraining strategy and embedding image gradient loss functions in the context model. Avi et al propose to combine a Full Convolution Network (FCN) with a conditional generation countermeasure network (cGAN) to generate a composite PET image from a given input CT image. Bi et al propose to synthesize PET data from CT data based on a multi-channel generative confrontation network (MGAN) artificially labeled lung tumors. While these methods can produce good composite images, they rely on a large number of pairs of images, which are difficult to obtain in practice. To relax the requirement for paired data, wolternk et al proposed a cycleGAN model to synthesize unpaired MR to CT images. In the model training process, cycle consistency is explicitly specified in the process of MR to CT and CT to MR synthesis. However, cycleGAN cannot guarantee structural consistency between the two images due to the lack of direct constraints between the synthesized image and the input image. The above mentioned methods are all using GANs to complete the synthesis, but they only add some traditional methods to the GAN network and do not solve the problem of GAN training difficulty. The stability of GAN is a first consideration. In addition, at present, no one has done the problem of converting nuclear magnetic images into ultrasonic images by using deep learning.

Disclosure of Invention

To overcome the defects of the prior art, the technical problem to be solved by the present invention is to provide a three-dimensional ultrasound simulation method using a generative countermeasure network, which learns the nonlinear mapping of MR to US based on the generative countermeasure network of a three-dimensional adaptive instance normalization layer, and applies the technique to ultrasound image simulation to generate a simulated three-dimensional ultrasound image highly simulating a real ultrasound image.

The technical scheme of the invention is as follows: the three-dimensional ultrasonic simulation method utilizing the generation countermeasure network comprises the following steps:

(1) simultaneously inputting the magnetic resonance MR image and the ultrasonic US image into an encoder in a generator in a least squares generation countermeasure network based on spectral regularization;

(2) embedding a three-dimensional self-adaptive example normalization layer between an encoder and a decoder of the network, and keeping the mean value and the variance of the features of the MR image consistent with the mean value and the variance of the features of the US image;

(3) in the design of the network architecture of the generator, a Res-U-Net network architecture is designed, wherein a residual block with a bottleneck is designed to replace blocks in an encoder and a decoder in the original U-Net network architecture;

(4) combining content loss including modal perception with feature matching loss and antagonistic loss of spectral regularization to construct a new loss function for ultrasonic simulation;

(5) a least squares generation dyadic network of spectral regularization is constructed.

The invention simultaneously inputs a magnetic resonance MR image and an ultrasonic US image into an encoder in a generator in a least square generating countermeasure network based on spectral regularization, a three-dimensional adaptive example normalization layer is embedded between the encoder and the decoder of the network, the mean value and the variance of the MR image characteristics are kept consistent with the mean value and the variance of the US image characteristics, a Res-U-Net network architecture is designed in the design of the network architecture of the generator, a residual block with a bottleneck is designed to replace the blocks in the encoder and the decoder in the original U-Net network architecture, the content loss and the characteristic matching loss which comprise modal perception are combined with the countermeasure loss of the spectral regularization, a new loss function aiming at the ultrasonic simulation is constructed, the spectral regularized least square generating countermeasure network is constructed, and the nonlinear mapping of the MR to the US can be learned based on the generating countermeasure network of the three-dimensional adaptive example normalization layer, the technology is applied to ultrasound image simulation to generate a simulated three-dimensional ultrasound image and a real ultrasound image which are highly simulated.

There is also provided a three-dimensional ultrasound simulation apparatus using a generative countermeasure network, comprising:

an image input module configured to simultaneously input the MR image and the US image into an encoder in a generator in a least squares generative countermeasure network based on spectral regularization;

an embedding module configured to embed a three-dimensional adaptive instance normalization layer between the encoder and decoder of the network, keeping the mean and variance of the MR image features consistent with the mean and variance of the US image features;

a network architecture design module configured to design a Res-U-Net network architecture in which a residual block with a bottleneck is designed to replace blocks in an encoder and a decoder in an original U-Net network architecture, in a network architecture design of a generator;

a loss function construction module configured to combine the content loss including modality perception and the countervailing loss including feature matching loss and spectral regularization to construct a new loss function for ultrasound simulation;

a network construction module that registers to construct a spectrally regularized least squares generating dyadic network.

Drawings

FIG. 1 is a flow chart of a three-dimensional ultrasound simulation method utilizing a generated countermeasure network in accordance with the present invention.

FIG. 2 is a schematic flow diagram of one embodiment of a three-dimensional ultrasound simulation method utilizing a generated countermeasure network in accordance with the present invention.

Fig. 3 is a specific network architecture of an encoder and decoder in a generator implemented according to the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the accompanying drawings and specific embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

In order to make the description of the present disclosure more complete and complete, the following description is given for illustrative purposes with respect to the embodiments and examples of the present invention; it is not intended to be the only form in which the embodiments of the invention may be practiced or utilized. The embodiments are intended to cover the features of the various embodiments as well as the method steps and sequences for constructing and operating the embodiments. However, other embodiments may be utilized to achieve the same or equivalent functions and step sequences.

As shown in fig. 1, the three-dimensional ultrasonic simulation method using a generation countermeasure network includes the following steps:

The invention simultaneously inputs a magnetic resonance MR image and an ultrasonic US image into an encoder in a generator in a least square generation type countermeasure network based on spectral regularization, a three-dimensional adaptive example normalization layer is embedded between the encoder and the decoder of the network, the mean value and the variance of the characteristics of the MR image are kept consistent with the mean value and the variance of the characteristics of the US image, a Res-U-Net network architecture is designed in the network architecture design of the generator, a residual block with a bottleneck is designed to replace the blocks in the encoder and the decoder in the original U-Net network architecture, the content loss and the characteristic matching loss including modal perception are combined with the countermeasure loss including spectral regularization, a new loss function aiming at ultrasonic simulation is constructed, a spectral least square generation countermeasure network is constructed, and the MR to US nonlinear mapping can be learned based on the generation countermeasure network of the three-dimensional adaptive example normalization layer, the technology is applied to ultrasound image simulation to generate a simulated three-dimensional ultrasound image and a real ultrasound image high simulation.

Preferably, in the step (2), the three-dimensional adaptive instance normalization layer is defined as formula (1):

wherein, mu (I)_MR) And σ (I)_MR) Mean and standard deviation, respectively, of the magnetic resonance MR image by sigma (I)_MR) To input US image I_USSimple scale normalization was performed with μ (I)_MR) It is translated and these statistics are computed across spatial locations.

Preferably, in the step (2), the three-dimensional adaptive instance normalization layer realizes modal transfer of the feature space by transferring channel mean and variance feature statistics; the modal features are converted to an output image space by a feed-forward decoder, and the variance of the feature channel encodes more modal information and transmits the information to the three-dimensional adaptive instance normalization layer output and the finally generated image.

Preferably, in the step (3), the modality of the Res-U-Net network architecture transfers the network to an MR image I_MRAnd a modality US image I_USAs input, an image G (I) is synthesized_MR) An image combining the former content information and the latter modality information; the two input images are passed to a feature encoding module where residual blocks with bottlenecks are used to replace blocks in the original U-Net encoder.

Preferably, in the step (3), the residual block connects two 3 × 3 × 3 convolutional layers and one additional 1 × 1 × 1 convolutional layer; after the two images are coded into a feature space, inputting the two feature maps into a 3D AdaIN layer, and aligning the mean value and the variance of the input feature maps with the mean value and the variance of the modal feature maps by the layer to generate a target feature map t; inputting t into a characteristic decoding module, and finally mapping the t back to an image space; the decoder is the mirror image of the encoder, adding two Dropout layers in the decoding block to prevent over-fitting during the training process.

Preferably, in said step (3), the jump connection from equal resolution provides the generator G with a method of bypassing the bottleneck of information, and different learning rates are set for the generator and the arbiter in order to balance the training speeds of both the generator and the arbiter during the training of the network.

Preferably, in the step (4),

output image G (I) of the generator_MR) Again through the encoder (E) and with the input MR image I passing through the encoder_MRComparing in the potential space; by measuring the input image I in the underlying space_MRAnd generating an image G (I)_MR) Similarity in content, only focus on the detailed information related to the modality. Since the training of the encoder is combined with the training of the decoder, which produces a modality-specific image, the input image I_MRThe resulting latent vector z is considered to be a representation of its modality-dependent content, which latent spatial representation changes continuously during the training process, thereby ultimately adapting to the modality image I_USIs represented by (a); the definition of modal-aware content loss is formula (2):

where d denotes the dimension of the underlying space, the modal-aware content loss computation I_MRAnd G (I)_MR) Normalized squared euclidean distance therebetween; in the above equation, E and G are coupled together for training, E is adjusted according to G, E is an adaptive content encoder;

the feature matching loss comparison is the activation mapping of the intermediate layer of the discriminators, and the feature matching loss penalizes the generation of the image G (I) by defining a distance metric between the activation mappings of the discriminators_MR) With the real image I_USDissimilarity, the feature matching penalty is defined as equation (3):

wherein L is the last convolution layer of the discriminator, N_iIs the number of elements of the i-th layer active layer,

is the mapping of the arbiter i layer active layer;

the confrontational loss of spectral regularization is formula (4):

wherein D is_SNDiscriminator after spectrum normalization, G_SRRepresenting the spectrally regularized generator, G being the input image I_MRA modal synthesis network of (a);

the overall loss function is formula (5):

L＝λ_cL_c+λ_FL_F+λ_advL_adv (5)

wherein λ is_c,λ_FAnd λ_advRespectively, weight parameters.

Preferably, in the step (4), the spectrum regularization is applied to the reactive network by using least square generation,

the objective function based on the LSGAN model is defined as formula (6):

wherein, I_MRIs an input MR image, I_USIs an input modality US image, G (I)_MR,I_US) Is the generated US image.

It will be understood by those skilled in the art that all or part of the steps in the method of the above embodiments may be implemented by hardware instructions related to a program, the program may be stored in a computer-readable storage medium, and when executed, the program includes the steps of the method of the above embodiments, and the storage medium may be: ROM/RAM, magnetic disks, optical disks, memory cards, and the like. Therefore, corresponding to the method of the invention, the invention also includes a three-dimensional ultrasound simulation device for generating a countermeasure network, which is generally represented in the form of functional modules corresponding to the steps of the method. The device includes:

an embedding module configured to embed a three-dimensional adaptive instance normalization layer between the encoder and decoder of the network, keeping the mean and variance of the features of the MR image consistent with the mean and variance of the features of the US image;

The present invention is described in more detail below.

The invention adopts an encoding-decoding network architecture, and utilizes a designed encoder to extract the characteristics of an input MR image and a modal US image. Modality transfer in a feature space is performed using a three-dimensional adaptive instance normalization layer. A symmetric encoder network is then learned to generate the final output image by converting the output of the three-dimensional adaptive instance normalization layer back into image space. The generator G produces an output modality US image from the input MR image. US image G (I) generated using a standard pair of antagonism discriminators D_MR) And real US images are distinguished. In addition, a feature matching penalty is calculated with the discriminator and a modality-aware content penalty is calculated with the encoder.

The three-dimensional adaptive instance normalization layer is defined as follows:

wherein, mu (I)_MR) And σ (I)_MR) Mean and standard deviation of the input MR images, respectively. We use σ (I)_MR) To input modality US image I_USSimple scale normalization with μ (I)_MR) It is translated. Like IN, these statistics are computed across spatial locations.

From the above formula, the three-dimensional adaptive instance normalization layer realizes the modal transfer of the feature space by transferring the feature statistics of the mean and variance of the channel. The output generated by the three-dimensional self-adaptive example normalization layer has higher average activation degree on the characteristics generated by mode image coding, and meanwhile, the structure information of the input image can be reserved. The modal characteristics are then translated to the output image space using a feed forward decoder. The variance of the feature channel can encode more modal information and transmit it to the three-dimensional adaptive instance normalization layer output and the final generated image. Furthermore, the three-dimensional adaptive instance normalization layer is as simple as the normalization layer and does not increase computational cost.

The proposed network architecture is that of 3D Res-U-Net, the modality transfer network uses an MR image I_MRAnd a modality US image I_USAs input, an image G (I) is synthesized_MR) The image combines the former content information and the latter modality information. The two input images are passed to a feature encoding module where residual blocks with bottlenecks are used to replace blocks in the original U-Net encoder. The residual block connects two 3 x 3 convolutional layers and one additional 1 x 1 convolutional layer. After the two images are encoded into the feature space, both feature maps are input into a 3D AdaIN layer that aligns the mean and variance of the input feature maps with the mean and variance of the modal feature maps to generate a target feature map t. T is then input to the feature decoding module, which maps it back to image space. The decoder is mainly a mirror image of the encoder, adding two Dropout layers in the decoding block to prevent over-fitting during training. Furthermore, the hopping connection from equal resolution provides the generator G with a way to bypass the information bottleneck. In training the network, different learning rates may be set for the generator and the arbiter in order to balance the training speeds of both the generator and the arbiter.

In order to improve the reconstruction quality, an optimization method for modal perception content loss is provided, and meanwhile, a composite image is learned through a network. Output image G (I) of the generator_MR) Again through the encoder (E) and with the input image I passing through the encoder_MRA comparison is made. Thus, by measuring the input image I in the underlying space_MRAnd generating an image G (I)_MR) The similarity in the content is such that,only the detail information related to the modality is of interest. Since the training of the encoder is combined with the training of the decoder, which produces a modality-specific image, the input image I_MRThe resulting latent vector z can be viewed as a representation of its modality-dependent content. The potential spatial representation is constantly changing during the training process, thereby finally adapting to the modal image I_USIs shown. Thus, modality-aware content loss is defined as follows:

where d denotes the dimension of the underlying space, the modal-aware content loss computation I_MRAnd G (I)_MR) Normalized squared euclidean distance between. Different modes input different degrees of content retention, so measuring content differences using fixed networks violates this empirical rule. In the above equation, E and G are coupled together for training, and E is adjusted according to G, so E is an adaptive content encoder.

Next, the final penalty is the feature matching penalty L_FMSimilar to the perceptual loss of perceptual differences between measured images. The loss-aware activation map is computed using a pre-trained VGG19 network. The feature matching loss comparison is the activation mapping of the intermediate layer of the discriminators, and the feature matching loss penalizes the generation of the image G (I) by defining a distance metric between the activation mappings of the discriminators_MR) With the real image I_USDissimilarity. The feature matching penalty is defined as:

is the mapping of the arbiter ith layer active layer. From the aboveIt can be seen that the feature matching penalty processes the edge information by comparing the activation maps of each intermediate layer of the arbiter, thereby forcing the generator to generate images closer to the real image, stabilizing the training process.

Additional constraints are added on the generator, and the problem of spectrum normalization gradient explosion is solved definitely.

The spectral regularization weight, W, may be defined as:

W＝W-max(0,σ₀-σ_clamp)v₀u₀

wherein σ_clampSet to a fixed value sigma_reg，σ₀Is the maximum singular value of the weight matrix, v₀And u₀Is the singular vector corresponding to W. It can be observed that adding additional constraints to G improves the stability of the training and improves the quality of the generated image better than weight normalization and gradient penalties. Thus, the penalty of spectral regularization can be designed as:

wherein D is_SNDiscriminator after spectrum normalization, G_SRRepresenting the spectrally regularized generator. G is the input image I_MRThe network is synthesized. In this context, we chose to apply spectral regularization on the least squares generation dyadic network proposed by MAO et al, since it was found that LSGAN generates images of better quality and faster network convergence than other forms of GANs. The objective function based on the LSGAN model is defined as follows:

wherein, I_MRIs an input MR image, I_USIs an input modalityUS image, G (I)_MR,I_US) Is the generated US image.

Plus a spectrally regularized penalty function, the overall penalty function is:

L＝λ_cL_c+λ_FL_F+λ_advL_adv

wherein λ is_c,λ_FAnd λ_advRespectively, weight parameters.

It can be seen that the training process for generating the antagonistic network is: obtaining a modal perception loss function and a characteristic matching loss function of the real ultrasonic image according to the output result of the generator and the corresponding real ultrasonic image; generating a least square loss function of a generator and a discriminator in the countermeasure network based on the spectrum regularization, obtaining a total loss function of the generator according to an output result of the discriminator and a content loss function of the real ultrasonic image, and obtaining a total loss function of the discriminator according to an output result of the discriminator; and respectively updating parameters in the network structures of the arbiter and the generator according to the total loss function of the arbiter and the total loss function of the generator until the generation of the confrontation network convergence.

The method for deep network learning generates an impedance network, simultaneously inputs MR and US images into a generator of the trained impedance network, and then generates a simulated three-dimensional ultrasonic image which is a highly simulated ultrasonic image with a real ultrasonic image.

The above description is only a preferred embodiment of the present invention, and is not intended to limit the present invention in any way, and all simple modifications, equivalent variations and modifications made to the above embodiment according to the technical spirit of the present invention still belong to the protection scope of the technical solution of the present invention.

Claims

1. A three-dimensional ultrasonic simulation method by using a generative countermeasure network is characterized in that: which comprises the following steps:

(2) embedding a three-dimensional self-adaptive example normalization layer between an encoder and a decoder of the network, and keeping the mean value and the variance of the MR image characteristics consistent with the mean value and the variance of the US image characteristics;

(5) constructing a least square generation countermeasure network of spectrum regularization;

in the step (2), the three-dimensional adaptive instance normalization layer is defined as formula (1):

wherein, mu (I)_MR) And σ (I)_MR) Mean and standard deviation, respectively, of the magnetic resonance MR image by sigma (I)_MR) To input ultrasound US image I_USSimple scale normalization with μ (I)_MR) It is translated and these statistics are computed across spatial locations.

2. The three-dimensional ultrasonic simulation method using a generative countermeasure network according to claim 1, wherein: in the step (2), the three-dimensional self-adaptive example normalization layer realizes modal transfer of the feature space by transferring the feature statistics of the mean value and the variance of the channel; the modal features are converted into an output image space by a feed-forward decoder, and the variance of the feature channel encodes more modal information and transmits the information to the three-dimensional adaptive instance normalization layer output and the finally generated image.

3. The three-dimensional hyper-graph utilizing a generative countermeasure network of claim 2The acoustic simulation method is characterized in that: in the step (3), the mode transfer network of the Res-U-Net network architecture uses one MR image I_MRAnd a modality US image I_USAs input, an image G (I) is synthesized_MR) The image combines the former content information and the latter modality information; the two input images are passed to a feature encoding module where residual blocks with bottlenecks are used to replace blocks in the original U-Net encoder.

4. The three-dimensional ultrasonic simulation method using a generative countermeasure network according to claim 3, wherein: in the step (3), the residual block connects two 3 × 3 × 3 convolutional layers and one additional 1 × 1 × 1 convolutional layer; after the two images are coded into a feature space, inputting the two feature maps into a 3D AdaIN layer, and aligning the mean value and the variance of the input feature maps with the mean value and the variance of the modal feature maps by the layer to generate a target feature map t; then inputting t into a characteristic decoding module, and mapping the t back to an image space; the decoder is the mirror image of the encoder, adding two Dropout layers in the decoding block to prevent overfitting during training.

5. The three-dimensional ultrasonic simulation method using a generative countermeasure network according to claim 3, wherein: in step (3), the jump connection from equal resolution provides a method for the generator G to bypass the bottleneck of information, and in the process of training the network, different learning rates are set for the generator and the arbiter in order to balance the training speeds of the generator and the arbiter.

6. The three-dimensional ultrasonic simulation method using a generative countermeasure network according to claim 5, wherein: in the step (4), the step of (C),

output image G (I) of the generator_MR) Again through the encoder and with the input image I passing through the encoder_MRComparing in the potential space; by measuring the input image I in the underlying space_MRAnd generating an image G (I)_MR) Similarity in content, focusing only on modality-wiseDetails of the relation; the decoder can generate a mode-specific image by combining the training of the encoder and the training of the decoder, so that the input image I is_MRThe resulting latent vector z is considered to be a representation of its modality-dependent content, which latent spatial representation changes continuously during the training process, thereby ultimately adapting to the modality image I_USIs represented by (a); the definition of modal-aware content loss is formula (2):

where d denotes the dimension of the underlying space, the modal-aware content loss computation I_MRAnd G (I)_MR) Normalized squared euclidean distance therebetween; in the above equation, E and G are coupled together for training, E is adjusted according to G, E is an adaptive content encoder; p (I)_MR)

is the mapping of the arbiter i layer active layer;

the confrontational loss of spectral regularization is formula (4):

wherein D is_SNDiscriminator after spectrum normalization, G_SRRepresenting the spectrally regularized generator, G being the input image I_MRA modal synthesis network;

the overall loss function is formula (5):

L＝λ_cL_c+λ_FL_F+λ_advL_adv (5)

wherein λ is_c,λ_FAnd λ_advRespectively, weight parameters.

7. The three-dimensional ultrasonic simulation method using a generative countermeasure network according to claim 6, wherein: in the step (4), the least square generation is adopted to apply spectrum regularization on the reactive network,

the objective function based on the LSGAN model is defined as formula (6):

wherein, I_MRIs an input MR image, I_USIs an input modality US image, G (I)_MR,I_US) Is the simulated US image generated.

8. A three-dimensional ultrasound simulation apparatus using a generative countermeasure network, characterized by: it includes:

a network architecture design module configured to design a Res-U-Net network architecture in the network architecture design of the generator, wherein a residual block with a bottleneck is designed to replace blocks in an encoder and a decoder in the original U-Net network architecture;

a network construction module which is registered to construct a spectrum regularized least square generation countermeasure network;

in the embedded module, a three-dimensional adaptive instance normalization layer is defined as formula (1):