CN116433795B

CN116433795B - Multi-mode image generation method and device based on countermeasure generation network

Info

Publication number: CN116433795B
Application number: CN202310699766.2A
Authority: CN
Inventors: 李劲松; 张楚杰; 陈延伟; 童若锋; 林兰芬
Original assignee: Zhejiang Lab
Current assignee: Zhejiang Lab
Priority date: 2023-06-14
Filing date: 2023-06-14
Publication date: 2023-08-29
Anticipated expiration: 2043-06-14
Also published as: CN116433795A

Abstract

The invention discloses a multimode image generation method and device based on an countermeasure generation network, comprising the following steps: acquiring a first mode image and a second mode image of the same target, and enhancing the first mode image to obtain two enhanced mode images; constructing an countermeasure generation network comprising a generator and a discriminator, wherein the generator generates three predicted second mode images based on the first mode image and two enhanced mode images thereof, the discriminator performs true-false discrimination on the second mode image and the predicted second mode image corresponding to the first mode image, and the discriminator also calculates and outputs two intermediate feature images of the two predicted second mode images corresponding to the two enhanced mode images in the intermediate layer of the discriminator; and (3) constructing contrast loss between features based on the two intermediate feature graphs, carrying out parameter optimization on the countermeasure generation network by combining the contrast loss with the original loss of the countermeasure generation network, and extracting a generator with the parameter optimization for generating the multi-mode image so as to improve the image precision.

Description

Multi-mode image generation method and device based on countermeasure generation network

Technical Field

The invention belongs to the technical field of cross-modal generation of medical images, and particularly relates to a multi-modal image generation method and device based on an countermeasure generation network.

Background

Medical imaging is a powerful diagnostic and research tool that can create visual representations of anatomical structures and has been widely used for disease diagnosis and surgical planning. In current clinical practice, computed Tomography (CT) and Magnetic Resonance Imaging (MRI) are most commonly used. Since CT and multiple MR imaging modalities provide complementary information, effective integration of these different modalities can help the physician make more informed decisions. Because of the difficulty in obtaining paired multimodal images, there is an increasing need in clinical practice to develop multimodal image generation to aid in clinical diagnosis and therapy.

Medical image generation is divided into a traditional machine learning method and a deep learning method. Traditional machine learning methods rely on explicit feature representations. Such as random forests, k-nearest neighbor algorithms, etc., are explicitly represented by optimization features by iterative methods. Recently, convolutional neural networks have been widely used for various image generation tasks and have achieved the most advanced performance through countermeasure generation networks.

Currently mainstream network-based models of countermeasure generation, when lifting discriminators, they are created by: gradient punishment, spectrum normalization, contrast learning, consistency regularization and other methods implicitly or explicitly regularize the discriminant.

For example, patent application publication number CN112465118A discloses a low-rank generation type countermeasure network construction method for medical image generation, which includes the following steps: 1) Using a principal component mode to approximate the full-rank convolution operation in the GAN model, and constructing a low-rank convolution operation based on a calculation rule of tensor CP decomposition; 2) Constructing a low-rank dimension convolution layer and a low-rank channel convolution layer to replace a full-rank convolution layer by utilizing the low-rank convolution operation in the step 1), adding a ReLU activation function and a batch regularization item between the low-rank convolution layers, adjusting the data distribution of the low-rank convolution layer, and designing a low-rank generation model; 3) And integrating the low-rank generation model and the full-rank discrimination model to construct a complete medical image low-rank generation type countermeasure network.

For another example, patent application CN113205567a discloses a method for synthesizing CT images based on deep learning MRI images, which includes the following steps: s1, selecting an original MRI image and an original CT image as a floating image and a reference image respectively, and then carrying out N4 offset correction and standardization to obtain preprocessed MRI and CT images; s2, training a countermeasure generation network model for synthesizing CT images by adopting the preprocessed MRI images and the preprocessed CT images; s3, inputting the preprocessed MRI image into an antagonism type generation network model of the synthesized CT image of the MRI image, so that the preprocessed MRI image is converted into the synthesized CT image.

In the technical solutions disclosed in the prior art, such as the above two patent applications, the features related to the high task output by the arbiter are usually acted, but the shallow features of the middle layer are often ignored, for example: color, texture, etc., and thus the image synthesis accuracy has yet to be improved.

Disclosure of Invention

In view of the foregoing, an object of the present invention is to provide a method and an apparatus for generating a multi-modal image based on an countermeasure generation network, which improve sensitivity of a discriminator to picture style information by contrast learning based on shallow features of the discriminator in the countermeasure generation network, thereby improving precision of generating the multi-modal image.

In order to achieve the above object, the embodiment of the present invention provides a multi-modal image generation method based on an countermeasure generation network, including the following steps:

acquiring a first mode image and a second mode image of the same target, and enhancing the first mode image to obtain two enhanced mode images;

constructing an countermeasure generation network comprising a generator and a discriminator, wherein the generator generates three predicted second mode images based on the first mode image and two enhanced mode images thereof, the discriminator performs true-false discrimination on the second mode image and the predicted second mode image corresponding to the first mode image, and the discriminator also calculates and outputs two intermediate feature images of the two predicted second mode images corresponding to the two enhanced mode images in the intermediate layer of the discriminator;

establishing contrast loss between features based on the two intermediate feature graphs, and carrying out parameter optimization on the countermeasure generation network by combining the contrast loss with the original loss of the countermeasure generation network;

the generator for extracting parameter optimization is used for generating the multi-mode image.

Preferably, the first modality image and the second modality image of the same object are preprocessed by:

filtering the original first mode image and the original second mode image;

performing rigid registration based on a target area on the filtered original first-mode image and the filtered original second-mode image;

respectively carrying out pixel normalization on the original first mode image and the original second mode image after rigid registration;

and selecting targets of the original first-mode image and the original second-mode image which are subjected to pixel normalization to obtain a first-mode image and a second-mode image of the same target.

Preferably, the generator adopts a generator structure in a pixel to pixel model.

Preferably, the discriminant employs a markov discriminant.

Preferably, the method further comprises:

at least 2 layers of MLP are added for each intermediate feature image output by the intermediate layer of the discriminator, the intermediate feature images are subjected to feature updating through the MLP, and the updated intermediate feature images participate in contrast loss calculation.

Preferably, constructing a contrast loss between features based on the two intermediate feature maps comprises:

taking two middle feature images of two predicted second mode images corresponding to the two enhanced mode images in the same middle layer of the discriminator as positive sample pairs;

taking two middle characteristic images of two predicted second mode images corresponding to the two enhanced mode images in different middle layers of the discriminator as negative sample pairs;

contrast loss is constructed based on the positive and negative sample pairs.

Preferably, the initial loss of the countermeasure generation network includes an L1 loss constructed based on the predicted second modality image and the second modality image corresponding to the first modality image, and further includes a countermeasure loss of the generator and the discriminator.

In order to achieve the above object, an embodiment further provides a multi-mode image generating device based on an countermeasure generating network, which comprises an acquisition module, a network construction module, a parameter optimization module, an image generating module,

the acquisition module is used for acquiring a first mode image and a second mode image of the same target, and enhancing the first mode image to obtain two enhanced mode images;

the network construction module is used for constructing an countermeasure generation network comprising a generator and a discriminator, wherein the generator generates three predicted second mode images based on the first mode image and two enhanced mode images thereof, the discriminator performs true-false discrimination on the second mode image and the predicted second mode image corresponding to the first mode image, and the discriminator also calculates and outputs two middle feature images of the two predicted second mode images corresponding to the two enhanced mode images in the middle layer of the discriminator;

the parameter optimization module is used for constructing contrast loss between features based on the two intermediate feature graphs, and carrying out parameter optimization on the countermeasure generation network by combining the contrast loss with the original loss of the countermeasure generation network;

the image generation module is used for extracting a generator with optimized parameters for generating the multi-mode image.

To achieve the above object, an embodiment further provides a computing device including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor executes the computer program to implement the steps of the above multi-modal image generation method based on an countermeasure generation network.

To achieve the above object, an embodiment further provides a computer-readable storage medium having stored thereon a computer program which, when executed, performs the steps of the above-described multi-modal image generation method based on an countermeasure generation network.

Compared with the prior art, the invention has the beneficial effects that at least the following steps are included:

on the basis of constructing the enhanced modal image, the discriminator of the countermeasure generation network calculates and outputs two middle feature images of two predicted second modal images corresponding to the two enhanced modal images in the middle layer of the discriminator, and based on the contrast loss between the features of the two middle feature images, the contrast learning is added in the middle layer of the discriminator, so that the learning of the discriminator on the shallow image features is enhanced, the sensitivity of the discriminator on the picture style information is improved, and the multi-modal generation precision of the generator is further improved. Meanwhile, the method can be simply applied to any other medical image generation algorithm, and the performance is improved on the basis of not changing the network structure of the original algorithm.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart of a multi-modal image generation method based on an countermeasure generation network provided by an embodiment;

FIG. 2 is a flow chart of a modality image processing provided by an embodiment;

FIG. 3 is a schematic diagram of an architecture of an countermeasure generation network provided by an embodiment;

FIG. 4 is a schematic diagram of the structure of a generator provided by an embodiment;

FIG. 5 is a schematic diagram of a residual structure in a generator provided by an embodiment;

FIG. 6 is a schematic structural diagram of a multi-modal image generation apparatus based on an countermeasure generation network according to an embodiment;

fig. 7 is a schematic structural diagram of an electronic device according to an embodiment.

Detailed Description

The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the detailed description is presented by way of example only and is not intended to limit the scope of the invention.

Recently, research methods have shown that classification models have a greater tendency to learn style information for pictures based on texture expressions, i.e., if such texture information is sufficient to help achieve higher classification accuracy, the model does not learn complex representations anymore. Because the discriminator in the GAN can be regarded as a simple classifier as well, the discriminator also depends on the texture information of the picture to perform discrimination, the invention provides a multi-mode image generation method and device based on an countermeasure generation network, and the designed countermeasure generation network comprises contrast learning on shallow features of the discriminator, so that the sensitivity of the discriminator to picture style information is improved, and the multi-mode generation result is further improved.

Fig. 1 is a flowchart of a multi-modal image generation method based on an countermeasure generation network according to an embodiment. As shown in fig. 1, the method for generating a multi-mode image based on an countermeasure generation network according to the embodiment includes the following steps:

s110, a first mode image and a second mode image of the same target are obtained, and the first mode image is enhanced to obtain two enhanced mode images.

In an embodiment, multi-modality image data is obtained from a hospital, including raw first modality image data, such as a magnetic resonance image (MR), raw second modality image data, such as a computed tomography image (CT), and a mask (mask) corresponding to a target region, such as a tumor region, in the image data. Wherein, the CT image includes: arterial phase (ART), portal Phase (PV), sweep phase (NC), delay phase (DL); the MR image contains: arterial phase (ART), delay phase (DL), diffusion Weighted Imaging (DWI), swipe phase (NC), portal Phase (PV), T2 weighted imaging (T2). The MR and CT data formats were nii and the mask (mask) data format was nrrd.

After the original multi-mode image data is obtained, preprocessing is further required to obtain a first mode image and a second mode image of the same target, as shown in fig. 2, which specifically includes:

s210, filtering the original first mode image and the original second mode image.

Specifically, for the original first modality image of CT, the window width is set to (-110, 190) according to the priori knowledge of the doctor, and filtering denoising is performed by using the np-clip method in the numpy library. For the MR image, this original second modality image, filtering denoising was performed using the methods estima and nlmeas in dipy library.

And S220, performing rigid registration based on the target area on the filtered original first mode image and the filtered original second mode image.

Specifically, an original first-mode image of the same patient is taken as an unregistered image (moving image), an original second-mode image is taken as a target image (fixed image), a mask (mask) of a target area is utilized to calculate a transformation relationship between target areas of the two-mode images, and the obtained transformation relationship is then used for acting on the whole unregistered image (moving image) to obtain a registered first-mode image. The specific method is affine registration using dipy.

And S230, respectively carrying out pixel normalization on the original first mode image and the original second mode image after rigid registration.

Specifically, for the original first modality image, the pixel values are normalized directly to [ -1,1] using linear normalization. For the original second modality image, the pixel values are normalized to [ -1,1] using a standard score (z-score) process followed by a linear normalization.

S240, performing target selection on the original first-mode image and the original second-mode image which are subjected to pixel normalization to obtain a first-mode image and a second-mode image of the same target.

Specifically, the slice index of the largest target in the mask data of the target areas of the original first-mode image and the original second-mode image is calculated respectively, and based on the slice index, four slices are selected from top to bottom, and 9 slices are taken as the first-mode image and the second-mode image of the same target.

After the first modal image is obtained, the first modal image is enhanced, and two enhanced modal images corresponding to the first modal image can be obtained through random cutting, random horizontal overturning and other methods, wherein the first modal image, the two enhanced modal images and the second modal image which belong to the same target form sample data.

S120, constructing an countermeasure generation network comprising a generator and a discriminator.

As shown in fig. 3, the constructed countermeasure generation network includes a generator and a discriminator. The generator generates three predicted second mode images based on the first mode image and two enhanced mode images. In an embodiment, the generator adopts a generator structure in a pixel to pixel model, as shown in fig. 4, and is composed of three parts, wherein the first part includes three downsampling modules, each downsampling module includes a convolution layer with a convolution kernel size (kernel size) of 3, a step size (stride) of 2, padding (padding) of 1, a normalization layer (instance normalization), and an activation function ReLU. The second part contains nine residual modules, the network structure of each residual module is shown in fig. 5, and the idea of a cyclic neural network is added on the basis of the basic residual modules, wherein the selection of a convolution layer, a normalization layer and an activation function is consistent with the convolution layer in the downsampling module of the first part, and t=3 in fig. 5 represents 3 cycles. The third section contains three upsampling modules, each upsampling module containing: a convolution kernel size (kernel size) of 3, a step size (stride) of 2, padding (padding) of 1, deconvolution layer parameter (output_padding) of 1, a normalization layer (instance normalization), and an activation function, wherein the activation function in the first two upsampling modules is ReLU and the activation function in the last upsampling module is Tanh.

The discriminator performs true and false discrimination on the second mode image and the predicted second mode image corresponding to the first mode image, and the discriminator also calculates and outputs two middle feature images of the two predicted second mode images corresponding to the two enhanced mode images in the middle layer of the discriminator. In one embodiment, the arbiter employs a Markov arbiter comprising five modules, wherein the first module comprises a convolution kernel of 3, a step size of 2, a convolutional layer with a padding (padding) of 1 and an activation function LeakyReLU, the second, third, and fourth lower modules comprise a convolution kernel of 3, a step size of 2, a convolutional layer with a padding (padding) of 1, a normalization layer (instance normalization), and an activation function LeakyReLU, and the last module is a fully connected layer.

In the embodiment, at least 2 layers of MLPs are added for each intermediate feature map output by the intermediate layer of the discriminator, the intermediate feature map is subjected to feature update through the MLPs, and the updated intermediate feature map participates in contrast loss calculation.

S130, constructing contrast loss between features based on the two intermediate feature graphs, and carrying out parameter optimization on the countermeasure generation network by combining the contrast loss with the original loss of the countermeasure generation network.

In an embodiment, constructing a contrast loss between features based on two intermediate feature graphs includes: taking two middle feature images of two predicted second mode images corresponding to the two enhanced mode images in the same middle layer of the discriminator as positive sample pairs; taking two middle characteristic images of two predicted second mode images corresponding to the two enhanced mode images in different middle layers of the discriminator as negative sample pairs; contrast loss is constructed based on the positive and negative sample pairs.

Specifically, the first modal image x and two corresponding enhanced modal images x1 and x2 are input into a generator to generate predicted second modal images G (x), G (x 1) and G (x 2), wherein the G (x 1) and the G (x 2) are respectively input into a discriminator to respectively obtain intermediate feature maps corresponding to N intermediate layers, and the total number of the intermediate feature maps is 2N. Assuming that the intermediate feature image sequence number obtained by inputting G (x 1) into the discriminator belongs to 1-N, and the intermediate feature image sequence number obtained by inputting G (x 2) into the discriminator belongs to n+1-2N, the intermediate feature images of the same intermediate layer form positive sample pairs, the intermediate feature images of different intermediate layers form negative sample pairs, and the specific calculation and comparison loss process is as follows:

for positive sample pairs numbered i and i+N, the contrast loss is:

wherein, the liquid crystal display device comprises a liquid crystal display device,representing an indication function if and only if j is not equal to i, i.e +.>When (I)>The value is 1, otherwise the value is 0, < >>Representing the ith intermediate feature map and the +.>Similarity between the intermediate feature maps, +.>Representing i intermediate feature maps and +.>Similarity between the intermediate feature graphs, wherein i is 1-N, j is 1-2N, and the contrast of all positive sample pairs is lost ∈ ->The method comprises the following steps:

in an embodiment, the contrast loss is combined with the original loss of the countermeasure generation network to perform parameter optimization on the countermeasure generation network, and an adaptive moment estimation (Adam) optimizer is used to update the weights. The original loss of the countermeasure generation network comprises L1 loss constructed based on the predicted second mode image G (x) and the second mode image y corresponding to the first mode image, and further comprises countermeasure loss of the generator and the discriminator.

Wherein, the L1 loss is expressed as:

the penalty of the generator and arbiter is expressed as:

wherein, the liquid crystal display device comprises a liquid crystal display device,representing the L1 norm, E () represents the desire, and D () represents the discrimination result.

In summary, the loss function of the model of the antagonism generation network is:

wherein, the liquid crystal display device comprises a liquid crystal display device,，/>，/>and->Is a coefficient for controlling the loss function.

In an embodiment, after optimizing the countermeasure generation network, three evaluation indexes are further adopted to evaluate the network, wherein the three evaluation indexes are Mean Absolute Error (MAE), peak signal to noise ratio (PSNR), and Structural Similarity (SSIM). Specifically, CT was generated from MR using sample data of 60 patients, and the results are shown in table 1 using these three evaluation indexes:

analysis Table 1 shows that the three evaluation indexes of MAE, PSNR and SSIM of the generator obtained by the method are all superior to those of the generator generated by pix2pix GAN.

And S140, a generator for extracting parameter optimization is used for generating the multi-mode images.

After optimizing the countermeasure generation network parameters through S130, the generator with optimized extraction parameters is used for generating the multi-mode image, specifically, the generator with optimized parameters is input into the first-mode image, and the predicted second-mode image is obtained through calculation, where the accuracy of the second-mode image is higher.

According to the multi-mode image generation method based on the countermeasure generation network, the used countermeasure generation network comprises the discriminators which utilize contrast learning, so that learning of the discriminators on the shallow features of the image is enhanced, discrimination capability of the discriminators on texture information when the discriminators discriminate the image is improved, and quality of the generated image is further improved.

It should be noted that, in this embodiment, only one of the countermeasure generation networks is based, and the discriminators using contrast learning of the present invention are used in other countermeasure generation networks, which belong to the protection scope of the present patent.

Based on the same inventive concept, as shown in fig. 6, the embodiment further provides a multi-modal image generating apparatus 600 based on an countermeasure generation network, including an acquisition module 610, a network construction module 620, a parameter optimization module 630, an image generation module 640,

the acquiring module 610 is configured to acquire a first mode image and a second mode image of the same target, and enhance the first mode image to obtain two enhanced mode images; the network construction module 620 is configured to construct an countermeasure generation network including a generator and a arbiter, where the generator generates three predicted second mode images based on the first mode image and two enhanced mode images thereof, the arbiter performs true-false discrimination on the second mode image and the predicted second mode image corresponding to the first mode image, and the arbiter further calculates and outputs two intermediate feature graphs of two predicted second mode images corresponding to the two enhanced mode images in an intermediate layer of the arbiter; the parameter optimization module 630 is configured to construct a contrast loss between features based on the two intermediate feature graphs, and perform parameter optimization on the countermeasure generation network by combining the contrast loss with an original loss of the countermeasure generation network; the image generation module 640 is used for extracting a generator of parameter optimization for multi-modal image generation.

It should be noted that, when the multi-mode image generating device provided in the foregoing embodiment performs multi-mode image generation, the division of the foregoing functional modules should be used as an example, and the foregoing functional allocation may be performed by different functional modules according to needs, that is, the internal structure of the terminal or the server is divided into different functional modules, so as to complete all or part of the functions described above. In addition, the embodiment of the present invention provides a multi-mode image generating device and a multi-mode image generating method, which belong to the same concept, and detailed implementation procedures of the embodiment of the multi-mode image generating device and the embodiment of the multi-mode image generating method are detailed in the embodiment of the multi-mode image generating method, and are not repeated herein.

The embodiment provides a multimode image generating device based on a countermeasure generating network, wherein the used countermeasure generating network comprises a discriminator which utilizes contrast learning, so that the learning of the discriminator on the shallow features of an image is enhanced, the discrimination capability of the discriminator on texture information when the discriminator discriminates the image is improved, and the quality of the generated image is further improved.

Based on the same inventive concept, an embodiment further provides a computing device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements the multi-modal image generation method based on an countermeasure generation network when executing the computer program, and includes the following steps:

s110, acquiring a first mode image and a second mode image of the same target, and enhancing the first mode image to obtain two enhanced mode images;

s120, constructing an countermeasure generation network comprising a generator and a discriminator;

s130, constructing contrast loss between features based on the two intermediate feature graphs, and carrying out parameter optimization on the countermeasure generation network by combining the contrast loss with the original loss of the countermeasure generation network;

As shown in fig. 7, the computing device provided by the embodiment includes, at a hardware level, hardware required by other services such as internal buses, network interfaces, and memories, in addition to the processor and the memory. The memory is a non-volatile memory, and the processor reads the corresponding computer program from the non-volatile memory into the memory and then runs the computer program to implement the multi-mode image generation method based on the countermeasure generation network as described in S110-S140. Of course, other implementations, such as logic devices or combinations of hardware and software, are not excluded from the present invention, that is, the execution subject of the following processing flows is not limited to each logic unit, but may be hardware or logic devices.

Based on the same inventive concept, the embodiment further provides a computer readable storage medium having a program stored thereon, which when executed by a processor, implements the above multi-modal image generation method based on an countermeasure generation network, and specifically includes the following steps:

In embodiments, computer-readable media, including both permanent and non-permanent, removable and non-removable media, may be used to implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, read only optical disk read only memory (CD-ROM), digital Versatile Disks (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.

The foregoing detailed description of the preferred embodiments and advantages of the invention will be appreciated that the foregoing description is merely illustrative of the presently preferred embodiments of the invention, and that no changes, additions, substitutions and equivalents of those embodiments are intended to be included within the scope of the invention.

Claims

1. The multi-mode image generation method based on the countermeasure generation network is characterized by comprising the following steps of:

acquiring a first mode image and a second mode image of the same target, and enhancing the first mode image to obtain two enhanced mode images, wherein the first mode image and the second mode image of the same target are obtained by preprocessing in the following mode: filtering the original first mode image and the original second mode image; performing rigid registration based on a target area on the filtered original first-mode image and the filtered original second-mode image; respectively carrying out pixel normalization on the original first mode image and the original second mode image after rigid registration; performing target selection on the original first-mode image and the original second-mode image which are subjected to pixel normalization to obtain a first-mode image and a second-mode image of the same target;

adding at least 2 layers of MLPs for each intermediate feature image output by the intermediate layer of the discriminator, wherein the intermediate feature images undergo feature updating through the MLPs, and the updated intermediate feature images participate in contrast loss calculation;

the method comprises the steps of constructing contrast loss between features based on two intermediate feature graphs, carrying out parameter optimization on an antagonism generation network by combining the contrast loss with original loss of the antagonism generation network, wherein the construction of the contrast loss between the features based on the two intermediate feature graphs comprises the following steps: taking two middle feature images of two predicted second mode images corresponding to the two enhanced mode images in the same middle layer of the discriminator as positive sample pairs; taking two middle characteristic images of two predicted second mode images corresponding to the two enhanced mode images in different middle layers of the discriminator as negative sample pairs; constructing a contrast penalty based on the positive and negative sample pairs;

2. The method for generating multimodal images based on an countermeasure generation network according to claim 1, wherein the generator adopts a generator structure in a pixel to pixel model.

3. A multimodal image generation method based on an countermeasure generation network according to claim 1, wherein the discriminant employs a markov discriminant.

4. A multimodal image generation method based on an countermeasure generation network according to claim 1, wherein the raw penalty of the countermeasure generation network includes an L1 penalty constructed based on the predicted second modality image and the second modality image corresponding to the first modality image, and further including a countermeasure penalty of the generator and the discriminator.

5. A multimode image generating device based on an countermeasure generating network is characterized by comprising an acquisition module, a network construction module, a parameter optimization module and an image generating module,

the acquisition module is used for acquiring a first mode image and a second mode image of the same target, and enhancing the first mode image to obtain two enhanced mode images, wherein the first mode image and the second mode image of the same target are obtained by preprocessing in the following mode: filtering the original first mode image and the original second mode image; performing rigid registration based on a target area on the filtered original first-mode image and the filtered original second-mode image; respectively carrying out pixel normalization on the original first mode image and the original second mode image after rigid registration; performing target selection on the original first-mode image and the original second-mode image which are subjected to pixel normalization to obtain a first-mode image and a second-mode image of the same target;

the parameter optimization module is configured to add at least 2 layers of MLPs to each intermediate feature map output by the intermediate layer of the arbiter, perform feature update on the intermediate feature map, participate in contrast loss calculation after the update of the intermediate feature map, construct contrast loss between features based on the two intermediate feature maps, and perform parameter optimization on the contrast generation network by combining the contrast loss with original loss of the contrast generation network, where the construction of the contrast loss between features based on the two intermediate feature maps includes: taking two middle feature images of two predicted second mode images corresponding to the two enhanced mode images in the same middle layer of the discriminator as positive sample pairs; taking two middle characteristic images of two predicted second mode images corresponding to the two enhanced mode images in different middle layers of the discriminator as negative sample pairs; constructing a contrast penalty based on the positive and negative sample pairs;

6. A computing device comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor, when executing the computer program, implements the steps of the countermeasure generation network-based multimodal image generation method of any of claims 1-4.

7. A computer-readable storage medium, on which a computer program is stored, characterized in that the computer program, when being processed and executed, implements the steps of the multimodal image generation method based on an countermeasure generation network as claimed in any one of claims 1 to 4.