CN116433795B - Multi-mode image generation method and device based on countermeasure generation network - Google Patents
Multi-mode image generation method and device based on countermeasure generation network Download PDFInfo
- Publication number
- CN116433795B CN116433795B CN202310699766.2A CN202310699766A CN116433795B CN 116433795 B CN116433795 B CN 116433795B CN 202310699766 A CN202310699766 A CN 202310699766A CN 116433795 B CN116433795 B CN 116433795B
- Authority
- CN
- China
- Prior art keywords
- mode
- image
- mode image
- images
- discriminator
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 43
- 238000005457 optimization Methods 0.000 claims abstract description 24
- 230000002708 enhancing effect Effects 0.000 claims abstract description 7
- 230000015654 memory Effects 0.000 claims description 18
- 238000010606 normalization Methods 0.000 claims description 18
- 238000004590 computer program Methods 0.000 claims description 11
- 238000010276 construction Methods 0.000 claims description 9
- 238000003860 storage Methods 0.000 claims description 8
- 238000004364 calculation method Methods 0.000 claims description 7
- 238000001914 filtration Methods 0.000 claims description 6
- 230000008485 antagonism Effects 0.000 claims description 4
- 238000007781 pre-processing Methods 0.000 claims description 3
- 238000002591 computed tomography Methods 0.000 description 16
- 230000006870 function Effects 0.000 description 12
- 230000004913 activation Effects 0.000 description 8
- 238000002595 magnetic resonance imaging Methods 0.000 description 8
- 239000004973 liquid crystal related substance Substances 0.000 description 6
- 238000010586 diagram Methods 0.000 description 5
- 238000011156 evaluation Methods 0.000 description 4
- 238000004422 calculation algorithm Methods 0.000 description 3
- 230000035945 sensitivity Effects 0.000 description 3
- ROFVEXUMMXZLPA-UHFFFAOYSA-N Bipyridyl Chemical compound N1=CC=CC=C1C1=CC=CC=N1 ROFVEXUMMXZLPA-UHFFFAOYSA-N 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000003384 imaging method Methods 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 230000002194 synthesizing effect Effects 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 description 1
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 1
- 238000013256 Gubra-Amylin NASH model Methods 0.000 description 1
- 206010028980 Neoplasm Diseases 0.000 description 1
- RJKFOVLPORLFTN-LEKSSAKUSA-N Progesterone Chemical compound C1CC2=CC(=O)CC[C@]2(C)[C@@H]2[C@@H]1[C@@H]1CC[C@H](C(=O)C)[C@@]1(C)CC2 RJKFOVLPORLFTN-LEKSSAKUSA-N 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 238000007792 addition Methods 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 210000003484 anatomy Anatomy 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000013145 classification model Methods 0.000 description 1
- 238000003759 clinical diagnosis Methods 0.000 description 1
- 238000011281 clinical therapy Methods 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 239000013256 coordination polymer Substances 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 238000005520 cutting process Methods 0.000 description 1
- 125000004122 cyclic group Chemical group 0.000 description 1
- 238000000354 decomposition reaction Methods 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 238000002059 diagnostic imaging Methods 0.000 description 1
- 238000002597 diffusion-weighted imaging Methods 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000007667 floating Methods 0.000 description 1
- 230000014509 gene expression Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000007637 random forest analysis Methods 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 238000012549 training Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—2D [Two Dimensional] image generation
- G06T11/003—Reconstruction from projections, e.g. tomography
- G06T11/008—Specific post-processing after tomographic reconstruction, e.g. voxelisation, metal artifact correction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0475—Generative networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/30—Determination of transform parameters for the alignment of images, i.e. image registration
- G06T7/33—Determination of transform parameters for the alignment of images, i.e. image registration using feature-based methods
- G06T7/337—Determination of transform parameters for the alignment of images, i.e. image registration using feature-based methods involving reference images or patches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10072—Tomographic images
- G06T2207/10081—Computed x-ray tomography [CT]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10072—Tomographic images
- G06T2207/10088—Magnetic resonance imaging [MRI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a multimode image generation method and device based on an countermeasure generation network, comprising the following steps: acquiring a first mode image and a second mode image of the same target, and enhancing the first mode image to obtain two enhanced mode images; constructing an countermeasure generation network comprising a generator and a discriminator, wherein the generator generates three predicted second mode images based on the first mode image and two enhanced mode images thereof, the discriminator performs true-false discrimination on the second mode image and the predicted second mode image corresponding to the first mode image, and the discriminator also calculates and outputs two intermediate feature images of the two predicted second mode images corresponding to the two enhanced mode images in the intermediate layer of the discriminator; and (3) constructing contrast loss between features based on the two intermediate feature graphs, carrying out parameter optimization on the countermeasure generation network by combining the contrast loss with the original loss of the countermeasure generation network, and extracting a generator with the parameter optimization for generating the multi-mode image so as to improve the image precision.
Description
Technical Field
The invention belongs to the technical field of cross-modal generation of medical images, and particularly relates to a multi-modal image generation method and device based on an countermeasure generation network.
Background
Medical imaging is a powerful diagnostic and research tool that can create visual representations of anatomical structures and has been widely used for disease diagnosis and surgical planning. In current clinical practice, computed Tomography (CT) and Magnetic Resonance Imaging (MRI) are most commonly used. Since CT and multiple MR imaging modalities provide complementary information, effective integration of these different modalities can help the physician make more informed decisions. Because of the difficulty in obtaining paired multimodal images, there is an increasing need in clinical practice to develop multimodal image generation to aid in clinical diagnosis and therapy.
Medical image generation is divided into a traditional machine learning method and a deep learning method. Traditional machine learning methods rely on explicit feature representations. Such as random forests, k-nearest neighbor algorithms, etc., are explicitly represented by optimization features by iterative methods. Recently, convolutional neural networks have been widely used for various image generation tasks and have achieved the most advanced performance through countermeasure generation networks.
Currently mainstream network-based models of countermeasure generation, when lifting discriminators, they are created by: gradient punishment, spectrum normalization, contrast learning, consistency regularization and other methods implicitly or explicitly regularize the discriminant.
For example, patent application publication number CN112465118A discloses a low-rank generation type countermeasure network construction method for medical image generation, which includes the following steps: 1) Using a principal component mode to approximate the full-rank convolution operation in the GAN model, and constructing a low-rank convolution operation based on a calculation rule of tensor CP decomposition; 2) Constructing a low-rank dimension convolution layer and a low-rank channel convolution layer to replace a full-rank convolution layer by utilizing the low-rank convolution operation in the step 1), adding a ReLU activation function and a batch regularization item between the low-rank convolution layers, adjusting the data distribution of the low-rank convolution layer, and designing a low-rank generation model; 3) And integrating the low-rank generation model and the full-rank discrimination model to construct a complete medical image low-rank generation type countermeasure network.
For another example, patent application CN113205567a discloses a method for synthesizing CT images based on deep learning MRI images, which includes the following steps: s1, selecting an original MRI image and an original CT image as a floating image and a reference image respectively, and then carrying out N4 offset correction and standardization to obtain preprocessed MRI and CT images; s2, training a countermeasure generation network model for synthesizing CT images by adopting the preprocessed MRI images and the preprocessed CT images; s3, inputting the preprocessed MRI image into an antagonism type generation network model of the synthesized CT image of the MRI image, so that the preprocessed MRI image is converted into the synthesized CT image.
In the technical solutions disclosed in the prior art, such as the above two patent applications, the features related to the high task output by the arbiter are usually acted, but the shallow features of the middle layer are often ignored, for example: color, texture, etc., and thus the image synthesis accuracy has yet to be improved.
Disclosure of Invention
In view of the foregoing, an object of the present invention is to provide a method and an apparatus for generating a multi-modal image based on an countermeasure generation network, which improve sensitivity of a discriminator to picture style information by contrast learning based on shallow features of the discriminator in the countermeasure generation network, thereby improving precision of generating the multi-modal image.
In order to achieve the above object, the embodiment of the present invention provides a multi-modal image generation method based on an countermeasure generation network, including the following steps:
acquiring a first mode image and a second mode image of the same target, and enhancing the first mode image to obtain two enhanced mode images;
constructing an countermeasure generation network comprising a generator and a discriminator, wherein the generator generates three predicted second mode images based on the first mode image and two enhanced mode images thereof, the discriminator performs true-false discrimination on the second mode image and the predicted second mode image corresponding to the first mode image, and the discriminator also calculates and outputs two intermediate feature images of the two predicted second mode images corresponding to the two enhanced mode images in the intermediate layer of the discriminator;
establishing contrast loss between features based on the two intermediate feature graphs, and carrying out parameter optimization on the countermeasure generation network by combining the contrast loss with the original loss of the countermeasure generation network;
the generator for extracting parameter optimization is used for generating the multi-mode image.
Preferably, the first modality image and the second modality image of the same object are preprocessed by:
filtering the original first mode image and the original second mode image;
performing rigid registration based on a target area on the filtered original first-mode image and the filtered original second-mode image;
respectively carrying out pixel normalization on the original first mode image and the original second mode image after rigid registration;
and selecting targets of the original first-mode image and the original second-mode image which are subjected to pixel normalization to obtain a first-mode image and a second-mode image of the same target.
Preferably, the generator adopts a generator structure in a pixel to pixel model.
Preferably, the discriminant employs a markov discriminant.
Preferably, the method further comprises:
at least 2 layers of MLP are added for each intermediate feature image output by the intermediate layer of the discriminator, the intermediate feature images are subjected to feature updating through the MLP, and the updated intermediate feature images participate in contrast loss calculation.
Preferably, constructing a contrast loss between features based on the two intermediate feature maps comprises:
taking two middle feature images of two predicted second mode images corresponding to the two enhanced mode images in the same middle layer of the discriminator as positive sample pairs;
taking two middle characteristic images of two predicted second mode images corresponding to the two enhanced mode images in different middle layers of the discriminator as negative sample pairs;
contrast loss is constructed based on the positive and negative sample pairs.
Preferably, the initial loss of the countermeasure generation network includes an L1 loss constructed based on the predicted second modality image and the second modality image corresponding to the first modality image, and further includes a countermeasure loss of the generator and the discriminator.
In order to achieve the above object, an embodiment further provides a multi-mode image generating device based on an countermeasure generating network, which comprises an acquisition module, a network construction module, a parameter optimization module, an image generating module,
the acquisition module is used for acquiring a first mode image and a second mode image of the same target, and enhancing the first mode image to obtain two enhanced mode images;
the network construction module is used for constructing an countermeasure generation network comprising a generator and a discriminator, wherein the generator generates three predicted second mode images based on the first mode image and two enhanced mode images thereof, the discriminator performs true-false discrimination on the second mode image and the predicted second mode image corresponding to the first mode image, and the discriminator also calculates and outputs two middle feature images of the two predicted second mode images corresponding to the two enhanced mode images in the middle layer of the discriminator;
the parameter optimization module is used for constructing contrast loss between features based on the two intermediate feature graphs, and carrying out parameter optimization on the countermeasure generation network by combining the contrast loss with the original loss of the countermeasure generation network;
the image generation module is used for extracting a generator with optimized parameters for generating the multi-mode image.
To achieve the above object, an embodiment further provides a computing device including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor executes the computer program to implement the steps of the above multi-modal image generation method based on an countermeasure generation network.
To achieve the above object, an embodiment further provides a computer-readable storage medium having stored thereon a computer program which, when executed, performs the steps of the above-described multi-modal image generation method based on an countermeasure generation network.
Compared with the prior art, the invention has the beneficial effects that at least the following steps are included:
on the basis of constructing the enhanced modal image, the discriminator of the countermeasure generation network calculates and outputs two middle feature images of two predicted second modal images corresponding to the two enhanced modal images in the middle layer of the discriminator, and based on the contrast loss between the features of the two middle feature images, the contrast learning is added in the middle layer of the discriminator, so that the learning of the discriminator on the shallow image features is enhanced, the sensitivity of the discriminator on the picture style information is improved, and the multi-modal generation precision of the generator is further improved. Meanwhile, the method can be simply applied to any other medical image generation algorithm, and the performance is improved on the basis of not changing the network structure of the original algorithm.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a multi-modal image generation method based on an countermeasure generation network provided by an embodiment;
FIG. 2 is a flow chart of a modality image processing provided by an embodiment;
FIG. 3 is a schematic diagram of an architecture of an countermeasure generation network provided by an embodiment;
FIG. 4 is a schematic diagram of the structure of a generator provided by an embodiment;
FIG. 5 is a schematic diagram of a residual structure in a generator provided by an embodiment;
FIG. 6 is a schematic structural diagram of a multi-modal image generation apparatus based on an countermeasure generation network according to an embodiment;
fig. 7 is a schematic structural diagram of an electronic device according to an embodiment.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the detailed description is presented by way of example only and is not intended to limit the scope of the invention.
Recently, research methods have shown that classification models have a greater tendency to learn style information for pictures based on texture expressions, i.e., if such texture information is sufficient to help achieve higher classification accuracy, the model does not learn complex representations anymore. Because the discriminator in the GAN can be regarded as a simple classifier as well, the discriminator also depends on the texture information of the picture to perform discrimination, the invention provides a multi-mode image generation method and device based on an countermeasure generation network, and the designed countermeasure generation network comprises contrast learning on shallow features of the discriminator, so that the sensitivity of the discriminator to picture style information is improved, and the multi-mode generation result is further improved.
Fig. 1 is a flowchart of a multi-modal image generation method based on an countermeasure generation network according to an embodiment. As shown in fig. 1, the method for generating a multi-mode image based on an countermeasure generation network according to the embodiment includes the following steps:
s110, a first mode image and a second mode image of the same target are obtained, and the first mode image is enhanced to obtain two enhanced mode images.
In an embodiment, multi-modality image data is obtained from a hospital, including raw first modality image data, such as a magnetic resonance image (MR), raw second modality image data, such as a computed tomography image (CT), and a mask (mask) corresponding to a target region, such as a tumor region, in the image data. Wherein, the CT image includes: arterial phase (ART), portal Phase (PV), sweep phase (NC), delay phase (DL); the MR image contains: arterial phase (ART), delay phase (DL), diffusion Weighted Imaging (DWI), swipe phase (NC), portal Phase (PV), T2 weighted imaging (T2). The MR and CT data formats were nii and the mask (mask) data format was nrrd.
After the original multi-mode image data is obtained, preprocessing is further required to obtain a first mode image and a second mode image of the same target, as shown in fig. 2, which specifically includes:
s210, filtering the original first mode image and the original second mode image.
Specifically, for the original first modality image of CT, the window width is set to (-110, 190) according to the priori knowledge of the doctor, and filtering denoising is performed by using the np-clip method in the numpy library. For the MR image, this original second modality image, filtering denoising was performed using the methods estima and nlmeas in dipy library.
And S220, performing rigid registration based on the target area on the filtered original first mode image and the filtered original second mode image.
Specifically, an original first-mode image of the same patient is taken as an unregistered image (moving image), an original second-mode image is taken as a target image (fixed image), a mask (mask) of a target area is utilized to calculate a transformation relationship between target areas of the two-mode images, and the obtained transformation relationship is then used for acting on the whole unregistered image (moving image) to obtain a registered first-mode image. The specific method is affine registration using dipy.
And S230, respectively carrying out pixel normalization on the original first mode image and the original second mode image after rigid registration.
Specifically, for the original first modality image, the pixel values are normalized directly to [ -1,1] using linear normalization. For the original second modality image, the pixel values are normalized to [ -1,1] using a standard score (z-score) process followed by a linear normalization.
S240, performing target selection on the original first-mode image and the original second-mode image which are subjected to pixel normalization to obtain a first-mode image and a second-mode image of the same target.
Specifically, the slice index of the largest target in the mask data of the target areas of the original first-mode image and the original second-mode image is calculated respectively, and based on the slice index, four slices are selected from top to bottom, and 9 slices are taken as the first-mode image and the second-mode image of the same target.
After the first modal image is obtained, the first modal image is enhanced, and two enhanced modal images corresponding to the first modal image can be obtained through random cutting, random horizontal overturning and other methods, wherein the first modal image, the two enhanced modal images and the second modal image which belong to the same target form sample data.
S120, constructing an countermeasure generation network comprising a generator and a discriminator.
As shown in fig. 3, the constructed countermeasure generation network includes a generator and a discriminator. The generator generates three predicted second mode images based on the first mode image and two enhanced mode images. In an embodiment, the generator adopts a generator structure in a pixel to pixel model, as shown in fig. 4, and is composed of three parts, wherein the first part includes three downsampling modules, each downsampling module includes a convolution layer with a convolution kernel size (kernel size) of 3, a step size (stride) of 2, padding (padding) of 1, a normalization layer (instance normalization), and an activation function ReLU. The second part contains nine residual modules, the network structure of each residual module is shown in fig. 5, and the idea of a cyclic neural network is added on the basis of the basic residual modules, wherein the selection of a convolution layer, a normalization layer and an activation function is consistent with the convolution layer in the downsampling module of the first part, and t=3 in fig. 5 represents 3 cycles. The third section contains three upsampling modules, each upsampling module containing: a convolution kernel size (kernel size) of 3, a step size (stride) of 2, padding (padding) of 1, deconvolution layer parameter (output_padding) of 1, a normalization layer (instance normalization), and an activation function, wherein the activation function in the first two upsampling modules is ReLU and the activation function in the last upsampling module is Tanh.
The discriminator performs true and false discrimination on the second mode image and the predicted second mode image corresponding to the first mode image, and the discriminator also calculates and outputs two middle feature images of the two predicted second mode images corresponding to the two enhanced mode images in the middle layer of the discriminator. In one embodiment, the arbiter employs a Markov arbiter comprising five modules, wherein the first module comprises a convolution kernel of 3, a step size of 2, a convolutional layer with a padding (padding) of 1 and an activation function LeakyReLU, the second, third, and fourth lower modules comprise a convolution kernel of 3, a step size of 2, a convolutional layer with a padding (padding) of 1, a normalization layer (instance normalization), and an activation function LeakyReLU, and the last module is a fully connected layer.
In the embodiment, at least 2 layers of MLPs are added for each intermediate feature map output by the intermediate layer of the discriminator, the intermediate feature map is subjected to feature update through the MLPs, and the updated intermediate feature map participates in contrast loss calculation.
S130, constructing contrast loss between features based on the two intermediate feature graphs, and carrying out parameter optimization on the countermeasure generation network by combining the contrast loss with the original loss of the countermeasure generation network.
In an embodiment, constructing a contrast loss between features based on two intermediate feature graphs includes: taking two middle feature images of two predicted second mode images corresponding to the two enhanced mode images in the same middle layer of the discriminator as positive sample pairs; taking two middle characteristic images of two predicted second mode images corresponding to the two enhanced mode images in different middle layers of the discriminator as negative sample pairs; contrast loss is constructed based on the positive and negative sample pairs.
Specifically, the first modal image x and two corresponding enhanced modal images x1 and x2 are input into a generator to generate predicted second modal images G (x), G (x 1) and G (x 2), wherein the G (x 1) and the G (x 2) are respectively input into a discriminator to respectively obtain intermediate feature maps corresponding to N intermediate layers, and the total number of the intermediate feature maps is 2N. Assuming that the intermediate feature image sequence number obtained by inputting G (x 1) into the discriminator belongs to 1-N, and the intermediate feature image sequence number obtained by inputting G (x 2) into the discriminator belongs to n+1-2N, the intermediate feature images of the same intermediate layer form positive sample pairs, the intermediate feature images of different intermediate layers form negative sample pairs, and the specific calculation and comparison loss process is as follows:
for positive sample pairs numbered i and i+N, the contrast loss is:
wherein, the liquid crystal display device comprises a liquid crystal display device,representing an indication function if and only if j is not equal to i, i.e +.>When (I)>The value is 1, otherwise the value is 0, < >>Representing the ith intermediate feature map and the +.>Similarity between the intermediate feature maps, +.>Representing i intermediate feature maps and +.>Similarity between the intermediate feature graphs, wherein i is 1-N, j is 1-2N, and the contrast of all positive sample pairs is lost ∈ ->The method comprises the following steps:
in an embodiment, the contrast loss is combined with the original loss of the countermeasure generation network to perform parameter optimization on the countermeasure generation network, and an adaptive moment estimation (Adam) optimizer is used to update the weights. The original loss of the countermeasure generation network comprises L1 loss constructed based on the predicted second mode image G (x) and the second mode image y corresponding to the first mode image, and further comprises countermeasure loss of the generator and the discriminator.
Wherein, the L1 loss is expressed as:
the penalty of the generator and arbiter is expressed as:
wherein, the liquid crystal display device comprises a liquid crystal display device,representing the L1 norm, E () represents the desire, and D () represents the discrimination result.
In summary, the loss function of the model of the antagonism generation network is:
wherein, the liquid crystal display device comprises a liquid crystal display device,,/>,/>and->Is a coefficient for controlling the loss function.
In an embodiment, after optimizing the countermeasure generation network, three evaluation indexes are further adopted to evaluate the network, wherein the three evaluation indexes are Mean Absolute Error (MAE), peak signal to noise ratio (PSNR), and Structural Similarity (SSIM). Specifically, CT was generated from MR using sample data of 60 patients, and the results are shown in table 1 using these three evaluation indexes:
analysis Table 1 shows that the three evaluation indexes of MAE, PSNR and SSIM of the generator obtained by the method are all superior to those of the generator generated by pix2pix GAN.
And S140, a generator for extracting parameter optimization is used for generating the multi-mode images.
After optimizing the countermeasure generation network parameters through S130, the generator with optimized extraction parameters is used for generating the multi-mode image, specifically, the generator with optimized parameters is input into the first-mode image, and the predicted second-mode image is obtained through calculation, where the accuracy of the second-mode image is higher.
According to the multi-mode image generation method based on the countermeasure generation network, the used countermeasure generation network comprises the discriminators which utilize contrast learning, so that learning of the discriminators on the shallow features of the image is enhanced, discrimination capability of the discriminators on texture information when the discriminators discriminate the image is improved, and quality of the generated image is further improved.
It should be noted that, in this embodiment, only one of the countermeasure generation networks is based, and the discriminators using contrast learning of the present invention are used in other countermeasure generation networks, which belong to the protection scope of the present patent.
Based on the same inventive concept, as shown in fig. 6, the embodiment further provides a multi-modal image generating apparatus 600 based on an countermeasure generation network, including an acquisition module 610, a network construction module 620, a parameter optimization module 630, an image generation module 640,
the acquiring module 610 is configured to acquire a first mode image and a second mode image of the same target, and enhance the first mode image to obtain two enhanced mode images; the network construction module 620 is configured to construct an countermeasure generation network including a generator and a arbiter, where the generator generates three predicted second mode images based on the first mode image and two enhanced mode images thereof, the arbiter performs true-false discrimination on the second mode image and the predicted second mode image corresponding to the first mode image, and the arbiter further calculates and outputs two intermediate feature graphs of two predicted second mode images corresponding to the two enhanced mode images in an intermediate layer of the arbiter; the parameter optimization module 630 is configured to construct a contrast loss between features based on the two intermediate feature graphs, and perform parameter optimization on the countermeasure generation network by combining the contrast loss with an original loss of the countermeasure generation network; the image generation module 640 is used for extracting a generator of parameter optimization for multi-modal image generation.
It should be noted that, when the multi-mode image generating device provided in the foregoing embodiment performs multi-mode image generation, the division of the foregoing functional modules should be used as an example, and the foregoing functional allocation may be performed by different functional modules according to needs, that is, the internal structure of the terminal or the server is divided into different functional modules, so as to complete all or part of the functions described above. In addition, the embodiment of the present invention provides a multi-mode image generating device and a multi-mode image generating method, which belong to the same concept, and detailed implementation procedures of the embodiment of the multi-mode image generating device and the embodiment of the multi-mode image generating method are detailed in the embodiment of the multi-mode image generating method, and are not repeated herein.
The embodiment provides a multimode image generating device based on a countermeasure generating network, wherein the used countermeasure generating network comprises a discriminator which utilizes contrast learning, so that the learning of the discriminator on the shallow features of an image is enhanced, the discrimination capability of the discriminator on texture information when the discriminator discriminates the image is improved, and the quality of the generated image is further improved.
Based on the same inventive concept, an embodiment further provides a computing device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements the multi-modal image generation method based on an countermeasure generation network when executing the computer program, and includes the following steps:
s110, acquiring a first mode image and a second mode image of the same target, and enhancing the first mode image to obtain two enhanced mode images;
s120, constructing an countermeasure generation network comprising a generator and a discriminator;
s130, constructing contrast loss between features based on the two intermediate feature graphs, and carrying out parameter optimization on the countermeasure generation network by combining the contrast loss with the original loss of the countermeasure generation network;
and S140, a generator for extracting parameter optimization is used for generating the multi-mode images.
As shown in fig. 7, the computing device provided by the embodiment includes, at a hardware level, hardware required by other services such as internal buses, network interfaces, and memories, in addition to the processor and the memory. The memory is a non-volatile memory, and the processor reads the corresponding computer program from the non-volatile memory into the memory and then runs the computer program to implement the multi-mode image generation method based on the countermeasure generation network as described in S110-S140. Of course, other implementations, such as logic devices or combinations of hardware and software, are not excluded from the present invention, that is, the execution subject of the following processing flows is not limited to each logic unit, but may be hardware or logic devices.
Based on the same inventive concept, the embodiment further provides a computer readable storage medium having a program stored thereon, which when executed by a processor, implements the above multi-modal image generation method based on an countermeasure generation network, and specifically includes the following steps:
s110, acquiring a first mode image and a second mode image of the same target, and enhancing the first mode image to obtain two enhanced mode images;
s120, constructing an countermeasure generation network comprising a generator and a discriminator;
s130, constructing contrast loss between features based on the two intermediate feature graphs, and carrying out parameter optimization on the countermeasure generation network by combining the contrast loss with the original loss of the countermeasure generation network;
and S140, a generator for extracting parameter optimization is used for generating the multi-mode images.
In embodiments, computer-readable media, including both permanent and non-permanent, removable and non-removable media, may be used to implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, read only optical disk read only memory (CD-ROM), digital Versatile Disks (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.
The foregoing detailed description of the preferred embodiments and advantages of the invention will be appreciated that the foregoing description is merely illustrative of the presently preferred embodiments of the invention, and that no changes, additions, substitutions and equivalents of those embodiments are intended to be included within the scope of the invention.
Claims (7)
1. The multi-mode image generation method based on the countermeasure generation network is characterized by comprising the following steps of:
acquiring a first mode image and a second mode image of the same target, and enhancing the first mode image to obtain two enhanced mode images, wherein the first mode image and the second mode image of the same target are obtained by preprocessing in the following mode: filtering the original first mode image and the original second mode image; performing rigid registration based on a target area on the filtered original first-mode image and the filtered original second-mode image; respectively carrying out pixel normalization on the original first mode image and the original second mode image after rigid registration; performing target selection on the original first-mode image and the original second-mode image which are subjected to pixel normalization to obtain a first-mode image and a second-mode image of the same target;
constructing an countermeasure generation network comprising a generator and a discriminator, wherein the generator generates three predicted second mode images based on the first mode image and two enhanced mode images thereof, the discriminator performs true-false discrimination on the second mode image and the predicted second mode image corresponding to the first mode image, and the discriminator also calculates and outputs two intermediate feature images of the two predicted second mode images corresponding to the two enhanced mode images in the intermediate layer of the discriminator;
adding at least 2 layers of MLPs for each intermediate feature image output by the intermediate layer of the discriminator, wherein the intermediate feature images undergo feature updating through the MLPs, and the updated intermediate feature images participate in contrast loss calculation;
the method comprises the steps of constructing contrast loss between features based on two intermediate feature graphs, carrying out parameter optimization on an antagonism generation network by combining the contrast loss with original loss of the antagonism generation network, wherein the construction of the contrast loss between the features based on the two intermediate feature graphs comprises the following steps: taking two middle feature images of two predicted second mode images corresponding to the two enhanced mode images in the same middle layer of the discriminator as positive sample pairs; taking two middle characteristic images of two predicted second mode images corresponding to the two enhanced mode images in different middle layers of the discriminator as negative sample pairs; constructing a contrast penalty based on the positive and negative sample pairs;
the generator for extracting parameter optimization is used for generating the multi-mode image.
2. The method for generating multimodal images based on an countermeasure generation network according to claim 1, wherein the generator adopts a generator structure in a pixel to pixel model.
3. A multimodal image generation method based on an countermeasure generation network according to claim 1, wherein the discriminant employs a markov discriminant.
4. A multimodal image generation method based on an countermeasure generation network according to claim 1, wherein the raw penalty of the countermeasure generation network includes an L1 penalty constructed based on the predicted second modality image and the second modality image corresponding to the first modality image, and further including a countermeasure penalty of the generator and the discriminator.
5. A multimode image generating device based on an countermeasure generating network is characterized by comprising an acquisition module, a network construction module, a parameter optimization module and an image generating module,
the acquisition module is used for acquiring a first mode image and a second mode image of the same target, and enhancing the first mode image to obtain two enhanced mode images, wherein the first mode image and the second mode image of the same target are obtained by preprocessing in the following mode: filtering the original first mode image and the original second mode image; performing rigid registration based on a target area on the filtered original first-mode image and the filtered original second-mode image; respectively carrying out pixel normalization on the original first mode image and the original second mode image after rigid registration; performing target selection on the original first-mode image and the original second-mode image which are subjected to pixel normalization to obtain a first-mode image and a second-mode image of the same target;
the network construction module is used for constructing an countermeasure generation network comprising a generator and a discriminator, wherein the generator generates three predicted second mode images based on the first mode image and two enhanced mode images thereof, the discriminator performs true-false discrimination on the second mode image and the predicted second mode image corresponding to the first mode image, and the discriminator also calculates and outputs two middle feature images of the two predicted second mode images corresponding to the two enhanced mode images in the middle layer of the discriminator;
the parameter optimization module is configured to add at least 2 layers of MLPs to each intermediate feature map output by the intermediate layer of the arbiter, perform feature update on the intermediate feature map, participate in contrast loss calculation after the update of the intermediate feature map, construct contrast loss between features based on the two intermediate feature maps, and perform parameter optimization on the contrast generation network by combining the contrast loss with original loss of the contrast generation network, where the construction of the contrast loss between features based on the two intermediate feature maps includes: taking two middle feature images of two predicted second mode images corresponding to the two enhanced mode images in the same middle layer of the discriminator as positive sample pairs; taking two middle characteristic images of two predicted second mode images corresponding to the two enhanced mode images in different middle layers of the discriminator as negative sample pairs; constructing a contrast penalty based on the positive and negative sample pairs;
the image generation module is used for extracting a generator with optimized parameters for generating the multi-mode image.
6. A computing device comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor, when executing the computer program, implements the steps of the countermeasure generation network-based multimodal image generation method of any of claims 1-4.
7. A computer-readable storage medium, on which a computer program is stored, characterized in that the computer program, when being processed and executed, implements the steps of the multimodal image generation method based on an countermeasure generation network as claimed in any one of claims 1 to 4.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310699766.2A CN116433795B (en) | 2023-06-14 | 2023-06-14 | Multi-mode image generation method and device based on countermeasure generation network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310699766.2A CN116433795B (en) | 2023-06-14 | 2023-06-14 | Multi-mode image generation method and device based on countermeasure generation network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116433795A CN116433795A (en) | 2023-07-14 |
CN116433795B true CN116433795B (en) | 2023-08-29 |
Family
ID=87081926
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310699766.2A Active CN116433795B (en) | 2023-06-14 | 2023-06-14 | Multi-mode image generation method and device based on countermeasure generation network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116433795B (en) |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110580695A (en) * | 2019-08-07 | 2019-12-17 | 深圳先进技术研究院 | multi-mode three-dimensional medical image fusion method and system and electronic equipment |
CN113205472A (en) * | 2021-04-21 | 2021-08-03 | 复旦大学 | Cross-modal MR image mutual generation method based on cyclic generation countermeasure network cycleGAN model |
CN114170118A (en) * | 2021-10-21 | 2022-03-11 | 北京交通大学 | Semi-supervised multi-mode nuclear magnetic resonance image synthesis method based on coarse-to-fine learning |
WO2022120762A1 (en) * | 2020-12-10 | 2022-06-16 | 中国科学院深圳先进技术研究院 | Multi-modal medical image generation method and apparatus |
CN114926382A (en) * | 2022-05-18 | 2022-08-19 | 深圳大学 | Generation countermeasure network for fused images, image fusion method and terminal equipment |
CN115601352A (en) * | 2022-11-04 | 2023-01-13 | 河北工业大学(Cn) | Medical image segmentation method based on multi-mode self-supervision |
WO2023020198A1 (en) * | 2021-08-16 | 2023-02-23 | 腾讯科技(深圳)有限公司 | Image processing method and apparatus for medical image, and device and storage medium |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20210012486A1 (en) * | 2019-07-09 | 2021-01-14 | Shenzhen Malong Technologies Co., Ltd. | Image synthesis with generative adversarial network |
CN113449135B (en) * | 2021-08-31 | 2021-11-19 | 阿里巴巴达摩院(杭州)科技有限公司 | Image generation system and method |
-
2023
- 2023-06-14 CN CN202310699766.2A patent/CN116433795B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110580695A (en) * | 2019-08-07 | 2019-12-17 | 深圳先进技术研究院 | multi-mode three-dimensional medical image fusion method and system and electronic equipment |
WO2022120762A1 (en) * | 2020-12-10 | 2022-06-16 | 中国科学院深圳先进技术研究院 | Multi-modal medical image generation method and apparatus |
CN113205472A (en) * | 2021-04-21 | 2021-08-03 | 复旦大学 | Cross-modal MR image mutual generation method based on cyclic generation countermeasure network cycleGAN model |
WO2023020198A1 (en) * | 2021-08-16 | 2023-02-23 | 腾讯科技(深圳)有限公司 | Image processing method and apparatus for medical image, and device and storage medium |
CN114170118A (en) * | 2021-10-21 | 2022-03-11 | 北京交通大学 | Semi-supervised multi-mode nuclear magnetic resonance image synthesis method based on coarse-to-fine learning |
CN114926382A (en) * | 2022-05-18 | 2022-08-19 | 深圳大学 | Generation countermeasure network for fused images, image fusion method and terminal equipment |
CN115601352A (en) * | 2022-11-04 | 2023-01-13 | 河北工业大学(Cn) | Medical image segmentation method based on multi-mode self-supervision |
Non-Patent Citations (1)
Title |
---|
生成式对抗网络研究进展;王万良;李卓蓉;;通信学报(02);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN116433795A (en) | 2023-07-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Singh et al. | Medical image generation using generative adversarial networks: A review | |
CN109191476B (en) | Novel biomedical image automatic segmentation method based on U-net network structure | |
Nie et al. | 3-D fully convolutional networks for multimodal isointense infant brain image segmentation | |
CN109978037B (en) | Image processing method, model training method, device and storage medium | |
CN110506278B (en) | Target detection in hidden space | |
CN106682435B (en) | System and method for automatically detecting lesion in medical image through multi-model fusion | |
RU2677764C2 (en) | Registration of medical images | |
Zhang et al. | LU-NET: An improved U-Net for ventricular segmentation | |
Arafati et al. | Artificial intelligence in pediatric and adult congenital cardiac MRI: an unmet clinical need | |
CN110517198B (en) | High-frequency sensitive GAN network for denoising LDCT image | |
WO2022121100A1 (en) | Darts network-based multi-modal medical image fusion method | |
JP2023540910A (en) | Connected Machine Learning Model with Collaborative Training for Lesion Detection | |
CN115496771A (en) | Brain tumor segmentation method based on brain three-dimensional MRI image design | |
Lee et al. | Reducing the model variance of a rectal cancer segmentation network | |
Song et al. | Brain tissue segmentation via non-local fuzzy c-means clustering combined with Markov random field | |
CN113362360B (en) | Ultrasonic carotid plaque segmentation method based on fluid velocity field | |
Yang et al. | Hierarchical progressive network for multimodal medical image fusion in healthcare systems | |
CN116433795B (en) | Multi-mode image generation method and device based on countermeasure generation network | |
CN111311531A (en) | Image enhancement method and device, console equipment and medical imaging system | |
Arega et al. | Using polynomial loss and uncertainty information for robust left atrial and scar quantification and segmentation | |
CN112950654B (en) | Brain tumor image segmentation method based on multi-core learning and super-pixel nuclear low-rank representation | |
Liao et al. | A fast spatial constrained fuzzy kernel clustering algorithm for MRI brain image segmentation | |
CN112561918A (en) | Convolutional neural network training method and focus segmentation method | |
Hu et al. | Single image super resolution of 3D MRI using local regression and intermodality priors | |
CN110570417A (en) | Pulmonary nodule classification method and device and image processing equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |