CN116385330B

CN116385330B - Multi-mode medical image generation method and device guided by graph knowledge

Info

Publication number: CN116385330B
Application number: CN202310661539.0A
Authority: CN
Inventors: 张楚杰; 胡季宏; 王伟彬; 陈延伟; 童若锋; 林兰芬; 李劲松
Original assignee: Zhejiang Lab
Current assignee: Zhejiang Lab
Priority date: 2023-06-06
Filing date: 2023-06-06
Publication date: 2023-09-15
Anticipated expiration: 2043-06-06
Also published as: CN116385330A

Abstract

The invention discloses a multi-mode medical image generation method and a device guided by graph knowledge, wherein the method firstly acquires an MR image and a CT image and carries out preprocessing to construct a data set; then constructing an countermeasure generation model based on the countermeasure generation network by using the graph knowledge guidance, and training the countermeasure generation model by using a training set; acquiring a trained generator in the trained countermeasure generation model, evaluating the performance of the trained generator through an evaluation index by using a verification set, and adjusting parameters of the generator according to an evaluation result to acquire an optimal generator; and finally, inputting the source domain image or the source domain image in the test set into an optimal generator to acquire a generated target domain image. The method can capture the cross-region and cross-image relationship as the context and compensation information, restrict the direction of countermeasure, further promote the multi-mode generation result and facilitate the improvement of the quality of the generated image.

Description

Multi-mode medical image generation method and device guided by graph knowledge

Technical Field

The invention relates to the technical field of cross-modal registration of computer technology and medical images, in particular to a multi-modal medical image generation method and device guided by graph knowledge.

Background

Medical imaging is a powerful diagnostic and research tool that can create visual representations of anatomical structures and has been widely used for disease diagnosis and surgical planning. In current clinical practice, computed Tomography (CT) and Magnetic Resonance (MR) imaging are most commonly used. Since CT and multiple MR imaging modalities provide complementary information, effective integration of these different modalities can help the physician make more informed decisions. On the other hand, children and elderly people cannot take CT images due to radioactivity, and in clinical practice, there is an increasing need to develop multi-modal image generation to assist clinical diagnosis and treatment.

Currently, the mainstream countermeasure-based generation network generally adopts a markov discriminator, and the discriminator only considers the relationship between a true image and a generated image in the same position when discriminating the true image from the false image. The relationship of anatomical regions within an image and the relationship between images are not considered. In fact, because of the pre-requirements, time, etc. of the medical image acquisition, the different modalities vary greatly, and disregarding the relationship between anatomical regions may lead to incorrect countermeasures. Anatomical information of medical images may guide image generation, studies have shown that taking into account complementary information between anatomical regions and multiple images may lead to better generation results. Aiming at the problems, the invention designs a discriminator guided by graph knowledge to fight against a generation network, and aims to enable the discriminator to capture cross-region and cross-image relations as context and compensation information, restrict the fight direction and further promote the multi-mode generation result.

Disclosure of Invention

The invention aims to provide a method and a device for generating a multi-mode medical image guided by graph knowledge aiming at the defects of the prior art.

The aim of the invention is realized by the following technical scheme: the first aspect of the embodiment of the invention provides a multi-mode medical image generation method guided by graph knowledge, which comprises the following steps:

(1) Acquiring a magnetic resonance image and a computed tomography image;

(2) Preprocessing a magnetic resonance image and a computer tomography image, constructing a data set according to the preprocessed magnetic resonance image and the preprocessed computer tomography image, and dividing the data set into a training set, a verification set and a test set;

(3) Constructing an countermeasure generation model based on a countermeasure generation network by using graph knowledge guidance, and training the countermeasure generation model by using a training set so as to obtain a trained countermeasure generation model;

(4) Acquiring a trained generator in the trained countermeasure generation model, evaluating the performance of the trained generator through an evaluation index by using a verification set, and adjusting parameters of the generator according to an evaluation result to acquire an optimal generator;

(5) The source domain images or the source domain images in the test set are input into an optimal generator to obtain the generated target domain images.

Further, the preprocessing comprises the following steps:

(2.1) resampling: resampling the magnetic resonance image and the computed tomography image;

(2.2) adjusting the window width and the window level: adjusting window width and window levels of the magnetic resonance image and the computed tomography image to obtain a denoised magnetic resonance image and a denoised computed tomography image;

(2.3) normalization: normalizing pixel values of the magnetic resonance image and the computed tomography image;

(2.4) selecting data: the magnetic resonance image and the computed tomography image are selected as a set of data.

Further, the step (3) includes the following substeps:

(3.1) constructing an countermeasure generation model based on a countermeasure generation network by using graph knowledge guidance, wherein the countermeasure generation model comprises a generator, a discriminator and a regularization module of the graph knowledge guidance;

(3.2) training the countermeasure generation model using the training set, and updating parameters of the countermeasure generation model according to the loss of the countermeasure generation model to obtain a trained countermeasure generation model.

Further, the generator comprises a downsampling module, a residual module and an upsampling module; the downsampling module comprises a convolution layer, an activation function and a normalization layer; the residual error module comprises a convolution layer, a normalization layer and an activation function; the up-sampling module includes a deconvolution layer, a normalization layer, and an activation function.

Further, the discriminator includes a first base module, a second base module, and a fully-connected layer; the first basic module comprises a convolution layer and an activation function; the second base module includes a convolution layer, a normalization layer, and an activation function.

Further, the step (3.2) specifically comprises: setting iteration times and learning rate, training and updating weight parameters of the countermeasure generation model by using an optimizer, setting the number of samples selected by each training, inputting a source domain image and a real target domain image in a training set into the countermeasure generation model for iterative training, and updating the weight parameters of the countermeasure generation model according to loss of the countermeasure generation model to obtain a trained countermeasure generation model.

Further, the penalty of the countermeasure generation model includes a pixel level penalty between the generated image and the real image, a countermeasure penalty of the generator, a countermeasure penalty of the discriminator, and a regularization constraint penalty of the discriminator, the regularization constraint penalty of the discriminator including a regularization constraint penalty between images and a regularization constraint penalty within an image.

Further, the evaluation index includes a perceived similarity of the image, frechet Inception distance, and peak signal-to-noise ratio of the image based on learning.

The second aspect of the embodiment of the invention provides a multi-mode medical image generating device guided by using graph knowledge, which comprises one or more processors and is used for realizing the multi-mode medical image generating method guided by using graph knowledge.

A third aspect of the embodiments of the present invention provides a computer-readable storage medium having stored thereon a program for implementing the above-described multi-modality medical image generation method guided by graph knowledge when executed by a processor.

The invention has the beneficial effects that the invention designs a brand new discriminator guided by using graph knowledge, avoids the problem of normal control sampling, and does not need additional manual labeling cost; the identifier can capture cross-region and cross-image relationships as context and compensation information, restrict the direction of countermeasure, further promote the multi-mode generation result and facilitate the improvement of the quality of generated images; the discriminator guided by the graph knowledge is applicable to all the current countermeasure generation networks, and can improve the quality of generated images.

Drawings

FIG. 1 is a flow chart diagram of a method for generating a multimodal medical image guided by graph knowledge in an embodiment of the invention;

FIG. 2 is a schematic diagram of a network structure of an countermeasure generation model provided by an embodiment of the present invention;

FIG. 3 is a schematic diagram of a network architecture of a generator according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of a network architecture of a residual module in a generator network in an embodiment of the invention;

FIG. 5 is a schematic diagram of a network architecture of a discriminator according to the embodiment of the invention;

FIG. 6 is a MR and CT image preprocessing flow in an embodiment of the invention;

fig. 7 is a schematic structural diagram of a multi-modal medical image generating apparatus guided by graph knowledge according to the present invention.

Detailed Description

Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the invention. Rather, they are merely examples of apparatus and methods consistent with aspects of the invention as detailed in the accompanying claims.

The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in this specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any or all possible combinations of one or more of the associated listed items.

It should be understood that although the terms first, second, third, etc. may be used herein to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the invention. The word "if" as used herein may be interpreted as "at … …" or "at … …" or "responsive to a determination", depending on the context.

The present invention will be described in detail with reference to the accompanying drawings. The features of the examples and embodiments described below may be combined with each other without conflict.

Referring to fig. 1, the method for generating a multi-modal medical image guided by graph knowledge according to the present invention specifically includes the following steps:

(1) A Magnetic Resonance (MR) image and a Computed Tomography (CT) image are acquired, and a mask (mask) of a tumor region in the corresponding images.

In this embodiment, private hospital data is used, which contains Magnetic Resonance (MR) images and Computed Tomography (CT) images of 305 patients. Wherein, the CT image includes: arterial phase (ART), portal Phase (PV), sweep phase (NC), delay phase (DL); the MR image contains: arterial phase (ART), delay phase (DL), diffusion Weighted Imaging (DWI), swipe phase (NC), portal Phase (PV), T2 weighted imaging (T2). The data format of the MR image and the CT image is nii.

(2) Preprocessing an MR image and a CT image, constructing a data set according to the preprocessed MR image and the preprocessed CT image, and dividing the data set into a training set, a verification set and a test set.

In this embodiment, the data set may be divided into a training set, a verification set, and a test set according to a certain proportion according to actual needs, for example, 6:2:2, etc.; the data may also be randomly extracted to construct training sets, validation sets, and test sets.

As shown in fig. 6, the specific flow of the pretreatment includes the following steps:

(2.1) resampling: resampling the MR image and the CT image.

In this embodiment, since the original MR image and the CT image have different layer thicknesses, resampling of the MR image and the CT image is required, and the dual linear interpolation method may be used to resample the MR image and the CT image to the same layer thickness, for example, resample to 1. It should be understood that bilinear interpolation is a method widely used in digital image and video processing, and the core idea is to perform linear interpolation in two directions respectively; of course, other methods of resampling the MR image and the CT image may be used, such as a method called a sample_image in a nilearn library.

(2.2) adjusting the window width and the window level: and adjusting window width levels of the MR image and the CT image to acquire the denoised MR image and CT image.

In this embodiment, for the CT image, the window width (-110, 190) may be set according to the priori knowledge of the doctor, and the corresponding window level is (40, 300), and a value is generally selected for setting; a truncated method may be used to obtain denoised CT images, such as the np-clip algorithm in the numpy library. For MR images, since the MR image cannot choose a fixed value to adjust the window width level, an image denoising method can be used to obtain a denoised MR image, such as the estimate_sigma algorithm and the nlmeans algorithm in dipy library.

(2.3) normalization: pixel values of the MR image and the CT image are normalized.

Specifically, for CT images, the pixel values thereof are normalized directly by linear normalization, for example, the pixel values can be normalized to [ -1,1], so that subsequent calculation is convenient. For MR images, the pixel values are initially normalized using z-score, then further normalized using linear normalization, and similarly normalized to [ -1,1]. It should be appreciated that z-score normalization is a common method of data processing by which data of different magnitudes can be converted into a uniform measure of z-score for comparison; of course, other normalization methods may be used to normalize the pixel values, such as zero-mean normalization, and the like.

(2.4) selecting data: MR images and CT images are selected as a set of data.

In this embodiment, the MR image and the CT image of the same slice (slice) can be selected as one set of data by the same slice method, and the data selected in this way is referred to as "the same slice". For example, the 26 th slice of the MR image is selected, then the 26 th slice should also be selected as the CT image.

(3) Constructing an countermeasure generation model based on a countermeasure generation network (Generative Adversarial Network, GAN) by using graph knowledge guidance, and training the countermeasure generation model by using a training set to obtain a trained countermeasure generation model.

(3.1) constructing an countermeasure generation model based on the countermeasure generation network by using the graph knowledge guidance, wherein the countermeasure generation model comprises a generator, a discriminator and a regularization module of the graph knowledge guidance.

In this embodiment, the generator adopts a network structure based on the generator in pixel to pixel GAN, and the structure is shown in fig. 3. The generator comprises three parts, the first part being three downsampling modules, each downsampling module comprising a convolutional layer (Conv), an activation function and a normalization layer (Instance Normalization), wherein the kernel size of the convolutional layer is 3, the stride is 2, and the padding parameter is 1. The second part is nine residual modules, each comprising a convolutional layer (Conv), a normalizing layer (InstanceNorm) and an activation function, the structure of which is shown in fig. 4. The third part is three upsampling modules, each comprising a deconvolution layer, a normalization layer (Instance Normalization) and an activation function, wherein the kernel size (kernel size) of the deconvolution layer is 3, the stride (stride) is 2, and the output_pad parameter is 1.

Preferably, the activation functions in the three downsampling modules, the nine residual modules and the first two upsampling modules are ReLU activation functions; the activation function in the last up-sampling module is the Tanh activation function.

It should be understood that, the calculation amount is large by adopting the Tanh activation function, the derivation involves division when the error gradient is calculated by back propagation, the calculation amount is relatively large, and the calculation amount of the whole process is saved greatly by adopting the ReLU activation function; for a deep network, when the Tanh function is counter-propagated, the condition of gradient disappearance easily occurs, so that training of the deep network cannot be completed; the ReLU can enable the output of a part of neurons to be 0, so that sparsity of a network is caused, the interdependence relation of parameters is reduced, and the occurrence of the over-fitting problem is relieved; since the input image is normalized to [ -1,1], the Tanh function is used at the output layer, so that the output result can be kept between-1 and 1, and the activation functions of other layers adopt the ReLU activation function. Instance Normalization is normalization of individual images in a batch.

Specifically, three downsampling modules and nine residual modules of the generator serve as coding parts of the generator, and characteristics of an input image can be extracted; three upsampling modules of the generator act as decoding parts of the generator, which can generate corresponding images from the feature map. The method comprises the steps of inputting a source domain image (namely an MR image or a CT image) into a generator, gradually extracting the characteristics of the source domain image through a downsampling module and a residual module, finally obtaining a characteristic image of the source domain image, and obtaining a generated target domain image through three upsampling modules.

In this embodiment, the discriminator adopts a markov discriminator (PatchGAN) as a basic structure, and the structure is shown in fig. 5. The discriminator comprises a first basic module, three second basic modules and a full connection layer, wherein the first basic module comprises a convolution layer and an activation function, the kernel size of the convolution layer is 3, the step size is 2, and the padding parameter is 1; each second base module comprises a convolutional layer, a normalizing layer (Instance Normalization) and an activation function, the kernel size (kernel size) of the convolutional layer is 3, the stride (stride) is 2, and the padding parameter is 1.

Preferably, the activation function in the first base module and the second base module is a LeakyReLU activation function.

It should be appreciated that the activation function selection in the discriminator employs a LeakyReLU activation function. Because the LeakyReLU activation function gives a small slope to the input value when the input is negative, the problem of dead neurons can be well relieved on the basis of solving the zero gradient problem under the condition of negative input.

In this embodiment, the regularization module guided by graph knowledge is used to calculate the similarity between the feature graphs and the feature graphs in the feature graphs according to the Hash code (Hash code) and the hamming distance.

Specifically, the generated target domain image enters a discriminator, a feature map corresponding to the features of the image can be obtained through a first basic module and a second basic module, a first prediction result can be obtained through a full-connection layer by the feature map, and the first prediction result and the corresponding first feature map are output. Similarly, the true target domain image is input into the discriminator, and a second prediction result and a corresponding second feature map can be obtained. And judging whether the target domain image generated by the generator is good or not according to the first prediction result and the second prediction result, namely, the similarity degree of the generated image and the real image. The regularization module guided by graph knowledge may calculate similarities within the first feature graph and within the second feature graph and between the first feature graph and the second feature graph.

When training the countermeasure generation model by using the training set, setting the iteration times and the learning rate, training and updating the weight parameters of the countermeasure generation model by using the optimizer, setting the sample number (batch size) selected by each training, for example, the batch size can be set to be 1, inputting the source domain image and the real target domain image in the training set into the countermeasure generation model for iterative training, and updating the weight parameters of the countermeasure generation model according to the loss of the countermeasure generation model to obtain the trained countermeasure generation model.

Preferably, the number of iterations is set to 80 and the learning rate is set to 0.0001.

Further, the optimizers include adaptive moment estimation (Adam) optimizers, adaGrad optimizers, RMSProp optimizers, and the like. It should be understood that a proper optimizer can be selected to train the cross-modal registration network model according to actual requirements, and the Adam optimizer dynamically adjusts the learning rate of each parameter by using the first moment estimation and the second moment estimation of the gradient; the AdaGrad optimizer can independently adjust the learning rate of the model parameters, can greatly update the sparse parameters and slightly update the frequent parameters, and is suitable for processing the sparse data; the RMSProp optimizer uses an exponentially weighted moving average instead of the sum of the squares of the gradients for the problem of the sum of the squares of the gradients accumulating more and more.

As shown in fig. 2, the countermeasure generation model adopts a network structure of pixel to pixel GAN as a whole. Specifically, when a CT image is generated from an MR image, the MR image in the training set is taken as a source domain image, and the CT image corresponding to the MR image in the training set is taken as a real target domain image; the method comprises the steps of inputting an MR image and a CT image into an countermeasure model, firstly obtaining a generated CT image through a generator, respectively connecting the generated CT image and a corresponding real CT image with the input MR image (conccate) to be used as input images of a discriminator, namely inputting the MR image and the generated CT image into the discriminator together to obtain a first prediction result and a first feature map, inputting the MR image and the real CT image into the discriminator together to obtain a second prediction result and a second feature, and then inputting the first feature map and the second feature into a regularization module to calculate regularization constraint loss.

Similarly, when an MR image is generated from a CT image, the CT image in the training set is taken as a source domain image, and the MR image corresponding to the CT image in the training set is taken as a true target domain image; inputting a CT image and an MR image into an countermeasure model, wherein the CT image firstly obtains a generated MR image through a generator, the generated MR image and a corresponding real MR image are respectively connected with the input CT image (concatate) to be used as input images of a discriminator, namely, inputting the CT image and the generated MR image into the discriminator together to obtain a first prediction result and a first characteristic image, inputting the CT image and the real MR image into the discriminator together to obtain a second prediction result and a second characteristic, and then inputting the first characteristic image and the second characteristic image into a regularization module to calculate regularization constraint loss.

It should be understood that conditional GAN refers to the input of the corresponding tag (real image) together into the discriminator, which is well known in the art.

In this embodiment, the penalty of the countermeasure generation model includes a pixel level penalty between the generated image and the real image, a countermeasure penalty of the generator, a countermeasure penalty of the discriminator, and a regularization constraint penalty of the discriminator, wherein the regularization constraint penalty of the discriminator includes a regularization constraint penalty between the images and a regularization constraint penalty within the images.

Further, the pixel level loss between the generated image and the real image is expressed as:

wherein, the liquid crystal display device comprises a liquid crystal display device,representing pixel level loss between the generated image and the real image,/->Representing a source domain image,/->Representing the generated target domain image,/->Representing a real target domain image.

Further, expressions of the countermeasure loss of the generator and the discriminator are respectively:

wherein, the liquid crystal display device comprises a liquid crystal display device,representing the counter-loss of the generator->Representing the loss of antagonism of the discriminator,>the representation of the discriminator is given by,representing a source domain image,/->Representing a real target domain image.

Further, regularization constraint loss between images is:

wherein, the liquid crystal display device comprises a liquid crystal display device,representing the relationship between images->Representing the relationship between nodes between images, +.>Representing Euclidean distance, ">Representing a feature map corresponding to the generated target domain image, and>representing the corresponding feature map of the real target domain image, E is a constant,/and E is a constant>Representing Hamming distance, & gt>Hash code representing the ith channel in the graph constructed by the generated target domain image corresponding feature graph,/and->Hash code representing the ith channel in the graph constructed by the corresponding feature map of the real target domain image,/for the target domain image>N is the number of channels (channels).

When calculating the relationship between images, the node N (nodes) of the image represents the feature map corresponding to the generated image and the real image, that is, two nodes; an edge (E) represents the similarity between nodes, and thus the graph (N, E) can be constructed.

Further, regularization constraint loss within the image is:

wherein, the liquid crystal display device comprises a liquid crystal display device,representing the relation between the areas inside the image, +.>Representing the relation between the internal areas of the generated image, +.>Representing the relation between the internal areas of the real image, +.>Representing the Euclidean distance, E is a constant, < ->Representing Hamming distance, & gt>Ith channel +.>The jth channel representing the feature map corresponding to the generated image,>n is the number of channels (channel),>，/>and respectively representing the ith and jth channels of the feature map corresponding to the real image. />Representations of hash codes of an ith channel and a jth channel in a graph constructed by generating an image-correspondence feature graph, respectively, ++>And respectively representing hash codes of an ith channel and a jth channel in a graph constructed by the corresponding feature graph of the real image.

It should be noted that, when calculating the relationship in the image, the obtained feature map is a map of (N, D, W), where N is the number of channels (channels), and the size is d×w, and each channel of the feature map constructs a node, then N nodes, and edge (edge) E represents the similarity between the nodes, so that the map (N, E) can be constructed.

In summary, the loss function of the countermeasure generation model is:

wherein, the liquid crystal display device comprises a liquid crystal display device,representing loss of the challenge generation model->、/>、/>、/>And->Is a constant for controlling the balance between the loss functions.

(4) The method comprises the steps of obtaining a trained generator in a trained countermeasure generation model, evaluating the performance of the trained generator through an evaluation index by using a verification set, and adjusting parameters of the generator according to an evaluation result to obtain an optimal generator.

In this embodiment, after the trained countermeasure generation model is obtained in step (3), the arbiter is no longer needed, and only the generator is needed for generating the medical image. Therefore, firstly, a trained generator is obtained, then the verification set is used for evaluating the performance of the trained generator, and the source domain image in the verification set is input into the trained generator, so that the generated target domain image can be obtained. And comparing the generated target domain image with a real target domain image through the evaluation index, so as to obtain an evaluation result, wherein the real target domain image is another image corresponding to the source domain image in the verification set. The evaluation result includes convergence condition of the generator, etc., and it can be determined whether the generator has an over-fitting condition, and if the over-fitting condition exists, further adjustment is required to be performed on parameters of the generator according to the verification set to obtain an optimal generator.

In this embodiment, the evaluation index includes learning-based image perceived similarity (LPIPS) and Frechet Inception Distance (FID), by which the performance of the trained generator is evaluated. Smaller values of perceived similarity based on the learned images represent better performance of the generator, and smaller Frechet Inception distances represent better performance of the generator.

It should be appreciated that LPIPS is a commonly used image similarity metric that learns to generate an image-to-real image reverse map forcing generator to learn a reverse map that reconstructs a real image from a false image and preferentially processes the perceived similarity between them. The smaller the value of LPIPS, the more similar the two images, and vice versa, the greater the difference. The FID can calculate the distance between the feature vectors of the real image and the generated image, and the similarity of the two groups of images is measured according to the evaluation index, and the larger the similarity is, the higher the generated image quality is; the smaller the value of FID, the greater the similarity between the true image and the generated image.

For example, in the present embodiment, sample data including 60 patients is used, a CT image is generated from an MR image, and the above two evaluation indexes are used, and the results are shown in table 1.

Table 1: model evaluation results

	LPIPS	FID
			Med GAN	0.2619	36.7396
Our method	0.2155	30.4321

As can be seen from table 1, the performance of the conventional Med GAN and the challenge model constructed by the present invention is more excellent.

It should be appreciated that other evaluation metrics may be selected to evaluate the performance of the trained generator, such as the peak signal-to-noise ratio (PSNR, peak Signal to Noise Ratio) of the image, with a larger PSNR value representing less distortion, indicating a better quality of the generated target domain image and also indicating a better performance of the trained generator. For another example, the structural similarity (SSIM, structural Similarity Index Measurement), which is an index for measuring the similarity of two images, can evaluate the quality of the generated target domain image, and the larger the SSIM value, the more similar the two images, which indicates that the generated target domain image is closer to the real target domain image, and the better the performance of the generator.

In summary, the countermeasure generation network used in the invention includes the identifier guided by the graph knowledge, and can capture the cross-region and cross-image relationships as the context and compensation information, restrict the countermeasure direction, and further promote the multi-modal generation result.

Specifically, the source domain image or the source domain image in the test set is input into an optimal generator, and the generated target domain image can be obtained.

The invention designs a brand new discriminator guided by using graph knowledge, avoids the problem of normal control sampling, and does not need additional manual labeling cost; the identifier can capture cross-region and cross-image relationships as context and compensation information, restrict the direction of countermeasure, further promote the multi-mode generation result and facilitate the improvement of the quality of generated images; it should be noted that, the identifier guided by using graph knowledge provided by the present invention is applicable to all the present countermeasure generation networks, and can improve the generated image quality.

Corresponding to the embodiment of the method for generating the multi-modal medical image guided by the graph knowledge, the invention further provides an embodiment of a multi-modal medical image generating device guided by the graph knowledge.

Referring to fig. 7, a multi-modal medical image generating apparatus guided by using graph knowledge according to an embodiment of the present invention includes one or more processors configured to implement the multi-modal medical image generating method guided by using graph knowledge in the foregoing embodiment.

The embodiment of the multi-mode medical image generating device guided by the graph knowledge can be applied to any device with data processing capability, and the device with data processing capability can be a device or a device such as a computer. The apparatus embodiments may be implemented by software, or may be implemented by hardware or a combination of hardware and software. Taking software implementation as an example, the device in a logic sense is formed by reading corresponding computer program instructions in a nonvolatile memory into a memory by a processor of any device with data processing capability. In terms of hardware, as shown in fig. 7, a hardware structure diagram of an apparatus with data processing capability, where the multi-mode medical image generating apparatus guided by graph knowledge is located, is shown in fig. 7, and in addition to a processor, a memory, a network interface, and a nonvolatile memory shown in fig. 7, any apparatus with data processing capability in the embodiment generally includes other hardware according to an actual function of the any apparatus with data processing capability, which is not described herein.

The implementation process of the functions and roles of each unit in the above device is specifically shown in the implementation process of the corresponding steps in the above method, and will not be described herein again.

For the device embodiments, reference is made to the description of the method embodiments for the relevant points, since they essentially correspond to the method embodiments. The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purposes of the present invention. Those of ordinary skill in the art will understand and implement the present invention without undue burden.

The embodiment of the invention also provides a computer readable storage medium, on which a program is stored, which when executed by a processor, implements the multi-modal medical image generation method guided by graph knowledge in the above embodiment.

The computer readable storage medium may be an internal storage unit, such as a hard disk or a memory, of any of the data processing enabled devices described in any of the previous embodiments. The computer readable storage medium may be any device having data processing capability, for example, a plug-in hard disk, a Smart Media Card (SMC), an SD Card, a Flash memory Card (Flash Card), or the like, which are provided on the device. Further, the computer readable storage medium may include both internal storage units and external storage devices of any data processing device. The computer readable storage medium is used for storing the computer program and other programs and data required by the arbitrary data processing apparatus, and may also be used for temporarily storing data that has been output or is to be output.

The above embodiments are merely for illustrating the design concept and features of the present invention, and are intended to enable those skilled in the art to understand the content of the present invention and implement the same, the scope of the present invention is not limited to the above embodiments. Therefore, all equivalent changes or modifications according to the principles and design ideas of the present invention are within the scope of the present invention.

Claims

1. The method for generating the multi-mode medical image guided by the graph knowledge is characterized by comprising the following steps of:

(1) Acquiring a magnetic resonance image and a computed tomography image;

said step (3) comprises the sub-steps of:

the discriminator comprises a first basic module, a second basic module and a full connection layer; the first basic module comprises a convolution layer and an activation function; the second basic module comprises a convolution layer, a normalization layer and an activation function;

the target domain image generated by the generator enters a discriminator, a feature map corresponding to the features of the image is obtained through a first basic module and a second basic module, a first prediction result is obtained through a full-connection layer by the feature map, and the first prediction result and the corresponding first feature map are output; inputting the real target domain image into a discriminator to obtain a second prediction result and a corresponding second feature map; calculating the similarity in the first characteristic diagram and the second characteristic diagram and between the first characteristic diagram and the second characteristic diagram through a regularization module guided by diagram knowledge;

(3.2) training the countermeasure generation model by using a training set, and updating parameters of the countermeasure generation model according to loss of the countermeasure generation model so as to obtain a trained countermeasure generation model;

2. The method for generating a multi-modal medical image guided by graph knowledge as claimed in claim 1, wherein the preprocessing includes the steps of:

3. The method for generating a multi-modal medical image guided by graph knowledge as claimed in claim 1, wherein the generator includes a downsampling module, a residual module, and an upsampling module; the downsampling module comprises a convolution layer, an activation function and a normalization layer; the residual error module comprises a convolution layer, a normalization layer and an activation function; the up-sampling module includes a deconvolution layer, a normalization layer, and an activation function.

4. The method for generating a multi-modal medical image guided by graph knowledge according to claim 1, wherein the step (3.2) specifically comprises: setting iteration times and learning rate, training and updating weight parameters of the countermeasure generation model by using an optimizer, setting the number of samples selected by each training, inputting a source domain image and a real target domain image in a training set into the countermeasure generation model for iterative training, and updating the weight parameters of the countermeasure generation model according to loss of the countermeasure generation model to obtain a trained countermeasure generation model.

5. The method of generating a multi-modal medical image guided by graph knowledge of claim 1 or 4, wherein the penalty of the contrast generation model includes pixel level penalty between the generated image and the real image, contrast penalty of the generator, contrast penalty of the discriminator, and regularization constraint penalty of the discriminator, including regularization constraint penalty between images and regularization constraint penalty within images.

6. The method of claim 1, wherein the evaluation index comprises a learning-based image perceived similarity, frechet Inception distance, and peak signal-to-noise ratio of the image.

7. A multi-modal medical image generation apparatus guided by graph knowledge, comprising one or more processors configured to implement the multi-modal medical image generation method guided by graph knowledge of any one of claims 1-6.

8. A computer-readable storage medium, having stored thereon a program which, when executed by a processor, is adapted to carry out the method for generating a multimodal medical image guided by graph knowledge according to any of claims 1-6.