WO2022120762A1 - Multi-modal medical image generation method and apparatus - Google Patents

Multi-modal medical image generation method and apparatus Download PDF

Info

Publication number
WO2022120762A1
WO2022120762A1 PCT/CN2020/135439 CN2020135439W WO2022120762A1 WO 2022120762 A1 WO2022120762 A1 WO 2022120762A1 CN 2020135439 W CN2020135439 W CN 2020135439W WO 2022120762 A1 WO2022120762 A1 WO 2022120762A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
network
generator
sampling
training
Prior art date
Application number
PCT/CN2020/135439
Other languages
French (fr)
Chinese (zh)
Inventor
蒋昌辉
胡战利
梁栋
张其阳
洪序达
郑海荣
Original Assignee
中国科学院深圳先进技术研究院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国科学院深圳先进技术研究院 filed Critical 中国科学院深圳先进技术研究院
Priority to PCT/CN2020/135439 priority Critical patent/WO2022120762A1/en
Publication of WO2022120762A1 publication Critical patent/WO2022120762A1/en

Links

Images

Classifications

    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis

Definitions

  • the embodiments of the present application relate to the field of medical images, in particular to the field of medical image modalities, and in particular, to a method and apparatus for generating a multimodal medical image.
  • X-ray computed tomography is a method that uses the principle of interaction between X-rays and matter to scan a layer with a certain thickness of the part to be inspected, and the detector receives the X-rays passing through the layer and converts them into electrical signals , and then converted into a digital signal (projection data) by an analog/digital converter, input into a computer, and a technology for imaging the internal information of an object.
  • Nuclear magnetic resonance (MR) imaging technology is based on the principle of nuclear magnetic resonance (nuclear magnetic resonance), according to the different attenuation of the released energy in different structural environments inside the material, and by applying a gradient magnetic field to detect the emitted electromagnetic waves, you can know the composition.
  • the position and type of the nuclei of the object can be used to map the internal structure of the object.
  • Positron emission tomography (PET) imaging technology is a diagnostic tool for tumor diseases. Its method is to label certain substances, generally necessary substances in the metabolism of biological life, such as glucose, protein, nucleic acid, and fatty acid. Short half-life radionuclides (such as 18 F, 11 C, etc.) are injected into the human body. Due to the vigorous metabolism of tumor cells, these substances will accumulate at the tumor cells. By detecting and imaging the photons emitted by the radionuclide to localize the tumor, the lesion can be diagnosed and analyzed.
  • PTT Positron emission tomography
  • PET images can provide diagnostic information of the tumor, but the images lack the anatomical structure information of the tumor and its surrounding tissues. These additional anatomies need to be provided by CT or MR images, as shown in Figure 1(b) shown.
  • each medical imaging modality requires a separate device for imaging, and then aggregates it to the doctor for comprehensive diagnosis based on multiple modality images, which consumes a lot of manpower and material resources.
  • each medical imaging modality requires separate equipment for imaging.
  • To obtain PET, CT, and MR images three types of equipment are required.
  • the cost of purchasing equipment is very high for the user; for patients These inspections are also very expensive and time-consuming.
  • the present invention overcomes the above shortcomings, converts one mode of a medical image to generate another mode image, and realizes a computed tomography (CT) image, a nuclear magnetic resonance (MR) image, and a positron emission tomography (PET) image. ) mutual conversion and generation between images. Only one of the three modalities of CT, MR, and PET needs to be scanned, and the other two can be generated. It can effectively reduce the procurement burden of equipment users, while saving examination time and economic burden for patients.
  • CT computed tomography
  • MR nuclear magnetic resonance
  • PET positron emission tomography
  • the embodiments of the present application propose a method and apparatus for generating a multimodal medical image.
  • an embodiment of the present application provides a method for generating a multimodal medical image, including:
  • the second modality image is output.
  • the generator network of the generator adopts a convolutional neural network, a residual network and a generative adversarial network.
  • the generator network adopts a generative adversarial network, and the network structure of the generator includes a generator and a discriminator;
  • the generator is used to generate the target modality image
  • the discriminator is used to judge whether the generated image meets the requirements, and stops training if it meets the requirement to stop training.
  • the network structure of the generator can use transfer learning technology to speed up network training by loading a pre-trained network.
  • the generator is used to implement multi-level down-sampling of the input image, and then corresponds to multi-level up-sampling, and multiple residual network connections are used between down-sampling and up-sampling.
  • the network combining the down-sampling and the up-sampling includes but is not limited to: both down-sampling and up-sampling use conventional convolution layers or residual convolution layers, and the activation function of the convolution layer uses relu, leaky_relu, tanh, sigmod.
  • the part of the combination of down-sampling and up-sampling may be directly connected, may be connected by a fully connected layer, or may be connected by one or more residual networks.
  • the embodiments of the present application also provide a multimodal medical image generation device, including:
  • the training device is used for inputting the first modal image and the target second modal image into the training network to realize supervised learning, stop the training after the iterative training reaches the stopping condition, and obtain the generator;
  • an input device for inputting a first modal image into a pre-trained generator, converting the first modal image into a second modal image, the second modal image and the first modal image modalities are different;
  • an output device for outputting the second modal image.
  • the generator network of the generator adopts a convolutional neural network, a residual network and a generative adversarial network.
  • the network structure of the generator includes a generator and a discriminator
  • the generator is used to generate the target modality image
  • the discriminator is used to judge whether the generated image meets the requirements, and stops training if it meets the requirement to stop training.
  • the network structure of the generator can use transfer learning technology to speed up network training by loading a pre-trained network.
  • the generator is used to implement multi-level down-sampling of the input image, and then corresponds to multi-level up-sampling, and multiple residual network connections are used between down-sampling and up-sampling.
  • the network combining the down-sampling and the up-sampling includes but is not limited to: both down-sampling and up-sampling use conventional convolution layers or residual convolution layers, and the activation function of the convolution layer uses relu, leaky_relu, tanh, sigmod.
  • the part of the combination of down-sampling and up-sampling may be directly connected, may be connected by a fully connected layer, or may be connected by one or more residual networks.
  • an electronic device including:
  • processors one or more processors
  • the one or more processors When the one or more programs are executed by the one or more processors, the one or more processors are caused to implement the method of any one of the above-described multimodal medical image generation methods.
  • an embodiment of the present application provides a computer-readable storage medium on which a computer program is stored, wherein, when the program is executed by a processor, the method of any of the foregoing multimodal medical image generation methods is implemented .
  • the invention overcomes the shortcomings in the prior art, converts one mode of medical images to generate another mode image, and realizes computed tomography (CT) images, nuclear magnetic resonance (MR) images and positron emission tomography scans Interconversion and generation between (PET) images.
  • CT computed tomography
  • MR nuclear magnetic resonance
  • PET positron emission tomography scans
  • CT computed tomography
  • MR nuclear magnetic resonance
  • PET positron emission tomography scans
  • PET computed tomography
  • MR nuclear magnetic resonance
  • PET positron emission tomography
  • Fig. 3 is the overall flow chart of the PET image generation MR image of the present application.
  • Fig. 4 is the generator network structure schematic diagram of the present application.
  • Fig. 5 is the generator discriminator network structure schematic diagram of the present application.
  • FIG. 6 is a schematic flowchart of the generator residual network structure of the present application.
  • Fig. 7 is a schematic diagram of the structure of the generator convolutional network A of the present application.
  • FIG. 8 is a schematic structural diagram of the generator convolutional network B of the present application.
  • FIG. 9 is a comparison diagram of an MR image generated according to a PET image according to an embodiment of the present application.
  • FIG. 10 is a schematic structural diagram of a multimodal medical image generating apparatus of the present application.
  • FIG. 2 shows a flowchart of one embodiment of a multimodal medical image generation method according to the present application.
  • the multimodal medical image generation method includes the following steps:
  • Step 201 input the first modal image and the target second modal image into the training network to realize supervised learning, stop the training after the iterative training reaches the stopping condition, and obtain a generator;
  • Step 202 Input the first modal image into a pre-trained generator, convert the first modal image into a second modal image, the second modal image and the first modal image different state;
  • Step 203 outputting the second modal image.
  • the second modality image is one of PET, CT, MR or other medical modality images.
  • the pre-trained generator in step 202 is trained in the following manner:
  • the first modality image data is input into the neural network.
  • the first modal image and the target second modal image are input into the training network to realize supervised learning, and the training stops when the iterative training reaches the stopping condition.
  • a single modal image is input to obtain the target modal image. Because the deep convolutional neural network has been pre-trained, it can be directly applied when deployed, and the entire reconstruction process is very fast. Taking the input PET image and outputting the MR image as an example, as shown in Figure 3, the conversion between other modalities is the same as this method.
  • the generator network can be implemented by networks such as convolutional neural networks, residual networks, and generative adversarial networks.
  • This example uses a generative adversarial network.
  • Its network structure includes a generator and a discriminator.
  • the generator is used to generate the target modal image, and the discriminator is used to judge whether the generated image meets the requirements, and stops training if it meets the requirements of stopping training.
  • the network structure of the generator can use transfer learning technology to speed up network training by loading a pre-trained network.
  • the generator is shown in FIG. 4 .
  • the generator implements multi-level downsampling of the input image, and then corresponds to multi-level upsampling. Multiple residual network connections are used between downsampling and upsampling.
  • the structure of the convolutional network A in the generator is shown in Figure 7. It contains several convolutional layers and activation layers. The last layer is a pooling layer, which can use maximum pooling or average pooling operations.
  • the main function is to extract high-level features of the image, a total of n series, and short-circuit with the corresponding convolutional network B;
  • the structure of the convolutional network B in the generator is shown in Figure 8, a total of n, including several Convolutional layer and activation layer, the last layer is the upsampling layer, which can use deconvolution or de-pooling operation to realize image generation;
  • the residual network structure in the generator is shown in Figure 6.
  • the discriminant network is shown in Figure 5. There are n convolutional layers and activation layers in the discriminator network respectively, and the last three layers are fully connected layer + activation layer + fully connected layer.
  • the number of convolutional networks A is 5, wherein the co-convolutional layer and the activation layer are 3 layers respectively, and the last layer is the maximum pooling layer
  • the number of corresponding convolutional networks B is 5, of which the convolutional layer and the activation layer are 3 layers respectively, and the last layer is the maximum deconvolution layer
  • the discriminator the convolutional layer and the activation layer are 8 layers respectively, The last three layers are fully connected layer + activation layer + fully connected layer.
  • the generator network structure adopts a generative adversarial network.
  • the downsampling process of the generator network includes: the input and output image sizes of the convolutional network A1 at the starting position of the network are the same as the input image size of the network, and the size of the input image is denoted as M ⁇ N, the feature image size generated by other convolutional layers (multiple concatenated convolutional units) inside this convolution module is M ⁇ N.
  • the feature image size is the same as (M/2) ⁇ (N/2).
  • the feature image size of the inner convolutional layer of the convolutional network An is (M/n) ⁇ (N/n).
  • a residual path is used to directly connect the output of the downsampling convolutional network A to the upsampling convolutional network output of B.
  • the size of the output feature image of the downsampling convolutional network A1 is M ⁇ N
  • the size of the output feature image of the upsampling convolutional network B2 is the same as M ⁇ N
  • the output of the downsampling convolution module 1 is connected to the upsampling volume.
  • the product network B2 they are jointly sent to the convolution network B1.
  • the size of the convolution kernel used by the "convolution module” in the network structure may be selected from (3 ⁇ 3), (5 ⁇ 5), (7 ⁇ 7), and the like.
  • the number of input and output feature images of each convolution module and the convolution layer inside the module can be selected from 8, 16, 32, 64, etc.
  • the activation function of each convolution module and the convolution layer inside the module can be selected from relu, leaky_relu, tanh, etc.
  • a PET image is used as an input, and an MR image is finally generated through the network, and the result is shown in Figure 9 below.
  • the effect is ideal, and can provide doctors with anatomical structure information of PET images and MR images for the localization and diagnosis of lesions.
  • the same method and process can be used in the conversion process between other modal images (such as PET to CT, CT to MR, CT to PET, MR to CT, and MR to PET).
  • the present application further provides a multimodal medical image generation apparatus, the apparatus embodiment corresponds to the method embodiment shown in FIG. 2 , and the apparatus can be specifically applied to various electronic devices.
  • the multimodal medical image processing apparatus 1000 in this embodiment includes: a training apparatus 1001 , an input apparatus 1002 and an output apparatus 1003 .
  • the training device 1001 is used to input the first modal image and the target second modal image into the training network to realize supervised learning, stop training after the iterative training reaches the stop condition, and obtain a generator;
  • An input device 1002 configured to input a first modality image into a pre-trained generator, convert the first modality image into a second modality image, the second modality image and the first modality image
  • the modalities of the images are different;
  • the second modality image is one of PET, CT, MR or other medical modality images
  • a typical implementation device is a computer.
  • the computer may be, for example, a personal computer, a laptop computer, a cellular phone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or A combination of any of these devices.
  • Computer-readable media includes both persistent and non-permanent, removable and non-removable media, and storage of information may be implemented by any method or technology.
  • Information may be computer readable instructions, data structures, modules of programs, or other data.
  • Examples of computer storage media include, but are not limited to, phase-change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read only memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), Flash Memory or other memory technology, Compact Disc Read Only Memory (CD-ROM), Digital Versatile Disc (DVD) or other optical storage, Magnetic tape cassettes, magnetic tape magnetic disk storage or other magnetic storage devices or any other non-transmission medium that can be used to store information that can be accessed by a computing device.
  • computer-readable media does not include transitory computer-readable media, such as modulated data signals and carrier waves.

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Medical Informatics (AREA)
  • Animal Behavior & Ethology (AREA)
  • Biomedical Technology (AREA)
  • Heart & Thoracic Surgery (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Surgery (AREA)
  • Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • Veterinary Medicine (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Apparatus For Radiation Diagnosis (AREA)

Abstract

A multi-modal medical image generation method and apparatus (1000), an electronic device, and a computer-readable storage medium. The method comprises: inputting a first modal image; inputting the first modal image into a pre-trained generator; and outputting a second modal image, wherein the second modal image is different from the first modal image. By means of the method, one modality of a medical image is converted so as to generate another modal image, and mutual conversion and generation between a computed tomography (CT) image, a nuclear magnetic resonance (MR) image and a positron emission tomography (PET) image are realized. Only one of the three modalities, i.e. CT, MR and PET needs to be scanned, and the other two modalities can be generated. The purchasing burden of equipment usage units is effectively reduced, and the inspection time of and economic burden on patients are saved on.

Description

多模态医学图像生成方法和装置Multimodal medical image generation method and device 技术领域technical field
本申请实施例涉及医学图像领域,具体涉及医学图像模态领域,尤其涉及多模态医学图像生成方法和装置。The embodiments of the present application relate to the field of medical images, in particular to the field of medical image modalities, and in particular, to a method and apparatus for generating a multimodal medical image.
背景技术Background technique
X射线计算机断层成像技术(X射线CT)是一种利用X射线与物质相互作用的原理,对待检部位一定厚度的层面进行扫描,由探测器接收透过该层面的X射线,转变为电信号,再经模拟/数字转换器转为数字信号(投影数据),输入计算机,对物体内部信息进行成像的一种技术。X-ray computed tomography (X-ray CT) is a method that uses the principle of interaction between X-rays and matter to scan a layer with a certain thickness of the part to be inspected, and the detector receives the X-rays passing through the layer and converts them into electrical signals , and then converted into a digital signal (projection data) by an analog/digital converter, input into a computer, and a technology for imaging the internal information of an object.
核磁共振(MR)成像技术是利用核磁共振(nuclear magnetic resonance)原理,依据所释放的能量在物质内部不同结构环境中不同的衰减,通过外加梯度磁场检测所发射出的电磁波,即可得知构成这一物体原子核的位置和种类,据此可以绘制成物体内部的结构图像。Nuclear magnetic resonance (MR) imaging technology is based on the principle of nuclear magnetic resonance (nuclear magnetic resonance), according to the different attenuation of the released energy in different structural environments inside the material, and by applying a gradient magnetic field to detect the emitted electromagnetic waves, you can know the composition. The position and type of the nuclei of the object can be used to map the internal structure of the object.
正电子发射断层扫描(PET)成像技术是一种用于肿瘤疾病诊断工具,其方法是将某种物质,一般是生物生命代谢中必须的物质,如:葡萄糖、蛋白质、核酸、脂肪酸,标记上短半衰期放射性核素(如 18F, 11C等),注入人体,由于肿瘤细胞代谢旺盛,这些物质会在肿瘤细胞处聚集。通过探测放射性核素发射的光子并成像,来定位肿瘤情况,从而可对病变进行诊断和分析。 Positron emission tomography (PET) imaging technology is a diagnostic tool for tumor diseases. Its method is to label certain substances, generally necessary substances in the metabolism of biological life, such as glucose, protein, nucleic acid, and fatty acid. Short half-life radionuclides (such as 18 F, 11 C, etc.) are injected into the human body. Due to the vigorous metabolism of tumor cells, these substances will accumulate at the tumor cells. By detecting and imaging the photons emitted by the radionuclide to localize the tumor, the lesion can be diagnosed and analyzed.
PET图像,如图1(a)所示,能提供肿瘤的诊断信息,但图像缺失了肿瘤 及其周围组织的解剖结构信息,这些额外的解剖需要CT或MR图像提供,如图1(b)所示。PET images, as shown in Figure 1(a), can provide diagnostic information of the tumor, but the images lack the anatomical structure information of the tumor and its surrounding tissues. These additional anatomies need to be provided by CT or MR images, as shown in Figure 1(b) shown.
目前在临床医疗检查中,各个医学成像模态都需要单独的设备进行成像,然后再汇总到医生处,根据多个模态图像进行综合诊断,耗费大量的人力物力。At present, in clinical medical examination, each medical imaging modality requires a separate device for imaging, and then aggregates it to the doctor for comprehensive diagnosis based on multiple modality images, which consumes a lot of manpower and material resources.
目前在临床医疗检查中,各个医学成像模态都需要单独的设备进行成像,要得到PET、CT、MR图像就需要三种设备,对于使用单位来说采购设备的成本很高;对于患者来说这些检查也非常昂贵,耗费的时间也很长。At present, in clinical medical examinations, each medical imaging modality requires separate equipment for imaging. To obtain PET, CT, and MR images, three types of equipment are required. The cost of purchasing equipment is very high for the user; for patients These inspections are also very expensive and time-consuming.
本发明克服了上面的缺点,将医学图像的一种模态转换、生成另一种模态图像,实现了计算机断层成像(CT)图像、核磁共振(MR)图像以及正电子发射断层扫描(PET)图像之间的相互转换与生成。只需要扫描CT、MR、PET三种模态中的一种,就可生成其他另外两种。能有效减轻设备使用单位的采购负担,同时为患者节省检查时间和经济负担。The present invention overcomes the above shortcomings, converts one mode of a medical image to generate another mode image, and realizes a computed tomography (CT) image, a nuclear magnetic resonance (MR) image, and a positron emission tomography (PET) image. ) mutual conversion and generation between images. Only one of the three modalities of CT, MR, and PET needs to be scanned, and the other two can be generated. It can effectively reduce the procurement burden of equipment users, while saving examination time and economic burden for patients.
发明内容SUMMARY OF THE INVENTION
本申请实施例提出了一种多模态医学图像生成方法和装置。The embodiments of the present application propose a method and apparatus for generating a multimodal medical image.
第一方面,本申请实施例提供了一种多模态医学图像生成方法,包括:In a first aspect, an embodiment of the present application provides a method for generating a multimodal medical image, including:
将第一模态图像与目标第二模态图像输入训练网络,实现监督学习,迭代训练达到停止条件后停止训练,获得生成器;Input the first modal image and the target second modal image into the training network to realize supervised learning, stop the training after the iterative training reaches the stopping condition, and obtain the generator;
将第一模态图像输入预先训练好的生成器,将所述第一模态图像转换为第二模态图像,所述第二模态图像与所述第一模态图像的模态不同;inputting the first modal image into a pre-trained generator, and converting the first modal image into a second modal image, where the second modal image is different from the first modal image;
输出所述第二模态图像。The second modality image is output.
在一些实施例中,其中,所述生成器的生成器网络采用卷积神经网络,残差网络和生成对抗网络。In some embodiments, the generator network of the generator adopts a convolutional neural network, a residual network and a generative adversarial network.
在一些实施例中,其中,所述生成器网络采用生成对抗网络,生成器的网络结构包含生成器与判别器;In some embodiments, the generator network adopts a generative adversarial network, and the network structure of the generator includes a generator and a discriminator;
生成器用于生成目标模态图像;The generator is used to generate the target modality image;
判别器用于判断生成的图像是否达到了要求,达到了停止训练的要求即停止训练。The discriminator is used to judge whether the generated image meets the requirements, and stops training if it meets the requirement to stop training.
在一些实施例中,其中,所述生成器的网络结构可以运用迁移学习技术,通过加载预训练网络加快网络训练。In some embodiments, the network structure of the generator can use transfer learning technology to speed up network training by loading a pre-trained network.
在一些实施例中,其中,所述生成器用于实现输入图像的多级降采样,之后再对应多级升采样,降采样与升采样之间采用多个残差网络连接。In some embodiments, the generator is used to implement multi-level down-sampling of the input image, and then corresponds to multi-level up-sampling, and multiple residual network connections are used between down-sampling and up-sampling.
在一些实施例中,其中,所述降采样与所述升采样结合的网络,包括但不限于:降采样与升采样都采用常规卷积层或残差卷积层,卷积层激活函数采用relu、leaky_relu、tanh、sigmod。In some embodiments, the network combining the down-sampling and the up-sampling includes but is not limited to: both down-sampling and up-sampling use conventional convolution layers or residual convolution layers, and the activation function of the convolution layer uses relu, leaky_relu, tanh, sigmod.
在一些实施例中,其中,降采样与升采样结合的部分,可以采用直接相连,也可以采用全连接层相连接,也可以采用一个或多个残差网络相连接。In some embodiments, the part of the combination of down-sampling and up-sampling may be directly connected, may be connected by a fully connected layer, or may be connected by one or more residual networks.
第二方面,本申请实施例还提供了一种多模态医学图像生成装置,包括:In a second aspect, the embodiments of the present application also provide a multimodal medical image generation device, including:
训练装置,用于将第一模态图像与目标第二模态图像输入训练网络,实现监督学习,迭代训练达到停止条件后停止训练,获得生成器;The training device is used for inputting the first modal image and the target second modal image into the training network to realize supervised learning, stop the training after the iterative training reaches the stopping condition, and obtain the generator;
输入装置,用于将第一模态图像输入预先训练好的生成器,将所述第一模 态图像转换为第二模态图像,所述第二模态图像与所述第一模态图像的模态不同;an input device for inputting a first modal image into a pre-trained generator, converting the first modal image into a second modal image, the second modal image and the first modal image modalities are different;
输出装置,用于输出所述第二模态图像。an output device for outputting the second modal image.
在一些实施例中,其中,所述生成器的生成器网络采用卷积神经网络,残差网络和生成对抗网络。In some embodiments, the generator network of the generator adopts a convolutional neural network, a residual network and a generative adversarial network.
在一些实施例中,其中,所述生成器的网络结构包含生成器与判别器;In some embodiments, the network structure of the generator includes a generator and a discriminator;
生成器用于生成目标模态图像;The generator is used to generate the target modality image;
判别器用于判断生成的图像是否达到了要求,达到了停止训练的要求即停止训练。The discriminator is used to judge whether the generated image meets the requirements, and stops training if it meets the requirement to stop training.
在一些实施例中,其中,所述生成器的网络结构可以运用迁移学习技术,通过加载预训练网络加快网络训练。In some embodiments, the network structure of the generator can use transfer learning technology to speed up network training by loading a pre-trained network.
在一些实施例中,其中,所述生成器用于实现输入图像的多级降采样,之后再对应多级升采样,降采样与升采样之间采用多个残差网络连接。In some embodiments, the generator is used to implement multi-level down-sampling of the input image, and then corresponds to multi-level up-sampling, and multiple residual network connections are used between down-sampling and up-sampling.
在一些实施例中,其中,所述降采样与所述升采样结合的网络,包括但不限于:降采样与升采样都采用常规卷积层或残差卷积层,卷积层激活函数采用relu、leaky_relu、tanh、sigmod。In some embodiments, the network combining the down-sampling and the up-sampling includes but is not limited to: both down-sampling and up-sampling use conventional convolution layers or residual convolution layers, and the activation function of the convolution layer uses relu, leaky_relu, tanh, sigmod.
在一些实施例中,其中,降采样与升采样结合的部分,可以采用直接相连,也可以采用全连接层相连接,也可以采用一个或多个残差网络相连接。In some embodiments, the part of the combination of down-sampling and up-sampling may be directly connected, may be connected by a fully connected layer, or may be connected by one or more residual networks.
第三方面,本申请实施例提供了一种电子设备,包括:In a third aspect, an embodiment of the present application provides an electronic device, including:
一个或多个处理器;one or more processors;
存储装置,用于存储一个或多个程序,storage means for storing one or more programs,
当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现上述多模态医学图像生成方法中任一实施例的方法。When the one or more programs are executed by the one or more processors, the one or more processors are caused to implement the method of any one of the above-described multimodal medical image generation methods.
第四方面,本申请实施例提供了一种计算机可读存储介质,其上存储有计算机程序,其中,该程序被处理器执行时实现上述多模态医学图像生成方法中任一实施例的方法。In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium on which a computer program is stored, wherein, when the program is executed by a processor, the method of any of the foregoing multimodal medical image generation methods is implemented .
本发明克服了现有技术中的缺点,将医学图像的一种模态转换、生成另一种模态图像,实现计算机断层成像(CT)图像、核磁共振(MR)图像以及正电子发射断层扫描(PET)图像之间的相互转换与生成。以输入PET图像输出MR图像为例,本发明将患者检查得到的PET图像输入本发明神经网络,直接可生成MR图像,不需要另外专用的MR设备重新扫描。因深度卷积神经网络已经预先训练完成,在部署时,可直接应用,整个重建过程速度非常迅速。The invention overcomes the shortcomings in the prior art, converts one mode of medical images to generate another mode image, and realizes computed tomography (CT) images, nuclear magnetic resonance (MR) images and positron emission tomography scans Interconversion and generation between (PET) images. Taking inputting PET images and outputting MR images as an example, the present invention inputs the PET images obtained by patient inspection into the neural network of the present invention, and can directly generate MR images without re-scanning with additional dedicated MR equipment. Because the deep convolutional neural network has been pre-trained, it can be directly applied when deployed, and the entire reconstruction process is very fast.
附图说明Description of drawings
图1是现有技术中两种模态的医学图像;1 is a medical image of two modalities in the prior art;
图2是本申请的多模态医学图像生成方法一个实施例的流程图;2 is a flowchart of an embodiment of the multimodal medical image generation method of the present application;
图3是本申请的PET图像生成MR图像总体流程图;Fig. 3 is the overall flow chart of the PET image generation MR image of the present application;
图4是本申请的生成器网络结构示意程图;Fig. 4 is the generator network structure schematic diagram of the present application;
图5是本申请的生成器判别器网络结构示意程图;Fig. 5 is the generator discriminator network structure schematic diagram of the present application;
图6是本申请的生成器残差网络结构示意程图;FIG. 6 is a schematic flowchart of the generator residual network structure of the present application;
图7是本申请的生成器卷积网络A结构示意程图;Fig. 7 is a schematic diagram of the structure of the generator convolutional network A of the present application;
图8是本申请的生成器卷积网络B结构示意程图;FIG. 8 is a schematic structural diagram of the generator convolutional network B of the present application;
图9是本申请一个实施例根据PET图像生成的MR图像的对比图;9 is a comparison diagram of an MR image generated according to a PET image according to an embodiment of the present application;
图10是本申请的多模态医学图像生成装置结构示意程图。FIG. 10 is a schematic structural diagram of a multimodal medical image generating apparatus of the present application.
具体实施方式Detailed ways
下面结合附图和实施例对本申请作进一步的详细说明。可以理解的是,此处所描述的具体实施例仅仅用于解释相关发明,而非对该发明的限定。另外还需要说明的是,为了便于描述,附图中仅示出了与有关发明相关的部分。The present application will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the related invention, but not to limit the invention. In addition, it should be noted that, for the convenience of description, only the parts related to the related invention are shown in the drawings.
需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。下面将参考附图并结合实施例来详细说明本申请。It should be noted that the embodiments in the present application and the features of the embodiments may be combined with each other in the case of no conflict. The present application will be described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
图2示出了根据本申请的多模态医学图像生成方法的一个实施例的流程图。该多模态医学图像生成方法包括以下步骤:FIG. 2 shows a flowchart of one embodiment of a multimodal medical image generation method according to the present application. The multimodal medical image generation method includes the following steps:
步骤201,将第一模态图像与目标第二模态图像输入训练网络,实现监督学习,迭代训练达到停止条件后停止训练,获得生成器; Step 201, input the first modal image and the target second modal image into the training network to realize supervised learning, stop the training after the iterative training reaches the stopping condition, and obtain a generator;
步骤202,将第一模态图像输入预先训练好的生成器,将所述第一模态图像转换为第二模态图像,所述第二模态图像与所述第一模态图像的模态不同;Step 202: Input the first modal image into a pre-trained generator, convert the first modal image into a second modal image, the second modal image and the first modal image different state;
步骤203,输出所述第二模态图像。 Step 203, outputting the second modal image.
第二模态图像为PET,CT,MR或其他医学模态图像中的一种。The second modality image is one of PET, CT, MR or other medical modality images.
在本实施例中,步骤202中预先训练好的生成器通过以下方式进行训练:In this embodiment, the pre-trained generator in step 202 is trained in the following manner:
首先将第一模态图像数据输入神经网络。在训练过程中,将第一模态图像与目标第二模态图像输入训练网络,实现监督学习,迭代训练达到停止条件后 停止训练。在部署过程中,单独输入一种模态图像就可得到目标模态图像。因深度卷积神经网络已经预先训练完成,在部署时,可直接应用,整个重建过程非常迅速。以输入PET图像输出MR图像为例,如图3所示,其他模态之间的转换与该方法相同。First, the first modality image data is input into the neural network. In the training process, the first modal image and the target second modal image are input into the training network to realize supervised learning, and the training stops when the iterative training reaches the stopping condition. During the deployment process, a single modal image is input to obtain the target modal image. Because the deep convolutional neural network has been pre-trained, it can be directly applied when deployed, and the entire reconstruction process is very fast. Taking the input PET image and outputting the MR image as an example, as shown in Figure 3, the conversion between other modalities is the same as this method.
生成器网络可由卷积神经网络、残差网络、生成对抗网络等网络实现。本示例采用生成对抗网络,其网络结构包含生成器与判别器,生成器用于生成目标模态图像,判别器用于判断生成的图像是否达到了要求,达到了停止训练的要求即停止训练。所述生成器的网络结构可以运用迁移学习技术,通过加载预训练网络加快网络训练。The generator network can be implemented by networks such as convolutional neural networks, residual networks, and generative adversarial networks. This example uses a generative adversarial network. Its network structure includes a generator and a discriminator. The generator is used to generate the target modal image, and the discriminator is used to judge whether the generated image meets the requirements, and stops training if it meets the requirements of stopping training. The network structure of the generator can use transfer learning technology to speed up network training by loading a pre-trained network.
在本实施例中,生成器如图4所示。生成器实现输入图像的多级降采样,之后再对应多级升采样,降采样与升采样之间采用多个残差网络连接。生成器中的卷积网络A的结构如图7所示,含有若干个卷积层和激活层,最后一层为池化层,可采用最大池化或者平均池化操作,卷积网络A的作用主要在于提取图像高层次特征,共n个串联,并和相对应的卷积网络B有短接;生成器中的卷积网络B的结构如图8所示,共n个,含有若干个卷积层和激活层,最后一层为上采样层,可采用反卷积或者反池化操作,实现图片生成;生成器中的残差网络结构如图6所示。判别网络如图5所示,判别器网络中分别有n个卷积层和激活层,最后三层为全连层+激活层+全连层。In this embodiment, the generator is shown in FIG. 4 . The generator implements multi-level downsampling of the input image, and then corresponds to multi-level upsampling. Multiple residual network connections are used between downsampling and upsampling. The structure of the convolutional network A in the generator is shown in Figure 7. It contains several convolutional layers and activation layers. The last layer is a pooling layer, which can use maximum pooling or average pooling operations. The main function is to extract high-level features of the image, a total of n series, and short-circuit with the corresponding convolutional network B; the structure of the convolutional network B in the generator is shown in Figure 8, a total of n, including several Convolutional layer and activation layer, the last layer is the upsampling layer, which can use deconvolution or de-pooling operation to realize image generation; the residual network structure in the generator is shown in Figure 6. The discriminant network is shown in Figure 5. There are n convolutional layers and activation layers in the discriminator network respectively, and the last three layers are fully connected layer + activation layer + fully connected layer.
在本实施例一些可选的实现方式中,在生成器中:卷积网络A的个数为5个,其中共卷积层和激活层分别为3层,最后一层为最大池化层,对应的卷积 网络B的个数为5个,其中卷积层和激活层分别为3层,最后一层为最大反卷积层;判别器中:卷积层和激活层分别为8层,最后三层为全连层+激活层+全连层。In some optional implementations of this embodiment, in the generator: the number of convolutional networks A is 5, wherein the co-convolutional layer and the activation layer are 3 layers respectively, and the last layer is the maximum pooling layer, The number of corresponding convolutional networks B is 5, of which the convolutional layer and the activation layer are 3 layers respectively, and the last layer is the maximum deconvolution layer; in the discriminator: the convolutional layer and the activation layer are 8 layers respectively, The last three layers are fully connected layer + activation layer + fully connected layer.
在本实施例一些可选的实现方式中,生成器网络结构采用生成对抗网络,在网络的一次训练中,预先对判别器网络训练n次,这里取n=5,然后对生成器训练1次,生成器训练产生的图像和参照图像经过判别器判断是否达到和参照图像清晰程度,如果没有,继续训练判别器5次,然后训练生成器1次,如此循环,经过多次迭代训练完成对网络训练。In some optional implementations of this embodiment, the generator network structure adopts a generative adversarial network. In one training of the network, the discriminator network is trained n times in advance, where n=5, and then the generator is trained once , the image generated by the generator training and the reference image are judged by the discriminator whether they reach the level of clarity of the reference image. If not, continue to train the discriminator 5 times, and then train the generator 1 time, and so on. train.
在本实施例一些可选的实现方式中,生成器网络的降采样过程包括:网络起始位置卷积网络A1的输入、输出图像尺寸与网络的输入图像尺寸相同,输入图像的尺寸记为M×N,则本卷积模块内部其他卷积层(多个级联卷积单元)生成的特征图像尺寸为M×N。下一级卷积网络A2的输入层(单元)卷积时采用stride=2模式降采样卷积,得到的特征图像尺寸为(M/2)×(N/2),其他层(单元)的特征图像尺寸同为(M/2)×(N/2)。以此类推,卷积网络An内部卷积层的特征图像尺寸为(M/n)×(N/n)。In some optional implementations of this embodiment, the downsampling process of the generator network includes: the input and output image sizes of the convolutional network A1 at the starting position of the network are the same as the input image size of the network, and the size of the input image is denoted as M ×N, the feature image size generated by other convolutional layers (multiple concatenated convolutional units) inside this convolution module is M×N. The input layer (unit) of the next level convolutional network A2 adopts stride=2 mode for downsampling convolution, and the obtained feature image size is (M/2)×(N/2), and the size of other layers (units) is (M/2)×(N/2). The feature image size is the same as (M/2)×(N/2). By analogy, the feature image size of the inner convolutional layer of the convolutional network An is (M/n)×(N/n).
在本实施例一些可选的实现方式中,生成器网络的升采样过程包括:网络的升采样过程与降采样相反,卷积网络Bn的内部卷积层特征图像尺寸为(M/n)×(N/n),模块的最后一层采用stride=2“反卷积”把特征图像升采样到(M/(n-1))×(N/(n-1)),后面的升采样模块以此类推,直到最终升采样卷积网络B1,卷积网络B1的最后一层不再升采样,经过整形层保证输出图像尺寸与网络最开 始输入图像尺寸一致。In some optional implementations of this embodiment, the up-sampling process of the generator network includes: the up-sampling process of the network is opposite to the down-sampling process, and the feature image size of the internal convolutional layer of the convolutional network Bn is (M/n)× (N/n), the last layer of the module uses stride=2 "deconvolution" to upsample the feature image to (M/(n-1))×(N/(n-1)), and the subsequent upsampling The module goes on like this until the final upsampling convolutional network B1, the last layer of the convolutional network B1 is no longer upsampling, and the shaping layer ensures that the output image size is the same as the initial input image size of the network.
在本实施例一些可选的实现方式中,在降采样与升采样的卷积网络输出同等特征图像尺寸时,采用残差路径把降采样卷积网络A的输出直接连接到升采样卷积网络B的输出。比如:降采样卷积网络A1的输出特征图像尺寸为M×N,升采样卷积网络B2的输出特征图像尺寸同为M×N,把降采样卷积模块1的输出,连接到升采样卷积网络B2的输出上,共同送入卷积网络B1。In some optional implementations of this embodiment, when the downsampling and upsampling convolutional networks output the same feature image size, a residual path is used to directly connect the output of the downsampling convolutional network A to the upsampling convolutional network output of B. For example: the size of the output feature image of the downsampling convolutional network A1 is M×N, and the size of the output feature image of the upsampling convolutional network B2 is the same as M×N, and the output of the downsampling convolution module 1 is connected to the upsampling volume. On the output of the product network B2, they are jointly sent to the convolution network B1.
在本实施例一些可选的实现方式中,在降采样的末尾,升采样的开始,采用多个级联的残差网络连接。In some optional implementation manners of this embodiment, at the end of down-sampling and at the beginning of up-sampling, multiple cascaded residual network connections are used.
在本实施例一些可选的实现方式中,网络结构中的“卷积模块”采用的卷积核大小,可以选取(3×3)、(5×5)、(7×7)等。各个卷积模块以及模块内部卷积层的输入输出特征图像数目,可以选取8、16、32、64等。各个卷积模块以及模块内部卷积层的激活函数,可以选取relu、leaky_relu、tanh等。In some optional implementation manners of this embodiment, the size of the convolution kernel used by the "convolution module" in the network structure may be selected from (3×3), (5×5), (7×7), and the like. The number of input and output feature images of each convolution module and the convolution layer inside the module can be selected from 8, 16, 32, 64, etc. The activation function of each convolution module and the convolution layer inside the module can be selected from relu, leaky_relu, tanh, etc.
在一个实施例中采用PET图像作为输入,最终通过网络生成MR图像,结果如下图9所示。其效果理想,可以为医生提供PET图像MR图像的解剖结构信息,用于病灶的定位和诊断。其他模态图像之间(如PET到CT,CT到MR,CT到PET,MR到CT,MR到PET)的转换过程均可采用相同的方法和流程。In one embodiment, a PET image is used as an input, and an MR image is finally generated through the network, and the result is shown in Figure 9 below. The effect is ideal, and can provide doctors with anatomical structure information of PET images and MR images for the localization and diagnosis of lesions. The same method and process can be used in the conversion process between other modal images (such as PET to CT, CT to MR, CT to PET, MR to CT, and MR to PET).
进一步参考图10,本申请还提供了一种多模态医学图像生成装置,该装置实施例与图2所示的方法实施例相对应,该装置具体可以应用于各种电子设备中。With further reference to FIG. 10 , the present application further provides a multimodal medical image generation apparatus, the apparatus embodiment corresponds to the method embodiment shown in FIG. 2 , and the apparatus can be specifically applied to various electronic devices.
如图10所示,本实施例的多模态医学图像处理装置1000包括:训练装置 1001,输入装置1002和输出装置1003。As shown in FIG. 10 , the multimodal medical image processing apparatus 1000 in this embodiment includes: a training apparatus 1001 , an input apparatus 1002 and an output apparatus 1003 .
其中,训练装置1001,用于将第一模态图像与目标第二模态图像输入训练网络,实现监督学习,迭代训练达到停止条件后停止训练,获得生成器;Wherein, the training device 1001 is used to input the first modal image and the target second modal image into the training network to realize supervised learning, stop training after the iterative training reaches the stop condition, and obtain a generator;
输入装置1002,用于将第一模态图像输入预先训练好的生成器,将所述第一模态图像转换为第二模态图像,所述第二模态图像与所述第一模态图像的模态不同;An input device 1002, configured to input a first modality image into a pre-trained generator, convert the first modality image into a second modality image, the second modality image and the first modality image The modalities of the images are different;
输出装置1003,用于输出所述第二模态图像。An output device 1003, configured to output the second modal image.
第二模态图像为PET,CT,MR或其他医学模态图像中的一种The second modality image is one of PET, CT, MR or other medical modality images
以上所述仅为本说明书的较佳实施例而已,并非用于限定本说明书的保护范围。凡在本说明书的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本说明书的保护范围之内。The above descriptions are only preferred embodiments of the present specification, and are not intended to limit the protection scope of the present specification. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of this specification shall be included within the protection scope of this specification.
上述实施例阐明的系统、装置、模块或单元,具体可以由计算机芯片或实体实现,或者由具有某种功能的产品来实现。一种典型的实现设备为计算机。具体的,计算机例如可以为个人计算机、膝上型计算机、蜂窝电话、相机电话、智能电话、个人数字助理、媒体播放器、导航设备、电子邮件设备、游戏控制台、平板计算机、可穿戴设备或者这些设备中的任何设备的组合。The systems, devices, modules or units described in the above embodiments may be specifically implemented by computer chips or entities, or by products with certain functions. A typical implementation device is a computer. Specifically, the computer may be, for example, a personal computer, a laptop computer, a cellular phone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or A combination of any of these devices.
计算机可读介质包括永久性和非永久性、可移动和非可移动媒体可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括,但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机 存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁带磁磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。按照本文中的界定,计算机可读介质不包括暂存电脑可读媒体(transitory media),如调制的数据信号和载波。Computer-readable media includes both persistent and non-permanent, removable and non-removable media, and storage of information may be implemented by any method or technology. Information may be computer readable instructions, data structures, modules of programs, or other data. Examples of computer storage media include, but are not limited to, phase-change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read only memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), Flash Memory or other memory technology, Compact Disc Read Only Memory (CD-ROM), Digital Versatile Disc (DVD) or other optical storage, Magnetic tape cassettes, magnetic tape magnetic disk storage or other magnetic storage devices or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, computer-readable media does not include transitory computer-readable media, such as modulated data signals and carrier waves.
还需要说明的是,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、商品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、商品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、商品或者设备中还存在另外的相同要素。It should also be noted that the terms "comprising", "comprising" or any other variation thereof are intended to encompass a non-exclusive inclusion such that a process, method, article or device comprising a series of elements includes not only those elements, but also Other elements not expressly listed, or which are inherent to such a process, method, article of manufacture, or apparatus are also included. Without further limitation, an element qualified by the phrase "comprising a..." does not preclude the presence of additional identical elements in the process, method, article of manufacture, or device that includes the element.
本说明书中的各个实施例均采用递进的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于系统实施例而言,由于其基本相似于方法实施例,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。Each embodiment in this specification is described in a progressive manner, and the same and similar parts between the various embodiments may be referred to each other, and each embodiment focuses on the differences from other embodiments. In particular, as for the system embodiments, since they are basically similar to the method embodiments, the description is relatively simple, and for related parts, please refer to the partial descriptions of the method embodiments.

Claims (16)

  1. 一种多模态医学图像生成方法,包括:A multimodal medical image generation method, comprising:
    将第一模态图像与目标第二模态图像输入训练网络,实现监督学习,迭代训练达到停止条件后停止训练,获得生成器;Input the first modal image and the target second modal image into the training network to realize supervised learning, stop the training after the iterative training reaches the stopping condition, and obtain the generator;
    将第一模态图像输入预先训练好的生成器,将所述第一模态图像转换为第二模态图像,所述第二模态图像与所述第一模态图像的模态不同;inputting the first modal image into a pre-trained generator, and converting the first modal image into a second modal image, where the second modal image is different from the first modal image;
    输出所述第二模态图像。The second modality image is output.
  2. 如权利要求1所述的方法,其中,所述生成器的生成器网络采用卷积神经网络,残差网络和生成对抗网络。The method of claim 1, wherein the generator network of the generator adopts a convolutional neural network, a residual network and a generative adversarial network.
  3. 如权利要求2所述的方法,其中,所述生成器网络采用生成对抗网络,生成器的网络结构包含生成器与判别器;The method of claim 2, wherein the generator network adopts a generative adversarial network, and the network structure of the generator comprises a generator and a discriminator;
    生成器用于生成目标模态图像;The generator is used to generate the target modality image;
    判别器用于判断生成的图像是否达到了要求,达到了停止训练的要求即停止训练。The discriminator is used to judge whether the generated image meets the requirements, and stops training if it meets the requirement to stop training.
  4. 如权利要求3所述的方法,其中,所述生成器的网络结构可以运用迁移学习技术,通过加载预训练网络加快网络训练。The method of claim 3, wherein the network structure of the generator can use transfer learning technology to speed up network training by loading a pre-trained network.
  5. 如权利要求4所述的方法,其中,所述生成器用于实现输入图像的多级降采样,之后再对应多级升采样,降采样与升采样之间采用多个残差网络连接。The method according to claim 4, wherein the generator is configured to implement multi-level down-sampling of the input image, and then corresponds to multi-level up-sampling, and multiple residual network connections are used between down-sampling and up-sampling.
  6. 如权利要求5所述的方法,其中,所述降采样与所述升采样结合的网 络,包括但不限于:降采样与升采样都采用常规卷积层或残差卷积层,卷积层激活函数采用relu、leaky_relu、tanh、sigmod。The method of claim 5, wherein the network combining the down-sampling and the up-sampling includes but is not limited to: both down-sampling and up-sampling use conventional convolution layers or residual convolution layers, convolution layers The activation function adopts relu, leaky_relu, tanh, and sigmod.
  7. 如权利要求6所述的方法,其中,降采样与升采样结合的部分,可以采用直接相连,也可以采用全连接层相连接,也可以采用一个或多个残差网络相连接。The method according to claim 6, wherein the part of the combination of down-sampling and up-sampling may be directly connected, may be connected by a fully connected layer, or may be connected by one or more residual networks.
  8. 一种多模态医学图像生成装置,包括:A multimodal medical image generation device, comprising:
    训练装置,用于将第一模态图像与目标第二模态图像输入训练网络,实现监督学习,迭代训练达到停止条件后停止训练,获得生成器;The training device is used for inputting the first modal image and the target second modal image into the training network to realize supervised learning, stop the training after the iterative training reaches the stopping condition, and obtain the generator;
    输入装置,用于将第一模态图像输入预先训练好的生成器,将所述第一模态图像转换为第二模态图像,所述第二模态图像与所述第一模态图像的模态不同;an input device for inputting a first modal image into a pre-trained generator, converting the first modal image into a second modal image, the second modal image and the first modal image modalities are different;
    输出装置,用于输出所述第二模态图像。an output device for outputting the second modal image.
  9. 如权利要求8所述的装置,其中,所述生成器的生成器网络采用卷积神经网络,残差网络和生成对抗网络。The apparatus of claim 8, wherein the generator network of the generator adopts a convolutional neural network, a residual network and a generative adversarial network.
  10. 如权利要求9所述的装置,其中,所述生成器的网络结构包含生成器与判别器;The apparatus of claim 9, wherein the network structure of the generator comprises a generator and a discriminator;
    生成器用于生成目标模态图像;The generator is used to generate the target modality image;
    判别器用于判断生成的图像是否达到了要求,达到了停止训练的要求即停止训练。The discriminator is used to judge whether the generated image meets the requirements, and stops training if it meets the requirement to stop training.
  11. 如权利要求10所述的装置,其中,所述生成器的网络结构可以运用 迁移学习技术,通过加载预训练网络加快网络训练。The apparatus of claim 10, wherein the network structure of the generator can use a transfer learning technique to speed up network training by loading a pre-trained network.
  12. 如权利要求11所述的装置,其中,所述生成器用于实现输入图像的多级降采样,之后再对应多级升采样,降采样与升采样之间采用多个残差网络连接。The apparatus according to claim 11, wherein the generator is configured to implement multi-level down-sampling of the input image, and then correspond to multi-level up-sampling, and use multiple residual network connections between down-sampling and up-sampling.
  13. 如权利要求12所述的装置,其中,所述降采样与所述升采样结合的网络,包括但不限于:降采样与升采样都采用常规卷积层或残差卷积层,卷积层激活函数采用relu、leaky_relu、tanh、sigmod。The apparatus according to claim 12, wherein the network combining the down-sampling and the up-sampling includes but is not limited to: both down-sampling and up-sampling use conventional convolution layers or residual convolution layers, convolution layers The activation function adopts relu, leaky_relu, tanh, and sigmod.
  14. 如权利要求13所述的装置,其中,降采样与升采样结合的部分,可以采用直接相连,也可以采用全连接层相连接,也可以采用一个或多个残差网络相连接。The apparatus according to claim 13, wherein the part of the combination of down-sampling and up-sampling may be directly connected, may be connected by a fully connected layer, or may be connected by one or more residual networks.
  15. 一种电子设备,包括:An electronic device comprising:
    一个或多个处理器;one or more processors;
    存储装置,用于存储一个或多个程序,storage means for storing one or more programs,
    当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如权利要求1-7中任一所述的方法。The one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of any of claims 1-7.
  16. 一种计算机可读存储介质,其上存储有计算机程序,其中,该程序被处理器执行时实现如权利要求1-7中任一所述的方法。A computer-readable storage medium on which a computer program is stored, wherein the program, when executed by a processor, implements the method according to any one of claims 1-7.
PCT/CN2020/135439 2020-12-10 2020-12-10 Multi-modal medical image generation method and apparatus WO2022120762A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/135439 WO2022120762A1 (en) 2020-12-10 2020-12-10 Multi-modal medical image generation method and apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/135439 WO2022120762A1 (en) 2020-12-10 2020-12-10 Multi-modal medical image generation method and apparatus

Publications (1)

Publication Number Publication Date
WO2022120762A1 true WO2022120762A1 (en) 2022-06-16

Family

ID=81974107

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/135439 WO2022120762A1 (en) 2020-12-10 2020-12-10 Multi-modal medical image generation method and apparatus

Country Status (1)

Country Link
WO (1) WO2022120762A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116402865A (en) * 2023-06-06 2023-07-07 之江实验室 Multi-mode image registration method, device and medium using diffusion model
CN116433795A (en) * 2023-06-14 2023-07-14 之江实验室 Multi-mode image generation method and device based on countermeasure generation network

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160113630A1 (en) * 2014-10-23 2016-04-28 Samsung Electronics Co., Ltd. Ultrasound imaging apparatus and method of controlling the same
CN110111395A (en) * 2019-04-24 2019-08-09 上海理工大学 A method of PET-MRI image is synthesized based on MRI image
CN110234400A (en) * 2016-09-06 2019-09-13 医科达有限公司 For generating the neural network of synthesis medical image
CN110444277A (en) * 2019-07-19 2019-11-12 重庆邮电大学 It is a kind of based on generating multipair anti-multi-modal brain MRI image bi-directional conversion method more
CN110544239A (en) * 2019-08-19 2019-12-06 中山大学 Multi-modal MRI conversion method, system and medium for generating countermeasure network based on conditions
CN110800066A (en) * 2017-07-03 2020-02-14 通用电气公司 Physiological mapping from multi-parameter radiological data

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160113630A1 (en) * 2014-10-23 2016-04-28 Samsung Electronics Co., Ltd. Ultrasound imaging apparatus and method of controlling the same
CN110234400A (en) * 2016-09-06 2019-09-13 医科达有限公司 For generating the neural network of synthesis medical image
CN110800066A (en) * 2017-07-03 2020-02-14 通用电气公司 Physiological mapping from multi-parameter radiological data
CN110111395A (en) * 2019-04-24 2019-08-09 上海理工大学 A method of PET-MRI image is synthesized based on MRI image
CN110444277A (en) * 2019-07-19 2019-11-12 重庆邮电大学 It is a kind of based on generating multipair anti-multi-modal brain MRI image bi-directional conversion method more
CN110544239A (en) * 2019-08-19 2019-12-06 中山大学 Multi-modal MRI conversion method, system and medium for generating countermeasure network based on conditions

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116402865A (en) * 2023-06-06 2023-07-07 之江实验室 Multi-mode image registration method, device and medium using diffusion model
CN116402865B (en) * 2023-06-06 2023-09-15 之江实验室 Multi-mode image registration method, device and medium using diffusion model
CN116433795A (en) * 2023-06-14 2023-07-14 之江实验室 Multi-mode image generation method and device based on countermeasure generation network
CN116433795B (en) * 2023-06-14 2023-08-29 之江实验室 Multi-mode image generation method and device based on countermeasure generation network

Similar Documents

Publication Publication Date Title
Cherry et al. Total-body PET: maximizing sensitivity to create new opportunities for clinical research and patient care
Bieniosek et al. Characterization of custom 3D printed multimodality imaging phantoms
JP4510081B2 (en) Rebiner performing nearest neighbor rebinning in clinical positron emission tomography using online 3D response line-bin mapping
WO2022120762A1 (en) Multi-modal medical image generation method and apparatus
Fiechter et al. Nuclear myocardial perfusion imaging with a novel cadmium-zinc-telluride detector SPECT/CT device: first validation versus invasive coronary angiography
Scholtens et al. Dual-time-point FDG PET/CT imaging in prosthetic heart valve endocarditis
Kim et al. Forward-projection architecture for fast iterative image reconstruction in X-ray CT
US20110089327A1 (en) Multimodality Imaging
Chen et al. Development and validation of an open data format for CT projection data
US20220130079A1 (en) Systems and methods for simultaneous attenuation correction, scatter correction, and de-noising of low-dose pet images with a neural network
WO2023142781A1 (en) Image three-dimensional reconstruction method and apparatus, electronic device, and storage medium
Cademartiri et al. Dual-source photon-counting computed tomography—Part I: clinical overview of cardiac CT and coronary CT angiography applications
Sanaat et al. Deep‐TOF‐PET: Deep learning‐guided generation of time‐of‐flight from non‐TOF brain PET images in the image and projection domains
Bammer MR and CT perfusion and pharmacokinetic imaging: clinical applications and theoretical principles
Gong et al. Direct patlak reconstruction from dynamic PET using unsupervised deep learning
US7409077B2 (en) Nearest-neighbor rebinning in clinical PET using on-line three dimensional LOR-to-bin mapping
Torkaman et al. Direct image-based attenuation correction using conditional generative adversarial network for SPECT myocardial perfusion imaging
Gong et al. Deep-learning-based direct synthesis of low-energy virtual monoenergetic images with multi-energy CT
Inomata et al. Estimation of Left and Right Ventricular Ejection Fractions from cine-MRI Using 3D-CNN
Dobrolinska et al. Radiation dose optimization for photon-counting CT coronary artery calcium scoring for different patient sizes: a dynamic phantom study
Zreik et al. Combined analysis of coronary arteries and the left ventricular myocardium in cardiac CT angiography for detection of patients with functionally significant stenosis
WO2021122843A1 (en) Apparatus for generating photon counting spectral image data
Jha et al. Estimating ROI activity concentration with photon-processing and photon-counting SPECT imaging systems
US20230134630A1 (en) Inference apparatus, medical image diagnostic apparatus, inference method, and trained neural network generation method
Cheng et al. Maximum likelihood activity and attenuation estimation using both emission and transmission data with application to utilization of Lu‐176 background radiation in TOF PET

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20964700

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20964700

Country of ref document: EP

Kind code of ref document: A1