WO2023065070A1 - Multi-domain medical image segmentation method based on domain adaptation - Google Patents

Multi-domain medical image segmentation method based on domain adaptation Download PDF

Info

Publication number
WO2023065070A1
WO2023065070A1 PCT/CN2021/124414 CN2021124414W WO2023065070A1 WO 2023065070 A1 WO2023065070 A1 WO 2023065070A1 CN 2021124414 W CN2021124414 W CN 2021124414W WO 2023065070 A1 WO2023065070 A1 WO 2023065070A1
Authority
WO
WIPO (PCT)
Prior art keywords
domain
image
style
segmentation
latent space
Prior art date
Application number
PCT/CN2021/124414
Other languages
French (fr)
Chinese (zh)
Inventor
乐美琰
秦文健
谢耀钦
Original Assignee
中国科学院深圳先进技术研究院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国科学院深圳先进技术研究院 filed Critical 中国科学院深圳先进技术研究院
Priority to PCT/CN2021/124414 priority Critical patent/WO2023065070A1/en
Publication of WO2023065070A1 publication Critical patent/WO2023065070A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation

Definitions

  • the present invention relates to the technical field of medical image processing, and more specifically, to a multi-domain medical image segmentation method based on domain self-adaptation.
  • volume calculation and shape evaluation of the segmented structures can assist clinicians in diagnosing the health status of relevant parts or diagnosing the type of disease.
  • analyzing the dose distribution information within the target area is an essential step in radiotherapy planning.
  • the calculation of features such as the gray distribution in the target area is also helpful for efficacy analysis and prognosis.
  • clinical image segmentation is usually done manually, which is time-consuming and labor-intensive.
  • Transfer learning is to obtain a pre-trained segmentation model by training a domain with a large amount of data, and then use a small amount of labeled data from other domains to fine-tune the pre-trained model to obtain a segmentation model suitable for each domain data.
  • the disadvantage of this method is that it is necessary to find a domain with a large amount of data to pre-train the model, but generally the sample size of medical images collected in each domain is small, which leads to poor performance of the pre-training model and also affects to the model performance after migration.
  • Style transfer is to realize the style transfer of medical images through Generative Adversarial Networks (GAN). Specifically, it is necessary to take a domain with a large amount of data as the source domain, and train the segmentation network on the source domain, then convert the style of other domain images into the style of the source domain through GAN, and finally use the segmentation network of the source domain to split. This approach also requires a large number of medical image samples. This method first selects a source domain, and other domains are used as target domains.
  • GAN Generative Adversarial Networks
  • a GAN In order to convert the image style of the target domain to the style of the source domain, a GAN needs to be trained between each target domain and the source domain, and the parameter amount increases linearly with the number of domains, especially when the number of domains is large, it needs to occupy More resources.
  • the feature mapping to the common space is to shorten the distance between the feature spaces by adding the features extracted from different domains to the confrontation loss or the constraints on the distribution, so as to obtain the common feature space, and then the features in the space Unified decoding to get the segmentation result.
  • VAE Very Autoencoder, Variational Autoencoder
  • KL divergence is often used to constrain the distribution of the coding space of VAE. For example, constraining the distribution of the coding space for all domains to a standard Gaussian distribution.
  • VAE Variational Autoencoder
  • the mean absolute error loss in the VAE enables features in the common space to be decoded to return images with domain style, which indicates that the feature distribution of each domain deviates from the standard Gaussian distribution, and the deviation of each domain There is a large difference in orientation, so there is still an inter-domain gap when performing image segmentation. And when each domain trains a VAE separately, it will cause a huge resource occupation.
  • the precise segmentation of medical images can assist doctors to diagnose and treat related diseases more effectively.
  • current medical images often exist scattered in the form of multi-source or multi-modal, and only a small amount of data exists in each domain.
  • images of multiple domains are directly mixed together to train the image segmentation model, and if you want to train a model suitable for all domains, the function mapping relationship of the model will be very complicated, so underfitting is very likely to occur during the training process Or the problem of overfitting to certain domains.
  • the purpose of the present invention is to overcome the defective of above-mentioned prior art, provide a kind of multi-domain medical image segmentation method based on domain self-adaptation, this method comprises the following steps:
  • Step S1 Train a variational autoencoder with the set loss function as the target to extract latent space codes of different domains and style vectors of corresponding domains, wherein the variational autoencoder includes an encoder and a decoder;
  • Step S2 For the image to be processed, use the variational autoencoder to infer the domain information and extract the domain style vector, subtract the style vector of the corresponding domain from the latent space code of the image, and obtain the destylized latent space code;
  • Step S3 Input the de-stylized latent space code into the decoder to reconstruct an image with a unified style
  • Step S4 Input the image of the unified style into the trained segmentation network to obtain a segmentation result.
  • the advantage of the present invention is that by destylizing the latent space encoding, the domain gap between multi-domain data is eliminated, and images with a uniform style are obtained to train the segmentation network. Compared with directly training with a single domain, this method increases the amount of data for training the segmentation network and can improve the segmentation accuracy. In addition, the present invention theoretically has no limit to the number of domains, and increasing the number of domains will not obviously increase the parameter quantity of the network.
  • FIG. 1 is a schematic diagram of a domain-adaptive-based multi-domain medical image segmentation process according to an embodiment of the present invention
  • Fig. 2 is a flowchart of a multi-domain medical image segmentation method based on domain adaptation according to an embodiment of the present invention.
  • the present invention aims to extract latent space encoding features of different domains by training a shared VAE to reduce resource occupation.
  • the domain gap between multi-domain data is then eliminated by destylizing the latent space encoding.
  • the latent space encoding features of images in different domains are guided into non-overlapping distributions, thereby avoiding the directional bias problem that occurs when all domains are guided to a standard Gaussian distribution.
  • all domains will be mapped to a common feature space, and then a unified style image will be decoded to train the subsequent segmentation network. In this way, images from all domains are used to train the segmentation network, which is equivalent to enriching the amount of training data.
  • Fig. 1 for the sake of clarity, it is described in the form of functional modules, which generally include a VAE module, a destylization module and a segmentation module.
  • M i represents the image of the i-th field
  • Y i denote the segmentation result and the gold standard of the i-th domain image, respectively.
  • the provided multi-domain medical image segmentation method based on domain adaptation includes the following steps.
  • Step S210 training a variational autoencoder to extract latent space coding features for multi-domain images.
  • the variational self-encoder includes an encoder and a decoder, where the encoder takes multi-domain images as input to extract latent space encoding features, and the decoder is used to realize image reconstruction based on latent space encoding features.
  • the variational self-encoder is used to encode the image to obtain a feature representation vector, which contains the information of the original image (that is, the original image can be obtained by decoding).
  • the loss function for training a variational autoencoder is expressed as:
  • D KL represents the loss formed by KL divergence, which can make the two distributions as close as possible.
  • the calculation for this term is expressed as:
  • ⁇ i represents the encoding of the covariance matrix of the image latent space of the i-th domain
  • x represents the sample obtained by sampling the Gaussian distribution N( ⁇ , ⁇ )
  • ⁇ i represents the encoding of the mean value of the image latent space of the i-th domain.
  • W represents the width of the image
  • H represents the height of the image
  • D represents the discriminator
  • the essence of VAE is to use the encoder to learn the style of the image (latent space coding) under the premise of maintaining the structure information, and then decode the structure and the selected style to obtain the reconstructed image with the same structure and different styles. .
  • the style encoding vector is closer to the set one-hot encoding, so that the image style in the later segmentation is easier to control and eliminate.
  • Step S220 extracting the image style of each domain, subtracting the style vector of the domain from the latent space code of the image to obtain a de-stylized latent space code, and reconstructing an image with a unified style.
  • the style of the image can be extracted and controlled.
  • it is first necessary to clarify the domain of the image, and then subtract the style vector of the domain from the latent space encoding ⁇ i of the image Make the latent space coding of images in all domains approach (0,0,...,0), that is, destylized latent space coding.
  • Inputting the de-stylized latent space encoding vector into the decoder can reconstruct a unified style image For subsequent training of the unified segmentation network.
  • the domain information of each image is known only in the training phase, but not in the testing phase. During the test (or when the image to be processed is actually divided), an image whose domain is unknown will be obtained. At this time, it needs to be input to the VAE encoder, and the domain information is inferred according to the obtained latent space encoding (as shown in formula 5), and finally Decentralization according to determined domain style vectors.
  • Step S230 using images of a unified style to train a segmentation network.
  • the image segmentation network uses the U-Net framework to supervise the update of network parameters through two loss weights of Dice and cross entropy.
  • the total segmentation loss L seg is expressed as:
  • Y represents the gold standard of segmentation tasks
  • N c represents the number of categories to be segmented
  • x and y represent the spatial coordinates
  • ⁇ 3 is the set weight parameter.
  • images of all domains can be used to train the segmentation network, making full use of the data and increasing the training sample size of the segmentation network.
  • the present invention designs a multi-domain medical image segmentation method based on domain adaptation, which maps all domain samples to a common feature space, improves the utilization rate of marked multi-domain medical images, and makes The trained unified segmentation network is more robust.
  • a method based on VAE encoding domain style is proposed, and the style encoding vector is closer to the set encoding value, so that the image style in the later segmentation is easier to control, and it is beneficial to the later style elimination.
  • a method of style removal is proposed, and the latent space vector obtained by VAE encoding is subtracted from the defined domain style vector to obtain the latent space vector of style removal.
  • the domain deviation obtained in this way of the present invention will not show strong directional differences, so as to map images of different domains to a common feature space A more efficient solution is provided.
  • the present invention can be a system, method and/or computer program product.
  • a computer program product may include a computer readable storage medium having computer readable program instructions thereon for causing a processor to implement various aspects of the present invention.
  • a computer readable storage medium may be a tangible device that can retain and store instructions for use by an instruction execution device.
  • a computer readable storage medium may be, for example, but is not limited to, an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing.
  • Computer-readable storage media include: portable computer diskettes, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM), or flash memory), static random access memory (SRAM), compact disc read only memory (CD-ROM), digital versatile disc (DVD), memory stick, floppy disk, mechanically encoded device, such as a printer with instructions stored thereon A hole card or a raised structure in a groove, and any suitable combination of the above.
  • RAM random access memory
  • ROM read-only memory
  • EPROM erasable programmable read-only memory
  • flash memory static random access memory
  • SRAM static random access memory
  • CD-ROM compact disc read only memory
  • DVD digital versatile disc
  • memory stick floppy disk
  • mechanically encoded device such as a printer with instructions stored thereon
  • a hole card or a raised structure in a groove and any suitable combination of the above.
  • computer-readable storage media are not to be construed as transient signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (e.g., pulses of light through fiber optic cables), or transmitted electrical signals.
  • Computer-readable program instructions described herein may be downloaded from a computer-readable storage medium to a respective computing/processing device, or downloaded to an external computer or external storage device over a network, such as the Internet, a local area network, a wide area network, and/or a wireless network.
  • the network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers, and/or edge servers.
  • a network adapter card or a network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in each computing/processing device .
  • Computer program instructions for carrying out operations of the present invention may be assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine-related instructions, microcode, firmware instructions, state setting data, or Source or object code written in any combination, including object-oriented programming languages—such as Smalltalk, C++, Python, etc., and conventional procedural programming languages—such as the “C” language or similar programming languages.
  • Computer readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server implement.
  • the remote computer can be connected to the user computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (such as via the Internet using an Internet service provider). connect).
  • LAN local area network
  • WAN wide area network
  • an electronic circuit such as a programmable logic circuit, field programmable gate array (FPGA), or programmable logic array (PLA)
  • FPGA field programmable gate array
  • PDA programmable logic array
  • These computer-readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine such that when executed by the processor of the computer or other programmable data processing apparatus , producing an apparatus for realizing the functions/actions specified in one or more blocks in the flowchart and/or block diagram.
  • These computer-readable program instructions can also be stored in a computer-readable storage medium, and these instructions cause computers, programmable data processing devices and/or other devices to work in a specific way, so that the computer-readable medium storing instructions includes An article of manufacture comprising instructions for implementing various aspects of the functions/acts specified in one or more blocks in flowcharts and/or block diagrams.
  • each block in a flowchart or block diagram may represent a module, a portion of a program segment, or an instruction that includes one or more Executable instructions.
  • the functions noted in the block may occur out of the order noted in the figures. For example, two blocks in succession may, in fact, be executed substantially concurrently, or they may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations can be implemented by a dedicated hardware-based system that performs the specified function or action , or may be implemented by a combination of dedicated hardware and computer instructions. It is well known to those skilled in the art that implementation by means of hardware, implementation by means of software, and implementation by a combination of software and hardware are all equivalent.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

A multi-domain medical image segmentation method based on domain adaptation. The method comprises: training a variational autoencoder by using a set loss function as a target to extract latent space codes of different domains and style vectors of corresponding domains, wherein the variational autoencoder comprises an encoder and a decoder; for an image to be processed, using the variational autoencoder to infer domain information and extract domain style vectors, and subtracting style vectors of corresponding domains from latent space codes of the image to obtain de-stylized latent space codes; inputting the de-stylized latent space codes into the decoder to reconstruct an image having a unified style; and inputting the image having a unified style into a trained segmentation network to obtain a segmentation result. According to the method, domain gaps between multi-domain data are eliminated, the data volume for training the segmentation network is increased, and the image segmentation precision is improved.

Description

一种基于领域自适应的多域医学图像分割方法A Multi-Domain Medical Image Segmentation Method Based on Domain Adaptation 技术领域technical field
本发明涉及医学图像处理技术领域,更具体地,涉及一种基于领域自适应的多域医学图像分割方法。The present invention relates to the technical field of medical image processing, and more specifically, to a multi-domain medical image segmentation method based on domain self-adaptation.
背景技术Background technique
临床上,医学图像的精准分割意义重大。一方面,对分割出的结构进行体积计算和形状评估,可以辅助临床医生诊断相关部位的健康状况或诊断疾病的类型。另一方面,分析目标区域内的剂量分布信息是放疗计划的必要步骤。另外,计算目标区域内的灰度分布等特征对疗效分析和预后也有一定的帮助。目前临床上的图像分割通常由人工完成,该过程相当耗时耗力。Clinically, the precise segmentation of medical images is of great significance. On the one hand, volume calculation and shape evaluation of the segmented structures can assist clinicians in diagnosing the health status of relevant parts or diagnosing the type of disease. On the other hand, analyzing the dose distribution information within the target area is an essential step in radiotherapy planning. In addition, the calculation of features such as the gray distribution in the target area is also helpful for efficacy analysis and prognosis. Currently, clinical image segmentation is usually done manually, which is time-consuming and labor-intensive.
近年来,人工智能技术的发展为医学图像分割带来了新的机遇,这需要收集大量医学图像数据训练机器学习模型。然而,出于隐私保护,医学影像数据较难收集。针对某一部位,比较容易从多个医院中分别收集到少量数据。而由于每个医院的仪器和影像采集标准不同,即使属于同一模态,这些多源的数据也会存在较大的差异。此外,由于医生习惯于同时观察对比多个模态的影像,并倾向于在看得最清楚的模态勾画目标区域,因此即使是同一医院中采集到的数据,带标签的数据间也存在显著差异。上述原因导致了医学图像多域,且各域内仅存在少量数据的分布状况。这种数据分布情况给单一图像分割网络的训练带来了极大的挑战。In recent years, the development of artificial intelligence technology has brought new opportunities for medical image segmentation, which requires the collection of large amounts of medical image data to train machine learning models. However, due to privacy protection, medical imaging data is difficult to collect. For a certain part, it is easier to collect a small amount of data from multiple hospitals. However, since the instruments and image acquisition standards of each hospital are different, even if they belong to the same modality, these multi-source data will have great differences. In addition, because doctors are accustomed to observing and comparing images of multiple modalities at the same time, and tend to delineate the target area in the most clearly seen modality, there are significant differences between labeled data even for data collected in the same hospital. difference. The above reasons lead to multi-domain medical images, and there is only a small amount of data distribution in each domain. This data distribution poses a great challenge to the training of a single image segmentation network.
在现有技术中,利用多域医学图像分割目标区域的方法主要被分为三类,包括迁移学习、风格转换和将特征映射至公共空间等。迁移学习是通过对某一个数据量较大的域进行训练,得到一个预训练的分割模型,然后利用其他域的少量标记数据分别微调该预训练模型,得到适合各个域数据的分割模型。这种方式的缺点在于,需要找到一个数据量较大的域来预训 练模型,但是一般收集到的各个域的医学图像的样本量都较少,从而导致预训练模型的性能差,也会影响到迁移后的模型性能。In the prior art, methods for segmenting target regions using multi-domain medical images are mainly divided into three categories, including transfer learning, style transfer, and feature mapping to common spaces, etc. Transfer learning is to obtain a pre-trained segmentation model by training a domain with a large amount of data, and then use a small amount of labeled data from other domains to fine-tune the pre-trained model to obtain a segmentation model suitable for each domain data. The disadvantage of this method is that it is necessary to find a domain with a large amount of data to pre-train the model, but generally the sample size of medical images collected in each domain is small, which leads to poor performance of the pre-training model and also affects to the model performance after migration.
风格转换是通过生成对抗网络(Generative Adversarial Networks,GAN)实现医学图像的风格转换。具体地,需要将某一个数据量较大的域当作源域,并在源域上训练分割网络,然后通过GAN将其他域图像的风格转换为源域的风格,最后利用源域的分割网络进行分割。这种方式同样也需要大量的医学图像样本。这种方式首先会选取一个源域,其他域都作为目标域。为了将目标域图像的风格转换为源域风格,每个目标域和源域之间都需要训练一个GAN,参数量随着域数量的增长而线性增长,特别是当域数量较多时,需要占用较多资源。Style transfer is to realize the style transfer of medical images through Generative Adversarial Networks (GAN). Specifically, it is necessary to take a domain with a large amount of data as the source domain, and train the segmentation network on the source domain, then convert the style of other domain images into the style of the source domain through GAN, and finally use the segmentation network of the source domain to split. This approach also requires a large number of medical image samples. This method first selects a source domain, and other domains are used as target domains. In order to convert the image style of the target domain to the style of the source domain, a GAN needs to be trained between each target domain and the source domain, and the parameter amount increases linearly with the number of domains, especially when the number of domains is large, it needs to occupy More resources.
此外,将特征映射至公共空间则是通过对不同域中提取的特征加对抗损失或分布上的约束来拉近特征空间之间的距离,从而得到公共的特征空间,随后对该空间中的特征统一解码,得到分割结果。In addition, the feature mapping to the common space is to shorten the distance between the feature spaces by adding the features extracted from different domains to the confrontation loss or the constraints on the distribution, so as to obtain the common feature space, and then the features in the space Unified decoding to get the segmentation result.
目前将特征映射至公共空间的方案,通常使用VAE(VariationalAutoencoder,变分自编码器)来提取特征。为了使各个域的图像映射到相同的特征空间,常会利用KL散度来约束VAE的编码空间的分布。例如,将所有域的编码空间的分布都限制为标准高斯分布。当所有域共享一个VAE时,VAE中的平均绝对误差损失使得公共空间中的特征能够解码返回带有域风格的图像,这表明各个域的特征分布与标准高斯分布存在偏差,且各域的偏离方向相差较大,因此在进行图像分割时仍然存在域间鸿沟。并且当每个域分别训练一个VAE时,则会造成极大的资源占用。The current scheme of mapping features to public space usually uses VAE (Variational Autoencoder, Variational Autoencoder) to extract features. In order to map the images of each domain to the same feature space, KL divergence is often used to constrain the distribution of the coding space of VAE. For example, constraining the distribution of the coding space for all domains to a standard Gaussian distribution. When all domains share a VAE, the mean absolute error loss in the VAE enables features in the common space to be decoded to return images with domain style, which indicates that the feature distribution of each domain deviates from the standard Gaussian distribution, and the deviation of each domain There is a large difference in orientation, so there is still an inter-domain gap when performing image segmentation. And when each domain trains a VAE separately, it will cause a huge resource occupation.
综上,医学图像的精准分割可以辅助医生更加有效地诊断和治疗相关疾病。但目前的医学影像常以多源或者多模态的形式分散存在,每个域都仅存在少量的数据。并且将多个域的图像直接混在一起训练图像分割模型,而如果希望训练出适用于所有域的模型,则该模型的函数映射关系将十分复杂,因此在训练过程中极有可能出现欠拟合或过拟合至某几个域的问题。In summary, the precise segmentation of medical images can assist doctors to diagnose and treat related diseases more effectively. However, current medical images often exist scattered in the form of multi-source or multi-modal, and only a small amount of data exists in each domain. And the images of multiple domains are directly mixed together to train the image segmentation model, and if you want to train a model suitable for all domains, the function mapping relationship of the model will be very complicated, so underfitting is very likely to occur during the training process Or the problem of overfitting to certain domains.
发明内容Contents of the invention
本发明的目的是克服上述现有技术的缺陷,提供一种基于领域自适应 的多域医学图像分割方法,该方法包括以下步骤:The purpose of the present invention is to overcome the defective of above-mentioned prior art, provide a kind of multi-domain medical image segmentation method based on domain self-adaptation, this method comprises the following steps:
步骤S1:以设定的损失函数为目标训练变分自编码器,以提取不同域的隐空间编码和对应域的风格向量,其中,该变分自编码器包括编码器和解码器;Step S1: Train a variational autoencoder with the set loss function as the target to extract latent space codes of different domains and style vectors of corresponding domains, wherein the variational autoencoder includes an encoder and a decoder;
步骤S2:对于待处理图像,利用所述变分自编码器推断域信息并提取域风格向量,将图像的隐空间编码减去相应域的风格向量,获得去风格化的隐空间编码;Step S2: For the image to be processed, use the variational autoencoder to infer the domain information and extract the domain style vector, subtract the style vector of the corresponding domain from the latent space code of the image, and obtain the destylized latent space code;
步骤S3:将所述去风格化的隐空间编码输入至所述解码器,重建出统一风格的图像;Step S3: Input the de-stylized latent space code into the decoder to reconstruct an image with a unified style;
步骤S4:将所述统一风格的图像输入到经训练的分割网络,获得分割结果。Step S4: Input the image of the unified style into the trained segmentation network to obtain a segmentation result.
与现有技术相比,本发明的优点在于,通过对隐空间编码进行去风格化,消除了多域数据之间的域鸿沟,得到统一风格的图像来训练分割网络。与直接用单一域进行训练相比,这种方式增加了训练分割网络的数据量,能够提升分割精度。此外,本发明理论上对域的个数是没有限制的,且增加域的个数也不会明显增加网络的参数量。Compared with the prior art, the advantage of the present invention is that by destylizing the latent space encoding, the domain gap between multi-domain data is eliminated, and images with a uniform style are obtained to train the segmentation network. Compared with directly training with a single domain, this method increases the amount of data for training the segmentation network and can improve the segmentation accuracy. In addition, the present invention theoretically has no limit to the number of domains, and increasing the number of domains will not obviously increase the parameter quantity of the network.
通过以下参照附图对本发明的示例性实施例的详细描述,本发明的其它特征及其优点将会变得清楚。Other features of the present invention and advantages thereof will become apparent from the following detailed description of exemplary embodiments of the present invention with reference to the accompanying drawings.
附图说明Description of drawings
被结合在说明书中并构成说明书的一部分的附图示出了本发明的实施例,并且连同其说明一起用于解释本发明的原理。The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention.
图1是根据本发明一个实施例的基于领域自适应的多域医学图像分割过程示意图;FIG. 1 is a schematic diagram of a domain-adaptive-based multi-domain medical image segmentation process according to an embodiment of the present invention;
图2是根据本发明一个实施例的基于领域自适应的多域医学图像分割方法的流程图。Fig. 2 is a flowchart of a multi-domain medical image segmentation method based on domain adaptation according to an embodiment of the present invention.
具体实施方式Detailed ways
现在将参照附图来详细描述本发明的各种示例性实施例。应注意到: 除非另外具体说明,否则在这些实施例中阐述的部件和步骤的相对布置、数字表达式和数值不限制本发明的范围。Various exemplary embodiments of the present invention will now be described in detail with reference to the accompanying drawings. It should be noted that the relative arrangements of components and steps, numerical expressions and numerical values set forth in these embodiments do not limit the scope of the present invention unless specifically stated otherwise.
以下对至少一个示例性实施例的描述实际上仅仅是说明性的,决不作为对本发明及其应用或使用的任何限制。The following description of at least one exemplary embodiment is merely illustrative in nature and in no way taken as limiting the invention, its application or uses.
对于相关领域普通技术人员已知的技术、方法和设备可能不作详细讨论,但在适当情况下,所述技术、方法和设备应当被视为说明书的一部分。Techniques, methods and devices known to those of ordinary skill in the relevant art may not be discussed in detail, but where appropriate, such techniques, methods and devices should be considered part of the description.
在这里示出和讨论的所有例子中,任何具体值应被解释为仅仅是示例性的,而不是作为限制。因此,示例性实施例的其它例子可以具有不同的值。In all examples shown and discussed herein, any specific values should be construed as exemplary only, and not as limitations. Therefore, other instances of the exemplary embodiment may have different values.
应注意到:相似的标号和字母在下面的附图中表示类似项,因此,一旦某一项在一个附图中被定义,则在随后的附图中不需要对其进行进一步讨论。It should be noted that like numerals and letters denote like items in the following figures, therefore, once an item is defined in one figure, it does not require further discussion in subsequent figures.
本发明旨在通过训练一个共享的VAE来提取不同域的隐空间编码特征,以减少资源占用。然后通过对隐空间编码进行去风格化,来消除多域数据之间的域鸿沟。并且在训练VAE时,将不同域图像的隐空间编码特征引导到不重叠的分布中,从而避免了将所有域都引导至标准高斯分布时出现的方向性偏差问题。进一步地,对隐空间的编码去风格化后,所有域都将映射到一个公共的特征空间,然后解码出统一风格的图像来训练后续的分割网络。通过这种方式,利用到所有域的图像来训练分割网络,相当于丰富了训练数据量。The present invention aims to extract latent space encoding features of different domains by training a shared VAE to reduce resource occupation. The domain gap between multi-domain data is then eliminated by destylizing the latent space encoding. And when training VAE, the latent space encoding features of images in different domains are guided into non-overlapping distributions, thereby avoiding the directional bias problem that occurs when all domains are guided to a standard Gaussian distribution. Further, after destylizing the encoding of the latent space, all domains will be mapped to a common feature space, and then a unified style image will be decoded to train the subsequent segmentation network. In this way, images from all domains are used to train the segmentation network, which is equivalent to enriching the amount of training data.
参见图1所示,为清楚起见,以功能模块的方式进行说明,整体上包括VAE模块、去风格化模块和分割模块。假设一共有N个域,其中M i表示第i个域的图像,
Figure PCTCN2021124414-appb-000001
表示由第i个域图像提取的隐空间特征,经变换解码后得到的第j个域的重建图像。
Figure PCTCN2021124414-appb-000002
表示第i个域的特征向量去中心化后解码出的统一风格的图像,
Figure PCTCN2021124414-appb-000003
和Y i分别表示对第i个域图像的分割结果和金标准。
Referring to Fig. 1, for the sake of clarity, it is described in the form of functional modules, which generally include a VAE module, a destylization module and a segmentation module. Suppose there are a total of N fields, where M i represents the image of the i-th field,
Figure PCTCN2021124414-appb-000001
Represents the hidden space features extracted from the i-th domain image, and the reconstructed image of the j-th domain obtained after transform decoding.
Figure PCTCN2021124414-appb-000002
Represents the image of a unified style decoded after the feature vector of the i-th domain is decentralized,
Figure PCTCN2021124414-appb-000003
and Y i denote the segmentation result and the gold standard of the i-th domain image, respectively.
具体地,结合图1和图2所示,所提供的基于领域自适应的多域医学图像分割方法包括以下步骤。Specifically, as shown in FIG. 1 and FIG. 2 , the provided multi-domain medical image segmentation method based on domain adaptation includes the following steps.
步骤S210,训练变分自编码器,以针对多域图像提取隐空间编码特 征。Step S210, training a variational autoencoder to extract latent space coding features for multi-domain images.
变分自编码器包含编码器和解码器,其中编码器以多域图像为输入,用于提取隐空间编码特征,解码器用于基于隐空间编码特征实现图像重建。变分自编码器用于对图像进行编码,得到特征表示向量,该向量中包含原始图像的信息(即能够通过解码得到原始图像)。The variational self-encoder includes an encoder and a decoder, where the encoder takes multi-domain images as input to extract latent space encoding features, and the decoder is used to realize image reconstruction based on latent space encoding features. The variational self-encoder is used to encode the image to obtain a feature representation vector, which contains the information of the original image (that is, the original image can be obtained by decoding).
在一个实施例中,训练变分自编码器的损失函数表示为:In one embodiment, the loss function for training a variational autoencoder is expressed as:
Figure PCTCN2021124414-appb-000004
Figure PCTCN2021124414-appb-000004
其中D KL表示KL散度构成的损失(loss),该损失可以使两个分布尽可能接近。例如,该项的计算表示为: Among them, D KL represents the loss formed by KL divergence, which can make the two distributions as close as possible. For example, the calculation for this term is expressed as:
Figure PCTCN2021124414-appb-000005
Figure PCTCN2021124414-appb-000005
其中,
Figure PCTCN2021124414-appb-000006
表示域的风格向量,是一个长度为N的向量,该向量第i个分量的值为5,其他均为0,
Figure PCTCN2021124414-appb-000007
表示维度为N×N的单位矩阵,这里表示希望让不同域的样本的隐空间编码接近不同的高斯分布。Σ i表示第i个域的图像隐空间的协方差矩阵的编码,
Figure PCTCN2021124414-appb-000008
与标准高斯分布的协方差矩阵一致,x表示对高斯分布N(μ,Σ)采样得到的样本,μ i表示第i个域的图像隐空间的均值的编码。
Figure PCTCN2021124414-appb-000009
表示重建图像与原始图像逐像素计算绝对误差,可采用均值,该损失(loss)保证隐空间编码能够解码返回原始图像,也即为了保证提取的隐空间编码保留有原始图像的结构等信息。除了解码返回同域图像,也可以对隐空间编码做变换
Figure PCTCN2021124414-appb-000010
使其解码后返回第j个域的图像,如此可以得到N-1个其他域的图像,进而可以从真实数据中抽取对应域的图像,来计算二者的对抗损失
Figure PCTCN2021124414-appb-000011
使重建出的图像更加真实。
in,
Figure PCTCN2021124414-appb-000006
Indicates the style vector of the domain, which is a vector with a length of N, the value of the i-th component of the vector is 5, and the others are all 0,
Figure PCTCN2021124414-appb-000007
Represents an identity matrix with a dimension of N×N, where it is hoped that the latent space encoding of samples in different domains will be close to different Gaussian distributions. Σ i represents the encoding of the covariance matrix of the image latent space of the i-th domain,
Figure PCTCN2021124414-appb-000008
Consistent with the covariance matrix of the standard Gaussian distribution, x represents the sample obtained by sampling the Gaussian distribution N(μ,Σ), and μi represents the encoding of the mean value of the image latent space of the i-th domain.
Figure PCTCN2021124414-appb-000009
Indicates that the absolute error between the reconstructed image and the original image is calculated pixel by pixel, and the mean value can be used. This loss (loss) ensures that the latent space coding can be decoded and returned to the original image, that is, to ensure that the extracted latent space coding retains the structure and other information of the original image. In addition to decoding and returning the same domain image, it is also possible to transform the latent space encoding
Figure PCTCN2021124414-appb-000010
Make it decoded and return the image of the jth domain, so that N-1 images of other domains can be obtained, and then the image of the corresponding domain can be extracted from the real data to calculate the confrontation loss of the two
Figure PCTCN2021124414-appb-000011
Make the reconstructed image more realistic.
在一个实施例中,将
Figure PCTCN2021124414-appb-000012
Figure PCTCN2021124414-appb-000013
设置为:
In one embodiment, the
Figure PCTCN2021124414-appb-000012
and
Figure PCTCN2021124414-appb-000013
Set as:
Figure PCTCN2021124414-appb-000014
Figure PCTCN2021124414-appb-000014
Figure PCTCN2021124414-appb-000015
Figure PCTCN2021124414-appb-000015
其中,W表示图像的宽度,H表示图像的高度,
Figure PCTCN2021124414-appb-000016
表示图像M i在位置(x,y)处的取值,
Figure PCTCN2021124414-appb-000017
表示图像
Figure PCTCN2021124414-appb-000018
在位置(x,y)处的取值,D表示判别器,
Figure PCTCN2021124414-appb-000019
表示第i个图像域的数据分布,
Figure PCTCN2021124414-appb-000020
表示由第i个域的图像生成的第j个域的图像的数据分布。
Among them, W represents the width of the image, H represents the height of the image,
Figure PCTCN2021124414-appb-000016
Indicates the value of image M i at position (x, y),
Figure PCTCN2021124414-appb-000017
represent image
Figure PCTCN2021124414-appb-000018
The value at position (x, y), D represents the discriminator,
Figure PCTCN2021124414-appb-000019
Indicates the data distribution of the i-th image domain,
Figure PCTCN2021124414-appb-000020
Denotes the data distribution of images of the j-th domain generated from images of the i-th domain.
在该步骤中,VAE的本质是在保持结构信息的前提下,用编码器学习图像的风格(隐空间编码),然后解码结构和选定的风格,得到重建出的结构一致、风格不同的图像。在训练过程中,优选地,将风格编码向量与设定的独热(one-hot)型编码拉近,使后期分割中的图像风格更加容易控制、消除。In this step, the essence of VAE is to use the encoder to learn the style of the image (latent space coding) under the premise of maintaining the structure information, and then decode the structure and the selected style to obtain the reconstructed image with the same structure and different styles. . During the training process, preferably, the style encoding vector is closer to the set one-hot encoding, so that the image style in the later segmentation is easier to control and eliminate.
步骤S220,提取各域的图像风格,将图像的隐空间编码减去域的风格向量,获得去风格化的隐空间编码,并重建出统一风格的图像。Step S220, extracting the image style of each domain, subtracting the style vector of the domain from the latent space code of the image to obtain a de-stylized latent space code, and reconstructing an image with a unified style.
在训练好VAE模块后,可以对图像的风格进行提取和控制。为了统一各个域的图像风格,首先需要明确图像的域,然后将图像的隐空间编码μ i减去域的风格向量
Figure PCTCN2021124414-appb-000021
使所有域中图像的隐空间编码都趋近于(0,0,…,0),即去风格化的隐空间编码。将该去风格化的隐空间编码向量输入至解码器可以重建出统一风格的图像
Figure PCTCN2021124414-appb-000022
以用于后续统一分割网络的训练。
After training the VAE module, the style of the image can be extracted and controlled. In order to unify the image style of each domain, it is first necessary to clarify the domain of the image, and then subtract the style vector of the domain from the latent space encoding μ i of the image
Figure PCTCN2021124414-appb-000021
Make the latent space coding of images in all domains approach (0,0,…,0), that is, destylized latent space coding. Inputting the de-stylized latent space encoding vector into the decoder can reconstruct a unified style image
Figure PCTCN2021124414-appb-000022
For subsequent training of the unified segmentation network.
应注意的是,仅在训练阶段是知道每个图像的域信息的,但测试阶段是未知的。在测试时(或实际分割待处理图像时),将得到域未知的图像,此时需要将其输入至VAE的编码器,根据所得的隐空间编码推断域信息(如公式5所示),最后根据确定的域风格向量进行去中心化。It should be noted that the domain information of each image is known only in the training phase, but not in the testing phase. During the test (or when the image to be processed is actually divided), an image whose domain is unknown will be obtained. At this time, it needs to be input to the VAE encoder, and the domain information is inferred according to the obtained latent space encoding (as shown in formula 5), and finally Decentralization according to determined domain style vectors.
domain *=argmax(softmax(μ i))   (5) domain * =argmax(softmax(μ i )) (5)
步骤S230,利用统一风格的图像训练分割网络。Step S230, using images of a unified style to train a segmentation network.
在一个实施例中,图像分割网络使用U-Net框架,通过Dice和交叉熵两个损失加权来监督网络参数更新,总的分割损失L seg表示为: In one embodiment, the image segmentation network uses the U-Net framework to supervise the update of network parameters through two loss weights of Dice and cross entropy. The total segmentation loss L seg is expressed as:
L Seg=L Dice3L CE   (6) L Seg = L Dice + λ 3 L CE (6)
Figure PCTCN2021124414-appb-000023
Figure PCTCN2021124414-appb-000023
Figure PCTCN2021124414-appb-000024
Figure PCTCN2021124414-appb-000024
其中,Y表示分割任务的金标准,
Figure PCTCN2021124414-appb-000025
表示分割网络的预测结果,N c表示分割的类别数,x和y表示空间坐标,
Figure PCTCN2021124414-appb-000026
表示分割网络预测出的(x,y)位置 属于第c类的概率,λ 3是设定的权重参数。
Among them, Y represents the gold standard of segmentation tasks,
Figure PCTCN2021124414-appb-000025
Represents the prediction result of the segmentation network, N c represents the number of categories to be segmented, x and y represent the spatial coordinates,
Figure PCTCN2021124414-appb-000026
Indicates the probability that the (x, y) position predicted by the segmentation network belongs to the c-th class, and λ 3 is the set weight parameter.
在该步骤中,可以使用所有域的图像来训练分割网络,充分利用了数据,增加了分割网络的训练样本量。In this step, images of all domains can be used to train the segmentation network, making full use of the data and increasing the training sample size of the segmentation network.
需要说明的是,在不违背本发明精神和范围的前提下,本领域技术人员可对上述实施例进行适当的改变或变型。例如,采用其他的损失函数训练变分自编码器或图像分割网络,如似然损失、指数形式损失等。It should be noted that, without departing from the spirit and scope of the present invention, those skilled in the art may make appropriate changes or modifications to the above-mentioned embodiments. For example, other loss functions are used to train variational autoencoders or image segmentation networks, such as likelihood loss, exponential loss, etc.
综上所述,本发明设计了基于领域自适应的多域医学图像的分割方法,将所有域的样本都映射到一个公共的特征空间,提高了带标记的多域医学图像的利用率,使训练出的统一分割网络更加鲁棒。并且提出了基于VAE编码域风格的方式,将风格编码向量与设定的编码值拉近,使后期分割中的图像风格更加容易控制,有利于后期的风格消除。此外,提出了一种风格去除的方法,将VAE编码得到的隐空间向量与定义的域风格向量做差,便能得到风格去除的隐空间向量。与直接拉近隐空间特征向量的分布和标准高斯分布相比,本发明这种方式得到的域偏差不会表现出很强的方向性差异,从而为将不同域的图像映射到公共的特征空间提供了一种更有效的解决方案。To sum up, the present invention designs a multi-domain medical image segmentation method based on domain adaptation, which maps all domain samples to a common feature space, improves the utilization rate of marked multi-domain medical images, and makes The trained unified segmentation network is more robust. And a method based on VAE encoding domain style is proposed, and the style encoding vector is closer to the set encoding value, so that the image style in the later segmentation is easier to control, and it is beneficial to the later style elimination. In addition, a method of style removal is proposed, and the latent space vector obtained by VAE encoding is subtracted from the defined domain style vector to obtain the latent space vector of style removal. Compared with the distribution of feature vectors in the latent space and the standard Gaussian distribution directly, the domain deviation obtained in this way of the present invention will not show strong directional differences, so as to map images of different domains to a common feature space A more efficient solution is provided.
本发明可以是系统、方法和/或计算机程序产品。计算机程序产品可以包括计算机可读存储介质,其上载有用于使处理器实现本发明的各个方面的计算机可读程序指令。The present invention can be a system, method and/or computer program product. A computer program product may include a computer readable storage medium having computer readable program instructions thereon for causing a processor to implement various aspects of the present invention.
计算机可读存储介质可以是可以保持和存储由指令执行设备使用的指令的有形设备。计算机可读存储介质例如可以是但不限于电存储设备、磁存储设备、光存储设备、电磁存储设备、半导体存储设备或者上述的任意合适的组合。计算机可读存储介质的更具体的例子(非穷举的列表)包括:便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、静态随机存取存储器(SRAM)、便携式压缩盘只读存储器(CD-ROM)、数字多功能盘(DVD)、记忆棒、软盘、机械编码设备、例如其上存储有指令的打孔卡或凹槽内凸起结构、以及上述的任意合适的组合。这里所使用的计算机可读存储介质不被解释为瞬时信号本身,诸如无线电波或者其他自由传播的电磁波、通过波导或 其他传输媒介传播的电磁波(例如,通过光纤电缆的光脉冲)、或者通过电线传输的电信号。A computer readable storage medium may be a tangible device that can retain and store instructions for use by an instruction execution device. A computer readable storage medium may be, for example, but is not limited to, an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of computer-readable storage media include: portable computer diskettes, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM), or flash memory), static random access memory (SRAM), compact disc read only memory (CD-ROM), digital versatile disc (DVD), memory stick, floppy disk, mechanically encoded device, such as a printer with instructions stored thereon A hole card or a raised structure in a groove, and any suitable combination of the above. As used herein, computer-readable storage media are not to be construed as transient signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (e.g., pulses of light through fiber optic cables), or transmitted electrical signals.
这里所描述的计算机可读程序指令可以从计算机可读存储介质下载到各个计算/处理设备,或者通过网络、例如因特网、局域网、广域网和/或无线网下载到外部计算机或外部存储设备。网络可以包括铜传输电缆、光纤传输、无线传输、路由器、防火墙、交换机、网关计算机和/或边缘服务器。每个计算/处理设备中的网络适配卡或者网络接口从网络接收计算机可读程序指令,并转发该计算机可读程序指令,以供存储在各个计算/处理设备中的计算机可读存储介质中。Computer-readable program instructions described herein may be downloaded from a computer-readable storage medium to a respective computing/processing device, or downloaded to an external computer or external storage device over a network, such as the Internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers, and/or edge servers. A network adapter card or a network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in each computing/processing device .
用于执行本发明操作的计算机程序指令可以是汇编指令、指令集架构(ISA)指令、机器指令、机器相关指令、微代码、固件指令、状态设置数据、或者以一种或多种编程语言的任意组合编写的源代码或目标代码,所述编程语言包括面向对象的编程语言—诸如Smalltalk、C++、Python等,以及常规的过程式编程语言—诸如“C”语言或类似的编程语言。计算机可读程序指令可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络—包括局域网(LAN)或广域网(WAN)—连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。在一些实施例中,通过利用计算机可读程序指令的状态信息来个性化定制电子电路,例如可编程逻辑电路、现场可编程门阵列(FPGA)或可编程逻辑阵列(PLA),该电子电路可以执行计算机可读程序指令,从而实现本发明的各个方面。Computer program instructions for carrying out operations of the present invention may be assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine-related instructions, microcode, firmware instructions, state setting data, or Source or object code written in any combination, including object-oriented programming languages—such as Smalltalk, C++, Python, etc., and conventional procedural programming languages—such as the “C” language or similar programming languages. Computer readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server implement. In cases involving a remote computer, the remote computer can be connected to the user computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (such as via the Internet using an Internet service provider). connect). In some embodiments, an electronic circuit, such as a programmable logic circuit, field programmable gate array (FPGA), or programmable logic array (PLA), can be customized by utilizing state information of computer-readable program instructions, which can Various aspects of the invention are implemented by executing computer readable program instructions.
这里参照根据本发明实施例的方法、装置(系统)和计算机程序产品的流程图和/或框图描述了本发明的各个方面。应当理解,流程图和/或框图的每个方框以及流程图和/或框图中各方框的组合,都可以由计算机可读程序指令实现。Aspects of the present invention are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It should be understood that each block of the flowcharts and/or block diagrams, and combinations of blocks in the flowcharts and/or block diagrams, can be implemented by computer-readable program instructions.
这些计算机可读程序指令可以提供给通用计算机、专用计算机或其它可编程数据处理装置的处理器,从而生产出一种机器,使得这些指令在通 过计算机或其它可编程数据处理装置的处理器执行时,产生了实现流程图和/或框图中的一个或多个方框中规定的功能/动作的装置。也可以把这些计算机可读程序指令存储在计算机可读存储介质中,这些指令使得计算机、可编程数据处理装置和/或其他设备以特定方式工作,从而,存储有指令的计算机可读介质则包括一个制造品,其包括实现流程图和/或框图中的一个或多个方框中规定的功能/动作的各个方面的指令。These computer-readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine such that when executed by the processor of the computer or other programmable data processing apparatus , producing an apparatus for realizing the functions/actions specified in one or more blocks in the flowchart and/or block diagram. These computer-readable program instructions can also be stored in a computer-readable storage medium, and these instructions cause computers, programmable data processing devices and/or other devices to work in a specific way, so that the computer-readable medium storing instructions includes An article of manufacture comprising instructions for implementing various aspects of the functions/acts specified in one or more blocks in flowcharts and/or block diagrams.
也可以把计算机可读程序指令加载到计算机、其它可编程数据处理装置、或其它设备上,使得在计算机、其它可编程数据处理装置或其它设备上执行一系列操作步骤,以产生计算机实现的过程,从而使得在计算机、其它可编程数据处理装置、或其它设备上执行的指令实现流程图和/或框图中的一个或多个方框中规定的功能/动作。It is also possible to load computer-readable program instructions into a computer, other programmable data processing device, or other equipment, so that a series of operational steps are performed on the computer, other programmable data processing device, or other equipment to produce a computer-implemented process , so that instructions executed on computers, other programmable data processing devices, or other devices implement the functions/actions specified in one or more blocks in the flowcharts and/or block diagrams.
附图中的流程图和框图显示了根据本发明的多个实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段或指令的一部分,所述模块、程序段或指令的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个连续的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或动作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。对于本领域技术人员来说公知的是,通过硬件方式实现、通过软件方式实现以及通过软件和硬件结合的方式实现都是等价的。The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in a flowchart or block diagram may represent a module, a portion of a program segment, or an instruction that includes one or more Executable instructions. In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks in succession may, in fact, be executed substantially concurrently, or they may sometimes be executed in the reverse order, depending upon the functionality involved. It should also be noted that each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented by a dedicated hardware-based system that performs the specified function or action , or may be implemented by a combination of dedicated hardware and computer instructions. It is well known to those skilled in the art that implementation by means of hardware, implementation by means of software, and implementation by a combination of software and hardware are all equivalent.
以上已经描述了本发明的各实施例,上述说明是示例性的,并非穷尽性的,并且也不限于所披露的各实施例。在不偏离所说明的各实施例的范围和精神的情况下,对于本技术领域的普通技术人员来说许多修改和变更都是显而易见的。本文中所用术语的选择,旨在最好地解释各实施例的原理、实际应用或对市场中的技术改进,或者使本技术领域的其它普通技术人员能理解本文披露的各实施例。本发明的范围由所附权利要求来限定。Having described various embodiments of the present invention, the foregoing description is exemplary, not exhaustive, and is not limited to the disclosed embodiments. Many modifications and alterations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen to best explain the principle of each embodiment, practical application or technical improvement in the market, or to enable other ordinary skilled in the art to understand each embodiment disclosed herein. The scope of the invention is defined by the appended claims.

Claims (10)

  1. 一种基于领域自适应的多域医学图像分割方法,包括以下步骤:A multi-domain medical image segmentation method based on domain adaptation, comprising the following steps:
    步骤S1:以设定的损失函数为目标训练变分自编码器,以提取不同域的隐空间编码和对应域的风格向量,其中,该变分自编码器包括编码器和解码器;Step S1: Train a variational autoencoder with the set loss function as the target to extract latent space codes of different domains and style vectors of corresponding domains, wherein the variational autoencoder includes an encoder and a decoder;
    步骤S2:对于待处理图像,利用所述变分自编码器推断域信息并提取域风格向量,将图像的隐空间编码减去相应域的风格向量,获得去风格化的隐空间编码;Step S2: For the image to be processed, use the variational autoencoder to infer the domain information and extract the domain style vector, subtract the style vector of the corresponding domain from the latent space code of the image, and obtain the destylized latent space code;
    步骤S3:将所述去风格化的隐空间编码输入至所述解码器,重建出统一风格的图像;Step S3: Input the de-stylized latent space code into the decoder to reconstruct an image with a unified style;
    步骤S4:将所述统一风格的图像输入到经训练的分割网络,获得分割结果。Step S4: Input the image of the unified style into the trained segmentation network to obtain a segmentation result.
  2. 根据权利要求1所述的方法,其特征在于,训练所述变分自编码器的损失函数设置为:The method according to claim 1, wherein the loss function of training the variational autoencoder is set to:
    Figure PCTCN2021124414-appb-100001
    Figure PCTCN2021124414-appb-100001
    其中D KL表示KL散度构成的损失,表示为: where D KL represents the loss constituted by the KL divergence, expressed as:
    Figure PCTCN2021124414-appb-100002
    Figure PCTCN2021124414-appb-100002
    其中
    Figure PCTCN2021124414-appb-100003
    表示一个长度为N的向量,
    Figure PCTCN2021124414-appb-100004
    表示维度为N×N的单位矩阵,Σ i表示第i个域的图像隐空间的协方差矩阵的编码,μ i表示第i个域的图像隐空间的均值的编码,
    Figure PCTCN2021124414-appb-100005
    表示重建图像与原始图像逐像素计算绝对误差,对隐空间编码做变换
    Figure PCTCN2021124414-appb-100006
    使其解码后返回第j个域的图像,
    Figure PCTCN2021124414-appb-100007
    表示真实数据中抽取对应域的图像和重建图像之间的对抗损失,λ 1和λ 2表示相应项的权重,x表示对高斯分布N(μ,Σ)采样得到的样本。
    in
    Figure PCTCN2021124414-appb-100003
    Represents a vector of length N,
    Figure PCTCN2021124414-appb-100004
    Represents an identity matrix with a dimension of N×N, Σ i represents the encoding of the covariance matrix of the image latent space of the i-th domain, μ i represents the encoding of the mean value of the image latent space of the i-th domain,
    Figure PCTCN2021124414-appb-100005
    Indicates that the absolute error between the reconstructed image and the original image is calculated pixel by pixel, and the hidden space coding is transformed
    Figure PCTCN2021124414-appb-100006
    Make it decoded and return the image of the jth field,
    Figure PCTCN2021124414-appb-100007
    Represents the confrontation loss between the image extracted from the corresponding domain and the reconstructed image in the real data, λ 1 and λ 2 represent the weight of the corresponding item, and x represents the sample obtained by sampling the Gaussian distribution N(μ,Σ).
  3. 根据权利要求2所述的方法,其特征在于,
    Figure PCTCN2021124414-appb-100008
    Figure PCTCN2021124414-appb-100009
    分别设置为:
    The method according to claim 2, characterized in that,
    Figure PCTCN2021124414-appb-100008
    and
    Figure PCTCN2021124414-appb-100009
    respectively set to:
    Figure PCTCN2021124414-appb-100010
    Figure PCTCN2021124414-appb-100010
    Figure PCTCN2021124414-appb-100011
    Figure PCTCN2021124414-appb-100011
    其中,W表示图像的宽度,H表示图像的高度,
    Figure PCTCN2021124414-appb-100012
    表示图像M i在位置(x,y)处的取值,
    Figure PCTCN2021124414-appb-100013
    表示图像
    Figure PCTCN2021124414-appb-100014
    在位置(x,y)处的取值,D表示判别器,
    Figure PCTCN2021124414-appb-100015
    表示第i个图像域的数据分布,
    Figure PCTCN2021124414-appb-100016
    表示由第i个域的图像生成的第j个域的图像的数据分布。
    Among them, W represents the width of the image, H represents the height of the image,
    Figure PCTCN2021124414-appb-100012
    Indicates the value of image M i at position (x, y),
    Figure PCTCN2021124414-appb-100013
    represent image
    Figure PCTCN2021124414-appb-100014
    The value at position (x, y), D represents the discriminator,
    Figure PCTCN2021124414-appb-100015
    Indicates the data distribution of the i-th image domain,
    Figure PCTCN2021124414-appb-100016
    Denotes the data distribution of images of the j-th domain generated from images of the i-th domain.
  4. 根据权利要求1所述的方法,其特征在于,在步骤S2中,根据以下公式推断域信息:The method according to claim 1, characterized in that in step S2, the domain information is inferred according to the following formula:
    domain *=argmax(softmax(μ i)) domain * =argmax(softmax(μ i ))
    其中,μ i表示隐空间编码。 Among them, μi represents the latent space encoding.
  5. 根据权利要求1所述的方法,其特征在于,训练所述分割网络的损失函数设置为:The method according to claim 1, wherein the loss function of training the segmentation network is set to:
    L Seg=L Dice3L CE L Seg = L Dice + λ 3 L CE
    其中,L Seg表示总分割损失,L Dice表示Dice损失,L CE表交叉熵损失,λ 3表示权重系数。 Among them, L Seg represents the total segmentation loss, L Dice represents the Dice loss, L CE represents the cross-entropy loss, and λ3 represents the weight coefficient.
  6. 根据权利要求5所述的方法,其特征在于,Dice损失和交叉熵损失分别设置为:method according to claim 5, is characterized in that, Dice loss and cross entropy loss are respectively set to:
    Figure PCTCN2021124414-appb-100017
    Figure PCTCN2021124414-appb-100017
    Figure PCTCN2021124414-appb-100018
    Figure PCTCN2021124414-appb-100018
    其中,Y表示分割任务的金标准,
    Figure PCTCN2021124414-appb-100019
    表示分割网络的预测结果,N c表示分割的类别数,x和y表示空间坐标,
    Figure PCTCN2021124414-appb-100020
    表示分割网络预测出的(x,y)位置属于第c类的概率。
    Among them, Y represents the gold standard of segmentation tasks,
    Figure PCTCN2021124414-appb-100019
    Represents the prediction result of the segmentation network, N c represents the number of categories to be segmented, x and y represent the spatial coordinates,
    Figure PCTCN2021124414-appb-100020
    Indicates the probability that the (x, y) position predicted by the segmentation network belongs to the c-th class.
  7. 根据权利要求1所述的方法,其特征在于,在训练所述变分自编码器过程中,将风格编码向量与设定的独热型编码拉近。The method according to claim 1, characterized in that, in the process of training the variational autoencoder, the style encoding vector is drawn closer to the set one-hot encoding.
  8. 根据权利要求1所述的方法,其特征在于,所述分割网络采用U-Net框架并利用多个域的图像来进行训练。The method according to claim 1, wherein the segmentation network adopts a U-Net framework and utilizes images of multiple domains for training.
  9. 一种计算机可读存储介质,其上存储有计算机程序,其中,该程序被处理器执行时实现根据权利要求1至8中任一项所述方法的步骤。A computer-readable storage medium, on which a computer program is stored, wherein, when the program is executed by a processor, the steps of the method according to any one of claims 1 to 8 are realized.
  10. 一种计算机设备,包括存储器和处理器,在所述存储器上存储有 能够在处理器上运行的计算机程序,其特征在于,所述处理器执行所述程序时实现权利要求1至8中任一项所述的方法的步骤。A computer device comprising a memory and a processor, wherein a computer program capable of running on the processor is stored on the memory, wherein any one of claims 1 to 8 is implemented when the processor executes the program The steps of the method described in the item.
PCT/CN2021/124414 2021-10-18 2021-10-18 Multi-domain medical image segmentation method based on domain adaptation WO2023065070A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/124414 WO2023065070A1 (en) 2021-10-18 2021-10-18 Multi-domain medical image segmentation method based on domain adaptation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/124414 WO2023065070A1 (en) 2021-10-18 2021-10-18 Multi-domain medical image segmentation method based on domain adaptation

Publications (1)

Publication Number Publication Date
WO2023065070A1 true WO2023065070A1 (en) 2023-04-27

Family

ID=86057828

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/124414 WO2023065070A1 (en) 2021-10-18 2021-10-18 Multi-domain medical image segmentation method based on domain adaptation

Country Status (1)

Country Link
WO (1) WO2023065070A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116758286A (en) * 2023-06-25 2023-09-15 中国人民解放军总医院 Medical image segmentation method, system, device, storage medium and product
CN117111696A (en) * 2023-09-07 2023-11-24 脉得智能科技(无锡)有限公司 Medical image segmentation method and training method of medical image segmentation model

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110853604A (en) * 2019-10-30 2020-02-28 西安交通大学 Automatic generation method of Chinese folk songs with specific region style based on variational self-encoder
CN111161249A (en) * 2019-12-31 2020-05-15 复旦大学 Unsupervised medical image segmentation method based on domain adaptation
CN112785692A (en) * 2021-01-29 2021-05-11 东南大学 Single-view-angle multi-person human body reconstruction method based on depth UV prior
US20210166088A1 (en) * 2019-09-29 2021-06-03 Tencent Technology (Shenzhen) Company Limited Training method and apparatus for image fusion processing model, device, and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210166088A1 (en) * 2019-09-29 2021-06-03 Tencent Technology (Shenzhen) Company Limited Training method and apparatus for image fusion processing model, device, and storage medium
CN110853604A (en) * 2019-10-30 2020-02-28 西安交通大学 Automatic generation method of Chinese folk songs with specific region style based on variational self-encoder
CN111161249A (en) * 2019-12-31 2020-05-15 复旦大学 Unsupervised medical image segmentation method based on domain adaptation
CN112785692A (en) * 2021-01-29 2021-05-11 东南大学 Single-view-angle multi-person human body reconstruction method based on depth UV prior

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116758286A (en) * 2023-06-25 2023-09-15 中国人民解放军总医院 Medical image segmentation method, system, device, storage medium and product
CN116758286B (en) * 2023-06-25 2024-02-06 中国人民解放军总医院 Medical image segmentation method, system, device, storage medium and product
CN117111696A (en) * 2023-09-07 2023-11-24 脉得智能科技(无锡)有限公司 Medical image segmentation method and training method of medical image segmentation model

Similar Documents

Publication Publication Date Title
Castiglioni et al. AI applications to medical images: From machine learning to deep learning
US10902588B2 (en) Anatomical segmentation identifying modes and viewpoints with deep learning across modalities
US11688518B2 (en) Deep neural network based identification of realistic synthetic images generated using a generative adversarial network
WO2023065070A1 (en) Multi-domain medical image segmentation method based on domain adaptation
You et al. Bootstrapping semi-supervised medical image segmentation with anatomical-aware contrastive distillation
JP7119865B2 (en) Information processing method and device, and information detection method and device
Xu et al. Robust contour tracking in ultrasound tongue image sequences
Wang et al. Uncertainty-guided efficient interactive refinement of fetal brain segmentation from stacks of MRI slices
CN113256592B (en) Training method, system and device of image feature extraction model
CN111667027B (en) Multi-modal image segmentation model training method, image processing method and device
Mokhtari et al. EchoGNN: explainable ejection fraction estimation with graph neural networks
Cui et al. Supervised machine learning for coronary artery lumen segmentation in intravascular ultrasound images
Amador et al. Predicting treatment-specific lesion outcomes in acute ischemic stroke from 4D CT perfusion imaging using spatio-temporal convolutional neural networks
Reynaud et al. Feature-conditioned cascaded video diffusion models for precise echocardiogram synthesis
Patel et al. Cross attention transformers for multi-modal unsupervised whole-body pet anomaly detection
Tiago et al. A domain translation framework with an adversarial denoising diffusion model to generate synthetic datasets of echocardiography images
Li et al. wUnet: A new network used for ultrasonic tongue contour extraction
CN115190999A (en) Classifying data outside of a distribution using contrast loss
Cai et al. 3D Medical Image Segmentation with Sparse Annotation via Cross-Teaching Between 3D and 2D Networks
CN116958693A (en) Image analysis method, apparatus, device, storage medium, and program product
WO2022262219A1 (en) Method for constructing semantic perturbation reconstruction network of self-supervised point cloud learning
CN113592972B (en) Magnetic resonance image reconstruction method and device based on multi-mode aggregation
Zhang et al. Multi-attention networks for temporal localization of video-level labels
CN112086174B (en) Three-dimensional knowledge diagnosis model construction method and system
Yang et al. Cervical nuclei segmentation in whole slide histopathology images using convolution neural network

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21960831

Country of ref document: EP

Kind code of ref document: A1