WO2023197612A1 - 一种基于自动数据增广的医学图像分割方法 - Google Patents

一种基于自动数据增广的医学图像分割方法 Download PDF

Info

Publication number
WO2023197612A1
WO2023197612A1 PCT/CN2022/134722 CN2022134722W WO2023197612A1 WO 2023197612 A1 WO2023197612 A1 WO 2023197612A1 CN 2022134722 W CN2022134722 W CN 2022134722W WO 2023197612 A1 WO2023197612 A1 WO 2023197612A1
Authority
WO
WIPO (PCT)
Prior art keywords
strategy
sub
data augmentation
iteration
training set
Prior art date
Application number
PCT/CN2022/134722
Other languages
English (en)
French (fr)
Inventor
刘敏
刘庆浩
张哲�
范文培
王耀南
Original Assignee
湖南大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 湖南大学 filed Critical 湖南大学
Publication of WO2023197612A1 publication Critical patent/WO2023197612A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30056Liver; Hepatic
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30096Tumor; Lesion

Definitions

  • the invention belongs to the technical field of medical image segmentation and relates to a medical image segmentation method based on automatic data augmentation.
  • Data augmentation is an effective method to expand the size of data. It has been widely used in computer vision tasks and has achieved remarkable results. It is worth mentioning that data augmentation is often used to solve the problem of insufficient medical image data. Since typical data augmentation methods in medical image segmentation include rotation, vertical flipping and random scaling, the differences between different types of diseases that need to be identified There are large differences in the number of samples, resulting in a variety of optimal data augmentation strategies. Experiments show that choosing an inappropriate data augmentation strategy will reduce the segmentation accuracy of the model. In other words, we need to design reasonable and effective data augmentation strategies for new medical image segmentation tasks, which requires professional experience and a lot of time to manually adjust the probability and magnitude of data augmentation operations.
  • the purpose of the present invention is to provide a medical image segmentation method based on automatic data augmentation, which designs a dedicated search space for medical image segmentation tasks. It is suitable for most medical image segmentation tasks, thereby improving the segmentation accuracy of deep learning medical image segmentation models.
  • the present invention provides a medical image segmentation method based on automatic data augmentation, which includes the following steps:
  • step S3 Train the sampling sub-strategy obtained in step S2 on the training set obtained in step S1.
  • update the network weight through stochastic gradient descent, apply the updated network weight to the verification set obtained in step S1, and calculate the verification
  • the set loss is used for forward propagation, and the policy parameters are updated through proximal iteration.
  • the verification set loss is minimum, the data augmentation sub-strategy is obtained;
  • step S4 In the retraining stage, apply the data augmentation sub-strategy obtained in step S3 to the original training set (the training set described in step S1) for data augmentation, and conduct training on the augmented training set to obtain the optimal Network weight, use this network weight to perform inference and obtain the target area.
  • the step S1 is specifically:
  • the original training set without data augmentation is randomly divided into a training set ⁇ train and a verification set ⁇ Val in a ratio of 1:1.
  • the training set and verification set include a medical image segmentation database with annotation information.
  • the step S2 includes the following steps:
  • search space S201 Design a search space S for the original data set. 17 operations are used in the search space S, including contrast enhancement, brightness adjustment, Gamma transformation, Gaussian noise, adaptive histogram equalization, horizontal flipping, vertical flipping, elastic transformation, Optical distortion, grid distortion, rotation, random scaling, affine transformation, horizontal translation, vertical translation, horizontal miscut, vertical miscut;
  • the sub-strategy distribution in the search space S conforms to the classification distribution.
  • the classification distribution is as shown in formulas (1) and (2).
  • the search space S contains 136 sub-strategies, and the sampling sub-strategy is obtained according to the classification distribution;
  • x represents the image
  • S is the data augmentation strategy search space
  • s is the sub-strategy
  • s(x) is the sub-policy applied to image x
  • h s is the one-hot vector
  • H s is the classification distribution
  • a s represents the probability that the application sub-strategy is applied
  • the step S3 includes the following steps:
  • step S301 Apply the sampling sub-strategy to the training set obtained in step S1, and then train the neural network. In each iteration, update the network weight through stochastic gradient descent, and use the updated network weight to calculate the verification set loss L val .
  • the proximal iteration method is used to reduce the loss of the verification set, as shown in formulas (3) and (4). When the loss of the verification set L val is the smallest, N data augmentation sub-strategies are obtained;
  • w * is the optimal network weight
  • stw * is the optimal network weight
  • w is the network weight
  • E represents the mathematical expectation of the loss function in the network training process
  • L train is Training set loss
  • parameter a represents the probability that the application sub-strategy is applied
  • parameter b represents the probability that the enhancement method is applied
  • parameter v represents the amplitude of each enhancement method in the sub-strategy.
  • the proximal iteration adopted is:
  • Q is the intersection of Q 1 constraint and Q 2 constraint; ⁇ 0 represents the zero norm of ⁇ ;
  • proximal iteration can be obtained as:
  • the present invention has the following advantages:
  • the present invention provides a medical image segmentation method based on automatic data augmentation.
  • Experimental results on a publicly available liver tumor data set show that the proposed algorithm achieves advanced performance under a basic network architecture.
  • the present invention provides a medical image segmentation method based on automatic data augmentation. Compared with existing algorithms, the efficiency of the search strategy of this algorithm is improved by at least one order of magnitude.
  • Figure 1 is a flow chart of a medical image segmentation method based on automatic data augmentation of the present invention.
  • Figure 2 is a strategy search flow chart of a medical image segmentation method based on automatic data augmentation of the present invention.
  • Figure 3 shows the comparison results of the verification set loss during the search phase of the automatic data augmentation strategy.
  • the following experiments are performed on the publicly available LiTs dataset, which is a publicly available liver tumor dataset that contains liver and tumor labels.
  • the training set and test set contain 130 CT images and 70 CT images respectively.
  • the present invention provides a medical image segmentation method based on automatic data augmentation, which includes the following steps:
  • Step 1 Randomly divide the original LiTs training set without data augmentation into a training set and a verification set at a ratio of 1:1.
  • the training set and verification set include a medical image segmentation database with annotation information;
  • Step 2 Construct a data augmentation search space and obtain a sampling sub-strategy based on the search space;
  • S201 Design a search space for the LiTs data set. 17 operations are used in the search space, including contrast enhancement, brightness adjustment, Gamma transformation, Gaussian noise, adaptive histogram equalization, horizontal flipping, vertical flipping, elastic transformation, and optical distortion. , grid distortion, rotation, random scaling, affine transformation, horizontal translation, vertical translation, horizontal miscut, vertical miscut;
  • the sub-strategy distribution in the search space S conforms to the classification distribution.
  • the classification distribution is as shown in formulas (1) and (2).
  • the search space S contains 136 sub-strategies, and the sampling sub-strategy is obtained according to the classification distribution;
  • x represents the image
  • S is the data augmentation strategy search space
  • s is the sub-strategy
  • s(x) is the sub-policy applied to image x
  • h s is the one-hot vector
  • Two data augmentation methods are sampled according to the Bernoulli distribution, and the two sampled data augmentation methods are applied to each batch of images in sequence;
  • Step 3 Train the sampling sub-policy obtained in Step 2 on the training set. In each iteration, update the network weight through stochastic gradient descent. Use the updated network weight to calculate the verification set loss, and update the strategy through proximal iteration. Parameters, when the loss of the verification set is minimum, the data augmentation sub-strategy is obtained;
  • Iterative search data augmentation strategy process apply the sampling sub-strategy to the training set, and then train the neural network. In each iteration, update the network weight through stochastic gradient descent, and apply the updated network weight to the verification set. , calculate the verification set loss L val , and reduce the verification set loss through proximal iteration to optimize the strategy parameters, as shown in formulas (3) and (4), when the verification set loss L val is the smallest, the iterative search for data ends During the augmentation sub-strategy process, N data augmentation sub-strategies are obtained, and the probabilities of the N data augmentation sub-strategies are arranged from large to small;
  • w * is the optimal network weight.
  • stw * is the optimal network weight
  • w is the network weight
  • E represents the mathematical expectation of the loss function during network training
  • is the policy parameter
  • ⁇ a, b, v ⁇
  • L train is the training set loss
  • parameter a represents the probability that the application sub-strategy is applied
  • parameter b represents the probability that the enhancement method is applied
  • parameter v represents the amplitude of each enhancement method in the sub-strategy
  • Step 4 In the retraining stage, apply the obtained N data augmentation sub-strategies to the LiTs original training set for data augmentation, train on the augmented training set to obtain the optimal network weight, and use the network The weights are used for reasoning, and the tumor area to be segmented in the LiTs data set is finally obtained.
  • This invention aims to search for a set of data enhancement sub-strategies suitable for multiple medical image segmentation tasks, and then apply the sub-strategies obtained in the search stage to common medical segmentation networks, such as UNet, UNet++, DenseUNet, MANet, nnUNet and FPN.
  • DenseNet achieves feature reuse by linking features on channels, achieving better performance with fewer parameters and computational cost.
  • Densenet161 was chosen as the encoder for UNet, which is a variant of UNet called DenseUNet.
  • the present invention conducted comparative experiments on the LiTs data set, and the comparative results of liver tumor segmentation are shown in Table 1.
  • the results in Table 1 show that the data enhancement strategy obtained by searching can improve the segmentation accuracy when applied to the network, and the best segmentation results obtained by the method proposed in this invention are better than nnUNet, which is considered the best segmentation framework in medical image segmentation.
  • the algorithm proposed by the present invention takes about 5 hours to search the data augmentation strategy. In comparison, existing data augmentation methods require more than 100 hours of search time. These prove the effectiveness and importance of the algorithm proposed in this invention for medical image segmentation.
  • the present invention applies the automatic data enhancement algorithm DADA from the natural image domain to medical image segmentation, and then we choose the traditional data enhancement strategy and DADA to compare the proposed algorithm with the method of the present invention.
  • MANet is a relatively novel liver tumor segmentation network and was selected as the baseline on the liver tumor data set. * indicates the data enhancement strategy used when the method in Table 2 was proposed.
  • UNet is selected as the baseline on the LiTs dataset, * indicates a combination of traditional data augmentation transformations, including random brightness contrast, random gamma, elastic transformation, grid distortion, optical distortion and rotation scaling in Table 3.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Measuring And Recording Apparatus For Diagnosis (AREA)

Abstract

本发明公开了一种基于自动数据增广的医学图像分割方法,包括:S1、将原始训练集按照设定比例随机划分为训练集和验证集;S2、构建数据增广搜索空间,得到采样子策略;S3、将采样子策略在训练集上进行训练,在每次迭代中,通过随机梯度下降更新网络权重,将更新后的网络权重用来计算验证集损失,通过近端迭代更新策略参数,当验证集损失最小时,得到数据增广子策略;S4、在重训练阶段,将数据增广子策略应用在原始训练集上进行数据增广,在增广后的训练集上进行训练,获得最优的网络权重,采用该网络权重进行推理,得到目标区域。本发明采用的算法在基本网络架构下实现了先进的性能,该算法的搜索策略的效率至少提高了一个数量级。

Description

一种基于自动数据增广的医学图像分割方法 技术领域
本发明属于医学图像分割技术领域,涉及一种基于自动数据增广的医学图像分割方法。
背景技术
近年来,深度神经网络在医学图像分割方面取得了长足的进步,为智能医疗的快速发展做出了贡献,有利于患者的疾病诊断、病理分析和手术规划。基于现有的医学图像分割数据集,国内外学者已经提出了多种医学分割模型。但医学图像分割模型的性能很大程度上依赖于大规模的标记数据,然而医学图像的数据规模普遍较小。这是因为患者隐私保护意识的增强使得患者病例数据的获取变得更加困难,同时需要花费专业医师的大量的时间和精力来标记医学图像。此外,由于疾病的多样性和检测方法的多样性,疾病的多模态数据具有跨度大、密度低的特点。因此,在医学图像分割中仍然存在许多挑战和许多需要解决的问题。
数据增广是一种有效的扩大数据规模的方法,已广泛应用于计算机视觉任务中,并取得了显著的效果。值得一提的是,数据增广经常被用来解决医学图像数据不足的问题,由于医学图像分割中典型的数据增广方法包括旋转、垂直翻转和随机缩放,需要识别的不同类型疾病之间的样本数量存在较大差异,从而产生了多种最优的数据增广策略。实验表明,选择不合适的数据增广策略会降低模型的分割精度。换句话说,我们需要为新的医学图像分割任务设计合理有效的数据增广策略,这需要专业经验和大量时间来手动调整数据增强操作的概率和幅度。
基于此,有必要开发一种基于自动数据增广的医学图像分割方法。
发明内容
针对医学图像分割数据集及现有数据增广方法存在的问题,本发明的目的在于提供一种基于自动数据增广的医学图像分割方法,该方法为医学图像分割任务设计了专用的搜索空间,适用于大多数医学图像分割任务,进而提升深度学习医学图像分割模型的分割精度。
为了达到上述目的,本发明提供以下技术方案:
本发明提供一种基于自动数据增广的医学图像分割方法,包括以下步骤:
S1、将原始训练集按照设定比例随机划分为训练集和验证集;
S2、构建数据增广搜索空间,得到基于所述搜索空间的采样子策略;
S3、将步骤S2所得采样子策略在步骤S1所得训练集上进行训练,在每次迭代中,通过随机梯度下降更新网络权重,将更新后的网络权重应用在步骤S1所得验证集上,计算验证集损失并用于前向传播,通过近端迭代更新策略参数,当验证集损失最小时,得到数据增广子策略;
S4、在重训练阶段,将步骤S3所得数据增广子策略应用在原始训练集(步骤S1所述训练集)上进行数据增广,在增广后的训练集上进行训练,获得最优的网络权重,采用该网络权重进行推理,得到目标区域。
优选的,所述步骤S1具体为:
将未进行数据增广的原始训练集按照1:1的比例随机划分为训练集Φ train和验证集Φ Val,所述训练集和验证集包括带有标注信息的医学图像分割数据库。
优选的,所述步骤S2包括以下步骤:
S201、为原始数据集设计一个搜索空间S,搜索空间S中使用17种操作,包括对比度增强、亮度调节、Gamma变换、高斯噪声、自适应直方图均衡化、 水平翻转、垂直翻转、弹性变换、光学畸变、网格畸变、旋转、随机缩放、仿射变换、水平平移、垂直平移、水平错切、垂直错切;
S202、搜索空间S中子策略分布符合分类分布,分类分布如公式(1)、(2)所示,搜索空间S中包含136个子策略,根据分类分布得到采样子策略;
Figure PCTCN2022134722-appb-000001
Figure PCTCN2022134722-appb-000002
公式(1)中,x代表图像;S为数据增广策略搜索空间;s为子策略;
Figure PCTCN2022134722-appb-000003
为采样得到的子策略;s(x)为将子策略应用到图像x上;h s为one-hot向量;
公式(2)中,H s为分类分布;a s表示应用子策略被应用的概率;
根据伯努利分布采样确定两个数据增强方法,将采样的两个数据增强方法按顺序应用在每个批次的图像上;
同时,为了增加策略的多样性,需要在连续范围内设置数据增强方法的幅度并确定区间的范围;除了数据增强方法的幅度,还需要搜索这些操作被应用的概率。
优选的,所述步骤S3包括以下步骤:
S301、将采样子策略应用在步骤S1所得训练集上,然后训练神经网络,在每次迭代中,通过随机梯度下降更新网络权重,将更新后的网络权重用来计算验证集损失L val,通过近端迭代方式来降低验证集损失,如公式(3)、(4)所示,当验证集损失L val最小时,获得N个数据增广子策略;
min L val(w *val)      (3)
s.t.w *=argmin E(L train(w,φ,Φ train))     (4)
公式(3)中,w *为最优的网络权重;
公式(4)中,s.t.w *为最优的网络权重;w为网络权重;E表示为网络训练过程中损失函数的数学期望;φ为策略参数φ={a,b,v};L train为训练集损失;
S302、在训练网络的每一次迭代中,根据验证集损失L val是否为最小,通过近端迭代更新策略参数φ={a,b,v};
其中,参数a表示应用子策略被应用的概率,参数b表示应增强方法被应用的概率,参数v表示子策略中每个增强方法的幅值。
进一步优选的,为解决公式(3)~(4)的优化问题,采用的近端迭代为:
将φ优化为一个受约束Q 2的连续变量,引入一个离散
Figure PCTCN2022134722-appb-000004
在迭代过程中
Figure PCTCN2022134722-appb-000005
受到由φ导出的约束Q 1的约束;
其中,Q=Q 1∩Q 2
Q 1={φ|||φ|| 0=1};
Q 2={φ|0≤φ≤1};
式中,Q为Q 1约束和Q 2约束的交集;‖φ‖ 0表示φ的零范数;
最终可以得到近端迭代为:
Figure PCTCN2022134722-appb-000006
其中,prox Q(φ)为近端迭代;φ为策略参数,φ={a,b,v}。
与现有技术相比,本发明具有以下优点:
本发明提供一种基于自动数据增广的医学图像分割方法,在公开可用的肝肿瘤数据集上的实验结果表明,所提出的算法在基本网络架构下实现了先进的 性能。
本发明提供一种基于自动数据增广的医学图像分割方法,与现有算法相比,该算法的搜索策略的效率至少提高了一个数量级。
附图说明
图1是本发明一种基于自动数据增广的医学图像分割方法的流程图。
图2是本发明一种基于自动数据增广的医学图像分割方法的策略搜索流程图。
图3为自动数据增广策略搜索阶段验证集损失比较结果。
具体实施方式
下面将结合附图对本发明的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
实施例1
下面对公开可用的LiTs数据集进行了实验,LiTs数据集是一个公开可用的肝脏肿瘤数据集,包含肝脏和肿瘤标签。数据由MICCAI和ISBI组织的2017年肝肿瘤分割挑战赛提供。训练集和测试集分别包含130张CT影像和70张CT影像。
如图1-2所示,本发明提供一种基于自动数据增广的医学图像分割方法,包括以下步骤:
步骤一、将未进行数据增广的LiTs原始训练集按照1:1的比例随机划分为训练集和验证集,所述训练集和验证集包括带有标注信息的医学图像分割数据库;
步骤二、构建数据增广搜索空间,得到基于搜索空间的采样子策略;
S201、为LiTs数据集设计一个搜索空间,搜索空间中使用17种操作,包括对比度增强、亮度调节、Gamma变换、高斯噪声、自适应直方图均衡化、水平翻转、垂直翻转、弹性变换、光学畸变、网格畸变、旋转、随机缩放、仿射变换、水平平移、垂直平移、水平错切、垂直错切;
S202、搜索空间S中子策略分布符合分类分布,分类分布如公式(1)、(2)所示,搜索空间S中包含136个子策略,根据分类分布得到采样子策略;
Figure PCTCN2022134722-appb-000007
Figure PCTCN2022134722-appb-000008
公式(1)中,x代表图像;S为数据增广策略搜索空间;s为子策略;
Figure PCTCN2022134722-appb-000009
为采样得到的子策略;s(x)为将子策略应用到图像x上;h s为one-hot向量;
公式(2)中,H s为分类分布;a s表示子策略被应用的概率;s为子策略;
根据伯努利分布采样两个数据增强方法,将采样的两个数据增强方法按顺序应用在每个批次的图像上;
同时,为了增加策略的多样性,需要在连续范围内设置数据增强方法的幅度并确定区间的范围;除了数据增强方法的幅度,还需要搜索这些操作被应用的概率;
步骤三、将步骤二所得采样子策略在训练集上进行训练,在每次迭代中,通过随机梯度下降更新网络权重,将更新后的网络权重用来计算验证集损失,通过近端迭代更新策略参数,当验证集损失最小时,得到数据增广子策略;
S301、迭代搜索数据增广策略过程,将采样子策略应用在训练集上,然后训练神经网络,在每次迭代中,通过随机梯度下降更新网络权重,将更新后的 网络权重应用在验证集上,计算验证集损失L val,通过近端迭代方式来降低验证集损失以此来优化策略参数,如公式(3)、(4)所示,当验证集损失L val最小时,结束迭代搜索数据增广子策略过程,获得N个数据增广子策略,N个数据增广子策略的概率按从大到小排列;
min L val(w *val)      (3)
s.t.w *=argmin E(L train(w,φ,Φ train))    (4)
公式(3)中,w *为最优的网络权重。
公式(4)中,s.t.w *为最优的网络权重;w为网络权重;E表示为网络训练过程中损失函数的数学期望;φ为策略参数,φ={a,b,v};L train为训练集损失;
S302、在训练网络的每一次迭代中,根据验证集损失L val是否为最小,通过近端迭代更新策略参数φ={a,b,v};
其中,参数a表示应用子策略被应用的概率,参数b表示应增强方法被应用的概率,参数v表示子策略中每个增强方法的幅值;
步骤四、在重训练阶段,将获得的N个数据增广子策略应用在LiTs原始训练集上进行数据增广,在增广后的训练集上进行训练获得最优的网络权重,采用该网络权重进行推理,最终获得LiTs数据集中所要分割的肿瘤区域。
与现有方法的比较:本发明旨在搜索得到一组适用于多个医学图像分割任务的数据增强子策略,然后将搜索阶段得到的子策略应用于常见的医学分割网络,例如UNet、UNet++、DenseUNet、MANet、nnUNet和FPN。DenseNet通过链接通道上的特征来实现特征重用,以更少的参数和计算成本实现更好的性能。选择Densenet161作为UNet的编码器,它是UNet的一个变体,称为 DenseUNet。
本发明在LiTs数据集上进行了比较实验,肝肿瘤分割的比较结果如表1所示。表1的结果表明,搜索得到的数据增强策略应用于网络时可以提高分割精度,并且本发明提出的方法取得的最佳分割结果优于nnUNet,nnUNet被认为是医学图像分割中最好的分割框架。此外,如表4所示,本发明提出的算法大约需要5小时搜索数据增广策略。相比之下,现有数据增广方法需的搜索时间超过100小时。这些证明了本发明提出的算法对于医学图像分割的有效性和重要性。
表1.LiTs数据集肝肿瘤分割的比较结果
Figure PCTCN2022134722-appb-000010
表2.传统和自动数据增强算法在LiTs数据集上的比较结果
Figure PCTCN2022134722-appb-000011
表3.LiTs数据集上传统和自动数据增强算法的比较结果
Figure PCTCN2022134722-appb-000012
表4.GPU小时数比较
Figure PCTCN2022134722-appb-000013
此外,本发明将自动数据增强算法DADA从自然图像域应用于医学图像分割,然后我们选择传统的数据增强策略以及DADA与所提出的算法与本发明方法进行比较。在实现中,MANet是一个比较新颖的肝肿瘤分割网络,被选为肝肿瘤数据集上的baseline,*表示表2中方法提出时所采用的数据增强策略。另外,选择UNet作为LiTs数据集上的baseline,*表示传统数据增强变换的组合,包括表3中的随机亮度对比度、随机伽马、弹性变换、网格畸变、光学畸变和旋转缩放。
如表2和表3所示,本发明所采用算法的性能超过了DADA算法和传统数据增强算法,这证实了所提出算法的效率。此外,我们还绘制了搜索阶段验证集损失的变化曲线如图3所示,该算法解决了DADA算法鲁棒性不强的问题, 而且收敛速度也优于DADA算法。这些实验证明了本发明方法的卓越鲁棒性和良好的收敛性能。

Claims (5)

  1. 一种基于自动数据增广的医学图像分割方法,包括以下步骤:
    S1、将原始训练集按照设定比例随机划分为训练集和验证集;
    S2、构建数据增广搜索空间,得到基于所述搜索空间的采样子策略;
    S3、将步骤S2所得采样子策略在步骤S1所得训练集上进行训练,在每次迭代中,通过随机梯度下降更新网络权重,将更新后的网络权重用来计算验证集损失,通过近端迭代更新策略参数,当验证集损失最小时,得到数据增广子策略;
    S4、在重训练阶段,将步骤S3所得数据增广子策略应用在原始训练集上进行数据增广,在增广后的训练集上进行训练,获得最优的网络权重,采用该网络权重进行推理,得到目标区域。
  2. 根据权利要求1所述一种基于自动数据增广的医学图像分割方法,其特征在于,所述步骤S1具体为:
    将未进行数据增广的原始训练集按照1:1的比例随机划分为训练集Φ train和验证集Φ val,所述训练集和验证集包括带有标注信息的医学图像分割数据库。
  3. 根据权利要求1所述一种基于自动数据增广的医学图像分割方法,其特征在于,所述步骤S2包括以下步骤:
    S201、为原始数据集设计一个搜索空间S,搜索空间S中使用17种操作,包括对比度增强、亮度调节、Gamma变换、高斯噪声、自适应直方图均衡化、水平翻转、垂直翻转、弹性变换、光学畸变、网格畸变、旋转、随机缩放、仿射变换、水平平移、垂直平移、水平错切、垂直错切;
    S202、搜索空间S中子策略分布符合分类分布,分类分布如公式(1)、(2)所示,搜索空间S中包含136个子策略,根据分类分布得到采样子策略;
    Figure PCTCN2022134722-appb-100001
    Figure PCTCN2022134722-appb-100002
    公式(1)中,x代表图像;S为数据增广策略搜索空间;s为子策略;
    Figure PCTCN2022134722-appb-100003
    为采样得到的子策略;s(x)为将子策略应用到图像x上;h s为one-hot向量;
    公式(2)中,H s为分类分布;a s表示子策略被应用的概率;
    根据伯努利分布采样两个数据增强方法,将采样的两个数据增强方法按顺序应用在每个批次的图像上;
    同时,为了增加策略的多样性,需要在连续范围内设置数据增强方法的幅度并确定区间的范围;除了数据增强方法的幅度,还需要搜索这些操作被应用的概率。
  4. 根据权利要求1所述一种基于自动数据增广的医学图像分割方法,其特征在于,所述步骤S3包括以下步骤:
    S301、将采样子策略应用在步骤S1所得训练集上,然后训练神经网络,在每次迭代中,通过随机梯度下降更新网络权重,将更新后的网络权重用来计算验证集损失L val,通过近端迭代方式来降低验证集损失,如公式(3)、(4)所示,当验证集损失L val最小时,获得N个数据增广子策略;
    min L val(w *val)  (3)
    s.t.w *=argmin E(L train(w,φ,Φ train))  (4)
    公式(3)中,w *为最优的网络权重;
    公式(4)中,s.t.w *为最优的网络权重;w为网络权重;E表示为网络训练过程中损失函数的数学期望;φ为策略参数φ={a,b,v};L train为训练集损失;
    S302、在训练网络的每一次迭代中,根据验证集损失L val是否为最小,通过近端迭代更新策略参数φ={a,b,v};
    其中,参数a表示应用子策略被应用的概率,参数b表示应增强方法被应用的概率,参数v表示子策略中每个增强方法的幅值。
  5. 根据权利要求4所述一种基于自动数据增广的医学图像分割方法,其特征在于,为解决公式(3)~(4)的优化问题,采用的近端迭代为:
    将φ优化为一个受约束Q 2的连续变量,引入一个离散
    Figure PCTCN2022134722-appb-100004
    在迭代过程中
    Figure PCTCN2022134722-appb-100005
    受到由φ导出的约束Q 1的约束;
    其中,Q=Q 1∩Q 2
    Q 1={φ|||φ|| 0=1};
    Q 2={φ|0≤φ≤1};
    式中,Q为Q 1约束和Q 2约束的交集;‖φ‖ 0表示φ的零范数;
    最终可以得到近端迭代为:
    Figure PCTCN2022134722-appb-100006
    其中,prox Q(φ)为近端迭代;φ为策略参数,φ={a,b,v}。
PCT/CN2022/134722 2022-04-15 2022-11-28 一种基于自动数据增广的医学图像分割方法 WO2023197612A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210395449.7 2022-04-15
CN202210395449.7A CN114693935A (zh) 2022-04-15 2022-04-15 一种基于自动数据增广的医学图像分割方法

Publications (1)

Publication Number Publication Date
WO2023197612A1 true WO2023197612A1 (zh) 2023-10-19

Family

ID=82143613

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/134722 WO2023197612A1 (zh) 2022-04-15 2022-11-28 一种基于自动数据增广的医学图像分割方法

Country Status (2)

Country Link
CN (1) CN114693935A (zh)
WO (1) WO2023197612A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117763356A (zh) * 2023-12-26 2024-03-26 中国地质科学院地质力学研究所 一种基于LightGBM算法的地震相快速识别方法
CN117765532A (zh) * 2024-02-22 2024-03-26 中国科学院宁波材料技术与工程研究所 基于共聚焦显微图像的角膜朗格汉斯细胞分割方法和装置

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114693935A (zh) * 2022-04-15 2022-07-01 湖南大学 一种基于自动数据增广的医学图像分割方法
CN116416492B (zh) * 2023-03-20 2023-12-01 湖南大学 一种基于特征自适应的自动数据增广方法
CN117132978B (zh) * 2023-10-27 2024-02-20 深圳市敏视睿行智能科技有限公司 一种微生物图像识别系统及方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111882492A (zh) * 2020-06-18 2020-11-03 天津中科智能识别产业技术研究院有限公司 一种图像数据自动增强的方法
KR20210033235A (ko) * 2019-09-18 2021-03-26 주식회사카카오브레인 데이터 증강 방법 및 장치, 그리고 컴퓨터 프로그램
CN112651892A (zh) * 2020-12-22 2021-04-13 中国科学技术大学 一种基于图像样本的自动数据增强策略选择方法
CN112686282A (zh) * 2020-12-11 2021-04-20 天津中科智能识别产业技术研究院有限公司 一种基于自学习数据的目标检测方法
CN114693935A (zh) * 2022-04-15 2022-07-01 湖南大学 一种基于自动数据增广的医学图像分割方法

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210012012A1 (en) * 2019-07-12 2021-01-14 Palo Alto Research Center Incorporated System and method for constructing a graph-based model for optimizing the security posture of a composed internet of things system
CN111275129A (zh) * 2020-02-17 2020-06-12 平安科技(深圳)有限公司 一种图像数据的增广策略选取方法及系统
CN111523494A (zh) * 2020-04-27 2020-08-11 天津中科智能识别产业技术研究院有限公司 一种人体图像检测方法
CN113569726B (zh) * 2021-07-27 2023-04-14 湖南大学 一种联合自动数据增广和损失函数搜索的行人检测方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20210033235A (ko) * 2019-09-18 2021-03-26 주식회사카카오브레인 데이터 증강 방법 및 장치, 그리고 컴퓨터 프로그램
CN111882492A (zh) * 2020-06-18 2020-11-03 天津中科智能识别产业技术研究院有限公司 一种图像数据自动增强的方法
CN112686282A (zh) * 2020-12-11 2021-04-20 天津中科智能识别产业技术研究院有限公司 一种基于自学习数据的目标检测方法
CN112651892A (zh) * 2020-12-22 2021-04-13 中国科学技术大学 一种基于图像样本的自动数据增强策略选择方法
CN114693935A (zh) * 2022-04-15 2022-07-01 湖南大学 一种基于自动数据增广的医学图像分割方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
QUANMING YAO; JU XU; WEI-WEI TU; ZHANXING ZHU: "Efficient Neural Architecture Search via Proximal Iterations", ARXIV.ORG, 30 May 2019 (2019-05-30), XP081536513 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117763356A (zh) * 2023-12-26 2024-03-26 中国地质科学院地质力学研究所 一种基于LightGBM算法的地震相快速识别方法
CN117765532A (zh) * 2024-02-22 2024-03-26 中国科学院宁波材料技术与工程研究所 基于共聚焦显微图像的角膜朗格汉斯细胞分割方法和装置
CN117765532B (zh) * 2024-02-22 2024-05-31 中国科学院宁波材料技术与工程研究所 基于共聚焦显微图像的角膜朗格汉斯细胞分割方法和装置

Also Published As

Publication number Publication date
CN114693935A (zh) 2022-07-01

Similar Documents

Publication Publication Date Title
WO2023197612A1 (zh) 一种基于自动数据增广的医学图像分割方法
Wang et al. Blind2unblind: Self-supervised image denoising with visible blind spots
CN109191476B (zh) 基于U-net网络结构的生物医学图像自动分割新方法
CN108876769B (zh) 一种左心耳ct图像分割方法
WO2021136368A1 (zh) 钼靶图像中胸大肌区域自动检测方法及装置
CN113112534B (zh) 一种基于迭代式自监督的三维生物医学图像配准方法
CN105976364B (zh) 基于简化带权无向图的统计平均模型构建方法
CN108053398A (zh) 一种半监督特征学习的黑色素瘤自动检测方法
CN106157249A (zh) 基于光流法和稀疏邻域嵌入的单图像超分辨率重建算法
CN111402278B (zh) 分割模型训练方法、图像标注方法及相关装置
CN112419344B (zh) 一种基于Chan-Vese模型的无监督图像分割方法
He et al. Differentiable automatic data augmentation by proximal update for medical image segmentation
CN116051494A (zh) 基于傅里叶的元学习领域泛化的少样本医学图像分割方法
Mei et al. Dense contour-imbalance aware framework for colon gland instance segmentation
CN109272539A (zh) 基于引导图全变分模型的图像纹理和结构的分解方法
CN113781461A (zh) 一种病人智能监控排序方法
CN116503668A (zh) 一种基于小样本元学习的医学影像分类方法
CN111784713A (zh) 一种引入注意力机制的u形心脏分割方法
CN108305268A (zh) 一种图像分割方法及装置
CN113450363B (zh) 一种基于标签校正的元学习细胞核分割系统及方法
Wang et al. SURVS: A Swin-Unet and game theory-based unsupervised segmentation method for retinal vessel
CN108765431B (zh) 一种图像的分割方法及其在医学领域的应用
CN113744209A (zh) 基于多尺度残差U-net网络的心脏分割方法
CN116524292A (zh) 一种多源异质医学图像的联邦学习方法
CN112785559B (zh) 基于深度学习的多个异构模型相互组合的骨龄预测方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22937244

Country of ref document: EP

Kind code of ref document: A1