CN116824146A - Small sample CT image segmentation method, system, terminal and storage medium - Google Patents

Small sample CT image segmentation method, system, terminal and storage medium Download PDF

Info

Publication number
CN116824146A
CN116824146A CN202310831051.8A CN202310831051A CN116824146A CN 116824146 A CN116824146 A CN 116824146A CN 202310831051 A CN202310831051 A CN 202310831051A CN 116824146 A CN116824146 A CN 116824146A
Authority
CN
China
Prior art keywords
training
data
model
image segmentation
preprocessing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310831051.8A
Other languages
Chinese (zh)
Inventor
黄炳顶
黄永志
朱金鑫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Technology University
Original Assignee
Shenzhen Technology University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Technology University filed Critical Shenzhen Technology University
Priority to CN202310831051.8A priority Critical patent/CN116824146A/en
Publication of CN116824146A publication Critical patent/CN116824146A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/09Supervised learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/30Noise filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • G06V10/765Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects using rules for classification or partitioning the feature space
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a small sample CT image segmentation method, a system, a terminal and a storage medium, wherein the method is based on a noise reduction probability diffusion model and uses unlabeled data to realize a generated self-supervision learning task. In addition, in order to effectively transfer the pre-trained noise reduction probability diffusion model to the image segmentation task, the method performs fine adjustment on part of parameters in the training process of the downstream task, so that the pre-training model can be suitable for the image segmentation task, and the accuracy of an image segmentation result is improved. The method effectively solves the problem that the deep learning method is difficult to use in the actual application due to insufficient data volume and poor robustness of the method based on supervised learning in the prior art.

Description

Small sample CT image segmentation method, system, terminal and storage medium
Technical Field
The invention relates to the field of image processing, in particular to a small sample CT image segmentation method, a system, a terminal and a storage medium.
Background
CT (computed tomography) images are prone to prejudice and false recognition due to low contrast, by manually and directly segmenting organ images in CT images. At present, the deep learning method can solve the above problems to a certain extent. The method can automatically learn how to automatically complete the segmentation task of the organ images in the CT images from the label data in a data-driven mode, and can rapidly position the target organ image area in minute-level time.
At present, although the organ image segmentation method or system based on supervised learning can approach or even exceed the manual level in a certain practical scene, a large amount of label data is required to be used in the training process. The development process of the system usually invests huge cost in data acquisition and maintenance, and generally comprises the steps of collecting large-scale medical image data, cleaning the data, desensitizing, and adopting experts to label a data set. In addition, medical images generally involve private information, are not readily available, and require expert experience for data annotation work of the medical images, which further limits the construction of large data sets. And when the data volume is insufficient, the generalization capability and the robustness of the supervised learning model can be greatly reduced, and even some low-level errors can occur, so that the deep learning method is difficult to floor in practical application.
Accordingly, there is a need for improvement and development in the art.
Disclosure of Invention
The invention aims to solve the technical problems of the prior art, and provides a small sample CT image segmentation method, a system, a terminal and a storage medium, which aims to solve the problems that the method based on supervised learning is poor in robustness and the deep learning method is difficult to use in practical application due to insufficient data volume in the prior art.
The technical scheme adopted by the invention for solving the problems is as follows:
in a first aspect, an embodiment of the present invention provides a small sample CT image segmentation method, where the method includes:
acquiring a target data set, and determining a training set and a testing set according to the target data set, wherein the target data set comprises label-free data and label data, and the label-free data is more than the label data;
processing the voxel distance and voxel intensity of the image data in the training set and the test set to obtain a preprocessing training set and a preprocessing test set;
inputting the preprocessing training set into a noise reduction probability diffusion model for training, and determining a pre-training model;
modifying the step length and the output layer network structure of the pre-training model, and inputting the pre-processing training set into the modified pre-training model for training to obtain an image segmentation model;
and inputting the preprocessing test set into the image segmentation model to obtain an image segmentation result.
In one implementation, the processing of the voxel spacing of the image data in the training set and the test set includes:
resampling the training set and the test set to target voxel spacing by adopting tri-line interpolation and nearest neighbor difference values;
and carrying out normalization processing on the voxel distances of the image data in the training set and the testing set based on the target voxel distance.
In one implementation, the processing of voxel intensities of image data in the training set and the test set includes:
calculating a first quantile and a second quantile of voxel intensities of the unlabeled data in the training set and the test set;
cutting the voxel intensities of the data in the training set and the test set to a position between a first quantile and a second quantile, and carrying out normalization processing on the cut voxel intensities.
In one implementation method, the image data in the preprocessing training set and the preprocessing test set are three-dimensional image data, and before the preprocessing training set is input to the noise reduction probability diffusion model for training, the method further comprises:
splitting the three-dimensional image data in the preprocessing training set and the preprocessing testing set along a transverse plane, and converting the three-dimensional image data into two-dimensional image data;
the resolution of the tag data in the two-dimensional image data is adjusted to a preset resolution by adopting a bilinear difference value and a nearest neighbor difference value;
and carrying out data inversion on the label-free data in the two-dimensional image data.
In one implementation, the noise reduction probability diffusion model includes:
in the forward process of the noise reduction probability diffusion model, defining a diffusion process as a Markov chain;
in the backward process of the noise reduction probability diffusion model, the U-Net network is adopted to learn the semantic features of the image data in the preprocessing training set or the preprocessing testing set.
In one implementation, the modifying the step size of the pre-trained model includes:
and modifying the step length of the pre-training model to be 0.
In one implementation method, the inputting the pre-processing training set into the modified pre-training model to train to obtain an image segmentation model further includes:
dividing parameters into coding layer parameters, decoding layer parameters, bottleneck layer parameters and classification header parameters according to the function of a network layer where the parameters of the U-Net in the modified pre-training model are located;
and adjusting weights of the decoding layer parameters and the classification head parameters in the modified pre-training model based on the weights corresponding to the coding layer parameters and the bottleneck layer parameters, and training the modified pre-training model based on the decoding layer parameters and the classification head parameters after weight adjustment to obtain the image segmentation model.
In a second aspect, an embodiment of the present invention further provides a small sample CT image segmentation system, where the small sample CT image segmentation system includes:
the data acquisition module is used for acquiring a target data set, and determining a training set and a testing set according to the target data set, wherein the target data set comprises label-free data and label data, and the label-free data is more than the label data;
the preprocessing module is used for processing the voxel distance and the voxel intensity of the image data in the training set and the test set to obtain a preprocessing training set and a preprocessing test set;
the pre-training module is used for inputting the pre-processing training set into the noise reduction probability diffusion model for training and determining a pre-training model;
the training module is used for modifying the step length and the output layer network structure of the pre-training model, and inputting the pre-processing training set into the modified pre-training model for training to obtain an image segmentation model;
and the image segmentation module is used for inputting the preprocessing test set into an image segmentation model to obtain an image segmentation result.
In a third aspect, an embodiment of the present invention further provides a terminal, where the terminal includes a memory and more than one processor; the memory stores more than one program; the program comprising instructions for performing a small sample CT image segmentation method as described in any one of the above; the processor is configured to execute the program.
In a fourth aspect, embodiments of the present invention further provide a computer readable storage medium having stored thereon a plurality of instructions, wherein the instructions are adapted to be loaded and executed by a processor to implement any of the small sample CT image segmentation methods described above.
The invention has the beneficial effects that: according to the method, the self-supervision learning task of the generation type is realized by using the unlabeled data based on the noise reduction probability diffusion model, and the pretrained model with the semantic features of the CT image can be obtained from the unlabeled data by pretraining the noise reduction probability diffusion model, so that the requirement limit of the traditional supervision learning method on the large-scale labeled data in the training process is solved to a certain extent. In addition, in order to effectively transfer the pre-trained noise reduction probability diffusion model to the image segmentation task, the method performs fine adjustment on part of parameters in the training process of the downstream task, so that the pre-training model can be suitable for the image segmentation task, and the accuracy of an image segmentation result is improved. The method effectively solves the problem that the deep learning method is difficult to use in the actual application due to insufficient data volume and poor robustness of the method based on supervised learning in the prior art.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments described in the present invention, and other drawings may be obtained according to the drawings without inventive effort to those skilled in the art.
Fig. 1 is a flowchart of a small sample CT image segmentation method according to an embodiment of the present invention.
Fig. 2 is a diagram of a forward process and a backward process of a noise reduction probability diffusion model of a small sample CT image segmentation method according to an embodiment of the present invention.
Fig. 3 is a network structure diagram of a small sample CT image segmentation method according to an embodiment of the present invention.
Fig. 4 is a schematic diagram of an internal module of a small sample CT image segmentation system according to an embodiment of the present invention.
Fig. 5 is a schematic block diagram of a terminal according to an embodiment of the present invention.
Detailed Description
The invention discloses a small sample CT image segmentation method, a system, a terminal and a storage medium, and in order to make the purposes, technical schemes and effects of the invention clearer and more definite, the invention is further described in detail below by referring to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless expressly stated otherwise, as understood by those skilled in the art. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or wirelessly coupled. The term "and/or" as used herein includes all or any element and all combination of one or more of the associated listed items.
It will be understood by those skilled in the art that all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs unless defined otherwise. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
CT (computed tomography) images are prone to prejudice and false recognition due to low contrast, by manually and directly segmenting organ images in CT images. At present, the deep learning method can solve the above problems to a certain extent. The method can automatically learn how to automatically complete the segmentation task of the organ images in the CT images from the label data in a data-driven mode, and can rapidly position the target organ image area in minute-level time.
At present, although the organ image segmentation method or system based on supervised learning can approach or even exceed the manual level in a certain practical scene, a large amount of label data is required to be used in the training process. The development process of the system usually invests huge cost in data acquisition and maintenance, and generally comprises the steps of collecting large-scale medical image data, cleaning the data, desensitizing, and adopting experts to label a data set. In addition, medical images generally involve private information, are not readily available, and require expert experience for data annotation work of the medical images, which further limits the construction of large data sets. And when the data volume is insufficient, the generalization capability and the robustness of the supervised learning model can be greatly reduced, and even some low-level errors can occur, so that the deep learning method is difficult to floor in practical application.
Aiming at the defects in the prior art, the invention provides a small sample CT image segmentation method, which realizes a generated self-supervision learning task by using unlabeled data based on a noise reduction probability diffusion model, and can obtain a pre-training model with CT image semantic features from the unlabeled data by pre-training the noise reduction probability diffusion model, thereby solving the requirement limit of the traditional supervision learning method on large-scale labeled data in the training process to a certain extent. In addition, in order to effectively transfer the pre-trained noise reduction probability diffusion model to the image segmentation task, the method performs fine adjustment on part of parameters in the training process of the downstream task, so that the pre-training model can be suitable for the image segmentation task, and the accuracy of an image segmentation result is improved. The method and the device effectively solve the problem that the deep learning method is difficult to land in practical application because of insufficient data volume and poor robustness of the method based on supervised learning in the prior art.
Exemplary method
As shown in fig. 1, the method includes:
step S100, acquiring a target data set, and determining a training set and a testing set according to the target data set, wherein the target data set comprises label-free data and label data, and the label-free data is more than the label data.
Specifically, the target data set is a CT image data set, a corresponding data set can be obtained according to requirements, the data set is used as the target data set, data is selected from the target data set as a training set and a testing set, and the number of non-tag data can be the number of redundant tag data in the training set and the testing set. The MICCAI FLARE 2022 dataset was used in this example as training and testing. The dataset contained 13 abdominal organs, 2000 unlabeled samples and 50 labeled samples. The first 1000 unlabeled exemplars and 40 labeled exemplars in the FLARE22 dataset were used as training sets, respectively, and the remaining 10 labeled data were used as test sets. In the embodiment, more label-free data are adopted as the training set, fewer label data are adopted as the test set, the sum of the data quantity in the test set and the training set is fewer, the requirement on the label data is reduced, the acquisition difficulty of the data set is reduced, the data demand is relatively less, and the popularization and the use of the deep learning model in practical application are facilitated.
And step 200, processing the voxel distance and the voxel intensity of the image data in the training set and the test set to obtain a preprocessing training set and a preprocessing test set.
Specifically, the method for processing the voxel spacing of the image data in the training set and the testing set comprises the following steps: resampling the training set and the test set to target voxel spacing by adopting tri-line interpolation and nearest neighbor difference values; and carrying out normalization processing on the voxel distances of the image data in the training set and the testing set based on the target voxel distance. Wherein the target voxel spacing varies according to the use of the data set. In this embodiment, the value of the target voxel spacing is set as the voxel spacing median of all samples in the dataset. Since the deep neural network only focuses on voxels during training, voxel spacing is not a concern. By normalizing the voxel spacing of the image data in the training set and the test set to the same value, the image data input into the noise reduction probability diffusion model can reflect the same magnitude of imaging size.
The method of processing voxel intensities of image data in the training set and the test set comprises: calculating a first quantile and a second quantile of voxel intensities of the unlabeled data in the training set and the test set; cutting the voxel intensities of the data in the training set and the test set to a position between a first quantile and a second quantile, and carrying out normalization processing on the cut voxel intensities. In this embodiment, first, the 0.5% quantile and 99.5% quantile of the voxel intensity of the image data in the unlabeled data in the training set are calculated. The voxel intensities of the unlabeled data and the labeled data are then clipped to within the 0.5% quantile and 99.5% quantile ranges and normalized to [0,1] using a min-max normalization method, and further voxel intensities are adjusted to [ -1,1] by linear transformation.
For processing of image voxel intensities, in the existing methods, a mean normalization method is mostly adopted, that is, a mean miu and a variance sigma of voxel intensities of each sample in a data set are needed to be calculated, and normalization is carried out in a (x-miu/sigma) mode. This normalization method allows the voxel intensities in each sample to follow a normal distribution, but the voxel maxima and minima of each sample are not fixed. This results in that in the operation of the re-linear transformation from the current interval to the [ -1,1] interval, the unification of the upper and lower boundaries of the voxel intensities is not guaranteed (i.e. the voxel intensity measures in each sample are different: the voxel intensities of the different samples are all-1, but one of them may be expressed in the original CT image is-1200 and one is-1500), resulting in poor effect of the final generated model (pre-training). By adopting the voxel intensity processing method, the problem of voxel intensity difference of input image data in the training process caused by different acquisition equipment can be effectively avoided, and the pre-training effect of the generated model is enhanced.
In one implementation, the image data in the preprocessing training set and the preprocessing test set are three-dimensional image data, and before the preprocessing training set is input to the noise reduction probability diffusion model for training, the method further includes:
splitting the three-dimensional image data in the preprocessing training set and the preprocessing testing set along a transverse plane, and converting the three-dimensional image data into two-dimensional image data;
the resolution of the tag data in the two-dimensional image data is adjusted to a preset resolution by adopting a bilinear difference value and a nearest neighbor difference value;
and carrying out data inversion on the label-free data in the two-dimensional image data.
Specifically, in order to enable the image data in the pre-processing training set and the pre-processing test set to be input into the subsequent model for learning, the image data in the pre-processing training set and the pre-processing test set also need to be processed. Because the image data in the preprocessing training set and the preprocessing testing set are three-dimensional image data, the three-dimensional image data is firstly split into a plurality of two-dimensional slices along a transverse plane to obtain two-dimensional image data. In this embodiment, the preprocessing training set and the preprocessing test set are focused on 207, 029 cases of two-dimensional data slices for the pre-training task after splitting, and 3, 879 and 915 cases of slices for the subsequent organ segmentation training and testing. After converting the image data from three dimensions to two dimensions, data enhancement is performed on the two-dimensional image data. The resolution of the image of the tag data and the tag is adjusted to 256×256 using bilinear difference and nearest neighbor difference, respectively, in this embodiment. And adopting horizontal overturn for the label-free data as a data enhancement method. According to the embodiment, the dimensions of the image data are converted, and the converted two-dimensional data are subjected to image enhancement, so that the image data can be input into the model for training and testing, and the semantic recognition accuracy of the model on the image data is improved.
And step S300, inputting the preprocessing training set into a noise reduction probability diffusion model for training, and determining a pre-training model.
Specifically, the pre-processed training set is first input to a noise reduction probability diffusion model ((Denoising Diffusion Probabilistic Model, DDPM)) for training, thereby obtaining a pre-training model. The noise reduction probability diffusion model is a generation model, and in the embodiment, the noise reduction probability diffusion model is pre-trained by adopting non-tag data through non-tag data. The noise reduction probability diffusion model comprises the following components: in the forward process of the noise reduction probability diffusion model, defining the diffusion process as a Markov chain; and in the backward process of the noise reduction probability diffusion model, the U-Net network is adopted to learn the semantic features of the image data in the preprocessing training set or the preprocessing test set. In this embodiment, in the forward process of the noise reduction probability model, the forward diffusion process is defined as a markov chain, and gaussian noise is continuously added to the original image data in a continuous time sequence to obtain a set of noise data. Let q (x) 0 ) For initial data distribution, a set of data samples x is given 0 ~q(x 0 ) If Gaussian noise is added to the initial data at any time t, the noisy data is denoted as x 1 ,x 2 ,...,x T And the forward diffusion process q can be defined as follows:
wherein T and beta 1 ,...,β T ∈[0,1]Representing the step size of the diffusion and the variance of the diffusion of the corresponding step size, respectively, I being the identity matrix,represents a normal distribution subject to mean μ and covariance σ. Through the parameter reforming technique, the method can directly perform the process of x t Sampling:
wherein alpha is t =1-β t
In the reverse process, if the distribution is from normal, the above properties are utilizedSampling a gaussian noise and following the inverse calculation method, one can then calculate from p (x 0 ) Generating a new sample:
wherein mu θ (x t T) and sigma θ (x t T) is the mean and covariance of the normal distribution to be estimated, both parameters requiring training of a U-Net network to achieve the estimation of the noise distribution. The loss calculation method is as follows:
wherein E represents the desire, z θ (x t T) is the network prediction x t In (c) is the noise of (t) is fromIs obtained by sampling in a uniform distribution of x 0 Is the original input, x t Is a noisy input of samples.
In order to enable the U-Net network to be suitable for preprocessing the generation task of the image data in the training set, the input layer and the output layer of the U-Net network are adjusted to be capable of receiving two inputs, namely the noisy image x t And an embedding vector of step t, thereby being capable of being used to predict noise of a gray scale single channel image. The noise reduction probability diffusion model in the pre-training stage is used as a generation model, and training data used in the pre-training is label-free data.
And step 400, modifying the step length and the output layer network structure of the pre-training model, and inputting the pre-processing training set into the modified pre-training model for training to obtain an image segmentation model.
Specifically, the network structure of the image segmentation model is the same as that of the pre-training model, and a noise reduction probability diffusion model is adopted. The present embodiment proposes the assumption that: the best performing model in the generation task also has the best image feature characterization capability, and the pre-training weights are also most beneficial to the downstream segmentation task. In this embodiment, the weight corresponding to the optimal generation model (i.e., the pre-training model) in each evaluation index is selected as the initial weight of the image segmentation model, and then the step size and the output layer network structure of the pre-training model are modified on the basis of the initial weight.
The present embodiment is initiated manually and fixes the step size: when a U-Net network is used as a model for generating a noise reduction probability diffusion model, the network needs to accept two inputs: noisy image data and step size; while as a segmentation model, the network only needs to accept a single input: an image to be segmented. In order to overcome the difference of the input of the U-Net network in the two different tasks, the following strategies are adopted: the value of the diffusion step is manually selected, the network is adjusted to accept a single-input segmentation model, and the parameter is fixed in the subsequent segmentation fine tuning process. The experimental result shows that the image segmentation model generally achieves the optimal performance when the diffusion step length is taken to be 0. Therefore, in this embodiment, the step length of the pre-training model is modified to be 0, so that the image segmentation model achieves a better segmentation effect.
In addition, the output layer network structure is modified: generally, for the gray image generation task, the number of channels of the output layer of the U-Net network should be 1 (the number of channels of the output layer is 2 if the variance of the prediction noise is also required), while in the segmentation task, the number of channels of the output layer should be determined by the number of classes to be segmented of the data set used. Taking the FLARE 2022 dataset as an example, the number of channels in the data layer should be 14. In this embodiment, the output layer of the U-Net network in the noise reduction probability diffusion model is modified to accommodate downstream organ segmentation tasks. The output layer consists of two continuous convolution layers, a BN layer, a ReLU activation function layer and a single convolution layer, wherein the number of output channels of the last convolution layer corresponds to the number of classes to be segmented of the target data set.
In one implementation, the inputting the pre-processing training set into the modified pre-training model to train to obtain an image segmentation model further includes:
dividing parameters into coding layer parameters, decoding layer parameters, bottleneck layer parameters and classification header parameters according to the function of a network layer where the parameters of the U-Net in the modified pre-training model are located;
and adjusting weights of the decoding layer parameters and the classification head parameters in the modified pre-training model based on the weights corresponding to the coding layer parameters and the bottleneck layer parameters, and training the modified pre-training model based on the decoding layer parameters and the classification head parameters after weight adjustment to obtain the image segmentation model.
In particular, the image segmentation model may already be used for semantic segmentation tasks even without a fine tuning step. To improve the mobility of the network and the accuracy of the segmentation, the present embodiment also trains an output classification header from scratch for the U-Net network and fine-tunes the rest of the network. In order to complete training and verification of organ segmentation tasks in a supervised learning manner using a tag dataset, the present embodiment proposes a fine tuning strategy: the weight of the Encode and Bottleneck (encoding layer parameter and Bottleneck layer parameter) parts can be frozen by dividing Encoder, decoder, bottleneck and Classification Head (corresponding to encoding layer parameter, decoding layer parameter, bottleneck layer parameter and classification header parameter respectively) in the whole learnable parameters of the U-Net network, and only the Fine tuning training is performed on the Decoder and Classification Head. Through the fine tuning strategy, the image semantic features in the pre-training model can be reserved, and the model has strong migration adjustment capability.
And S500, inputting the preprocessing test set into the image segmentation model to obtain an image segmentation result.
In short, after the image segmentation model is trained, the preprocessing test set can be input into the image segmentation model to obtain an image segmentation result. In the CT image generation task with the size of 256×256, the embodiment achieves 11.32, 46.93 and 73.1% in three indexes of FID, sFID and F1-score respectively, which shows that the noise reduction probability diffusion model can generate various and real two-dimensional CT images. In the downstream organ segmentation task, the DSC and Jaccard Index scores of this example reached 86.91% and 80.38% when the full training set was used, which remained comparable to the most advanced method. The advantages of this embodiment are more evident in the context of a small amount of tag data. With only 1% and 10% tag data sets, the DSC scores of this example still reached 71.56% and 78.51% and Jaccard Index scores reached 64.21% and 72.43%. Even with only 4 cases of label data (about 0.1% of the total training set data), the DSC and Jaccard Index scores of the split model were maintained at 51.81% and 44.79%. These results demonstrate that the technical solution of the present embodiment overcomes to some extent the limitation of supervised learning that is severely dependent on large scale marker data. The method can acquire semantic information of the image from a large amount of unlabeled data, and can train a high-quality organ segmentation model by using a small amount of labeled data, so that the limitation that the method based on supervised learning is limited by a small-scale labeled data set is solved to a certain extent.
Because the imaging mechanism of the CT image is different from that of a common RGB image, the noise reduction probability diffusion model cannot be directly used for processing the CT image, and therefore, the noise reduction probability diffusion model can be successfully applied to a CT image generation task by performing a series of processing before the training set and the test set are input into the noise reduction probability diffusion model and modifying the U-Net network input-output layer structure for noise prediction. According to the embodiment, the migration method and the fine tuning strategy are provided, training can be completed under any data scale based on the noise reduction probability diffusion model, good segmentation performance can be maintained under the condition of extremely small data scale, training deployment is simple, and manual feature extraction according to a specific data set is not needed.
It should be noted that, for the method of generating pre-training, migration and fine-tuning strategies, which are all independent of the network model, any network used for the semantic segmentation task can be used as the network model in the present solution. For example, the network model may be replaced with ResU-Net, deep Lab series based on CNNs, V-Net, U-Net++, etc., or with transition U-Net, trans U-Net, UNETR, swin UNETR, etc., network models based on transitions. Besides CT images, the method can be used for RGB images in the field of identification, MRI, X-ray images and the like in the medical field, and even other specific fields such as remote sensing images and the like.
Based on the above embodiment, the present invention further provides a small sample CT image segmentation system, as shown in fig. 4, including:
the data acquisition module 01 is used for acquiring a target data set, and determining a training set and a testing set according to the target data set, wherein the target data set comprises label-free data and label data, and the label-free data is more than the label data;
the preprocessing module 02 is used for processing the voxel distance and the voxel intensity of the image data in the training set and the test set to obtain a preprocessing training set and a preprocessing test set;
the pre-training module 03 is used for inputting the pre-processing training set into a noise reduction probability diffusion model for training, and determining a pre-training model;
the training module 04 is used for modifying the step length and the output layer network structure of the pre-training model, inputting the pre-processing training set into the modified pre-training model for training, and obtaining an image segmentation model;
the image segmentation module 05 is used for inputting the preprocessing test set into an image segmentation model to obtain an image segmentation result.
Based on the above embodiment, the present invention also provides a terminal, and a functional block diagram thereof may be shown in fig. 5. The terminal comprises a processor, a memory, a network interface and a display screen which are connected through a system bus. Wherein the processor of the terminal is adapted to provide computing and control capabilities. The memory of the terminal includes a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The network interface of the terminal is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a small sample CT image segmentation method. The display screen of the terminal may be a liquid crystal display screen or an electronic ink display screen.
It will be appreciated by those skilled in the art that the functional block diagram shown in fig. 5 is merely a block diagram of some of the structures associated with the present inventive arrangements and is not limiting of the terminal to which the present inventive arrangements may be applied, and that a particular terminal may include more or less components than those shown, or may combine some of the components, or have a different arrangement of components.
In one implementation, the memory of the terminal has stored therein one or more programs, and the one or more programs configured to be executed by one or more processors include instructions for performing a small sample CT image segmentation method.
Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.
In summary, the invention discloses a small sample CT image segmentation method, a system, a terminal and a storage medium, wherein the method is used for realizing a generated self-supervision learning task by using unlabeled data based on a noise reduction probability diffusion model, and the method can obtain a pre-training model with CT image semantic characteristics from the unlabeled data by pre-training the noise reduction probability diffusion model, so that the requirement limit of the traditional supervision learning method on large-scale labeled data in the training process is solved to a certain extent. In addition, in order to effectively transfer the pre-trained noise reduction probability diffusion model to the image segmentation task, the method performs fine adjustment on part of parameters in the training process of the downstream task, so that the pre-training model can be suitable for the image segmentation task, and the accuracy of an image segmentation result is improved. The method and the device effectively solve the problem that the deep learning method is difficult to land in practical application because of insufficient data volume and poor robustness of the method based on supervised learning in the prior art.
It is to be understood that the invention is not limited in its application to the examples described above, but is capable of modification and variation in light of the above teachings by those skilled in the art, and that all such modifications and variations are intended to be included within the scope of the appended claims.

Claims (10)

1. A method for small sample CT image segmentation, the method comprising:
acquiring a target data set, and determining a training set and a testing set according to the target data set, wherein the target data set comprises label-free data and label data, and the label-free data is more than the label data;
processing the voxel distance and voxel intensity of the image data in the training set and the test set to obtain a preprocessing training set and a preprocessing test set;
inputting the preprocessing training set into a noise reduction probability diffusion model for training, and determining a pre-training model;
modifying the step length and the output layer network structure of the pre-training model, and inputting the pre-processing training set into the modified pre-training model for training to obtain an image segmentation model;
and inputting the preprocessing test set into the image segmentation model to obtain an image segmentation result.
2. The small sample CT image segmentation method as recited in claim 1, wherein said processing the voxel spacing of the image data in the training set and the test set comprises:
resampling the training set and the test set to target voxel spacing by adopting tri-line interpolation and nearest neighbor difference values;
and carrying out normalization processing on the voxel distances of the image data in the training set and the testing set based on the target voxel distance.
3. The small sample CT image segmentation method as recited in claim 1, wherein said processing voxel intensities of image data in the training set and the test set comprises:
calculating a first quantile and a second quantile of voxel intensities of the unlabeled data in the training set and the test set;
cutting the voxel intensities of the data in the training set and the test set to a position between a first quantile and a second quantile, and carrying out normalization processing on the cut voxel intensities.
4. The small sample CT image segmentation method according to claim 1, wherein the image data in the pre-processing training set and the pre-processing test set are three-dimensional image data, and the method further comprises, before the pre-processing training set is input to the noise reduction probability diffusion model for training:
splitting the three-dimensional image data in the preprocessing training set and the preprocessing testing set along a transverse plane, and converting the three-dimensional image data into two-dimensional image data;
the resolution of the tag data in the two-dimensional image data is adjusted to a preset resolution by adopting a bilinear difference value and a nearest neighbor difference value;
and carrying out data inversion on the label-free data in the two-dimensional image data.
5. The small sample CT image segmentation method according to claim 1, wherein the noise reduction probability diffusion model comprises:
in the forward process of the noise reduction probability diffusion model, defining a diffusion process as a Markov chain;
in the backward process of the noise reduction probability diffusion model, the U-Net network is adopted to learn the semantic features of the image data in the preprocessing training set or the preprocessing testing set.
6. The small sample CT image segmentation method as recited in claim 1, wherein said modifying the step size of the pre-training model comprises:
and modifying the step length of the pre-training model to be 0.
7. The method of claim 5, wherein the inputting the pre-processed training set into the modified pre-training model for training to obtain an image segmentation model, further comprises:
dividing parameters into coding layer parameters, decoding layer parameters, bottleneck layer parameters and classification header parameters according to the function of a network layer where the parameters of the U-Net in the modified pre-training model are located;
and adjusting weights of the decoding layer parameters and the classification head parameters in the modified pre-training model based on the weights corresponding to the coding layer parameters and the bottleneck layer parameters, and training the modified pre-training model based on the decoding layer parameters and the classification head parameters after weight adjustment to obtain the image segmentation model.
8. A small sample CT image segmentation system, the system comprising:
the data acquisition module is used for acquiring a target data set, and determining a training set and a testing set according to the target data set, wherein the target data set comprises label-free data and label data, and the label-free data is more than the label data;
the preprocessing module is used for processing the voxel distance and the voxel intensity of the image data in the training set and the test set to obtain a preprocessing training set and a preprocessing test set;
the pre-training module is used for inputting the pre-processing training set into the noise reduction probability diffusion model for training and determining a pre-training model;
the training module is used for modifying the step length and the output layer network structure of the pre-training model, and inputting the pre-processing training set into the modified pre-training model for training to obtain an image segmentation model;
and the image segmentation module is used for inputting the preprocessing test set into an image segmentation model to obtain an image segmentation result.
9. A terminal comprising a memory and one or more processors; the memory stores more than one program; the program comprising instructions for performing the small sample CT image segmentation method of any one of claims 1-7; the processor is configured to execute the program.
10. A computer readable storage medium having stored thereon a plurality of instructions adapted to be loaded and executed by a processor to implement the steps of the small sample CT image segmentation method of any of the preceding claims 1-7.
CN202310831051.8A 2023-07-05 2023-07-05 Small sample CT image segmentation method, system, terminal and storage medium Pending CN116824146A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310831051.8A CN116824146A (en) 2023-07-05 2023-07-05 Small sample CT image segmentation method, system, terminal and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310831051.8A CN116824146A (en) 2023-07-05 2023-07-05 Small sample CT image segmentation method, system, terminal and storage medium

Publications (1)

Publication Number Publication Date
CN116824146A true CN116824146A (en) 2023-09-29

Family

ID=88139175

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310831051.8A Pending CN116824146A (en) 2023-07-05 2023-07-05 Small sample CT image segmentation method, system, terminal and storage medium

Country Status (1)

Country Link
CN (1) CN116824146A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117422732A (en) * 2023-12-18 2024-01-19 湖南自兴智慧医疗科技有限公司 Pathological image segmentation method and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200349229A1 (en) * 2019-05-02 2020-11-05 King Fahd University Of Petroleum And Minerals Open domain targeted sentiment classification using semisupervised dynamic generation of feature attributes
CN115082493A (en) * 2022-06-02 2022-09-20 陕西科技大学 3D (three-dimensional) atrial image segmentation method and system based on shape-guided dual consistency
CN115409733A (en) * 2022-09-02 2022-11-29 山东财经大学 Low-dose CT image noise reduction method based on image enhancement and diffusion model
CN115861250A (en) * 2022-12-14 2023-03-28 深圳技术大学 Self-adaptive data set semi-supervised medical image organ segmentation method and system
CN116228795A (en) * 2023-03-13 2023-06-06 北京工业大学 Ultrahigh resolution medical image segmentation method based on weak supervised learning

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200349229A1 (en) * 2019-05-02 2020-11-05 King Fahd University Of Petroleum And Minerals Open domain targeted sentiment classification using semisupervised dynamic generation of feature attributes
CN115082493A (en) * 2022-06-02 2022-09-20 陕西科技大学 3D (three-dimensional) atrial image segmentation method and system based on shape-guided dual consistency
CN115409733A (en) * 2022-09-02 2022-11-29 山东财经大学 Low-dose CT image noise reduction method based on image enhancement and diffusion model
CN115861250A (en) * 2022-12-14 2023-03-28 深圳技术大学 Self-adaptive data set semi-supervised medical image organ segmentation method and system
CN116228795A (en) * 2023-03-13 2023-06-06 北京工业大学 Ultrahigh resolution medical image segmentation method based on weak supervised learning

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117422732A (en) * 2023-12-18 2024-01-19 湖南自兴智慧医疗科技有限公司 Pathological image segmentation method and device
CN117422732B (en) * 2023-12-18 2024-02-23 湖南自兴智慧医疗科技有限公司 Pathological image segmentation method and device

Similar Documents

Publication Publication Date Title
CN108898175B (en) Computer-aided model construction method based on deep learning gastric cancer pathological section
CN110188754B (en) Image segmentation method and device and model training method and device
CN111784671B (en) Pathological image focus region detection method based on multi-scale deep learning
CN109345508B (en) Bone age evaluation method based on two-stage neural network
CN111145181B (en) Skeleton CT image three-dimensional segmentation method based on multi-view separation convolutional neural network
CN110930378B (en) Emphysema image processing method and system based on low data demand
CN111260055A (en) Model training method based on three-dimensional image recognition, storage medium and equipment
CN111640120A (en) Pancreas CT automatic segmentation method based on significance dense connection expansion convolution network
Qiao et al. Dilated squeeze-and-excitation U-Net for fetal ultrasound image segmentation
CN113256592B (en) Training method, system and device of image feature extraction model
CN110866921A (en) Weakly supervised vertebral body segmentation method and system based on self-training and slice propagation
CN114240955B (en) Semi-supervised cross-domain self-adaptive image segmentation method
CN111667483A (en) Training method of segmentation model of multi-modal image, image processing method and device
CN116824146A (en) Small sample CT image segmentation method, system, terminal and storage medium
CN109410189B (en) Image segmentation method, and image similarity calculation method and device
CN115359066B (en) Focus detection method and device for endoscope, electronic device and storage medium
CN113177554B (en) Thyroid nodule identification and segmentation method, system, storage medium and equipment
CN112990359B (en) Image data processing method, device, computer and storage medium
CN117522891A (en) 3D medical image segmentation system and method
CN115190999A (en) Classifying data outside of a distribution using contrast loss
CN112785581A (en) Training method and device for extracting and training large blood vessel CTA (computed tomography angiography) imaging based on deep learning
CN117197456A (en) HE dyeing-oriented pathological image cell nucleus simultaneous segmentation classification method
CN115880277A (en) Stomach cancer pathology total section T stage classification prediction method based on Swin transducer and weak supervision
CN115762721A (en) Medical image quality control method and system based on computer vision technology
CN111160346A (en) Ischemic stroke segmentation system based on three-dimensional convolution

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination