CN115861164A - Medical image segmentation method based on multi-field semi-supervision - Google Patents

Medical image segmentation method based on multi-field semi-supervision Download PDF

Info

Publication number
CN115861164A
CN115861164A CN202211130790.6A CN202211130790A CN115861164A CN 115861164 A CN115861164 A CN 115861164A CN 202211130790 A CN202211130790 A CN 202211130790A CN 115861164 A CN115861164 A CN 115861164A
Authority
CN
China
Prior art keywords
segmentation
field
medical image
domain
teacher
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211130790.6A
Other languages
Chinese (zh)
Inventor
舒禹程
李恒博
肖斌
李伟生
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University of Post and Telecommunications
Original Assignee
Chongqing University of Post and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University of Post and Telecommunications filed Critical Chongqing University of Post and Telecommunications
Priority to CN202211130790.6A priority Critical patent/CN115861164A/en
Publication of CN115861164A publication Critical patent/CN115861164A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention belongs to the field of computer vision and medical image fusion, and particularly relates to a medical image segmentation method based on multi-field semi-supervision, which comprises the following steps: constructing and training a student segmentation model of a semi-supervised teacher, and inputting data of a dissimilarity field to be segmented into the trained segmentation model to obtain a segmentation result; according to the invention, a teacher student network excavates high-level semantic features of a large number of label-free multi-field medical images, a network of a self-attention-based disentanglement mechanism is adopted to extract field features and segmentation part features, a field feature similarity screening mechanism and a multi-field high-level semantic contrast loss function are used for robust learning, and an exponential sliding average algorithm is introduced, so that student models are heterogeneous to be teacher models, a pixel level error-surveying scheme is adopted under the constraint of teacher-student consistency, the segmentation precision is improved, the applicability and the generalization of multiple fields are expanded, the image segmentation effect of a depth model is improved, and the development of related technical fields is promoted.

Description

Medical image segmentation method based on multi-field semi-supervision
Technical Field
The invention belongs to the field of computer vision and medical image fusion, and particularly relates to a medical image segmentation method based on multi-field semi-supervision.
Background
The medical image segmentation aims at dividing and extracting regions of organs, tissues, focuses and the like with visual homogeneity and semantic consistency in images such as CT, MRI, X-ray and the like; the method utilizes methods such as image processing, computer vision, machine learning and the like to provide theoretical and application support for digital modeling and automatic auxiliary diagnosis and treatment of diseases; as a key step of a medical image intelligent analysis technology, a medical image segmentation algorithm has shown a wide clinical application prospect, and numerous scientific research institutions at home and abroad develop a great deal of research in the field; in 2017, stanford university developed a deep learning algorithm for diagnosing skin cancer; the Google Brain team proposed a method for predicting cardiovascular and cerebrovascular disease risk through retinal images in 2018; in recent years, the upper gastrointestinal tumor endoscope auxiliary diagnosis system and the new coronary pneumonia intelligent auxiliary diagnosis system developed by scientific research teams in China have been greatly successful; therefore, aiming at practical clinical application scenes and requirements, a medical image segmentation method with accuracy, robustness and universality is researched, so that a new-generation intelligent clinical auxiliary diagnosis system with independent intellectual property rights is developed, and the method has very important practical significance in the aspects of promoting the clinical diagnosis and treatment level and efficiency of hospitals, promoting the basic diagnosis and treatment level under a multi-stage diagnosis and treatment system, enhancing the informatization strength of medical systems in China and the like;
with the development of a deep learning method and continuous research of related fields, a deep network architecture with a data-driven mechanism shows excellent feature learning and knowledge mining capabilities; in this developing wave, the field of medical image segmentation has also made great progress; however, for the supervision feedback characteristic of the deep neural network, a large amount of training data is often needed for a general natural image analysis task to ensure the convergence of the model, and high requirements are provided for the number and quality of training samples;
in the field of inverse medical image segmentation, due to the influences of tissue physiological differences and organ motion, different equipment parameters and specifications during image acquisition and the like, various medical image data often have larger appearance differences; on the basis, because of the sparsity, data sensitivity and relative closure of diseases, and the medical image labels need to be marked by medical professionals, a large number of high-quality training samples are difficult to obtain all the time; at present, a sample set with a large amount of data and labels can be obtained in the field of natural image analysis, and a large number of deep networks with extremely large parameter quantity emerge; in the field of medical image analysis, due to the above analysis reasons, researchers have to face the problem of "small samples" lacking samples and labels in various medical image analysis tasks at the present stage;
therefore, most of the existing medical image segmentation methods are often limited to a single data modality, a single organ, a fixed disease and a few verification data sets under the constraint of a small sample problem, thereby bringing a series of disadvantages:
1) Because the available data is too little, an accurate and credible model is difficult to establish by utilizing a data driving mechanism of a deep architecture;
2) The generalization capability of the learning model is difficult to continuously improve, and the usability and the universality of the learning model are restricted;
3) Models under different medical image analysis tasks are relatively independent, so that common medical knowledge is difficult to effectively refine.
Disclosure of Invention
In order to solve the technical problem, the invention provides a medical image segmentation method based on multi-field semi-supervision, which comprises the following steps:
constructing and training a student segmentation model of a semi-supervised teacher, and inputting data of a dissimilarity field to be segmented into the trained segmentation model to obtain a segmentation result;
the training process of the semi-supervised teacher student segmentation model comprises the following steps:
s1: acquiring a source field medical image, wherein the source field medical image consists of a small amount of labeled data and a large amount of unlabeled data;
s2: inputting labeled data in a medical image in a source field into a student segmentation network, extracting advanced organization semantic features of the labeled data through an encoder, performing feature channel restoration on the advanced organization semantic features by using a decoder, predicting and restoring pixel by pixel of each channel after using a segmentation head MLP, performing binarization on predicted pixels through a Softmax function to obtain a segmentation prediction confidence matrix, and performing supervision training according to the segmentation prediction confidence matrix and the labeled data in the medical image in the source field;
s3: updating network parameters of a student network for supervised training by using an exponential sliding average algorithm to obtain a teacher network, inputting label-free data into a segmentation prediction confidence matrix obtained by the teacher network, and processing the segmentation prediction confidence matrix into a teacher pseudo label through a Detach function;
s4: extracting high-level semantic features f of label-free data through student network encoder θ (X s ) A 1 to f θ (X s ) Decomposing the data into Q, K, V three groups of independent semantic features through grouping convolution operation and extracting F gate control points G through one-dimensional convolution, enabling two groups of independent semantic features Q, K to be independent in pairs through a Gaussian whitening function, embedding the two groups of independent semantic features into a binary item morphological feature M through transposition dot products, and enabling V groups of independent semantic features to form a unitary item field feature D through dot product operation with the gate control points G;
s5: storing the unary item field characteristics D into a queue, and constructing a field memory storage unit Meta Bank;
s6: extracting a unary term field characteristic D similar to the current unary term field characteristic D by using a nearest neighbor algorithm in a Meta Bank, regarding D and D as current positive field characteristics, regarding the rest characteristics in the Meta Bank as negative field characteristics, using contrast learning to draw the positive field characteristics and the field characteristics closest to the positive field characteristics in a field memory storage unit Meta Bank to be close to each other, and isolating the positive field characteristics and the negative field characteristics from each other;
s7: selecting a unitary item field characteristic D as a new field characteristic, and adding the unitary item field characteristic D and a binary item morphological characteristic M to obtain a new high-level semantic characteristic F;
s8: f, decoding again through a student segmentation network decoder to obtain a feature map with the same size as the original input image, classifying the feature map through a segmentation head MLP to obtain student segmentation confidence prediction, and performing pseudo-supervision training with a teacher pseudo-label;
s9: and starting a gradient back propagation mechanism, optimizing a loss function of the model, updating network parameters according to an exponential moving average algorithm, finishing the training of the model and fixing the model parameters when the model converges or reaches the set epoch times.
Preferably, the prediction pixel is binarized by a Softmax function to obtain a segmentation prediction confidence matrix, which is expressed as:
Figure BDA0003848477800000031
wherein Y represents a segmentation prediction confidence matrix obtained by a student segmenting the network,
Figure BDA0003848477800000032
the method represents the student network coder-decoder and the segmentation head processing operation in the student segmentation network, X represents the labeled data in the input source domain medical image, sigma (-) represents the Softmax function, and tau represents the differentiation parameter of the control segmentation result. />
Preferably, supervised training is performed according to the segmentation prediction confidence matrix and labeled data in the medical image in the source field, and is expressed as:
Figure BDA0003848477800000041
wherein the content of the first and second substances,
Figure BDA0003848477800000042
the method comprises the steps of representing a loss function of supervised training, wherein T represents the total pixel size of labeled data in a source field medical image, X represents the labeled data in the input source field medical image, GT (X) represents the label of the labeled data in the source field medical image, and Y represents a segmentation prediction confidence matrix obtained by a student segmenting a network.
Preferably, the label-free data is input into a segmentation prediction confidence matrix obtained by the teacher network, and is processed into a teacher pseudo label through a Detach function, which is expressed as:
Figure BDA0003848477800000043
wherein, Y * A pseudo label is shown to the teacher and,
Figure BDA0003848477800000044
teacher network codec and split header processing operations representing the interior of a teacher network>
Figure BDA0003848477800000045
The operation of the Detach and Softmax function is shown, X represents the labeled data in the input source field medical image, and tau represents the differentiation parameter of the control segmentation result.
Preferably, the two sets of independent semantic features Q, K are independent in pairs by using a gaussian whitening function, and are embedded as a binary morphological feature M by a transposed dot product, which is expressed as:
Figure BDA0003848477800000046
wherein M represents a binary term morphological feature,
Figure BDA0003848477800000047
representing an independent semantic feature representing a signal that has undergone a transposed Gaussian whitening operation, <' >>
Figure BDA0003848477800000048
Q i Denotes the ith pixel in Q, K j Denotes the jth pixel in K, T denotes the transpose operation, μ Q Mean value representing an independent semantic feature Q>
Figure BDA0003848477800000049
μ K Representing a mean value representing an independent semantic feature K, <' >>
Figure BDA00038484778000000410
Ω denotes the symbol in the characteristic diagramThe method comprises the following steps that a set of pixels is provided, t represents a specific threshold pixel point, sigma (·) represents a Softmax function, G represents a gating point of high-level semantic features of unlabeled data, and exp () represents an exponential function.
Preferably, the V-group independent semantic features form a unary term domain feature D by dot product operation with the gating point G, and are expressed as:
Figure BDA0003848477800000051
wherein the content of the first and second substances,
Figure BDA0003848477800000052
and V represents a group of independent semantic features obtained by decomposition of the grouping convolution operation, and G represents a gating point of the high-level semantic features of the unlabeled data.
Preferably, a nearest neighbor algorithm is used to extract a similar univariate term domain feature D to the current univariate term domain feature D, which is expressed as:
D*=L d (D) Meta
wherein L is d (·) Meta Representing the shortest distance operation between the metrics D and D.
Preferably, the positive domain features and the domain features closest to the positive domain features in the domain memory storage unit Meta Bank are drawn to be pairwise by using contrast learning, and the positive and negative domain features are isolated to be pairwise, and are represented as follows:
Figure BDA0003848477800000053
wherein the content of the first and second substances,
Figure BDA0003848477800000054
the contrast loss function between the ith pixel, exp (D), representing the characteristics of the positive and negative fields i ·D* i Tau) represents the positive and nearest neighbor characteristic closeness terms, exp (-) represents the exponential function, D i Representing the ith pixel, D, in the positive domain feature i Indicating the ith of nearest neighbor featuresPixel, τ, represents the differentiation parameter, Σ, controlling the segmentation result D`∈MetaBank exp(D i ·D` i Tau) represents the positive and negative domain feature separation term, D' represents the negative domain feature, D ″ i The ith pixel is a negative domain feature.
Preferably, the student segmentation confidence prediction and the teacher pseudo label are subjected to pseudo supervised training, which is expressed as:
Figure BDA0003848477800000055
wherein the content of the first and second substances,
Figure BDA0003848477800000056
representing a pseudo-supervised loss function, N representing the total pixel size of unlabelled data in the source domain medical image, Y b Represents the result of the student network prediction and is up or down>
Figure BDA0003848477800000057
And representing the teacher network prediction result.
Preferably, the loss function of the model is expressed as:
Figure BDA0003848477800000058
wherein the content of the first and second substances,
Figure BDA0003848477800000059
a total loss function representing a teacher student network>
Figure BDA00038484778000000510
Loss function representing supervised training>
Figure BDA00038484778000000511
A contrast loss function between the ith pixel, representing a positive and negative field feature, <' >>
Figure BDA00038484778000000512
Is falseA supervisory loss function.
The invention has the beneficial effects that: according to the invention, a large number of high-level semantic features of label-free multi-field medical images are mined through a teacher student network, the field features and segmentation part features are extracted through a network of a self-attention disentanglement mechanism, robust learning is carried out by using a field feature similarity screening mechanism and a multi-field high-level semantic contrast loss function, an exponential sliding average algorithm is introduced, so that student models are isomerized into teacher models, pixel level error-investigation schemes are carried out under the constraint of teacher-student consistency, and finally segmentation areas are worked out according to student output judgment results, so that the segmentation precision is improved, the applicability and the generalization of the multi-field are expanded.
Drawings
FIG. 1 is a block diagram of a multi-domain semi-supervised based medical image segmentation method of the present invention;
FIG. 2 is a graph of the segmentation result according to the embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
A multi-domain semi-supervised based medical image segmentation method, as shown in fig. 1, includes:
constructing and training a student segmentation model of a semi-supervised teacher, and inputting data of the dissimilarity field to be segmented into the trained segmentation model to obtain a segmentation result;
the training process of the semi-supervised teacher student segmentation model comprises the following steps:
s1: acquiring a source field medical image, wherein the source field medical image consists of a small amount of labeled data and a large amount of non-labeled data;
s2: inputting labeled data in a medical image in a source field into a student segmentation network, extracting advanced organization semantic features of the labeled data through an encoder, performing feature channel restoration on the advanced organization semantic features by using a decoder, predicting and restoring pixel by pixel of each channel after using a segmentation head MLP, performing binarization on predicted pixels through a Softmax function to obtain a segmentation prediction confidence matrix, and performing supervision training according to the segmentation prediction confidence matrix and the labeled data in the medical image in the source field;
s3: updating network parameters of a student network for supervised training by using an exponential sliding average algorithm to obtain a teacher network, inputting label-free data into a segmentation prediction confidence matrix obtained by the teacher network, and processing the segmentation prediction confidence matrix into a teacher pseudo label through a Detach function;
s4: extracting high-level semantic features f of label-free data through student network encoder θ (X s ) A 1 is to f θ (X s ) Decomposing the data into Q, K, V three groups of independent semantic features through grouping convolution operation and extracting F gate control points G through one-dimensional convolution, enabling two groups of independent semantic features Q, K to be independent in pairs through a Gaussian whitening function, embedding the two groups of independent semantic features into a binary item morphological feature M through transposition dot products, and enabling V groups of independent semantic features to form a unitary item field feature D through dot product operation with the gate control points G;
s5: storing the unary field characteristics D into a queue, and constructing a field memory storage unit Meta Bank;
s6: extracting a univariate domain feature D similar to the current univariate domain feature D by using a nearest neighbor algorithm in a Meta Bank, regarding D and D as current positive domain features, regarding the rest features in the Meta Bank as negative domain features, using contrast learning to pull the positive domain features and the domain features nearest to the positive domain features in the Meta Bank to be pairwise closer, and isolating the positive domain features and the negative domain features from each other;
s7: selecting a unitary item field characteristic D as a new field characteristic, and adding the unitary item field characteristic D and a binary item morphological characteristic M to obtain a new high-level semantic characteristic F;
s8: f, decoding again through a student segmentation network decoder to obtain a feature map with the same size as the original input image, classifying the feature map through a segmentation header MLP to obtain student segmentation confidence prediction, and performing pseudo-supervised training with a teacher pseudo-label;
s9: and starting a gradient back propagation mechanism, optimizing a loss function of the model, updating network parameters according to an exponential moving average algorithm, finishing the training of the model and fixing the model parameters when the model converges or reaches the set epoch times.
Cutting the acquired multi-field medical image data, wherein the size of the multi-field medical image data is set to be 255 x 255 pixels in the embodiment; and performing different data enhancement on the data and inputting the data into a teacher and student network, wherein the specific enhancement modes comprise (Mixup, cutmix, gaussian white noise and the like).
And binarizing the prediction pixel through a Softmax function to obtain a segmentation prediction confidence matrix, wherein the segmentation prediction confidence matrix is expressed as:
Figure BDA0003848477800000081
wherein Y represents a segmentation prediction confidence matrix obtained by a student segmenting the network,
Figure BDA0003848477800000082
representing student network coder-decoder and segmentation head processing operation in the student segmentation network, wherein X represents labeled data in an input source field medical image, sigma (-) represents a Softmax function, and tau represents a differentiation parameter for controlling segmentation results.
Updating network parameters of the student network for supervised training by using an exponential moving average algorithm, wherein the network parameters are expressed as follows:
X′ t =αX′ t-1 +(1-α)X t
wherein, X' t Representing the teacher's network parameter, θ, at round t t The parameter of the student network at round t, α, represents a weighting coefficient.
Carrying out supervision training according to the segmentation prediction confidence matrix and labeled data in the medical image in the source field, and expressing as follows:
Figure BDA0003848477800000083
wherein the content of the first and second substances,
Figure BDA0003848477800000084
the method comprises the steps of representing a loss function of supervised training, wherein T represents the total pixel size of labeled data in a source field medical image, X represents the labeled data in the input source field medical image, GT (X) represents the label of the labeled data in the source field medical image, and Y represents a segmentation prediction confidence matrix obtained by a student segmenting a network.
Inputting label-free data into a segmentation prediction confidence matrix obtained by a teacher network, and processing the label-free data into a teacher pseudo label through a Detach function, wherein the label-free data is expressed as follows:
Figure BDA0003848477800000085
wherein Y is * A pseudo label is shown to the teacher and,
Figure BDA0003848477800000086
teacher network codec and split header processing operations representing the interior of a teacher network>
Figure BDA0003848477800000087
The operation of the Detach function and the Softmax function is shown, X shows the labeled data in the input source domain medical image, and tau shows the differentiation parameter of the control segmentation result.
Two groups of independent semantic features Q, K are pairwise independent by using a gaussian whitening function and are embedded into a bivariate morphological feature M by a transposed dot product, which is expressed as:
Figure BDA0003848477800000091
wherein the content of the first and second substances,m represents a morphological feature of a binary term,
Figure BDA0003848477800000092
representing an independent semantic feature representing a signal that has undergone a transposed Gaussian whitening operation, <' >>
Figure BDA0003848477800000093
Q i Denotes the ith pixel in Q, K j Denotes the jth pixel in K, T denotes the transpose operation, μ Q Mean value, representing an independent semantic feature Q>
Figure BDA0003848477800000094
μ K Representing a mean value representing an independent semantic feature K>
Figure BDA0003848477800000095
Omega represents the set of all pixels in the feature map, t represents a specific threshold pixel point, sigma (-) represents a Softmax function, G represents a gating point of high-level semantic features of unlabeled data, and exp () represents an exponential function.
And the V group of independent semantic features form a unitary item field feature D through dot product operation with the gating point G, and the unitary item field feature D is expressed as:
Figure BDA0003848477800000096
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003848477800000097
and expressing dot product operation, V expressing a group of independent semantic features obtained by the decomposition of the grouping convolution operation, and G expressing the gating points of the high-level semantic features of the unlabeled data.
Extracting a unary term domain feature D similar to the current unary term domain feature D by using a nearest neighbor algorithm, wherein the unary term domain feature D is expressed as:
D*=L d (D) Meta
wherein L is d (·) Meta Representing the shortest distance operation between the metrics D and D x.
Using contrast learning to draw the positive domain features and the domain features closest to the positive domain features in the domain memory storage unit Meta Bank two by two, and isolating the positive domain features and the negative domain features two by two, and expressing as:
Figure BDA0003848477800000098
wherein the content of the first and second substances,
Figure BDA0003848477800000099
the contrast loss function between the ith pixel, exp (D), representing the characteristics of the positive and negative fields i ·D* i Tau) represents a positive domain and nearest neighbor domain feature-approximating term, the purpose of which is to compact similar domains in a domain library to each other so that the network is good for cluster modeling of the same domain, where exp (·) is an exponential scale metric function, D i For the ith pixel in the positive field feature, D i The ith pixel in the nearest neighbor domain features is represented by tau, and the tau represents a differentiation parameter of a control segmentation result; sigma D`∈MetaBank exp(D i ·D` i Tau) is a positive and negative domain feature isolation item, the purpose of the isolation item is to respectively structurally express different domains in a domain library so that a network can carry out robust learning on cross-domain disturbance, wherein Meta Bank is a domain memory storage unit Meta Bank, and D 'is all features of a non-positive domain feature D and a nearest neighbor domain feature D in the Meta Bank, also called negative domain feature, D' i The ith pixel is a negative domain feature.
And (3) carrying out pseudo-supervised training on the student segmentation confidence prediction and the teacher pseudo-label, and expressing as follows:
Figure BDA0003848477800000101
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003848477800000102
is a pseudo-supervised loss function, N represents the total pixel size of the unlabelled data in the medical image of the source field, Y b Network prediction for studentsResult is taken up>
Figure BDA0003848477800000103
Predict the result for the teacher network>
Figure BDA0003848477800000104
Indicates the correct conditional entropy of the prediction,
Figure BDA0003848477800000105
And (4) expressing the prediction error condition entropy, and adding the two condition entropies to obtain the pseudo-supervised loss.
The loss function of the model, expressed as:
Figure BDA0003848477800000106
wherein the content of the first and second substances,
Figure BDA0003848477800000107
a total loss function representing a teacher student network>
Figure BDA0003848477800000108
Represents a loss function of the supervised training>
Figure BDA0003848477800000109
A contrast loss function between the ith pixel, representing a positive and negative field feature, <' >>
Figure BDA00038484778000001010
And adding the three groups of loss functions to obtain the total loss function of the current teacher student network.
In the S9, updating the network parameters according to an exponential moving average algorithm, wherein the network parameters are expressed as;
θ′ t =αθ′ t-1 +(1-α)θ t
wherein, theta' t Network parameter, θ ', representing model update at the t-th round' t-1 Show teacher's web at the t-th turnParameter of the complex, θ t Denotes the parameters of the student network at the t-th turn, and α denotes a weighting coefficient.
A multi-domain semi-supervised based medical image segmentation system, comprising: the system comprises a data collection module, a down-sampling module, a feature exchange module, an MLP module, a pooling module, a splicing module, a confidence coefficient calculation module, a Gaussian feature whitening module, a domain memory storage module, an attention module, a nearest neighbor candidate feature module, an index moving average module, a matching module and a consistency constraint module;
the data collection module is used for acquiring multi-field labeled medical images and multi-field unlabeled medical images input into the model;
the down-sampling module is used for screening out high-level semantic features;
the feature exchange module is used for extracting characteristics of the unary field and characteristics of the nearest neighbor unary field and then exchanging feature information of the two groups of fields;
the MLP module is used for extracting deep features of original high-level semantics and nearest neighbor high-level semantics by adopting a plurality of MLPs based on the output of the feature segmentation module;
the pooling module is used for performing maximum pooling on the deep features to obtain global features of a source field and global features of a target field;
the splicing module is used for feature information fusion of features, and splicing unary term field features and binary term morphological features with deep features of each point;
the confidence coefficient calculation module is used for inputting the splicing characteristics of each pixel into the full-connection layer to obtain the confidence coefficient of each pixel;
the Gaussian feature whitening module is used for high-level semantic features decoupled from a plurality of independent features;
the domain memory storage module is used for storing unary domain characteristics, and the internal data structure of the domain memory storage module accords with a queue storage mode;
the attention module is used for acquiring the association information among the characteristics and enhancing the characteristic representation, namely processing the high-level semantic characteristic representation according to an attention mechanism and a cross attention mechanism to obtain four groups of decoupled mutually independent characteristic representations;
the nearest neighbor candidate feature module is used for inquiring another unary field feature which is most similar to the unary field feature in the field memory storage module according to a nearest neighbor algorithm;
the index sliding average module is used for calculating the weight parameters of the student network in the previous turn and updating the teacher network weight parameters after averaging;
the matching module is used for acquiring a corresponding point set matched with the first point set from the target point cloud;
and the consistency constraint module is used for introducing consistency constraint into the teacher student model to obtain the pseudo-supervision loss of the teacher student.
In one embodiment, experiments were performed using datasets (M & Ms) and (SCGM), multicenter, multi-vendor and multi-disease cardiac image segmentation (M & Ms) datasets: the M & Ms challenge data set contained 320 subjects who were scanned in 6 clinical centers in 3 different countries using 4 different magnetic resonance scanner vendors (siemens, philips, GE, and canon), namely A, B, C and d fields, each subject labeled only end systole and end diastole, with voxel resolutions ranging from 0.85 x 10 mm to 1.45 x 9.9 mm, field a containing 95 cases, and field B containing 125 cases. Both domain C and domain D contained 50 subjects.
Polio Segmentation (SCGM) dataset: data for SCGM [38] were from 4 different medical centers, using different MRI systems (philips implementation a, siemens triplet, siemens Skyra), i.e. domains 1, 2, 3 and 4, with voxel resolutions ranging from 0.25 × 0.25 × 2.5mm to 0.5 × 0.5 × 5mm; there were 10 labeled subjects and 10 unlabeled subjects per field.
As shown in fig. 2, which shows an example image and predicted segmentation mask for each model under different conditions, the performance dropped significantly when the baseline model was trained with less labeled data, in contrast to our model which produced a satisfactory segmentation mask in each case.
Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims (10)

1. A medical image segmentation method based on multi-field semi-supervision is characterized by comprising the following steps:
constructing and training a student segmentation model of a semi-supervised teacher, and inputting data of a dissimilarity field to be segmented into the trained segmentation model to obtain a segmentation result;
the training process of the semi-supervised teacher student segmentation model comprises the following steps:
s1: acquiring a source field medical image, and performing data enhancement on the acquired medical image, wherein the source field medical image consists of a small amount of labeled data and a large amount of non-labeled data;
s2: inputting labeled data in a source field medical image after data enhancement into a student segmentation network, extracting advanced organization semantic features of the labeled data through an encoder, performing feature channel reduction on the advanced organization semantic features by using a decoder, predicting and reducing each channel pixel by pixel after MLP (Multi level prediction) of a segmentation head, performing binarization on predicted pixels through a Softmax function to obtain a segmentation prediction confidence matrix, and performing supervision training according to the segmentation prediction confidence matrix and the labeled data in the source field medical image;
s3: updating network parameters of a student network for supervised training by using an exponential sliding average algorithm to obtain a teacher network, inputting label-free data into a segmentation prediction confidence matrix obtained by the teacher network, and processing the segmentation prediction confidence matrix into a teacher pseudo label through a Detach function;
s4: extracting high-level semantic features f of label-free data through student network encoder θ (X s ) A 1 is to f θ (X s ) Decomposing the data into Q, K, V three groups of independent semantic features through grouping convolution operation and extracting a gating point G of F through one-dimensional convolution, using a Gaussian whitening function to enable two groups of independent semantic features of Q, K to be independent in pairs, and embedding the independent semantic features into a binary morphological feature M and a V group of independent semantic features through transposition dot productsConstructing a unitary item field characteristic D by performing dot product operation on the stereo semantic characteristic and a gating point G;
s5: storing the unary field characteristics D into a queue, and constructing a field memory storage unit Meta Bank;
s6: extracting a unary term field characteristic D similar to the current unary term field characteristic D by using a nearest neighbor algorithm in a Meta Bank, regarding D and D as current positive field characteristics, regarding the rest characteristics in the Meta Bank as negative field characteristics, using contrast learning to draw the positive field characteristics and the field characteristics closest to the positive field characteristics in a field memory storage unit Meta Bank to be close to each other, and isolating the positive field characteristics and the negative field characteristics from each other;
s7: selecting a unitary item field characteristic D as a new field characteristic, and adding the unitary item field characteristic D and a binary item morphological characteristic M to obtain a new high-level semantic characteristic F;
s8: f, decoding again through a student segmentation network decoder to obtain a feature map with the same size as the original input image, classifying the feature map through a segmentation header MLP to obtain student segmentation confidence prediction, and performing pseudo-supervised training with a teacher pseudo-label;
s9: and starting a gradient back propagation mechanism, optimizing a loss function of the model, updating network parameters according to an exponential moving average algorithm, finishing the training of the model and fixing the model parameters after the model converges or reaches the set epoch times.
2. The multi-domain semi-supervised-based medical image segmentation method as claimed in claim 1, wherein the prediction pixels are binarized by a Softmax function to obtain a segmentation prediction confidence matrix, which is expressed as:
Figure FDA0003848477790000023
wherein Y represents a segmentation prediction confidence matrix obtained by a student segmenting the network,
Figure FDA0003848477790000024
representing student network coder-decoder and segmentation head processing operation in the student segmentation network, wherein X represents labeled data in an input source field medical image, sigma (-) represents a Softmax function, and tau represents a differentiation parameter for controlling segmentation results.
3. The multi-domain semi-supervised based medical image segmentation method of claim 1, wherein the supervised training is performed according to the segmentation prediction confidence matrix and labeled data in the source domain medical image, and is represented as follows:
Figure FDA0003848477790000021
wherein the content of the first and second substances,
Figure FDA0003848477790000022
the method comprises the steps of representing a loss function of supervised training, wherein T represents the total pixel size of labeled data in a source field medical image, X represents the labeled data in the input source field medical image, GT (X) represents the label of the labeled data in the source field medical image, and Y represents a segmentation prediction confidence matrix obtained by a student segmenting a network.
4. The multi-field semi-supervised-based medical image segmentation method as claimed in claim 1, wherein non-label data is input into a segmentation prediction confidence matrix obtained by a teacher network and processed into teacher pseudo labels through a Detach function, and the representation is as follows:
Figure FDA0003848477790000031
wherein, Y * A pseudo label is shown to the teacher and,
Figure FDA0003848477790000032
teacher network encoder-decoder and split header representing inside of teacher networkProcessing operation,. Based on>
Figure FDA0003848477790000033
The operation of the Detach and Softmax function is shown, X represents the labeled data in the input source field medical image, and tau represents the differentiation parameter of the control segmentation result.
5. The multi-domain semi-supervised-based medical image segmentation method as claimed in claim 1, wherein two groups of independent semantic features Q, K are independent in pairs by using a Gaussian whitening function, and are embedded as a binary morphological feature M by a transposed dot product, expressed as:
Figure FDA0003848477790000034
wherein M represents a binary term morphological feature,
Figure FDA0003848477790000035
the representation shows the independent semantic features subjected to a transposed gaussian whitening operation,
Figure FDA0003848477790000036
Q i denotes the ith pixel in Q, K j Denotes the jth pixel in K, T denotes the transpose operation, μ Q Mean value representing an independent semantic feature Q>
Figure FDA0003848477790000037
μ K The representation represents the mean value of the independent semantic features K,
Figure FDA0003848477790000038
omega represents the set of all pixels in the feature map, t represents a specific threshold pixel point, sigma (-) represents a Softmax function, G represents a gating point of high-level semantic features of unlabeled data, and exp () represents an exponential function.
6. The multi-domain semi-supervised-based medical image segmentation method as recited in claim 1, wherein the V groups of independent semantic features form a univariate domain feature D by dot product operation with a gating point G, and the feature is expressed as:
Figure FDA0003848477790000039
wherein the content of the first and second substances,
Figure FDA00038484777900000310
and V represents a group of independent semantic features obtained by decomposition of the grouping convolution operation, and G represents a gating point of the high-level semantic features of the unlabeled data.
7. The multi-domain semi-supervised-based medical image segmentation method according to claim 1, wherein a nearest neighbor algorithm is used to extract a univariate domain feature D similar to a current univariate domain feature D, and the extraction is represented as:
D*=L d (D) Meta
wherein L is d (·) Meta Representing the shortest distance operation between the metrics D and D.
8. The multi-domain semi-supervised-based medical image segmentation method as claimed in claim 1, wherein the positive domain features and the domain features closest to the positive domain features in a domain memory storage unit Meta Bank are pairwise drawn close and separated by using contrast learning, and the two isolation between the positive and negative domain features are represented as follows:
Figure FDA0003848477790000041
wherein the content of the first and second substances,
Figure FDA0003848477790000042
i (i) for representing positive and negative field featuresInter-pixel contrast loss function, exp (D) i ·D* i Tau) represents the characteristic closeness term of the positive domain and the nearest neighbor domain, exp (-) represents an exponential function, D i Representing the ith pixel, D, in the positive domain feature i Represents the ith pixel in the nearest neighbor domain feature, tau represents the differentiation parameter of the control segmentation result, sigma D`∈Meta Bank exp(D i ·D` i Tau) represents the positive and negative domain feature separation term, D' represents the negative domain feature, D ″ i The ith pixel is a negative domain feature.
9. The multi-field semi-supervised-based medical image segmentation method as claimed in claim 1, wherein the student segmentation confidence prediction and teacher pseudo-label are subjected to pseudo-supervised training, and are represented as follows:
Figure FDA0003848477790000043
wherein the content of the first and second substances,
Figure FDA0003848477790000044
representing a pseudo-supervised loss function, N representing the total pixel size of unlabelled data in the source domain medical image, Y b Represents the result of the student network prediction and is up or down>
Figure FDA0003848477790000045
And representing the teacher network prediction result.
10. The multi-domain semi-supervised-based medical image segmentation method as set forth in claim 1, wherein the loss function of the model is expressed as:
Figure FDA0003848477790000046
wherein, the first and the second end of the pipe are connected with each other,
Figure FDA0003848477790000047
a total loss function representing a teacher student network>
Figure FDA0003848477790000048
A loss function representing the supervised training,
Figure FDA0003848477790000049
a contrast loss function between the ith pixel, representing a positive and negative field feature, <' >>
Figure FDA00038484777900000410
Is a pseudo-supervised loss function. />
CN202211130790.6A 2022-09-16 2022-09-16 Medical image segmentation method based on multi-field semi-supervision Pending CN115861164A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211130790.6A CN115861164A (en) 2022-09-16 2022-09-16 Medical image segmentation method based on multi-field semi-supervision

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211130790.6A CN115861164A (en) 2022-09-16 2022-09-16 Medical image segmentation method based on multi-field semi-supervision

Publications (1)

Publication Number Publication Date
CN115861164A true CN115861164A (en) 2023-03-28

Family

ID=85660973

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211130790.6A Pending CN115861164A (en) 2022-09-16 2022-09-16 Medical image segmentation method based on multi-field semi-supervision

Country Status (1)

Country Link
CN (1) CN115861164A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116912535A (en) * 2023-09-08 2023-10-20 中国海洋大学 Unsupervised target re-identification method, device and medium based on similarity screening
CN117372306A (en) * 2023-11-23 2024-01-09 山东省人工智能研究院 Pulmonary medical image enhancement method based on double encoders
CN117830638A (en) * 2024-03-04 2024-04-05 厦门大学 Omnidirectional supervision semantic segmentation method based on prompt text

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116912535A (en) * 2023-09-08 2023-10-20 中国海洋大学 Unsupervised target re-identification method, device and medium based on similarity screening
CN116912535B (en) * 2023-09-08 2023-11-28 中国海洋大学 Unsupervised target re-identification method, device and medium based on similarity screening
CN117372306A (en) * 2023-11-23 2024-01-09 山东省人工智能研究院 Pulmonary medical image enhancement method based on double encoders
CN117372306B (en) * 2023-11-23 2024-03-01 山东省人工智能研究院 Pulmonary medical image enhancement method based on double encoders
CN117830638A (en) * 2024-03-04 2024-04-05 厦门大学 Omnidirectional supervision semantic segmentation method based on prompt text

Similar Documents

Publication Publication Date Title
Sadad et al. Brain tumor detection and multi‐classification using advanced deep learning techniques
Zhou et al. GAN review: Models and medical image fusion applications
CN115861164A (en) Medical image segmentation method based on multi-field semi-supervision
Li et al. Dual-consistency semi-supervised learning with uncertainty quantification for COVID-19 lesion segmentation from CT images
CN113256592B (en) Training method, system and device of image feature extraction model
CN113314205A (en) Efficient medical image labeling and learning system
CN114240955B (en) Semi-supervised cross-domain self-adaptive image segmentation method
Zeng et al. Reciprocal learning for semi-supervised segmentation
CN116884623B (en) Medical rehabilitation prediction system based on laser scanning imaging
Feng et al. Supervoxel based weakly-supervised multi-level 3D CNNs for lung nodule detection and segmentation
Yang et al. A novel deep learning framework for standardizing the label of OARs in CT
CN108319969B (en) Brain glioma survival period prediction method and system based on sparse representation framework
CN116664588A (en) Mask modeling-based 3D medical image segmentation model building method and application thereof
CN117274599A (en) Brain magnetic resonance segmentation method and system based on combined double-task self-encoder
Franco-Barranco et al. Deep learning based domain adaptation for mitochondria segmentation on EM volumes
Zhang et al. Feature extraction of ancient Chinese characters based on deep convolution neural network and big data analysis
US11769033B2 (en) System, computer readable storage medium, and method for segmentation and enhancement of brain MRI images
Li et al. MCRformer: Morphological constraint reticular transformer for 3D medical image segmentation
Chatterjee et al. A survey on techniques used in medical imaging processing
Ardimento et al. Evo-GUNet3++: Using evolutionary algorithms to train UNet-based architectures for efficient 3D lung cancer detection
Teli et al. Deep Learning for Bioinformatics
Roth et al. Multi-plane UNet++ ensemble for glioblastoma segmentation
CN116759076A (en) Unsupervised disease diagnosis method and system based on medical image
Pan et al. A review of machine learning approaches, challenges and prospects for computational tumor pathology
CN116580225A (en) Rectal cancer CT image classification method based on spatial information drive

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination