CN117274599A - Brain magnetic resonance segmentation method and system based on combined double-task self-encoder - Google Patents

Brain magnetic resonance segmentation method and system based on combined double-task self-encoder Download PDF

Info

Publication number
CN117274599A
CN117274599A CN202311273017.XA CN202311273017A CN117274599A CN 117274599 A CN117274599 A CN 117274599A CN 202311273017 A CN202311273017 A CN 202311273017A CN 117274599 A CN117274599 A CN 117274599A
Authority
CN
China
Prior art keywords
segmentation
data
self
encoder
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311273017.XA
Other languages
Chinese (zh)
Inventor
田智强
李皓冰
施展艺
杜少毅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian Jiaotong University
Original Assignee
Xian Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian Jiaotong University filed Critical Xian Jiaotong University
Priority to CN202311273017.XA priority Critical patent/CN117274599A/en
Publication of CN117274599A publication Critical patent/CN117274599A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/0895Weakly supervised learning, e.g. semi-supervised or self-supervised learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/0985Hyperparameter optimisation; Meta-learning; Learning-to-learn
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06V10/7753Incorporation of unlabelled data, e.g. multiple instance learning [MIL]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/03Recognition of patterns in medical or anatomical images

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Databases & Information Systems (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Magnetic Resonance Imaging Apparatus (AREA)

Abstract

The invention discloses a brain magnetic resonance segmentation method and a brain magnetic resonance segmentation system based on a combined double-task self-encoder, which are characterized in that a segmentation training set of a downstream segmentation task is registered, then middle cutting is carried out on the registered segmentation training set, and then resampling is carried out on data after center cutting to obtain characteristic data; extracting the characteristics of the acquired characteristic data by utilizing a pre-trained self-encoder to obtain basic characteristics, and decoding the acquired basic characteristics to obtain a segmentation result; the method comprises the steps of training a network segmentation model by using a decoded segmentation result and a corresponding segmentation training set, segmenting an MR image by using the trained network segmentation model, segmenting by using a double model, greatly improving the accuracy of the segmentation result, and improving the model segmentation result by using a pixel-level and object-level combined double-task framework to enable the model to learn pixel-level detail and object-level distinguishing information respectively and fusing modal information in a parameter sharing mode.

Description

Brain magnetic resonance segmentation method and system based on combined double-task self-encoder
Technical Field
The invention belongs to the technical field of image processing, and particularly relates to a brain magnetic resonance segmentation method and system based on a combined double-task self-encoder.
Background
Medical image segmentation plays an important role in computer-aided diagnosis and treatment and can help doctors to analyze data of diseases. The task is to accurately delineate target areas, such as organs, lesions, tissues, etc., at the pixel level in medical imaging. The current medical image segmentation is mainly performed in a manual labeling mode, which requires strong medical expert knowledge and experience, and is generally labeled by doctors, so that the ordinary people are difficult to conquer the task, and the low labeling data output efficiency is caused. Meanwhile, due to the noise of machine imaging, the divergence of subjective judgment of experts and the fatigue caused by a great deal of boring repetitive labor, subjective human errors are easily generated by manual labeling. A need therefore arises to develop accurate automated medical image segmentation algorithms. Compared with manual labeling, the algorithm has strong objectivity, can rapidly carry out batched image labeling, and greatly reduces the labor capacity of doctors.
There is currently a growing interest in the study of self-monitoring methods, but little work is done in the medical field to employ self-monitoring methods. Some studies indicate that self-supervised learning can be applied directly to the medical field, since unlabeled medical images carry valuable information about organ structures, and that self-supervision enables models to derive concepts about these structures without additional annotation costs. Unlike natural images, medical images are of a 3D nature, i.e. they are presented in a sequence. Many current approaches to self-supervising have been studied to convert 3D imaging tasks to 2D by extracting slices along arbitrary axes (e.g., axial dimensions). Obtaining a data representation from a 3D image by means of a 2D context is a suboptimal solution which reduces the performance of downstream tasks.
Deep learning models are typically trained using a supervised learning paradigm, where model learning maps inputs (e.g., nuclear magnetic resonance images or health records) to outputs. In order for the model to learn the relevant patterns in the data, the training process requires a large dataset with each input carrying corresponding tag information through supervised learning training. Deep learning has been very successful on supervised models. However, such work places more emphasis on building and testing models than on building annotated data sets. This is in part because expert annotation of large-scale generation of patient multi-modality data is non-simple, expensive and time-consuming for most medical tasks, and is associated with privacy exposure risks, even semi-automated software tools may not adequately reduce annotation costs.
On the other hand, unlike natural images, despite individual differences, the physical structure of medical images is relatively stable due to the anatomical structure of the human body depicted, presenting natural consistent contextual information in the image, and lesions also have their specific textures and appearances. The self-supervision proxy task is used for learning the basic mode of human anatomy, under the condition that an accurate segmentation result cannot be obtained in the self-supervision learning task, the discrimination accuracy of the encouragement model is affected.
Disclosure of Invention
The invention aims to provide a brain magnetic resonance segmentation method and a brain magnetic resonance segmentation system based on a combined double-task self-encoder, which are used for solving the problems that the accurate segmentation result cannot be obtained and the judgment accuracy of an encouragement model is affected in the prior art.
A brain magnetic resonance segmentation method based on a combined double-task self-encoder comprises the following steps:
registering the segmentation training set of the downstream segmentation task, performing intermediate cutting on the registered segmentation training set, and resampling the data after center cutting to obtain characteristic data;
extracting the characteristics of the acquired characteristic data by utilizing a pre-trained self-encoder to obtain basic characteristics, and decoding the acquired basic characteristics to obtain a segmentation result;
and training a network segmentation model by using the decoded segmentation result and the corresponding segmentation training set, and segmenting the MR image by using the trained network segmentation model.
Preferably, the pre-trained self-encoder training specific process is: the method comprises the steps of collecting a pre-training image as a pre-training set, converting pre-training set data into a brain imaging data structure, standardizing registration of the data structure of the pre-training set converted into the brain imaging data structure to the same template, then performing center cutting operation on the registered pre-training set data to obtain pre-training characteristic data, and training a self-encoder by utilizing the pre-training characteristic data.
Preferably, the data structure registration of the pre-training set is normalized to the same template using a clinical platform.
Preferably, performing random rotation operation on the same batch of pre-training characteristic data to obtain two positive correlation views of each sample in the same batch after data enhancement, and performing random masking operation on each positive correlation view to obtain a positive correlation characteristic map of a shielded part patch; and inputting the positive correlation characteristic map subjected to data enhancement and random mask operation into a self-encoder network to extract characteristic operation, respectively obtaining reconstructed image patches and contrast coding characteristics through a pixel-level prediction head and an object-level prediction head, and performing self-encoder pre-training by utilizing self-supervision information of a pre-training image.
Preferably, in the pre-training process of the self-encoder, the parameters of the network are optimized by using a back propagation strategy, training is assisted by using the loss function, and the network parameters are updated according to the value of the loss function, so that the loss function continuously descends until the loss function converges to a set value, and at the moment, the training is finished, and the pre-training of the self-encoder is completed.
Preferably, the step of adopting the clinical platform to standardize the registration of the data structure of the pre-training set to the same template specifically comprises the following steps:
and calculating a maximum adjacent rectangular frame for the maximum foreground region of all the modes after the registration, excluding the region with 0, unifying the space size of each sample to be consistent level for the whole pre-training set after the registration, and then resampling to obtain feature data, wherein the resampled target space is obtained by averaging the whole data set.
Preferably, for single-mode data, the self-encoder is directly loaded as an extraction feature encoder, and the low-level semantics of the encoding stage are connected with the high-level semantics of the decoding stage under the same downsampling multiplying power by using a cross-layer connection through a U-shaped network structure, so as to obtain a segmentation result.
Preferably, for multi-mode data, a simple mode sharing encoder is adopted, different mode data are input into the encoder sharing parameters, and common characteristics of all modes are captured, so that a segmentation result is obtained.
Preferably, the multi-modal coding output is decoded, and meanwhile, the low-level semantics of the coding stage are connected with the high-level semantics of the decoding stage under the same downsampling multiplying power by using cross-layer connection, and finally the segmentation result is obtained by decoding.
A brain magnetic resonance segmentation method based on a combined double-task self-encoder comprises a data preprocessing module, a self-supervision module and a segmentation module:
the data preprocessing module is used for registering the segmentation training set of the downstream segmentation task, then performing intermediate cutting on the registered segmentation training set, and then resampling the data after center cutting to obtain characteristic data;
the self-supervision module is used for extracting the characteristics of the acquired characteristic data to obtain basic characteristics, and decoding the acquired basic characteristics to obtain a segmentation result;
the segmentation module trains a network segmentation model by utilizing the decoded segmentation result and the corresponding segmentation training set, and segments the MR image by utilizing the trained network segmentation model.
Compared with the prior art, the invention has the following beneficial technical effects:
the invention provides a brain magnetic resonance segmentation method based on a combined double-task self-encoder, which is characterized in that a segmentation training set of a downstream segmentation task is registered, then the registered segmentation training set is subjected to middle cutting, and then data subjected to center cutting are resampled to obtain characteristic data; extracting the characteristics of the acquired characteristic data by utilizing a pre-trained self-encoder to obtain basic characteristics, and decoding the acquired basic characteristics to obtain a segmentation result; the decoded segmentation result and the corresponding segmentation training set are utilized to train the network segmentation model, the trained network segmentation model is utilized to segment the MR image, and the double model is utilized to segment, so that the accuracy of the segmentation result can be greatly improved.
Furthermore, a combined double-task framework of a pixel level and an object level is adopted, so that the model learns pixel level details and object level distinguishing information respectively, in a segmentation task, a self-encoder loading strategy based on a mode is provided for multi-mode data, mode information is fused in a mode of sharing parameters, and a model segmentation result is improved.
Furthermore, for multi-mode data, the data distribution of each branch is counted by using a special normalization layer in each branch, the mode private information is reserved, the data processing amount is reduced, and the image processing precision is improved.
Furthermore, by using cross entropy loss, softprice loss and depth supervision loss in combination, the feedback of gradient is promoted, model convergence is enhanced, and model training effect is further improved.
Drawings
Fig. 1 is a flowchart of an implementation of a brain magnetic resonance segmentation method in an embodiment of the present invention.
Fig. 2 is a block diagram of a network partition model based on a combined dual-tasking self-encoder in an embodiment of the present invention.
FIG. 3 is a view of a focus mask self-encoder in a self-supervision stage combined bi-tasking model in an embodiment of the present invention.
FIG. 4 is a schematic diagram of a contrast-based self-encoder architecture in a self-supervision stage combined dual-tasking model in an embodiment of the present invention.
Figure 5 is a diagram of a downstream brain magnetic resonance segmentation task network framework based on modalities in an embodiment of the invention.
Figure 6 is an extracted feature layer diagram of a modality-based downstream brain magnetic resonance segmentation encoder in an embodiment of the present invention.
Fig. 7 is a graph showing a segmentation effect of a brain magnetic resonance segmentation method model in an embodiment of the present invention.
Detailed Description
In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
As shown in fig. 1 and 2, the invention provides a brain magnetic resonance segmentation method based on a combined double-task self-encoder, which specifically comprises the following steps:
s1, data preprocessing: the method comprises the steps of collecting a pre-training image as a pre-training set, converting pre-training set data into a brain imaging data structure, standardizing registration of the data structure of the pre-training set converted into the brain imaging data structure to the same template, and then performing center cutting operation on the registered pre-training set data to obtain pre-training characteristic data.
In the invention, a Clinica platform is adopted to standardize the registration of the data structure of the pre-training set to the same template.
Self-encoder pre-training process: performing random rotation operation on the same batch of pre-training characteristic data to obtain two positive correlation views of each sample in the same batch after data enhancement, and performing random masking operation on each positive correlation view to obtain a positive correlation characteristic map of a shielded part patch; and inputting the positive correlation characteristic map subjected to data enhancement and random mask operation into a self-encoder network to extract characteristic operation, respectively obtaining reconstructed image patches and contrast coding characteristics through a pixel-level prediction head and an object-level prediction head, and performing self-encoder pre-training by utilizing self-supervision information of a pre-training image.
In the pre-training process of the self-encoder, the parameters of the network are optimized by using a counter-propagation strategy, the training is assisted by using the loss function, and the network parameters are updated according to the value of the loss function, so that the loss function continuously descends until the loss function converges to a set value, and the pre-training of the self-encoder is completed after the training is finished.
Specifically, in the pre-training stage of the self-encoder, the disclosed data set is adopted as the pre-training data set, and the pre-training data set is totally divided into the pre-training sets.
The pre-training set data are converted into brain imaging data structures, and the pre-training set data are registered and standardized on the same template by adopting a Clinica platform, and specifically comprise the following steps:
calculating a maximum adjacent rectangular frame for the maximum foreground area of all the modes after the matching registration, excluding the area with 0, unifying the space size of each sample to be the same level for the whole pre-training set after the registration, and then resampling to obtain feature data, wherein the resampled target space is obtained by averaging the whole data set;
specifically, the pre-dataset is used entirely for training, and the tag information of the dataset is not used in training. In order to make training more stable during cutting, an oversampling strategy is adopted, so that at least one third of data in one batch is guaranteed to contain prospects.
The specific process of self-encoder training is as follows: the self-encoder is pre-trained with a combined dual-proxy task at the pixel level and the object level. The specific process is as follows: input samples X ε R for a 3D voxel volume of given batch size (Batchsize) of N H ×W×D×C Randomly performing data enhancement of rotation operation, namely randomly transforming an input data instance into two positive correlation views of the same instance through the data enhancement, wherein the final data enhancement comprises 2N data points; the rotation enhancement formula is:
r=RandomChoices(R),R={0°,90°,180°,270°}
wherein: r is a random rotation angle, and k represents a sample number of a batch size N.
For the enhanced samplesAnd->Firstly, remodelling the brain into a series of flat 3D patches, and adding symmetrical position codes in patch embedding to calculate a formula of the position codes according to the characteristic that the structure of the brain has bilateral symmetry in order to ensure the position information:
pos=h·x-|w/2-y|+w/2+d 2 ·z
wherein: dim is the dimension of the patch code, pos is the position of the patch embedding with coordinates of (x, y, z), i is the different dimension of the position code, and the patch embedding at the same position adopts an alternating coding mode of sine code and cosine code. Since the values of sine and cosine are between-1 and 1, no significant distortion is induced when adding position coding to the patch embedding. The final input from the encoder is the sum of the position encoding and the patch embedding addition.
The patch data is subjected to random masking operation, a visible patch area is sent to an encoder, and the three-dimensional patch sequence is projected to a fixed-dimensional space through an embedding layer. For more efficient modeling patch embedding interactions, features of input size H ' x W ' x D ' are uniformly divided into non-overlapping windows using windows of size MxM x M, and local self-attention is calculated within each region. And then offsetting the window according to (M/2, M/2 and M/2) voxels, so that different window features divided before appear in the same window after the window is moved, and the local self-attention in the window area can be calculated again at the moment to realize information exchange of different windows.
As shown in fig. 3 and 4, the output of the self-encoder after the extracted features is decoded, and finally, the reconstructed image patch and the contrast coding features are obtained through the prediction heads at the pixel level and the object level.
Training by using training data, wherein in the training process, parameters of a network are optimized by using a back propagation strategy, and training is assisted by using a loss function, wherein the loss function comprises focus reconstruction loss and contrast loss; the reconstruction loss and the contrast loss are used for helping training, and the parameters of the network are optimized through back propagation, so that the model is encouraged to learn the basic characteristics of the image.
Focusing on the reconstruction loss, aiming at the characteristic of relatively stable brain tissue structure, gradient of each voxel is calculated to obtain a gradual change image in each direction, and a calculation formula is as follows:
G i =I*D i ,i∈{x,y,z}
wherein: i is the characteristic of the input, D i Is a filter in the i directionRepresenting a convolution operation.
Thereby obtaining the gradual change direction theta of each voxel, and the specific formula is as follows:
the gradient amplitude is:
for each image patch, a 2D directional gradient histogram is created, each voxel is traversed, looking at which interval of X, Y axis the gradient direction of the voxel falls in, and the gradient magnitude of the voxel is added to the Y-axis of the interval. After the traversal is finished, a norm normalization process is required for the vector representing the histogram size. Obtaining the importance degree of all the masking image patches in the whole brain tissue, wherein the calculation formula is as follows:
wherein:representing the average value of the histogram, N is the number of randomly masked image patches.
Depending on the importance of the directional gradient histogram, different weights are applied when measuring the pixel differences between the restored image region and the original image, thereby encouraging the model to pay more attention to the important region. The calculation formula of the focus reconstruction loss function is as follows:
loss of contrast reconstruction is to set a pair of enhanced samples in the same batch as positive sample z i And z j Taking other 2 (N-1) enhancement samples in the same batch asNegative examples. And calculating mutual information between the two vectors through cosine similarity, wherein the rest chord similarity formulas are as follows:
the contrast loss calculation formula is thus:
and updating the network parameters according to the value of the loss function, so that the loss function is continuously reduced until the loss function is converged to a smaller value, and at the moment, training is finished, and the trained pre-trained self-encoder is stored.
Collecting an image to be trained of a downstream segmentation task as a segmentation training set, registering the segmentation training set, then performing intermediate cutting on the registered segmentation training set, and then performing resampling to obtain feature data; inputting the acquired feature data into a self-encoder obtained in a self-supervision stage to extract basic features, then performing decoding operation on the extracted basic features, simultaneously connecting low-level semantics of convolution processing with high-level semantics of a decoding stage under the same downsampling multiplying power by using cross-layer connection, and finally decoding to obtain a segmentation result; and training a network segmentation model by using the segmentation result obtained by decoding and the corresponding segmentation training set, and segmenting the MR image by using the trained network segmentation model.
In the downstream brain MRI segmentation stage, for single-mode data, a self-encoder is directly loaded as an extraction feature encoder, and the low-level semantics of the encoding stage are connected with the high-level semantics of the decoding stage under the same downsampling multiplying power by using a cross-layer connection through a U-shaped network structure, so as to obtain a segmentation result. For multi-mode data, a simple mode sharing encoder is adopted, different mode data are input into the encoder sharing parameters, and common characteristics of all modes are captured, so that a segmentation result is obtained.
For the modal private information, after convolution, carrying out separate normalization operation on the multiple modes, and independently counting the modal private characteristics, wherein the specific formula is as follows:
wherein: u (u) LRepresenting the mean and variance over the whole sample. Is a very small constant and prevents the denominator from being 0. Alpha m ,β m Is a trainable parameter, respectively a scaling factor and a translation parameter in affine transformation, for restoring the expressive power on data. By modality privatization (alpha) mm ) So as to achieve the effect of distinguishing statistical modal information.
And decoding the multi-modal coded output, simultaneously connecting the low-level semantics of the coding stage with the high-level semantics of the decoding stage under the same downsampling multiplying power by using cross-layer connection, and finally decoding to obtain a segmentation result.
In the network segmentation model training process, parameters of a network are optimized by using a back propagation strategy, and training is assisted by using a loss function, wherein the loss function comprises cross entropy loss, softprice loss and full resolution depth supervision loss; cross entropy loss, softrace loss, and full resolution depth supervision loss are used to help train, back-propagate parameters of the optimization network.
Cross entropy is the most common penalty in image segmentation algorithms, which compares each pixel to a truth-chart one by one, and is formulated as follows:
wherein: dxW.times.H is the number of pixels of the entire three-dimensional image, y i E {0,1} is the true label of the ith element, where 0 is background, 1 is foreground, p i ∈[0,1]Representing the probability that the pixel predicted by the network belongs to the foreground.
The softdie loss is formulated as follows:
wherein: epsilon is a small constant and is 0 in order to prevent denominator.
And taking a decoding layer of each stage of the network as an intermediate output, carrying out corresponding up-sampling on the output according to the down-sampling multiplying power of the stage, introducing accompanying (Side) loss in a full resolution mode, and carrying out depth supervision. The final loss function formula is expressed as follows:
wherein: p is a predictive probability map, Y is a truth map, g (x, u) represents upsampling by u as multiplying power, lambda i Is a super parameter for regulating the losses of different intermediate layers, and N is the number of the intermediate layers.
Updating network parameters according to the value of the loss function, enabling the loss function to continuously decline until the loss function converges to a smaller value, finishing training at the moment, and storing a trained network model; and constructing a brain magnetic resonance segmentation model by using the stored trained model.
A brain magnetic resonance segmentation system based on a combined double-task self-encoder comprises a data preprocessing module, a self-supervision module and a segmentation module:
the data preprocessing module is used for registering the segmentation training set of the downstream segmentation task, then performing intermediate cutting on the registered segmentation training set, and then resampling the data after center cutting to obtain characteristic data;
the self-supervision module is used for extracting the characteristics of the acquired characteristic data to obtain basic characteristics, and decoding the acquired basic characteristics to obtain a segmentation result;
the segmentation module trains a network segmentation model by utilizing the decoded segmentation result and the corresponding segmentation training set, and segments the MR image by utilizing the trained network segmentation model.
The invention relates to a brain magnetic resonance segmentation method based on a combined double-task self-encoder, which aims at brain tissue priori knowledge to design a reconstruction agent task to pay attention to important image features; in order to learn the basic mode of the anatomical structure of the brain region, a combined double-task framework of a pixel level and an object level which are suitable for brain magnetic resonance imaging is provided, so that the model learns pixel level detail and object level distinguishing information respectively; the reconstruction loss and the contrast loss are combined, so that gradient feedback is promoted, model convergence is enhanced, and model learning basic characteristics are further encouraged.
In the downstream task, according to the number of modes, a self-encoder loading strategy (MALS) based on the modes enables the self-encoder to be directly loaded as a characteristic extraction encoder for single-mode data in segmentation, and a U-shaped network is adopted to obtain a segmentation result. And the encoder which adopts the shared characteristic extraction parameters for the multi-mode data extracts the public information of the modes, but the multi-mode is normalized by the separated mode private information, and the mode private characteristics of the multi-mode data are counted independently.
By combining cross entropy loss, softprice loss and depth supervision loss, the feedback of the gradient is promoted, model convergence is enhanced, and model training effect is further improved;
the method and the device have the advantages that the competitive Dice and HD results are obtained by performing fine adjustment of the downstream segmentation task on the three public data sets, and the method and the device can be superior to the current popular self-supervision medical image segmentation model.
Examples
A brain magnetic resonance segmentation method based on a combined double-task self-encoder comprises the following steps:
self-supervised learning of self-encoders: the 3D medical source data is pre-processed to make it suitable for training of the model. The specific working procedure is as follows:
(1.1) employing two sets of public data sets as pre-training data sets;
(1.2) converting the data set in the step (1.1) into a brain imaging data structure, and standardizing data registration on the same template by adopting a Clinica platform;
(1.3) calculating a maximum adjacent rectangular frame for the maximum foreground region of all modes, excluding the region with 0 in the foreground region, unifying the space size of each sample to a consistent level for the whole registered training set, enabling a convolution kernel to traverse data extraction features with the same receptive field, and accordingly resampling, wherein the resampled target space is obtained by averaging the whole data set;
(1.4) dividing the data set processed in the step (1.3) into training sets, wherein the training process does not use label information of any data set.
A reconstruction agent task is designed aiming at brain tissue priori knowledge to pay attention to important image features, and an agent task based on contrast coding is added on the basis of the important image features, so that a combined double-task framework of a pixel level and an object level is formed. The specific working procedure is as follows:
(2.1) for the data set obtained in step (1.4), for the same batch of input samples X εR H×W×D×C Data enhancement with random rotation is performed, and two positive correlation views of each sample after enhancement are obtained, namely
(2.2) for the enhancement sample described in step (2.1), remolding it into a series of flat 3D patches, and adding symmetric position codes in patch embedding, i.e. pos=h·x- |w/2-y|+w/2+d, in order to guarantee position information while having bilateral symmetry characteristics according to the brain structure 2 ·z;
(2.3) carrying out random masking operation on the feature patch sequence added with the position codes and obtained in the step (2.2), and sending the visible patch area to an encoder;
(2.4) more efficient modeling patch embedding interactions in the encoder using paired window self-attention computation modules and moving window self-attention computation modules.
And (2.5) decoding the output of the encoder obtained in the step (2.4), and finally obtaining the reconstructed image patch and the contrast coding characteristic through the prediction heads of the pixel level and the object level.
The focus reconstruction loss and the contrast loss are adopted in the acquired self-encoder training process to promote gradient feedback, strengthen model convergence and further encourage model learning basic characteristics;
the 3D medical source data is pre-processed to make it suitable for training of the model. The specific working procedure is as follows:
(4.1) employing three sets of public data sets as training data sets;
(4.2) converting the data set in the step (4.1) into a brain imaging data structure, and standardizing data registration on the same template by adopting a Clinica platform;
(4.3) calculating a maximum adjacent rectangular frame for the maximum foreground region of all modes, excluding the region with 0 in the foreground region, unifying the space size of each sample to a consistent level for the whole registered training set, enabling the convolution kernel to traverse the data extraction features with the same receptive field, and thus resampling, wherein the resampled target space is obtained by averaging the whole data set;
(4.4) dividing the data set processed in the step (4.3) into a training set and a testing set.
For different modality data, a modality-based self-encoder loading strategy (Modal Autoencoder Loading Strategy, MALS) is proposed, as shown in fig. 5, 6. The specific working procedure is as follows:
(5.1) for single mode data, directly loading the self-encoder as an extraction feature encoder.
(5.2) for multi-mode data, a simple mode sharing encoder is adopted, different mode data are input into the encoder sharing parameters, and common characteristics of all modes are captured.
(5.3) carrying out separate normalization operation on the multimode after convolution on the modal private information in the step (5.2), and independently counting the modal private characteristics of the multimode.
Decoding basic features subjected to feature extraction by a self-encoder, simultaneously connecting low-level semantics of an encoding stage with high-level semantics of a decoding stage under the same downsampling multiplying power by using cross-layer connection, and finally decoding to obtain a segmentation result;
for a network segmentation model, cross entropy loss, softprice loss and full resolution depth supervision loss are adopted in the training process to promote gradient feedback, strengthen model convergence and further improve training effect;
for the trained network segmentation model, the test image is taken as input, and the result of automatic segmentation is obtained, as shown in fig. 7. The specific working procedure is as follows:
according to the invention, training set data are converted into brain imaging data structures in a self-supervision stage, data registration is standardized to the same template, then center cutting operation is carried out on the registered training set, and finally resampling is carried out to obtain characteristic data; performing random rotation operation on the obtained same batch of characteristic data to obtain two positive correlation views of each sample in the same batch after data enhancement, and performing random masking operation on each view to obtain the characteristics of the shielded part patch; the method comprises the steps of inputting a positive correlation characteristic map obtained after data enhancement and random mask operation into a self-encoder network to extract characteristic operation, obtaining reconstructed image patches and contrast encoding characteristics through a pixel-level prediction head and an object-level prediction head respectively, finally training the self-encoder by utilizing self-supervision information of an image, introducing symmetrical position encoding according to priori knowledge of brain magnetic resonance imaging, enabling symmetrical positions to have the same position information, simultaneously considering smoothness of a medical image relative to a natural image, classifying the characteristics of different areas according to the distribution of local intensity gradients of a three-dimensional voxel directional gradient histogram, applying different weights when reconstruction loss is measured and restored to pixel differences between an image area and an original image, thereby encouraging the model to pay more attention to the important areas, providing a combined dual-task framework of a pixel level and an object level customized for brain magnetic resonance imaging, enabling the model to learn pixel-level detail and object-level distinguishing information respectively, providing a self-encoder loading strategy based on multi-modal data in a downstream brain segmentation task, and improving a mode fusion mode and sharing the mode of the model by measuring and sharing parameter information.

Claims (10)

1. The brain magnetic resonance segmentation method based on the combined double-task self-encoder is characterized by comprising the following steps of:
registering the segmentation training set of the downstream segmentation task, performing intermediate cutting on the registered segmentation training set, and resampling the data after center cutting to obtain characteristic data;
extracting the characteristics of the acquired characteristic data by utilizing a pre-trained self-encoder to obtain basic characteristics, and decoding the acquired basic characteristics to obtain a segmentation result;
and training a network segmentation model by using the decoded segmentation result and the corresponding segmentation training set, and segmenting the MR image by using the trained network segmentation model.
2. The brain magnetic resonance segmentation method based on the combined double-task self-encoder according to claim 1, wherein the pre-training self-encoder training specific process is as follows: the method comprises the steps of collecting a pre-training image as a pre-training set, converting pre-training set data into a brain imaging data structure, standardizing registration of the data structure of the pre-training set converted into the brain imaging data structure to the same template, then performing center cutting operation on the registered pre-training set data to obtain pre-training characteristic data, and training a self-encoder by utilizing the pre-training characteristic data.
3. The brain magnetic resonance segmentation method based on the combined dual-task self-encoder according to claim 2, wherein the data structure registration of the pre-training set is standardized to the same template by using a clinical platform.
4. The brain magnetic resonance segmentation method based on the combined double-task self-encoder according to claim 2, wherein the obtained same batch of pre-training characteristic data is subjected to random rotation operation to obtain two positive correlation views of each sample in the same batch after data enhancement, and each positive correlation view is subjected to random masking operation to obtain a positive correlation characteristic map of a shielded part patch; and inputting the positive correlation characteristic map subjected to data enhancement and random mask operation into a self-encoder network to extract characteristic operation, respectively obtaining reconstructed image patches and contrast coding characteristics through a pixel-level prediction head and an object-level prediction head, and performing self-encoder pre-training by utilizing self-supervision information of a pre-training image.
5. The brain magnetic resonance segmentation method based on the combined double-task self-encoder according to claim 2, wherein in the pre-training process of the self-encoder, parameters of the network are optimized by using a back propagation strategy, training is assisted by using a loss function, and the parameters of the network are updated according to the value of the loss function, so that the loss function continuously descends until the loss function converges to a set value, and at the moment, the training is finished, and the pre-training of the self-encoder is completed.
6. A brain magnetic resonance segmentation method based on a combined dual-task self-encoder according to claim 3, characterized in that the normalization of the registration of the data structures of the pre-training set onto the same template by using a clinical platform specifically comprises the following steps:
and calculating a maximum adjacent rectangular frame for the maximum foreground region of all the modes after the registration, excluding the region with 0, unifying the space size of each sample to be consistent level for the whole pre-training set after the registration, and then resampling to obtain feature data, wherein the resampled target space is obtained by averaging the whole data set.
7. The brain magnetic resonance segmentation method based on the combined double-task self-encoder according to claim 1, wherein for single-mode data, the self-encoder is directly loaded as an extraction feature encoder, and the low-level semantics of the encoding stage are connected with the high-level semantics of the decoding stage under the same downsampling multiplying power by using a cross-layer connection through a U-shaped network structure, so as to obtain a segmentation result.
8. The brain magnetic resonance segmentation method based on the combined double-task self-encoder according to claim 1, wherein for multi-modal data, a simple modal shared encoder is adopted, different modal data are input into the encoder sharing parameters, and common characteristics of all modes are captured to obtain segmentation results.
9. The brain magnetic resonance segmentation method based on the combined double-task self-encoder according to claim 8, wherein the multi-modal encoded output is decoded, and the low-level semantics of the encoding stage are connected with the high-level semantics of the decoding stage under the same downsampling ratio by using cross-layer connection, and the segmentation result is finally obtained by decoding.
10. The brain magnetic resonance segmentation method based on the combined double-task self-encoder is characterized by comprising a data preprocessing module, a self-supervision module and a segmentation module:
the data preprocessing module is used for registering the segmentation training set of the downstream segmentation task, then performing intermediate cutting on the registered segmentation training set, and then resampling the data after center cutting to obtain characteristic data;
the self-supervision module is used for extracting the characteristics of the acquired characteristic data to obtain basic characteristics, and decoding the acquired basic characteristics to obtain a segmentation result;
the segmentation module trains a network segmentation model by utilizing the decoded segmentation result and the corresponding segmentation training set, and segments the MR image by utilizing the trained network segmentation model.
CN202311273017.XA 2023-09-28 2023-09-28 Brain magnetic resonance segmentation method and system based on combined double-task self-encoder Pending CN117274599A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311273017.XA CN117274599A (en) 2023-09-28 2023-09-28 Brain magnetic resonance segmentation method and system based on combined double-task self-encoder

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311273017.XA CN117274599A (en) 2023-09-28 2023-09-28 Brain magnetic resonance segmentation method and system based on combined double-task self-encoder

Publications (1)

Publication Number Publication Date
CN117274599A true CN117274599A (en) 2023-12-22

Family

ID=89217464

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311273017.XA Pending CN117274599A (en) 2023-09-28 2023-09-28 Brain magnetic resonance segmentation method and system based on combined double-task self-encoder

Country Status (1)

Country Link
CN (1) CN117274599A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117476240A (en) * 2023-12-28 2024-01-30 中国科学院自动化研究所 Disease prediction method and device with few samples
CN117557568A (en) * 2024-01-12 2024-02-13 吉林省迈达医疗器械股份有限公司 Focal region segmentation method in thermal therapy process based on infrared image

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117476240A (en) * 2023-12-28 2024-01-30 中国科学院自动化研究所 Disease prediction method and device with few samples
CN117476240B (en) * 2023-12-28 2024-04-05 中国科学院自动化研究所 Disease prediction method and device with few samples
CN117557568A (en) * 2024-01-12 2024-02-13 吉林省迈达医疗器械股份有限公司 Focal region segmentation method in thermal therapy process based on infrared image
CN117557568B (en) * 2024-01-12 2024-05-03 吉林省迈达医疗器械股份有限公司 Focal region segmentation method in thermal therapy process based on infrared image

Similar Documents

Publication Publication Date Title
CN111091589B (en) Ultrasonic and nuclear magnetic image registration method and device based on multi-scale supervised learning
CN109978850B (en) Multi-modal medical image semi-supervised deep learning segmentation system
CN112465827B (en) Contour perception multi-organ segmentation network construction method based on class-by-class convolution operation
CN110992351B (en) sMRI image classification method and device based on multi-input convolution neural network
CN117274599A (en) Brain magnetic resonance segmentation method and system based on combined double-task self-encoder
CN113902761B (en) Knowledge distillation-based unsupervised segmentation method for lung disease focus
CN110782427B (en) Magnetic resonance brain tumor automatic segmentation method based on separable cavity convolution
CN112132878B (en) End-to-end brain nuclear magnetic resonance image registration method based on convolutional neural network
CN112785593A (en) Brain image segmentation method based on deep learning
CN115147600A (en) GBM multi-mode MR image segmentation method based on classifier weight converter
CN115496720A (en) Gastrointestinal cancer pathological image segmentation method based on ViT mechanism model and related equipment
WO2021184195A1 (en) Medical image reconstruction method, and medical image reconstruction network training method and apparatus
Zhai et al. An improved full convolutional network combined with conditional random fields for brain MR image segmentation algorithm and its 3D visualization analysis
CN117218453B (en) Incomplete multi-mode medical image learning method
CN116958094A (en) Method for dynamically enhancing magnetic resonance image characteristics to generate pathological image characteristics
CN115496732B (en) Semi-supervised heart semantic segmentation algorithm
CN116485853A (en) Medical image registration method and device based on deep learning neural network
Mortazi et al. Weakly supervised segmentation by a deep geodesic prior
Lei et al. Generative adversarial networks for medical image synthesis
CN115526898A (en) Medical image segmentation method
CN114581459A (en) Improved 3D U-Net model-based segmentation method for image region of interest of preschool child lung
Kwarciak et al. Deep generative networks for heterogeneous augmentation of cranial defects
CN114332018A (en) Medical image registration method based on deep learning and contour features
CN112967295A (en) Image processing method and system based on residual error network and attention mechanism
Xu et al. Correlation via synthesis: End-to-end image generation and radiogenomic learning based on generative adversarial network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination