CN115239740A - GT-UNet-based full-center segmentation algorithm - Google Patents
GT-UNet-based full-center segmentation algorithm Download PDFInfo
- Publication number
- CN115239740A CN115239740A CN202210645929.4A CN202210645929A CN115239740A CN 115239740 A CN115239740 A CN 115239740A CN 202210645929 A CN202210645929 A CN 202210645929A CN 115239740 A CN115239740 A CN 115239740A
- Authority
- CN
- China
- Prior art keywords
- convolution
- dimensional
- encoder
- segmentation
- multiplied
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10004—Still image; Photographic image
- G06T2207/10012—Stereo images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10072—Tomographic images
- G06T2207/10081—Computed x-ray tomography [CT]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10072—Tomographic images
- G06T2207/10088—Magnetic resonance imaging [MRI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20092—Interactive image processing based on input by user
- G06T2207/20104—Interactive definition of region of interest [ROI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20112—Image segmentation details
- G06T2207/20132—Image cropping
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
- G06T2207/30048—Heart; Cardiac
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
Abstract
The invention provides a GT-UNet-based full-center segmentation algorithm, which comprises the following steps: preprocessing input three-dimensional multi-modality cardiac images (including CT and MRI); converting the preprocessed data into a plurality of mutually independent slices, conveying the slices to a two-dimensional segmentation network for training, and outputting a class probability map a; cutting the preprocessed data into a plurality of independent data volumes, transmitting the data into a three-dimensional segmentation network for training, and outputting a class probability graph b; sending the class probability map a and the class probability map b into a fusion module, comparing the class probability map a and the class probability map b pixel by pixel, and performing full-center segmentation by using the maximum probability class; the whole-center segmentation algorithm adaptively adjusts the size of a receptive field according to input, and effectively utilizes global information to perform remote modeling, so that the segmentation precision of the algorithm is effectively improved.
Description
Technical Field
The invention relates to the technical field of medical image processing, in particular to a GT-UNet-based full-center segmentation algorithm.
Background
The automatic segmentation of the whole heart is taken as an important step for quantitatively evaluating and quantitatively diagnosing the heart disease of the heart structure, the complete region and the edge of the heart can be accurately extracted, and then a heart three-dimensional model is established to assist a doctor in subsequent clinical diagnosis and treatment, so that the method has important application value and clinical significance for cardiac operation navigation, interventional therapy guidance, computer-aided diagnosis and the like.
Computed Tomography (CT) and Magnetic Resonance Imaging (MRI) are common Imaging diagnosis methods for heart diseases, although doctors can obtain anatomical information of the internal structure of the heart from an Imaging examination slice of a patient, which is helpful for performing non-invasive quantitative assessment on heart function subsequently, but also greatly increases the workload of doctors invisibly, the traditional image segmentation method is manual reading, and then a radiologist manually segments a boundary by using professional software. In order to reduce the heavy workload of imaging doctors and improve the segmentation precision of cardiac structures, the research of automatic image segmentation and diagnosis by computer-aided doctors is not slow.
Based on the technical problems of the existing full-heart segmentation technology, the invention provides a full-heart segmentation algorithm based on GT-UNet.
Disclosure of Invention
The invention provides a GT-UNet based full-center segmentation algorithm.
The invention adopts the following technical scheme:
a Graph-Reasoning and transform-module based (GT-UNet) full-heart segmentation algorithm, comprising:
step 1, preprocessing input three-dimensional multi-modal cardiac images including CT and MRI;
step 2, converting the preprocessed data into a plurality of mutually independent slices, conveying the slices to a two-dimensional segmentation network for training, and outputting a class probability map a;
step 3, cutting the preprocessed data into a plurality of independent data volumes, transmitting the data volumes to a three-dimensional segmentation network for training, and outputting a class probability chart b;
and 4, sending the class probability map a and the class probability map b into a fusion module, comparing the class probability map a and the class probability map b pixel by pixel, performing full-center segmentation by using the maximum probability class, and outputting a segmentation result.
Further, in step 1, the pretreatment comprises:
step 1.1, cutting a Region of interest (ROI);
step 1.2, resampling and normalizing the ROI.
Further, converting the preprocessed data into a plurality of mutually independent slices, and conveying the slices to a two-dimensional segmentation network for training comprises the following steps:
step 2.1, obtaining a mapping function f (-) representing a characteristic linear combination, and enabling an input characteristic diagram X epsilon R of an original coordinate space omega L×C Mapping into an interaction space H through a mapping function f (·), and obtaining a new feature V = f (X) epsilon R N×C Wherein, N is the node number in H space, C is the expected feature dimension, and the feature calculation formula is:
wherein, b i ∈R 1×L Is a learnable mapping weight, x j ∈R 1×C ,v i ∈R 1×C ,b ij Is a binary combination generated by convolution operation, and takes the value of 0 or 1;
step 2.2, using Graph Convolution Network (GCN) to perform inference to obtain a full connectivity Graph storing new features, and performing inference by learning interactive edge weight corresponding to each node, wherein the definition of the single-layer Graph Convolution network is as follows:
Z=GVW g =((I-A g )V)W g ∈R N×C ……(2),
wherein A is g And G denotes a node adjacency matrix of NxN size, A g Is randomly initialized and learned in the training process, I is an identity matrix, Z belongs to R N×C For nodes that are globally inferred, W g The state of each node is updated at the time of a state updating function;
step 2.3, projecting the node Z in the interaction space H to an original coordinate space omega, and enabling a reverse mapping function Y = g (Z) to be epsilon to R L×C Can be obtained from equation 3:
wherein, y i ∈R 1×C Is a learnable inverse mapping weight, d ij As a weighted scalar, Z j Representing the jth inference node.
Further, in step 2.1, by converting the function f (X)And B = θ (X; W) θ ) To reduce the input dimension, where B = [ B ] 1 ,…,b N ]∈R N×L In order to map the weights, the weights are,and θ (X) is two convolution layers, W θ Andis a learnable convolution kernel for each layer.
Further, in step 3, the three-dimensional segmentation network includes: CNN encoder F for extracting multi-scale feature maps from input images CNN (. The), deTrans encoder that processes the attention multiscale feature map embedded with position coding in an end-to-end manner, CNN decoder for generating and feeding the DeTrans encoder to decoder segmentation.
Further, the CNN encoder F CNN (. H) contains a Conv-In-Relu module and a three-stage Resnet module, where the Conv-In-Relu module first performs a convolution operation with a convolution kernel of 7 × 7 × 7 × 064 and a step size of (1, 2), followed by example normalization and ReLu processing; then, the residual error data is sent to a Resnet module in the first stage, the residual error data comprises three residual error units, residual error operation with the step size of (2, 2) and the convolution kernel of 3 multiplied by 13 multiplied by 23 multiplied by 3192 is firstly executed, then residual error operation with the step size of (1, 1) and the convolution kernel of 3 multiplied by 43 multiplied by 53 multiplied by 6192 is carried out twice, 192 characteristic diagrams with the size of 48 multiplied by 740 multiplied by 840 are obtained, and the characteristic diagrams are sent to a Resnet module in the second stage; except that the number of convolution kernels is updated to 384 from 192 by the Resnet module, the other parameters are the same as those of the first stage, 384 characteristic diagrams with the size of 24 × 20 × 20 are finally obtained, the characteristic diagrams are sent to the Resnet module of the third stage, two residual error units are arranged in the Resnet module, the residual error operation with the step size of (2, 2) and the convolution kernel of 3 × 3 × 3 × 384 is firstly executed, and then the residual error operation with the step size of (1, 1) and the convolution kernel of 3 × 3 × 3 × 384 is executed; by F CNN The definition formula of the generated characteristic diagram is as follows:
where L denotes the number of feature layers, L is a specific layer, x is an input feature map, Θ denotes parameters required by the encoder, C denotes the number of channels, H denotes the height of the input image, W denotes the width of the input image, and D denotes the depth of the input data, i.e., the number of slices.
Further, the DeTrans encoder comprises a sequence layer for converting the input image and a plurality of stacked deformable DeTrans layers, and the DeTrans encoder is used for generating the feature map generated by the CNN encoderFlattening the image into a one-dimensional image patch sequence, and embedding a three-dimensional fixed position coding sequence into the flattened one-dimensional sequenceTo capture sequences of relative or absolute positions between various substructures of the heart.
Furthermore, the CNN decoder includes four upsampling modules, each of the first three upsampling modules includes a convolutional layer having a step size of 2 × 2 × 2 and a convolutional kernel of 2 × 2 × 2, the number of the corresponding convolutional kernels is 384, 192, and 64, the convolutional layer is followed by a three-dimensional residual block to refine the feature map, and then the feature map output by the encoder and the feature map obtained after the transposed convolution are connected by jumping to sum pixel by pixel, so as to retain more low-layer information; the final upsampling module consists of one upsampling layer and one 1 x 1 convolutional layer, mapping the feature maps of 64 channels into the desired number of classes.
Compared with the prior art, the invention has the following advantages:
the GT-UNet based full-center segmentation algorithm can effectively capture the global relationship and is suitable for different heart data sets, wherein the graph reasoning unit captures the global relationship by projecting the characteristics into an interactive space to carry out relationship reasoning, the Transformer module can overcome induction deviation of convolution operation and inherent limitation of local sensitivity, the size is adjusted in a self-adaptive mode according to input, and the global information is effectively utilized for remote modeling, so that the segmentation precision of the algorithm is effectively improved.
Drawings
FIG. 1 is a flowchart of a GT-UNet based full-center segmentation algorithm in an embodiment of the present invention.
Detailed Description
In order that the above objects, features and advantages of the present invention can be more clearly understood, the present invention will be described in further detail below with reference to the accompanying drawings and specific embodiments, it being understood that the embodiments and features of the embodiments of the present application can be combined with each other without conflict.
As shown in fig. 1, the GT-UNet based full-center segmentation algorithm includes:
step 1, preprocessing input three-dimensional multi-modal cardiac images including CT and MRI;
step 2, converting the preprocessed data into a plurality of mutually independent slices, conveying the slices to a two-dimensional segmentation network for training, and outputting a class probability map a;
step 3, cutting the preprocessed data into a plurality of independent data volumes, transmitting the data volumes to a three-dimensional segmentation network for training, and outputting a class probability map b;
and 4, sending the class probability map a and the class probability map b into a fusion module, comparing the class probability map a and the class probability map b pixel by pixel, performing full-center segmentation by using the maximum probability class, and outputting a segmentation result.
Specifically, in step 1, a non-zero template is generated according to an input image, and clipping is performed according to the size and the position of a boundary frame; resampling, wherein the xy plane adopts third-order spline interpolation, and the z axis adopts a nearest neighbor interpolation method; normalization using the z-score method;
in step 2, the convolutional neural network is good at extracting local relations, but is very weak in capturing global relations, and a multilayer superposition is usually required to achieve an expected effect, so that the difficulty and cost of global reasoning of the CNN are increased sharply, and a global modeling and reasoning are generally used to benefit a global segmentation task, so in the embodiment, a global semantic reasoning unit based on graph convolution is added in the two-dimensional segmentation network;
the two-dimensional convolution segmentation network specifically comprises:
step 2.1, obtaining a mapping function f (-) representing a characteristic linear combination, and enabling an input characteristic diagram X epsilon R of an original coordinate space omega L×C Mapping into an interaction space H through a mapping function f (·), and obtaining a new feature V = f (X) epsilon R N×C Wherein N is the number of nodes in the H space, C is the expected feature dimension, and the feature calculation formula is as follows:
wherein, b i ∈R N×L Is a learnable mapping weight, x j ∈R 1×C ,v i ∈R 1×C ,b ij Is a binary generated using a convolution operationPreparing a combination, wherein the value is 0 or 1;
step 2.2, using Graph Convolution Network (GCN) to carry out reasoning to obtain a full connected graph storing new characteristics, and reasoning by learning interactive edge weight corresponding to each node, wherein the definition of the single-layer graph convolution network is as follows:
Z=GVW g =((I-A g )V)W g ∈R N×C ;
wherein, A g And G denotes a node adjacency matrix of NxN size, A g Is randomly initialized and learned in the training process, I is an identity matrix, and Z belongs to R N×C For nodes subject to global reasoning, W g The state of each node is updated at the time of a state updating function;
step 2.3, projecting the node Z in the interaction space H to an original coordinate space omega, and reversely mapping a function Y = g (Z) epsilon R L×C Can be obtained from equation 3:
wherein, y i ∈R 1×C Is a learnable inverse mapping weight, d ij As a weighted scalar, Z j Representing the jth inference node.
As an improvement of this embodiment, in order to further reduce the algorithm input dimension, the function f (X) is converted intoAnd B = θ (X; W) θ ) Wherein B = [ B ] 1 ,…,b N ]∈R N×L In order to map the weights, the weights are,and θ (X) is two convolution layers, W θ Andis a learnable convolution kernel for each layer.
In step 2.3, the network takes a CNN encoder-decoder as a basic framework, a Transformer-based deformable encoder (DeTrans) is inserted for modeling and analyzing the long-distance dependency relationship, the network mainly comprises a CNN encoder, the DeTrans encoder and a CNN decoder, wherein the CNN encoder extracts a multi-scale feature map from an input image, the DeTrans encoder processes an attention multi-scale feature map embedded with position encoding in an end-to-end manner, and the CNN decoder reconstructs the feature map;
wherein, CNN encoder F CNN <xnotran> (·) Conv-In-Relu Resnet , conv-In-Relu 7 × 7 × 7 × 064, (1,2,2) , ReLu , Resnet , , (2,2,2), 3 × 13 × 23 × 3192 , (1,1,1), 3 × 43 × 53 × 6192 , 192 48 × 740 × 840 , Resnet , Resnet 192 384, , 384 24 × 20 × 20 , Resnet , , (2,2,2), 3 × 3 × 3 × 384 , (1,1,1), 3 × 3 × 3 × 384 ; </xnotran> By F CNN The definition formula of the generated characteristic diagram is as follows:
wherein, L represents the number of feature layers, Θ represents a parameter required by an encoder, C represents the number of channels, H represents the height of an input image, W represents the width of the input image, and D represents the depth of input data, i.e., the number of slices;
to overcome the inductive bias of convolution operations and the inherent limitations of locality sensitivity, the present inventionEmbodiments introduce a DeTrans encoder, the core point is a multi-scale deformable self-attention (MS-DMSA) mechanism for capturing remote pixel dependency, the DeTrans encoder is composed of a sequence layer of input image conversion and a plurality of stacked deformable DeTrans layers, while a Transformer can only process data in a sequence-to-sequence mode and does not contain circulation and convolution operations, and a characteristic diagram generated by a CNN encoder needs to be processedFlattening is a one-dimensional image patch sequence, but direct operation inevitably loses some key spatial position relations, and a three-dimensional fixed position coding sequence needs to be embedded in the flattened one-dimensional sequenceTo capture sequences of relative or absolute positions between various substructures of the heart;
in this embodiment, the wavelength is used to form a trigonometric function of a geometric progression from 2 pi to 10000 · 2 pi to calculate the coordinates of each dimension pos by the following specific calculation formula:
where pos represents the position, i is the dimension, # ∈ { D, H, W } represents the depth, height and width of the input image, respectively, and for each feature level l, PE needs to be assigned D ,PE H ,PE W Spliced together as a three-dimensional position code p l Then with the unfolded f l Adding element by element to obtain an input sequence of the DeTrans encoder;
the self-attention layer in the initial Transformer can check all possible spatial positions according to the size of the characteristic diagram, and the invention introduces a position module only focusing on key sampling points around a reference point, which is named as MS-DMSA, so that the parameter and calculation cost, z q ∈R C Is a characteristic representation of the query matrix q,is a three-dimensional coordinate normalized by the reference point, when a multi-dimensional feature map extracted from the last L stages of the CNN encoder is givenThe ith attention head is characterized by:
wherein K is the number of sampling key points ^ (z) q ) ilqk ∈[0,1]To focus on the weight, Δ pilqk ∈R 3 Indicating the sample offset, σ, of the kth sample point in the ith feature level l (. A) is toRescaled to the ith characteristic level, ^ (z) q ) ilqk And Δ pilqk Are all to the query feature z q The values of the parameters obtained by performing the linear projection, where the MS-DMSA layer is defined as:
h is the number of the attention heads, phi (·) represents a linear projection layer for weighting and aggregating the characteristics of all the attention heads, a DeTrans layer is composed of an MS-DMSA layer and a feedforward network, each layer adopts jump connection and carries out layer normalization, a DeTrans encoder is created by repeatedly stacking the DeTrans layers, and then an output sequence is formed into a characteristic diagram again according to the size of a three-dimensional scale;
in this embodiment, the CNN decoder includes four upsampling modules, each of the first three modules includes a convolutional layer having a step size of 2 × 2 × 2 and convolutional kernels of 2 × 2 × 2, the number of the convolutional kernels is 384, 192, and 64, respectively, the convolutional layers are followed by a three-dimensional residual block to refine the feature map, the feature map output by the encoder and the feature map obtained after performing the transpose convolution are subjected to pixel-by-pixel summation through jump connection, more low-layer information is retained, the last module is composed of one upsampling layer and one 1 × 1 convolutional layer, and the feature maps of 64 channels are mapped into a desired number of classes.
In step 4, the output results of the two sub-convolution networks are fused together by the fusion module, firstly, preprocessed data are converted into a plurality of mutually independent slices and are conveyed to the two-dimensional segmentation network for training, a category probability graph a is output, the preprocessed data are cut into a plurality of data volumes and are conveyed to the three-dimensional segmentation network for training, a label prediction probability graph b is output, then the two probability graphs are conveyed to the fusion module for pixel-by-pixel comparison, and final full-center segmentation is carried out according to the category with the maximum probability.
The present invention is not limited to the above-described embodiments, which are described in the specification and illustrated only for illustrating the principle of the present invention, but various changes and modifications may be made within the scope of the present invention as claimed without departing from the spirit and scope of the present invention. The scope of the invention is defined by the appended claims.
Claims (8)
1. A GT-UNet based full-heart segmentation algorithm, comprising:
step 1, preprocessing an input three-dimensional multi-modal heart image;
step 2, converting the preprocessed data into a plurality of mutually independent slices, conveying the slices to a two-dimensional segmentation network for training, and outputting a class probability map a;
step 3, cutting the preprocessed data into a plurality of independent data volumes, transmitting the data volumes to a three-dimensional segmentation network for training, and outputting a class probability chart b;
and 4, sending the class probability map a and the class probability map b into a fusion module, comparing the class probability map a and the class probability map b pixel by pixel, and performing full-center segmentation by using the maximum probability class.
2. The GT-UNet based full-heart segmentation algorithm according to claim 1, wherein in step 1, the preprocessing comprises:
step 1.1, cutting ROI;
step 1.2, resampling and normalizing the ROI.
3. The GT-UNet based full-heart segmentation algorithm according to claim 1, wherein the step 2 of converting the preprocessed data into a plurality of independent slices and feeding the slices into the two-dimensional segmentation network for training comprises:
step 2.1, obtaining a mapping function T (-) representing the linear combination of the characteristics, and inputting the characteristic diagram X epsilon R of the original coordinate space omega L×C Mapping into an interaction space H through a mapping function f (·), and obtaining a new feature V = f (X) epsilon R N×C Wherein, N is the node number of H space, C is the expected feature dimension, and the feature calculation formula is:
wherein, b i ∈R 1×L Is a learnable mapping weight, x j ∈R 1×C ,v i ∈R 1×C ,b ij Is a binary combination generated by convolution operation, and takes the value of 0 or 1;
step 2.2, reasoning is carried out by using a graph convolution network GCN to obtain a full connected graph storing new characteristics, reasoning is carried out by learning interactive edge weights corresponding to each node, and the definition of the single-layer graph convolution network is as follows:
Z=GVW g =((I-A g )V)W g ∈R N×C ......(2),
wherein, A g And G represents a node adjacency matrix of size NxN, A g Is randomly initialized and learned in the training process, I is an identity matrix, Z belongs to R N×C For nodes that are globally inferred, W g A state update function, in which the state of each node is updated;
step 2.3, project node Z in interaction space H to original seatMarking a space omega, and mapping a function Y = g (Z) epsilon R reversely L×C Can be obtained from equation 3:
wherein, y i ∈R 1×C Is a learnable inverse mapping weight, d ij As a weighted scalar, Z j Representing the jth inference node.
4. The GT-UNet based full heart segmentation algorithm according to claim 3, wherein in step 2.1, by converting the function f (X)And B = θ (X; W) θ ) To reduce the input dimension, where B = [ B ] 1 ,…,b N ]∈R N×L In order to map the weights, the weights are,and θ (X) is two convolution layers, W θ Andis a learnable convolution kernel for each layer.
5. The GT-UNet based full-heart segmentation algorithm according to claim 1, wherein in step 3, the three-dimensional segmentation network comprises: a CNN encoder for extracting a multi-scale feature map from an input image, a retrans encoder for processing an attention multi-scale feature map embedded with position encoding in an end-to-end manner, a CNN decoder for generating and feeding the retrans encoder to decoder segmentation.
6. The GT-UNet based full-center segmentation algorithm of claim 5, wherein the CNN encoder F CNN (. C) contains a Conv-In-Relu module and three stages of RAn esnet module, wherein the Conv-In-Relu module first performs a convolution operation with a convolution kernel of 7 × 7 × 064 and a step size of (1, 2), followed by example normalization and ReLu processing; then, the residual error data is sent to a Resnet module in the first stage, the residual error data comprises three residual error units, residual error operation with the step size of (2, 2) and the convolution kernel of 3 multiplied by 13 multiplied by 23 multiplied by 3192 is firstly executed, then residual error operation with the step size of (1, 1) and the convolution kernel of 3 multiplied by 43 multiplied by 53 multiplied by 6192 is carried out twice, 192 characteristic diagrams with the size of 48 multiplied by 740 multiplied by 840 are obtained, and the characteristic diagrams are sent to a Resnet module in the second stage; except that the Resnet module updates the number of convolution kernels from 192 to 384, the other parameters are the same as those in the first stage, 384 characteristic diagrams with the size of 24 x 20 are finally obtained, the characteristic diagrams are sent to the Resnet module in the third stage, in the Resnet module, two residual units are provided, firstly, a residual operation with the step size of (2, 2) and the convolution kernel of 3 x 384 is executed, then, residual operation with the step size of (1, 1) and the convolution kernel of 3 multiplied by 384 is executed once; by F CNN (. The definition of the generated feature graph is as follows:
wherein L represents the number of feature layers, L is a specified layer, x is an input feature map, Θ represents parameters required by the encoder, C represents the number of channels, H represents the height of an input image, W represents the width of the input image, and D represents the depth of input data, i.e., the number of slices.
7. The GT-UNet based full-centric segmentation algorithm according to claim 5, wherein the DeTrans encoder comprises a sequence layer of input image conversion and a plurality of stacked deformable DeTrans layers, the DeTrans encoder is used for converting the feature map generated by the CNN encoderFlattening the image into a one-dimensional image patch sequence, and embedding a three-dimensional fixed position coding sequence into the flattened one-dimensional sequenceTo capture sequences of relative or absolute positions between various substructures of the heart.
8. The GT-UNet based full-heart segmentation algorithm according to claim 5, wherein the CNN decoder comprises four upsampling modules, each of the first three upsampling modules comprises a convolution layer with a step size of 2 × 2 × 2 and a convolution kernel of 2 × 2 × 2, the number of the corresponding convolution kernels is 384, 192, and 64, respectively, and the feature map is refined by a three-dimensional residual block, and then the feature map output by the encoder and the feature map obtained after the transposition convolution are subjected to pixel-by-pixel summation through jump connection, so as to retain more low-layer information; the final upsampling module consists of one upsampling layer and one 1 x 1 convolutional layer, mapping the feature maps of 64 channels into the desired number of classes.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210645929.4A CN115239740A (en) | 2022-06-08 | 2022-06-08 | GT-UNet-based full-center segmentation algorithm |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210645929.4A CN115239740A (en) | 2022-06-08 | 2022-06-08 | GT-UNet-based full-center segmentation algorithm |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115239740A true CN115239740A (en) | 2022-10-25 |
Family
ID=83670140
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210645929.4A Pending CN115239740A (en) | 2022-06-08 | 2022-06-08 | GT-UNet-based full-center segmentation algorithm |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115239740A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117333777A (en) * | 2023-12-01 | 2024-01-02 | 山东元明晴技术有限公司 | Dam anomaly identification method, device and storage medium |
-
2022
- 2022-06-08 CN CN202210645929.4A patent/CN115239740A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117333777A (en) * | 2023-12-01 | 2024-01-02 | 山东元明晴技术有限公司 | Dam anomaly identification method, device and storage medium |
CN117333777B (en) * | 2023-12-01 | 2024-02-13 | 山东元明晴技术有限公司 | Dam anomaly identification method, device and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111476292B (en) | Small sample element learning training method for medical image classification processing artificial intelligence | |
Zhuang et al. | An Effective WSSENet-Based Similarity Retrieval Method of Large Lung CT Image Databases. | |
Hermessi et al. | Convolutional neural network-based multimodal image fusion via similarity learning in the shearlet domain | |
Khan et al. | Deep neural architectures for medical image semantic segmentation | |
Birenbaum et al. | Longitudinal multiple sclerosis lesion segmentation using multi-view convolutional neural networks | |
WO2021186592A1 (en) | Diagnosis assistance device and model generation device | |
Ghoniem | A novel bio-inspired deep learning approach for liver cancer diagnosis | |
US12008757B2 (en) | Method and system for automatic multiple lesion annotation of medical images | |
CN115331769B (en) | Medical image report generation method and device based on multi-mode fusion | |
CN114972291B (en) | Medical image structured automatic labeling method and system based on hybrid enhanced intelligence | |
CN114792326A (en) | Surgical navigation point cloud segmentation and registration method based on structured light | |
CN116258732A (en) | Esophageal cancer tumor target region segmentation method based on cross-modal feature fusion of PET/CT images | |
Fatemizadeh et al. | Automatic landmark extraction from image data using modified growing neural gas network | |
CN115239740A (en) | GT-UNet-based full-center segmentation algorithm | |
Vuppala et al. | Explainable deep learning methods for medical imaging applications | |
Wong et al. | The synergy of cybernetical intelligence with medical image analysis for deep medicine: a methodological perspective | |
Verma et al. | Role of deep learning in classification of brain MRI images for prediction of disorders: a survey of emerging trends | |
CN116309754A (en) | Brain medical image registration method and system based on local-global information collaboration | |
CN114708952B (en) | Image annotation method and device, storage medium and electronic equipment | |
CN116091412A (en) | Method for segmenting tumor from PET/CT image | |
Wei et al. | Multimodal Medical Image Fusion: The Perspective of Deep Learning | |
CN117616467A (en) | Method for training and using deep learning algorithm to compare medical images based on reduced dimension representation | |
CN113850710A (en) | Cross-modal medical image accurate conversion method | |
Younisse et al. | Fine-tuning U-net for medical image segmentation based on activation function, optimizer and pooling layer. | |
CN115151951A (en) | Image similarity determination by analysis of registration |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |