CN113888466A - Pulmonary nodule image detection method and system based on CT image - Google Patents
Pulmonary nodule image detection method and system based on CT image Download PDFInfo
- Publication number
- CN113888466A CN113888466A CN202111030746.3A CN202111030746A CN113888466A CN 113888466 A CN113888466 A CN 113888466A CN 202111030746 A CN202111030746 A CN 202111030746A CN 113888466 A CN113888466 A CN 113888466A
- Authority
- CN
- China
- Prior art keywords
- image
- patch
- cnn
- transformer
- features
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0012—Biomedical image inspection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10072—Tomographic images
- G06T2207/10081—Computed x-ray tomography [CT]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
- G06T2207/30061—Lung
- G06T2207/30064—Lung nodule
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Biomedical Technology (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Medical Informatics (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Radiology & Medical Imaging (AREA)
- Quality & Reliability (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
- Apparatus For Radiation Diagnosis (AREA)
- Measuring And Recording Apparatus For Diagnosis (AREA)
Abstract
The invention discloses a lung nodule image detection method and system based on a CT image, wherein the detection method comprises the following steps: s1, image serialization: performing labeling by reshaping the slices of the input lung CT image into a set of patch sequences; s2, utilizing patch embedding, and mapping the vectorization patch sequence to a potential two-dimensional embedding space by using trainable linear mapping; s3, establishing a CNN and Transformer mixed encoder: coding the marked image block from the CNN characteristic map into an input sequence for extracting the global context through a Transformer; s4, cascade decoder: firstly, the coding features obtained in the step S3 are up-sampled through a decoder, then the up-sampled coding features are combined with a high-resolution CNN feature map to achieve accurate positioning, and finally, the U-Net is utilized to recover local spatial information to enhance more accurate detail detection information. The method can effectively improve the accuracy of pulmonary nodule detection.
Description
Technical Field
The invention relates to the technical field of image processing, in particular to a lung nodule image detection method and system based on a CT image.
Background
The lung cancer is the cancer with the highest cause of death in the world, and the lung nodules are used as early-stage expression forms of the lung cancer, can be observed on a CT image as quasi-circular lung shadows with the diameter not more than 3cm, and can help doctors to realize diagnosis of benign and malignant lung nodules by accurately detecting the outlines of the lung nodules. Since the lung nodules are minute in size and have features such as morphology and brightness similar to those of tissues such as blood vessels in the lung parenchyma, it is difficult to separate them by visual observation alone, and thus they are likely to cause serious interference in judgment by a doctor. In order to reduce the workload of doctors and improve the efficiency of nodule diagnosis, computer-aided diagnosis techniques have been used in clinical work.
Deep learning currently achieves excellent application effects in the field of computer vision. The U-Net architecture has become a de facto standard in various medical image segmentation tasks and has enjoyed great success. However, due to the inherent locality of convolution operations, U-Net typically exhibits limitations in explicitly modeling remote dependencies. The design of transformers for sequence-to-sequence prediction has become an alternative architecture with a congenital global self-attack mechanism, but may result in limited localization capabilities due to insufficient details of its low-level features.
Disclosure of Invention
Aiming at the problem that the conventional method for encoding a marked image block by using only a Transformer and then directly upsampling a hidden feature representation into a dense output with complete resolution cannot produce a satisfactory result, which generally shows large difference in texture, shape and size among patients, the invention provides a lung nodule image detection method and system based on a CT image, wherein the detection method uses a Transformer Unet combined framework, and proposes a self-attention mechanism based on CNN features on the basis of the conventional research, and different from the conventional CNN-based method, the Transformer Unet establishes the self-attention mechanism from the perspective of sequence to sequence prediction. In order to compensate for the loss of feature resolution caused by the Transformer, the network adopts a mixed structure of CNN and Transformer to utilize detailed high-resolution spatial information from CNN features and Transformer-encoded global context information. Inspired by the design of the U-shaped architecture, self-attention features encoded by the transform are then upsampled to combine with different high resolution CNN features that jump from the encoding path to achieve accurate positioning.
The invention relates to a pulmonary nodule image detection method and system based on a CT image, which adopts a transformerUnet combined framework, wherein the transformerUnet combined framework is a neural network framework (Transformer) based on an attention coding technology in deep learning and a biomedical semantic segmentation network framework (U-Net) based on a full convolution network technology, a substitute system framework of the Transformer with an innate global self-entry mechanism for sequence-to-sequence prediction is designed, and the problem of low positioning accuracy caused by insufficient low-level features can be solved by adding a medical image segmentation model. Different from the existing manually designed pulmonary nodule detection model, the detection framework provided by the invention consists of two parts: a Transformer part and a U-Net part.
Interpretation of terms:
1. transformer: attention is directed to neural network architectures for coding techniques.
2. CNN: convolutional Neural Networks.
3. U-Net: a biomedical semantic segmentation network architecture belongs to a full convolution neural network.
4. Batch: the feature detector in the convolutional neural network divides the input image into a plurality of patches, and the patch is called patch.
5. And (4) CUP: cascaded Upsampler, a Cascaded decoder that upsamples larger pictures with a less computationally intensive decoder to increase decoding speed.
6. MSA: Multi-Head Self Attention, while understanding the input sequence from different angles, and computing multiple attentions.
7. MLP: the multi layer perceiver, also called artificial neural network, has a structure with a plurality of hidden layers in the middle besides an input and output layer.
The technical scheme adopted by the invention for overcoming the technical problems is as follows:
the invention discloses a lung nodule image detection method based on a CT image, which adopts a Transformer Unet combined framework to detect the lung nodule image, wherein the Transformer Unet combined framework comprises a Transformer part and a U-Net part, and the detection method comprises the following steps:
s1, image serialization: performing labeling by reshaping the slices of the input lung CT image into a set of patch sequences;
s2, utilizing patch embedding, and mapping the vectorization patch sequence to a potential two-dimensional embedding space by using trainable linear mapping;
s3, establishing a CNN and Transformer mixed encoder: coding the marked image block from the CNN characteristic map into an input sequence for extracting the global context through a Transformer;
s4, cascade decoder: firstly, the coding features obtained in the step S3 are up-sampled through a decoder, then the up-sampled coding features are combined with a high-resolution CNN feature map to achieve accurate positioning, and finally, the U-Net is utilized to recover local spatial information to enhance more accurate detail detection information.
Further, in step S1, let the lung CT image beH × W is the spatial resolution, and C is the number of channels.
Further, step S1 specifically includes:
tokenization is performed by remodeling the input lung CT image x into a set of patch sequencesWhere p is the sequence size, so the size of each patch is p × p, the number of each image patchI.e. the input sequence length.
Further, step S2 specifically includes:
s21, in order to encode the patch sequence space information, a specific position code added to the patch sequence embedding is learned to retain the position information, as shown in the following equation:
wherein the content of the first and second substances,is a patch embedded map that is embedded in,representing position embedding information, D is the dimension of the input patch;
s22, in order to recover the spatial order of the patch embedded, the size of the coding feature is first selected fromBecome intoThe channel size of the features is reduced to the number of feature classes using 1 × 1 convolution, and then the feature map is directly upsampled to full resolution H × W for predicting the final segmentation result.
Further, step S3 specifically includes:
the CNN and transform hybrid encoder is constructed by l-layer multi-headed self-attention and multi-layer perceptrons as the expressions shown in equations (2) and (3), so the output of the l-th layer can be written as follows:
where MSA denotes multi-head self-attention, MLP denotes multi-layer perceptron, LN (-) denotes the normalization operator of the image,indicating the first layer of multi-headed attention output, zlRepresenting layer I codingDescription of the image.
Further, the method further includes compensating for information loss of a CNN and Transformer hybrid encoder, and specifically includes:
similar to U-Net, skip concatenation is used to fuse the multi-scale features from the hybrid encoder with the upsampled features, using CNN as a feature extractor to generate a feature map instead of inputting a1 × 1 patch extracted from the original image, thereby preserving more deep and shallow features to compensate for information loss.
Further, step S4 specifically includes:
the plurality of upsampling steps are used for decoding the hidden features to output a final segmentation mask map, specifically:
in the case of hidden featuresIs reconstructed intoThen, a cascaded decoder is realized by cascading a plurality of upsampling blocks to achieve the followingTo H × W full resolution, wherein cascading multiple upsampling blocks sequentially comprises two upsamples, a 3 × 3 convolutional layer and a ReLU layer;
and finally, the cascade decoder and the hybrid encoder form a U-shaped structure together, and the feature fusion is carried out by realizing the upsampling of feature maps with different levels of resolution ratios through jump connection.
The invention also discloses a lung nodule image detection system based on the CT image, which adopts a transformer Unet combined framework and specifically comprises the following steps:
an image serialization module for remodeling the slices of the input lung CT image into a set of patch sequences to perform labeling;
a patch embedding module to map the vectorized patch sequence to a potential two-dimensional embedding space using a trainable linear mapping;
a mixed encoder module of CNN and Transformer, which is used for encoding the marked image block from the CNN feature map into an input sequence for extracting the global context through the Transformer;
and the cascade decoder module is used for firstly up-sampling the coding characteristics obtained by the CNN and Transformer hybrid encoder module through the decoder, then combining the up-sampled coding characteristics with the high-resolution CNN characteristic diagram to realize accurate positioning, and finally utilizing U-Net to enhance more accurate detail detection information by recovering local spatial information.
The invention has the beneficial effects that:
1. the existing algorithm for detecting the pulmonary nodule needs a lot of time in the process of feature extraction. The traditional feature extraction algorithm needs a large amount of manual labeling, the features need a large amount of priori knowledge, the method for detecting and classifying the pulmonary nodules by using deep learning can effectively avoid subjective uncertainty of judgment of doctors, effectively relieve the workload of the doctors and simultaneously improve the accuracy rate of pulmonary nodule detection, and the deep learning model can automatically learn and extract the features suitable for the current task.
2. The effect of lung nodule detection by directly using a Transformer is not as good as that of U-Net or Attenttion, and the Transformer can well extract high-level semantic features, which is beneficial to a classification task, but lacks low-level features to segment a lung nodule image. Therefore, a Transformer Unet network formed by combining Transformer jump connection with a U-Net structure has strong learning capacity of high-level semantic features and bottom-level detail features, can effectively improve the accuracy of pulmonary nodule detection, and assists doctors in judgment.
Drawings
Fig. 1 is a lung CT image serial slice according to an embodiment of the present invention.
Fig. 2 is a schematic diagram illustrating a principle of a TransformerUnet binding architecture according to an embodiment of the present invention.
Detailed Description
In order to facilitate a better understanding of the invention for those skilled in the art, the invention will be described in further detail with reference to the accompanying drawings and specific examples, which are given by way of illustration only and do not limit the scope of the invention.
Examples 1,
As shown in fig. 1 and fig. 2, the present embodiment discloses a lung nodule image detection method based on CT images, which performs lung nodule image detection by using a Transformer unet combination architecture, where the Transformer unet combination architecture includes a Transformer portion and a U-Net portion.
The lung nodule image detection method based on the CT image comprises the following steps:
step S1, image preprocessing, which is to perform image serialization by remodeling slices of the input CT image of the lung into a batch sequence to perform labeling.
Given a CT image of the lung asH × W is the spatial resolution, and C is the number of channels. The goal is to predict a pixel label map of the corresponding size H W, unlike prior methods of training CNN (e.g., U-Net), encoding an image into a high-level feature representation, and then decoding it to full spatial resolution, by introducing the self-attribute mechanism into the encoder design using the Transformer, the image is first encoded into a high-level feature representation and then decoded to the original resolution size.
The pixel sizes and the thickness granularity of different scanning surfaces are different, so that the training task of the model is not facilitated, and the situation can be effectively avoided by adopting image serialization. The image serialization described in this embodiment specifically includes:
tokenization is performed by remodeling the input lung CT image x into a set of patch sequencesWhere p is the sequence size and the unit of p is the pixels, so the size of each patch is p × p and the number of each image patchI.e. the input sequence length.
Step S2, patch embedding: with patch embedding, a trainable linear mapping is used to map the vectorized patch sequence to a potential two-dimensional embedding space.
In this embodiment, step S2 specifically includes:
s21, in order to encode the patch sequence space information, a specific position code added to the patch sequence embedding is learned to retain the position information, as shown in the following equation:
wherein the content of the first and second substances,is a patch embedded map that is embedded in,representing position embedding information, D is the dimension of the input patch;
s22, in order to recover the spatial order of the patch embedded, the size of the coding feature is first selected fromBecome intoThe channel size of the features is reduced to the number of feature classes using 1 × 1 convolution, and then the feature map is directly upsampled to full resolution H × W for predicting the final segmentation result.
Step S3, establishing a CNN and Transformer hybrid encoder: the marked image blocks from the CNN feature map are encoded by the Transformer as an input sequence for extracting the global context.
In the mixed encoder of the CNN and the Transformer, different suspected lung nodule candidate sets are obtained after embedding according to patch. Due to the internal limitations of convolution operations (which still remain in terms of long distance relationships in the modeling), these architectures often yield poor performance, especially for patients exhibiting large differences in structure texture, shape, and size. To overcome this limitation, a self-entry mechanism is established based on the CNN features, which encodes the labeled image blocks from the CNN feature map into an input sequence that extracts the global context. Secondly, unlike previous CNN-based methods, the Transformer is not only powerful in global feature extraction, but also exhibits excellent transferability to downstream tasks under large-scale pre-training, as an alternative architecture, it completely employs distributed convolution operations, relying only on attention mechanism.
Specifically, step S3 specifically includes:
the CNN and transform hybrid encoder is constructed by l-layer multi-headed self-attention and multi-layer perceptrons as the expressions shown in equations (2) and (3), so the output of the l-th layer can be written as follows:
where MSA denotes multi-head self-attention, MLP denotes multi-layer perceptron, LN (-) denotes the normalization operator of the image,indicating the first layer of multi-headed attention output, zlRepresenting a description of the l-th layer coded picture.
Because of the information loss of the CNN and Transformer hybrid encoder, this embodiment further includes compensation for the information loss of the CNN and Transformer hybrid encoder, and a hybrid CNN-Transformer architecture is used as an encoder and cascaded upsampling is performed to achieve accurate positioning. The method specifically comprises the following steps:
similar to U-Net, skip concatenation is used to fuse the multi-scale features from the hybrid encoder with the upsampled features, using CNN as a feature extractor to generate a feature map instead of inputting a1 × 1 patch extracted from the original image, thereby preserving more deep and shallow features to compensate for information loss.
Here shallow and deep features are concatenated together to reduce the loss of spatial information from down-sampling. Then a linear layer, the connecting feature size remains the same as the size of the upsampling feature.
Step S4, the concatenated decoder: firstly, the coding features obtained in the step S3 are up-sampled through a decoder, then the up-sampled coding features are combined with a high-resolution CNN feature map to achieve accurate positioning, and finally, more accurate detail detection information is enhanced by recovering local spatial information through U-Net, false positive of lung nodule detection is effectively reduced, and an accurate image is provided for an auxiliary diagnosis system.
In this embodiment, step S4 specifically includes:
the plurality of upsampling steps are used for decoding the hidden features to output a final segmentation mask map, specifically:
in the case of hidden featuresIs reconstructed intoThen, a cascaded decoder is realized by cascading a plurality of upsampling blocks to achieve the followingTo H × W full resolution, wherein cascading multiple upsampling blocks sequentially comprises two upsamples, a 3 × 3 convolutional layer and a ReLU layer;
and finally, the cascade decoder and the hybrid encoder form a U-shaped structure together, and the feature fusion is carried out by realizing the upsampling of feature maps with different levels of resolution ratios through jump connection.
The transformerUnet combined architecture provided by the invention is shown in FIG. 2, and establishes self-attention mechanism from the perspective of sequence-to-sequence prediction. To compensate for the loss of feature resolution caused by the transform, the Transformer uet employs a CNN-transform hybrid structure to exploit the high-resolution spatial information from CNN features and the transform-encoded global context information. Inspired by the U-Net design, the self-attribute feature of the transform coding is then upsampled, which combines with the different high resolution CNN features that hop the connection from the coding path to achieve accurate positioning. This design enables the overall network framework to retain the advantage of the Transformer and also benefits lung nodule image detection. Fig. 1 is a slice of a CT image acquired of a lung.
The network establishes a deep learning framework in a Python environment on the basis of an Nvidia RTX2080Ti GPU hardware platform under an Ubuntu16 operating system, and is trained by using a LUNA16 and a LIDC data set, and a large number of experiments prove the feasibility of transformer Unet model training and testing.
Data amplification, such as random rotation and flipping, was used for all experiments. For the Transformer encoder, only ViT with 12 Transformer layers is employed. For the hybrid encoder design, in combination with ResNet-50 and ViT, all transform architectures (i.e., ViT) and ResNet-50 are pre-trained on ImageNet, the resolution and patch size of the input image are set to 224 × 224 and 16, respectively, and four cascaded upsampled blocks need to be set in the CUP to achieve the original image resolution. The model was trained using an SGD optimizer with a learning rate of 0.01, momentum of 0.9, weight decay of 1e-4. The default batch size is 24, the default number of training iterations for the LUNA16 dataset is 20k, and the default number of training iterations for the LIDC dataset is 14 k.
The invention is characterized in that on one hand, a CNN architecture (U-Net) is utilized to provide a way for extracting low-level characteristic clues, and such fine spatial details can be well supplemented. And on the other hand, a Transformer network is adopted to encode the marked image blocks from the Convolutional Neural Network (CNN) feature map into an input sequence for extracting the global context under the U-Net framework. Finally, the decoder upsamples the encoded features, which are combined with the high resolution CNN feature map to achieve accurate positioning. With the combination of U-Net, Transformers can be used as a powerful encoder for lung nodule detection tasks by recovering local spatial information.
Examples 2,
The embodiment discloses a system of a lung nodule image detection method based on a CT image described in embodiment 1, which adopts a TransformerUnet combination architecture, and specifically includes:
an image serialization module for remodeling the slices of the input lung CT image into a set of patch sequences to perform labeling;
a patch embedding module to map the vectorized patch sequence to a potential two-dimensional embedding space using a trainable linear mapping;
a mixed encoder module of CNN and Transformer, which is used for encoding the marked image block from the CNN feature map into an input sequence for extracting the global context through the Transformer;
and the cascade decoder module is used for firstly up-sampling the coding characteristics obtained by the CNN and Transformer hybrid encoder module through the decoder, then combining the up-sampled coding characteristics with the high-resolution CNN characteristic diagram to realize accurate positioning, and finally utilizing U-Net to enhance more accurate detail detection information by recovering local spatial information.
The functions of the above modules correspond to those of embodiment 1, and are not described herein again.
The foregoing merely illustrates the principles and preferred embodiments of the invention and many variations and modifications may be made by those skilled in the art in light of the foregoing description, which are within the scope of the invention.
Claims (8)
1. A lung nodule image detection method based on CT image is characterized in that lung nodule image detection is carried out by adopting a transformer Unet combined framework, and the detection method comprises the following steps:
s1, image serialization: performing labeling by reshaping the slices of the input lung CT image into a set of patch sequences;
s2, utilizing patch embedding, and mapping the vectorization patch sequence to a potential two-dimensional embedding space by using trainable linear mapping;
s3, establishing a CNN and Transformer mixed encoder: coding the marked image block from the CNN characteristic map into an input sequence for extracting the global context through a Transformer;
s4, cascade decoder: firstly, the coding features obtained in the step S3 are up-sampled through a decoder, then the up-sampled coding features are combined with a high-resolution CNN feature map to achieve accurate positioning, and finally, the U-Net is utilized to recover local spatial information to enhance more accurate detail detection information.
4. The method according to claim 3, wherein step S2 specifically comprises:
s21, in order to encode the patch sequence space information, a specific position code added to the patch sequence embedding is learned to retain the position information, as shown in the following equation:
wherein the content of the first and second substances,is a patch embedded map that is embedded in,representing position embedding information, D is the dimension of the input patch;
s22, in order to recover the spatial order of the patch embedded, the size of the coding feature is first selected fromBecome intoThe channel size of the features is reduced to the number of feature classes using 1 × 1 convolution, and then the feature map is directly upsampled to full resolution H × W for predicting the final segmentation result.
5. The method according to claim 1, wherein step S3 specifically comprises:
the CNN and transform hybrid encoder is constructed by l-layer multi-headed self-attention and multi-layer perceptrons as the expressions shown in equations (2) and (3), so the output of the l-th layer can be written as follows:
6. The method according to any one of claims 1-5, further comprising compensating for a loss of information of a hybrid encoder of CNN and Transformer, specifically comprising:
similar to U-Net, skip concatenation is used to fuse the multi-scale features from the hybrid encoder with the upsampled features, using CNN as a feature extractor to generate a feature map instead of inputting a1 × 1 patch extracted from the original image, thereby preserving more deep and shallow features to compensate for information loss.
7. The method according to claim 5, wherein step S4 specifically comprises:
the plurality of upsampling steps are used for decoding the hidden features to output a final segmentation mask map, specifically:
in the case of hidden featuresIs reconstructed intoThen, a cascaded decoder is realized by cascading a plurality of upsampling blocks to achieve the followingTo H × W full resolution, wherein cascading multiple upsampling blocks sequentially comprises two upsamples, a 3 × 3 convolutional layer and a ReLU layer;
and finally, the cascade decoder and the hybrid encoder form a U-shaped structure together, and the feature fusion is carried out by realizing the upsampling of feature maps with different levels of resolution ratios through jump connection.
8. A pulmonary nodule image detection system based on CT images is characterized in that a transformer Unet combined framework is adopted, and the pulmonary nodule image detection system specifically comprises:
an image serialization module for remodeling the slices of the input lung CT image into a set of patch sequences to perform labeling;
a patch embedding module to map the vectorized patch sequence to a potential two-dimensional embedding space using a trainable linear mapping;
a mixed encoder module of CNN and Transformer, which is used for encoding the marked image block from the CNN feature map into an input sequence for extracting the global context through the Transformer;
and the cascade decoder module is used for firstly up-sampling the coding characteristics obtained by the CNN and Transformer hybrid encoder module through the decoder, then combining the up-sampled coding characteristics with the high-resolution CNN characteristic diagram to realize accurate positioning, and finally utilizing U-Net to enhance more accurate detail detection information by recovering local spatial information.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111030746.3A CN113888466A (en) | 2021-09-03 | 2021-09-03 | Pulmonary nodule image detection method and system based on CT image |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111030746.3A CN113888466A (en) | 2021-09-03 | 2021-09-03 | Pulmonary nodule image detection method and system based on CT image |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113888466A true CN113888466A (en) | 2022-01-04 |
Family
ID=79012272
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111030746.3A Pending CN113888466A (en) | 2021-09-03 | 2021-09-03 | Pulmonary nodule image detection method and system based on CT image |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113888466A (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114638842A (en) * | 2022-03-15 | 2022-06-17 | 桂林电子科技大学 | Medical image segmentation method based on MLP |
CN114757942A (en) * | 2022-05-27 | 2022-07-15 | 南通大学 | Method for detecting active tuberculosis by multilayer spiral CT (computed tomography) based on deep learning |
CN115713661A (en) * | 2022-11-29 | 2023-02-24 | 湘南学院 | Spinal column lateral bending Lenke parting system |
CN116779170A (en) * | 2023-08-24 | 2023-09-19 | 济南市人民医院 | Pulmonary function attenuation prediction system and device based on self-adaptive deep learning |
WO2024000161A1 (en) * | 2022-06-28 | 2024-01-04 | 中国科学院深圳先进技术研究院 | Ct pancreatic tumor automatic segmentation method and system, terminal and storage medium |
CN117636064A (en) * | 2023-12-21 | 2024-03-01 | 浙江大学 | Intelligent neuroblastoma classification system based on pathological sections of children |
CN117636064B (en) * | 2023-12-21 | 2024-05-28 | 浙江大学 | Intelligent neuroblastoma classification system based on pathological sections of children |
-
2021
- 2021-09-03 CN CN202111030746.3A patent/CN113888466A/en active Pending
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114638842A (en) * | 2022-03-15 | 2022-06-17 | 桂林电子科技大学 | Medical image segmentation method based on MLP |
CN114638842B (en) * | 2022-03-15 | 2024-03-22 | 桂林电子科技大学 | Medical image segmentation method based on MLP |
CN114757942A (en) * | 2022-05-27 | 2022-07-15 | 南通大学 | Method for detecting active tuberculosis by multilayer spiral CT (computed tomography) based on deep learning |
WO2024000161A1 (en) * | 2022-06-28 | 2024-01-04 | 中国科学院深圳先进技术研究院 | Ct pancreatic tumor automatic segmentation method and system, terminal and storage medium |
CN115713661A (en) * | 2022-11-29 | 2023-02-24 | 湘南学院 | Spinal column lateral bending Lenke parting system |
CN115713661B (en) * | 2022-11-29 | 2023-06-23 | 湘南学院 | Scoliosis Lenke parting system |
CN116779170A (en) * | 2023-08-24 | 2023-09-19 | 济南市人民医院 | Pulmonary function attenuation prediction system and device based on self-adaptive deep learning |
CN117636064A (en) * | 2023-12-21 | 2024-03-01 | 浙江大学 | Intelligent neuroblastoma classification system based on pathological sections of children |
CN117636064B (en) * | 2023-12-21 | 2024-05-28 | 浙江大学 | Intelligent neuroblastoma classification system based on pathological sections of children |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113888466A (en) | Pulmonary nodule image detection method and system based on CT image | |
Chen et al. | Recent advances and clinical applications of deep learning in medical image analysis | |
CN113870258B (en) | Counterwork learning-based label-free pancreas image automatic segmentation system | |
WO2020108362A1 (en) | Body posture detection method, apparatus and device, and storage medium | |
CN109903292A (en) | A kind of three-dimensional image segmentation method and system based on full convolutional neural networks | |
CN113012172B (en) | AS-UNet-based medical image segmentation method and system | |
CN116309650B (en) | Medical image segmentation method and system based on double-branch embedded attention mechanism | |
CN112734755A (en) | Lung lobe segmentation method based on 3D full convolution neural network and multitask learning | |
CN114494296A (en) | Brain glioma segmentation method and system based on fusion of Unet and Transformer | |
CN116309648A (en) | Medical image segmentation model construction method based on multi-attention fusion | |
Azad et al. | Enhancing medical image segmentation with TransCeption: a multi-scale feature fusion approach | |
CN115471470A (en) | Esophageal cancer CT image segmentation method | |
CN115861616A (en) | Semantic segmentation system for medical image sequence | |
CN117132595B (en) | Intelligent light-weight processing method and system for DWI (discrete wavelet transform) images of rectal cancer and cervical cancer | |
CN113205094A (en) | Tumor image segmentation method and system based on ORSU-Net | |
CN114972378A (en) | Brain tumor MRI image segmentation method based on mask attention mechanism | |
CN116596949A (en) | Medical image segmentation method based on conditional diffusion model | |
CN117455906B (en) | Digital pathological pancreatic cancer nerve segmentation method based on multi-scale cross fusion and boundary guidance | |
CN115526829A (en) | Honeycomb lung focus segmentation method and network based on ViT and context feature fusion | |
Zheng et al. | Self-supervised monocular depth estimation based on combining convolution and multilayer perceptron | |
Wu et al. | Continuous Refinement-based Digital Pathology Image Assistance Scheme in Medical Decision-Making Systems | |
Wen et al. | Short‐term and long‐term memory self‐attention network for segmentation of tumours in 3D medical images | |
CN115375712B (en) | Lung lesion segmentation method for realizing practicality based on bilateral learning branch | |
CN116468887A (en) | Method for segmenting colon polyp with universality | |
CN116342877A (en) | Semantic segmentation method based on improved ASPP and fusion module in complex scene |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |