CN116563285B - Focus characteristic identifying and dividing method and system based on full neural network - Google Patents
Focus characteristic identifying and dividing method and system based on full neural network Download PDFInfo
- Publication number
- CN116563285B CN116563285B CN202310838165.5A CN202310838165A CN116563285B CN 116563285 B CN116563285 B CN 116563285B CN 202310838165 A CN202310838165 A CN 202310838165A CN 116563285 B CN116563285 B CN 116563285B
- Authority
- CN
- China
- Prior art keywords
- focus
- image
- segmentation
- feature
- neural network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 46
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 23
- 230000011218 segmentation Effects 0.000 claims abstract description 56
- 238000012545 processing Methods 0.000 claims abstract description 28
- 230000003902 lesion Effects 0.000 claims abstract description 19
- 238000003062 neural network model Methods 0.000 claims abstract description 13
- 239000013598 vector Substances 0.000 claims description 21
- 238000011176 pooling Methods 0.000 claims description 14
- 238000010606 normalization Methods 0.000 claims description 8
- 238000011156 evaluation Methods 0.000 claims description 6
- 238000005070 sampling Methods 0.000 claims description 6
- 230000009466 transformation Effects 0.000 claims description 4
- 239000000284 extract Substances 0.000 claims description 3
- 230000009467 reduction Effects 0.000 claims description 3
- 230000001629 suppression Effects 0.000 claims description 3
- 238000012216 screening Methods 0.000 claims description 2
- 101100295091 Arabidopsis thaliana NUDT14 gene Proteins 0.000 claims 1
- 238000000605 extraction Methods 0.000 abstract description 9
- 238000005457 optimization Methods 0.000 abstract description 5
- 238000003709 image segmentation Methods 0.000 abstract description 4
- 230000006870 function Effects 0.000 description 12
- 238000012360 testing method Methods 0.000 description 8
- 230000008569 process Effects 0.000 description 7
- 239000004973 liquid crystal related substance Substances 0.000 description 6
- 238000013507 mapping Methods 0.000 description 6
- 230000006835 compression Effects 0.000 description 5
- 238000007906 compression Methods 0.000 description 5
- 238000012549 training Methods 0.000 description 5
- 238000001514 detection method Methods 0.000 description 3
- 230000010339 dilation Effects 0.000 description 3
- 238000009499 grossing Methods 0.000 description 3
- 230000004913 activation Effects 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 238000003745 diagnosis Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 238000012805 post-processing Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 238000004140 cleaning Methods 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000004807 localization Effects 0.000 description 1
- 239000003550 marker Substances 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000000691 measurement method Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000007634 remodeling Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 239000011800 void material Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0012—Biomedical image inspection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10072—Tomographic images
- G06T2207/10081—Computed x-ray tomography [CT]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Abstract
The application provides a focus characteristic identification and segmentation method and a focus characteristic identification and segmentation system based on a full neural network, which belong to medical image processing, wherein the method comprises the following steps: the definition and contrast of CT images are improved by adopting a wavelet filter neural network model; identifying each structure type and segmenting each structure in the CT image after adjustment by adopting a semantic segmentation network; stacking lesion data in a plurality of continuous and adjacent lesion images to construct lesion volume data with three-dimensional context information; and extracting the characteristics from the volume data by using a multi-scale characteristic pyramid network, processing and optimizing the characteristics according to a conditional random field method, and outputting characteristic information and category information of the characteristics after identification and segmentation. According to the application, the medical image is subjected to feature extraction and enhancement, image segmentation and feature image classification, and finally the recognition and segmentation accuracy of each structure in the medical image is effectively improved through processing and optimization.
Description
Technical Field
The application relates to the field of medical image processing, in particular to a focus characteristic identification and segmentation method based on a full neural network. And also relates to a focus characteristic identification and segmentation system based on the full neural network.
Background
At present, the focus can be identified and the characteristics are segmented by constructing and training a deep learning neural network, and the deep neural network participates in improving the diagnosis rate and reducing the misdiagnosis rate and the missed diagnosis rate.
However, the complexity of the CT image features directly utilizes the global features of the CT image to perform image recognition, so that the difficulty is high, the training speed is low, the workload is increased, and the recognition accuracy may be low. In addition, the image is identified after being segmented, the characteristics at the boundary of the image outline are ignored, and the reliability of focus identification is reduced.
Therefore, the conventional medical image recognition and segmentation method lacks effective feature extraction and enhancement means, resulting in high error rate of recognition and segmentation of each structure in the medical image.
Disclosure of Invention
The application aims to overcome the defect that the medical image identification and segmentation in the prior art lacks an effective feature extraction and enhancement means, and provides a focus feature identification and segmentation method based on a full neural network. And also relates to a focus characteristic identification and segmentation system based on the full neural network.
The application provides a focus characteristic identification and segmentation method based on a full neural network, which comprises the following steps:
s1, improving the definition and contrast of a CT image by adopting a wavelet filter neural network model;
s2, recognizing and segmenting focuses of the CT image with improved definition and contrast by adopting a semantic segmentation network, and generating a focus image;
s3, stacking focus data in a plurality of continuous and adjacent focus images to construct focus volume data with three-dimensional context information;
s4, extracting focus features from the focus volume data by adopting a multi-scale feature pyramid network, processing and optimizing the focus features by adopting a conditional random field method, and outputting recognition and segmentation results of the focus features.
Optionally, the wavelet filtering neural network model adopts a Mallat algorithm:
wherein x represents a vector of sampling points, y outputs a filtered vector of the sampling points, J is a scale parameter, k represents a position parameter under the scale represented by J,is the scale parameter->Wavelet function of position parameter k, saidIs the scale parameter->Is a minimum scale wavelet function of (1).
Optionally, the semantic segmentation network includes:
extracting the focus by adopting cavity convolution;
based on the extracted focus, adopting group normalization to replace batch normalization to perform data standardization treatment;
for the focus of the standardized treatment, 5 parallel cavity convolution branches and an ASPP model of an average pooling branch are adopted to acquire focus identification and segmented focus images.
Optionally, stacking the focus images to obtain a three-dimensional tensor, where each element in the three-dimensional tensor corresponds to a specific pixel point in the original CT image.
Optionally, the multi-scale feature pyramid network comprises: a low-dimensional feature layer, a high-dimensional feature layer and a parallel connection layer;
the low-dimensional feature layer converts low-level focus features in the focus volume data into high-level focus features;
the high-dimensional feature layer extracts focus feature representations from focus features of different scales;
the parallel connection layer fuses the lesion characteristic representations of different levels and scales.
Optionally, the method further comprises:
in the high-dimensional feature layer, the input three-dimensional tensor is converted into a plurality of two-dimensional feature graphs with different scales through layer-by-layer downsampling;
fusing the characteristic representations of different scales in the parallel connection layer;
in the global pooling layer, carrying out global pooling on the fused feature representation to obtain a feature vector with a fixed size;
inputting the feature vectors into a full-connection layer for classification, and outputting probability values of each category;
comparing the output probability value with a preset threshold value to obtain object types, positions and confidence degrees;
and performing de-duplication and screening by using non-maximum suppression, and outputting the predicted category, position and confidence.
Optionally, the processing and optimizing the feature according to the conditional random field method includes:
modeling the dependency relationship between the focus characteristic pixels by using a CRF model to eliminate noise, fill holes and smooth segmentation boundaries.
Optionally, the method further comprises:
performance evaluation: performance evaluation was performed using a recipe comprising a Dice coefficient, precision, recall, mAP0.5:0.95.
The application also provides a focus characteristic identification and segmentation system based on the full neural network, which comprises a reading module and a processing module;
the reading module reads a DICOM file comprising a CT image by adopting a pydicom library, and inputs the CT image into the processing module;
the processing module adopts a wavelet filter neural network model to improve the definition and contrast of CT images; identifying and segmenting focuses of the CT images with improved definition and contrast by adopting a semantic segmentation network to generate focus images; stacking the focus images based on the adjacent relation of the focuses in the adjacent focus images to construct focus volume data with three-dimensional context information; and (3) extracting focus features from the focus volume data by adopting a multi-scale feature pyramid network, processing and optimizing the focus features by adopting a conditional random field method, and outputting recognition and segmentation results of the focus features.
The application has the advantages and beneficial effects that:
the application provides a focus characteristic identification and segmentation method based on a full neural network, which comprises the following steps: the definition and contrast of a CT image are improved by adopting a wavelet filter neural network model, and the adjustment of each structure type in the CT image is carried out according to the need; identifying each structure type and segmenting each structure in the CT image after adjustment by adopting a semantic segmentation network; stacking the CT image slices adjacent to each other in different directions after each structure is identified and segmented, and constructing volume data with three-dimensional context information; and extracting the characteristics from the volume data by using a multi-scale characteristic pyramid network, processing and optimizing the characteristics according to a conditional random field method, and outputting characteristic information and category information of the characteristics after identification and segmentation. According to the application, the medical image is subjected to feature extraction and enhancement, image segmentation and feature image classification, and finally the recognition and segmentation accuracy of each structure in the medical image is effectively improved through processing and optimization.
Drawings
Fig. 1 is a schematic diagram of a lesion feature recognition and segmentation process based on a full neural network in the present application.
FIG. 2 is a schematic representation of a multiple resolution analysis scheme of the Mallat algorithm of the present application.
FIG. 3 is a schematic diagram of the process flow of DeepLabV3 in the present application.
Fig. 4 is a graph comparing lesion feature recognition and segmentation results in the present application.
Detailed Description
The present application is further described in conjunction with the accompanying drawings and specific embodiments so that those skilled in the art may better understand the present application and practice it.
The following is a detailed description of the embodiments of the present application, but the present application may be implemented in other ways than those described herein, and those skilled in the art can implement the present application by different technical means under the guidance of the inventive concept, so that the present application is not limited by the specific embodiments described below.
The application belongs to the field of medical image processing, and solves the problems that: the identification and segmentation accuracy of each structure in the medical image is improved by carrying out feature enhancement and optimization on the CT image.
The application adopts the wavelet filter neural network to process CT images, thereby improving definition and contrast; the semantic segmentation network is improved, and the accuracy of feature recognition and segmentation is improved; a multi-scale feature pyramid network is added, so that the recognition and accuracy of the features are further improved; and the processing of the feature by the conditional random field method is increased, and the integrity of the output feature is optimized.
The following description will be made with respect to an example of identification and segmentation of individual structures in a lumbar CT image.
Referring to fig. 1, S1 improves the sharpness and contrast of the CT image by using a wavelet neural network model.
The DICOM library is used to read DICOM files containing CT graphics. Meanwhile, the CT image is subjected to operations such as data cleaning, denoising, scaling and standardization through a preprocessing function, so that more reliable data are obtained.
Specifically, the application adopts a wavelet filter neural network model to improve the definition and contrast of the image, and can adjust different structure types according to the requirement.
The application adopts a wavelet filter neural network model optimized by a Mallat algorithm.
The reason for using the Mallat algorithm is: the Mallat algorithm has better time-frequency local property and stability, and can effectively solve the problem of singularity in wavelet transformation. Meanwhile, the method has good time-frequency localization characteristics, can better capture local characteristics of signals, and has higher compression ratio and reconstruction quality.
The flow of the Mallat algorithm is as shown in fig. 2:
s201, multiple resolution analysis of the Mallat algorithm, expressed as:
;
wherein x represents a vector of sampling points, y outputs a filtered vector of the sampling points, J is a scale parameter, represents the total scale number, k represents a position parameter under the scale represented by J,is the scale parameter->Wavelet function of position parameter k, said +.>Is the scale parameter->Is a minimum scale wavelet function of (1).
S202, performing wavelet transformation on the images to obtain wavelet coefficients of the images under different scales and positions, wherein the wavelet coefficients are expressed as follows:
;
wherein, the liquid crystal display device comprises a liquid crystal display device,representing the scale parameter as +.>Wavelet coefficient with position parameter k.
S203 performs noise reduction and contrast enhancement processing on the image using the wavelet coefficients described above, and the expression is as follows:
;
s204, after processing the wavelet coefficients, combines the coefficients of all scales back into the original image using the inverse wavelet transform, expressed as:
;
wherein, the liquid crystal display device comprises a liquid crystal display device,and->The wavelet functions are calculated according to the processed wavelet coefficients.
S2, recognizing and segmenting the focus of the CT image with improved definition and contrast by adopting a semantic segmentation network, and generating a focus image.
The application adopts the modified deep LabV3 semantic segmentation network based on the full convolution network to realize the identification and segmentation of the focus in the lumbar CT.
The deep LabV3 adopts a cavity convolution, an ASPP module and an end-to-end training method to effectively solve the problems of fuzzy object edges and lack of context information in semantic segmentation tasks.
The process flow of the deep LabV3 is as follows:
s301 inputting an image of the size of the image。
Feature extraction is performed in a first layer convolution by using a cavity convolution method, wherein the reason for using the cavity convolution is to enlarge a receptive field and capture multiple scales at the same time, and the expression is as follows:
s302 uses the method of GroupNormalization (group normalization) for normalization of the graph.
The method of groupnomation is used instead of batch normalization in order to avoid serious degradation of the input data for smaller batch size, thus reducing the training effect.
The GroupNormalization is not dependent on the size of the batch size, but is normalized according to the channel grouping, thereby reducing the dependency on the batch size. In addition, the GroupNormalization can overcome the problem of the Batchnormalization when the data distribution is relatively uneven, and the expression is as follows:
wherein, the liquid crystal display device comprises a liquid crystal display device,representing the +.>The sample is at->Values on individual channels ∈ ->And->Respectively represent the +.>The mean and variance of all samples on each channel, ϵ, is a constant to avoid dividing by zero。
S303, adding the input and the output by using residual connection (residual connection) to obtain a feature with the sizeMapping feature F1.
The first full-join layer of the residual structure (called the squeeze layer) is a 1x1 convolution operation, to beCompressed to a smaller dimension (compression ratio of 8, compression of the input feature map to 1/8 of the original), then use the ReLU activation function.
The second full-join layer (called the extraction layer) of the residual structure is a 1×1 convolution operation, outputting the squeeze layerFurther expansion: compression is used on the first layer, this compression ratio is 8, so the compressed feature vector is +.>The mapping matrix is needed to map back to the original dimension>。
Expanding to the same dimension as the number of channels of the input feature map, and obtaining the attention weight of each channel by using a sigmoid normalization functionThe expression is as follows:
s304 uses a attention mechanism module of the Squeeze-and-experientationNetwork (SE-Net) to improve the expression capability of the feature and the performance of the model.
The convolved image is passed throughGlobal pooling to get sizeIs a channel description vector +.>Channel description vector +.>Mapping to new channel attention weight vectors through two full connection layersAnd above, each element represents the weight of the corresponding channel.
Attention weight vectorAnd (3) remolding: the remodeling is performed by adding two new dimensions, namely adding a dimension of length 1 in front of the original one-dimensional vector and adding a dimension of length C in back, resulting in a three-dimensional tensor of size 1x C, wherein the third dimension represents the attention weight of each channel. It is understood that one-dimensional vectors are spread across the channel dimension to enable element-wise multiplication with the feature map per channel.
Remodelling to a 1x1xC tensor to match the input profileElement-by-element multiplication is performed.
The weighted feature map Y is obtained by multiplying s with the input feature map element by element,.
s305, adding Y and the input feature map X, and performing convolution and pooling operation by using a ReLU activation function to obtain a feature map with the size ofIs defined as the mapping feature F2. The mapping feature F2 is not only the identification of the applicationAnd segmenting the generated lesion image.
Specifically, the step of obtaining the mapping feature F2 includes:
a AtrousSpatialPyramidPooling (ASPP) module is constructed that includes 5 parallel hole convolution branches and one average pooling branch.
Each branch convolution operation uses a different void fraction in order to obtain a different range of context information. The structure is as follows:
branch 1: with a 1x1 convolution, groupNormalization, reLU.
Branch 2: convolutions with a fill of 12, a dilation of 12, a kernel size of 3 x 3, groupNormalization, reLU.
Branch 3: convolutions with a fill of 24, a dilation of 24, a kernel size of 3 x 3, groupNormalization, reLU.
Branch 4: a convolution with a fill of 36, a dilation of 36, a kernel size of 3 x 3, groupNormalization, reLU.
Branch 5: the input W and H are restored back by a bilinear interpolation method using a pooling layer of the size of the higher layer features, pooling to 1x1, convolving with a 1x1 convolution, groupNormalization, reLU.
The outputs of these 5 branches are spliced (in the channels direction) by means of a Concat, and finally the information is further fused in a convolutional layer through a 1x 1.
S306, calculating the distance between the model output result and the real result by using the regularized cross entropy loss function, and updating model parameters by using a back propagation algorithm, wherein the expression is as follows:
and S3, stacking the focus data in a plurality of continuous and adjacent focus images, and constructing focus volume data with three-dimensional context information.
After ASPP has completed feature extraction, a volumetric data with three-dimensional context information is constructed by stacking adjacent CT slices in different directions (e.g., 1-10 CT images are selected, in which case the first and third images are used to stack their data for the second CT image).
Since lumbar lesions usually appear on multiple sections, the accuracy of lesion detection can be improved by using information of the multiple sections. In this volume data, ASPP was used for feature extraction. By the stacking process, a three-dimensional tensor is obtained, wherein each element corresponds to a certain pixel point of the original CT image.
S4, extracting focus features from the focus volume data by adopting a multi-scale feature pyramid network, processing and optimizing the focus features by adopting a conditional random field method, and outputting recognition and segmentation results of the focus features. The segmentation result comprises: feature information and category information for a lesion of a feature.
Features are extracted using a multi-scale Feature Pyramid (FPN) network operation to enhance the discrimination of lumbar lesions. A multi-scale Feature Pyramid (FPN) network is a convolutional neural network used to extract image features. The feature pyramid is compressed into a vector with a fixed size by using a global pooling technology based on the feature representations of different levels and scales extracted layer by layer.
The main purpose of the FPN network design is to solve the problem that in the target detection task, the feature representations of different scales and layers have advantages in the aspect of improving accuracy.
The FPN network comprises three main modules, namely a low-dimensional feature layer (landalayers), a high-dimensional feature layer (Top-down path) and a parallel connection (featurefile).
The low-dimensional feature layer converts low-level focus features in the focus volume data into high-level focus features;
the high-dimensional feature layer extracts focus feature representations from focus features of different scales;
and the parallel connection layer fuses the focus characteristic representations of different layers and scales to improve detection accuracy.
Further processing the output three-dimensional tensor using the FPN network, including:
first, in a high-dimensional feature layer, an input three-dimensional tensor is converted into a plurality of two-dimensional feature maps with different scales by means of layer-by-layer downsampling.
And then, in the parallel connection layer, fusing the feature graphs with different scales, namely the feature representations, so as to obtain the feature representation with rich context information.
And finally, in a global pooling layer, carrying out global pooling on the fused feature images so as to obtain a vector representation with a fixed size.
The vector representation contains characteristic information extracted from different levels and scales and can be used for the final lumbar lesion classification task.
After feature extraction is completed, the feature vectors are input into a fully connected layer (FC) for classification, and probability values of each category are output. And predicting according to the output probability value and a threshold value (between 0 and 1, wherein the higher the probability of the probability is higher, the higher the probability is, and obtaining the object category, the position and the confidence. Non-maximal suppression (NMS) is applied for deduplication and filtering, ultimately outputting the predicted class.
Further, a Conditional Random Field (CRF) method is adopted for carrying out segmentation post-processing and optimization so as to eliminate the problems of noise, filling holes, smoothing segmentation boundaries and the like.
For the image segmentation problem, the CRF method is adopted for post-processing and optimization, and the pixels of the image are mainly marked, divided into two or more categories and segmented according to the categories. In particular, the image may be represented as a grid graph, where each node represents a pixel, and the node is marked as belonging to a certain class.
In the foregoing process, the CRF model may be used to model the dependency relationship between pixels, so as to eliminate the problems of noise, filling the hole, and smoothing the segmentation boundary. The CRF model is typically used to learn interactions between pixels and predict the labels for each pixel from the labels for other pixels in the image.
In the CRF model, some feature functions may be constructed to represent the correlation between pixels. These functions may be defined based on information such as distance between pixels, gray values, textures, etc. These feature functions may be used to score each possible pixel marker during image segmentation. The CRF model is then solved using a message-based algorithm and the best set of pixel markers is found. The mark set can eliminate the problems of noise, filling holes, smoothing the dividing boundary and the like.
Furthermore, common measurement methods such as a Dice coefficient, precision, recall rate, mAP0.5:0.95 and the like can be adopted to evaluate indexes such as accuracy, robustness, speed and the like of the segmentation algorithm, and the performance and generalization capability of the model are further improved through means such as super-parameter tuning, data expansion, integrated learning and the like.
Evaluation index:
the evaluation index adopted in the experiment is Dice, namely, dice of all test samples on a test set is calculated, and an average value is obtained to obtain mDice; the Dice is one of evaluation indexes of semantic segmentation and is used for measuring accuracy of segmentation results.
P, the accuracy of lesion classification determination for all test specimens on the test set.
R, i.e., recall for lesion category determination for all test specimens on the test set.
mAP0.5:0.95, mAP0.5:0.95 refers to the average mean of the precision of all test samples on the test set, mAP 0.5:0.05 representing the average mAP over different IoU thresholds (from 0.5 to 0.95, step size 0.05) (0.5, 0.55, 0.6, 0.65, 0.7, 0.75, 0.8, 0.85, 0.9, 0.95).
The experimental results are as follows:
mDice is 0.9068268537521362, p is 0.993, r is 0.9949, map0.5:0.95 is 0.8509.
As shown by the experimental results, the Dice index of the application reaches a higher degree.
In addition, the training and reasoning process of the model can be accelerated by utilizing the technologies such as GPU parallel computing capability, video memory management strategy and the like, and the computing efficiency and response speed are improved.
The application also provides a focus characteristic identification and segmentation system based on the full neural network, which comprises a reading module and a processing module;
the reading module reads a DICOM file comprising a CT image by adopting a pydicom library, and inputs the CT image into the processing module;
the processing module adopts a wavelet filter neural network model to improve the definition and contrast of CT images; identifying and segmenting focuses of the CT images with improved definition and contrast by adopting a semantic segmentation network to generate focus images; stacking the focus images based on the adjacent relation of the focuses in the adjacent focus images to construct focus volume data with three-dimensional context information; and (3) extracting focus features from the focus volume data by adopting a multi-scale feature pyramid network, processing and optimizing the focus features by adopting a conditional random field method, and outputting recognition and segmentation results of the focus features.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and are not limiting; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present application.
Claims (9)
1. The focus characteristic identification and segmentation method based on the full neural network is characterized by comprising the following steps of:
s1, improving the definition and contrast of a CT image by adopting a wavelet filter neural network model, decomposing a signal or an image into wavelet coefficients under different scales and positions based on a Mallat algorithm, then performing wavelet transformation to obtain the wavelet coefficients under different scales and positions, and performing noise reduction and contrast enhancement treatment on the image by using the wavelet coefficients;
s2, recognizing and segmenting focuses of the CT image with improved definition and contrast by adopting a semantic segmentation network, and generating a focus image;
s3, stacking focus data in a plurality of continuous and adjacent focus images to construct focus volume data with three-dimensional context information;
s4, extracting focus features from the focus volume data by adopting a multi-scale feature pyramid network, processing and optimizing the focus features by adopting a conditional random field method, and outputting recognition and segmentation results of the focus features.
2. The lesion feature recognition and segmentation method according to claim 1, wherein the wavelet filter neural network model employs a Mallat algorithm:
;
wherein x represents a vector of sampling points, y outputs a filtered vector of the sampling points, J is a scale parameter, k represents a position parameter under the scale represented by J,is the scale parameter->Wavelet function of position parameter k, said +.>Is the scale parameter->Is a minimum scale wavelet function of (1).
3. The full neural network-based lesion feature recognition and segmentation method according to claim 1, wherein the semantic segmentation network comprises:
extracting the focus by adopting cavity convolution;
based on the extracted focus, adopting group normalization to replace batch normalization to perform data standardization treatment;
for the focus of the standardized treatment, 5 parallel cavity convolution branches and an ASPP model of an average pooling branch are adopted to acquire focus identification and segmented focus images.
4. The method for identifying and segmenting focal features based on the full neural network according to claim 3, wherein the focal images are stacked to obtain three-dimensional tensors, and each element in the three-dimensional tensors corresponds to a pixel point in an original CT image.
5. The full neural network-based lesion feature recognition and segmentation method according to claim 1, wherein the multi-scale feature pyramid network comprises: a low-dimensional feature layer, a high-dimensional feature layer and a parallel connection layer;
the low-dimensional feature layer converts low-level focus features in the focus volume data into high-level focus features;
the high-dimensional feature layer extracts focus feature representations from focus features of different scales;
the parallel connection layer fuses the lesion characteristic representations of different levels and scales.
6. The method for identifying and segmenting lesion features based on an all-neural network according to claim 5, further comprising:
in the high-dimensional feature layer, the input three-dimensional tensor is converted into a plurality of two-dimensional feature graphs with different scales through layer-by-layer downsampling;
fusing the characteristic representations of different scales in the parallel connection layer;
in the global pooling layer, carrying out global pooling on the fused feature representation to obtain a feature vector with a fixed size;
inputting the feature vectors into a full-connection layer for classification, and outputting probability values of each category;
comparing the output probability value with a preset threshold value to obtain object types, positions and confidence degrees;
and performing de-duplication and screening by using non-maximum suppression, and outputting the predicted category, position and confidence.
7. The method for identifying and segmenting focal features based on the full neural network according to claim 1, wherein the processing and optimizing the focal features by using a conditional random field method comprises:
modeling the dependency relationship between the focus characteristic pixels by using a CRF model to eliminate noise, fill holes and smooth segmentation boundaries.
8. The method for identifying and segmenting focal features based on the full neural network according to any one of claims 1 to 7, further comprising:
performance evaluation: the method comprises the steps of Dice coefficient, precision, recall rate and mAP0.5:0.95.
9. The focus characteristic recognition and segmentation system based on the full neural network is characterized by comprising a reading module and a processing module;
the reading module reads a DICOM file comprising a CT image by adopting a pydicom library, and inputs the CT image into the processing module;
the processing module adopts a wavelet filter neural network model to improve the definition and contrast of a CT image, the wavelet filter neural network model decomposes a signal or an image into wavelet coefficients under different scales and positions based on a Mallat algorithm, then performs wavelet transformation to obtain the wavelet coefficients under different scales and positions, and performs noise reduction and contrast enhancement on the image by using the wavelet coefficients; identifying and segmenting focuses of the CT images with improved definition and contrast by adopting a semantic segmentation network to generate focus images; stacking the focus images based on the adjacent relation of the focuses in the adjacent focus images to construct focus volume data with three-dimensional context information; and (3) extracting focus features from the focus volume data by adopting a multi-scale feature pyramid network, processing and optimizing the focus features by adopting a conditional random field method, and outputting recognition and segmentation results of the focus features.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310838165.5A CN116563285B (en) | 2023-07-10 | 2023-07-10 | Focus characteristic identifying and dividing method and system based on full neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310838165.5A CN116563285B (en) | 2023-07-10 | 2023-07-10 | Focus characteristic identifying and dividing method and system based on full neural network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116563285A CN116563285A (en) | 2023-08-08 |
CN116563285B true CN116563285B (en) | 2023-09-19 |
Family
ID=87488329
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310838165.5A Active CN116563285B (en) | 2023-07-10 | 2023-07-10 | Focus characteristic identifying and dividing method and system based on full neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116563285B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117274278B (en) * | 2023-09-28 | 2024-04-02 | 武汉大学人民医院(湖北省人民医院) | Retina image focus part segmentation method and system based on simulated receptive field |
Citations (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110148129A (en) * | 2018-05-24 | 2019-08-20 | 深圳科亚医疗科技有限公司 | Training method, dividing method, segmenting device and the medium of the segmentation learning network of 3D rendering |
CN110807462A (en) * | 2019-09-11 | 2020-02-18 | 浙江大学 | Training method insensitive to context of semantic segmentation model |
CN111626300A (en) * | 2020-05-07 | 2020-09-04 | 南京邮电大学 | Image semantic segmentation model and modeling method based on context perception |
CN112102321A (en) * | 2020-08-07 | 2020-12-18 | 深圳大学 | Focal image segmentation method and system based on deep convolutional neural network |
AU2020103901A4 (en) * | 2020-12-04 | 2021-02-11 | Chongqing Normal University | Image Semantic Segmentation Method Based on Deep Full Convolutional Network and Conditional Random Field |
CN113012170A (en) * | 2021-03-25 | 2021-06-22 | 推想医疗科技股份有限公司 | Esophagus tumor region segmentation and model training method and device and electronic equipment |
CN113052856A (en) * | 2021-03-12 | 2021-06-29 | 北京工业大学 | Hippocampus three-dimensional semantic network segmentation method based on multi-scale feature multi-path attention fusion mechanism |
CN113298826A (en) * | 2021-06-09 | 2021-08-24 | 东北大学 | Image segmentation method based on LA-Net network |
CN113362338A (en) * | 2021-05-24 | 2021-09-07 | 国能朔黄铁路发展有限责任公司 | Rail segmentation method, device, computer equipment and rail segmentation processing system |
WO2021184817A1 (en) * | 2020-03-16 | 2021-09-23 | 苏州科技大学 | Method for segmenting liver and focus thereof in medical image |
CN114332133A (en) * | 2022-01-06 | 2022-04-12 | 福州大学 | New coronary pneumonia CT image infected area segmentation method and system based on improved CE-Net |
CN114565601A (en) * | 2022-03-08 | 2022-05-31 | 江苏师范大学 | Improved liver CT image segmentation algorithm based on DeepLabV3+ |
CN114693719A (en) * | 2022-03-30 | 2022-07-01 | 南京航空航天大学 | Spine image segmentation method and system based on 3D-SE-Vnet |
CN114723669A (en) * | 2022-03-08 | 2022-07-08 | 同济大学 | Liver tumor two-point five-dimensional deep learning segmentation algorithm based on context information perception |
CN115861346A (en) * | 2023-02-16 | 2023-03-28 | 邦世科技(南京)有限公司 | Spine nuclear magnetic resonance image segmentation method based on scene perception fusion network |
CN116012320A (en) * | 2022-12-26 | 2023-04-25 | 南开大学 | Image segmentation method for small irregular pancreatic tumors based on deep learning |
CN116205967A (en) * | 2023-04-27 | 2023-06-02 | 中国科学院长春光学精密机械与物理研究所 | Medical image semantic segmentation method, device, equipment and medium |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10977530B2 (en) * | 2019-01-03 | 2021-04-13 | Beijing Jingdong Shangke Information Technology Co., Ltd. | ThunderNet: a turbo unified network for real-time semantic segmentation |
CN112686899B (en) * | 2021-03-22 | 2021-06-18 | 深圳科亚医疗科技有限公司 | Medical image analysis method and apparatus, computer device, and storage medium |
US20220366682A1 (en) * | 2021-05-04 | 2022-11-17 | University Of Manitoba | Computer-implemented arrangements for processing image having article of interest |
-
2023
- 2023-07-10 CN CN202310838165.5A patent/CN116563285B/en active Active
Patent Citations (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110148129A (en) * | 2018-05-24 | 2019-08-20 | 深圳科亚医疗科技有限公司 | Training method, dividing method, segmenting device and the medium of the segmentation learning network of 3D rendering |
CN110807462A (en) * | 2019-09-11 | 2020-02-18 | 浙江大学 | Training method insensitive to context of semantic segmentation model |
WO2021184817A1 (en) * | 2020-03-16 | 2021-09-23 | 苏州科技大学 | Method for segmenting liver and focus thereof in medical image |
CN111626300A (en) * | 2020-05-07 | 2020-09-04 | 南京邮电大学 | Image semantic segmentation model and modeling method based on context perception |
CN112102321A (en) * | 2020-08-07 | 2020-12-18 | 深圳大学 | Focal image segmentation method and system based on deep convolutional neural network |
AU2020103901A4 (en) * | 2020-12-04 | 2021-02-11 | Chongqing Normal University | Image Semantic Segmentation Method Based on Deep Full Convolutional Network and Conditional Random Field |
CN113052856A (en) * | 2021-03-12 | 2021-06-29 | 北京工业大学 | Hippocampus three-dimensional semantic network segmentation method based on multi-scale feature multi-path attention fusion mechanism |
CN113012170A (en) * | 2021-03-25 | 2021-06-22 | 推想医疗科技股份有限公司 | Esophagus tumor region segmentation and model training method and device and electronic equipment |
CN113362338A (en) * | 2021-05-24 | 2021-09-07 | 国能朔黄铁路发展有限责任公司 | Rail segmentation method, device, computer equipment and rail segmentation processing system |
CN113298826A (en) * | 2021-06-09 | 2021-08-24 | 东北大学 | Image segmentation method based on LA-Net network |
CN114332133A (en) * | 2022-01-06 | 2022-04-12 | 福州大学 | New coronary pneumonia CT image infected area segmentation method and system based on improved CE-Net |
CN114565601A (en) * | 2022-03-08 | 2022-05-31 | 江苏师范大学 | Improved liver CT image segmentation algorithm based on DeepLabV3+ |
CN114723669A (en) * | 2022-03-08 | 2022-07-08 | 同济大学 | Liver tumor two-point five-dimensional deep learning segmentation algorithm based on context information perception |
CN114693719A (en) * | 2022-03-30 | 2022-07-01 | 南京航空航天大学 | Spine image segmentation method and system based on 3D-SE-Vnet |
CN116012320A (en) * | 2022-12-26 | 2023-04-25 | 南开大学 | Image segmentation method for small irregular pancreatic tumors based on deep learning |
CN115861346A (en) * | 2023-02-16 | 2023-03-28 | 邦世科技(南京)有限公司 | Spine nuclear magnetic resonance image segmentation method based on scene perception fusion network |
CN116205967A (en) * | 2023-04-27 | 2023-06-02 | 中国科学院长春光学精密机械与物理研究所 | Medical image semantic segmentation method, device, equipment and medium |
Non-Patent Citations (10)
Title |
---|
A Method for Improving Accuracy of DeepLabv3+ Semantic Segmentation Model Based on Wavelet Transform;Xin Yin等;International Conference in Communications, Signal Processing, and Systems;688–693 * |
Attention Deeplabv3+: Multi-level Context Attention Mechanism for Skin Lesion Segmentation;Reza Azad等;European Conference on Computer Vision;251–266 * |
Cascaded ASPP and Attention Mechanism-based Deeplabv3+ Semantic Segmentation Model;S. Guo 等;2022 IEEE 8th International Conference on Cloud Computing and Intelligent Systems;315-318 * |
Residual Squeeze-and-Excitation Network with Multi-scale Spatial Pyramid Module for Fast Robotic Grasping Detection;H. Cao 等;2021 IEEE International Conference on Robotics and Automation;13445-13451 * |
Semantic segmentation based on DeeplabV3+ with multiple fusions of low-level features;J. Libiao 等;2021 IEEE 5th Advanced Information Technology, Electronic and Automation Control Conference;1957-1963 * |
Structural Damage Evolution of Mesoscale Representative Elementary Areas of Mudstones;Qijun Hu等;Geofluids;1-5 * |
X. Jia 等.Improving the semantic segmentation algorithm of DeepLabv3+.2023 IEEE 6th Information Technology,Networking,Electronic and Automation Control Conference.2023,1730-1734. * |
基于多尺度特征融合的遥感图像语义分割方法;吴宁 等;计算机应用 网络首发;1-10 * |
基于多尺度特征选择网络的新冠肺炎CT图像分割方法;厉恩硕 等;现代仪器与医疗;49-54 * |
基于深度特征的腹部CT影像肝脏占位性病变辅助诊断研究;夏开建;中国博士学位论文全文数据库 (医药卫生科技辑)(第(2021)03期);E060-7 * |
Also Published As
Publication number | Publication date |
---|---|
CN116563285A (en) | 2023-08-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109086811B (en) | Multi-label image classification method and device and electronic equipment | |
CN111145181B (en) | Skeleton CT image three-dimensional segmentation method based on multi-view separation convolutional neural network | |
CN115018824B (en) | Colonoscope polyp image segmentation method based on CNN and Transformer fusion | |
CN116563285B (en) | Focus characteristic identifying and dividing method and system based on full neural network | |
CN114266794B (en) | Pathological section image cancer region segmentation system based on full convolution neural network | |
CN117078930A (en) | Medical image segmentation method based on boundary sensing and attention mechanism | |
CN114972202A (en) | Ki67 pathological cell rapid detection and counting method based on lightweight neural network | |
Oga et al. | River state classification combining patch-based processing and CNN | |
CN116309612B (en) | Semiconductor silicon wafer detection method, device and medium based on frequency decoupling supervision | |
CN116258877A (en) | Land utilization scene similarity change detection method, device, medium and equipment | |
Adegun et al. | Deep convolutional network-based framework for melanoma lesion detection and segmentation | |
Samudrala et al. | Semantic Segmentation in Medical Image Based on Hybrid Dlinknet and Unet | |
CN114863132A (en) | Method, system, equipment and storage medium for modeling and capturing image spatial domain information | |
CN113192076A (en) | MRI brain tumor image segmentation method combining classification prediction and multi-scale feature extraction | |
CN113822846A (en) | Method, apparatus, device and medium for determining region of interest in medical image | |
CN117635962B (en) | Multi-frequency fusion-based channel attention image processing method | |
CN114998990B (en) | Method and device for identifying safety behaviors of personnel on construction site | |
CN114677368B (en) | Image significance detection method and device | |
CN117351294B (en) | Image detection method and device based on dual-function discriminator | |
CN112990215B (en) | Image denoising method, device, equipment and storage medium | |
CN116402999B (en) | SAR (synthetic aperture radar) instance segmentation method combining quantum random number and deep learning | |
Tasya et al. | Breast Cancer Detection Using Convolutional Neural Network with EfficientNet Architecture | |
CN116311086B (en) | Plant monitoring method, training method, device and equipment for plant monitoring model | |
CN116977325A (en) | 3DV-Net lung nodule detection method integrating attention mechanism | |
Balaji Prabhu et al. | Super-resolution of level-17 images using generative adversarial networks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |