CN111105423A - Deep learning-based kidney segmentation method in CT image - Google Patents

Deep learning-based kidney segmentation method in CT image Download PDF

Info

Publication number
CN111105423A
CN111105423A CN201911303210.7A CN201911303210A CN111105423A CN 111105423 A CN111105423 A CN 111105423A CN 201911303210 A CN201911303210 A CN 201911303210A CN 111105423 A CN111105423 A CN 111105423A
Authority
CN
China
Prior art keywords
image
kidney
images
models
interest
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911303210.7A
Other languages
Chinese (zh)
Other versions
CN111105423B (en
Inventor
杜强
李剑楠
郭雨晨
聂方兴
张兴
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Xbentury Network Technology Co ltd
Original Assignee
Beijing Xbentury Network Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Xbentury Network Technology Co ltd filed Critical Beijing Xbentury Network Technology Co ltd
Priority to CN201911303210.7A priority Critical patent/CN111105423B/en
Publication of CN111105423A publication Critical patent/CN111105423A/en
Application granted granted Critical
Publication of CN111105423B publication Critical patent/CN111105423B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10081Computed x-ray tomography [CT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30084Kidney; Renal

Abstract

The invention discloses a method for segmenting kidney in a CT image based on deep learning, which comprises the following steps: inputting a CT image group; normalizing each image in the image group; generating a position coding image of the kidney, and overlapping the position coding image with each image; convolving each image, determining an interested position area, and performing pixel Hadamard product on each processed image to obtain a segmented image; an output end binarizes the image; and performing Hadamard products on the obtained images and the images after the normalization processing, and determining to output the images of the kidney part. According to the method, the position code is added when the neural network training image is used, so that the problem that the spleen and the kidney are difficult to distinguish can be well distinguished and solved; in addition, an attention mechanism is used, so that the convergence rate of network fitting is higher, the network can pay attention to the kidney position region, and therefore the interference of the spleen is eliminated.

Description

Deep learning-based kidney segmentation method in CT image
Technical Field
The invention relates to the technical field of image segmentation, in particular to a method for segmenting a kidney in a CT image based on deep learning.
Background
With the large-scale growth of image data on the internet, image segmentation techniques have received a wide range of attention and applications. Especially in the segmentation in the medical field, the segmentation cost of doctors is high, the segmentation efficiency is low, the segmentation standards are uneven,
the existing medical segmentation technology is generally based on the traditional computer vision technology, a CT value (-1000Hu,1000Hu) is projected to an RGB space (0-255), features are extracted manually, a heuristic algorithm is adopted according to the experience of a doctor, and the extracted features are used for performing operations such as dimension reduction, machine learning algorithm and the like. Such a method has many advantages, such as the experience of the doctor being well automated, and for example, a series of machine learning algorithms, represented by random forests and SVMs, being interpretable and having relatively optimal solutions, and the requirements on the amount of data being not very high, perhaps tens or hundreds of models being able to be obtained that perform well, but performing very poorly on the prediction of new data, especially the performance achieved on multi-center verification.
From the current research, the method for solving the problems in recent years has shifted from the traditional computer vision field to the combination of deep learning and computer vision, and the deep learning method has many advantages, and the high robustness is one of the important points.
For example, although the neural network Unet has a certain effect on segmentation, for images with similar shapes and similar CT values, as shown in fig. 1, the kidneys and the spleen are confused, and therefore, it is desirable to provide a method for accurately identifying and segmenting the kidneys from the CT images.
Disclosure of Invention
In order to solve the technical problem, the technical scheme adopted by the invention is to provide a method for segmenting the kidney in a CT image based on deep learning, which comprises the following steps:
s1, inputting a CT image group;
s2, carrying out normalization processing on each image in the image group;
s3, generating a position code pattern of the kidney, and overlapping the position code pattern with each image in the step S2;
s4, performing convolution on the images in the step S3 to determine a position region of interest, and performing pixel Hadmard Product on the position region of interest and the processed images to obtain a segmented image or a characteristic image;
s5, outputting a binary image;
s6, Hadmard Product is performed on the image obtained in step S5 and each image in step S2, and an image obtained by segmenting the kidney part is determined and output.
In the above method, further comprising the step of: setting a loss function, selecting a plurality of models for training, establishing a model set by the trained models, segmenting the image group by the models in the model set respectively, and fusing the results of the plurality of models in a voting mode by using a model fusion technology to obtain a final segmentation result.
In the above method, the normalization process performed on each image in the image group is calculated as follows:
Figure BDA0002322395330000021
in the formula IoriginRepresenting the original image, InormalRepresenting the normalized image.
In the above method, the position code generation specifically includes:
PE(i,j)=cos(β*ei)+sin(β*ej),i∈(0,511),j∈(0,511)
wherein, (i, j) represents the coordinates of the pixel points of the kidney, and the PE function is a position function related to (i, j), β is a hyper-parameter and is used for adjusting the frequency of different positions;
adding the position coding pattern and each processed image, wherein the calculation formula is as follows:
Inew=Inormal(i,j)+α*PE(i,j)
here, α is a scale for adjusting the original image and the position information after the normalization processing.
In the above method, the region of interest (ROI) and the processed images are subjected to pixel hadamard product calculation as follows:
f1up=concat(f2up,f2down)*f2attention
f2up=concat(f3up,f3down)*f3attention
f3up=concat(f4up,f4down)*f4attention
wherein f is1~4up1-4 layers of convolution output characteristic images are sampled upwards; f. of2~4downDown-sampling 2-4 layers of output images; f. of2~4attentionIs a feature image that has undergone a layer of convolution and pooling.
In the above method, the binarized image is specifically: the binarized image outlining the kidney is distinguished by 0 and 1 values.
In the method, the CT image group is 3 continuous CT images.
According to the method, the position encoding is added when the neural network Unet trains the images, so that the problem that the spleen and the kidney are difficult to distinguish can be well distinguished and solved; in addition, an attention mechanism is used, the network can adjust the ROI according to the PE information in the process of back-propagation of the neural network, so that the attention effect of the region of interest is achieved, the network fitting convergence speed is higher, the network can pay attention to the region of the kidney position, and accordingly spleen interference is eliminated.
Drawings
FIG. 1 is a diagram illustrating a segmentation result obtained by a conventional segmentation method in the background art according to the present invention;
FIG. 2 is a process flow of the method provided by the present invention;
FIG. 3 is a schematic diagram of a neural network structure provided by the present invention;
FIG. 4 is a schematic diagram of a neural network framework incorporating position coding and a regional attention mechanism according to the present invention;
FIG. 5 is a diagram illustrating the effect of the CT image segmentation according to the present invention.
Detailed Description
The invention provides a method for a kidney CT image segmentation network, which adds a position coding and attention mechanism, enables a neural network to pass back-propagation (back-propagation) and does not confuse spleen or other organs when segmenting a kidney. The present invention will be described in detail with reference to the accompanying drawings.
As shown in fig. 2-3, the present invention provides a method for kidney segmentation in a CT image based on deep learning, comprising the following steps:
s1, inputting a CT image group; in this embodiment, the CT image group may be 3 continuous CT images, the length and width are 512 × 512, and the pixel value range is-1000 to 2000HU, and multiple groups of images may be transmitted for training during training, and the 3 continuous images have information of upper and lower layers, so that the model predicts the mask of the middle layer through the information of the upper and lower layers.
S2, carrying out normalization processing on each image in the image group; since the actually measured kidney pixel values are generally distributed between 25 and 50, and the spleen pixel values are generally distributed between 35 and 60, the input CT image set is normalized by the following formula:
Figure BDA0002322395330000041
wherein, IoriginRepresenting the original image, InormalRepresenting the normalized image.
S3, generating a position encoding map (Positional encoding) of the kidney, and overlapping the position encoding map with each image in the step S2; the implementation steps are as follows:
as shown in fig. 4, for the neural network with the position code and attention mechanism added, the position code is generated as follows:
PE(i,j)=cos(β*ei)+sin(β*ej),i∈(0,511),j∈(0,511
wherein, (i, j) represents the coordinates of the pixel points of the kidney, the PE function is a position function related to (i, j), and β is a hyper-parameter for adjusting the frequency of different positions.
Adding the position-coding pattern to the processed images corresponds to expressing the position information in the form of frequency and superimposing the information on the processed images, with the formula:
Inew=Inormal(i,j)+α*PE(i,j)
α is used for adjusting the ratio of the normalized original image to the position information, and the value is 0.1 in order to prevent α from affecting the normalized original image;
s4, convolving the images in step S3 to determine a region of interest (ROI), and performing a pixel hadamard Product (hadamard Product) on the region of interest (ROI) and the processed images to obtain a segmented image or feature image, specifically:
in the embodiment, because the PE (provider edge) is added, namely the position information, the position information always participates in the calculation in the convolution process, the network has certain sensitivity to the position information in the convolution process, and the attention effect is given to the sensitive position.
After the convolution operation is performed on each image in step S3, the position region is found, 2 × 2 pooling is performed, the position region of interest is found, and the obtained region of interest (ROI) is used as a pixel Hadmard Product. In the process of back propagation, the network can adjust the ROI according to the PE information so as to achieve attention effect of the region of interest (ROI), wherein pixel Hadmard Product calculation is carried out on the ROI and each processed image as follows:
f1up=concat(f2up,f2down)*f2attention
f2up=concat(f3up,f3down)*f3attention
f3up=concat(f4up,f4down)*f4attention
wherein f is1~4up1-4 layers of convolution output characteristic images (feature maps) are sampled upwards;
f2~4downdown-sampling 2-4 layers of output images; f. of2~4attentionIs a feature image after a layer of convolution and pooling, and then f is processeddownAnd fupSplicing the corresponding characteristic graphs and then splicing the characteristic graphs with fattentionThe multiplication finally yields an image of the location area ROI. By the method, the network fitting convergence speed is higher, the network can notice the kidney position region, and therefore spleen interference is eliminated.
S5, output end binarization image: the binarized image outlining the kidney is distinguished by 0 and 1 values. It is possible to perform down streaming tasks by this step.
S6, performing Hadamard Product (Hadmard Product) on the image obtained in the step S5 and each image in the step S2, and determining and outputting an image with a segmented kidney part; as shown in fig. 5, is the result of segmentation using the present invention.
Preferably, in this embodiment, in order to effectively improve the accuracy of the segmentation result, the method further includes the steps of: setting a loss function, selecting a plurality of models for training, establishing a model set S by the trained models, segmenting image groups respectively by the models in the model set S, and fusing the results of the plurality of models in a voting way by using a model fusion technology to obtain a final segmentation result, wherein the method specifically comprises the following steps:
in this embodiment, after the attention mechanism Unet is added, different backbones are used as model candidates of the Unet, and the model candidates may include: ResNet52, VggNet, DenseNet, SENEt, etc.; training a plurality of models through steps S1-S6 until loss functions are not obviously changed and putting the models into a model set S through different super-parameter adjustment, respectively segmenting image groups through the models in the model set S, voting the results, and performing weighted average on the voting results to obtain final segmentation results; the model training process is implemented by the prior known technology, and is not a protection point of the technology of the embodiment, so that redundant description is omitted.
In the present embodiment, the first and second embodiments,and selecting results returned by 2-4 of the 4 models for voting, and performing weighted average on all voting results to obtain a final segmentation result. The results returned by 3 of the 4 models can be selected for voting, and the total result can be obtained
Figure BDA0002322395330000061
And (4) grouping voting results, and performing weighted average on the 4 groups of voting results to obtain a final result, wherein the voting specifically means that the pixel value of the Mask is 1 if more than 2 models predict that the kidney exists, and otherwise, the pixel value of the Mask is 0. The advantage of this is that the individual model error can be effectively reduced.
According to the method for segmenting the kidney in the CT image based on deep learning, which is provided by the invention, training is carried out on a data set KiTS19 data set, a Dice coefficient (the Dice coefficient is one of common methods for evaluating the segmentation effect, and can also be used as a loss function to measure the difference between the segmentation result and a label) is improved by about 20% from the result 0.64 obtained by training with a single Unet to the current result 0.77 without adding any other optimization method. The iteration convergence rate is increased by about 60% from 30-35 times of iteration required by a single Unet to 10-15 times of iteration required by the present invention with the addition of a position coding and attention mechanism, and the performance of the present invention in the kidney with tumor can also reach a Dice value of 0.77.
The invention has the beneficial effects that:
(1) the method has an Attention mechanism, so that the training curve is converged more quickly, the characteristics in the image are found more easily, the characteristics are learned in a thinning mode, the convergence speed is obviously improved in comparison with a single Unet, about 30 iteration times are needed for the Unet in the task, and the previous effect can be achieved only by 10-15 iterations after the Positional Encoding and the Attention are added.
(2) The method has better performance in the task of segmenting the kidney, the quantization standard (Dice Coefficient) reaches a higher level, and the reason is that the task Specificity (Specificity) is greatly improved, and compared with the prior network, the importance of the position information can be adjusted through α, so that the network not only learns the shapes of the kidney and the spleen, but also judges according to the positions of the kidney and the spleen, and the occurrence of false positives (non-kidney areas and predicted kidney areas) is reduced.
(3) Compared with the traditional machine learning method, the end-to-end prediction method has the advantages that the parameters learned by the neural network form an input-output model, a single model (taking ResNet52 as a backbone as an example) can segment the image within 1E-2s, the segmentation efficiency is improved, compared with manual segmentation, the method has more unified standard, and powerful guarantee is provided for downstream tasks.
The present invention is not limited to the above-mentioned preferred embodiments, and any structural changes made under the teaching of the present invention shall fall within the protection scope of the present invention, which has the same or similar technical solutions as the present invention.

Claims (7)

1. A kidney segmentation method in a CT image based on deep learning is characterized by comprising the following steps:
s1, inputting a CT image group;
s2, carrying out normalization processing on each image in the image group;
s3, generating a position code pattern of the kidney, and overlapping the position code pattern with each image in the step S2;
s4, performing convolution on the images in the step S3 to determine a position region of interest, and performing pixel Hadmard Product on the position region of interest and the processed images to obtain a segmented image or a characteristic image;
s5, outputting a binary image;
s6, Hadmard Product is performed on the image obtained in step S5 and each image in step S2, and an image obtained by segmenting the kidney part is determined and output.
2. The method of claim 1, further comprising the step of: setting a loss function, selecting a plurality of models for training, establishing a model set by the trained models, segmenting the image group by the models in the model set respectively, and fusing the results of the plurality of models in a voting mode by using a model fusion technology to obtain a final segmentation result.
3. The method of claim 1, wherein the normalizing for each image in the set of images is calculated as:
Figure FDA0002322395320000011
in the formula IoriginRepresenting the original image, InormalRepresenting the normalized image.
4. The method of claim 1, wherein the position code generation is specifically as follows:
PE(i,j)=cos(β*ei)+sin(β*ej),i∈(0,511),j∈(0,511)
wherein (i, j) represents the coordinates of the pixel points of the kidney, and the PE function is a position function related to (i, j), β is a hyper-parameter and is used for adjusting the frequency of different positions;
adding the position coding pattern and each processed image, wherein the calculation formula is as follows:
Inew=Inormal(i,j)+α*PE(i,j)
here, α is a scale for adjusting the original image and the position information after the normalization processing.
5. The method of claim 1, wherein the region of interest (ROI) and the processed images are computed as a Hadmard Product of pixels as follows:
f1up=concat(f2up,f2down)*f2attention
f2up=concat(f3up,f3down)*f3attention
f3up=concat(f4up,f4down)*f4attention
wherein f is1~4up1-4 layers of convolution output characteristic images are sampled upwards; f. of2~4downDown-sampling 2-4 layers of output images; f. of2~4attentionIs a feature image that has undergone a layer of convolution and pooling.
6. The method according to claim 1, characterized in that said binarized image is specifically: the binarized image outlining the kidney is distinguished by 0 and 1 values.
7. The method of claim 1, wherein the set of CT images is 3 consecutive CT images.
CN201911303210.7A 2019-12-17 2019-12-17 Deep learning-based kidney segmentation method in CT image Active CN111105423B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911303210.7A CN111105423B (en) 2019-12-17 2019-12-17 Deep learning-based kidney segmentation method in CT image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911303210.7A CN111105423B (en) 2019-12-17 2019-12-17 Deep learning-based kidney segmentation method in CT image

Publications (2)

Publication Number Publication Date
CN111105423A true CN111105423A (en) 2020-05-05
CN111105423B CN111105423B (en) 2021-06-29

Family

ID=70422001

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911303210.7A Active CN111105423B (en) 2019-12-17 2019-12-17 Deep learning-based kidney segmentation method in CT image

Country Status (1)

Country Link
CN (1) CN111105423B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111445474A (en) * 2020-05-25 2020-07-24 南京信息工程大学 Kidney CT image segmentation method based on bidirectional complex attention depth network
CN116681892A (en) * 2023-06-02 2023-09-01 山东省人工智能研究院 Image precise segmentation method based on multi-center polar mask model improvement

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109492232A (en) * 2018-10-22 2019-03-19 内蒙古工业大学 A kind of illiteracy Chinese machine translation method of the enhancing semantic feature information based on Transformer
CN109993809A (en) * 2019-03-18 2019-07-09 杭州电子科技大学 Rapid magnetic resonance imaging method based on residual error U-net convolutional neural networks

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109492232A (en) * 2018-10-22 2019-03-19 内蒙古工业大学 A kind of illiteracy Chinese machine translation method of the enhancing semantic feature information based on Transformer
CN109993809A (en) * 2019-03-18 2019-07-09 杭州电子科技大学 Rapid magnetic resonance imaging method based on residual error U-net convolutional neural networks

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
ASHISH VASWANI等: "Attention Is All You Need", 《ARXIV:1706.03762V4》 *
OLAF RONNEBERGER等: "U-Net: Convolutional Networks for Biomedical Image Segmentation", 《ARXIV:1505.04597V1》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111445474A (en) * 2020-05-25 2020-07-24 南京信息工程大学 Kidney CT image segmentation method based on bidirectional complex attention depth network
CN116681892A (en) * 2023-06-02 2023-09-01 山东省人工智能研究院 Image precise segmentation method based on multi-center polar mask model improvement
CN116681892B (en) * 2023-06-02 2024-01-26 山东省人工智能研究院 Image precise segmentation method based on multi-center polar mask model improvement

Also Published As

Publication number Publication date
CN111105423B (en) 2021-06-29

Similar Documents

Publication Publication Date Title
US11556797B2 (en) Systems and methods for polygon object annotation and a method of training an object annotation system
Yang et al. Automatic pixel‐level crack detection and measurement using fully convolutional network
CN108230339B (en) Stomach cancer pathological section labeling completion method based on pseudo label iterative labeling
CN110298321B (en) Road blocking information extraction method based on deep learning image classification
CN114120102A (en) Boundary-optimized remote sensing image semantic segmentation method, device, equipment and medium
US20190057507A1 (en) System and method for semantic segmentation of images
Li et al. A robust instance segmentation framework for underground sewer defect detection
CN113160192A (en) Visual sense-based snow pressing vehicle appearance defect detection method and device under complex background
CN113159120A (en) Contraband detection method based on multi-scale cross-image weak supervision learning
CN111105423B (en) Deep learning-based kidney segmentation method in CT image
CN116645592B (en) Crack detection method based on image processing and storage medium
CN114926511A (en) High-resolution remote sensing image change detection method based on self-supervision learning
CN111985381B (en) Guidance area dense crowd counting method based on flexible convolution neural network
US11348349B2 (en) Training data increment method, electronic apparatus and computer-readable medium
CN114359286A (en) Insulator defect identification method, device and medium based on artificial intelligence
CN115147418A (en) Compression training method and device for defect detection model
Zuo et al. A remote sensing image semantic segmentation method by combining deformable convolution with conditional random fields
CN111667461A (en) Method for detecting abnormal target of power transmission line
Yang et al. Semantic segmentation of bridge point clouds with a synthetic data augmentation strategy and graph-structured deep metric learning
CN114494786A (en) Fine-grained image classification method based on multilayer coordination convolutional neural network
CN112801021B (en) Method and system for detecting lane line based on multi-level semantic information
CN110472640B (en) Target detection model prediction frame processing method and device
CN116385466A (en) Method and system for dividing targets in image based on boundary box weak annotation
CN114022787B (en) Machine library identification method based on large-scale remote sensing image
CN113657225B (en) Target detection method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant