CN112785609B - CBCT tooth segmentation method based on deep learning - Google Patents
CBCT tooth segmentation method based on deep learning Download PDFInfo
- Publication number
- CN112785609B CN112785609B CN202110180002.3A CN202110180002A CN112785609B CN 112785609 B CN112785609 B CN 112785609B CN 202110180002 A CN202110180002 A CN 202110180002A CN 112785609 B CN112785609 B CN 112785609B
- Authority
- CN
- China
- Prior art keywords
- feature map
- network
- tooth
- cbct
- image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000007408 cone-beam computed tomography Methods 0.000 title claims abstract description 42
- 230000011218 segmentation Effects 0.000 title claims abstract description 38
- 238000000034 method Methods 0.000 title claims abstract description 29
- 238000013135 deep learning Methods 0.000 title claims abstract description 16
- 238000011176 pooling Methods 0.000 claims abstract description 17
- 238000007781 pre-processing Methods 0.000 claims abstract description 6
- 238000000605 extraction Methods 0.000 claims abstract 2
- 238000010586 diagram Methods 0.000 claims description 11
- 238000004364 calculation method Methods 0.000 claims description 9
- 238000012937 correction Methods 0.000 claims description 9
- 230000006835 compression Effects 0.000 claims description 5
- 238000007906 compression Methods 0.000 claims description 5
- 230000006870 function Effects 0.000 claims description 4
- 238000013507 mapping Methods 0.000 claims description 3
- 230000009467 reduction Effects 0.000 claims description 3
- 210000003625 skull Anatomy 0.000 claims description 3
- 230000000694 effects Effects 0.000 description 4
- 238000013527 convolutional neural network Methods 0.000 description 3
- 239000003814 drug Substances 0.000 description 3
- 210000000988 bone and bone Anatomy 0.000 description 2
- 238000004422 calculation algorithm Methods 0.000 description 2
- 230000015556 catabolic process Effects 0.000 description 2
- 238000002591 computed tomography Methods 0.000 description 2
- 238000006731 degradation reaction Methods 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 238000003745 diagnosis Methods 0.000 description 2
- 230000008034 disappearance Effects 0.000 description 2
- 238000004880 explosion Methods 0.000 description 2
- 210000003128 head Anatomy 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000001356 surgical procedure Methods 0.000 description 2
- 238000002560 therapeutic procedure Methods 0.000 description 2
- 208000012659 Joint disease Diseases 0.000 description 1
- 208000025157 Oral disease Diseases 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 239000004053 dental implant Substances 0.000 description 1
- 210000004262 dental pulp cavity Anatomy 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 208000031748 disorder of facial skeleton Diseases 0.000 description 1
- 230000001815 facial effect Effects 0.000 description 1
- 238000002513 implantation Methods 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 210000000214 mouth Anatomy 0.000 description 1
- 208000030194 mouth disease Diseases 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- 230000005855 radiation Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 210000001519 tissue Anatomy 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T9/00—Image coding
- G06T9/002—Image coding using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20112—Image segmentation details
- G06T2207/20132—Image cropping
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
- G06T2207/30036—Dental; Teeth
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The invention relates to a CBCT tooth segmentation method based on deep learning, and belongs to the field of computer graphics. The method comprises the following steps: s1: preprocessing the cone beam CBCT image by using the priori knowledge of the tooth image, and extracting a tooth part to obtain an interested region; s2: performing feature extraction on the image through a ResNet-FPN network to obtain a feature map; s3: compressing the space and channel dimensions of the feature map by using a CBAM model, thereby performing importance coding on the feature map; s4: extracting candidate regions of the feature map by using an RPN (resilient packet network); s5: pooling the corresponding region in the feature map to a fixed size feature map according to the position coordinates of the preselected box using the ROI Align; s6: and classifying, segmenting, surrounding frame regression and segmenting and scoring the candidate regions. The invention simplifies the segmentation steps and improves the tooth segmentation accuracy in the CBCT image.
Description
Technical Field
The invention belongs to the field of computer graphics, and relates to a CBCT tooth segmentation method based on deep learning.
Background
At present, oral medicine mainly comprises the aspects of tooth orthodontics, tooth implantation, oral, maxillofacial and joint disease diagnosis and the like. Taking a dental implant as an example, doctors usually make plan planning of surgery according to clinical experience on the basis of observing X-ray panoramic pictures and apical slices of patients at present. Because the panoramic view can not faithfully reflect the spatial position of the teeth in the oral cavity and the accurate acquisition of the tooth root information is lacked, the success rate of the operation is affected, and the risk of the operation failure is increased.
The development of computer vision and graphics in recent years has made digital oral medicine a reality. The key to digital oral medicine is to acquire and segment a complete 3D dental model. Three-dimensional information such as the shape and root position of a tooth is very important for clinical operations such as frontal and facial surgery, root canal therapy, and therapy simulation. However, obtaining a complete 3D model of the teeth is a difficult task. Currently, there are two mainstream techniques for obtaining 3D dental models: (1) intraoral or desktop scanning; (2) cone Beam Computed Tomography (CBCT). Intraoral or bench-top scanning is a convenient method of obtaining the geometry of the crown surface, but in many cases it is not possible to obtain the root information needed for accurate diagnosis and treatment. Compared with the common CT, the CBCT has the advantages of small radiation dose, short scanning time, high image spatial resolution and the like, and also provides more comprehensive 3D volume information of all oral tissues including teeth. So that the teeth are segmented from the CBCT image to obtain a more complete and accurate tooth model. However, the CBCT dental image has the following characteristics, making tooth segmentation a very challenging task: (1) the gaps between adjacent teeth are small, so that the contour lines of the contact positions of the adjacent teeth in the image are lost; (2) the density of teeth is different from the crown to the root of the tooth, so that the gray scale of a single tooth in the CBCT image is not uniform; (3) the root of the tooth is embedded in the alveolar bone, and the density of the root of the tooth and the alveolar bone is similar, so the edge is not clear; (4) there is a topological variation in the profile of the tooth between the root and crown portions.
In recent years, many experts and scholars at home and abroad make intensive research on CBCT tooth segmentation, and currently, many algorithms exist, for example: the method comprises a level set segmentation method, a threshold segmentation method and a region growing method, but the methods all need operators to have strong prior knowledge, and good segmentation effect can be obtained only by carrying out very good initialization on an algorithm.
Disclosure of Invention
In view of the above, the present invention provides a CBCT tooth segmentation method based on deep learning, which solves the problem that the conventional tooth segmentation method requires good initialization to accurately segment teeth, and realizes end-to-end CBCT tooth segmentation by using the deep learning method, so that not only can teeth be segmented fully automatically without user labeling and subsequent steps, but also teeth can be segmented from CBCT images more accurately while the segmentation steps are simplified.
In order to achieve the purpose, the invention provides the following technical scheme:
a CBCT tooth segmentation method based on deep learning comprises the following steps:
s1: preprocessing a Cone Beam Computed Tomography (CBCT) image by using prior knowledge of the tooth image, and extracting a tooth part to obtain an interested area;
s2: extracting the features of the image through a depth residual error network ResNet and a feature pyramid network FPN to obtain a feature map (feature map);
s3: sequentially compressing the space and channel dimensions of the feature map by using a Attention model (CBAM), thereby performing importance coding on the feature map;
s4: extracting candidate regions of the feature map by using the generated candidate region network RPN;
s5: pooling corresponding regions in the feature map into a feature map of fixed size for subsequent operation using the ROI Align network according to the position coordinates of the preselected frame;
s6: and classifying, segmenting, surrounding frame regression and segmenting and scoring the candidate regions.
Further, in step S1, the CBCT image includes: dental images and skull information;
preprocessing the CBCT image, comprising: and (3) cutting the image by utilizing the tooth priori knowledge, wherein the cut size is 384 multiplied by 320, and converting the CBCT scanning image into an image in a jpg format by using Mirco Dicom medical processing software, so that the subsequent processing is facilitated.
Further, step S2 specifically includes: the residual representation between input and output is learned by using a plurality of parameter layers of the ResNet network, so that the problems of gradient disappearance, explosion and network degradation caused by the increase of the number of network layers can be solved; applying deep-level tooth characteristic diagrams and shallow-level tooth characteristic diagrams in the ResNet network to the FPN network, and efficiently integrating the characteristic diagrams through bottom-up, top-down and transverse connection, so that the detection time is not greatly increased while the precision is improved; the CBCT dental picture can obtain the optimal CNN feature combination set of the dental picture through a ResNet network and an FPN network, and a feature map (feature map) is output.
Further, in step S3, the CBAM model is a model based on attention mechanism, which can encode the feature map with an importance, and is divided into two parts: a Channel Attention Module (CAM) and a Spatial Attention Module (SAM).
Further, in step S3, the feature map F obtained in step S2 is input into the CBAM model, the feature map F is spatially compressed in the CAM module to obtain a channel weight coefficient Mc, the coefficient Mc is multiplied by the feature map F to obtain a new feature map F ', and then the feature map F' enters the SAM module, the feature map F 'is compressed in one channel to obtain a spatial weight coefficient Ms, and the coefficient Ms is multiplied by the feature map F' to obtain a modified feature map F ". The corrected feature map F' contains channel weight coefficients and space weight coefficients, so that the network can better learn the features of the teeth.
Further, step S3 specifically includes: the method for compressing the feature map F in the CAM module in space specifically comprises the following steps: spatial compression was performed using max-pooled maxpool (f) and average-pooled avgpool (f) resulting in two different spatial descriptions:andthe shared network composed of the multiple layers of perceptrons MLP is used for calculating the two different space descriptions to obtain a channel weight coefficient Mc, and the calculation formula is as follows:
where σ (·) denotes a sigmod function, W1∈RC/r×C,W0∈RC×C/rWherein C is the number of channels and r is the reduction ratio;
and obtaining a characteristic diagram F' by using the obtained channel weight coefficient Mc and the characteristic diagram F, wherein the calculation formula is as follows:
the characteristic diagram F 'enters the SAM module, and the maximum pooling and the average pooling are used for compressing the characteristic diagram F' on the channel to obtain two different spatial descriptions:andconnected and convolved by a standard convolution layer to generate a two-dimensional spatial weight coefficient Ms, which is calculated as follows:
wherein f is7×7Represents a convolution operation with a convolution size of 7 × 7;
and obtaining a feature map F 'by using the obtained spatial weight coefficient Ms and the feature map F', wherein the calculation formula is as follows:
wherein, F "is a feature map after channel attention and spatial attention correction, and is a feature map after importance coding.
Further, in step S4, inputting the feature map F ″ encoded by the importance of step S3 into an RPN network, which generates K target boxes with preset aspect ratios and areas for each position by means of a window sliding on the shared feature map, where the default K is 9, the initial target box includes three areas (128 × 128, 256 × 256, 512 × 512), and each area includes three aspect ratios 1:1, 1:2, 2: 1; wherein, the RPN is essentially of a tree structure, the trunk is a 3 × 3 convolutional layer, and the branches are two 1 × 1 convolutional layers; the first 1 x 1 convolutional layer solves the foreground and background outputs, each point corresponds to 1 target frame, each target frame has a foreground score and a background score, so the output is 2 x K values. Another 1 × 1 convolutional layer solves the output of the frame correction, each point corresponds to K anchors, each anchor corresponds to a value of 4 correction coordinates, so the output is 4 × K values. In this network, a candidate bounding box is extracted for the tooth feature F ", and a bounding box correction is performed.
Further, in step S5, the result of step S4 is input into the roilign network, the tooth original image ROI region is mapped to the tooth feature map ROI, and then the size of the extracted region is normalized into the size input by the convolutional network through pooling; the ROIAlign directly uses bilinear interpolation from the original image to the ROI mapping of the feature map, so that errors are reduced, and the accuracy of the ROIAlign corresponding to the original image after pooling is higher; assuming that the obtained floating point coordinates are (X, Y), the nearest four points around the floating point coordinates are interpolated twice in the Y direction and once again in the X direction to obtain a new value, and the shape of the ROI is not changed.
Further, in step S6, in the segmentation, after 4 convolution operations are performed on the feature map with a fixed size generated by the ROIAlign network, a feature map with a size of 14 × 14 is generated; then generating a characteristic map with the size of 28 x 28 through upsampling; finally, generating a characteristic graph with the size of 28 x 28 and the depth of 80 through convolution operation;
in segmenting the scores, a modified network of maskolou scores was used, the model maskolou head aimed at regression predicting IoU between the tooth mask and its ground-truth mask; the input of the MaskIoU consists of two parts, namely a RoI feature map obtained by ROIAlign and a mask output by a mask branch; after the two are connected, maskIoU is output through 3 layers of convolution and 2 layers of full connection; the network module utilizes the example characteristics and the corresponding predicted mask to regression MaskIoU, can accurately evaluate the score of the tooth segmentation task, and improves the segmentation effect.
The invention has the beneficial effects that: the invention is based on a deep learning method and uses a two-stage deep supervision neural network. And (3) introducing a attention model (CBAM) to compress space and channel dimensions of the feature map in sequence, so as to perform importance coding on the feature map, adopting a network capable of grading the segmentation effect, regressing IoU between the predicted mask and the ground real mask thereof, and determining the final grade by the classification score and the segmentation score together, thereby obtaining more accurate segmented teeth.
Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention. The objectives and other advantages of the invention may be realized and attained by the means of the instrumentalities and combinations particularly pointed out hereinafter.
Drawings
For the purposes of promoting a better understanding of the objects, aspects and advantages of the invention, reference will now be made to the following detailed description taken in conjunction with the accompanying drawings in which:
FIG. 1 is a flow chart of the deep learning-based CBCT tooth segmentation method according to the present invention;
FIG. 2 is a schematic overall framework diagram of the deep learning-based CBCT tooth segmentation method according to the present invention;
FIG. 3 is an image of a dental crown segmented from CBCT data in accordance with the present invention;
FIG. 4 is an image of a tooth root segmented from CBCT data according to the present invention.
Detailed Description
The embodiments of the present invention are described below with reference to specific embodiments, and other advantages and effects of the present invention will be easily understood by those skilled in the art from the disclosure of the present specification. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention. It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present invention in a schematic way, and the features in the following embodiments and examples may be combined with each other without conflict.
Wherein the showings are for the purpose of illustrating the invention only and not for the purpose of limiting the same, and in which there is shown by way of illustration only and not in the drawings in which there is no intention to limit the invention thereto; to better illustrate the embodiments of the present invention, some parts of the drawings may be omitted, enlarged or reduced, and do not represent the size of an actual product; it will be understood by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.
The same or similar reference numerals in the drawings of the embodiments of the present invention correspond to the same or similar components; in the description of the present invention, it should be understood that if there is an orientation or positional relationship indicated by terms such as "upper", "lower", "left", "right", "front", "rear", etc., based on the orientation or positional relationship shown in the drawings, it is only for convenience of description and simplification of description, but it is not an indication or suggestion that the referred device or element must have a specific orientation, be constructed in a specific orientation, and be operated, and therefore, the terms describing the positional relationship in the drawings are only used for illustrative purposes, and are not to be construed as limiting the present invention, and the specific meaning of the terms may be understood by those skilled in the art according to specific situations.
Referring to fig. 1 to 4, the CBCT tooth segmentation method based on deep learning according to the present invention includes the following steps:
s1: preprocessing a Cone Beam Computed Tomography (CBCT) image by using prior knowledge of a tooth image, extracting a tooth part and obtaining an interested area, wherein the method specifically comprises the following steps: the whole cone-beam computer tomography image not only comprises a tooth image, but also comprises skull information, the image is cut by utilizing tooth priori knowledge, the size of the cut image is 384 multiplied by 320, and a CBCT scanning picture is converted into a picture in a jpg format by using Mirco dicom medical processing software, so that subsequent processing is facilitated;
s2: in the process of extracting tooth characteristics by using a deep residual network (ResNet), a plurality of parameter layers are used for learning residual representation between input and output, so that the problems of gradient disappearance, explosion and network degradation caused by the increase of the number of network layers can be solved. The Feature Pyramid Network (FPN) not only uses deep-level tooth feature maps in the ResNet network, but also uses shallow-level tooth feature maps in the feature pyramid network, and efficiently integrates the feature maps through bottom-up, top-down and transverse connection, so that the detection time is not greatly increased while the accuracy is improved. The CBCT dental picture can obtain the optimal CNN feature combination set of the dental picture through a ResNet network and an FPN network, and a feature map (feature map) is output.
S3: compressing the feature map by space and channel dimensions successively by using a attention model (CBAM), thereby coding the importance of the feature map; inputting the feature map (hereinafter, feature map is referred to as F) obtained in the previous step into a (CBAM) model, which is a model based on an attention mechanism and can encode the feature map with an importance, wherein the model is divided into two parts: channel Attention Modules (CAM) and Spatial Attention Modules (SAM). The feature map is spatially compressed in the CAM bank using maximum pooling and average pooling, resulting in two different spatial descriptions: fmax cAnd Favg cAnd calculating the two different spatial background descriptions by using a shared network consisting of a plurality of layers of perceptrons MLP to obtain a channel weight coefficient Mc, wherein the calculation formula is as follows:
where σ (·) denotes a sigmod function, W1∈RC/r×C,W0∈RC×C/rWherein C is a channel numberNumber, r is the reduction ratio;
and obtaining a characteristic F' by using the obtained channel weight coefficient Mc and the characteristic F, wherein a calculation formula is as follows:
The feature F' enters the SAM space attention module, which performs compression on the channel using maximum pooling and average pooling on the feature map, resulting in two different spatial descriptions:andthese are concatenated and convolved by standard convolution layers to produce a two-dimensional spatial weighting coefficient Ms, which is calculated as follows:
wherein f is7×7Representing a convolution operation with a convolution size of 7 x 7.
And obtaining the feature F 'by using the obtained spatial weight coefficient Ms and the feature F', wherein a calculation formula is as follows:
wherein, F "is the feature map after the channel attention and the spatial attention correction, and is the feature map after the importance coding. The positions of teeth in the computed tomography images are relatively fixed, and the importance coding of space and dimension on the tooth features can enable the network to learn the tooth features better.
S4: extracting candidate regions of the feature map by using a candidate Region Proposal Network (RPN);
inputting the feature F' subjected to significance coding in the previous step into a candidate region proposal network, wherein the network generates K bounding boxes (anchors) with preset length-width ratios and areas for each position by means of a window sliding on a shared feature map, the default K is 9, but as the dental target is smaller, the number K of the bounding boxes is set to be 1 in implementation, the initial bounding box comprises three areas (128 × 128, 256 × 256, 512 × 512), and each area comprises three length-width ratios (1:1, 1:2, 2: 1). The nature of the RPN network is a tree structure, with the trunk being one 3 × 3 convolutional layer and the branches being two 1 × 1 convolutional layers. The first 1 x 1 convolutional layer solves the foreground and background outputs, each point corresponds to 1 target frame, each target frame has a foreground score and a background score, so the output is 2 values. Another 1 × 1 convolutional layer solves the output of the frame correction, each point corresponds to 1 anchor, and each anchor corresponds to 4 values of the correction coordinates, so the output is 4 values. In this network, a candidate bounding box is extracted for the tooth feature F ", and a bounding box correction is performed.
S5: pooling corresponding areas in the feature map into a feature map with a fixed size according to the position coordinates of the preselected frame by using the ROI Align so as to perform subsequent operation;
and inputting the result of the last step into a ROIAlign network, wherein the network firstly maps the original image ROI area to the feature image ROI, and then the size of the area is normalized into the size of the input of the convolution network through pooling. The ROIAlign directly uses bilinear interpolation from the original image to the ROI mapping of the feature map without rounding, so that the error is small, and the accuracy of the original image after pooling is higher. Assuming that the obtained floating point coordinates are (X, Y), the nearest four points around the floating point coordinates are interpolated twice in the Y direction and once again in the X direction to obtain a new value, and the shape of the ROI is not changed.
S6: and classifying, segmenting, surrounding frame regression and segmenting and scoring the candidate regions.
Multi-class classification, box-candidate regression, instance segmentation, and segmentation scoring are performed on the dental ROI region. During example segmentation, a feature map with a fixed size of a tooth ROI generated by ROIAlign operation is subjected to 4 convolution operations to generate a feature map with a size of 14 × 14; then generating a characteristic map with the size of 28 x 28 through upsampling; finally, a feature map with the size of 28 x 28 and the depth of 80 is generated through a convolution operation. In segmenting the scores, a modified network of maskolou scores was used, and the model maskolou head was aimed at regressing IoU between the predicted mask and its ground-truth mask. The input of the MaskIoU consists of two parts, namely a RoI feature map obtained by ROIAlign and a mask output by a mask branch. After the two are connected, maskIoU is output through 3 layers of convolution and 2 layers of full connection. This network module uses the example features and the corresponding predicted mask to regression MaskIoU. The module has three types, the first type only learns MaskIoU of a target class, and ignores other classes in the proposal; the second learns maskolou for all classes. If the class does not appear in the RoI, its target MaskIoU is set to 0. This setup represents using only regression to predict maskolou, which requires the regressor to be aware of the existence of irrelevant classes; the third learns the maskolou of all foreground classes, where a foreground class indicates that the class appears in the RoI area. The remaining categories in the proposal will be ignored. In tooth segmentation we need to use the third type because in example segmentation different teeth belong to different classes, which need to be segmented.
Will smaskDefined as the score of the predicted tooth. Ideal s ofmaskIs equal to the prediction smaskThe pixel level IoU between its matching ground truth masks, the maskoiu mentioned earlier. Ideal smaskIt should be positive for only the ground truth class and zero for the other classes because the mask belongs to only one class. This requires a mask score smaskIt works well on both tasks: the masks are classified into the correct categories, and the proposed maskIoU is regressed into the foreground object categories. It is difficult to train both tasks using only a single objective function. We can decompose the mask score learning task into mask classification and IoU regression for all objectsClasses, all indicated as smask=scls·siou。sclsFocus on classifying to which category a proposal belongs, and siouFocus on regression to maskIoU. For scls,sclsThe goal of (1) is to classify the proposals belonging to which class, which is done in the classification task of the R-CNN stage. The corresponding classification score may be taken directly. Regression siouObtained in the network with the maskolou score modified in the fourth section. The scores of the whole tooth segmentation are determined simultaneously by using the classified scores and the segmentation scores, so that the scores are more accurate, and the segmentation accuracy is improved.
Finally, the above embodiments are only intended to illustrate the technical solutions of the present invention and not to limit the present invention, and although the present invention has been described in detail with reference to the preferred embodiments, it will be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions, and all of them should be covered by the claims of the present invention.
Claims (5)
1. A CBCT tooth segmentation method based on deep learning is characterized by comprising the following steps:
s1: preprocessing a Cone Beam Computed Tomography (CBCT) image by using the priori knowledge of the tooth image, and extracting a tooth part to obtain an interested area;
s2: carrying out feature extraction on the image through a depth residual error network ResNet and a feature pyramid network FPN to obtain a feature map, which specifically comprises the following steps: learning residual representations between the inputs and outputs using a plurality of parametric layers of the ResNet network; applying deep-level tooth feature maps and shallow-level tooth feature maps in a ResNet network to an FPN network, integrating the feature maps from bottom to top, from top to bottom and in transverse connection to obtain an optimal CNN feature combination set of tooth pictures, and outputting the feature maps;
s3: sequentially compressing the feature map by using a constant Block orientation Module (CBAM) to perform importance coding on the feature map; the CBAM model is divided into two parts: a Channel Attention Module (CAM) and a Spatial Attention Module (SAM);
inputting the feature map F obtained in the step S2 into a CBAM model, performing spatial compression on the feature map F in a CAM module to obtain a channel weight coefficient Mc (F), multiplying the coefficient Mc (F) by the feature map F to obtain a new feature map F ', and then, entering the feature map F' into a SAM module, performing channel compression on the feature map F 'to obtain a spatial weight coefficient Ms (F'), and multiplying the coefficient Ms (F ') by the feature map F' to obtain a corrected feature map F ";
s4: extracting candidate regions from the feature map compressed in step S3 by using the generated candidate region network RPN;
s5: pooling the corresponding areas in the feature map processed in step S4 into a fixed-size feature map according to the position coordinates of the pre-selected frame using the roiign network;
s6: classifying, segmenting, surrounding frame regression and segmenting and grading the candidate regions;
during segmentation, 4 convolution operations are carried out on a feature map with a fixed size generated by the ROI Align network; then, upsampling is carried out; finally, performing convolution operation;
in the division of the scores, IoU between the tooth mask and the ground real mask thereof is predicted by regression using a correction network of MaskIoU scores; the input of the MaskIoU consists of two parts, namely a RoI feature map obtained by ROIAlign and a mask output by a mask branch; after the two are connected, maskIoU is output through 3 layers of convolution and 2 layers of full connection.
2. The CBCT tooth segmentation method based on deep learning of claim 1, wherein in step S1, the CBCT image comprises: dental images and skull information;
preprocessing the CBCT image, comprising: and (4) cutting the image by utilizing the tooth priori knowledge, and converting the CBCT scanning image into an image in a jpg format.
3. The CBCT tooth segmentation method based on deep learning of claim 1, wherein the step S3 specifically includes: the method for compressing the feature map F in the CAM module in space specifically comprises the following steps: spatial compression was performed using max-pooled maxpool (f) and average-pooled avgpool (f) resulting in two different spatial descriptions:andthe shared network composed of the multiple layers of perceptrons MLP is used for calculating the two different space descriptions to obtain a channel weight coefficient Mc (F), and the calculation formula is as follows:
where σ (·) denotes a sigmod function, W1∈RC/r×C,W0∈RC×C/rWherein C is the number of channels and r is the reduction ratio;
carrying out element multiplication on the obtained channel weight coefficient Mc (F) and the feature graph F to obtain a feature graph F', wherein the calculation formula is as follows:
the characteristic diagram F 'enters the SAM module, and the maximum pooling and the average pooling are used for compressing the characteristic diagram F' on the channel to obtain two different spatial descriptions:andconnected and convolved by standard convolution layers to generate a two-dimensional spatial weight coefficient Ms (F'), which is calculated as follows:
wherein f is7×7Represents a convolution operation with a convolution size of 7 × 7;
and carrying out element multiplication on the obtained space weight coefficient Ms (F ') and the feature map F' to obtain a feature map F ", wherein the calculation formula is as follows:
wherein, F' is a characteristic diagram after the channel attention and the space attention are corrected.
4. The deep learning based CBCT dental segmentation method as claimed in claim 1, wherein in step S4, the significance-coded feature map F "from step S3 is inputted into RPN network, the network generates K preset aspect ratio and area target boxes anchor for each position by means of a window sliding on the shared feature map, the initial target box comprises three areas, each area comprises three aspect ratios 1:1, 1:2, 2: 1; wherein, the RPN network has a tree structure, the trunk is a 3 × 3 convolutional layer, and the branches are two 1 × 1 convolutional layers.
5. The CBCT tooth segmentation method based on deep learning of claim 4, wherein in step S5, the result of step S4 is inputted into ROI Align network, firstly, the ROI area of tooth original image is mapped to the ROI, and then the size of the extracted area is normalized into the size inputted by the convolution network through pooling; ROIAlign directly uses bilinear interpolation from the original image to the ROI mapping of the feature map; assuming that the obtained floating point coordinates are (X, Y), the nearest four points around the floating point coordinates are interpolated twice in the Y direction and once again in the X direction to obtain a new value, and the shape of the ROI is not changed.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110180002.3A CN112785609B (en) | 2021-02-07 | 2021-02-07 | CBCT tooth segmentation method based on deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110180002.3A CN112785609B (en) | 2021-02-07 | 2021-02-07 | CBCT tooth segmentation method based on deep learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112785609A CN112785609A (en) | 2021-05-11 |
CN112785609B true CN112785609B (en) | 2022-06-03 |
Family
ID=75761447
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110180002.3A Active CN112785609B (en) | 2021-02-07 | 2021-02-07 | CBCT tooth segmentation method based on deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112785609B (en) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113344950A (en) * | 2021-07-28 | 2021-09-03 | 北京朗视仪器股份有限公司 | CBCT image tooth segmentation method combining deep learning with point cloud semantics |
CN114332107A (en) * | 2021-12-01 | 2022-04-12 | 石家庄铁路职业技术学院 | Improved tunnel lining water leakage image segmentation method |
CN114332463A (en) * | 2021-12-31 | 2022-04-12 | 成都工业职业技术学院 | MR brain tumor image example segmentation method, device, equipment and storage medium |
CN114187293B (en) * | 2022-02-15 | 2022-06-03 | 四川大学 | Oral cavity palate part soft and hard tissue segmentation method based on attention mechanism and integrated registration |
CN114549559A (en) * | 2022-03-01 | 2022-05-27 | 上海博恩登特科技有限公司 | Post-processing method and system for segmenting tooth result based on CBCT (Cone Beam computed tomography) data AI (Artificial Intelligence) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110930421A (en) * | 2019-11-22 | 2020-03-27 | 电子科技大学 | Segmentation method for CBCT (Cone Beam computed tomography) tooth image |
CN111968120A (en) * | 2020-07-15 | 2020-11-20 | 电子科技大学 | Tooth CT image segmentation method for 3D multi-feature fusion |
CN112017196A (en) * | 2020-08-27 | 2020-12-01 | 重庆邮电大学 | Three-dimensional tooth model mesh segmentation method based on local attention mechanism |
CN112150472A (en) * | 2020-09-24 | 2020-12-29 | 北京羽医甘蓝信息技术有限公司 | Three-dimensional jaw bone image segmentation method and device based on CBCT (cone beam computed tomography) and terminal equipment |
CN112257758A (en) * | 2020-09-27 | 2021-01-22 | 浙江大华技术股份有限公司 | Fine-grained image recognition method, convolutional neural network and training method thereof |
CN112308867A (en) * | 2020-11-10 | 2021-02-02 | 上海商汤智能科技有限公司 | Tooth image processing method and device, electronic equipment and storage medium |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100158332A1 (en) * | 2008-12-22 | 2010-06-24 | Dan Rico | Method and system of automated detection of lesions in medical images |
US8761493B2 (en) * | 2011-07-21 | 2014-06-24 | Carestream Health, Inc. | Method and system for tooth segmentation in dental images |
US11645746B2 (en) * | 2018-11-28 | 2023-05-09 | Orca Dental AI Ltd. | Dental image segmentation and registration with machine learning |
-
2021
- 2021-02-07 CN CN202110180002.3A patent/CN112785609B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110930421A (en) * | 2019-11-22 | 2020-03-27 | 电子科技大学 | Segmentation method for CBCT (Cone Beam computed tomography) tooth image |
CN111968120A (en) * | 2020-07-15 | 2020-11-20 | 电子科技大学 | Tooth CT image segmentation method for 3D multi-feature fusion |
CN112017196A (en) * | 2020-08-27 | 2020-12-01 | 重庆邮电大学 | Three-dimensional tooth model mesh segmentation method based on local attention mechanism |
CN112150472A (en) * | 2020-09-24 | 2020-12-29 | 北京羽医甘蓝信息技术有限公司 | Three-dimensional jaw bone image segmentation method and device based on CBCT (cone beam computed tomography) and terminal equipment |
CN112257758A (en) * | 2020-09-27 | 2021-01-22 | 浙江大华技术股份有限公司 | Fine-grained image recognition method, convolutional neural network and training method thereof |
CN112308867A (en) * | 2020-11-10 | 2021-02-02 | 上海商汤智能科技有限公司 | Tooth image processing method and device, electronic equipment and storage medium |
Non-Patent Citations (5)
Title |
---|
Yuma Miki等.Classification of teeth in cone-beam CT using deep convolutional neural network.《Computers in Biology and Medicine》.2017,第80卷(第C期), * |
Zhiming Cui等.ToothNet:Automatic Tooth Instance Segmentation and Identification from Cone Beam CT Images.《2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)》.2020, * |
汪葛等.基于水平集的牙齿CT图像分割技术.《计算机应用》.2016,第36卷(第3期), * |
苟苗.基于牙齿CT图像数据的分割研究.《中国优秀硕士学位论文全文数据库 (医药卫生科技辑)》.2020, * |
钱驾宏等.基于区域能量函数的快速CBCT图像牙齿分割算法.《计算机辅助设计与图形学学报》.2018,第30卷(第6期), * |
Also Published As
Publication number | Publication date |
---|---|
CN112785609A (en) | 2021-05-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112785609B (en) | CBCT tooth segmentation method based on deep learning | |
JP7451406B2 (en) | Automatic 3D root shape prediction using deep learning methods | |
US11735306B2 (en) | Method, system and computer readable storage media for creating three-dimensional dental restorations from two dimensional sketches | |
US11464467B2 (en) | Automated tooth localization, enumeration, and diagnostic system and method | |
US20200350059A1 (en) | Method and system of teeth alignment based on simulating of crown and root movement | |
Tian et al. | DCPR-GAN: dental crown prosthesis restoration using two-stage generative adversarial networks | |
US11443423B2 (en) | System and method for constructing elements of interest (EoI)-focused panoramas of an oral complex | |
Zanjani et al. | Mask-MCNet: Tooth instance segmentation in 3D point clouds of intra-oral scans | |
US20220084267A1 (en) | Systems and Methods for Generating Quick-Glance Interactive Diagnostic Reports | |
Tian et al. | Efficient computer-aided design of dental inlay restoration: a deep adversarial framework | |
WO2022213654A1 (en) | Ultrasonic image segmentation method and apparatus, terminal device, and storage medium | |
CN107203998A (en) | A kind of method that denture segmentation is carried out to pyramidal CT image | |
CN115205469A (en) | Tooth and alveolar bone reconstruction method, equipment and medium based on CBCT | |
CN114757960A (en) | Tooth segmentation and reconstruction method based on CBCT image and storage medium | |
US20220358740A1 (en) | System and Method for Alignment of Volumetric and Surface Scan Images | |
Du et al. | Mandibular canal segmentation from CBCT image using 3D convolutional neural network with scSE attention | |
Tian et al. | Efficient tooth gingival margin line reconstruction via adversarial learning | |
Chen et al. | Detection of various dental conditions on dental panoramic radiography using Faster R-CNN | |
US20230252748A1 (en) | System and Method for a Patch-Loaded Multi-Planar Reconstruction (MPR) | |
CN113393470A (en) | Full-automatic tooth segmentation method | |
US20230419631A1 (en) | Guided Implant Surgery Planning System and Method | |
CN112201349A (en) | Orthodontic operation scheme generation system based on artificial intelligence | |
CN116797731A (en) | Artificial intelligence-based oral cavity CBCT image section generation method | |
CN116421341A (en) | Orthognathic surgery planning method, orthognathic surgery planning equipment, orthognathic surgery planning storage medium and orthognathic surgery navigation system | |
Anusree et al. | A Deep Learning Approach to Generating Flattened CBCT Volume Across Dental Arch From 2D Panoramic X-ray for 3D Oral Cavity Reconstruction |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |