CN114549523A

CN114549523A - Single-step depth network-based automatic detection method for multiple raw teeth in center of curved surface layer graph

Info

Publication number: CN114549523A
Application number: CN202210436787.0A
Authority: CN
Inventors: 戴修斌; 蒋昕; 朱书进; 冒添逸; 刘天亮
Original assignee: Nanjing University of Posts and Telecommunications
Current assignee: Nanjing University of Posts and Telecommunications
Priority date: 2022-04-25
Filing date: 2022-04-25
Publication date: 2022-05-27

Abstract

The invention discloses a single-step depth network-based automatic detection method for multiple fresh teeth in the center of a curved surface layer graph, which is characterized by comprising the following steps of: collecting a new curved surface body layer image; inputting the new curvohedral layer image into a single-step depth network model obtained by training, and obtaining a plurality of candidate bounding box positions and target confidence coefficients containing target classes belonging to a plurality of median teeth; and screening a final boundary frame from the plurality of candidate boundary frames by using a non-maximum inhibition method to obtain the final positioning of the median molar excess raw teeth. The invention can quickly and automatically identify whether the image contains more middle raw teeth and give the position of the middle raw teeth only by scanning the curved surface layer image once, thereby avoiding the influence of experience difference of doctors on the precision and assisting the doctors to quickly and correctly diagnose more middle raw teeth.

Description

Single-step depth network-based automatic detection method for multiple raw teeth in center of curved surface layer graph

Technical Field

The invention relates to a single-step depth network-based automatic detection method for multiple raw teeth in the center of a curved surface layer graph, and belongs to the technical field of computer vision prediction.

Background

The multiple raw teeth refer to teeth or tooth-like tissues which are present in jawbone and are not 20 in deciduous dentition or 32 in permanent dentition, and are common diseases of abnormal development of tooth number. Multiple raw teeth often cause cystic lesions, delayed sprouting of adjacent teeth, ectopic sprouting of adjacent teeth, root absorption of adjacent teeth or torsion of adjacent teeth and other complications. Most of the multiple teeth are buried in the jaw bone, so two-dimensional curvographic body layer images are a common means of diagnosing and locating multiple teeth. Compared with Cone Beam Computed Tomography (CBCT), the method has the advantages of small radiation amount and low cost. Currently, clinicians mainly rely on artificial diagnosis to determine whether there are more middle raw teeth from the curved surface layer images. However, the accuracy of the detection results obtained by manual diagnosis is greatly affected by the difference in experience of clinicians. Therefore, the development of the automatic detection algorithm for the multiple middle raw teeth of the curveyoid layer image has important clinical value.

In view of the excellent performance of deep learning in the field of computer vision, many researchers are beginning to use deep convolutional neural networks for oral intelligent detection diagnosis. For example, Fukuda et al established a CNN-based deep learning model for detecting root fissure using detectenet, and performed five-fold cross validation for protecting test data selection bias and improving reliability. Ekert et al proposed a 7-layer CNN model to detect root tip lesions on dental panoramic films, trained and validated through 10 repeated data rewashing. Inspired by the above work, Kuwada et al created three learning models using AlexNet, VGG-16 and DetectNet for automatic detection of multiple fresh teeth in the center of a curved surface layer image, trained learning of the curved surface layer image of a patient in the upper jaw incisor area, and verification and comparison of the classification effect of 3 deep learning systems on the upper jaw multiple fresh teeth of a patient with full molar teeth. While the above work has performed well in the classification task, there are some deficiencies. An important step of the Kuwada method is to manually crop the image blocks including the possible locations of the lesion area from the test image into the network before classification begins. The size of the cropped image block is also manually set according to experience, but not according to the image content, and errors are often generated in the clinical diagnosis and data annotation processes. In addition, inputting image blocks containing only local information into the training and testing phase may lose critical global information.

Disclosure of Invention

The invention aims to overcome the defects of the prior art and provides an automatic detection method for multiple median raw teeth of a curved surface layer graph based on a single-step depth network.

In order to achieve the above object, the present invention provides a method for automatically detecting multiple median raw teeth in a curved surface layer map based on a single-step depth network, comprising:

collecting a new curved surface body layer image;

inputting the new curvohedral layer image into a single-step depth network model obtained by training, and obtaining a plurality of candidate bounding box positions and target confidence coefficients containing target classes belonging to a plurality of median teeth;

and screening a final boundary frame from the plurality of candidate boundary frames by using a non-maximum inhibition method to obtain the final positioning of the median molar excess raw teeth.

Preferably, the training obtains a single-step deep network model, comprising:

(1) expanding a training set, wherein the training set comprises a curved surface layer image, a cone beam CT image, classification labeling information and positioning labeling information;

(2) a single-step deep network model is constructed by iterative training of a training set;

in each iterative training, setting an initial value of a hyper-parameter by adopting automatic learning anchor distribution and an Adam optimizer;

and (3) carrying out hyper-parameter search by using a genetic algorithm, training the single-step deep network model according to the set epoch, and sequencing all hyper-parameter combinations according to a total loss function to obtain the optimal hyper-parameter combination.

Preferably, the single-step deep network model comprises a basic framework end, a detection end and an output end;

the basic skeleton end comprises a focusing layer,S ₁a plurality of convolution layers,S ₂A cross-stage local-1 module, a spatial pyramid pooling module, a cross-stage local-2 module, a focusing layer,S ₁a plurality of convolution layers,S ₂The cross-stage local-1 module, the spatial pyramid pooling module and the cross-stage local-2 module are sequentially connected;

the focusing layer performs connection operation on the input curved surface layer image in channel dimension after 4 slicing operations, and obtains a characteristic diagram with resolution reduced to half of the original resolution;

the cross-stage local-2 module comprises an upper branch and a lower branch, the upper branch is subjected to convolution-batch normalization-LeakyRelu module, a residual error module and a convolutional layer extraction feature map, the lower branch is subjected to convolutional layer extraction feature map, and the feature map of the upper branch and the feature map of the lower branch are subjected to batch normalization module, LeakyRelu activation function module and convolution-batch normalization-LeakyRelu module after connection operation at the channel level;

the detection end comprises a three-layer characteristic pyramid and a three-layer path aggregation network, wherein the three-layer characteristic pyramid is connected with the three-layer path aggregation network in parallel;

and the output end encodes the multi-scale hierarchical feature map, and outputs a plurality of candidate bounding box positions and the target confidence of each candidate bounding box after convolution operation.

Preferentially, the positioning labeling information and the classification labeling information of the training set are obtained by the following steps:

marking a middle multiple-tooth part of the curved surface layer image as an ROI by using a bounding box;

note the bookR _iFor curved body layer imageiROI, all of which are extracted from the curved surface layer imageR _iIs sitting at the center pointSubject matter (x _i ，y _i) And width and height: (w _i ，h _i) As positioning label information;

and judging whether the median multiple raw teeth exist or not according to the cone beam CT image acquired by the curved surface layer image corresponding to the case, wherein the cone beam CT image is used as the classification marking information of the curved surface layer image.

Preferably, total loss functionL _totalThe expression of (a) is:

L _total=L _box+L _obj+L _cls，

in the formula (I), the compound is shown in the specification,L _boxto loss of generalized cross-over ratioGIOU，L _objIn order to target the loss of the material,L _clsto predict object class loss.

Preferably, the first and second liquid crystal films are,L _objandL _clsare all binary cross entropy loss functions

；

，

L _box=1-GIOU，

In the formula (I), the compound is shown in the specification,B _labelin order to be a bounding box for the annotation,B _predfor the predicted candidate bounding box, C isB _labelAndB _prethe smallest bounding box of (1);IOUis the ratio of the intersection and union of the predicted candidate bounding box and the labeled bounding box;

，

in the formula (I), the compound is shown in the specification,i∈[1,N]， Nrepresenting the number of samples of the training set,y _i∈[0,1]representing a sampleiThe standard category label value, the bounding box contains 1 for more raw teeth in the middle and 0 for less raw teeth in the middle;

∈[0,1]representing single-step multi-scale deep learning network model prediction samplesiContains the target confidence for more raw teeth in the middle.

Preferentially, before inputting the new curved surface body layer image into the single-step depth network model obtained by training, dividing the collected new curved surface body layer image into an S multiplied by S grid;

screening a final bounding box from a plurality of candidate bounding boxes by using a non-maximum suppression method to obtain the final positioning of the median molar excess, comprising the following steps:

multiplying the target confidence coefficient of the bounding box by the target confidence coefficient of each grid containing the target class belonging to the median toothings to obtain the probability value of the candidate bounding box containing the target class belonging to the median toothings;

according to the fact that the candidate bounding box comprises the probability value and the IOU intersection of the target class belonging to the median prolific dentiform, the redundant candidate bounding box is removed, and the final bounding box is determined;

and if the probability value of the target class belonging to the central excessive tooth in the final bounding box exceeds a set threshold value, judging that the position of the central excessive tooth in the new curvaceous layer image indicated by the final bounding box is qualified.

Preferably, the training set is augmented, including:

and performing brightness modification, random pixel clipping, random saturation modification, image inversion and mosaic enhancement on the curved surface layer image.

An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of any of the above methods when executing the program.

A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method of any of the above.

The invention achieves the following beneficial effects:

the method utilizes the cone beam CT image to evaluate and judge the existence condition of multiple middle fresh teeth, adds additional 3D space information for training the single-step depth network model, and reduces the manual error when the data set is manufactured.

According to the method, the training set is expanded in a data enhancement mode, a more generalized model can be trained by amplifying the size of the training set, and the image-enhanced data set can effectively help a single-step deep network model to reduce negative effects caused by disturbance such as position, illumination relation, saturation and the like when characteristics are extracted and learned;

the method utilizes a genetic algorithm to search the hyperparameters to finally obtain the optimal hyperparameter combination, and hyperparameter evolution enables a single-step deep network model to minimize a given loss function in an independently distributed data to achieve better prediction accuracy;

according to the method, GIOU loss is used as a loss function of a regression bounding box, and when GIOU is used as measurement, L = 1-GIOU and has nonnegativity, uncertainty, symmetry, triangle inequality and scale invariance; GIOU considers the non-overlapping area not considered by IOU and can reflect the overlapping mode of A and B;

the invention inputs the new curvy body layer image into the trained single-step depth network model for testing, the testing stage does not need manual operation, and only needs to scan the whole curvy body layer image once, thus whether the image contains multiple middle raw teeth or not can be rapidly and automatically identified and the position of the image can be given at the same time, and doctors are assisted to rapidly and correctly diagnose the multiple middle raw teeth;

the method adopts a non-maximum value inhibition method to obtain the final positioning of the middle multiple raw teeth, a large number of different boundary frames can be detected at the position of the same target, a plurality of overlapped boundary frames are generated, the non-maximum value inhibition method can inhibit the maximum value, the maximum value of the local space is searched, the optimal solution in the local space is obtained, and the redundant boundary frames are eliminated;

the invention does not need manual operation, and can quickly and automatically identify whether the image contains multiple median raw teeth and give the positions of the multiple median raw teeth by only scanning the curved surface layer image once, thereby avoiding the influence of experience difference of doctors on the precision and assisting the doctors to quickly and correctly diagnose the multiple median raw teeth.

Drawings

FIG. 1 is a flow chart of the method of the present invention;

FIG. 2 is a schematic diagram of a single-step deep network model structure adopted by the method of the present invention;

fig. 3 is a diagram illustrating the result of detecting multiple median raw teeth in a curved surface layer image according to a second embodiment of the present invention.

Detailed Description

The following examples are only for illustrating the technical solutions of the present invention more clearly, and the protection scope of the present invention is not limited thereby.

Example one

As shown in fig. 1, the method for automatically detecting multiple raw teeth in the center of a curved surface layer map based on a single-step depth network comprises the following steps:

collecting a new curved surface body layer image;

Further, in this embodiment, the training to obtain the single-step deep network model includes:

in each iterative training, an automatic learning anchor distribution and an Adam optimizer are adopted, and initial hyper-parameters are set, wherein the hyper-parameters comprise an initial learning rate, a training round number, a depth coefficient and a width coefficient;

Further, as shown in fig. 2, the single-step depth network model in this embodiment includes a basic skeleton end, a detection end, and an output end;

Further, in this embodiment, the positioning labeling information and the classification labeling information of the training set are obtained through the following steps:

note the bookR _iFor curved body layer imageiROI, all of which are extracted from the curved surface layer imageR _iCoordinates of center point of (1:x _i ，y _i) And width and height: (w _i ，h _i) As positioning label information;

and judging whether the median multiple fresh teeth exist or not through the acquired cone beam CT image of the case corresponding to the curved surface layer image, wherein the cone beam CT image is used as the classification marking information of the curved surface layer image.

Further, the total loss function in this embodimentL _totalThe expression of (a) is:

L _total=L _box+L _obj+L _cls，

Further, in the present embodimentL _objAndL _clsare all binary cross entropy loss functions

；

，

L _box=1-GIOU，

，

in the formula (I), the compound is shown in the specification,i∈[1,N]， Nrepresenting training setsThe number of samples is such that,y _i∈[0,1]representing a sampleiThe standard category label value, the bounding box contains 1 for more raw teeth in the middle and 0 for less raw teeth in the middle;

The acquired new curved surface body layer image is automatically divided into S multiplied by S grids, and a plurality of candidate bounding boxes are generated in each grid.

Further, in this embodiment, the step of screening out a final bounding box from the multiple candidate bounding boxes by using a non-maximum suppression method to obtain a final location of the middle multiple raw teeth includes:

Further, the expanding the training set in this embodiment includes:

An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of any of the methods described above when executing the program.

A computer-readable storage medium, having stored thereon a computer program which, when being executed by a processor, carries out the steps of any of the methods described above.

The obtained plurality of candidate bounding box locations includes candidate bounding box locations and candidate bounding box sizes.

Example two

In practical application, a curved surface body layer image to be detected specifically comprises the following steps:

(1) the training set was the curved surface layer image, classification labeling information and positioning labeling information of 342 study cases, wherein the curved surface layer image size for the training was 2976 × 1536, and the pixel size was 0.07 × 0.07mm ²。

(2) Whether or not there are more middle and middle raw teeth is determined from the cone beam CT image corresponding to the curved surface layer image study case, and the determined result is used as classification labeling information corresponding to the curved surface layer image. And (3) labeling the curvy body layer image of the study case, taking the curvy body layer image training set with the central multiple tooth parts as the ROI, and labeling by using a bounding box. Note the bookR _iFor curved body layer imageiROI, all of which are extracted from the curved surface layer imageR _iCoordinates of center point of (1:x _i ，y _i) And width and height: (w _i ，h _i) As positioning label information; 230 cases with more raw teeth and 112 cases without more raw teeth.

(3) The training set is extended in a data enhancement mode, and the data enhancement mode comprises the following steps: brightness modification, random pixel clipping, random saturation modification, image flipping, mosaic enhancement, and the like.

(4) The single-step deep network model sets the initial values of the hyper-parameters: the depth coefficient is set to be 1.66, the width coefficient is set to be 1.50, the initial learning rate is set to be 0.0032, the fusion probability is set to be 0.243, the number of training rounds is set to be 250, and automatic learning anchor distribution and an Adam optimizer are adopted;

after the initial hyper-parameters are set, hyper-parameter search is carried out by using a genetic algorithm, the number of the termination cycle iterations of the hyper-parameter search is 300, each pair of wheels trains 100 epochs of the single-step deep network model, and all hyper-parameter combinations are sequenced according to a total loss function, so that the optimal hyper-parameter combination is finally obtained.

The total loss function of the single step deep network model is as follows:

total loss functionL _totalIncluding bounding box lossL _boxTarget lossL _objAnd predicting object class lossL _cls：

L _total=L _box+L _obj+L _cls，

In the formula (I), the compound is shown in the specification,L _boxto loss of generalized cross-over ratioGIOU，L _objAndL _clsare all binary cross entropy loss functions

；

，

L _box=1-GIOU，

，

∈[0,1]representing predicted samplesiContains the target confidence for more raw teeth in the middle.

(5) And (5) repeating the steps (3) to (4) to serve as a training stage of the single-step deep network model, and obtaining the final model weight.

(6) And (5) testing the acquired new curved surface body layer image by using the single-step deep network model trained in the step (5). Inputting the new curvohedral layer image into the single-step depth network model, and outputting a candidate bounding box position, a candidate bounding box size, a target confidence coefficient of each candidate bounding box, and a target confidence coefficient that each grid contains a target class belonging to the median polyneuropathy.

(7) Screening out a final boundary box from the candidate boundary boxes by using a non-maximum value inhibition method, determining the final positioning of the median multiple raw teeth, and multiplying the target confidence coefficient by the target confidence coefficient of each grid containing the target class belonging to the median multiple raw teeth to obtain the probability value of the candidate boundary box containing the target class belonging to the median multiple raw teeth;

removing redundant bounding boxes according to the probability value and IOU intersection of the target class belonging to the median molar excess, and determining the final bounding box;

if the probability value that the final bounding box contains the target class and belongs to the central multiple fresh teeth exceeds the set threshold value, the position of the central multiple fresh teeth in the indicated new curvaceous layer image is judged to be qualified.

As shown in FIG. 3, Yes indicates that there are more middle teeth in the curved surface layer image detected by the method of the present invention, and No indicates that there are more middle teeth in the curved surface layer image detected by the method of the present invention; in fig. 3, the left side is the curved surface layer image to be detected, the right side is the final bounding box and the target confidence coefficient output by the present invention, and the target confidence coefficients of the three curved surface layer images to be detected are 0.90, 0.93 and 0.08 from top to bottom.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

The above description is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, several modifications and variations can be made without departing from the technical principle of the present invention, and these modifications and variations should also be regarded as the protection scope of the present invention.

Claims

1. The automatic detection method for multiple fresh teeth in the center of the curved surface layer graph based on the single-step depth network is characterized by comprising the following steps of:

collecting a new curved surface body layer image;

2. The method for automatically detecting multiple fresh teeth in the center of a curved surface body layer graph based on the single-step depth network as claimed in claim 1, wherein the training obtains a single-step depth network model, comprising:

3. The method for automatically detecting median multiple raw teeth of a curved surface layer map based on a single-step depth network as claimed in claim 2, wherein the single-step depth network model comprises a basic skeleton end, a detection end and an output end;

4. The method for automatically detecting multiple fresh teeth in the center of a curved surface layer graph based on a single-step depth network as claimed in claim 2, wherein the positioning label information and the classification label information of the training set are obtained by the following steps:

and judging whether more middle raw teeth exist or not through the cone beam CT image acquired by the curved surface layer image corresponding to the case, and using the cone beam CT image as the classification and labeling information of the curved surface layer image.

5. The method of claim 2, wherein the total loss function is a function of total lossL _totalThe expression of (a) is:

L _total=L _box+L _obj+L _cls，

6. The method for automatically detecting central multiple raw teeth of a curvography based on a single-step depth network as claimed in claim 5,L _objandL _clsare all binary cross entropy loss functions

；

，

L _box=1-GIOU，

，

∈[0,1]representing single step multi-scale deep learning netNetwork model prediction samplesiContains the target confidence for more raw teeth in the middle.

7. The method for automatically detecting central multiple raw teeth of a curved surface body layer map based on a single-step depth network as claimed in claim 1, characterized in that before inputting the new curved surface body layer image into the trained single-step depth network model, the collected new curved surface body layer image is divided into S x S grids;

8. The method for automatically detecting multiple fresh teeth in the center of a curved surface layer map based on a single-step deep network as claimed in claim 2, wherein the training set is expanded and comprises:

9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the method of any of claims 1 to 8 are implemented when the processor executes the program.

10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 8.