CN117252847A - Method and device for detecting and identifying defects of maxillary anterior alveolar bone - Google Patents

Method and device for detecting and identifying defects of maxillary anterior alveolar bone Download PDF

Info

Publication number
CN117252847A
CN117252847A CN202311287033.4A CN202311287033A CN117252847A CN 117252847 A CN117252847 A CN 117252847A CN 202311287033 A CN202311287033 A CN 202311287033A CN 117252847 A CN117252847 A CN 117252847A
Authority
CN
China
Prior art keywords
result
processing
target
enhancement
data enhancement
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311287033.4A
Other languages
Chinese (zh)
Inventor
王亚杰
龚蓓文
张淮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jigu Intelligent Technology Co ltd
Original Assignee
Beijing Jigu Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jigu Intelligent Technology Co ltd filed Critical Beijing Jigu Intelligent Technology Co ltd
Priority to CN202311287033.4A priority Critical patent/CN117252847A/en
Publication of CN117252847A publication Critical patent/CN117252847A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4038Image mosaicing, e.g. composing plane images from plane sub-images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/52Scale-space analysis, e.g. wavelet analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/766Arrangements for image or video recognition or understanding using pattern recognition or machine learning using regression, e.g. by projecting features on hyperplanes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2200/00Indexing scheme for image data processing or generation, in general
    • G06T2200/32Indexing scheme for image data processing or generation, in general involving image mosaicing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10081Computed x-ray tomography [CT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30036Dental; Teeth

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Quality & Reliability (AREA)
  • Radiology & Medical Imaging (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Image Processing (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a method and a device for detecting and identifying defects of maxillary anterior alveolar bone, wherein the method comprises the following steps: acquiring a target image set, wherein the target image set comprises CBCT images of a sagittal plane and a coronal double-section of the anterior maxillary tooth; according to the data enhancement strategy, performing data enhancement processing on the target image set to obtain a data enhancement result; the data enhancement processing comprises image stitching and data amplification based on a self-learning data enhancement strategy; performing feature processing on the data enhancement result according to the determined target network model to obtain a feature processing result, wherein the feature processing result comprises first state information indicating that the alveolar bone is normal or second state information indicating that the alveolar bone is windowed and/or cracked; the feature processing operation comprises the operations corresponding to feature extraction, downsampling, feature fusion, loss calculation and data prediction. Therefore, the invention can improve the suitability and accuracy of detection and identification of the alveolar bone micro area on the tooth digital CBCT image.

Description

Method and device for detecting and identifying defects of maxillary anterior alveolar bone
Technical Field
The invention relates to the technical field of deep learning and data identification, in particular to a method and a device for detecting and identifying defects of maxillary anterior alveolar bone.
Background
The current deep learning method for identifying and detecting the tooth mutation type is mainly based on dental X-ray films, and the main network used is a Convolutional Neural Network (CNN). Meanwhile, along with development and perfection of convolutional neural networks, various convolutional network structures are developed for identifying and detecting abnormal teeth, and the historical applications include: VGG-19 network, 7-layer CNN structure, region-CNN (R-CNN) series network, single Shot Multi Box Detector (SSD), U-Net. In addition, researchers have also used the fast RCNN for tooth numbering to aid in the detection and identification of tooth defect images. However, how to improve the recognition and detection accuracy of the tooth alien image is still a technical problem of current research and improvement.
Disclosure of Invention
The invention aims to solve the technical problem of providing a method and a device for detecting and identifying defects of alveolar bones of anterior teeth of a maxilla, which can improve the suitability and accuracy of detection and identification of small areas of the alveolar bones on digital CBCT images of teeth.
To solve the above technical problem, a first aspect of the present invention discloses a method for detecting and identifying defects of maxillary anterior alveolar bone, the method comprising:
acquiring a target image set to be processed, wherein the target image set comprises at least one CBCT image corresponding to a sagittal plane and a coronal double-section of the anterior maxillary tooth; the target image set is marked by preset data;
according to a preset data enhancement strategy, performing data enhancement processing on the target image set to obtain a data enhancement result corresponding to the target image set; the data enhancement processing comprises image splicing and data amplification based on a self-learning data enhancement strategy;
performing a preset feature processing operation on the data enhancement result according to the determined target network model to obtain a feature processing result corresponding to the data enhancement result, wherein the feature processing result comprises state information of an alveolar bone, and the state information comprises first state information indicating that the alveolar bone is normal or second state information indicating that the alveolar bone has windowing and/or cracking;
the feature processing operation comprises the operations corresponding to feature extraction, downsampling, feature fusion, loss calculation and data prediction.
In an optional implementation manner, in a first aspect of the present invention, the performing, according to a preset data enhancement policy, data enhancement processing on the target image set to obtain a data enhancement result corresponding to the target image set includes:
performing image stitching on the target image set according to a preset data enhancement strategy to obtain an image stitching result of the target image set, wherein the image stitching comprises random stitching or similar stitching according to image similarity; the image similarity corresponding to the two or more target images which are subjected to similar splicing is within a preset similarity threshold;
according to the determined self-learning data enhancement strategy, algorithm searching is carried out on a plurality of predefined image enhancement operations, and a target enhancement algorithm and corresponding enhancement parameters of the image splicing result adaptation are obtained;
and then, according to the target enhancement algorithm and the corresponding enhancement parameters thereof, performing enhancement operation on the image splicing result to obtain an enhancement result of the image splicing result, wherein the enhancement result is used as a data enhancement result corresponding to the target image set.
As an optional implementation manner, in the first aspect of the present invention, the self-learning data enhancement strategy includes a search space and a search algorithm;
The search space comprises a first preset number of sub-strategies, each sub-strategy comprises a second preset number of target strategy operations, each target strategy operation in a single sub-strategy has an operation sequence, and two adjacent target strategy operations are different in operation type; each of the target policies operates for image enhancement;
wherein, each target strategy operation has corresponding operation probability and operation strength; the operation probability corresponding to each target strategy operation is a third preset number of first discrete values, and the operation intensity is a fourth preset number of second discrete values; and all the first discrete values follow a uniform distribution, all the second discrete values follow a uniform distribution.
In a first aspect of the present invention, according to the determined self-learning data enhancement strategy, algorithm searching is performed from a predefined plurality of image enhancement operations to obtain the target enhancement algorithm adapted to the image stitching result and the corresponding enhancement parameters thereof, including:
for each sub-strategy, according to the search algorithm, executing search pairing on all the first discrete values and all the second discrete values corresponding to the sub-strategy and the image splicing result to obtain a pairing set corresponding to the sub-strategy, wherein the pairing set corresponding to the sub-strategy comprises a plurality of pairing groups, each pairing group corresponds to one group of the first discrete values and the second discrete values, and each pairing group has a pairing value with the image splicing result;
Determining a pairing group with the highest pairing value from all pairing sets and all pairing groups included in the pairing sets, and marking the pairing group as a target pairing group;
and determining the sub-strategy corresponding to the target pairing group as a target enhancement algorithm matched with the image splicing result, and determining the first discrete value and the second discrete value corresponding to the target pairing group as enhancement parameters corresponding to the target enhancement algorithm.
As an optional implementation manner, in the first aspect of the present invention, the target network model is an optimized YOLO v8 model; the target network model comprises an improved backhaul network, an improved Neck network and a Head network;
the backhaul network comprises a fifth preset number of layers of CSPModule modules and a layer of SPPF modules, and the two types of modules adopt a cascade structure in the backhaul network;
the Neck network adopts an AFPN progressive characteristic pyramid structure;
the step of executing a preset feature processing operation on the data enhancement result according to the determined target network model to obtain a feature processing result corresponding to the data enhancement result, includes:
executing a first processing operation on the input data enhancement result according to all the CSPMmodule and the SPPF module to obtain a first processing result corresponding to the data enhancement result; the first processing operation includes the feature extraction, the downsampling;
Executing a second processing operation on the first processing result according to the Neck network to obtain a second processing result corresponding to the first processing result, wherein the second processing operation comprises at least three layers of progressive feature fusion operation; and the feature fusion operations corresponding to different levels correspond to different space weights;
executing a third processing operation on the second processing result according to the Head network to obtain a third processing result corresponding to the second processing result, wherein the third processing result is used as a characteristic processing result corresponding to the data enhancement result;
the third processing operation comprises at least three operations corresponding to loss calculation, loss weighting calculation and back propagation optimization.
As an optional implementation manner, in the first aspect of the present invention, each time the data enhancement result passes through a layer of the CSPModule module, an output result corresponding to the layer of the CSPModule module is output and is recorded as a first sub-result; the data enhancement result is recorded as a second sub-result through the output result corresponding to the CSPModuue module and the SPPF module of the last layer; the first processing result comprises all the first sub-results and the second sub-results;
The step of executing a second processing operation on the first processing result according to the nack network to obtain a second processing result corresponding to the first processing result includes:
selecting the second sub-result as the first characteristic of the Neck network; selecting the first front first sub-result and the second front second first sub-result adjacent to the second sub-result as the second characteristic and the third characteristic of the Neck network;
combining the second feature and the third feature with a first space weight and a second space weight distributed in the Neck network, and inputting the first space weight and the second space weight into a feature pyramid corresponding to the Ncek network; and inputting the first feature and the third spatial weight distributed in the Neck network into the feature pyramid to obtain multi-scale features corresponding to the first feature, the second feature and the third feature as a second processing result.
As an alternative implementation manner, in the first aspect of the present invention, the Head network is composed of a decoupled classification branch and a regression branch, wherein the classification branch comprises a VFL loss function, and the regression branch comprises a DFL loss function and a CIoU loss function;
and executing a third processing operation on the second processing result according to the Head network to obtain a third processing result corresponding to the second processing result, where the third processing result includes:
Sequentially inputting the second processing result into the DFL loss function, the CIoU loss function and the VFL loss function, and sequentially calculating to obtain a first loss value, a second loss value and a third loss value;
multiplying the first loss value, the second loss value and the third loss value by their corresponding weighted values and summing to obtain the total network loss of the Head network;
performing minimization processing on the total loss of the network through a back propagation algorithm to obtain a plurality of minimization processing results;
and selecting a minimum processing result of the total network loss within a preset target loss threshold from all the minimum processing results as a third processing result.
The second aspect of the present invention discloses a device for detecting and identifying defects of maxillary anterior alveolar bone, the device comprising:
the acquisition module is used for acquiring a target image set to be processed, wherein the target image set comprises at least one CBCT image corresponding to a sagittal plane of the anterior maxillary tooth and a coronal double-section; the target image set is marked by preset data;
the data enhancement processing module is used for executing data enhancement processing on the target image set according to a preset data enhancement strategy to obtain a data enhancement result corresponding to the target image set; the data enhancement processing comprises image splicing and data amplification based on a self-learning data enhancement strategy;
The characteristic processing module is used for executing preset characteristic processing operation on the data enhancement result according to the determined target network model to obtain a characteristic processing result corresponding to the data enhancement result, wherein the characteristic processing result comprises state information of an alveolar bone, and the state information comprises first state information indicating that the alveolar bone is normal or second state information indicating that the alveolar bone has windowing and/or cracking;
the feature processing operation comprises the operations corresponding to feature extraction, downsampling, feature fusion, loss calculation and data prediction.
In a second aspect of the present invention, the data enhancement processing module performs data enhancement processing on the target image set according to a preset data enhancement policy, and the manner of obtaining the data enhancement result corresponding to the target image set specifically includes:
performing image stitching on the target image set according to a preset data enhancement strategy to obtain an image stitching result of the target image set, wherein the image stitching comprises random stitching or similar stitching according to image similarity; the image similarity corresponding to the two or more target images which are subjected to similar splicing is within a preset similarity threshold;
According to the determined self-learning data enhancement strategy, algorithm searching is carried out on a plurality of predefined image enhancement operations, and a target enhancement algorithm and corresponding enhancement parameters of the image splicing result adaptation are obtained;
and then, according to the target enhancement algorithm and the corresponding enhancement parameters thereof, performing enhancement operation on the image splicing result to obtain an enhancement result of the image splicing result, wherein the enhancement result is used as a data enhancement result corresponding to the target image set.
As an optional implementation manner, in the second aspect of the present invention, the self-learning data enhancement strategy includes a search space and a search algorithm;
the search space comprises a first preset number of sub-strategies, each sub-strategy comprises a second preset number of target strategy operations, each target strategy operation in a single sub-strategy has an operation sequence, and two adjacent target strategy operations are different in operation type; each of the target policies operates for image enhancement;
wherein, each target strategy operation has corresponding operation probability and operation strength; the operation probability corresponding to each target strategy operation is a third preset number of first discrete values, and the operation intensity is a fourth preset number of second discrete values; and all the first discrete values follow a uniform distribution, all the second discrete values follow a uniform distribution.
In a second aspect of the present invention, the data enhancement processing module performs algorithm searching from a predefined plurality of image enhancement operations according to the determined self-learning data enhancement policy, and the manner of obtaining the target enhancement algorithm adapted to the image stitching result and the corresponding enhancement parameters thereof specifically includes:
for each sub-strategy, according to the search algorithm, executing search pairing on all the first discrete values and all the second discrete values corresponding to the sub-strategy and the image splicing result to obtain a pairing set corresponding to the sub-strategy, wherein the pairing set corresponding to the sub-strategy comprises a plurality of pairing groups, each pairing group corresponds to one group of the first discrete values and the second discrete values, and each pairing group has a pairing value with the image splicing result;
determining a pairing group with the highest pairing value from all pairing sets and all pairing groups included in the pairing sets, and marking the pairing group as a target pairing group;
and determining the sub-strategy corresponding to the target pairing group as a target enhancement algorithm matched with the image splicing result, and determining the first discrete value and the second discrete value corresponding to the target pairing group as enhancement parameters corresponding to the target enhancement algorithm.
As an alternative embodiment, in the second aspect of the present invention, the target network model is an optimized YOLO v8 model; the target network model comprises an improved backhaul network, an improved Neck network and a Head network;
the backhaul network comprises a fifth preset number of layers of CSPModule modules and a layer of SPPF modules, and the two types of modules adopt a cascade structure in the backhaul network;
the Neck network adopts an AFPN progressive characteristic pyramid structure;
the feature processing module comprises:
the first processing sub-module is used for executing first processing operation on the input data enhancement result according to all the CSPMmodule and the SPPF module to obtain a first processing result corresponding to the data enhancement result; the first processing operation includes the feature extraction, the downsampling;
the second processing sub-module is used for executing a second processing operation on the first processing result according to the Neck network to obtain a second processing result corresponding to the first processing result, wherein the second processing operation comprises at least three layers of progressive feature fusion operation; and the feature fusion operations corresponding to different levels correspond to different space weights;
A third processing sub-module, configured to perform a third processing operation on the second processing result according to the Head network, to obtain a third processing result corresponding to the second processing result, where the third processing result is used as a feature processing result corresponding to the data enhancement result;
the third processing operation comprises at least three operations corresponding to loss calculation, loss weighting calculation and back propagation optimization.
As an alternative implementation manner, in the second aspect of the present invention, each time the data enhancement result passes through a layer of the CSPModule module, an output result corresponding to the layer of the CSPModule module is output and is recorded as a first sub-result; the data enhancement result is recorded as a second sub-result through the output result corresponding to the CSPModuue module and the SPPF module of the last layer; the first processing result comprises all the first sub-results and the second sub-results;
the second processing sub-module executes a second processing operation on the first processing result according to the nack network, and the manner of obtaining the second processing result corresponding to the first processing result specifically includes:
selecting the second sub-result as the first characteristic of the Neck network; selecting the first front first sub-result and the second front second first sub-result adjacent to the second sub-result as the second characteristic and the third characteristic of the Neck network;
Combining the second feature and the third feature with a first space weight and a second space weight distributed in the Neck network, and inputting the first space weight and the second space weight into a feature pyramid corresponding to the Ncek network; and inputting the first feature and the third spatial weight distributed in the Neck network into the feature pyramid to obtain multi-scale features corresponding to the first feature, the second feature and the third feature as a second processing result.
As an alternative embodiment, in the second aspect of the present invention, the Head network is composed of a decoupled classification branch including a VFL loss function and a regression branch including a DFL loss function and a CIoU loss function;
the third processing sub-module executes a third processing operation on the second processing result according to the Head network, and the method for obtaining the third processing result corresponding to the second processing result specifically includes:
sequentially inputting the second processing result into the DFL loss function, the CIoU loss function and the VFL loss function, and sequentially calculating to obtain a first loss value, a second loss value and a third loss value;
multiplying the first loss value, the second loss value and the third loss value by their corresponding weighted values and summing to obtain the total network loss of the Head network;
Performing minimization processing on the total loss of the network through a back propagation algorithm to obtain a plurality of minimization processing results;
and selecting a minimum processing result of the total network loss within a preset target loss threshold from all the minimum processing results as a third processing result.
In a third aspect, the present invention discloses another maxillary anterior alveolar bone defect detection and identification device, the device comprising:
a memory storing executable program code;
a processor coupled to the memory;
the processor invokes the executable program code stored in the memory to perform the maxillary anterior alveolar bone defect detection and identification method disclosed in the first aspect of the present invention.
A fourth aspect of the present invention discloses a computer storage medium storing computer instructions for performing the maxillary anterior alveolar bone defect detection and identification method disclosed in the first aspect of the present invention when the computer instructions are called.
Compared with the prior art, the embodiment of the invention has the following beneficial effects:
in an embodiment of the present invention, there is provided a method for detecting and identifying defects of an alveolar bone of a front maxillary tooth, the method comprising: acquiring a target image set to be processed, wherein the target image set comprises at least one CBCT image corresponding to a sagittal plane and a coronal double-section of the anterior maxillary tooth; the target image set is marked by preset data; according to a preset data enhancement strategy, performing data enhancement processing on the target image set to obtain a data enhancement result corresponding to the target image set; the data enhancement processing comprises image stitching and data amplification based on a self-learning data enhancement strategy; performing preset feature processing operation on the data enhancement result according to the determined target network model to obtain a feature processing result corresponding to the data enhancement result, wherein the feature processing result comprises state information of an alveolar bone, and the state information comprises first state information representing normal state of the alveolar bone or second state information representing windowing and/or cracking of the alveolar bone; the feature processing operation comprises the operations of feature extraction, downsampling, feature fusion, loss calculation and data prediction, wherein the operations correspond to each other. Therefore, after the target image set is acquired, the data enhancement module is arranged (the processing of image splicing and self-learning enhancement strategies is added on the basis of the existing enhancement mode), so that the data set is enriched and expanded through the data enhancement module, and the robustness of the model is improved when the data set is input into a target network model for model training in the follow-up process; furthermore, the feature processing is carried out through the optimized target network model, so that the tooth digital CBCT image with higher spatial resolution can be adapted to the tooth digital CBCT image; meanwhile, through improvement and arrangement of the feature extraction and fusion module, the optimized target network model improves the corresponding adaptation degree, detection accuracy and recognition accuracy when processing the detection and recognition processing of the tiny change area of the tooth bone deficiency.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic flow chart of a method for detecting and identifying defects of an alveolar bone of a front maxillary tooth according to an embodiment of the present invention;
FIG. 2 is a flow chart of another method for detecting and identifying defects of the alveolar bone of the anterior maxillary tooth according to an embodiment of the present invention;
fig. 3 is a schematic structural view of a device for detecting and recognizing defects of an alveolar bone of a front maxillary tooth according to an embodiment of the present invention;
FIG. 4 is a schematic view showing the structure of another device for detecting and recognizing defects of maxillary anterior alveolar bone according to an embodiment of the present invention;
FIG. 5 is a schematic view showing the structure of a device for detecting and recognizing defects of an alveolar bone of a maxillary anterior tooth according to an embodiment of the present invention;
FIG. 6 is a flow chart of a method for detecting and identifying defects of an alveolar bone of a maxillary anterior tooth according to an embodiment of the present invention;
Fig. 7 is a schematic structural diagram of a backhaul feature extraction Backbone network according to an embodiment of the present invention;
fig. 8 is a schematic diagram of a comparison structure before and after a new network is improved according to an embodiment of the present invention.
Detailed Description
In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The terms first, second and the like in the description and in the claims and in the above-described figures are used for distinguishing between different objects and not necessarily for describing a sequential or chronological order. Furthermore, the terms "comprise" and "have," as well as any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, apparatus, article, or article that comprises a list of steps or elements is not limited to only those listed but may optionally include other steps or elements not listed or inherent to such process, method, article, or article.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the invention. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments.
The invention discloses a method and a device for detecting and identifying defects of maxillary anterior alveolar bone, wherein after a target image set is acquired, a data enhancement module (based on the existing enhancement mode, the processing of an image splicing and self-learning enhancement strategy is added) is arranged, so that a data set is enriched and expanded through the data enhancement module, and the robustness of a model is improved when the data set is input into a target network model for model training; furthermore, the feature processing is carried out through the optimized target network model, so that the tooth digital CBCT image with higher spatial resolution can be adapted to the tooth digital CBCT image; meanwhile, through improvement and arrangement of the feature extraction and fusion module, the optimized target network model improves the corresponding adaptation degree, detection accuracy and recognition accuracy when processing the detection and recognition processing of the tiny change area of the tooth bone deficiency. The following will describe in detail.
Example 1
Referring to fig. 1 and 6, fig. 1 is a schematic flow chart of a method for detecting and identifying defects of an alveolar bone of a front maxillary tooth according to an embodiment of the present invention; fig. 6 is a flow chart illustrating a method for detecting and identifying defects of an alveolar bone of a maxillary anterior tooth according to an embodiment of the present invention. The method for detecting and identifying the defects of the maxillary anterior alveolar bone described in fig. 1 and 6 may be applied to a device for detecting and identifying defects of the maxillary anterior alveolar bone, which is not limited in the embodiments of the present invention. As shown in fig. 1, the method for detecting and identifying defects of maxillary anterior alveolar bone may include the operations of:
101. and acquiring a target image set to be processed, wherein the target image set comprises at least one CBCT image corresponding to the sagittal plane of the maxillary anterior tooth and the coronal double-section.
In the embodiment of the invention, the target image set is marked by preset data.
In the embodiment of the invention, after a CBCT image corresponding to a sagittal plane of the maxillary anterior tooth and a coronal double-section is initially acquired, standard marking is carried out on the acquired image, detection data in a standard format is obtained, and the marked detection data is used as a target image set.
In the embodiment of the present invention, in the target image set, for the CBCT image of the anterior maxillary sagittal plane and the CBCT image corresponding to the coronal double-section, the two types of images may be mixed storage or may be different storage according to the attribution classification, and the embodiment of the present invention is not limited.
102. And executing data enhancement processing on the target image set according to a preset data enhancement strategy to obtain a data enhancement result corresponding to the target image set.
In the embodiment of the invention, the data enhancement processing comprises image splicing and data amplification based on a self-learning data enhancement strategy.
According to the embodiment of the invention, before the target image set is directly input into the target network model, the data enhancement processing is carried out on the target image set through the improved data enhancement strategy, so that the robustness of the target network model is improved when the target network model is trained by the target network model which is input subsequently.
103. And executing preset feature processing operation on the data enhancement result according to the determined target network model to obtain a feature processing result corresponding to the data enhancement result.
In the embodiment of the invention, the characteristic processing result comprises state information of the alveolar bone, wherein the state information comprises first state information indicating that the alveolar bone is normal or second state information indicating that the alveolar bone has windowing and/or cracking;
in the embodiment of the invention, the feature processing operation comprises the operations corresponding to feature extraction, downsampling, feature fusion, loss calculation and data prediction.
It can be seen that, after the target image set is obtained, the method for detecting and identifying the defect of the maxillary anterior alveolar bone described in fig. 1 is implemented, and the data set is enriched and expanded by the data enhancement module through the set data enhancement module (the processing of image stitching and self-learning enhancement strategies is added on the basis of the existing enhancement mode), so that the robustness of the model is improved when the data set is input into the target network model for model training; furthermore, the feature processing is carried out through the optimized target network model, so that the tooth digital CBCT image with higher spatial resolution can be adapted to the tooth digital CBCT image; meanwhile, through improvement and arrangement of the feature extraction and fusion module, the optimized target network model improves the corresponding adaptation degree, detection accuracy and recognition accuracy when processing the detection and recognition processing of the tiny change area of the tooth bone deficiency.
In an optional embodiment, step 102 performs data enhancement processing on the target image set according to a preset data enhancement policy, and the manner of obtaining the data enhancement result corresponding to the target image set specifically includes:
performing image stitching on the target image set according to a preset data enhancement strategy to obtain an image stitching result of the target image set, wherein the image stitching comprises random stitching or similar stitching according to image similarity; the image similarity corresponding to two or more target images which are spliced in a similar way is within a preset similarity threshold;
According to the determined self-learning data enhancement strategy, algorithm searching is carried out on a plurality of predefined image enhancement operations, and a target enhancement algorithm with an adaptive image splicing result and corresponding enhancement parameters are obtained;
and then, according to the target enhancement algorithm and the corresponding enhancement parameters thereof, performing enhancement operation on the image splicing result to obtain an enhancement result of the image splicing result, wherein the enhancement result is used as a data enhancement result corresponding to the target image set.
Therefore, in the optional embodiment, before the target image set is input into the subsequent target network model, the expansion and enrichment of the data set are realized through the improved data enhancement strategy, so that the subsequent training and application of the target network model based on the expanded and enriched data set are facilitated, the robustness of the target network model is improved, and the target network model is more suitable for the actual application scene.
In this alternative embodiment, the self-learning data enhancement strategy includes a search space and a search algorithm;
the search space comprises a first preset number of sub-strategies, each sub-strategy comprises a second preset number of target strategy operations, each target strategy operation in a single sub-strategy has an operation sequence, and two adjacent target strategy operations are different in operation type; each target policy operation is for image enhancement;
Wherein, each target strategy operation has corresponding operation probability and operation strength; the operation probability corresponding to each target strategy operation is a third preset number of first discrete values, and the operation strength is a fourth preset number of second discrete values; and all first discrete values follow a uniform distribution, all second discrete values follow a uniform distribution.
In this alternative embodiment, specifically, after comprehensively considering the uniform distribution, the computational complexity and the model performance, setting the search space may include 5 sub-strategies; each sub-policy includes 2 target policy operations, which may specifically be 2 simple image enhancement operations, and there is a corresponding sequential application order for the 2 push enhancement operations.
The preset image enhancement operation comprises the following steps: mosaic, snapMix, cutMix, mixup and TranslateX/Y; 2 image enhancement operations are randomly selected from the 5 image enhancement operations, and the application sequence of each push enhancement operation is set and then added into 1 sub-strategy.
Furthermore, for each image enhancement operation in each sub-strategy, the image enhancement operation corresponds to a default operation intensity range, and before the image enhancement operation is applied, the image enhancement operation is discretized into 11 values (corresponding to a third preset number of first discrete values) in a manner of following a uniform distribution, and the uniform distribution discretization facilitates subsequent searching by using a discrete search algorithm. Similarly, each image enhancement operation corresponds to a default operation probability, and is also discretized into 10 values (corresponding to a fourth predetermined number of second discrete values) in a manner that follows a uniform distribution.
It should be noted that the above selection of discrete values of 11 and 10 is a trade-off between comprehensively considering the average distribution, the computational complexity and the model performance, and this selection provides enough flexibility and diversity while limiting the computational overhead and searching within a reasonable range. However, the specific discrete number and range of values are not limited to values such as 11, 10, and may be adjusted according to specific task requirements and existing computing resources.
And finally, in actual application, each image enhancement operation is regarded as a time-over parameter, and the error and optimization are continuously tried out by using a reinforcement learning mode through 21 discrete values corresponding to the image enhancement operation, so that the optimal data enhancement strategy is found.
Further, the method for performing algorithm search from a plurality of predefined image enhancement operations according to the determined self-learning data enhancement strategy to obtain the target enhancement algorithm adapted to the image stitching result and the corresponding enhancement parameters thereof specifically includes:
for each sub-strategy, according to a search algorithm, carrying out search pairing on all first discrete values and all second discrete values corresponding to the sub-strategy and an image splicing result to obtain a pairing set corresponding to the sub-strategy, wherein the pairing set corresponding to the sub-strategy comprises a plurality of pairing groups, each pairing group corresponds to one group of first discrete values and second discrete values, and each pairing group has a pairing value with the image splicing result;
Determining a pairing group with the highest pairing value from all pairing sets and all pairing groups included in the pairing sets, and marking the pairing group as a target pairing group;
and determining the sub-strategy corresponding to the target pairing group as a target enhancement algorithm matched with the image splicing result, and determining the first discrete value and the second discrete value corresponding to the target pairing group as enhancement parameters corresponding to the target enhancement algorithm.
Therefore, in the optional embodiment, through the search space (including the sub-strategy, the uniform discretized operation intensity and the operation probability thereof) and the set of the search algorithm, the intelligent learning optimization of the data enhancement strategy is realized, so that the optimal data enhancement strategy is finally obtained, the determination accuracy and reliability of the data enhancement strategy are improved, and the robustness of the target network model is improved when the data set enhanced by the data enhancement strategy is input into the target network model for training.
Example two
Referring to fig. 2, fig. 2 is a flowchart illustrating another method for detecting and identifying defects of an alveolar bone of an anterior maxillary tooth according to an embodiment of the present invention. The method for detecting and identifying the defects of the maxillary anterior alveolar bone described in fig. 2 may be applied to a device for detecting and identifying defects of the maxillary anterior alveolar bone, which is not limited in the embodiment of the present invention. As shown in fig. 2, the method for detecting and identifying defects of maxillary anterior alveolar bone may include the operations of:
201. And acquiring a target image set to be processed, wherein the target image set comprises at least one CBCT image corresponding to the sagittal plane of the maxillary anterior tooth and the coronal double-section.
202. And executing data enhancement processing on the target image set according to a preset data enhancement strategy to obtain a data enhancement result corresponding to the target image set.
In the embodiment of the invention, the target network model is an optimized YOLO v8 model; the target network model comprises an improved backhaul network, an improved Neck network and a Head network;
the backhaul network comprises a fifth preset number of layers of CSPModule modules and a layer of SPPF modules, and the two types of modules adopt a cascade structure in the backhaul network;
in the embodiment of the present invention, referring to fig. 7, fig. 7 is a schematic structural diagram of a backhaul feature extraction Backbone network disclosed in the embodiment of the present invention; as shown in FIG. 7, the backhaul network may comprise a cascade of 4-layer CSPModule modules and one-layer SPPF modules. Wherein, the 4-layer CSPModule module is a basic building unit of a backbone network and adopts a CSPDarkNet-53 network structure; the 4-layer CSPModule module functions to perform feature extraction and downsampling to obtain higher level feature representations. In practical application, the CSPModule module for each layer is specifically set as follows: the convolution kernel size of the first convolution layer is set to 3 and the step size is 2.
The number of layers of the CSPModule module can be increased or decreased according to practical application, but the number of layers is not lower than 3.
In the embodiment of the invention, the SPPF module is a module of a backbone network and is formed by serial connection of three MaxPools. The purpose of this module is to introduce the idea of spatial pyramid pooling (Spatial Pyramid Pooling) to obtain feature representations of different scales (including shallow, middle and deep features). In practical application, the size of the convolution kernel is set to 5*5.
In the embodiment of the invention, the improved Neck network adopts an AFPN progressive feature pyramid structure.
In the embodiment of the present invention, the other descriptions of the steps 201 to 202 refer to the other specific descriptions of the steps 101 to 102 in the first embodiment, and the description of the embodiment of the present invention is omitted.
203. Executing a first processing operation on the input data enhancement result according to all CSPModule modules and SPPF modules to obtain a first processing result corresponding to the data enhancement result; the first processing operation includes feature extraction and downsampling.
204. Executing a second processing operation on the first processing result according to the Neck network to obtain a second processing result corresponding to the first processing result, wherein the second processing operation comprises at least three layers of progressive feature fusion operation; and the feature fusion operations corresponding to different levels correspond to different spatial weights.
205. And executing a third processing operation on the second processing result according to the Head network to obtain a third processing result corresponding to the second processing result, wherein the third processing result is used as a characteristic processing result corresponding to the data enhancement result.
In the embodiment of the invention, the third processing operation comprises at least three operations corresponding to loss calculation, loss weighting calculation and back propagation optimization.
It can be seen that, by implementing the method for detecting and identifying the defect of the maxillary anterior alveolar bone described in fig. 2, the idea of spatial pyramid pooling can be fused through an improved YOLO v8 model (specifically including an improved back bone network, a neg network and a Head network), and the back bone network is used for capturing visual information of different levels, so that an input image is converted into a feature representation with more characterization capability, and then the feature representation and the feature identification are further processed and fused by the neg fusion network, thereby being beneficial to improving the expression capability and the discrimination of the feature; and finally, carrying out corresponding loss calculation, weighting treatment, back propagation optimization and the like by adopting a Head network in a prediction stage so as to improve the prediction accuracy and robustness.
In an alternative embodiment, as shown in FIG. 7, each time the data enhancement result passes through a layer of CSPModule modules, an output result corresponding to the layer of CSPModule modules is output and recorded as a first sub-result; the data enhancement result is marked as a second sub-result through the output result corresponding to the CSPModule module and the SPPF module of the last layer; the first processing result comprises all first sub-results and second sub-results;
Step 204, executing a second processing operation on the first processing result according to the Neck network, where the manner of obtaining the second processing result corresponding to the first processing result specifically includes:
selecting the second sub-result as the first characteristic of the Neck network; selecting the first front first sub-result and the second front second first sub-result adjacent to the second sub-result as the second characteristic and the third characteristic of the Neck network;
combining the second feature and the third feature with the first space weight and the second space weight distributed in the Neck network, and inputting the first space weight and the second space weight into a feature pyramid corresponding to the Ncek network; and then inputting the first feature and the third spatial weight distributed in the Neck network into a feature pyramid to obtain multi-scale features corresponding to the first feature, the second feature and the third feature, and taking the multi-scale features as a second processing result.
In this alternative embodiment, referring specifically to fig. 8, fig. 8 is a schematic diagram of a comparison structure before and after a new network is improved according to an embodiment of the present invention. As shown in FIG. 8, when the Neck feature fusion network is applied, the last layer of features of the last three CSPModule modules are extracted from all feature layers of the Backbone network are extracted from the back bone features, thereby generating a set of features (shallow, middle and deep features) of different scales, denoted as { C ] 3 ,C 4 ,C 5 }. To perform depth feature fusion on features of different scales, first, low-level and middle-level features { C 3 ,C 4 Input into the feature pyramid network and then add the high-level features C5. After the feature fusion step, a set of multi-scale features is generated as { P ] 3 ,P 4 ,P 5 }。
In this alternative embodiment, the architecture of the AFPN is shown in fig. 8, where the AFPN progressively integrates shallow, middle and deep features during the bottom-up feature extraction of the backhaul network. In addition, in the multi-level feature fusion process, the features of different levels are distributed with different spatial weights by adopting a proportion distribution mode (shallow layer: middle layer: deep layer=5:3:2), so that the importance of a key level is enhanced, and the influence of contradictory information from different targets is relieved. The spatial weight allocation for each level may also be adjusted according to practical applications, which is not limited in the embodiment of the present invention.
In the alternative embodiment, the improved Neck feature fusion network introduces a gradual layered fusion structure, and meanwhile, different spatial weights are allocated to different levels of features, so that the importance of a key level is enhanced, the influence of contradictory information of different targets is reduced, and the fusion accuracy and reliability of the Neck feature fusion network are improved.
In another alternative embodiment, the Head network is comprised of decoupled classification branches including VFL loss functions and regression branches including DFL loss functions and CIoU loss functions;
the Head network adopts a decoupling Head structure, extracts category characteristics and position characteristics through two parallel branches (a classification branch and a regression branch) respectively, and then completes classification and positioning tasks by a layer of 1X 1 convolution. Wherein the classification branch is used to perform the computation of the VFL loss function; the regression branch is used to perform the calculations of the DFL loss function and CIoU loss function.
In this optional embodiment, the step 205 performs a third processing operation on the second processing result according to the Head network, and the manner of obtaining the third processing result corresponding to the second processing result specifically includes:
sequentially inputting the second processing result into a DFL loss function, a CIoU loss function and a VFL loss function, and sequentially calculating to obtain a first loss value, a second loss value and a third loss value;
multiplying the first loss value, the second loss value and the third loss value by their corresponding weighted values and summing to obtain the total network loss of the Head network;
performing minimization processing on the total loss of the network through a back propagation algorithm to obtain a plurality of minimization processing results;
And selecting a minimum processing result of the total network loss within a preset target loss threshold from all the minimum processing results as a third processing result.
In this alternative embodiment, each loss function provides a different constraint to guide the learning process of the network, in particular. The DFL loss function and CIoU loss function constrain the regression branches so that they can accurately predict the location and shape of the target bounding box, while the VFL loss function constrains the classification branches so that they can accurately predict the class of the target. Therefore, the whole network can optimize classification and regression tasks simultaneously in the training process so as to improve the accuracy and the robustness of prediction. Wherein the DFL penalty models the location of the detection box as a general distribution, ensuring that the network is quickly focused on the distribution of locations closest to the target location. The specific calculation formula is shown as formula (1):
DFL(S i ,S i+1 )=-((y i+1 -y)log(S i )+(y-y i )(S i+1 )) (1)
wherein S is i ,S i+1 The location distribution (general distribution) representing the regression branch output can also be understood as a prediction of the target location by the network. y is i And y i+1 The representation corresponds to S i ,S i+1 I.e. the true distribution of target positions.
The CIoU loss function is used for the loss function of bounding box regression, which further considers the distance between bounding boxes and the compactness of the overlap on the basis of IoU. The calculation formula is shown as formula (2):
Wherein IoU denotes the cross-over ratio. b, b gt Representing the predicted bounding box and the actual bounding box, respectively. Representing the diagonal length of the overlapping portion of the two. c represents a normalization factor. Alpha represents the balance coefficient. αv represents a penalty term for non-overlapping bounding boxes.
The VFL loss function is used for loss calculation of classification tasks to reduce the weight of the easily separable samples by introducing variance, thereby increasing the focus on difficult-to-separate samples. The calculation formula is shown as formula (3):
where p represents the predicted probability that the model belongs to a positive class for the sample and q is the output of the classification branch. Representing that the sample is truly a tag. Alpha is a positive and negative sample balance coefficient and is used for adjusting the weight between the positive and negative samples and controlling the attention degree of the model to samples of different categories. Gamma is the balance coefficient of the difficulty samples and is used for adjusting the weight among the difficulty samples.
It can be seen that in this alternative embodiment, the total loss of the target network model consists of a weighted sum of the three loss functions described above, and a back propagation algorithm is used to minimize the loss functions, thereby adjusting the weights and parameters of the network so that the predicted outcome is as close as possible to the true target class and bounding box. Finally, through continuous iterative training process, the network can gradually improve the prediction performance, and identify the tiny change area and judge the state of the alveolar bone (normal, alveolar bone windowing and cracking) in the prediction stage; that is, by setting the Head network, the ending process is performed for the final data prediction and recognition, which is beneficial to improving the detection accuracy and recognition accuracy for the small change region of the alveolar bone.
Example III
Referring to fig. 3, fig. 3 is a schematic structural diagram of a device for detecting and identifying defects of an alveolar bone of a maxillary anterior tooth according to an embodiment of the present invention. The device for detecting and identifying the defects of the maxillary anterior alveolar bone can be a terminal, a device, a system or a server for detecting and identifying the defects of the maxillary anterior alveolar bone, and the server can be a local server, a remote server or a cloud server (also called cloud server), and when the server is a non-cloud server, the non-cloud server can be in communication connection with the cloud server. As shown in fig. 3, the maxillary anterior alveolar bone defect detecting and identifying apparatus may include an acquisition module 301, a data enhancement processing module 302, and a feature processing module 303, wherein:
the acquiring module 301 is configured to acquire a target image set to be processed, where the target image set includes at least one CBCT image corresponding to a sagittal plane of the anterior maxillary tooth and a coronal double-section; the target image set is marked by preset data.
The data enhancement processing module 302 is configured to perform data enhancement processing on the target image set according to a preset data enhancement policy, so as to obtain a data enhancement result corresponding to the target image set. The data enhancement processing includes image stitching and data amplification based on a self-learning data enhancement strategy.
The feature processing module 303 is configured to perform a preset feature processing operation on the data enhancement result according to the determined target network model, so as to obtain a feature processing result corresponding to the data enhancement result, where the feature processing result includes state information of an alveolar bone, and the state information includes first state information indicating that the alveolar bone is normal or second state information indicating that the alveolar bone has a window and/or a crack;
the feature processing operation comprises the operations of feature extraction, downsampling, feature fusion, loss calculation and data prediction, wherein the operations correspond to each other.
As can be seen, after the maxillary anterior alveolar bone defect detection and identification device described in fig. 3 is implemented and the target image set is acquired, the data set is enriched and expanded by the data enhancement module through the set data enhancement module (the processing of image stitching and self-learning enhancement strategies is added on the basis of the existing enhancement mode), so that the robustness of the model is improved when the data set is input into the target network model for model training; furthermore, the feature processing is carried out through the optimized target network model, so that the tooth digital CBCT image with higher spatial resolution can be adapted to the tooth digital CBCT image; meanwhile, through improvement and arrangement of the feature extraction and fusion module, the optimized target network model improves the corresponding adaptation degree, detection accuracy and recognition accuracy when processing the detection and recognition processing of the tiny change area of the tooth bone deficiency.
In an optional embodiment, the data enhancement processing module 302 performs data enhancement processing on the target image set according to a preset data enhancement policy, and the manner of obtaining the data enhancement result corresponding to the target image set specifically includes:
performing image stitching on the target image set according to a preset data enhancement strategy to obtain an image stitching result of the target image set, wherein the image stitching comprises random stitching or similar stitching according to image similarity; the image similarity corresponding to two or more target images which are spliced in a similar way is within a preset similarity threshold;
according to the determined self-learning data enhancement strategy, algorithm searching is carried out on a plurality of predefined image enhancement operations, and a target enhancement algorithm with an adaptive image splicing result and corresponding enhancement parameters are obtained;
and then, according to the target enhancement algorithm and the corresponding enhancement parameters thereof, performing enhancement operation on the image splicing result to obtain an enhancement result of the image splicing result, wherein the enhancement result is used as a data enhancement result corresponding to the target image set.
Therefore, in the optional embodiment, before the target image set is input into the subsequent target network model, the expansion and enrichment of the data set are realized through the improved data enhancement strategy, so that the subsequent training and application of the target network model based on the expanded and enriched data set are facilitated, the robustness of the target network model is improved, and the target network model is more suitable for the actual application scene.
In another alternative embodiment, the self-learning data enhancement strategy includes a search space and a search algorithm;
the search space comprises a first preset number of sub-strategies, each sub-strategy comprises a second preset number of target strategy operations, each target strategy operation in a single sub-strategy has an operation sequence, and two adjacent target strategy operations are different in operation type; each target policy operation is for image enhancement;
wherein, each target strategy operation has corresponding operation probability and operation strength; the operation probability corresponding to each target strategy operation is a third preset number of first discrete values, and the operation strength is a fourth preset number of second discrete values; and all first discrete values follow a uniform distribution, all second discrete values follow a uniform distribution.
In this optional embodiment, further, the data enhancement processing module 302 performs algorithm searching from a predefined plurality of image enhancement operations according to the determined self-learning data enhancement policy, and the manner of obtaining the target enhancement algorithm adapted by the image stitching result and the corresponding enhancement parameters thereof specifically includes:
for each sub-strategy, according to a search algorithm, carrying out search pairing on all first discrete values and all second discrete values corresponding to the sub-strategy and an image splicing result to obtain a pairing set corresponding to the sub-strategy, wherein the pairing set corresponding to the sub-strategy comprises a plurality of pairing groups, each pairing group corresponds to one group of first discrete values and second discrete values, and each pairing group has a pairing value with the image splicing result;
Determining a pairing group with the highest pairing value from all pairing sets and all pairing groups included in the pairing sets, and marking the pairing group as a target pairing group;
and determining the sub-strategy corresponding to the target pairing group as a target enhancement algorithm matched with the image splicing result, and determining the first discrete value and the second discrete value corresponding to the target pairing group as enhancement parameters corresponding to the target enhancement algorithm.
Therefore, in the optional embodiment, through the search space (including the sub-strategy, the uniform discretized operation intensity and the operation probability thereof) and the set of the search algorithm, the intelligent learning optimization of the data enhancement strategy is realized, so that the optimal data enhancement strategy is finally obtained, the determination accuracy and reliability of the data enhancement strategy are improved, and the robustness of the target network model is improved when the data set enhanced by the data enhancement strategy is input into the target network model for training.
In yet another alternative embodiment, the target network model is an optimized YOLO v8 model; the target network model comprises an improved backhaul network, an improved Neck network and a Head network;
the backhaul network comprises a fifth preset number of layers of CSPModule modules and a layer of SPPF modules, and the two types of modules adopt a cascade structure in the backhaul network;
The Neck network adopts an AFPN progressive characteristic pyramid structure;
the feature processing module 303 includes:
the first processing sub-module 3031 is configured to execute a first processing operation on the input data enhancement result according to all the csp module and the SPPF module, so as to obtain a first processing result corresponding to the data enhancement result; the first processing operation includes feature extraction, downsampling;
the second processing sub-module 3032 is configured to perform a second processing operation on the first processing result according to the Neck network to obtain a second processing result corresponding to the first processing result, where the second processing operation includes at least three layers of progressive feature fusion operations; and the feature fusion operations corresponding to different levels correspond to different space weights;
a third processing sub-module 3033, configured to perform a third processing operation on the second processing result according to the Head network, to obtain a third processing result corresponding to the second processing result, as a feature processing result corresponding to the data enhancement result;
the third processing operation comprises at least three operations corresponding to loss calculation, loss weighting calculation and back propagation optimization.
It can be seen that in this alternative embodiment, the idea of spatial pyramid pooling can be fused through an improved YOLO v8 model (specifically including an improved back bone network, a neg network and a Head network), and the back bone network is used for capturing visual information of different levels, so that an input image is converted into a feature representation with more characterization capability, and then the input image is input into a neg fusion network for further processing and fusion, which is beneficial to improving the expression capability and discriminant of the feature; and finally, carrying out corresponding loss calculation, weighting treatment, back propagation optimization and the like by adopting a Head network in a prediction stage so as to improve the prediction accuracy and robustness.
In another alternative embodiment, each time the data enhancement result passes through a layer of CSPModule modules, an output result corresponding to the layer of CSPModule modules is output and recorded as a first sub-result; the data enhancement result is marked as a second sub-result through the output result corresponding to the CSPModule module and the SPPF module of the last layer; the first processing result comprises all first sub-results and second sub-results;
the second processing sub-module 3032 executes a second processing operation on the first processing result according to the Neck network, and the manner of obtaining the second processing result corresponding to the first processing result specifically includes:
selecting the second sub-result as the first characteristic of the Neck network; selecting the first front first sub-result and the second front second first sub-result adjacent to the second sub-result as the second characteristic and the third characteristic of the Neck network;
combining the second feature and the third feature with the first space weight and the second space weight distributed in the Neck network, and inputting the first space weight and the second space weight into a feature pyramid corresponding to the Ncek network; and then inputting the first feature and the third spatial weight distributed in the Neck network into a feature pyramid to obtain multi-scale features corresponding to the first feature, the second feature and the third feature, and taking the multi-scale features as a second processing result.
In the alternative embodiment, the improved Neck feature fusion network introduces a gradual layered fusion structure, and meanwhile, different spatial weights are allocated to different levels of features, so that the importance of a key level is enhanced, the influence of contradictory information of different targets is reduced, and the fusion accuracy and reliability of the Neck feature fusion network are improved.
In yet another alternative embodiment, the Head network is comprised of decoupled classification branches including VFL loss functions and regression branches including DFL loss functions and CIoU loss functions;
the third processing sub-module 3033 executes a third processing operation on the second processing result according to the Head network, so as to obtain a third processing result corresponding to the second processing result specifically includes:
sequentially inputting the second processing result into a DFL loss function, a CIoU loss function and a VFL loss function, and sequentially calculating to obtain a first loss value, a second loss value and a third loss value;
multiplying the first loss value, the second loss value and the third loss value by their corresponding weighted values and summing to obtain the total network loss of the Head network;
performing minimization processing on the total loss of the network through a back propagation algorithm to obtain a plurality of minimization processing results;
And selecting a minimum processing result of the total network loss within a preset target loss threshold from all the minimum processing results as a third processing result.
It can be seen that in this alternative embodiment, the total loss of the target network model consists of a weighted sum of the three loss functions described above, and a back propagation algorithm is used to minimize the loss functions, thereby adjusting the weights and parameters of the network so that the predicted outcome is as close as possible to the true target class and bounding box. Finally, through continuous iterative training process, the network can gradually improve the prediction performance, and identify the tiny change area and judge the state of the alveolar bone (normal, alveolar bone windowing and cracking) in the prediction stage; that is, by setting the Head network, the ending process is performed for the final data prediction and recognition, which is beneficial to improving the detection accuracy and recognition accuracy for the small change region of the alveolar bone.
Example IV
Referring to fig. 5, fig. 5 is a schematic structural view of a detection and identification device for defects of maxillary anterior alveolar bone according to an embodiment of the present invention. As shown in fig. 5, the maxillary anterior alveolar bone defect detecting and identifying device may include:
a memory 401 storing executable program codes;
A processor 402 coupled with the memory 401;
the processor 402 invokes executable program codes stored in the memory 401 to perform the steps in the maxillary anterior alveolar bone defect detection and identification method described in the first or second embodiment of the present invention.
Example five
The embodiment of the invention discloses a computer storage medium which stores computer instructions for executing the steps in the maxillary anterior alveolar bone defect detection and identification method described in the first or second embodiment of the invention when the computer instructions are called.
Example six
An embodiment of the present invention discloses a computer program product comprising a non-transitory computer storage medium storing a computer program, and the computer program is operable to cause a computer to perform the steps of the method for detecting and identifying a defect of an anterior maxillary alveolar bone as described in the first or second embodiment.
The apparatus embodiments described above are merely illustrative, in which the modules illustrated as separate components may or may not be physically separate, and the components shown as modules may or may not be physical, i.e., may be located in one place, or may be distributed over multiple network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
From the above detailed description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course by means of hardware. Based on such understanding, the foregoing technical solutions may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer storage medium including Read-Only Memory (ROM), random access Memory (Random Access Memory, RAM), programmable Read-Only Memory (PROM), erasable programmable Read-Only Memory (Erasable Programmable Read Only Memory, EPROM), one-time programmable Read-Only Memory (OTPROM), electrically erasable programmable Read-Only Memory (EEPROM), compact disc Read-Only Memory (Compact Disc Read-Only Memory, CD-ROM) or other optical disc Memory, magnetic disk Memory, tape Memory, or any other medium readable by a computer that can be used to carry or store data.
Finally, it should be noted that: the embodiment of the invention discloses a method and a device for detecting and identifying defects of maxillary anterior alveolar bone, which are disclosed as preferred embodiments of the invention, and are only used for illustrating the technical scheme of the invention, but not limiting the technical scheme; although the invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art will understand that; the technical scheme recorded in the various embodiments can be modified or part of technical features in the technical scheme can be replaced equivalently; such modifications and substitutions do not depart from the spirit and scope of the corresponding technical solutions.

Claims (10)

1. A method for detecting and identifying defects of maxillary anterior alveolar bone, the method comprising:
acquiring a target image set to be processed, wherein the target image set comprises at least one CBCT image corresponding to a sagittal plane and a coronal double-section of the anterior maxillary tooth; the target image set is marked by preset data;
according to a preset data enhancement strategy, performing data enhancement processing on the target image set to obtain a data enhancement result corresponding to the target image set; the data enhancement processing comprises image splicing and data amplification based on a self-learning data enhancement strategy;
Performing a preset feature processing operation on the data enhancement result according to the determined target network model to obtain a feature processing result corresponding to the data enhancement result, wherein the feature processing result comprises state information of an alveolar bone, and the state information comprises first state information indicating that the alveolar bone is normal or second state information indicating that the alveolar bone has windowing and/or cracking;
the feature processing operation comprises the operations corresponding to feature extraction, downsampling, feature fusion, loss calculation and data prediction.
2. The method for detecting and identifying defects of maxillary anterior alveolar bone according to claim 1, wherein the performing data enhancement processing on the target image set according to a preset data enhancement policy to obtain a data enhancement result corresponding to the target image set includes:
performing image stitching on the target image set according to a preset data enhancement strategy to obtain an image stitching result of the target image set, wherein the image stitching comprises random stitching or similar stitching according to image similarity; the image similarity corresponding to the two or more target images which are subjected to similar splicing is within a preset similarity threshold;
According to the determined self-learning data enhancement strategy, algorithm searching is carried out on a plurality of predefined image enhancement operations, and a target enhancement algorithm and corresponding enhancement parameters of the image splicing result adaptation are obtained;
and then, according to the target enhancement algorithm and the corresponding enhancement parameters thereof, performing enhancement operation on the image splicing result to obtain an enhancement result of the image splicing result, wherein the enhancement result is used as a data enhancement result corresponding to the target image set.
3. The method for detecting and identifying defects of maxillary anterior alveolar bone according to claim 2, wherein the self-learning data enhancement strategy comprises a search space and a search algorithm;
the search space comprises a first preset number of sub-strategies, each sub-strategy comprises a second preset number of target strategy operations, each target strategy operation in a single sub-strategy has an operation sequence, and two adjacent target strategy operations are different in operation type; each of the target policies operates for image enhancement;
wherein, each target strategy operation has corresponding operation probability and operation strength; the operation probability corresponding to each target strategy operation is a third preset number of first discrete values, and the operation intensity is a fourth preset number of second discrete values; and all the first discrete values follow a uniform distribution, all the second discrete values follow a uniform distribution.
4. The method for detecting and identifying defects of maxillary anterior alveolar bone according to claim 3, wherein the performing an algorithm search from a predefined plurality of image enhancement operations according to the determined self-learning data enhancement strategy, to obtain the target enhancement algorithm adapted to the image stitching result and the corresponding enhancement parameters thereof, comprises:
for each sub-strategy, according to the search algorithm, executing search pairing on all the first discrete values and all the second discrete values corresponding to the sub-strategy and the image splicing result to obtain a pairing set corresponding to the sub-strategy, wherein the pairing set corresponding to the sub-strategy comprises a plurality of pairing groups, each pairing group corresponds to one group of the first discrete values and the second discrete values, and each pairing group has a pairing value with the image splicing result;
determining a pairing group with the highest pairing value from all pairing sets and all pairing groups included in the pairing sets, and marking the pairing group as a target pairing group;
and determining the sub-strategy corresponding to the target pairing group as a target enhancement algorithm matched with the image splicing result, and determining the first discrete value and the second discrete value corresponding to the target pairing group as enhancement parameters corresponding to the target enhancement algorithm.
5. The method for detecting and identifying a defect in an alveolar bone of a maxillary anterior tooth according to any one of claims 1-4, wherein said target network model is an optimized YOLO v8 model; the target network model comprises an improved backhaul network, an improved Neck network and a Head network;
the backhaul network comprises a fifth preset number of layers of CSPModule modules and a layer of SPPF modules, and the two types of modules adopt a cascade structure in the backhaul network;
the Neck network adopts an AFPN progressive characteristic pyramid structure;
the step of executing a preset feature processing operation on the data enhancement result according to the determined target network model to obtain a feature processing result corresponding to the data enhancement result, includes:
executing a first processing operation on the input data enhancement result according to all the CSPMmodule and the SPPF module to obtain a first processing result corresponding to the data enhancement result; the first processing operation includes the feature extraction, the downsampling;
executing a second processing operation on the first processing result according to the Neck network to obtain a second processing result corresponding to the first processing result, wherein the second processing operation comprises at least three layers of progressive feature fusion operation; and the feature fusion operations corresponding to different levels correspond to different space weights;
Executing a third processing operation on the second processing result according to the Head network to obtain a third processing result corresponding to the second processing result, wherein the third processing result is used as a characteristic processing result corresponding to the data enhancement result;
the third processing operation comprises at least three operations corresponding to loss calculation, loss weighting calculation and back propagation optimization.
6. The method according to claim 5, wherein each time the data enhancement result passes through a layer of the CSPModule module, an output result corresponding to the layer of the CSPModule module is output and recorded as a first sub-result; the data enhancement result is recorded as a second sub-result through the output result corresponding to the CSPModuue module and the SPPF module of the last layer; the first processing result comprises all the first sub-results and the second sub-results;
the step of executing a second processing operation on the first processing result according to the nack network to obtain a second processing result corresponding to the first processing result includes:
selecting the second sub-result as the first characteristic of the Neck network; selecting the first front first sub-result and the second front second first sub-result adjacent to the second sub-result as the second characteristic and the third characteristic of the Neck network;
Combining the second feature and the third feature with a first space weight and a second space weight distributed in the Neck network, and inputting the first space weight and the second space weight into a feature pyramid corresponding to the Ncek network; and inputting the first feature and the third spatial weight distributed in the Neck network into the feature pyramid to obtain multi-scale features corresponding to the first feature, the second feature and the third feature as a second processing result.
7. The method of detecting and identifying anterior maxillary alveolar bone defects according to claim 5 or 6, wherein the Head network is comprised of decoupled classification branches including VFL loss functions and regression branches including DFL loss functions and CIoU loss functions;
and executing a third processing operation on the second processing result according to the Head network to obtain a third processing result corresponding to the second processing result, where the third processing result includes:
sequentially inputting the second processing result into the DFL loss function, the CIoU loss function and the VFL loss function, and sequentially calculating to obtain a first loss value, a second loss value and a third loss value;
multiplying the first loss value, the second loss value and the third loss value by their corresponding weighted values and summing to obtain the total network loss of the Head network;
Performing minimization processing on the total loss of the network through a back propagation algorithm to obtain a plurality of minimization processing results;
and selecting a minimum processing result of the total network loss within a preset target loss threshold from all the minimum processing results as a third processing result.
8. A device for detecting and identifying defects of an alveolar bone of a front maxilla, the device comprising:
the acquisition module is used for acquiring a target image set to be processed, wherein the target image set comprises at least one CBCT image corresponding to a sagittal plane of the anterior maxillary tooth and a coronal double-section; the target image set is marked by preset data;
the data enhancement processing module is used for executing data enhancement processing on the target image set according to a preset data enhancement strategy to obtain a data enhancement result corresponding to the target image set; the data enhancement processing comprises image splicing and data amplification based on a self-learning data enhancement strategy;
the characteristic processing module is used for executing preset characteristic processing operation on the data enhancement result according to the determined target network model to obtain a characteristic processing result corresponding to the data enhancement result, wherein the characteristic processing result comprises state information of an alveolar bone, and the state information comprises first state information indicating that the alveolar bone is normal or second state information indicating that the alveolar bone has windowing and/or cracking;
The feature processing operation comprises the operations corresponding to feature extraction, downsampling, feature fusion, loss calculation and data prediction.
9. A device for detecting and identifying defects of an alveolar bone of a front maxilla, the device comprising:
a memory storing executable program code;
a processor coupled to the memory;
the processor invokes the executable program code stored in the memory to perform the maxillary anterior alveolar bone defect detection and identification method of any one of claims 1-7.
10. A computer storage medium storing computer instructions which, when invoked, are adapted to perform the method of detecting and identifying defects of maxillary anterior alveolar bone as defined in any one of claims 1-7.
CN202311287033.4A 2023-10-07 2023-10-07 Method and device for detecting and identifying defects of maxillary anterior alveolar bone Pending CN117252847A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311287033.4A CN117252847A (en) 2023-10-07 2023-10-07 Method and device for detecting and identifying defects of maxillary anterior alveolar bone

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311287033.4A CN117252847A (en) 2023-10-07 2023-10-07 Method and device for detecting and identifying defects of maxillary anterior alveolar bone

Publications (1)

Publication Number Publication Date
CN117252847A true CN117252847A (en) 2023-12-19

Family

ID=89129074

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311287033.4A Pending CN117252847A (en) 2023-10-07 2023-10-07 Method and device for detecting and identifying defects of maxillary anterior alveolar bone

Country Status (1)

Country Link
CN (1) CN117252847A (en)

Similar Documents

Publication Publication Date Title
CN111899245B (en) Image segmentation method, image segmentation device, model training method, model training device, electronic equipment and storage medium
US11263434B2 (en) Fast side-face interference resistant face detection method
CN111178197B (en) Mass R-CNN and Soft-NMS fusion based group-fed adherent pig example segmentation method
JP2020009402A (en) Method and system for automatic chromosome classification
CN109271958B (en) Face age identification method and device
CN110796199B (en) Image processing method and device and electronic medical equipment
JPWO2019026104A1 (en) Information processing apparatus, information processing program, and information processing method
CN112949704B (en) Tobacco leaf maturity state identification method and device based on image analysis
CN111626349A (en) Target detection method and system based on deep learning
CN110245620B (en) Non-maximization inhibition method based on attention
CN112560710B (en) Method for constructing finger vein recognition system and finger vein recognition system
CN112836625A (en) Face living body detection method and device and electronic equipment
CN112614133A (en) Three-dimensional pulmonary nodule detection model training method and device without anchor point frame
CN111797705A (en) Action recognition method based on character relation modeling
CN111126155B (en) Pedestrian re-identification method for generating countermeasure network based on semantic constraint
CN111783716A (en) Pedestrian detection method, system and device based on attitude information
CN115359264A (en) Intensive distribution adhesion cell deep learning identification method
US20230206601A1 (en) Device and method for classifying images and accessing the robustness of the classification
CN117315380B (en) Deep learning-based pneumonia CT image classification method and system
CN112418299B (en) Coronary artery segmentation model training method, coronary artery segmentation method and device
CN113283388A (en) Training method, device and equipment of living human face detection model and storage medium
CN117371511A (en) Training method, device, equipment and storage medium for image classification model
CN112001921A (en) New coronary pneumonia CT image focus segmentation image processing method based on focus weighting loss function
CN117252847A (en) Method and device for detecting and identifying defects of maxillary anterior alveolar bone
CN113780444B (en) Training method of tongue fur image classification model based on progressive learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination