CN115761226A

CN115761226A - Oral cavity image segmentation identification method and device, electronic equipment and storage medium

Info

Publication number: CN115761226A
Application number: CN202211390075.6A
Authority: CN
Inventors: 杨慧芳; 李刚
Original assignee: Peking University School of Stomatology
Current assignee: Peking University School of Stomatology
Priority date: 2022-11-08
Filing date: 2022-11-08
Publication date: 2023-03-07

Abstract

The invention discloses a method and a device for segmenting and identifying an oral cavity image, electronic equipment and a storage medium. The method comprises the following steps: preprocessing and registering the oral cavity image to obtain an image to be segmented; inputting the image to be segmented into a pre-trained neural network segmentation model to obtain semantic classification of each tissue in the oral cavity; and comparing and analyzing the semantic classification with an anatomical knowledge base to obtain an example segmentation result or a disease characteristic quantification result of each tissue. By the mode, the panoramic segmentation scheme is realized, the training complexity and the workload can be reduced, a universal segmentation model is obtained, and the segmentation recognition efficiency is improved; moreover, artificial intelligence and semantic segmentation are fused together and applied to the oral cavity image processing, so that multiple tissues of homologous data can be segmented simultaneously.

Description

Oral cavity image segmentation identification method and device, electronic equipment and storage medium

Technical Field

The invention relates to the technical field of medical image processing, in particular to an oral image segmentation and identification method and device, electronic equipment and a storage medium.

Background

Medical clinical practice and scientific research often measure the boundary, shape, cross-section or volume of a certain tissue or organ of a human body to obtain structural or texture information of the tissue, and accurate segmentation and measurement have important clinical significance for diagnosis and treatment of diseases. Image segmentation (Image segmentation) is to divide an Image into a plurality of meaningful portions according to a certain uniformity (or consistency) principle, so that each portion meets the requirement of consistency. The segmentation of images is in many cases a classification problem of pixel points or voxel points.

Tooth and alveolar bone morphology is important information that is of interest to the dentist, especially orthodontics, implants and periodontal doctors. With the Cone Beam Computed Tomography (CBCT) being widely applied to oral medicine, the periodontal state and the root morphology of a human body are also receiving more and more attention, and the anatomical structures of teeth and surrounding tissues are constructed by applying a three-dimensional segmentation technology and a three-dimensional virtual reconstruction technology, so that information can be provided for a clinician to make more accurate diagnosis.

The tooth CBCT image segmentation technology can obtain information of a crown and a root of a single tooth, obtain periodontal information, obtain information of a whole dental arch and an alveolar bone, obtain information of a skull bone tissue and the like. In the current research, a certain tissue is mostly segmented, the morphology and texture difference of the tissue interconnection area are not fully utilized to carry out multi-tissue global consideration, and the existing tissue segmentation method in the three-dimensional body layer data is in the research stage of a laboratory in a department of industry and is complex in processing flow and not suitable for being popularized and applied in radiology and clinic.

Disclosure of Invention

In view of the above problems, the present invention is proposed to provide an oral cavity image segmentation recognition method, apparatus, electronic device and storage medium that overcome the above problems or at least partially solve the above problems.

According to an aspect of the present invention, there is provided an oral cavity image segmentation and identification method, the method including:

preprocessing and registering the oral cavity image to obtain an image to be segmented;

inputting the image to be segmented into a pre-trained neural network segmentation model to obtain semantic classification of each tissue in the oral cavity;

and comparing and analyzing the semantic classification with an anatomical knowledge base to obtain an example segmentation result or a disease characteristic quantification result of each tissue.

Optionally, when the oral cavity image is preprocessed and registered, or when the neural network segmentation model is pre-trained, the oral cavity image or the oral cavity image is processed by any one or more of the following processes:

carrying out coordinate and gray value correction or normalization processing;

setting HU threshold values for tissues in the HU threshold values, and performing data filtering processing;

selecting a reasonable effective image from the original data, and performing data enhancement, including the processes of amplifying, reducing, rotating, translating or cutting the reasonable effective image;

performing feature-based registration of three-dimensional volume layer data;

registration based on voxel characteristics of the anatomical region is performed.

Optionally, the step of inputting the image to be segmented into a pre-trained neural network segmentation model to obtain semantic classification of each tissue in the oral cavity specifically includes the following steps:

processing an image to be segmented by a visual analyzer to obtain partial characteristics and integral characteristics;

inputting the partial features and the overall features into a pre-trained neural network segmentation model, wherein the neural network segmentation model is a Transformer model constructed based on a scaling dot product attention mechanism and/or a multi-head attention mechanism;

extracting partial features with semantic information from partial features and overall features output by an upper layer of an encoder by using the encoder in a Transformer model;

and fusing the partial features with semantic information into the overall features by utilizing a decoder in a Transformer model.

Optionally, the neural network segmentation model is a segmentation model formed by fusing a Transformer model and a U-net model.

Optionally, before comparing and analyzing the semantic classification with the anatomical knowledge base to obtain an example segmentation result or a disease feature quantification result of each tissue, the method further includes:

carrying out regional division on typical oral cavity images in clinic according to different anatomical positions or carrying out morphological feature division according to different functions;

and storing the spatial information, morphological information, texture information, pathological information, characteristic quantitative information and/or the correlation information of the tissues and the surrounding tissues of the corresponding tissues into an anatomical knowledge base.

Optionally, the semantic classification is compared and analyzed with an anatomical knowledge base, and an example segmentation result or a disease feature quantification result of each tissue specifically includes at least one of the following:

analyzing the spatial position, morphological characteristics and textural characteristics of each tissue in the example segmentation result or the disease characteristic quantification result, and analyzing the corresponding diagnosis result or clinical significance;

evaluating the obtained disease characteristic quantification result to obtain disease state information;

and verifying the anatomical knowledge base by using the example segmentation result or the disease feature quantification result.

Optionally, the method further includes:

saving the example segmentation result or the disease feature quantification result to the anatomical knowledge base.

According to another aspect of the present invention, there is provided an oral image segmentation and identification device, the device comprising:

the preprocessing module is used for preprocessing and registering the oral cavity image to obtain an image to be segmented;

the semantic classification module is suitable for inputting the image to be segmented into a pre-trained neural network segmentation model to obtain semantic classification of each tissue in the oral cavity;

and the example segmentation module is suitable for comparing and analyzing the semantic classification with an anatomical knowledge base to obtain an example segmentation result and a disease characteristic quantification result of each tissue.

According to still another aspect of the present invention, there is provided an electronic apparatus including: the system comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface are communicated with each other through the communication bus;

the memory is used for storing at least one executable instruction, and the executable instruction enables the processor to execute the operation corresponding to the oral cavity image segmentation identification method.

According to still another aspect of the present invention, there is provided a computer storage medium having at least one executable instruction stored therein, the executable instruction causing a processor to perform operations corresponding to the above-mentioned method for identifying oral cavity image segmentation.

According to the oral cavity image segmentation and identification scheme disclosed by the invention, firstly, the oral cavity image can be preprocessed and registered to obtain an image to be segmented; then the image to be segmented is input into a pre-trained neural network segmentation model to obtain semantic classification of each tissue in the oral cavity, and the semantic classification is compared and analyzed with an anatomical knowledge base to obtain an example segmentation result or a disease characteristic quantification result of each tissue, so that a panoramic segmentation scheme is realized. According to the scheme, the training samples can be accurately marked based on small sample data, so that a trained neural network segmentation model is obtained to automatically segment and identify the oral cavity image, the training complexity and workload are reduced, the efficiency is improved, and a universal segmentation model is obtained; moreover, the artificial intelligence technology and the semantic segmentation technology are fused together and applied to the oral cavity image processing, so that the simultaneous segmentation of a plurality of tissues of the homologous data can be realized.

The foregoing description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.

Drawings

Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:

fig. 1 is a flowchart illustrating an oral cavity image segmentation and identification method according to an embodiment of the present invention;

FIG. 2 is a schematic diagram illustrating a principle of achieving registration of two images by using a mutual information algorithm according to an embodiment of the present invention;

FIG. 3 is a schematic diagram illustrating a calculation process of a zoom dot product attention mechanism according to an embodiment of the present invention;

FIG. 4 is a schematic structural diagram of a Transformer model provided in an embodiment of the present invention;

FIG. 5 is a diagram illustrating a training process of a neural network segmentation model according to an embodiment of the present invention;

FIG. 6 is a schematic diagram illustrating the structure of a disease state provided by an embodiment of the present invention;

fig. 7 is a schematic structural diagram illustrating an apparatus for segmenting and recognizing an oral cavity image according to an embodiment of the present invention;

fig. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

Exemplary embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the invention are shown in the drawings, it should be understood that the invention can be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.

Fig. 1 is a flowchart illustrating an embodiment of an oral image segmentation and identification method disclosed in the present invention, which is applied to an electronic device. The electronic equipment is intelligent terminal equipment, computer equipment and/or cloud provided with a computer program for oral cavity image segmentation and identification, and the intelligent terminal equipment comprises but is not limited to a smart phone and a PAD; the computer equipment comprises but is not limited to a personal computer, a notebook computer, an industrial computer, a network host, a single network server and a plurality of network server sets; the Cloud is made up of a large number of computers or web servers based on Cloud Computing (Cloud Computing), which is a type of distributed Computing, a virtual supercomputer consisting of a collection of loosely coupled computers.

As shown in fig. 1, the method comprises the steps of:

step 110: and preprocessing and registering the oral cavity image to obtain an image to be segmented.

The image refers to two-dimensional or three-dimensional image data, the preprocessing includes processing for standardizing or normalizing the image, and the registration (registration) includes a process of matching and superimposing two or more images acquired at different times and different sensors (imaging devices) or under different conditions (weather, illumination, camera positions and angles, and the like), and is mainly used for improving the consistency of the images and further providing guarantee for the accuracy of image segmentation and identification.

Step 120: and inputting the image to be segmented into a pre-trained neural network segmentation model to obtain semantic classification of each tissue in the oral cavity.

The type and composition of the neural network segmentation model in the step are not limited, and the neural network segmentation model comprises network models such as CNN, U-net and Transformer. It should be noted that, during pre-training, a reasonable regularization scheme is adopted in the model to avoid the over-fitting phenomenon of the trained network, so that the model is an important choice for data cleaning, and the regularization schemes of data filtering, morphological methods, data enhancement, feature extractor selection, loss function selection, bayes inference with different weights, three-dimensional mesh reconstruction, denoising, boundary smoothing and the like are adopted in the segmentation model to improve the generalization capability of the segmentation model.

Step 130: and comparing and analyzing the semantic classification with an anatomical knowledge base to obtain an example segmentation result or a disease characteristic quantification result of each tissue.

In the step, according to the relevant prior knowledge of the oral anatomy, including possible distribution positions, distribution characteristics and other information, accurate segmentation of images of the oral medicine is facilitated, a new diagnosis model and scheme are introduced for orthodontics and oral periodontics, and therefore the disease damage degree (such as damage of tooth body diseases, the height and width of periodontal recession, the area of periapical disease shadow and the distribution rule) of the diseases is effectively and intuitively analyzed from the aspects of mathematics and physical models, and the obtained oral tooth and bone tissue models are more accurate.

In summary, the solution disclosed in the above embodiments of the present invention solves the labeling and segmentation problems of medical big data based on the small sample and weakly labeled multi-dimensional data automatic labeling model, wherein the segmentation result can be subjected to noise removal, boundary smoothing and three-dimensional mesh reconstruction by combining with methods such as morphology, and the like, and the segmentation and identification result can be used as clinical auxiliary diagnosis, such as mechanical modeling, 3D printing, disease typing and clinical decision.

In one embodiment, when the oral cavity image is preprocessed and registered in step 110, or when the neural network segmentation model is pre-trained, the oral cavity image or the oral cavity image is processed by any one of the following processes:

firstly, the pre-processing can be performed by performing coordinate and gray value correction or normalization processing on the image to be segmented or the sample during training.

Next, HU thresholds may be set for each tissue in the mouth image or image, and data filtering processing may be performed.

It should be noted that the traditional neural network segmentation model is mainly based on a CNN structure, depends on convolution calculation to extract feature information of an image, and a gradient descent method is a common first-order optimization method, is one of the simplest and most classical methods for solving an unconstrained optimization problem, and determines an optimal value of a parameter by minimizing an error of a cost function, thereby improving the performance of the network. However, the oral cavity image CBCT belongs to three-dimensional body layer data, the data volume is large, and the HU values of teeth and bone tissues are generally greater than 500 according to the prior information, so that data filtering is a very critical step, and the information can be used to effectively help the identification of the category and the judgment of the category of the three-dimensional voxel point.

In the embodiment, the redundant information in the same type of input images can be obtained by performing high-pass filtering or band-pass filtering on Hounsfield Unit (HU) values of the CBCT images, so that the resources of a memory and a display card can be saved, meanwhile, noise data are filtered, effective features are extracted from the images, the features are used as input to train a neural network, the calculated amount is reduced, and the problems of over-fitting and under-fitting of the network are avoided.

And in the pre-training process, selecting a reasonable effective image from the original data sample according to the thought of small sampling and precision, and carrying out amplification, reduction, rotation, translation or cutting on the reasonable effective image so as to realize data enhancement.

Data Augmentation (Data Augmentation) can improve algorithm performance and meet the requirement of deep learning models for a large amount of Data. The quality of an image segmentation model is related to a plurality of factors, such as the type of input data, the quantity of the input data, the data enhancement method, the characteristics and the category of identification data, the image labeling method, the region range of filtering during data preprocessing, the gray value normalization method, the patch size of training data, the selection of an optimization function, the setting of the condition for stopping training, and the like. The embodiment of the invention is proved by a preliminary experiment that reasonable and effective data selection is very important when more input data is better.

In order to expand the data and diversity of images, a generation countermeasure network can be designed, and a countermeasure training example is input by selecting a targeted transformation so as to increase the training data and enable the model to generate different inputs. Data enhancement may enable data augmentation of small sample data. Such as by operations of zooming in, zooming out, rotating, translating, grey scale adjustment, gaussian noise addition and cropping, different training data can be created while avoiding over-fitting. Alternatively, to prevent overfitting, L1 regularization, L2 regularization, dropout, dropConnect, early stopping, etc. may also be used.

With regard to the scheme of image registration, first, registration of feature-based three-dimensional volumetric layer data may be performed.

In this registration scheme, it is assumed that two images I are given _F (x) And I _M (x) Where x is the two image spaces omega _F And Ω _M At a certain point in time. The aim of the image registration algorithm is to find a transformation T omega _F →Ω _M So that the similarity of two images after a certain image is transformed reaches the maximum. The similarity is a function related to transformation and calculated by two image data to measure the similarity, for example, it can be the Sum of squares of errors (SSD) of the gray values of the images, and the calculation method is shown in formula (1):

and finally, finding the optimal solution of the function through a mathematical optimization algorithm, namely realizing rigid body transformation or affine transformation of the space.

If the image F is to be registered to the image M, in addition to spatial translation and rotation, zooming in, zooming out, and tilting of the image are required, which is one of linear transformations. In the actual processing of the oral DICOM format data, since the radiation system software will derive data of different voxel sizes, such as 200 × 200 or 300 × 300 (unit: μm), in order to ensure that the data size is not distorted, the data can be restored to the true size by using affine transformation.

Secondly, the present embodiment may also perform registration between images by basing voxel characteristics of anatomical regions.

For example, the registration of features based on anatomical regions can be achieved by mutual information or least squares. Mutual Information (MI) describes the amount of information in the corresponding parts of two subsystems in the same system. For two random variables, MI is the "amount of information" (usually in bits) obtained by observing one random variable after obtaining information for the other random variable. As the registration of the medical image data can use a mutual information method and take the statistical characteristics of the gray information as reference, the registration is calculated automatically by a machine after the initial state is pre-registered manually by considering a method of using multi-layer information input. The manual movement can avoid the output result from the error of the space three-dimensional coordinate from being unstable. Temporomandibular joint can move in oral medicine, so that the joints are in different occlusion states (such as opening and closing mouth, expression states and the like) photographed at different moments. Therefore, in the specific task of oral cavity, the tissues in the mouth can not be registered in the global state, and the experience knowledge of the doctor is increased to fix the registration area in the calculation process. After the fixed area is determined, the mutual information amount of the relative area is calculated, as shown in formula (2):

where p (X, Y) is the joint probability mass function of different images X and Y, and p (X) and p (Y) are the edge probability mass functions of X and Y, respectively. If the range of the region is too large, convergence is difficult, and if the range of the region is too small, a selection error is difficult to guarantee. Taking the intelligent tooth extraction in the oral cavity as an example, the image registration before and after the tooth extraction is realized by using a mutual information method. Registration principle diagram as shown in fig. 2, a joint matrix of two registered images (in fig. 2, a is CBCT image after wisdom tooth extraction, and B is CBCT image before wisdom tooth extraction) is realized by using maximum mutual information method, C is an overlapped display diagram of the result after registration, Z axis of D in the diagram represents voxel values of the diagram a and the diagram B, and D result in the diagram shows that the diagram a and the diagram B are best matched when the top region is displayed.

Different region Registration can be achieved by the two methods, and besides, a method Based on image Feature Registration (Feature-Based Registration) can be adopted: SIFT features, SURF features, harris features, FAST features, etc.; intensity-Based method (Intensity-Based Registration): such as voxel entropy based methods; image-based labeling methods; based on a plane feature method, based on a manual control area method and the like.

In one or some embodiments, the step of inputting the image to be segmented into a pre-trained neural network segmentation model in S120 to obtain semantic classification of each tissue in the oral cavity specifically includes the following steps:

step 121, processing the image to be segmented by a visual analyzer to obtain partial features and overall features;

step 122, inputting the partial features and the overall features into a pre-trained neural network segmentation model, wherein the neural network segmentation model is a Transformer model constructed based on a scaling point product attention mechanism and/or a multi-head attention mechanism;

step 123, extracting partial features with semantic information from partial features and overall features output from the upper layer of the encoder by using an encoder in a Transformer model;

and step 124, fusing the partial features with the semantic information into the overall features by using a decoder in a Transformer model.

At present, a deep learning sequence model is mostly based on an Encoder-Decoder (Encoder-Decoder) model constructed by a cycle network (RNN) or a Convolutional Neural Network (CNN), and the fact that overall information of oral CBCT (cone beam computed tomography) body layer data is totally input into the network, so that the receptive field is too large, the calculated amount is doubled, people normally have 28-32 teeth, and the overall distribution of the teeth has similar characteristics is considered. For example, there are relatively independent image features between the dentin region and the enamel of the tooth, between the dentin and the pulp cavity region, and between the tooth and the bone boundary, so in this embodiment, it is preferable to decode the multi-level feature information of the transform structure based on the Attention, so as to improve the model capability of semantic segmentation.

Further, in the embodiment, the network generalization capability is increased through the feature extraction of Part-Whole.

The human visual system is able to capture Part-hour information from a scene. In the aspect of part information, the human visual system can distinguish high-level semantic information such as teeth, bones, dental pulp, maxillary sinus cavities and the like in a scene from the whole picture. In terms of the whole information, the human visual system can model the global information of the picture, thereby helping the understanding of the part information. This embodiment proposes a visual parser that separates visual features into part and white levels. By displaying the modeling part feature and the whole feature, the semantic modeling capability of the model is improved.

Specifically, two inputs, part feature (local feature such as texture information of an image) and whole feature (global feature such as spatial coordinate distribution feature) are determined first. If the ith's whole feature and part feature are expressed as a whole, then the input of the ith layer is the whole feature of the previous layer, and the process of interacting the part feature and the whole feature is also a process of attention and calculation.

In addition, the main body of the transform structure is an Attention model, and in a preferred embodiment, the transform structure may be constructed by stacking in a certain order using Scaled Dot-Product Attention (SDPA). In general, the SDPA calculation is shown in equation (3):

the Query vector (Q), the Key vector (K), and the Value vector (V) represent current input information, respectively, an information matrix that the model is about to focus on in the current state, and a Key and a Value of the matrix are generated by different linear mappings from a feature map obtained by a CNN structure, and a calculation schematic diagram is shown in an upper diagram of fig. 3. d _k Representing the vector dimension. Q vector sumThe K vectors are multiplied and normalized to represent the distribution of the information to be focused on the current input, namely the attention weight, and finally multiplied by the information of the corresponding area representing the area to be focused.

In order to learn more information from different feature spaces by the model and enhance the generalization capability of the network, a Multi-Head attention (MHA) mechanism can be brought into the network. The MHA firstly performs linear mapping once, then calculates the Attention on different linear mappings, and performs splicing and weighting on different calculation results to serve as final calculation results, so as to perform feature extraction of small sample training.

The structure of the Transformer model constructed in this example is shown in fig. 4. The Transformer model is a sequence-to-sequence (Seq 2 Seq) model, and is composed of an Encoder-Decoder based on the Attention model. The Encoder (Encoder) refines and compresses the input content to extract the characteristic information of different layers, and the Decoder (Decoder) decodes the characteristic information of multiple layers by using the attention structure. In the network, the Encoder extracts the part feature with high-level semantic information according to the part feature and the whole feature of the previous layer, and after the Encoder finishes the process, the part feature can be considered to already contain the most basic information for describing visual input. Therefore, in the Decoder process, the encoded part feature is transmitted back to the whole feature, so that each pixel on the feature map can interact with information in a wider range. The above-described visual feature analysis method can cope with a Long Distance Dependency (LDD) problem.

In addition, the structure can also be combined with a U-net structure, a Transformer structure is used for replacing the traditional convolution method, and the global information capturing capability of the Transformer is analyzed by comparing with the U-net series method.

In the experimental Stage, the embodiment of the invention realizes the segmentation task at the pixel level through a Two-Stage (Two-Stage) idea, and compared with a Mask RCNN method using a ResNet series structure, the superiority of a transform model structure is proved.

In one embodiment, as shown in fig. 5, the neural network segmentation model is a segmentation model formed by fusing a Transformer model and a U-net model. Preferably, the Transformer model is arranged before the U-net model, the images are firstly processed by the Transformer model and then calculated by the U-net model, and the two models can be fused in a cross-serial mode.

In one embodiment, before comparing and analyzing the semantic classification with an anatomical knowledge base to obtain an example segmentation result or a disease feature quantification result of each tissue, the method further includes:

carrying out regional division on a typical oral cavity image in clinic according to different anatomical positions or morphological characteristic division according to different functions;

and storing the space information, morphological information, texture information, pathological information, characteristic quantization information and/or the correlation information between the tissues and the surrounding tissues of the corresponding tissues into an anatomical knowledge base.

In addition, the anatomical knowledge base also includes information such as clinical diagnosis models and diagnosis protocols.

Preferably, the semantic classification is compared and analyzed with an anatomical knowledge base in step 130, and the obtained example segmentation result or disease feature quantification result of each tissue specifically includes at least one of the following:

Specifically, the medical data is divided into regions according to different anatomical positions, and morphological features can be divided according to different functions. The tissue region division represents a spatial coordinate system or a local coordinate system corresponding to a certain tissue in the image. During data analysis, the spatial anatomical positions of different tissues in the CBCT image need to be identified, such as teeth of different tooth positions, corresponding pulp cavities and peripheral alveolar bone regions, which are the spatial coordinate description of the tissues. The description methods of spatial information, morphological information, texture information, and interrelation between tissues and surrounding tissues in data of different tissues are different, and a tissue feature library in an anatomical knowledge base can be constructed, image information of a certain tissue can be described by a structured language, and corresponding clinical tissues and disease features can be searched.

Aiming at the problems that periodontal ligament soft tissue, inherent alveolar bone microstructure and the like are not considered in example segmentation and semantic segmentation in the existing CBCT image, the embodiment helps to realize accurate segmentation of the image of oral medicine according to relevant prior knowledge of oral anatomy, including possible information such as distribution position and distribution characteristics, introduces a new diagnosis scheme into the oral orthodontics and oral periodontology, and accordingly effectively analyzes the disease damage degree (such as damage of tooth diseases, height and width of periodontal recession, area and distribution rule of periapical disease dark shadows) of diseases intuitively from the aspects of mathematics and physical models, and enables the obtained oral tooth and bone tissue models to be more accurate.

1) And calculating and analyzing the spatial position, morphological characteristics and textural characteristics of the tissues in the three-dimensional space, and analyzing the corresponding diagnosis result or clinical significance. Taking periodontitis as an example, analyzing the characteristics of alveolar bone in the root peripheral area of different tooth positions, analyzing the morphological characteristics of peripheral bone tissue (for example, dividing the tooth into 6 areas of space, analyzing bone tissue information at the 1/3 deep position of the tooth root and tongue side) by scientific and reasonable space position positioning, and describing the morphological characteristics by using mathematical or physical language, as shown in fig. 6, a method for measuring the height from two-dimensional Enamel cementum Junction (CEJ) to the alveolar bone surface parallel to the tooth long axis is shown.

The mathematical and physical characteristics of the data are interpreted by medical language, and the corresponding diagnosis result and clinical significance of the data in the medical diseases are analyzed. By utilizing the method for extracting and quantifying the characteristics of the image information, the occurrence position and the severity of the disease can be further accurately described, and the disease interpretable model constructed by the method has scientific significance and clinical reference value.

2) And evaluating the obtained disease characteristic quantification result to obtain disease state information. In particular, the quantified indicators may be evaluated for effectiveness, creating an image information base of the disease in an anatomical knowledge base. And comparing the quantified indexes in the image with the current clinical image expert analysis result to evaluate the effectiveness of the indexes. The evaluation can be performed according to the following procedure: the occurrence of the disease is described to varying degrees on the assumption. The image information presented in the corresponding tissues is different for different disease progression. The disease model of the image data can be preset for the disease state condition, and the quantitative result is analyzed to see whether the result accords with the hypothesis of the hypothesis. The process is exemplified by the texture characteristics of the hippocampus of the Alzheimer disease, and the texture characteristics of the hippocampus are analyzed to be normal, mild cognitive impairment and Alzheimer disease according to the existing three-dimensional magnetic resonance tomography data. Analysis of bone in CBCT in the mouth can consider the following parameters, respectively: bone height, bone width, distribution of trabeculae, thickness of cortical bone in the corresponding tissue, bone density of cancellous bone and mechanical properties of the corresponding location. Meanwhile, the change of the gray value of the bone inflammation can be considered, the change of the image information of the bone cancer can be analyzed, and the distribution relation among hard tissues, ligaments and soft tissues of bone joints, the spatial position relation between the bone and the like can be explored.

3) And evaluating the effect of the segmented result, and verifying the information in the anatomical knowledge base obtained by pathological or clinical data. The fine tissue segmentation is helpful for the research of the artificial intelligence technology on the research of the disease progression rules of the tissues, and particularly provides a powerful support tool for the change of the morphology and the texture of the disease progression process.

Different quantitative parameters of a medical image may correspond to different disease states of a tissue or organ, and theoretically more detailed disease definition schemes and diagnostic modes may allow a more comprehensive analysis of the disease cause and progression. Thus, through the methods disclosed in the above embodiments, a set of standardized, organizational partitioning standards systems and methods can be established to validate and service models and assumptions.

In a preferred embodiment, the method further comprises: and storing the example segmentation result or the disease characteristic quantification result into the anatomical knowledge base, so that the constructed anatomical knowledge base is enriched according to the new identification segmentation result.

Fig. 7 is a schematic structural diagram illustrating an embodiment of the device for segmentation and identification of an oral cavity image according to the present invention. As shown in fig. 7, the apparatus 700 includes:

the preprocessing module 710 is used for preprocessing and registering the oral cavity image to obtain an image to be segmented;

the semantic classification module 720 is adapted to input the image to be segmented into a pre-trained neural network segmentation model to obtain semantic classifications of various tissues in the oral cavity;

and the example segmentation module 730 is suitable for comparing and analyzing the semantic classification with an anatomical knowledge base to obtain an example segmentation result and a disease characteristic quantification result of each tissue.

In a preferred embodiment, the pre-processing module 710 is further adapted to:

carrying out coordinate and gray value correction or normalization processing;

selecting a reasonable effective image from the original data, and performing data enhancement, including amplification, reduction, rotation, translation or cutting processing on the reasonable effective image;

performing feature-based registration of three-dimensional volume layer data;

In one embodiment, the semantic classification module 720 is further adapted to:

In one embodiment, the neural network segmentation model is a segmentation model formed by fusing a Transformer model and a U-net model.

In one embodiment, the instance partitioning module 730 is further adapted to:

and verifying the anatomical knowledge base by using the example segmentation result or the disease characteristic quantification result.

Preferably, the apparatus 700 further comprises a database update module adapted to:

saving the instance segmentation result or the disease feature quantification result to the anatomical knowledge base.

In conclusion, the scheme of the invention can obtain the following beneficial effects:

because the medical data sources are different and the high-quality data is less, the method can reduce the overfitting problem of the data in the training process through a small amount of high-quality data, thereby realizing the labeling by utilizing the small sample data and obtaining the universal image segmentation network with universality.

A large amount of labor cost needs to be paid for data labeling, but in actual operation, the more data labeling is displayed, the more accurate data labeling is, the better the data labeling is, the operability of a machine and the practicability of clinical requirements can be considered in the labeling process, multiple classes are labeled simultaneously in a small sample, and efficient and accurate segmentation of data is achieved. The three-dimensional characteristics of a plurality of concerned anatomical tissues and the adjacency relation between tissues can be effectively provided by using a semantic segmentation technology for labeling.

The method realizes automatic segmentation and identification of data by using a weak sample labeling method based on small sample data and by combining an artificial intelligence technology and a semantic segmentation technology together and applying the artificial intelligence technology and the semantic segmentation technology to oral CBCT, and can realize simultaneous segmentation of a plurality of tissues of homologous data.

The embodiment of the invention provides a nonvolatile computer storage medium, wherein at least one executable instruction is stored in the computer storage medium, and the computer executable instruction can execute the oral cavity image segmentation identification method in any method embodiment.

Fig. 8 is a schematic structural diagram of an embodiment of the electronic device according to the present invention, and the specific embodiment of the present invention does not limit the specific implementation of the electronic device.

As shown in fig. 8, the electronic device may include: a processor (processor) 802, a communication Interface 804, a memory 806, and a communication bus 808.

Wherein: the processor 802, communication interface 804, and memory 806 communicate with one another via a communication bus 808. A communication interface 804 for communicating with network elements of other devices, such as clients or other servers. The processor 802 is configured to execute the program 810, and may specifically execute the relevant steps in the above-described embodiments of the method for identifying an oral cavity image segmentation for an electronic device.

In particular, the program 810 may include program code comprising computer operating instructions.

The processor 802 may be a central processing unit CPU, or an Application Specific Integrated Circuit ASIC (Application Specific Integrated Circuit), or one or more Integrated circuits configured to implement embodiments of the present invention. The electronic device comprises one or more processors, which can be the same type of processor, such as one or more CPUs; or may be different types of processors such as one or more CPUs and one or more ASICs.

The memory 806 stores a program 810. The memory 806 may comprise high-speed RAM memory, and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.

The program 810 may be specifically configured to enable the processor 802 to execute operations corresponding to the oral cavity image segmentation recognition method in any of the above-described method embodiments.

The algorithms or displays presented herein are not inherently related to any particular computer, virtual machine, or other apparatus. Various general purpose systems may also be used with the teachings herein. The required structure for constructing such a system is apparent from the description above. In addition, embodiments of the present invention are not directed to any particular programming language. It is appreciated that a variety of programming languages may be used to implement the teachings of the present invention as described herein, and any descriptions of specific languages are provided above to disclose the best mode of the invention.

In the description provided herein, numerous specific details are set forth. However, it is understood that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.

Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the invention, various features of the embodiments of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the invention and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be construed to reflect the intent: that the invention as claimed requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.

Those skilled in the art will appreciate that the modules in the devices in an embodiment may be adaptively changed and arranged in one or more devices different from the embodiment. The modules or units or components of the embodiments may be combined into one module or unit or component, and furthermore they may be divided into a plurality of sub-modules or sub-units or sub-components. All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or elements of any method or apparatus so disclosed, may be combined in any combination, except combinations where at least some of such features and/or processes or elements are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.

Moreover, those of skill in the art will appreciate that while some embodiments herein include some features included in other embodiments, not others, combinations of features of different embodiments are meant to be within the scope of the invention and form different embodiments. For example, in the following claims, any of the claimed embodiments may be used in any combination.

The various component embodiments of the invention may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. Those skilled in the art will appreciate that a microprocessor or Digital Signal Processor (DSP) may be used in practice to implement some or all of the functionality of some or all of the components according to embodiments of the present invention. The present invention may also be embodied as apparatus or device programs (e.g., computer programs and computer program products) for performing a portion or all of the methods described herein. Such programs implementing the present invention may be stored on computer-readable media or may be in the form of one or more signals. Such a signal may be downloaded from an internet website or provided on a carrier signal or in any other form.

It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention can be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means can be embodied by one and the same item of hardware. The usage of the words first, second and third, etcetera do not indicate any ordering. These words may be interpreted as names. The steps in the above embodiments should not be construed as limiting the order of execution unless specified otherwise.

Claims

1. An oral cavity image segmentation identification method, the method comprising:

2. The method of claim 1, wherein when the oral cavity image is preprocessed and registered, or when the neural network segmentation model is pre-trained, the oral cavity image or the oral cavity image is processed by any one or more of the following processes:

carrying out coordinate and gray value correction or normalization processing;

performing feature-based registration of three-dimensional volume layer data;

3. The method according to claim 1, wherein the step of inputting the image to be segmented into a pre-trained neural network segmentation model to obtain semantic classification of each tissue in the oral cavity specifically comprises the steps of:

4. The method according to claim 3, wherein the neural network segmentation model is a segmentation model formed by fusing a Transformer model and a U-net model.

5. The method according to any one of claims 1-4, wherein prior to analyzing the semantic classification against an anatomical knowledge base to obtain an example segmentation result or a disease feature quantification result for each of the tissues, further comprising:

6. The method according to any of claims 1-4, wherein comparing the semantic classification with an anatomical knowledge base to obtain an example segmentation result or a disease feature quantification result for each of the tissues specifically comprises at least one of:

7. The method of claim 6, further comprising:

8. An oral image segmentation recognition device, the device comprising:

9. An electronic device, comprising: the system comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete mutual communication through the communication bus;

the memory is used for storing at least one executable instruction, and the executable instruction causes the processor to execute the operation corresponding to the oral cavity image segmentation identification method according to any one of claims 1-7.

10. A computer storage medium having at least one executable instruction stored therein, the executable instruction causing a processor to perform operations corresponding to the method for identifying dental image segmentation according to any one of claims 1 to 7.