CN112634293A - Temporal bone inner ear bone cavity structure automatic segmentation method based on coarse-to-fine dense coding and decoding network - Google Patents

Temporal bone inner ear bone cavity structure automatic segmentation method based on coarse-to-fine dense coding and decoding network Download PDF

Info

Publication number
CN112634293A
CN112634293A CN202110045206.6A CN202110045206A CN112634293A CN 112634293 A CN112634293 A CN 112634293A CN 202110045206 A CN202110045206 A CN 202110045206A CN 112634293 A CN112634293 A CN 112634293A
Authority
CN
China
Prior art keywords
segmentation
temporal bone
convolution
bone
stage
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110045206.6A
Other languages
Chinese (zh)
Inventor
李晓光
伏鹏
朱梓垚
卓力
张辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Technology
Original Assignee
Beijing University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Technology filed Critical Beijing University of Technology
Priority to CN202110045206.6A priority Critical patent/CN112634293A/en
Publication of CN112634293A publication Critical patent/CN112634293A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10081Computed x-ray tomography [CT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20092Interactive image processing based on input by user
    • G06T2207/20104Interactive definition of region of interest [ROI]

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Apparatus For Radiation Diagnosis (AREA)

Abstract

A temporal bone inner ear bone cavity structure automatic segmentation method based on a coarse-to-fine dense coding and decoding network belongs to the field of medical images. The invention adopts a frame from thick to thin, firstly carries out rough segmentation on the anatomical structure to be segmented in the temporal bone area, and calculates the coordinates of a central point. Around this central point, the image expands externally to a region that can completely contain the inner ear bone cavity structure and can retain a portion of background information as a sub-region for further accurate segmentation. In the segmentation stage, dense connection modules are respectively introduced in the encoding process to extract more sufficient features, and hole convolution is added in the dense connection modules, so that a segmentation algorithm obtains a larger receptive field for a target to be segmented, and more sufficient surrounding features and spatial information are extracted. In the decoding stage, the features extracted in the encoding stage are up-sampled by the transposition convolution, and after each transposition convolution, a dense connection module is introduced to strengthen the reutilization of decoding information. The invention has more accurate segmentation.

Description

Temporal bone inner ear bone cavity structure automatic segmentation method based on coarse-to-fine dense coding and decoding network
Technical Field
The invention belongs to the field of medical image processing, and particularly relates to an automatic segmentation method for a temporal bone inner ear bone cavity structure of a coarse-to-fine dense coding and decoding network.
Background
Temporal bone CT is an important reference for doctors to check ear diseases. In the temporal bone region, three regions, namely the outer ear, the middle ear and the inner ear, are divided, and 30 tiny anatomical structures are contained in total. Wherein the inner ear region is one of the important regions of the temporal bone, helping the human body hear the sound and maintain balance. The region mainly comprises structures such as a cochlea, a vestibule, an outer semicircular canal, a rear semicircular canal, a front semicircular canal and the like, the structures are mainly formed by communicated bone cavity structures, and the structures respectively play different roles in ensuring the hearing and balance of a human body. The cochlea, the external ear and the middle ear act together, and external sound waves promote liquid in the cochlea wall to flow, so that fine hair in the cochlea bends and moves, and a motion signal of the sound waves is converted into an electric signal and is sent to the brain through auditory nerves; the vestibule and the three semicircular canals play an important role in maintaining the balance of the human body, wherein the vestibule is a connecting part of a cochlea and the semicircular canals, contains more liquid and fine hair, and senses the motion of the human body through the liquid flow; the three semicircular canals form a right angle with each other, and when a human body moves, liquid in the semicircular canals flows to promote the movement of internal fine hair and help people to feel the movement direction. Therefore, the inner ear region is an important reference for hearing loss, dizziness and other symptoms. In recent years, with the development of medical imaging technology, the temporal bone CT image data has been rapidly increased, but due to the lack of automatic analysis tools, a large amount of temporal bone CT image data is difficult to be effectively applied to relevant analysis research.
Medical image segmentation is a complex and key basic step in automatic medical image processing and analysis, medical research and clinical diagnosis, and aims to segment parts with certain special meanings in medical images and provide reliable reference bases for tasks such as clinical diagnosis, surgical planning, clinical teaching and the like. Segmentation is an important step of medical image analysis, and accurate segmentation not only can reduce the workload of doctors, but also can help doctors to further know the characteristics of anatomical structures and perform physiological analysis, such as measurement of physical quantities, such as the number of cochlear modiolus, vestibular size, semicircular canal curvature included angle and the like.
Small object segmentation in anatomical structures has been a challenging task in medical image segmentation. In the temporal bone anatomy, the anatomy that needs to be segmented only accounts for less than 1% of the whole CT data volume, and due to the particularity of the temporal bone anatomy, there is usually no obvious boundary division between the anatomy and the surrounding anatomy, which brings a challenge to the segmentation of the temporal bone anatomy.
The classic medical image segmentation algorithm has methods such as a threshold value method and region growing, but because the boundary between temporal bone CT anatomical structures is not obvious, the structure is precise, the volume is small, and the method is difficult to obtain an accurate segmentation result. Therefore, there is a pressing need for more efficient and accurate segmentation algorithms.
In recent years, with the development of deep learning technology, many medical image segmentation algorithms based on convolutional neural networks emerge. In order to fully capture the in-layer information and the inter-layer information, the algorithm usually adopts a three-dimensional neural network with a U-shaped coding and decoding structure. Due to the limitation of computational resources, such algorithms usually use a sliding window to sequentially segment the CT blocks in a complete CT data for the prediction of voxel class. Because the target of the temporal bone to be segmented is tiny and is easily interfered by a complex background, the segmentation speed and the segmentation precision are not good enough.
The invention designs a temporal bone CT image segmentation algorithm for densely connecting an encoding and decoding network from a thick frame to a thin frame, which is used for automatically segmenting a bone cavity structure of an inner ear in a temporal bone region. Firstly, an anatomical structure to be segmented in a temporal bone region is roughly segmented by adopting a frame from coarse to fine by adopting a high-efficiency light-weight segmentation algorithm to obtain a rough region, and the coordinates of a central point are calculated according to a foreground point set predicted by a rough segmentation result. Then, around the central point, the image is externally expanded to a region which can completely contain the inner ear bone cavity structure, and a part of background information can be reserved as a sub-region for further accurate segmentation. In the segmentation stage, dense connection modules are respectively introduced in the encoding process to extract more sufficient features, and hole convolution is added in the dense connection modules, so that a segmentation algorithm obtains a larger receptive field for a target to be segmented, and more sufficient surrounding features and spatial information are extracted. In the decoding stage, the features extracted in the encoding stage are up-sampled by the transposition convolution, and after each transposition convolution, a dense connection module is introduced to strengthen the reutilization of decoding information. And in other parts, a 3D deep supervision mechanism and a 3D multi-pooling feature fusion strategy adopted by a patent (publication number: 110544264A, publication date: 2019, 12 and 6) are continuously used in the fine segmentation network to guide the training of the segmentation algorithm.
Disclosure of Invention
The invention aims to overcome the defects of the existing medical image small target segmentation method. Segmentation of small anatomical structures has been a challenging task in medical image segmentation. This problem also exists in the anatomy of the temporal bone. In an example temporal bone CT sequence, the number of voxels of the target anatomy only accounts for less than 1% of the number of voxels of the complete data, and therefore, the complex larger background may negatively affect the segmentation result of the target to be segmented. Meanwhile, due to the particularity of the temporal bone CT, boundaries among various anatomical structures are not clearly divided, and challenges are brought to segmentation of the anatomical structures. Aiming at the problems, a frame for automatically segmenting the key anatomical structure of the temporal bone from thick to thin is designed, in a fine segmentation algorithm, a new innovation is provided on the basis of a small target segmentation method of the key anatomical structure of the temporal bone based on a 3D deep supervision mechanism, which is proposed by a patent (publication number: 110544264A, publication date: 2019, 12 and 6), a larger target receptive field and more sufficient characteristics are obtained by introducing dense connecting blocks and cavity convolution, and the anatomical structure of the temporal bone is automatically segmented more accurately.
The invention is realized by adopting the following technical means:
a method for automatically segmenting the bone cavity structure of the temporal inner ear from a dense network with thickness from thick to thin. The method is integrally divided into 3 stages: the specific flow is shown in the attached figure 1 of the specification based on the stages of rough positioning of the bone cavity structure of the inner temporal bone, extraction of candidate regions of the bone cavity structure of the inner temporal bone and fine segmentation of the bone cavity structure of the inner temporal bone.
The method specifically comprises the following steps:
1) a temporocele inner ear bone cavity structure coarse positioning stage based on coarse segmentation:
the first step is as follows: the temporal bone inner ear bone cavity structure is roughly segmented by adopting a conventional medical image segmentation method with a light network structure and a high segmentation speed. In the coarse segmentation model training stage, 48 × 48 × 48 (the actual physical dimensions in the front and back and left and right directions of the CT image are the product of the number of pixels and the pixel pitch, and the up and down directions are the product of the slice span and the layer pitch) are randomly extracted at the same position in a complete temple CT image data and a labeling file, so that a cube containing an inner ear bone cavity structure can be completely extracted as long as the actual physical distance is more than 24mm, in the method, the adopted data pixel pitch and the layer pitch are both 0.5mm, therefore, a cube with the size of 48 × 48 × 48 × 48 corresponding to a voxel block of the 24 × 24mm cube is extracted, if the cube contains a label of a target structure, the HU value of the extracted temple CT image cube is cut off, and the HU value is smaller than TminIs set to TminHU value greater than TmaxIs set to TmaxWherein, TminAnd TmaxThe value range of (1) is between the lowest HU value and the highest HU value, generally, in the temporal bone CT, the air HU value is-1024, the HU value of the bone is more than 300, the data is cut to be the lowest-1024, the temporal bone region can be completely reserved (2347 is used as a cut upper limit in the method) when the data is cut to be the highest to be 2000 or more, the temporal bone region is used for further temporal bone analysis, the influence of an irrelevant background on target structure segmentation is reduced, the cut HU values are normalized to be the mean value of 0, the variance is 1, the data distribution is sent to a rough segmentation network for training, and otherwise, the cubes with the same size are extracted again until labels containing anatomical structures are extracted in the range.
The second step is that: and the complete temporal bone CT image data is segmented into a plurality of groups of blocks according to a mode of overlapping sliding windows and a sequence, the blocks are sequentially sent into a trained rough segmentation network for segmentation, and segmentation results are overlapped and restored to be consistent with the complete temporal bone CT size according to an input sequence. During the rough segmentation test, in order to find the balance between the rough positioning time and the segmentation precision, the overlap rate is selected to be 2, and the input complete temporal bone CT image data is divided into a plurality of groups of overlapped blocks to be input into the segmentation network for prediction.
The third step: and removing outliers generated in the rough segmentation by adopting an absolute median difference method according to the structural voxel level label. Let the absolute median difference be calculated as shown in equation (1):
MAD=median(|Xi-median(X)|) (1)
where X is the set of all points predicted to be foreground points, XiThe epsilon X is the ith point in the point set X, the mean (-) is the median calculated for the point set, and the absolute median algorithm process is as follows:
(1) calculating median (X) of all predicted foreground point coordinates;
(2) calculating the absolute deviation value abs (X) of each predicted foreground point and the mediani-median(X));
(3) Calculating Median Absolute Deviation (MAD) of Absolute Deviation values in (2);
(4) and (3) dividing the value in the step (2) by the value in the step (3) to obtain a group of distances Dis of all the predicted foreground points from the center based on the MAD. The calculation formula is as shown in formula (2):
Figure BDA0002897096800000041
(5) and removing the point with the maximum Dis value larger than the threshold Th in the dimensions of x, y and z as an abnormal point. Th is the screening threshold value of the ratio Dis between the absolute deviation value and the median of the absolute deviation value. The selection can be performed according to the proportional relation between the real foreground point and the outlier. When the ratio of the current scenic spot to the outlier is small, the Th selection is large, which indicates that the position difference of the foreground scenic spot is more robust, more foreground points can be reserved, but the outlier cannot be completely removed; when the ratio of the current scenic spots to the outliers is large, Th selection is small, which means that the condition for screening the foreground points is stricter, and the correct foreground points can be deleted while the outliers are removed and most of the foreground points are reserved. The threshold value of the conventional MAD algorithm is between 1 and 10, when 5 inner ear bone cavity structures are removed and coarse positioning is carried out, outliers can interfere with the center coordinates of a target area, when the outliers are removed, the number of foreground points is large, therefore, the selection of Th values is small, 3.5 is adopted as a screening threshold value in the method, and the outliers far away from the target area can be removed.
2) Temporal bone key anatomical structure candidate region extraction stage:
firstly, counting the extraction size of a bone cavity structure candidate area of the inner ear of the temporal bone. Based on statistical voxel-level labeling data, counting the maximum values and the minimum values of all labeling voxel points of the temporal bone inner ear cavity structure in three dimensions of x, y and z, preliminarily calculating the voxel coordinate span of the anatomical structure according to the difference values between the maximum values of x, y and z and the minimum values of x, y and z, extracting voxel regions with the same size cannot guarantee to extract temporal bone regions with the same size due to the fact that pixel spacing and layer spacing parameters between CTs are possibly inconsistent, therefore, according to the pixel spacing and the layer spacing of each CT, calculating the actual physical enclosure frame size corresponding to the temporal bone cavity structure of each CT, extending the maximum value of the actual physical size of each structure outwards to 24mm × 24mm × 24mm, taking the temporal bone CT with the layer spacing and the pixel spacing of 0.5mm as an example, taking a cube of 48 × 48 × 48 × 48, namely, the input of a segmentation algorithm can be met while guaranteeing that the target segmentation structure is completely enclosed, as the extraction size of the candidate region.
And secondly, extracting the region of interest by combining the coarse positioning central point of the anatomical structure to be segmented and the size information of the prior surrounding frame of the anatomical structure. And describing the region of interest by using a region central point and a three-dimensional size, wherein the central point of the key anatomical structure to be segmented of the temporal bone predicted in the first stage is taken as the center, the central point is extended outwards to extract cube data, the three-dimensional size of the cube is calculated according to the size of an enclosure frame corresponding to each anatomical structure to be segmented of the temporal bone in a first step statistics manner, the extracted sub-region is taken as a candidate region for further accurate segmentation, and the position of the extracted sub-region is recorded.
3) And (3) a temporal bone inner ear bone cavity structure fine segmentation stage:
the fine segmentation stage specifically comprises two processes of encoding and decoding, and the whole network architecture is as shown in the specification and fig. 2.
a) Encoding stage
The first step is as follows: data truncation and normalization. And sending the 48 x 48 voxel length sub-area of the temporal bone inner ear bone cavity structure candidate area extracted in the stage 2) into a precise segmentation algorithm. In order to reduce the influence of a complex irrelevant background on a segmented target, data truncation is carried out on the sub-region CT value of the CT image data according to the HU value distribution range of the temporal bone CT, and for the temporal bone CT, the sub-region CT value can be smaller than TminIs truncated to TminIs greater than TmaxIs truncated to Tmax,TminAnd TmaxThe selection is consistent with the coarse positioning value of the bone cavity structure of the inner ear of the temporales based on coarse segmentation in the first stage, and the coarse positioning value is normalized into data distribution with the mean value of 0 and the variance of 1.
The second step is that: and extracting features by adopting a dense connection network with cavity convolution. And (3) feeding the 48 x 48 complete sub-area containing the bone cavity structure of the inner ear of the temporal bone to be segmented after the data truncation and normalization in the first step into a network. 3 groups of dense connection modules are designed in the encoding stage, the modules can enhance the information transmission in the network, reduce gradient disappearance or gradient explosion, and simultaneously can repeatedly utilize extracted features to obtain rich semantic information. The module consists of three parts, namely batch normalization-modified linear unit-convolution layer, splicing and bottleneck layer, as shown in the description attached figure 3, wherein all convolution sizes of the densely connected module are 3 multiplied by 3. The batch normalization-correction linear unit-convolution layer in the dense connecting block consists of three operations of batch normalization, correction linear unit and convolution layer; splicing is to cascade the characteristic diagrams at the channel level; the bottleneck layer is used for reducing the number of characteristic graphs output by the dense connection block. In the third group of densely connected modules, a hole convolution module is adopted, and the schematic diagram of the module is shown in the description attached to fig. 4, so that the convolution output can increase the receptive field, namely the spatial information containing a large range around the anatomical structure. This helps to extract the spatial information of the tiny critical anatomical structures of the temporal bone and their surroundings in the three-dimensional data.
The third step: pooling features are fused. Following the multi-pooling feature fusion strategy of the patent (publication No. 110544264a, published: 2019, 12/6/10), batch normalization-modified activated cell-convolutional layer was used after the dense-connected module output at each level, after which Dropout layer was typically used to prevent overfitting, Dropout rate was set to 0.5, after which both 3D max pooling and 3D average pooling were used and the results after pooling were put together for a splice. The 3D max pooling may preserve edge features of the volumetric data and the 3D average pooling may preserve background information of the volumetric data. The splicing of the two can provide rich characteristic information for subsequent segmentation.
b) Decoding stage
The first step is as follows: the feature upsampling is performed using a transposed convolution and dense join module. In the decoding process, the transposition convolution with the dense connection module is adopted to decode the semantic information. And (3) carrying out up-sampling on the tensor feature data with the size of 12 × 12 × 12 in the last layer in the encoding stage by adopting twice transposition convolution, and restoring the tensor feature data to the original input size of 48 × 48 × 48. Different from the conventional method which adopts the common convolution to extract the features after the transposition convolution, after the two times of transposition convolution in the decoding stage, the dense connecting blocks replace the common convolution layer, so that the features after the transposition convolution can be more efficiently utilized, and the voxel type can be better predicted.
The second step is that: 3D deep supervision mechanism. Along with a 3D deep supervision mechanism of a patent (publication number: 110544264A, publication date: 2019, 12 and 6), in an encoding stage, features output by a first densely-connected network block are extracted by using 64 convolution kernels, then a 1 × 1 × 1 convolution is carried out, and then a softmax layer is followed to output an auxiliary segmentation result. And performing convolution operation on the spliced features by the second layer in the decoding stage to further extract the features, and performing transposition convolution on the obtained features to improve the resolution, and then performing softmax layer by using a 1 × 1 × 1 convolution kernel to obtain a second auxiliary segmentation result. The last layer of the decoding stage outputs the prediction result of the trunk network after the jointed features are subjected to convolution operation containing different convolution kernels, and the prediction result and the branch of the trunk networkThe prediction results of the branch networks together guide the training of the network. In the process of network training, the loss function of the main network and the loss time function of the branch network jointly form a joint objective function, including
Figure BDA0002897096800000063
Coefficient (dsc) loss function and cross entropy loss function. The DSC loss function is defined as shown in equation (3):
Figure BDA0002897096800000061
wherein X and Y respectively represent a prediction voxel and a real target voxel, n represents the number of classes (including background) of the target to be segmented, and XiAnd yiRespectively representing the number of target labeled voxels contained in the predicted voxel data and the true target voxel data. Introducing a weight denoted as W for the cross entropy loss function, as shown in equation (4):
Figure BDA0002897096800000062
wherein N iskRepresenting the number of target voxel labels in the voxel data to be segmented, NcRepresenting the number of all voxels in the voxel data to be segmented. The cross entropy loss function is shown in equation (5):
Figure BDA0002897096800000071
constructing a joint objective function based on the loss function defined above is shown in equation (6):
Figure BDA0002897096800000072
where m is the number of hidden layers to be monitored, λ1kAnd λ2kIs the hyperparameter of the k-th supervised hidden layer loss function, and the value of the hyperparameterThe range is 0-1, because the combined loss function should be mainly the loss function of the network backbone and assisted by the loss function of the hidden layer, in the invention, m is 2, and the hyper-parameter lambda is1kAnd λ2kValues of 0.6 and 0.3, L respectivelykAnd HkRespectively, the k-th supervised hidden layer DSC loss function and the cross-entropy loss function. And constructing a target loss function based on the loss functions of the main network and the branch network to jointly guide network training, reduce gradient disappearance and accelerate the convergence speed of the network.
The third step: and predicting a segmentation result with the resolution of 48 multiplied by 48 by adopting a precise segmentation algorithm, and reducing the extracted position of the candidate region recorded in the 2) stage to a corresponding position in the complete CT as a final segmentation result.
Compared with the prior art, the invention has the following obvious advantages and beneficial effects:
the invention provides an automatic segmentation algorithm for a temporal bone inner ear bone cavity structure of a coarse-to-fine dense coding and decoding network. Combining the characteristic of tiny volume of the anatomical structure of the temporal bone, firstly adopting a light-weight and high-efficiency general medical image anatomical segmentation algorithm to carry out rough segmentation on the anatomical structure to be segmented, obtaining the spatial position distribution range of the anatomical structure, extracting small sub-regions by extending the center of the distribution range outwards, and reducing the candidate region of the structure for accurate segmentation. On the basis of reducing the area to be segmented, an accurate segmentation algorithm with dense connection and cavity convolution is adopted to automatically segment the bone cavity structure of the inner ear of the temporal bone more accurately. According to the method, a coarse-to-fine segmentation frame is adopted, a candidate region is determined by performing coarse segmentation on the anatomical structure to be segmented, and then accurate segmentation is performed in the candidate region, so that the problems that the anatomical structure target of the temporal bone region is small and the influence of a complex background is large are further solved, and the segmentation precision and the segmentation speed of the temporal bone inner ear bone cavity structure are improved. The invention can replace manual drawing and realize automatic segmentation of the bone cavity structure of the inner ear of the temporal bone by a computer.
The invention has the characteristics that:
1. aiming at the characteristic that the inner ear bone cavity structure in the temporal bone CT is complex and fine, the algorithm provides a coarse-to-fine segmentation frame, the frame adopts a lightweight coarse segmentation method to determine a candidate region for accurately segmenting an anatomical structure, and a more complex fine segmentation algorithm is adopted in the region for further accurate segmentation;
2. in the encoding stage of the fine segmentation algorithm, a dense connection module with cavity convolution is introduced to capture complex spatial information around an anatomical structure to be segmented; in the decoding stage, the reuse of decoding information is enhanced by combining the transposition convolution with the dense connecting block. The segmentation performance of the temporal bone micro anatomical structure is improved.
Drawings
FIG. 1 is a general frame diagram of a temporal inner ear bone cavity structure segmentation algorithm from thick to thin;
FIG. 2 is a diagram of an overall network architecture of a temporo inner ear bone cavity structure fine segmentation algorithm;
FIG. 3 is a schematic diagram of a quasi-dense connection module and a dense connection module with a hole convolution;
FIG. 4 is a diagram illustrating standard convolution and hole convolution;
fig. 5 is a graph showing the result of the bone cavity structure segmentation of the inner ear of the temporal bone.
The specific implementation mode is as follows:
the following description of the embodiments of the present invention is provided in conjunction with the accompanying drawings:
we collected 64 manually standardized temporal bone CT data approved by the ethical committee of the beijing friendship hospital affiliated with the university of capital medical science. Patient information in all data is desensitized according to hospital requirements. Of the 64 normal human temporal bone CT data, 33 were male, 31 were female, and the average age was 44 years. A radiologist with experience-rich in the friendship hospital is invited to perform voxel-level labeling of the 5 inner ear bone cavity structures in the temporal bone CT. In the experiment, 56 cases of data were used as training sets, and 8 cases of data were used as training sets.
The data preprocessing employed by the present invention includes resampling of the CT image and the labeled data.
In order to avoid inconsistent distribution of CT data caused by different pixel spacing and layer spacing of CT image data due to different brands and parameters of CT acquisition equipment, resampling uniform pixel spacing and layer spacing to 0.5mm for the CT image data by adopting a B spline interpolation algorithm; and (5) resampling the marked data to a uniform pixel spacing and a layer spacing of 0.5mm by adopting a nearest neighbor algorithm.
1) Coarse positioning stage of critical anatomical structures of temporal bones:
the first step is as follows: the 3D Unet algorithm which is light and is generally applied in the field of medical image segmentation is adopted to carry out rough segmentation on the 5 inner ear bone cavity. The segmentation test was performed using 56 training sets and 8 test sets. In the training stage, one example is randomly extracted from 56 complete cases of temporal bone CT image data and corresponding labeling data of the inner ear bone cavity structure, a cubic region with the actual physical size of 24mm is randomly extracted from the same position in the case of the CT image data and the labeling, and all the CT regions after resampling have uniform pixel spacing and layer spacing of 0.5mm, so that the voxel regions corresponding to the region are uniformly 48 multiplied by 48 cubes, if the labeled cube contains a labeling foreground of a target structure, namely the extracted labeled cube contains a label value of 1, the HU value of the extracted temporal bone CT image cube is cut off, and T is used for cutting off the HU value of the extracted temporal bone CT image cubeminIs set to-1024, TmaxAnd 2347, the influence of the irrelevant background on the target structure segmentation is reduced, the truncated HU values are normalized to be data distribution with the mean value of 0 and the variance of 1, and finally the data distribution is sent to a rough segmentation network for training, otherwise, the cubes with the same size are extracted again until the data distribution is labeled in the range. The net training setting batch size is 2, the initial learning rate is 0.0001, and the momentum factor is set to 0.5. A DSC loss function and a cross-entropy loss function with weights are used. After every 20 batchs are trained, randomly extracting a piece of data in a verification set for verification, adopting a DSC coefficient as an evaluation result, wherein a formula is shown as a formula (7), storing a model every 2000 times, guiding the loss convergence of the accurate segmentation model, verifying that the DSC index is close to saturation, and selecting the model with the highest verification result index as the optimal model for the accurate segmentation network training.
Figure BDA0002897096800000091
The second step is that: and in the coarse segmentation algorithm testing stage, the complete 8 cases of test data are input into a 3D Unet algorithm for segmentation testing in a sliding window mode with overlapping, the overlapping rate is 2, the cube size of each sliding window is 48 multiplied by 48.
The third step: for the rough segmentation result, removing abnormal outliers of the 3D Unet segmentation by adopting an absolute median difference algorithm and taking a threshold value as 2.5;
the third step: and taking the central point of the voxel-level segmentation result of the inner ear bone cavity structure as the central point for further candidate region extraction.
2) Temporal bone key anatomical structure candidate region extraction stage:
the first step is as follows: and counting the extraction size of the key anatomical candidate region of the temporal bone CT. According to voxel-level labeled data based on an inner ear bone cavity structure, in 56 cases of training data sets, the maximum values and the minimum values of all labeled voxel points of the inner ear bone cavity structure in the x, y and z dimensions are counted, the corresponding actual physical distances are counted and calculated according to the difference values between the maximum values of the x, y and z and the minimum values of the x, y and z dimensions, the actual physical distances are expanded outwards, and the requirement of a 3D Unet segmentation algorithm for three-time down-sampling is met. The actual physical distance of the candidate bounding box region for finally determining the inner ear bone cavity structure is 24 × 24 × 24, and the corresponding voxel size is 48 × 48 × 48.
The second step is that: and according to the structural center point of the inner ear bone cavity in the first stage, respectively extending the inner ear bone cavity in the positive and negative directions of x, y and z by 12mm, namely, 24 voxel distances, and taking the inner ear bone cavity as a candidate region for further accurate segmentation. .
3) Accurate segmentation stage of key anatomical structure of temporal bone:
the precise segmentation stage of the key anatomical structure of the temporal bone is mainly divided into an encoding stage and a decoding stage.
a) Encoding stage
The first step is as follows: sending the 48 multiplied by 48 sub-area of the inner ear bone cavity area extracted in the 2) stage into a precise segmentation algorithm. And (3) truncating a value of the CT value of the sub-region of the CT image data, which is smaller than-1024, to-1024, and truncating a value of the CT value, which is larger than 2347, to 2347, and normalizing the values into data distribution with the mean value of 0 and the variance of 1.
The second step is that: and the dense connection module is combined with the cavity convolution to extract the characteristics. The inner ear bone cavity structure has small volume, in order to more fully extract the structure to be segmented, dense connecting blocks are adopted for feature extraction, and each layer of dense connecting module consists of three parts, namely a group of batch normalization-correction linear units-convolution layer, splicing layer and bottleneck layer. In the encoding phase, we use 3 layers of densely connected modules. In order to obtain a larger receptive field through convolution output and extract a larger range of spatial information, the convolution layer of the 3 rd densely connected module is replaced by a cavity convolution with the expansion rate of 2.
The third step: pooling features are fused. The multi-pooling feature fusion strategy is used, batch normalization-correction activation unit-convolution layer is adopted after the dense connection module of each level outputs, in order to prevent overfitting in the training process, a dropout layer with a discarding rate of 0.5 is usually adopted, 3D maximum pooling and 3D average pooling are simultaneously adopted after the dropout layer, and results after pooling are spliced. The 3D max pooling may preserve edge features of the volumetric data and the 3D average pooling may preserve background information of the volumetric data. The splicing of the two can provide rich characteristic information for subsequent segmentation.
b) Decoding stage
i. The feature upsampling is performed using a transposed convolution and dense join module.
The first step is as follows: the characteristics of the output of the first, second and third densely connected network blocks in the encoding stage are respectively F1,F2,F3The resolutions thereof were 48X 48, 24X 2412X 12, respectively. To F3Adopting transposition convolution operation, the convolution step length is 2, the boundary filling adopts 0 to fill, and outputting characteristic T after the first group of transposition convolution2The size is 24 × 24 × 24;
the second step is that: outputting F from the second densely-connected block in the encoding stage2And T2Channel splicing is carried out to form a new characteristic group C2Set of characteristics C2Extracting features through the dense connecting blocks 4 in the decoding stage to obtain features D2
The third step: output F of the first densely packed block of the encoding stage1Passing through a 3D convolution layer to obtain 64 features M1By using a transposed convolution operation, D2Up-sampling is carried out until the sampling rate is 48 multiplied by 48, and the output characteristic is recorded as T1Will feature F1、M1、T1Performing characteristic splicing to obtain a characteristic group C1Wherein M is1And F1The splicing of the method is respectively spliced in the form of short connection and long connection, and the semantic gap between the low-level space characteristic and the high-level semantic characteristic is reduced. Will be characteristic group C1Inputting the dense connection block 5 in the decoding stage, increasing the reuse of the features and obtaining the features D1Finally, by a bottleneck convolution of 1 × 1 × 1, pair D1And (5) performing channel screening as final output.
ii.3D deep supervision mechanism.
The first step is as follows: in the decoding stage, for feature D2Performing transposed convolution operation, up-sampling feature data to 48 × 48 × 48, performing bottleneck convolution of 1 × 1 × 1, outputting 2 feature cubes, calculating the probability that each voxel is a target anatomical structure according to softmax, and recording as auxiliary prediction aux _ pred 1;
the second step is that: encoding stage feature set M1Calculating the probability that each voxel is the target anatomical structure according to softmax by using 1 × 1 × 1 convolution, and recording as auxiliary prediction aux _ pred 2;
the third step: for feature group D1Inputting 1 × 1 × 1 bottleneck convolution for channel screening, calculating classification probability of each voxel according to softmax, and recording as trunk prediction main _ pred;
the fourth step: joint loss guides network training. And respectively comparing the network prediction results aux _ pred1, aux _ pred2 and main _ pred with a manually marked gold standard GT, calculating cross entropy loss and DSC loss, and forming joint loss guide network training by the loss obtained by the auxiliary prediction result and the loss obtained by the main prediction result.
The training and testing process for the precision segmented network is as follows:
a) model training
In the model training phase, the 48 × 48 × 48 subvolume extracted in the first phase is input to a segmentation algorithm for training. In the training process, the batch size is set to be 2, the initial learning rate is set to be 0.0001, the momentum coefficient is set to be 0.5, after each 20 batches of training are completed, a piece of data is randomly extracted in a verification set for verification, the DSC coefficient is used as an evaluation result, the model is stored every 2000 times until loss convergence of the accurate segmentation model is achieved, the DSC index is verified to be close to saturation, and the model with the highest verification result index is selected as the optimal model for the accurate segmentation network training.
b) Model testing
In the model testing stage, the 48 × 48 × 48 × 48 sub-region extracted in the second stage is subjected to accurate segmentation prediction, and after the prediction is completed, the accurate segmentation result is restored to the corresponding position according to the position recorded by the sub-region extracted in the first stage, and is used as the final segmentation result of the temporal bone CT image.
The subjective picture of the algorithm on the segmentation result of the inner ear bone cavity structure is shown in the attached figure 5.

Claims (1)

1. A method for automatically segmenting a bone cavity structure of an inner temporal ear by a thick-to-thin dense network is characterized by integrally comprising 3 stages: a step of coarse positioning of a bone cavity structure of the inner temporal bone based on coarse segmentation, extraction of a candidate area of the bone cavity structure of the inner temporal bone and fine segmentation of the bone cavity structure of the inner temporal bone;
the method specifically comprises the following steps:
1) a temporocele inner ear bone cavity structure coarse positioning stage based on coarse segmentation:
the first step is as follows: roughly dividing a bone cavity structure of an inner ear of a temporal bone; in the coarse segmentation model training stage, a 48 × 48 × 48 cube is randomly extracted at the same position in a complete temporal bone CT image data and a labeling file, if the cube contains a label of a target structure, the HU value of the extracted temporal bone CT image cube is cut off, and the HU value is smaller than TminIs set to TminHU value greater than TmaxIs set to TmaxWherein, TminAnd TmaxRange of values from lowest HU value to highest HU valueIn the temporal bone CT, the air HU value is-1024, the HU value of bone is more than 300, the data is cut to be the lowest-1024, the highest is cut to be 2000 or more, the temporal bone area can be completely reserved for further temporal bone analysis, the HU values after being cut are normalized to be the data distribution with the mean value of 0 and the variance of 1, the data distribution is sent to a rough segmentation network for training, otherwise, cubes with the same size are extracted again until the marks containing anatomical structures are extracted in the range;
the second step is that: dividing the complete temporal bone CT image data into a plurality of groups of blocks according to a mode of overlapping sliding windows and a sequence, sequentially sending the blocks into a trained rough division network for division, and overlapping and restoring the division results to be consistent with the complete temporal bone CT size according to an input sequence; selecting the overlapping rate of 2, dividing the input complete temporal bone CT image data into a plurality of groups of overlapping blocks, and inputting the blocks into a segmentation network for prediction;
the third step: according to the voxel-level labeling of each structure, removing outliers generated in the rough segmentation by adopting an absolute median difference method; let the absolute median difference be calculated as shown in equation (1):
MAD=median(|Xi-median(X)|) (1)
where X is the set of all points predicted to be foreground points, XiThe epsilon X is the ith point in the point set X, the mean (-) is the median calculated for the point set, and the absolute median algorithm process is as follows:
(1) calculating median (X) of all predicted foreground point coordinates;
(2) calculating the absolute deviation value abs (X) of each predicted foreground point and the mediani-median(X));
(3) Calculating the median MAD of the absolute deviation value in the step (2);
(4) dividing the value in the step (2) by the value in the step (3) to obtain a group of distances Dis between all the predicted foreground points based on the MAD and the center; the calculation formula is as shown in formula (2):
Figure FDA0002897096790000021
(5) removing the point with the maximum Dis value larger than the threshold Th in the dimensions of x, y and z as an abnormal point; removing outliers far away from the target area by using 3.5 as a screening threshold;
2) temporal bone key anatomical structure candidate region extraction stage:
firstly, counting the extraction size of a temporobone inner ear bone cavity structure candidate area; based on statistical voxel-level labeling data, counting the maximum values and the minimum values of all labeling voxel points of the temporal bone inner ear cavity structure in three dimensions of x, y and z, preliminarily calculating the voxel coordinate span of the anatomical structure according to the difference values between the maximum values of x, y and z and the minimum values of x, y and z, calculating the actual physical enclosure frame size corresponding to the temporal bone cavity structure of each CT according to the pixel interval and the layer interval of each CT, outwards extending the maximum value of the actual physical size of each structure to 24mm × 24mm × 24mm, and taking a 48 × 48 × 48 × 48 cube by using the temporal bone CT with the layer interval and the pixel interval of 0.5mm, so that the target segmentation structure can be completely enclosed, and meanwhile, the input of a segmentation algorithm is met, and the cube serves as the extraction size of a candidate region;
secondly, extracting an interested region by combining a coarse positioning central point of the anatomical structure to be segmented and size information of a prior surrounding frame of the anatomical structure; describing the region of interest through a region central point and a three-dimensional size, wherein the central point of the key anatomical structure to be segmented of the temporal bone predicted in the first stage is taken as the center, extending outwards to extract cube data, calculating the three-dimensional size of the cube according to the size of an enclosure frame corresponding to each anatomical structure to be segmented of the temporal bone calculated in the first step in a statistical manner, taking the extracted sub-region as a candidate region for further accurate segmentation, and recording the position of the extracted sub-region;
3) and (3) a temporal bone inner ear bone cavity structure fine segmentation stage:
the fine segmentation stage specifically comprises two processes of encoding and decoding;
a) encoding stage
The first step is as follows: data truncation and normalization; sending the 48 x 48 voxel length sub-area of the temporal bone inner ear bone cavity structure candidate area extracted in the stage 2) into a precise segmentation algorithm; the sub-region CT value of the CT image data is subjected to data truncation according to the HU value distribution range of the temporal bone CT, and for the temporal bone CT, the CT value smaller than T can be obtainedminIs truncated to TminIs greater than TmaxIs truncated to Tmax,TminAnd TmaxThe selection is consistent with the coarse positioning value of the bone cavity structure of the inner ear of the temporales based on coarse segmentation in the first stage, and the coarse positioning value are normalized into data distribution with the mean value of 0 and the variance of 1;
the second step is that: extracting features by adopting a dense connection network with cavity convolution; sending the 48 multiplied by 48 complete sub-area containing the bone cavity structure of the inner ear of the temporal bone to be segmented after the data in the first step are truncated and normalized into a network; 3 groups of dense connection modules are designed in the encoding stage; the module consists of three parts, namely batch normalization-correction linear unit-convolution layer, splicing and bottleneck layer, wherein all convolution sizes of the densely connected module are 3 multiplied by 3; the batch normalization-correction linear unit-convolution layer in the dense connecting block consists of three operations of batch normalization, correction linear unit and convolution layer; splicing is to cascade the characteristic diagrams at the channel level; the bottleneck layer is used for reducing the number of characteristic graphs output by the dense connection blocks; in the third group of dense connection modules, a hole convolution module is adopted;
the third step: fusing multi-pooling characteristics; using batch normalization-modified active unit-convolutional layer after the output of the dense connection module of each level, in order to prevent overfitting, using Dropout layer after this, setting Dropout rate to 0.5, using 3D max pooling and 3D average pooling simultaneously after it, and making a splice of the result after pooling;
b) decoding stage
The first step is as follows: performing feature upsampling by using a transposed convolution and dense connection module; in the decoding process, adopting a transposition convolution with a dense connection module to decode semantic information; carrying out up-sampling on tensor eigen data with the size of 12 x 12 at the last layer in the encoding stage by adopting twice transposition convolution, and restoring the tensor eigen data to the original input size of 48 x 48; after two times of transposition convolution in the decoding stage, replacing a common convolution layer with a dense connecting block;
the second step is that: a 3D deep supervision mechanism; in the encoding stage, the output characteristic of the first densely connected network block adopts 64 convolution kernelsExtracting features, performing convolution by 1 multiplied by 1, and outputting an auxiliary segmentation result immediately after a softmax layer; performing convolution operation on the spliced features by the second layer in the decoding stage to further extract the features, firstly performing transposition convolution on the obtained features to improve the resolution, and then performing softmax layer by adopting a 1 × 1 × 1 convolution kernel to obtain a second auxiliary segmentation result; in the decoding stage, the last layer outputs the prediction result of the trunk network after convolution operation containing different convolution kernels is carried out on the spliced characteristics; in the process of network training, the loss function of the main network and the loss time function of the branch network jointly form a joint objective function, including
Figure FDA0002897096790000033
Coefficient (dsc) loss function and cross entropy loss function; the DSC loss function is defined as shown in equation (3):
Figure FDA0002897096790000031
wherein X and Y respectively represent a prediction voxel and a real target voxel, n represents the number of classes of the target to be segmented, and XiAnd yiRespectively representing the number of target mark voxels contained in the predicted voxel data and the real target voxel data; introducing a weight denoted as W for the cross entropy loss function, as shown in equation (4):
Figure FDA0002897096790000032
wherein N iskRepresenting the number of target voxel labels in the voxel data to be segmented, NcRepresenting the number of all voxels in the voxel data to be segmented; the cross entropy loss function is shown in equation (5):
Figure FDA0002897096790000041
constructing a joint objective function based on the loss function defined above is shown in equation (6):
Figure FDA0002897096790000042
where m is the number of hidden layers to be monitored, λ1kAnd λ2kIs a hyperparameter of the kth supervised hidden layer loss function, m is 2, the hyperparameter is lambda1kAnd λ2kValues of 0.6 and 0.3, L respectivelykAnd HkRespectively a k-th supervised hidden layer DSC loss function and a cross entropy loss function; constructing a target loss function based on loss functions of a main network and a branch network to jointly guide network training, reducing gradient disappearance and accelerating the convergence speed of the network;
the third step: and predicting a segmentation result with the resolution of 48 multiplied by 48 by adopting a precise segmentation algorithm, and reducing the extracted position of the candidate region recorded in the 2) stage to a corresponding position in the complete CT as a final segmentation result.
CN202110045206.6A 2021-01-14 2021-01-14 Temporal bone inner ear bone cavity structure automatic segmentation method based on coarse-to-fine dense coding and decoding network Pending CN112634293A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110045206.6A CN112634293A (en) 2021-01-14 2021-01-14 Temporal bone inner ear bone cavity structure automatic segmentation method based on coarse-to-fine dense coding and decoding network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110045206.6A CN112634293A (en) 2021-01-14 2021-01-14 Temporal bone inner ear bone cavity structure automatic segmentation method based on coarse-to-fine dense coding and decoding network

Publications (1)

Publication Number Publication Date
CN112634293A true CN112634293A (en) 2021-04-09

Family

ID=75294120

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110045206.6A Pending CN112634293A (en) 2021-01-14 2021-01-14 Temporal bone inner ear bone cavity structure automatic segmentation method based on coarse-to-fine dense coding and decoding network

Country Status (1)

Country Link
CN (1) CN112634293A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113850760A (en) * 2021-08-27 2021-12-28 北京工业大学 Vestibule detection method based on ear CT (computed tomography) image
CN117455935A (en) * 2023-12-22 2024-01-26 中国人民解放军总医院第一医学中心 Abdominal CT (computed tomography) -based medical image fusion and organ segmentation method and system
CN113850760B (en) * 2021-08-27 2024-05-28 北京工业大学 Ear CT image vestibule detection method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110544264A (en) * 2019-08-28 2019-12-06 北京工业大学 Temporal bone key anatomical structure small target segmentation method based on 3D deep supervision mechanism
CN111192245A (en) * 2019-12-26 2020-05-22 河南工业大学 Brain tumor segmentation network and method based on U-Net network
CN112116605A (en) * 2020-09-29 2020-12-22 西北工业大学深圳研究院 Pancreas CT image segmentation method based on integrated depth convolution neural network

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110544264A (en) * 2019-08-28 2019-12-06 北京工业大学 Temporal bone key anatomical structure small target segmentation method based on 3D deep supervision mechanism
CN111192245A (en) * 2019-12-26 2020-05-22 河南工业大学 Brain tumor segmentation network and method based on U-Net network
CN112116605A (en) * 2020-09-29 2020-12-22 西北工业大学深圳研究院 Pancreas CT image segmentation method based on integrated depth convolution neural network

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113850760A (en) * 2021-08-27 2021-12-28 北京工业大学 Vestibule detection method based on ear CT (computed tomography) image
CN113850760B (en) * 2021-08-27 2024-05-28 北京工业大学 Ear CT image vestibule detection method
CN117455935A (en) * 2023-12-22 2024-01-26 中国人民解放军总医院第一医学中心 Abdominal CT (computed tomography) -based medical image fusion and organ segmentation method and system
CN117455935B (en) * 2023-12-22 2024-03-19 中国人民解放军总医院第一医学中心 Abdominal CT (computed tomography) -based medical image fusion and organ segmentation method and system

Similar Documents

Publication Publication Date Title
CN111798462B (en) Automatic delineation method of nasopharyngeal carcinoma radiotherapy target area based on CT image
CN112150428B (en) Medical image segmentation method based on deep learning
CN109815826B (en) Method and device for generating face attribute model
CN110599528A (en) Unsupervised three-dimensional medical image registration method and system based on neural network
CN110111366A (en) A kind of end-to-end light stream estimation method based on multistage loss amount
US20100135541A1 (en) Face recognition method
CN110363760B (en) Computer system for recognizing medical images
JP2023550844A (en) Liver CT automatic segmentation method based on deep shape learning
CN110544264A (en) Temporal bone key anatomical structure small target segmentation method based on 3D deep supervision mechanism
CN113012172A (en) AS-UNet-based medical image segmentation method and system
CN111369565A (en) Digital pathological image segmentation and classification method based on graph convolution network
CN110648331B (en) Detection method for medical image segmentation, medical image segmentation method and device
CN111047605B (en) Construction method and segmentation method of vertebra CT segmentation network model
CN115223082A (en) Aerial video classification method based on space-time multi-scale transform
CN110246171B (en) Real-time monocular video depth estimation method
CN114973412A (en) Lip language identification method and system
CN113012140A (en) Digestive endoscopy video frame effective information region extraction method based on deep learning
CN112465754B (en) 3D medical image segmentation method and device based on layered perception fusion and storage medium
CN112070685A (en) Method for predicting dynamic soft tissue motion of HIFU treatment system
CN111179269A (en) PET image segmentation method based on multi-view and 3-dimensional convolution fusion strategy
CN112418032A (en) Human behavior recognition method and device, electronic equipment and storage medium
CN112634293A (en) Temporal bone inner ear bone cavity structure automatic segmentation method based on coarse-to-fine dense coding and decoding network
CN114693790A (en) Automatic image description method and system based on mixed attention mechanism
CN117237351B (en) Ultrasonic image analysis method and related device
CN112489048B (en) Automatic optic nerve segmentation method based on depth network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination