WO2023045231A1 - 一种解耦分治的面神经分割方法和装置 - Google Patents

一种解耦分治的面神经分割方法和装置 Download PDF

Info

Publication number
WO2023045231A1
WO2023045231A1 PCT/CN2022/076927 CN2022076927W WO2023045231A1 WO 2023045231 A1 WO2023045231 A1 WO 2023045231A1 CN 2022076927 W CN2022076927 W CN 2022076927W WO 2023045231 A1 WO2023045231 A1 WO 2023045231A1
Authority
WO
WIPO (PCT)
Prior art keywords
feature map
facial nerve
segmentation
feature
label
Prior art date
Application number
PCT/CN2022/076927
Other languages
English (en)
French (fr)
Inventor
王静
董波
何宏建
蔡秀军
Original Assignee
浙江大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 浙江大学 filed Critical 浙江大学
Publication of WO2023045231A1 publication Critical patent/WO2023045231A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/12Edge-based segmentation
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B34/00Computer-aided surgery; Manipulators or robots specially adapted for use in surgery
    • A61B34/10Computer-aided planning, simulation or modelling of surgical operations
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B34/00Computer-aided surgery; Manipulators or robots specially adapted for use in surgery
    • A61B34/20Surgical navigation systems; Devices for tracking or guiding surgical instruments, e.g. for frameless stereotaxis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/13Edge detection
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B34/00Computer-aided surgery; Manipulators or robots specially adapted for use in surgery
    • A61B34/10Computer-aided planning, simulation or modelling of surgical operations
    • A61B2034/101Computer-aided simulation of surgical operations
    • A61B2034/105Modelling of the patient, e.g. for ligaments or bones
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B34/00Computer-aided surgery; Manipulators or robots specially adapted for use in surgery
    • A61B34/10Computer-aided planning, simulation or modelling of surgical operations
    • A61B2034/107Visualisation of planned trajectories or target regions
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B34/00Computer-aided surgery; Manipulators or robots specially adapted for use in surgery
    • A61B34/10Computer-aided planning, simulation or modelling of surgical operations
    • A61B2034/108Computer aided selection or customisation of medical implants or cutting guides
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10081Computed x-ray tomography [CT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning

Definitions

  • the invention relates to the field of medical image processing, in particular to a facial nerve segmentation method and device for decoupling and dividing.
  • Robotic cochlear implantation an automated treatment to restore hearing to patients, relies on precise preoperative planning to avoid damage to critical anatomical structures.
  • the identification of pericochlear tissue structures plays a crucial role in preoperative path planning.
  • damage to the facial nerve may lead to permanent temporofacial nerve palsy, which is the most important tissue structure around the cochlea, and the distance between it and surrounding tissues is less than 1 mm.
  • accurate segmentation of the facial nerve faces two major challenges: (1) The facial nerve structure is very small. The facial nerve only occupies a very small area in the CT image (in a 512 ⁇ 512-pixel whole-brain CT image, the facial nerve area only has 9-16 pixels, and the image accounts for 0.0034%).
  • the purpose of the present invention is to provide a facial nerve segmentation method and device for decoupling and division, which effectively solves the small impact of facial nerve structure and low contrast on segmentation, improves the accuracy and speed of facial nerve automatic segmentation, Meet the needs of path planning before robot cochlear implantation.
  • Embodiment provides a kind of facial nerve segmentation method of decoupling divide and conquer, comprises the following steps:
  • the facial nerve segmentation model including feature extraction module, rough segmentation module, and fine segmentation module; input CT image samples are extracted by feature extraction module to obtain a low-level feature map and multiple high-level feature maps at different levels; rough segmentation module Including the search and recognition unit and the pyramid fusion unit, multiple high-level feature maps of different levels are paralleled and respectively passed through the search and recognition unit for global facial nerve search, and the obtained multiple facial nerve feature maps are fused by the pyramid fusion unit to obtain the fusion feature map; the fine segmentation module includes solution Coupling unit and spatial attention unit, the fusion feature map is transformed through the feature space of the decoupling unit, and the obtained center subject feature map is combined with the low-level feature map to obtain the edge detail feature map, and the center subject feature map and the edge detail feature map are respectively subjected to spatial attention. After the attention feature is extracted by the force unit, the extracted results are fused and then processed by the spatial attention unit to obtain the facial nerve segmentation map;
  • a loss function which includes the difference between the fusion feature map and the original label of the CT image sample, the difference between the facial nerve segmentation map and the original label of the CT image sample, the prediction result of the center subject feature map processed by the spatial attention unit and the difference between the subject label The difference between the prediction result of the edge detail feature map processed by the spatial attention unit and the detail label;
  • the facial nerve segmentation was performed on the input CT image samples using the facial nerve segmentation model determined by the parameters, and the facial nerve segmentation map was obtained.
  • the feature extraction module adopts the improved Res2Net50, removes all fully connected layers and the last convolution group of the original Res2Net50, and the remaining multiple convolution groups form the improved Res2Net50, and the input CT image samples Input to Res2Net50, the output of the first convolution group is a low-level feature map, and the output of other convolution groups are high-level feature maps of different levels.
  • the global facial nerve search process to advanced feature map is:
  • the separated feature map undergoes a convolution operation to convert the number of channels.
  • the operation results of the feature maps of all branches are fused to realize the expansion of the advanced feature map;
  • the inverse operation of separation is performed on the fusion result to realize feature reconstruction to obtain the facial nerve feature map.
  • a space transformation model is used to perform feature space transformation on the fused feature map to obtain the central subject feature map, and the feature space transformation process includes parameter prediction, coordinate mapping, and pixel sampling;
  • the parameter prediction process use the convolutional layer to transform and predict the fusion feature map to obtain the parameter matrix;
  • Coordinate mapping process use the element value in the parameter matrix as the offset of the pixel point, and use the offset to coordinate the pixel point of the fusion feature map that will be in the standard space network to obtain a new fusion feature map;
  • Pixel sampling process Differentiable bilinear sampling mechanism is used to sample the new fusion feature map, and the obtained pixel points form the central subject feature map.
  • the obtained central subject feature map is combined with the low-level feature map to obtain an edge detail feature map, including:
  • the facial nerve feature map is obtained.
  • the difference is spliced with the facial nerve feature map, and the edge detail feature map is obtained through convolutional layer fusion.
  • the process of extracting attention features from the input image is:
  • the input image undergoes convolution operation and global average pooling operation, it is further screened by the threshold mechanism composed of activation layer and fully connected layer, and then activated by the activation function to obtain the prediction result of the attention feature extraction corresponding to the input image;
  • the input image is the center subject feature map, the edge detail feature map, and the fusion result of the prediction results of the two attention feature extractions corresponding to the center subject feature map and the edge detail feature map.
  • the construction process of the subject label and the detail label of the CT image sample is:
  • the subject label I b and detail label I d of the CT image samples are determined as follows:
  • Ib I*I"
  • Id I*(II").
  • the loss function is expressed as:
  • L( ⁇ ) represents the cross-entropy loss function
  • Liou ( ⁇ ) represents the intersection ratio loss function
  • p b represents the prediction result of the central subject feature map processed by the spatial attention unit
  • I b represents the subject label
  • p d represents The prediction result of the edge detail feature map processed by the spatial attention unit
  • I d represents the detail label
  • is the weight factor
  • I represents the original label of the CT image sample
  • p 1 and p 2 represent the fusion feature map and facial nerve segmentation map respectively.
  • the preprocessing of the CT image includes: performing data enhancement in a random flipping and shearing manner, and the CT image after data enhancement forms a sample set.
  • Embodiment also provides a kind of decoupling divide-and-conquer facial nerve segmentation device, comprise memory, processor and be stored in described memory and the computer program that can be executed on described processor, described processor executes described computer program Realize above-mentioned facial nerve segmentation method steps of decoupling divide-and-conquer at the same time.
  • the decoupling, dividing and conquering facial nerve segmentation method and device provided by the embodiment have beneficial effects at least including:
  • a facial nerve segmentation model including a feature extraction module, a rough segmentation module, and a fine segmentation module was constructed, and the feature extraction module was used to extract low-level features and multiple high-level features at different levels.
  • the rough segmentation module conducts global search and fusion of facial nerve features on the advanced features of different levels, and then uses the fine segmentation module to decouple the fusion features to obtain the central subject features.
  • the central subject Features and edge detail features are extracted from the spatial attention mechanism to obtain a facial nerve segmentation map. This method improves the accuracy and speed of automatic facial nerve segmentation and meets the needs of path planning before robotic cochlear implantation.
  • Fig. 2 is the structural diagram of facial nerve segmentation model in the embodiment of the present invention.
  • FIG. 3 is a schematic structural diagram of a search and identification unit in an embodiment of the present invention.
  • Fig. 4 is the structural representation of pyramid fusion unit in the embodiment of the present invention.
  • Fig. 5 is a schematic diagram of obtaining subject tags and detail tags in an embodiment of the present invention.
  • FIG. 6 is a graph of a loss function in an embodiment of the present invention.
  • Fig. 7 is a display diagram of segmentation results in an embodiment of the present invention.
  • Fig. 8 is a comparison diagram of segmentation results of various methods in the embodiment of the present invention.
  • Fig. 9 is a schematic diagram of the result indicators Dice score and FLOPs in the embodiment of the present invention.
  • the embodiment provides a decoupling divide-and-conquer facial nerve segmentation method.
  • Fig. 1 is a flow chart of the decoupling divide-and-conquer facial nerve segmentation method in the embodiment of the present invention.
  • the decoupled divide-and-conquer facial nerve segmentation method provided by the embodiment mainly includes two stages of training and testing. First, all the collected CT image data are randomly divided into training set and test set. In the training phase, the training set is used for training. For the training set, random flipping, cutting, etc. are used for data enhancement to achieve preprocessing, and then the data-enhanced training samples are input to the facial nerve segmentation model, and the parameters of the facial nerve segmentation model are updated. . Then verify the trained facial nerve segmentation model. For the test set, use random flipping, cutting, etc.
  • the prediction results are evaluated to judge whether the facial nerve segmentation model is the optimal model, if yes, the model parameters are saved, otherwise, the model parameters are not saved, and iterative updates are performed in this way.
  • the obtained optimal parameters are loaded into the facial nerve segmentation model for testing, and the prediction results are output.
  • Fig. 2 is a structural diagram of a facial nerve segmentation model in an embodiment of the present invention.
  • the facial nerve segmentation model provided by the embodiment includes a feature extraction module, a rough segmentation module, and a fine segmentation module.
  • the feature extraction module is used to perform feature extraction on the input CT image samples to obtain a low-level feature map and multiple high-level feature maps of different levels.
  • the improved Res2Net50 is used as the backbone network of the feature extraction module. In order to be suitable for the facial nerve segmentation task, all fully connected layers and the last convolution group of the original Res2Net50 are removed, and the remaining 4 convolution groups form an improved Res2Net50.
  • the embodiment divides the extracted features into high-level feature maps f 2 , f 3 , f 4 and low-level feature maps f 1 .
  • the rough segmentation module includes the search and recognition unit SIM and the pyramid fusion unit PFM.
  • the three search and recognition units SIM perform global facial nerve search on the advanced feature maps f 2 , f 3 , and f 4 respectively, and capture small objects by expanding the receptive field to obtain the facial nerve Feature maps C 2 , C 3 , C 4 , combined with an effective pyramid fusion unit PFM to fuse facial nerve feature maps C 2 , C 3 , C 4 to obtain a fusion feature map.
  • FIG. 3 is a schematic structural diagram of a search and identification unit in an embodiment of the present invention.
  • a sliding window with a size of 2 ⁇ 2 is used to search for a 2 ⁇ 2 ⁇ c patch in the advanced feature map, and then Each patch is arranged according to the channel, and the advanced feature map is separated according to the channel.
  • Each patch will be disassembled and converted to 4 channels, so after the separation operation, a N/2 ⁇ N/2 ⁇ 4c feature map can be obtained, and each point in the feature map represents the original high-level feature map 4 feature points in . Then, use the receptive field to process the feature map of each channel.
  • the receptive field contains 4 branches.
  • the first branch uses a 1 ⁇ 1 convolutional layer to process the feature map of the first channel.
  • Convolution operation to convert the number of channels to 32
  • the remaining three branches use a 1 ⁇ 1 convolution layer to perform convolution operations on the feature maps of the remaining three channels to convert the number of channels to 32, and then use asymmetric convolution
  • the receptive field of the neural network is expanded, and the facial nerve is searched globally to obtain an accurate facial nerve feature map.
  • the pyramid fusion unit PFM is used to fuse the facial nerve feature maps C 2 , C 3 , and C 4 to obtain the fusion feature map.
  • the pyramid fusion unit PFM includes the first unit Unit-1 and the second unit Unit-2.
  • the facial nerve feature map C 2 that contains both high-level semantic information and low-level detail information
  • three different convolution layers are used to perform convolution operations on C 2 to obtain three feature maps ⁇ , ⁇ , ⁇ , and then multiply the feature map ⁇ and the facial nerve feature map C 3 to measure the self-similarity, and connect the obtained features with the feature map ⁇ to obtain the feature map M 1 , the feature map M 1 retains more information
  • two convolutional layers are used to smooth the feature map M 1 to obtain a smooth M 1 .
  • the convolution operation is performed on the facial nerve feature map C 3 , and the obtained feature map is multiplied by the facial nerve feature map C 4 to obtain C 34 , and a strong feature is constructed by taking the product of ⁇ and C 34 , and then, after concatenating the strong features with the smooth M 1 , smooth the concatenation result through two convolutional layers, and use one convolutional layer to compress the number of channels to 32, and output the fused feature map F 1 .
  • the fine segmentation module includes a decoupling unit DOM and a spatial attention unit SAM.
  • the decoupling unit DOM performs feature space transformation on the fusion feature map to obtain the central subject feature map T 1 , which is combined with the low-level feature map to obtain edge details Feature map T 2 , the spatial attention unit SAM performs attention feature extraction on the central subject feature map T 1 to obtain the prediction result F 3 , the spatial attention unit SAM performs attention feature extraction on the edge detail feature map T 2 to obtain the prediction result F 2 , the fusion result of the prediction result F 2 and the prediction result F 3 is processed by the spatial attention unit SAM, and the facial nerve segmentation map F 4 is output.
  • the decoupling unit DOM adopts the space transformation model to perform feature space transformation on the fusion feature map to obtain the central subject feature map T 1 .
  • the feature space conversion process includes parameter prediction, coordinate mapping, and pixel sampling; among them, in the parameter prediction process, the input fusion feature map F 1 ⁇ R H ⁇ W ⁇ c is processed by using a convolution layer with a convolution kernel size of 3 ⁇ 3.
  • the difference between the original fusion feature map and the obtained central subject feature map is used to extract the detail feature map.
  • its shallow low-level feature map f1 mainly includes low-level features in the image, and contains more detailed information. Therefore, the combination of the above detail feature map and the low-level feature map f 1 is used as the edge detail feature map T 2 of the facial nerve.
  • the specific process is as follows: after the low-level feature map f 1 is searched by the search and recognition unit for the global facial nerve, the facial nerve feature map C 1 is obtained, and after calculating the difference between the central subject feature map and the fusion feature map, the difference is spliced with the facial nerve feature map C 1 , The edge detail feature map T 2 is obtained through convolution layer fusion.
  • the core of the spatial attention unit SAM is how to optimize the central subject feature map T 1 and the edge detail feature map T 2 , and obtain the final facial nerve segmentation map. Since the output of the convolutional layer, the central main body feature and the edge detail feature do not consider the correlation dependence between the positions, in order to make the facial nerve segmentation model selectively enhance the features with a large amount of information, and the information is small or useless. features are suppressed.
  • the central subject feature map T1 and the edge detail feature map T2 are processed separately by a spatial attention unit SAM.
  • the specific process is as follows: firstly examine the feature signal of each position in the central subject feature map T 1 and the edge detail feature map T 2 , use the convolution operation to compress the global information as the feature position descriptor, and then use the global average pooling operation to The statistics of each channel are extracted.
  • a threshold mechanism is used to form an activation layer and two fully connected layers to further filter the features, and then the activation function is used to calculate the degree of dependence of each position.
  • an embodiment decouples raw tags into subject tags and detail tags, as shown in FIG. 5 .
  • the distance transform function converts a binary image into a new image where each foreground pixel has a minimum distance from the background.
  • the original label I of the CT image sample is split into foreground I fg and background I bg , and the distance f(p,q ), then the transformed label I' is obtained by the following distance transformation function;
  • the label I' is in the form of a matrix.
  • I'-min(I') means that the value of each element in the matrix I' is subtracted from min(I'), and the result obtained is then compared with max(I' ') and the difference between min(I') as a ratio to obtain the normalized result I".
  • the normalized label I" is used as the subject label of the foreground after multiplying the original label I b , the detail label I d of its edge detail part can be calculated from the original label, and the specific calculation formula is:
  • Ib I*I′′
  • Id I*(II′′)
  • the constructed loss function includes four parts, which are the difference between the fusion feature map and the original label of the CT image sample, the difference between the facial nerve segmentation map and the original label of the CT image sample, the prediction result of the central subject feature map processed by the spatial attention unit, and The difference of the subject label, the difference of the prediction result of the edge detail feature map processed by the spatial attention unit and the detail label. Specifically expressed as:
  • L iou ( ) represents the cross-entropy loss function
  • L bce ( ) represents the cross-over-combination ratio loss function
  • p b represents the prediction result of the central subject feature map processed by the spatial attention unit
  • I b represents the subject label
  • p d Represents the prediction result of the edge detail feature map processed by the spatial attention unit
  • I d represents the detail label
  • is the weight factor
  • I represents the original label of the CT image sample
  • p 1 and p 2 represent the fusion feature map and the facial nerve segmentation map respectively .
  • the loss function curve during training is shown in Fig. 6.
  • the Adam optimizer is used to optimize the model, and the initial learning rate is set to 1e-4, which is reduced by 10 times every 60 rounds.
  • the input image is resized to 352 ⁇ 352, and multi-scale is used for training, and the rescaling rate is [0.5, 0.75, 1, 1.25, 1.5]. All training samples are augmented with random flips, rotations and boundary clipping.
  • This embodiment also provides the implementation results of the decoupling divide-and-conquer facial nerve segmentation method.
  • the facial nerve segmentation result obtained by the facial nerve segmentation model provided in the embodiment is basically consistent with the manual segmentation result, which shows the accuracy of the facial nerve segmentation model of this method.
  • Table 1 comparing the results of the facial nerve segmentation model Ours with the classic Unet model, Unet++ model, AttUnet model and R2AttUnet model, each index has a stable improvement, the Dice coefficient is 0.858, and the 95% Hough distance is 0.363.
  • the computing power (FLOPs) required by the facial nerve segmentation model is only 13.33G, which is less than 1/10 of Unet (123.77G), and the number of parameters (9.86M) is only 1/4 of Unet (34.53M). about.
  • FLOPs the computing power required by the facial nerve segmentation model
  • the facial nerve segmentation model Ours is more accurate, and will not identify other tissues as the facial nerve, resulting in wrong segmentation.
  • the edges of the segmentation model Ours segmentation results are also closer to the labels.
  • the facial nerve segmentation model achieves the best in both computational complexity and Dice score.
  • Embodiment also provides a kind of decoupling divide-and-conquer facial nerve segmentation device, comprise memory, processor and be stored in described memory and the computer program that can be executed on described processor, described processor executes described computer program Realize above-mentioned facial nerve segmentation method steps of decoupling divide-and-conquer at the same time.
  • the computer memory can be a near-end volatile memory, such as RAM, or a non-volatile memory, such as ROM, FLASH, floppy disk, mechanical hard disk, etc., or a remote storage cloud.
  • Computer processor can be central processing unit (CPU), microprocessor (MPU), digital signal processor (DSP), or field programmable gate array (FPGA), promptly can realize the facial nerve of decoupling divide-and-conquer by these processors. Split method steps.

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Surgery (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Animal Behavior & Ethology (AREA)
  • Veterinary Medicine (AREA)
  • Public Health (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Robotics (AREA)
  • Medical Informatics (AREA)
  • Biomedical Technology (AREA)
  • Heart & Thoracic Surgery (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Processing (AREA)
  • Image Analysis (AREA)

Abstract

本发明公开了一种解耦分治的面神经分割方法和装置,针对面神经结构小和对比度低的特性,构建了包含特征提取模块、粗分割模块、以及精分割模块的面神经分割模型,利用特征提取模块提取低级特征和多个不同层次的高级特征的基础上,采用粗分割模块对不同层次的高级特征进行面神经特征的全局搜索和融合,再利用精分割模块对融合特征进行解耦得到中心主体特征,综合中心主体特征与低级特征得到边缘细节特征后,对中心主体特征和边缘细节特征进行空间注意力机制特征提取,得到面神经分割图,该方法提升了面神经的自动分割精度和速度,满足机器人人工耳蜗植入术前路径规划需求。

Description

一种解耦分治的面神经分割方法和装置 技术领域
本发明涉及医学图像处理领域,具体地说,涉及一种解耦分治的面神经分割方法和装置。
背景技术
机器人人工耳蜗植入是一种帮助患者恢复听力的自动治疗方法,它依赖于精确的术前规划以避免对关键解剖结构的损坏。耳蜗周围组织结构的识别在术前路径规划中起着至关重要的作用。其中,面神经的损伤可能导致永久性颞面神经麻痹,是耳蜗周围最重要的组织结构,并且与周围组织之间的距离不到1mm。然而,面神经的准确分割面临两大挑战:(1)面神经结构非常小。面神经在CT图像中仅占一个特别小的区域(512×512像素的全脑CT图像,面神经区域仅有9-16个像素点,图像占比0.0034%)。(2)面神经与周围组织结构之间的对比度低。面神经与周围环境的边界通常很模糊,缺乏传统分割方法所需的强对比度。综上所述,自动、准确的面神经分割是机器人人工耳蜗植入术前路径规划中的一大难题。
面神经分割的传统方法依赖于手动提取特征,例如中心线和设置点。这些方法通常训练一个分类器来区分面神经和复杂结构,错误分割率很高。主要是因为面神经区域与周围高度相似区域之间的类间差异较弱,导致人工特征提取的表示能力非常有限,如文献Noble J H,Warren F M,Labadie R F,et al.Automatic segmentation of the facial nerve and chorda tympani using image registration and statistical priors[C]//Medical Imaging 2008:Image  Processing.International Society for Optics and Photonics,2008,6914:69140P。
近年来,随着深度学习的发展,医学图像分析取得了重大突破。尤其是Unet模型,它利用多层次信息重建高分辨率特征图,在少量数据的前提下收敛,推动了医学图像分割的发展。然而,基于U形模型的解码器中的特征,高度依赖于从编码器中提取的特征。这些方法直接将特征从编码器引入到解码器,忽略了不同层次特征聚合的有效性,限制了特征的有效利用并引入了误导性特征,从而导致面神经与其他区域混淆。现有的基于深度学习的面神经分割方法中,如文献Ronneberger O,Fischer P,Brox T.U-net:Convolutional networks for biomedical image segmentation[C]//International Conference on Medical image computing and computer-assisted intervention.Springer,Cham,2015:234-241.所述的采用Unet模型的分割精度Dice系数为0.756,如文献Zhou Z,Siddiquee M M R,Tajbakhsh N,et al.Unet++:A nested u-net architecture for medical image segmentation[M]//Deep learning in medical image analysis and multimodal learning for clinical decision support.Springer,Cham,2018:3-11.所述的采用Unet++模型的分割精度Dice系数为0.764。
发明内容
鉴于上述,本发明的目的是提供一种解耦分治的面神经分割方法和装置,该方法和装置有效解决了面神经结构小、对比度低对分割的影响,提高了面神经自动分割的精度和速度,满足机器人人工耳蜗植入术前路径规划需求。
实施例提供了一种解耦分治的面神经分割方法,包括以下步骤:
获取并预处理CT影像,得到样本集;
构建面神经分割模型,包括特征提取模块、粗分割模块、以及精分割模块;输入的CT影像样本经过特征提取模块特征提取,得到1个低级特征图和多个不同层次的高级特征图;粗分割模块包括搜索识别单元和金字塔融合单元,多个不同层次的高级特征图并列分别经过搜索识别单元进行全局面神经搜索,得到的多个面神经特征图经过金字塔融合单元融合得到融合特征图;精分割模块包括解耦单元和空间注意力单元,融合特征图经过解耦单元特征空间转换,得到的中心主体特征图与低级特征图结合后得到边缘细节特征图,中心主体特征图与边缘细节特征图分别经过空间注意力单元进行注意力特征提取后,得到的提取结果融合后再经过空间注意力单元处理,得到面神经分割图;
构建损失函数,损失函数包括融合特征图与CT影像样本的原始标签的差异、面神经分割图与CT影像样本的原始标签的差异、中心主体特征图经过空间注意力单元处理的预测结果与主体标签的差异、边缘细节特征图经过空间注意力单元处理的预测结果与细节标签的差异;
采用样本集和损失函数优化面神经分割模型的参数后,利用参数确定的面神经分割模型对输入的CT影像样本进行面部神经分割,得到面神经分割图。
在一个实施例中,所述特征提取模块采用改进的Res2Net50,去掉原始Res2Net50的所有全连接层和最后1个卷积组,剩下的多个卷积组形成改进的Res2Net50,输入的CT影像样本输入至Res2Net50中,第一个卷积组的输出为低级特征图,其他卷积组的输出分别为不同层次的高级特征图。
在一个实施例中,所述搜索识别单元中,对高级特征图的全局面神经 搜索过程为:
对高级特征图按照通道分离,得到分离后的特征图;接着利用多分支的操作处理分离后的特征图,第一分支中,分离后的特征图经过卷积操作以转换通道数,在剩下分支中,分离后的特征图经过卷积操作以转换通道数后,再经过非对称卷积操作和扩张卷积操作后,将所有分支的特征图的操作结果融合,实现高级特征图的扩张;对融合结果进行分离的逆运算,实现特征重构,以得到面神经特征图。
在一个实施例中,所述解耦单元中,采用空间转换模型对融合特征图进行特征空间转换,得到中心主体特征图,特征空间转换过程包括参数预测、坐标映射以及像素采样;
其中,参数预测过程:采用卷积层对融合特征图进行变换预测,得到参数矩阵;
坐标映射过程:将参数矩阵中的元素值作为像素点的偏移量,利用偏移量对将处于标准空间网络中的融合特征图的像素点进行坐标映射,以得到新融合特征图;
像素采样过程:采用可微双线性采样机制对新融合特征图进行采样,得到的像素点组成中心主体特征图。
在一个实施例中,所述得到的中心主体特征图与低级特征图结合后得到边缘细节特征图,包括:
低级特征图经过搜索识别单元进行全局面神经搜索后得到面神经特征图,计算中心主体特征图与融合特征图的差值后,差值与面神经特征图拼接后,通过卷积层融合得到边缘细节特征图。
在一个实施例中,所述注意力机制单元中,对输入图进行注意力特征提取过程为:
输入图经过卷积操作和全局平均池化操作后,经过由激活层和全连接层组成的门限机制进一步筛选,然后通过激活函数激活,得到输入图对应的注意力特征提取的预测结果;
其中,输入图为中心主体特征图、边缘细节特征图、中心主体特征图与边缘细节特征图对应的两个注意力特征提取的预测结果的融合结果。
在一个实施例中,CT影像样本的主体标签和细节标签的构建过程为:
将CT影像样本的原始标签I拆分成前景I fg和背景I bg,计算属于前景I fg的像素点p与属于背景I bg的像素点q之间的距离f(p,q),则通过以下距离变换函数得到变换后标签I′;
Figure PCTCN2022076927-appb-000001
对变换后标签I′进行归一化处理,得到归一化标签I″:
Figure PCTCN2022076927-appb-000002
依据归一化标签I″确定CT影像样本的主体标签I b和细节标签I d分别为:
I b=I*I″ I d=I*(I-I″)。
在一个实施例中,所述损失函数表示为:
Figure PCTCN2022076927-appb-000003
L(p i,I)=L bce(p i,I,α)+L iou(p i,I,α)
其中,L(·)表示交叉熵损失函数,L iou(·)表示交并比损失函数,p b表示中心主体特征图经过空间注意力单元处理的预测结果,I b表示主体标签,p d表示边缘细节特征图经过空间注意力单元处理的预测结果,I d表示细节 标签,α为权重因子,I表示CT影像样本的原始标签,p 1和p 2表示分别表示融合特征图与面神经分割图。
在一个实施例中,预处理CT影像包括:采用随机翻转,剪切方式进行数据增强,数据增强后的CT影像形成样本集。
实施例还提供了一种解耦分治的面神经分割装置,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上执行的计算机程序,所述处理器执行所述计算机程序时实现上述解耦分治的面神经分割方法步骤。
实施例提供的解耦分治的面神经分割方法和装置,具有的有益效果至少包括:
针对面神经结构小和对比度低的特性,构建了包含特征提取模块、粗分割模块、以及精分割模块的面神经分割模型,利用特征提取模块提取低级特征和多个不同层次的高级特征的基础上,采用粗分割模块对不同层次的高级特征进行面神经特征的全局搜索和融合,再利用精分割模块对融合特征进行解耦得到中心主体特征,综合中心主体特征与低级特征得到边缘细节特征后,对中心主体特征和边缘细节特征进行空间注意力机制特征提取,得到面神经分割图,该方法提升了面神经的自动分割精度和速度,满足机器人人工耳蜗植入术前路径规划需求。
附图说明
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图做简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动前提下,还可以根据这些附图获得其他附图。
图1本发明实施例中解耦分治的面神经分割方法的流程图;
图2为本发明实施例中面神经分割模型的结构图;
图3为本发明实施例中搜索识别单元的结构示意图;
图4为本发明实施例中金字塔融合单元的结构示意图;
图5为本发明实施例中获取主体标签和细节标签的示意图;
图6为本发明实施例中损失函数曲线图;
图7为本发明实施例中分割结果的展示图;
图8为本发明实施例中各方法的分割结果对比图;
图9为本发明实施例中结果指标Dice分数和FLOPs的示意图。
具体实施方式
为使本发明的目的、技术方案及优点更加清楚明白,以下结合附图及实施例对本发明进行进一步的详细说明。应当理解,此处所描述的具体实施方式仅仅用以解释本发明,并不限定本发明的保护范围。
针对面神经结构小和对比度低的特点,为解决传统面神经分割错误分割率高、分割速度低的问题,实施例提供了一种解耦分治的面神经分割方法。
图1本发明实施例中解耦分治的面神经分割方法的流程图。如图1所示,实施例提供的解耦分治的面神经分割方法,主要包括训练和测试两个阶段。首先,将采集的所有CT影像数据随机划分为训练集和测试集。在训练阶段使用训练集进行训练,对于训练集,采用随机翻转,剪切等方式进行数据增强以实现预处理,接着将数据增强后的训练样本输入到面神经分割模型,并更新面神经分割模型的参数。再将训练的面神经分割模型进行验证,对于测试集,采用随机翻转,剪切等方式进行数据增强以实现预处理,使用数据增强后的测试样本输入至面神经分割模型,并且对面神经 分割模型输出的预测结果进行评估,以判断面神经分割模型是否是最优模型,如果是,则保存模型参数,不是则不保存模型参数,这样进行迭代更新。在测试阶段,将得到的最优参数加载到面神经分割模型中进行测试,输出预测结果。
图2为本发明实施例中面神经分割模型的结构图。如图2所示,实施例提供的面神经分割模型包括特征提取模块、粗分割模块、以及精分割模块。
特征提取模块用于对输入的CT影像样本进行特征提取,得到1个低级特征图和多个不同层次的高级特征图。实施例中,采用改进的Res2Net50作为特征提取模块的骨干网络,为了适用于面神经分割任务,去掉了原始Res2Net50的所有全连接层和最后1个卷积组,剩下的4个卷积组形成改进的Res2Net50。对于输入的CT影像样本I∈R H×W×c,其中H表示高度,W表示宽度,c表示通道数目,此处c=1,改进的Res2Net50提取不同层次特征图表示为
Figure PCTCN2022076927-appb-000004
由于面神经区域较小,实施例中将骨干网络的第二层卷积的步长设置为1,保留更大的分辨率,因此每个层次特征图的分辨率为{[H/k,W/k],k=2,2,4,8}。最近的研究表明网络较深层倾向于提取目标的低频的主体位置信息,较浅层容易保留图像中高频的细节信息。因此,实施例将提取的特征分为高级特征图f 2,f 3,f 4和低级特征图f 1
粗分割模块包括搜索识别单元SIM和金字塔融合单元PFM,三个通过搜索识别单元SIM分别对高级特征图f 2,f 3,f 4进行全局面神经搜索,通过扩展感受野来捕获小物体,得到面神经特征图C 2,C 3,C 4,并结合有效的金字塔融合单元PFM进行面神经特征图C 2,C 3,C 4的融合得到融合特征图。
图3为本发明实施例中搜索识别单元的结构示意图。如图3所示,搜索识别单元SIM中,对于尺寸为N×N×c的高级特征图,采用尺寸为2×2 的滑动窗口在高级特征图中搜索2×2×c的patch,然后将每个patch按照通道进行排列,实现高级特征图按照通道分离。每一个patch都会被拆解转换到4个通道,因此在分离操作后,可以得到一个N/2×N/2×4c的特征图,该特征图中的每个点都表示了原高级特征图中的4个特征点。然后,采用感受野对每个通道的特征图进行处理,针对4个通道的特征图,感受野包含4个分支,第一个分支采用1×1的卷积层对第一通道的特征图进行卷积操作以转换通道数到32,剩下三个分支采用1×1的卷积层对剩下三个通道的特征图进行卷积操作以转换通道数到32后,再采用非对称卷积和具有相应卷积核的扩张卷积进行非对称卷积操作和扩张卷积操作,拼接融合4个分支的处理结果,以实现高级特征图的扩张,最后,对拼接融合结果进行分离的逆运算,实现特征重构,以得到面神经特征图。这样通过对高级特征图的分离、扩张和重建,拓展了神经网络的感受野,实现了在全局范围内搜索面神经,以得到准确的面神经特征图。
为进一步增强高层特征的表达能力,采用金字塔融合单元PFM对面神经特征图C 2,C 3,C 4的融合得到融合特征图,如图4所示,金字塔融合单元PFM包括第一单元Unit-1和第二单元Unit-2。在第一单元Unit-1中,对于同时包含高级语义信息和低级细节信息的面神经特征图C 2,采用三个不同的卷积层分别对C 2进行卷积操作以获得三个特征图α,β,γ,然后将特征图α和面神经特征图C 3相乘以测量自相似性,将得到的特征与特征图β连接以获得特征图M 1,特征图M 1保留了更多的信息,最后再使用两个卷积层对特征图M 1进行平滑,得到平滑的M 1。在第二单元Unit-2中,对面神经特征图C 3进行卷积操作,将得到的特征图与面神经特征图C 4相乘得到C 34后,通过取γ与C 34的乘积构造一个强特征,然后,将强特征与平滑的M 1拼接后,再通过两个卷积层对拼接结果进行平滑处理,并且利用一个卷积 层将其通道数进行压缩到32,输出融合特征图F 1
精分割模块包括解耦单元DOM和空间注意力单元SAM,解耦单元DOM对融合特征图进行特征空间转换以得到中心主体特征图T 1,该中心主体特征图与低级特征图结合后得到边缘细节特征图T 2,空间注意力单元SAM对中心主体特征图T 1进行注意力特征提取以得到预测结果F 3,空间注意力单元SAM对边缘细节特征图T 2进行注意力特征提取以得到预测结果F 2,预测结果F 2与预测结果F 3的融合结果经过空间注意力单元SAM处理后,输出面神经分割图F 4
解耦单元DOM采用空间转换模型对融合特征图进行特征空间转换,以获得中心主体特征图T 1。特征空间转换过程包括参数预测、坐标映射以及像素采样;其中,参数预测过程中,采用卷积核大小为3×3的卷积层对输入的融合特征图F 1∈R H×W×c进行变换预测,得到参数矩阵
Figure PCTCN2022076927-appb-000005
c 1=2,表示大小相同,通道数为2的变换矩阵;坐标映射过程中,将参数矩阵中的元素值作为像素点的偏移量,利用偏移量对将处于标准空间网络中的融合特征图的像素点进行坐标映射,以得到新融合特征图,即处于标准空间网络中的融合特征图的像素点p l通过p l+θ(p l)的方式映射到扭曲空间网格上新像素点
Figure PCTCN2022076927-appb-000006
实现坐标映射,新像素点
Figure PCTCN2022076927-appb-000007
组成新融合特征图;像素采样过程中,采用可微双线性采样机制对新融合特征图进行采样,得到的像素点组成中心主体特征图T 1,即采用在空间变压器网络(Transformer network)中提出的可微双线性采样机制来逼近中心主体特征中的每个点,该可微双线性采样机制对像素点p l的四个最近邻像素的值进行线性插值来逼近中心主体特征图中的每个像素点。
为了获得边缘细节特征图,利用原始融合特征图与获得的中心主体特征图做差值来提取出细节特征图。但是由于骨干网络的特性,其浅层的低 级特征图f 1主要包括图像中的低级特征,所含细节信息更加丰富。因此利用上述细节特征图与低级特征图f 1的结合作为面神经的边缘细节特征图T 2。具体过程为:低级特征图f 1经过搜索识别单元进行全局面神经搜索后得到面神经特征图C 1,计算中心主体特征图与融合特征图的差值后,差值与面神经特征图C 1拼接后,通过卷积层融合得到边缘细节特征图T 2
空间注意力单元SAM的核心就是怎样去优化中心主体特征图T 1和边缘细节特征图T 2,并且得到最终面神经分割图。由于卷积层的输出,中心主体特征和边缘细节特征并没有考虑位置之间的相关依赖性,为了使得面神经分割模型有选择性地增强具有信息量较大的特征,并且对信息量小或者无用特征进行抑制。采用空间注意力单元SAM分别对中心主体特征图T 1和边缘细节特征图T 2进行处理。具体过程为:首先考察中心主体特征图T 1和边缘细节特征图T 2中的每个位置的特征信号,采用卷积操作压缩全局信息为特征的位置描述符,接着利用全局平均池化操作来提取各个通道的统计量,为了增强提取特征的泛化能力,还使用1个激活层和2个全连接层组成门限机制来进行进一步筛选特征,再利用激活函数计算出各个位置的依赖程度,以学习特征之间非互斥的关系,并且降低多个通道对面神经分割特征的影响,最终得到中心主体特征图T 1和边缘细节特征图T 2对应的注意力特征提取的预测结果F 3和F 2,然后,将预测结果F 3和F 2求和得到特征图
Figure PCTCN2022076927-appb-000008
再利用空间注意力单元SAM对特征图
Figure PCTCN2022076927-appb-000009
进行处理,得到最终的面神经分割图F 4
像素的预测难度与其位置密切相关。由于CT影像灰度值杂乱,面神经边缘附近的像素更容易预测失误。相比之下,由于面神经内部的一致性,中心像素具有更高的预测精度。与其平等地对待这些像素,不如根据它们各自的特性来处理它们。因此,实施例将原始标签解耦为主体标签和细节 标签,如图5所示。距离变换函数可以将二值图像转换成一个新图像,其中每个前景像素具有一个相对于背景的最小距离。
具体来说,将CT影像样本的原始标签I拆分成前景I fg和背景I bg,计算属于前景I fg的像素点p与属于背景I bg的像素点q之间的距离f(p,q),则通过以下距离变换函数得到变换后标签I′;
Figure PCTCN2022076927-appb-000010
对变换后标签I′进行归一化处理,得到归一化标签I″:
Figure PCTCN2022076927-appb-000011
变换后标签I′为矩阵形式,在进行归一化时,I′-min(I′)表示矩阵I′中的每个元素值减去min(I′),得到的结果再与max(I′)和min(I′)之差做比值,得到归一化结果I″。
归一化标签I″中的像素并不依靠前景或者背景区分,而是越靠近前景中心的像素值越高,因此,将归一化标签I″作为与原始标签求积之后作为前景的主体标签I b,其边缘细节部分的细节标签I d可以由原始标签计算所得,具体计算公式为:
I b=I*I″ I d=I*(I-I″)
至此,原始标签I已解耦为两种不同的标签,监督协助网络分别学习不同特征的中心主体特征和边缘细节特征。
为优化面神经分割模型的网络参数,还需要构建损失函数。构建的损失函数包括4部分,分别为融合特征图与CT影像样本的原始标签的差异、面神经分割图与CT影像样本的原始标签的差异、中心主体特征图经过空间注意力单元处理的预测结果与主体标签的差异、边缘细节特征图经过空 间注意力单元处理的预测结果与细节标签的差异。具体表示为:
Figure PCTCN2022076927-appb-000012
L(p i,I)=L bce(p i,I,α)+L iou(p i,I,α)
其中,L iou(·)表示交叉熵损失函数,L bce(·)表示交并比损失函数,p b表示中心主体特征图经过空间注意力单元处理的预测结果,I b表示主体标签,p d表示边缘细节特征图经过空间注意力单元处理的预测结果,I d表示细节标签,α为权重因子,I表示CT影像样本的原始标签,p 1和p 2表示分别表示融合特征图与面神经分割图。训练过程中的损失函数曲线如图6所示。
在利用损失函数训练面神经分割模型时,利用Adam优化器优化模型,初始学习率设定为1e-4,每60轮降低10次。输入图像调整为352×352,利用多尺度进行训练,尺度调整率为[0.5,0.75,1,1.25,1.5]。所有训练样本都通过随机翻转、旋转和边界剪裁来增强。
本实施例还给出解耦分治的面神经分割方法的实施结果。如图7所示,实施例提供的面神经分割模型得到的面神经分割结果与人工分割结果基本一致,表明了本方法的面神经分割模型的准确性。如表1所示,面神经分割模型Ours与经典的Unet模型、Unet++模型、AttUnet模型以及R2AttUnet模型的结果比较,各个指标均有稳定的提升,Dice系数为0.858,95%的豪夫距离为0.363。计算复杂度方面,面神经分割模型对算力的需求(FLOPs)仅为13.33G不及Unet(123.77G)的1/10,同时参数量(9.86M)也仅为Unet(34.53M)的1/4左右。如图8所示,从各个模型的面神经的2D分割结果可以得到,相较于其他方法,面神经分割模型Ours更为准确,而且不会将其他组织认定为是面神经,导致错误的分割,同时面神经分割 模型Ours分割结果的边缘也更加贴近标签。如图9所示,面神经分割模型在计算复杂度和Dice分数上都实现了最优。
表1
Figure PCTCN2022076927-appb-000013
实施例还提供了一种解耦分治的面神经分割装置,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上执行的计算机程序,所述处理器执行所述计算机程序时实现上述解耦分治的面神经分割方法步骤。
实际应用中,计算机存储器可以为在近端的易失性存储器,如RAM,还可以是非易失性存储器,如ROM,FLASH,软盘,机械硬盘等,还可以是远端的存储云。计算机处理器可以为中央处理器(CPU)、微处理器(MPU)、数字信号处理器(DSP)、或现场可编程门阵列(FPGA),即可以通过这些处理器实现解耦分治的面神经分割方法步骤。
以上所述的具体实施方式对本发明的技术方案和有益效果进行了详细说明,应理解的是以上所述仅为本发明的最优选实施例,并不用于限制本发明,凡在本发明的原则范围内所做的任何修改、补充和等同替换等,均应包含在本发明的保护范围之内。

Claims (10)

  1. 一种解耦分治的面神经分割方法,其特征在于,包括以下步骤:
    获取并预处理CT影像,得到样本集;
    构建面神经分割模型,包括特征提取模块、粗分割模块、以及精分割模块;输入的CT影像样本经过特征提取模块特征提取,得到1个低级特征图和多个不同层次的高级特征图;粗分割模块包括搜索识别单元和金字塔融合单元,多个不同层次的高级特征图并列分别经过搜索识别单元进行全局面神经搜索,得到的多个面神经特征图经过金字塔融合单元融合得到融合特征图;精分割模块包括解耦单元和空间注意力单元,融合特征图经过解耦单元特征空间转换,得到的中心主体特征图与低级特征图结合后得到边缘细节特征图,中心主体特征图与边缘细节特征图分别经过空间注意力单元进行注意力特征提取后,得到的提取结果融合后再经过空间注意力单元处理,得到面神经分割图;
    构建损失函数,损失函数包括融合特征图与CT影像样本的原始标签的差异、面神经分割图与CT影像样本的原始标签的差异、中心主体特征图经过空间注意力单元处理的预测结果与主体标签的差异、边缘细节特征图经过空间注意力单元处理的预测结果与细节标签的差异;
    采用样本集和损失函数优化面神经分割模型的参数后,利用参数确定的面神经分割模型对输入的CT影像样本进行面部神经分割,得到面神经分割图。
  2. 根据权利要求1所述的解耦分治的面神经分割方法,其特征在于,所述特征提取模块采用改进的Res2Net50,去掉原始Res2Net50的所有全连接层和最后1个卷积组,剩下的多个卷积组形成改进的Res2Net50,输 入的CT影像样本输入至Res2Net50中,第一个卷积组的输出为低级特征图,其他卷积组的输出分别为不同层次的高级特征图。
  3. 根据权利要求1所述的解耦分治的面神经分割方法,其特征在于,所述搜索识别单元中,对高级特征图的全局面神经搜索过程为:
    对高级特征图按照通道分离,得到分离后的特征图;接着利用多分支的操作处理分离后的特征图,第一分支中,分离后的特征图经过卷积操作以转换通道数,在剩下分支中,分离后的特征图经过卷积操作以转换通道数后,再经过非对称卷积操作和扩张卷积操作后,将所有分支的特征图的操作结果融合,实现高级特征图的扩张;对融合结果进行分离的逆运算,实现特征重构,以得到面神经特征图。
  4. 根据权利要求1所述的解耦分治的面神经分割方法,其特征在于,所述解耦单元中,采用空间转换模型对融合特征图进行特征空间转换,得到中心主体特征图,特征空间转换过程包括参数预测、坐标映射以及像素采样;
    其中,参数预测过程:采用卷积层对融合特征图进行变换预测,得到参数矩阵;
    坐标映射过程:将参数矩阵中的元素值作为像素点的偏移量,利用偏移量对将处于标准空间网络中的融合特征图的像素点进行坐标映射,以得到新融合特征图;
    像素采样过程:采用可微双线性采样机制对新融合特征图进行采样,得到的像素点组成中心主体特征图。
  5. 根据权利要求1所述的解耦分治的面神经分割方法,其特征在于,所述得到的中心主体特征图与低级特征图结合后得到边缘细节特征图,包括:
    低级特征图经过搜索识别单元进行全局面神经搜索后得到面神经特征图,计算中心主体特征图与融合特征图的差值后,差值与面神经特征图拼接后,通过卷积层融合得到边缘细节特征图。
  6. 根据权利要求1所述的解耦分治的面神经分割方法,其特征在于,所述注意力机制单元中,对输入图进行注意力特征提取过程为:
    输入图经过卷积操作和全局平均池化操作后,经过由激活层和全连接层组成的门限机制进一步筛选,然后通过激活函数激活,得到输入图对应的注意力特征提取的预测结果;
    其中,输入图为中心主体特征图、边缘细节特征图、中心主体特征图与边缘细节特征图对应的两个注意力特征提取的预测结果的融合结果。
  7. 根据权利要求1所述的解耦分治的面神经分割方法,其特征在于,CT影像样本的主体标签和细节标签的构建过程为:
    将CT影像样本的原始标签I拆分成前景I fg和背景I bg,计算属于前景I fg的像素点p与属于背景I bg的像素点q之间的距离f(p,q),则通过以下距离变换函数得到变换后标签I′;
    Figure PCTCN2022076927-appb-100001
    对变换后标签I′进行归一化处理,得到归一化标签I″:
    Figure PCTCN2022076927-appb-100002
    依据归一化标签I″确定CT影像样本的主体标签I b和细节标签I d分别为:
    I b=I*I″  I d=I*(I-I″)。
  8. 根据权利要求1所述的解耦分治的面神经分割方法,其特征在于, 所述损失函数表示为:
    Figure PCTCN2022076927-appb-100003
    L(p i,I)=L bce(p i,I,α)+L iou(p i,I,α)
    其中,L iou表示交叉熵损失函数,L bce(·)表示交并比损失函数,p b表示中心主体特征图经过空间注意力单元处理的预测结果,I b表示主体标签,p d表示边缘细节特征图经过空间注意力单元处理的预测结果,I d表示细节标签,α为权重因子,I表示CT影像样本的原始标签,p 1和p 2表示分别表示融合特征图与面神经分割图。
  9. 根据权利要求1所述的解耦分治的面神经分割方法,其特征在于,预处理CT影像包括:采用随机翻转,剪切方式进行数据增强,数据增强后的CT影像形成样本集。
  10. 一种解耦分治的面神经分割装置,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上执行的计算机程序,其特征在于,所述处理器执行所述计算机程序时实现权利要求1-9任一项所述的解耦分治的面神经分割方法步骤。
PCT/CN2022/076927 2021-09-22 2022-02-18 一种解耦分治的面神经分割方法和装置 WO2023045231A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111106992.2 2021-09-22
CN202111106992.2A CN113870289B (zh) 2021-09-22 2021-09-22 一种解耦分治的面神经分割方法和装置

Publications (1)

Publication Number Publication Date
WO2023045231A1 true WO2023045231A1 (zh) 2023-03-30

Family

ID=78993217

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/076927 WO2023045231A1 (zh) 2021-09-22 2022-02-18 一种解耦分治的面神经分割方法和装置

Country Status (2)

Country Link
CN (1) CN113870289B (zh)
WO (1) WO2023045231A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116228797A (zh) * 2023-05-09 2023-06-06 中国石油大学(华东) 基于注意力和U-Net的页岩扫描电镜图像分割方法
CN117392153A (zh) * 2023-12-06 2024-01-12 江西师范大学 一种基于局部补偿和多尺度自适应变形的胰腺分割方法
CN117547353A (zh) * 2024-01-12 2024-02-13 中科璀璨机器人(成都)有限公司 一种双源ct成像的肿瘤精确定位与机器人穿刺方法及系统
CN117765410A (zh) * 2024-01-05 2024-03-26 浙江时空智子大数据有限公司 遥感影像双分支特征融合固废识别方法、系统及电子设备

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113870289B (zh) * 2021-09-22 2022-03-15 浙江大学 一种解耦分治的面神经分割方法和装置
CN114398979A (zh) * 2022-01-13 2022-04-26 四川大学华西医院 一种基于特征解耦的超声图像甲状腺结节分类方法
CN116503933B (zh) * 2023-05-24 2023-12-12 北京万里红科技有限公司 一种眼周特征提取方法、装置、电子设备及存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100183222A1 (en) * 2009-01-21 2010-07-22 Yissum Research Development Company Of The Hebrew University Of Jerusalem Ltd. System and method for edge-enhancement of digital images using wavelets
CN112102321A (zh) * 2020-08-07 2020-12-18 深圳大学 一种基于深度卷积神经网络的病灶图像分割方法及系统
CN112465827A (zh) * 2020-12-09 2021-03-09 北京航空航天大学 一种基于逐类卷积操作的轮廓感知多器官分割网络构建方法
CN112862805A (zh) * 2021-03-04 2021-05-28 同济大学 听神经瘤图像自动化分割方法及系统
CN113870289A (zh) * 2021-09-22 2021-12-31 浙江大学 一种解耦分治的面神经分割方法和装置

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112651973B (zh) * 2020-12-14 2022-10-28 南京理工大学 基于特征金字塔注意力和混合注意力级联的语义分割方法
CN113177943B (zh) * 2021-06-29 2021-09-07 中南大学 一种脑卒中ct影像分割方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100183222A1 (en) * 2009-01-21 2010-07-22 Yissum Research Development Company Of The Hebrew University Of Jerusalem Ltd. System and method for edge-enhancement of digital images using wavelets
CN112102321A (zh) * 2020-08-07 2020-12-18 深圳大学 一种基于深度卷积神经网络的病灶图像分割方法及系统
CN112465827A (zh) * 2020-12-09 2021-03-09 北京航空航天大学 一种基于逐类卷积操作的轮廓感知多器官分割网络构建方法
CN112862805A (zh) * 2021-03-04 2021-05-28 同济大学 听神经瘤图像自动化分割方法及系统
CN113870289A (zh) * 2021-09-22 2021-12-31 浙江大学 一种解耦分治的面神经分割方法和装置

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116228797A (zh) * 2023-05-09 2023-06-06 中国石油大学(华东) 基于注意力和U-Net的页岩扫描电镜图像分割方法
CN116228797B (zh) * 2023-05-09 2023-08-15 中国石油大学(华东) 基于注意力和U-Net的页岩扫描电镜图像分割方法
CN117392153A (zh) * 2023-12-06 2024-01-12 江西师范大学 一种基于局部补偿和多尺度自适应变形的胰腺分割方法
CN117392153B (zh) * 2023-12-06 2024-02-23 江西师范大学 一种基于局部补偿和多尺度自适应变形的胰腺分割方法
CN117765410A (zh) * 2024-01-05 2024-03-26 浙江时空智子大数据有限公司 遥感影像双分支特征融合固废识别方法、系统及电子设备
CN117765410B (zh) * 2024-01-05 2024-05-28 浙江时空智子大数据有限公司 遥感影像双分支特征融合固废识别方法、系统及电子设备
CN117547353A (zh) * 2024-01-12 2024-02-13 中科璀璨机器人(成都)有限公司 一种双源ct成像的肿瘤精确定位与机器人穿刺方法及系统
CN117547353B (zh) * 2024-01-12 2024-03-19 中科璀璨机器人(成都)有限公司 一种双源ct成像的肿瘤定位方法及系统

Also Published As

Publication number Publication date
CN113870289B (zh) 2022-03-15
CN113870289A (zh) 2021-12-31

Similar Documents

Publication Publication Date Title
WO2023045231A1 (zh) 一种解耦分治的面神经分割方法和装置
Adegun et al. Deep learning techniques for skin lesion analysis and melanoma cancer detection: a survey of state-of-the-art
CN106056595B (zh) 基于深度卷积神经网络自动识别甲状腺结节良恶性的辅助诊断系统
CN112150428B (zh) 一种基于深度学习的医学图像分割方法
Lahiri et al. Generative adversarial learning for reducing manual annotation in semantic segmentation on large scale miscroscopy images: Automated vessel segmentation in retinal fundus image as test case
CN109389584A (zh) 基于cnn的多尺度鼻咽肿瘤分割方法
CN110930416A (zh) 一种基于u型网络的mri图像前列腺分割方法
CN111583210B (zh) 基于卷积神经网络模型集成的乳腺癌图像自动识别方法
CN102831614B (zh) 基于交互式字典迁移的序列医学图像快速分割方法
CN112581458B (zh) 一种图像处理方法和装置
CN109670489B (zh) 基于多实例学习的弱监督式早期老年性黄斑病变分类方法
CN112132827A (zh) 病理图像的处理方法、装置、电子设备及可读存储介质
CN112348059A (zh) 基于深度学习的多种染色病理图像分类方法及系统
Sornapudi et al. Comparing deep learning models for multi-cell classification in liquid-based cervical cytology image
Al-Masni et al. A deep learning model integrating FrCN and residual convolutional networks for skin lesion segmentation and classification
CN114841947A (zh) 肺腺癌h&e染色病理图像肿瘤区域多尺度特征提取与预后分析方法、装置
CN110047075A (zh) 一种基于对抗网络的ct图像分割方法
CN115546466A (zh) 一种基于多尺度显著特征融合的弱监督图像目标定位方法
CN115205520A (zh) 胃镜图像智能目标检测方法、系统、电子设备及存储介质
CN112419335B (zh) 一种细胞核分割网络的形状损失计算方法
WO2024104035A1 (zh) 基于长短期记忆自注意力模型的三维医学图像分割方法及系统
CN113191393A (zh) 基于多模态融合的对比增强能谱乳腺摄影分类方法及系统
CN105528791B (zh) 一种面向触摸屏手绘图像的质量评价装置及其评价方法
CN116883341A (zh) 一种基于深度学习的肝脏肿瘤ct图像自动分割方法
CN106650629A (zh) 一种基于核稀疏表示的快速遥感目标检测识别方法

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 17802953

Country of ref document: US

NENP Non-entry into the national phase

Ref country code: DE