WO2023207820A1 - 基于有监督深度子空间学习的胰腺术后糖尿病预测系统 - Google Patents

基于有监督深度子空间学习的胰腺术后糖尿病预测系统 Download PDF

Info

Publication number
WO2023207820A1
WO2023207820A1 PCT/CN2023/089985 CN2023089985W WO2023207820A1 WO 2023207820 A1 WO2023207820 A1 WO 2023207820A1 CN 2023089985 W CN2023089985 W CN 2023089985W WO 2023207820 A1 WO2023207820 A1 WO 2023207820A1
Authority
WO
WIPO (PCT)
Prior art keywords
feature
pancreatic
image
pancreas
features
Prior art date
Application number
PCT/CN2023/089985
Other languages
English (en)
French (fr)
Inventor
李劲松
胡佩君
田雨
周天舒
Original Assignee
之江实验室
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 之江实验室 filed Critical 之江实验室
Publication of WO2023207820A1 publication Critical patent/WO2023207820A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B6/00Apparatus or devices for radiation diagnosis; Apparatus or devices for radiation diagnosis combined with radiation therapy equipment
    • A61B6/02Arrangements for diagnosis sequentially in different planes; Stereoscopic radiation diagnosis
    • A61B6/03Computed tomography [CT]
    • A61B6/032Transmission computed tomography [CT]
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B6/00Apparatus or devices for radiation diagnosis; Apparatus or devices for radiation diagnosis combined with radiation therapy equipment
    • A61B6/50Apparatus or devices for radiation diagnosis; Apparatus or devices for radiation diagnosis combined with radiation therapy equipment specially adapted for specific body parts; specially adapted for specific clinical applications
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B6/00Apparatus or devices for radiation diagnosis; Apparatus or devices for radiation diagnosis combined with radiation therapy equipment
    • A61B6/52Devices using data or image processing specially adapted for radiation diagnosis
    • A61B6/5211Devices using data or image processing specially adapted for radiation diagnosis involving processing of medical diagnostic data
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B6/00Apparatus or devices for radiation diagnosis; Apparatus or devices for radiation diagnosis combined with radiation therapy equipment
    • A61B6/52Devices using data or image processing specially adapted for radiation diagnosis
    • A61B6/5258Devices using data or image processing specially adapted for radiation diagnosis involving detection or reduction of artifacts or noise
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/60Analysis of geometric attributes
    • G06T7/62Analysis of geometric attributes of area, perimeter, diameter or volume
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10081Computed x-ray tomography [CT]

Definitions

  • the invention relates to the field of medical and health information technology, and in particular to a post-pancreatic surgery diabetes prediction system based on supervised deep subspace learning.
  • pancreas is an important organ that produces endocrine and exocrine hormones and is critical for glucose metabolism.
  • Patients undergoing pancreatectomy are at risk for impaired glucose tolerance/diabetes mellitus (IGT/DM), with a reported incidence of 4.8%-60%, and the period of development of diabetes ranging from within 30 days to 3.5 years after surgery. Incidence rates vary depending on factors such as heterogeneity of the patient population, loss of pancreatic parenchymal volume, type of pancreatic resection, and other factors.
  • Distal pancreatectomy (DP) is a standard surgical treatment for the removal of neoplastic and non-neoplastic lesions in the body and tail of the pancreas.
  • Existing post-pancreatectomy diabetes risk prediction is generally based on electronic medical records, extracting demographic information such as age, gender, BMI, etc., laboratory data such as rapid blood glucose levels, glucose tolerance, serum glycated hemoglobin, etc., and considering pancreatic residual volume and pancreatic volume.
  • Related factors such as resection rate and risk factors for diabetes were mined based on statistical testing methods, but no risk prediction system was established.
  • BMI is usually used to measure obesity and is used as a risk factor for diabetes. But in fact, different fat and muscle components are closely related to human metabolic diseases, and more directly reflect the degree of obesity.
  • Radiomics prediction for other diseases generally builds disease prediction models through steps such as manually delineating regions of interest, calculating shallow image features such as texture, feature screening, and machine learning model construction.
  • this type of method when used for risk prediction after pancreatic resection, it requires the delineation of areas of interest on preoperative and postoperative CT scans, which is time-consuming and labor-intensive.
  • feature calculation and screening generally only shallow image features are considered, and statistical analysis methods or recursive feature elimination methods are used to screen features.
  • Feature screening and classification models are independent of each other. The dimensionality reduction effect of features is not good.
  • traditional machine learning classification models such as logistic regression, support vector machines, and random forests are generally selected, but the accuracy of diabetes risk prediction after pancreatectomy is not high enough.
  • the purpose of the present invention is to address the shortcomings of the existing technology and propose a post-pancreatectomy diabetes risk prediction system based on supervised deep subspace learning that combines clinical and imaging features.
  • the present invention proposes to use a deep convolutional neural network to automatically segment the preoperative CT pancreatic area, and then uses MITK software to simulate the pancreatic resection margin to obtain the postoperative pancreatic area, greatly reducing the workload of marking the area of interest.
  • This invention can combine patient clinical information and high-dimensional image features, find low-dimensional features related to the risk of postoperative diabetes through deep subspace learning, and at the same time predict the patient's risk of postoperative diabetes, with a high degree of automation and discrimination. Accuracy.
  • a pancreatic postoperative diabetes prediction system based on supervised deep subspace learning.
  • the system includes a preoperative CT image data acquisition module, a residual pancreas region of interest acquisition module, and an image acquisition module.
  • the preoperative CT image data acquisition module is used to acquire preoperative CT image data for pancreatic resection and input it into the residual pancreas region of interest acquisition module and image feature calculation module;
  • the residual pancreas region of interest acquisition module is used to input preoperative CT image data into the trained pancreas segmentation network to obtain the pancreas prediction region; on the pancreas prediction region, the edge of pancreatic resection is simulated through software to obtain the post-resection region.
  • the residual pancreatic area is used as the area of interest for subsequent calculation of image features and is input to the image feature calculation module;
  • the image feature calculation module is used to calculate the pancreatic image features based on the preoperative CT image data and the region of interest of the image features, and input them into the deep subspace learning module;
  • the clinical feature calculation module is used to obtain clinical information related to patients with diabetes after surgery, including demographic information, living habits, pancreatic volume resection rate, remaining pancreatic volume, and abdominal fat and muscle content characteristics, and connects the features to form a clinical Features are input to the deep subspace learning module;
  • the deep subspace learning module performs feature dimensionality reduction and fusion through a deep subspace learning network.
  • the deep subspace learning network includes an encoder, a latent space variable self-expression layer and a decoder, and supervised learning of the latent space variable self-expression layer;
  • the deep subspace learning network inputs pancreatic image features and clinical features, and outputs latent space variables through the encoder.
  • the output latent space variables are connected to a fully connected layer and applied to the activation function to obtain the predictive value of diabetes risk.
  • the preoperative CT image data acquisition module acquires the preoperative CT image data and truncates the HU value of the CT image data between [-100, 240], and then discretizes it to between [0, 255].
  • the pancreas calculates the rectangular frame of its surrounding area, sets the edge extension value, and then intercepts the rectangular frame of the CT image data and the residual pancreas annotation image.
  • the preoperative pancreatic CT image is automatically segmented based on a deep convolutional neural network to obtain a complete pancreas prediction region.
  • the medical image processing tool set MITK software The surgical cutting surface is simulated, and the residual pancreas area after resection is obtained as the residual pancreas area of interest for subsequent image feature calculation.
  • the pancreas segmentation network selects a densely connected dilated convolutional network.
  • the image feature calculation module is used to filter the preoperative CT image data, using the filtered image and the residual pancreas region of interest to calculate the first-order statistical feature vector, shape feature vector and texture feature vector, and combine the three
  • the feature vectors are connected to obtain the filtered feature vector; according to the fully connected layer input of the pancreas segmentation network, the feature mean of all pixels in the residual pancreas area of interest is calculated and standardized to obtain a high-level semantic feature vector, which is filtered
  • the final feature vectors are connected with high-level semantic feature vectors to obtain pancreatic image features.
  • X f (X f -min(X f ))/(max(X f )-min(X f ))
  • X f represents the feature vector
  • f represents the specific feature name
  • the vector length is the number n of all CT image data.
  • d 1 is the dimension of the image feature
  • n is the number of CT image data
  • X radiomics is the feature vector after filtering, representing radiomics features
  • X deep is the high-level semantic feature vector.
  • the clinical feature calculation module includes a body composition feature calculation module, a clinical information feature calculation module and a pancreatic resection feature calculation module;
  • the body composition feature calculation module is used to calculate the area of visceral fat, subcutaneous fat and skeletal muscle in the cross-sectional image of the third spinal position of the CT volume data, and calculate the ratio of visceral fat to skeletal muscle and the ratio of visceral fat to skeletal muscle. The ratio of subcutaneous fat to obtain body composition characteristics;
  • the clinical information feature acquisition module is used to obtain basic clinical information of patients, including demographic characteristics and living habits, to form clinical information features;
  • the pancreatic resection characteristic calculation module calculates the preoperative volume and postoperative volume of the pancreas, calculates the pancreatic resection ratio, and constructs the pancreatic resection feature calculation module. resection features;
  • the results of the body composition feature calculation module, the clinical information feature calculation module and the pancreatic resection feature calculation module are connected to form clinical features, which are input to the deep subspace learning module.
  • X ⁇ X img , X clinic ⁇ , X img is the image feature, X clinic is the clinical feature, is the output of the decoder, y is the true condition of the patient’s postoperative diabetes, is the diabetes risk predicted by the model, Z is the latent space variable output by the encoder, L is the Laplacian matrix, the symbol Tr represents the trace of the matrix, the symbol T represents the matrix transpose, ⁇ represents all parameters in the network, including the encoder Parameter ⁇ e , self-representation coefficient matrix C, supervision module parameter ⁇ s and decoder parameter ⁇ d ; ⁇ , ⁇ 1 and ⁇ 2 are regularization coefficients, the symbol
  • the present invention proposes to use preoperative CT images and clinical information to predict the risk of diabetes after pancreatectomy, filling the gap in using preoperative images to predict the risk of diabetes.
  • the present invention proposes to automatically segment preoperative CT images based on deep convolutional neural networks.
  • the pancreatic cutting edge is simulated to obtain the postoperative residual pancreas area of interest, which can greatly reduce the amount of manual annotation.
  • the present invention establishes a high-dimensional image feature set that combines radiomics features and high-level semantic features, and a diabetes-related clinical feature set including pancreatic resection rate, fat and muscle tissue composition, demographic information, and lifestyle habits.
  • a supervised deep subspace learning network is innovatively proposed to achieve feature dimensionality reduction and fusion in subspace, while training the prediction model, mining sensitive features related to the prediction task, and improving the accuracy of the prediction model.
  • Figure 1 is a diagram of the diabetes prediction system after pancreatic surgery based on supervised deep subspace learning of the present invention.
  • Figure 2 is a schematic diagram of the densely connected dilated convolutional neural network architecture.
  • Figure 3 is a schematic diagram of body composition boundaries.
  • Figure 4 is a schematic diagram of the supervised deep subspace learning network structure.
  • the purpose of the present invention is to address the current gap in using clinical characteristics and preoperative imaging features to predict the risk of diabetes after pancreatectomy, and the existing radiomics methods rely on manual delineation of regions of interest and high-dimensional features Due to the problems of poor dimensionality reduction effect and insufficient discrimination ability, a post-pancreatectomy glycolysis method based on supervised deep subspace learning is proposed. Diabetes risk prediction system.
  • the present invention obtains the pancreas prediction region through the residual pancreas region of interest acquisition module and then obtains the residual pancreas region.
  • the preoperative pancreatic CT image is automatically segmented based on a deep convolutional neural network to obtain a complete pancreas prediction region.
  • the surgical cutting plane is simulated in the medical image processing tool MITK software, and the residual pancreas area after resection is obtained as the residual pancreas area of interest for subsequent image feature calculations.
  • the image feature calculation module is used to calculate pancreatic image features.
  • the clinical feature calculation module is used to extract clinical information related to the patient's postoperative diabetes, including demographic information (gender, age, etc.), living habits (drinking, smoking, etc.), pancreatic volume resection rate, remaining pancreatic volume, and abdominal fat. and muscle mass to establish clinical feature sets.
  • a supervised deep subspace learning network is established through the deep subspace learning module, and the supervised learning module is added to reduce and fuse the image features and clinical features to obtain their sparse representation in the low-dimensional subspace, and The similarity matrix between features is calculated from it.
  • the supervised learning module in the trained deep subspace learning network is used to predict the patient's risk of postoperative diabetes.
  • the deep subspace learning network utilized in the present invention is shown in Figure 4.
  • the preoperative CT image data acquisition module After the preoperative CT image data acquisition module acquires the preoperative CT image data for pancreatic resection, it preprocesses the CT image data and divides the data set, specifically as follows:
  • CT value which is a unit of measurement for measuring the density of a certain local tissue or organ in the human body. It is usually called Hounsfield unit (HU). Air is -1000, dense bone is +1000. Then, calculate the rectangular frame of the surrounding area based on the residual pancreas, set the edge extension value, and then intercept the CT image and the rectangular frame of the annotated image. This step can reduce the amount of subsequent image feature calculations.
  • the residual pancreas region of interest acquisition module is used for automatic segmentation of pancreatic CT images and extraction of residual pancreas region of interest.
  • the preoperative pancreatic CT volume data is I and the size is 512 ⁇ 512 ⁇ L, where L is the volume. The number of layers of data.
  • contrast adjustment and image frame selection are performed for each two-dimensional image I A, l .
  • the image The HU value is truncated between [-100,240] and then normalized to between [0,1].
  • the pancreas segmentation network can choose a densely connected dilated convolutional network or other end-to-end fully convolutional neural network.
  • the densely connected dilated convolutional neural network (DenseASPP) is selected as the pancreas segmentation network.
  • DenseASPP is a generalized densely connected network with densely connected atrous spatial pyramid pooling layers (ASPP). It encodes multi-scale information by concatenating feature maps from atrous convolutions with different dilation rates.
  • DenseASPP uses dense connections to the output of each atrous convolutional layer and obtains increasingly larger receptive fields with reasonable expansion rates. Therefore, DenseASPP can obtain output feature maps covering a larger receptive field in a very dense manner.
  • the present invention uses DenseNet161 followed by dilated convolutional layers to build the basic DenseASPP network (see Figure 2).
  • the first part of DenseASPP is a feature generation block that outputs a basic feature map y 0 with size 1/8 of the input image size. Specifically, it consists of a convolutional layer, 4 dense blocks, and 3 transition layers. The initial number of feature maps in the first dense block is 96 and the growth rate is 48.
  • the second part of DenseASPP is a dense ASPP module built on the feature map y 0 through densely connected atrous convolutional layers, in which the number of atrous convolutional layers is 3 and the expansion rates d are 3, 6, and 12 respectively.
  • pancreas prediction area On the automatically segmented pancreas prediction area, based on the surgical records or pancreatic tumor location, use MITK software to simulate the edge of pancreatic resection and obtain the residual pancreas area after resection, which is recorded as R and used as the residual pancreas for subsequent calculation of image features. area of interest.
  • the image feature calculation module uses the residual pancreas region of interest to calculate radiomic features. Since the original images of most pancreatic regions are usually slice-constant and contain high redundant signals.
  • the present invention uses wavelet analysis to extract high-frequency and low-frequency information from images. Wavelet analysis uses wavelet functions called wavelets to convert signals into a spatial/frequency representation. After wavelet transformation, the image is decomposed into multi-resolution subspaces, and the corresponding wavelet coefficients reflect the high and low frequency signals of the original image. Wavelet filtering was performed on the CT image data before pancreatectomy. The wavelet bases were selected as db1, db5, sym7, coif3 and haar wavelet bases.
  • the wavelet toolkit based on matlab decomposed the image in three directions to decompose the high-frequency and low-frequency signals. .
  • wavedec3 in the wavelet toolkit is used to decompose the high-frequency and low-frequency information of the image.
  • the 3D wavelet transform can be expressed as
  • H and L represent high-pass filtering and low-pass filtering respectively
  • x, y and z represent the three-dimensional coordinate axes.
  • the fully connected layer input of the pancreas segmentation network is extracted as a high-level semantic feature, and the feature mean of all pixels in the residual pancreas area of interest is calculated.
  • the densely connected dilated convolutional neural network (DenseASPP) is selected as the pancreas segmentation network, and the pre-pancreatectomy CT image data is contrast-adjusted and then input into the trained segmentation network.
  • the input of the last fully connected layer output by the network is taken as the extracted high-level semantic feature, and the mean value of all pixels in the area of interest is calculated to finally obtain a 1488-dimensional feature.
  • the clinical feature calculation module includes a body composition feature calculation module, a clinical information feature calculation module and a pancreatic resection feature calculation module;
  • the body composition feature calculation module extracts the cross-sectional image of the spine position of the third segment of the CT volume data, and manually marks the body peripheral boundary 1, the subcutaneous fat and skeletal muscle boundary 2, and the skeletal muscle and abdominal cavity boundary 3 (see Figure 3 ).
  • the area of visceral fat, subcutaneous fat and skeletal muscle was calculated based on the CT HU value. Specifically, the HU value range of adipose tissue is set to [-190.-30], and the HU value range of muscle tissue is [-29,150].
  • the area between boundary 1 and boundary 2 is the location of subcutaneous fat. According to the fat Tissue HU value range, set threshold to extract subcutaneous adipose tissue area SAT.
  • the area within boundary 3 is the area where visceral fat is located.
  • a threshold is set to extract the VAT of the visceral adipose tissue area.
  • the area between boundary 2 and boundary 3 is the location of the skeletal muscle.
  • the skeletal muscle area SKM is extracted.
  • S i , i ⁇ SAT, VAT, SKM ⁇ and the total fat area S AT SVAT + S SAT .
  • the ratio of visceral fat to skeletal muscle S VAT /S SKM and the ratio of visceral fat to subcutaneous fat S VAT /S SAT are calculated.
  • Body composition characteristics X composition ⁇ S VAT , S SAT , S SKM , S VAT /S SAT , S VAT /S SKM , S AT ⁇ .
  • the clinical information feature acquisition module acquires the patient's basic clinical information, including demographic characteristics (gender, age, etc.) and living habits (smoking, drinking, etc.), to form clinical information feature X info .
  • the above image feature sets and clinical feature sets contain highly redundant or irrelevant features that require feature dimensionality reduction. At the same time, imaging features and clinical features need to be fused.
  • the present invention performs a deep subspace learning process through a deep subspace learning module to achieve feature dimensionality reduction and fusion.
  • the deep subspace learning module includes a deep subspace learning network based on autoencoders.
  • the specific construction process is as follows: First, design an autoencoder network AE.
  • Ordinary autoencoders generally consist of an encoder and a decoder, and learn efficient representations of input data through unsupervised learning. These efficient representations of input data are called codings, and their dimensions are generally much smaller than the input data, making autoencoders useful for dimensionality reduction.
  • the present invention in order to use the autoencoder to realize subspace clustering, the present invention also adds a latent space variable self-expression layer to the encoder. Therefore, the present invention designs AE to consist of an encoder E, a decoder D and a latent space variable self-expression layer.
  • the encoder E includes three convolutional layers-activation layers, the decoder D consists of three corresponding convolutional layers-activation layers, and the latent space variable self-expression layer includes a fully connected layer. Then, a supervised learning module is added to the autoencoder network, that is, a fully connected layer is connected to the latent space variable Z output by the encoder, and the activation function is applied to obtain the predicted value of the label.
  • the output of the encoder E is a latent space variable ⁇ e is the encoder parameter
  • the output of the decoder is ⁇ d is the decoder parameter.
  • S the similarity matrix that describes the similarity between data, where the elements S ij of the similarity matrix represent the data after the data X i can be reconstructed represents the coefficient.
  • the difference with the original data X should be as small as possible, and define the reconstruction loss of the network
  • the symbol Tr represents the trace of the matrix
  • the symbol T represents the matrix transpose
  • the symbol diag( ⁇ ) represents the diagonal matrix
  • n represents the number of samples.
  • the loss function of the self-expression layer can be defined as
  • the similarity matrix S is calculated from the self-representation coefficient matrix C, and the calculation formula is
  • the first term of the loss function formula requires the coefficient matrix C to be sparse, that is, the filtered features are sparse.
  • the second term requires that data in the same subspace be self-representable, that is, linearly represented by other data in the same subspace.
  • 1 represents the L1 norm
  • F represents the frobenius norm
  • represents the regularization coefficient.
  • the traditional autoencoder learning method is unsupervised.
  • a supervised learning module to the autoencoder.
  • the parameters of the fully connected layer introduced are ⁇ s
  • the activation function is the softmax function ⁇
  • the latent space variable Z is obtained after passing through the fully connected layer and activation function.
  • the real label of the data is recorded as y
  • the loss function of the supervised learning module is expressed as
  • represents all parameters in the network, including autoencoder parameters ⁇ e , self-representation layer parameters C, supervision module parameters ⁇ s and decoder parameters ⁇ d .
  • ⁇ , ⁇ 1 and ⁇ 2 are regularization coefficients.
  • the network structure hyperparameters and model parameters are set based on the grid search method, including the number of neurons, the number of network layers, the learning rate, batch size, the number of iteration steps, and the regularization coefficient ⁇ 1 and ⁇ 2 .
  • the divided training set S tr is used to train the subspace learning network, and the ADAM method is used to optimize the network parameters, and finally the trained deep subspace learning network model is obtained.
  • the cut-off points were determined at the point with the largest positive likelihood ratio on the ROC curve, and the sensitivity, specificity, area under the ROC curve (AUC) and accuracy of assessing pathological response were calculated. The larger the AUC value, the higher the evaluation accuracy of the system of the present invention.
  • Feature calculation For the residual pancreas region of interest, image features based on wavelet filtering and deep learning-based high-level semantic features to form an image feature set. Calculate body composition characteristics and pancreatic resection characteristics, extract patient clinical information, and form a clinical feature set.
  • the five-fold cross-validation results show that the features screened out by the subspace learning network include 36 image features and 4 clinical features.
  • Clinical features include alcohol consumption, muscle mass, age and residual pancreatic volume, and image features include 9 db5 filter features, 8 sym7 filter features, and 19 haar filter features.
  • the method of the present invention jointly mines imaging and clinical features, and the clinical variables mined are consistent with the relevant factors reported in the literature, which illustrates the effectiveness of this method in screening diabetes-related risk factors.

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Medical Informatics (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Public Health (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Pathology (AREA)
  • Radiology & Medical Imaging (AREA)
  • Molecular Biology (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • High Energy & Nuclear Physics (AREA)
  • Heart & Thoracic Surgery (AREA)
  • General Physics & Mathematics (AREA)
  • Veterinary Medicine (AREA)
  • Animal Behavior & Ethology (AREA)
  • Surgery (AREA)
  • Optics & Photonics (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Epidemiology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Pulmonology (AREA)
  • Databases & Information Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Primary Health Care (AREA)
  • Dentistry (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Quality & Reliability (AREA)

Abstract

一种基于有监督深度子空间学习的胰腺术后糖尿病预测系统。利用深度卷积神经网络自动分割术前CT胰腺区域,利用MITK软件进行胰腺切缘模拟,获得术后的残余胰腺区域,大大减小感兴趣区域标注的工作量。对残余胰腺区域提取传统影像组学特征和深层语义特征,构建高维影像特征集;提取与糖尿病相关的临床因素,包括胰腺切除率、脂肪与肌肉组织成分、人口学信息和生活习惯,构建临床特征集。基于一个有监督的深度子空间学习网络,对影像和临床特征在子空间中进行降维表示和融合,同时训练预测模型,挖掘与预测任务高度相关的敏感特征,对患者术后患糖尿病风险进行预测,具有较高的自动化程度和判别精度。

Description

基于有监督深度子空间学习的胰腺术后糖尿病预测系统 技术领域
本发明涉及医疗健康信息技术领域,尤其涉及一种基于有监督深度子空间学习的胰腺术后糖尿病预测系统。
背景技术
胰腺是产生内分泌和外分泌激素的重要器官,对葡萄糖代谢至关重要。接受胰腺切除术的患者存在糖耐量降低/糖尿病(IGT/DM)的风险,据报道发生率为4.8%-60%,糖尿病发展期从术后30天内至3.5年不等。发病率因患者人群的异质性、胰腺实质体积减少、胰腺切除术的类型等因素不同。胰体尾切除术(distal pancreatectomy,DP)是一种标准的手术治疗方法,用于切除胰腺的体部和尾部的肿瘤性和非肿瘤性病变。然后,胰体尾切除术后新发糖尿病的相关因素仍不清楚,尚未被充分研究。由于健康的生活方式和及时的医疗干预被认为是降低糖尿病风险的有效方法,因此早期预测胰腺切除术后新发糖尿病是必要且有意义的。然而,关于预测胰腺切除术后新发糖尿病的文献非常有限,基于影像数据的糖尿病分类更具挑战性。
少量的文献工作中,基于电子病历数据提取患者人口学信息如性别、年龄、BMI和实验室指标如糖耐量、快速血糖值等,挖掘患糖尿病风险因子。但尚未有研究工作利用影像数据建立风险预测模型。CT作为胰腺疾病的常规影像学检查手段之一,具有显影清晰、无创的优点,CT图像可以反应胰腺的纹理质地。结合基础临床信息和CT影像特征对预测胰腺切除术后患糖尿病的风险具有重要的意义和临床应用价值。
现有胰腺切除术后糖尿病风险预测一般基于电子病历,提取人口学信息如年龄、性别、BMI等,实验室数据如快速血糖值、糖耐量、血清糖化血红蛋白等,以及考虑胰腺残余体积、胰腺体积切除率等相关因素,基于统计检验方法挖掘患糖尿病风险因子,但没有建立风险预测系统。其中,通常将BMI衡量肥胖程度,来作为糖尿病的风险因素。但实际上不同脂肪和肌肉成分与人体代谢类疾病非常相关,更直接反应肥胖程度。同时,目前尚未有方法结合临床信息和影像数据来预测胰腺切除术后患糖尿病风险。针对其他疾病的影像组学预测一般通过人工勾画感兴趣区域、计算纹理等浅层影像特征、特征筛选和机器学习模型构建等步骤来建立疾病预测模型。但是,此类方法应用于胰腺切除术后风险预测,需要对术前、术后CT进行感兴趣区域勾画,费时耗力。在特征计算与筛选方面,一般只考虑浅层影像特征,采取统计分析方法或者递归特征消除法等方法筛选特征,特征筛选与分类模型相互独立,对高维 特征的降维效果不好。在分类模型构建时,一般选取逻辑回归、支持向量机、随机森林等传统的机器学习分类模型,在胰腺切除术后糖尿病风险预测的准确度不够高。
发明内容
本发明目的在于针对现有技术的不足,提出一种结合临床和影像特征,基于有监督深度子空间学习的胰腺切除术后糖尿病风险预测系统。本发明提出利用深度卷积神经网络自动分割术前CT胰腺区域,然后利用MITK软件进行胰腺切缘模拟,获得术后的胰腺区域,大大减小感兴趣区域标注的工作量。对残余胰腺区域提取小波滤波图像的一阶统计、形状和纹理特征,再通过深度卷积神经网络提取术后胰腺区域的高层语义特征,构建残余胰腺区域的影像特征集;进一步,提取或计算胰腺切除术后相关的风险因子,包括人口学信息、生活习惯、胰腺体积切除率、胰腺剩余体积以及腹部脂肪和肌肉含量,建立胰腺临床特征集。然后,创新性地提出一个有监督深度子空间学习方法对影像特征集和临床特征集同时进行特征筛选与融合,获得两者在低维子空间中的稀疏表示,并从中计算数据间的相似矩阵。最后,基于深度子空间学习网络中的有监督模块,预测患者术后会发生糖尿病的风险。本发明能够结合患者临床信息和高维影像特征,通过深度子空间学习找到与术后发生糖尿病风险相关的低维特征,同时对患者术后患糖尿病风险进行预测,具有较高的自动化程度和判别精度。
本发明的目的是通过以下技术方案来实现的:一种基于有监督深度子空间学习的胰腺术后糖尿病预测系统,该系统包括术前CT图像数据获取模块、残余胰腺感兴趣区域获取模块、影像特征计算模块、临床特征计算模块和深度子空间学习模块;
所述术前CT图像数据获取模块用于获取胰腺切除术前CT图像数据,输入到残余胰腺感兴趣区域获取模块和影像特征计算模块;
所述残余胰腺感兴趣区域获取模块用于将术前CT图像数据输入到训练好的胰腺分割网络中,得到胰腺预测区域;在胰腺预测区域上,通过软件模拟胰腺切除的边缘,得到切除术后的残余胰腺区域,作为后续计算影像特征的感兴趣区域,输入到影像特征计算模块;
所述影像特征计算模块用于根据术前CT图像数据以及影像特征的感兴趣区域计算得到胰腺影像特征,输入到深度子空间学习模块;
所述临床特征计算模块用于获取患者术后患糖尿病相关的临床信息,包括人口学信息、生活习惯、胰腺体积切除率、胰腺剩余体积以及腹部脂肪和肌肉含量特征,并进行特征连接后构成临床特征,输入到深度子空间学习模块;
所述深度子空间学习模块通过深度子空间学习网络进行特征降维和融合,所述深度子空间学习网络包括编码器、隐空间变量自表达层和解码器,对隐空间变量自表达层监督学习;深度子空间学习网络输入胰腺影像特征和临床特征,经过编码器输出隐空间变量,对编码器 输出的隐空间变量连接一个全连接层,并作用激活函数,获得患糖尿病风险的预测值。
进一步地,所述术前CT图像数据获取模块,获取胰腺切除术前CT图像数据后将CT图像数据的HU值截断在[-100,240]之间,然后离散化到[0,255]之间,根据残余胰腺计算其包围区域的矩形框,设定边缘扩展值,然后截取CT图像数据和残余胰腺标注图像的矩形框。
进一步地,所述残余胰腺感兴趣区域获取模块中,基于深度卷积神经网络自动分割术前胰腺CT图像,获得完整的胰腺预测区域,根据手术记录或者肿瘤位置,在医学图像处理工具集MITK软件中模拟手术切割面,得到切除术后的残余胰腺区域作为后续影像特征计算的残余胰腺感兴趣区域。
进一步地,所述残余胰腺感兴趣区域获取模块中,胰腺分割网络选择密集连接扩张卷积网络。
进一步地,所述影像特征计算模块用于对术前CT图像数据做滤波处理,利用滤波后的图像和残余胰腺感兴趣区域,计算一阶统计特征向量、形状特征向量以及纹理特征向量,将三个特征向量连接得到滤波处理后的特征向量;根据胰腺分割网络的全连接层输入,计算残余胰腺感兴趣区域内所有像素点的特征均值,并进行标准化处理,得到高层语义特征向量,将滤波处理后的特征向量和高层语义特征向量相连接,得到胰腺影像特征。
进一步地,将所有CT图像数据滤波处理后的特征向量按照如下方式处理:
Xf=(Xf-min(Xf))/(max(Xf)-min(Xf))
其中Xf表示特征向量,f代表具体的特征名,向量长度为所有CT图像数据个数n。
进一步地,将滤波处理后的特征向量和高层语义特征向量相连接,得到影像特征Ximg
Ximg=concat(Xradiomics,Xdeep),axis=1)
其中d1为影像特征的维度,n为CT图像数据个数,Xradiomics为滤波处理后的特征向量,表示影像组学特征,Xdeep为高层语义特征向量。
进一步地,所述临床特征计算模块包括身体组成成分特征计算模块、临床信息特征计算模块和胰腺切除特征计算模块;
所述身体组成成分特征计算模块用于计算CT体数据的第三节脊椎位置的横断面图像的内脏脂肪、皮下脂肪和骨骼肌的区域面积,并计算内脏脂肪与骨骼肌之比和内脏脂肪与皮下脂肪之比,得到身体组成成分特征;
所述临床信息特征获取模块用于获取患者基础临床信息,包括人口学特征和生活习惯,组成临床信息特征;
所述胰腺切除特征计算模块计算胰腺术前体积、术后体积,计算胰腺切除比,构建胰腺 切除特征;
将身体组成成分特征计算模块、临床信息特征计算模块和胰腺切除特征计算模块的结果连接构成临床特征,输入到深度子空间学习模块。
进一步地,所述深度子空间学习模块中的深度子空间学习网络的损失函数为:

s.t.diag(C)=0
其中,X={Ximg,Xclinic},Ximg为影像特征,Xclinic为临床特征,为解码器的输出,y为患者术后患糖尿病的真实情况,是模型预测的糖尿病风险,Z为编码器输出的隐空间变量,L为拉普拉斯矩阵,符号Tr表示矩阵的迹,符号T表示矩阵转置,Θ表示网络中所有的参数,包括编码器参数Θe,自表示系数矩阵C,监督模块参数Θs和解码器参数Θd;α,γ1和γ2为正则化系数,符号||·||F表示frobenius范数,BCE(·)表示交叉熵损失。
本发明的有益效果:本发明提出利用术前CT图像和临床信息来预测胰腺切除术后患糖尿病风险,填补了利用术前影像预测糖尿病风险的空白。与传统影像组学方法不同,本发明提出基于深度卷积神经网络自动分割术前CT图像,在此基础上模拟胰腺切割边缘,获得术后残余胰腺感兴趣区域,可以大大减少人工标注量。进一步,本发明建立了结合影像组学特征和高层语义特征的高维影像特征集,和包括胰腺切除率、脂肪与肌肉组织成分、人口学信息和生活习惯等与糖尿病相关的临床特征集,这是挖掘与胰腺切除术后患糖尿病相关特征的基础和关键。同时,创新性地提出一个有监督的深度子空间学习网络,在子空间中实现特征降维与融合,同时训练预测模型,挖掘与预测任务相关的敏感特征,提高了预测模型准确率。
附图说明
图1为本发明基于有监督深度子空间学习的胰腺术后糖尿病预测系统图。
图2为密集连接扩张卷积神经网络架构示意图。
图3为身体组成成分边界示意图。
图4为有监督的深度子空间学习网络结构示意图。
具体实施方式
以下结合附图对本发明具体实施方式作进一步详细说明。
如图1所示,本发明的目的在于针对目前利用临床特征和术前影像特征预测胰腺切除术后患糖尿病风险方面的空白,和现有影像组学方法依赖手工勾画感兴趣区域,高维特征降维效果不好、区分能力不足的问题,提出了一种基于有监督深度子空间学习的胰腺切除术后糖 尿病风险预测系统。
本发明在感兴趣区域标注阶段,通过残余胰腺感兴趣区域获取模块得到胰腺预测区域进而得到残余胰腺区域,具体为:基于深度卷积神经网络自动分割术前胰腺CT图像,获得完整的胰腺预测区域,然后,根据手术记录或者肿瘤位置,在医学图像处理工具MITK软件中模拟手术切割面,得到切除术后的残余胰腺区域作为后续影像特征计算的残余胰腺感兴趣区域。在特征提取阶段,利用影像特征计算模块计算胰腺影像特征,一方面,基于小波滤波后的图像提取一阶统计特征、形状特征以及高阶纹理特征等传统影像组学特征,另一方面,基于深度卷积神经网络输出的特征图获得残余胰腺区域的高层语义特征,构建胰腺影像特征。进一步,通过临床特征计算模块提取与患者术后患糖尿病相关的临床信息,包括人口学信息(性别、年龄等)、生活习惯(饮酒、抽烟等),胰腺体积切除率、胰腺剩余体积以及腹部脂肪和肌肉含量,建立临床特征集。在特征筛选阶段,通过深度子空间学习模块建立有监督的深度子空间学习网络,加入监督学习模块,对影像特征和临床特征进行降维与融合,得到其在低维子空间的稀疏表示,并从中计算特征间的相似矩阵。利用训练完成的深度子空间学习网络中的监督学习模块,预测患者术后患糖尿病的风险。本发明中所利用的深度子空间学习网络如图4所示。
所述术前CT图像数据获取模块获取胰腺切除术前CT图像数据后,对CT图像数据预处理并进行数据集划分,具体如下:
1)CT图像重采样、灰度值离散化和图像区域框选。对术前胰腺CT图像,将其空间分辨率重采样到1*1*1mm,并将图像的HU值截断在[-100,240]之间,然后离散化到[0,255]之间。HU值:即CT值,是测定人体某一局部组织或器官密度大小的一种计量单位,通常称亨氏单位(hounsfield unit,HU)。空气为-1000,致密骨为+1000。然后,根据残余胰腺计算其包围区域的矩形框,设定边缘扩展值,然后截取CT图像和标注图像的矩形框。此步骤可以减少后续影像特征计算量。
2)针对胰体尾切除术后患者队列,按照50%、20%、30%将其随机划分成训练集Str、验证集Sval和测试集Stt。将患者术后是否发生糖尿病作为预测模型的真实标签y,y=1表示患者术后得糖尿病或糖尿量异常,y=0表示患者血糖功能正常。
所述残余胰腺感兴趣区域获取模块用于胰腺CT图像自动分割及残余胰腺感兴趣区域的提取,具体为:记术前胰腺CT体数据为I,大小为512×512×L,其中L为体数据的层数。将I进行轴位面切片,获得二维图像序列。将连续的三张图像组合作为一张三通道伪彩图图像,记为IA,l(l=1,…,L)。对每张二维图像IA,l,进行对比度调整与图像框选。具体的,将图像 的HU值截断在[-100,240]之间,然后归一化到[0,1]之间。对每张二维图像IA,l截取大小为448×448的图像块,输入到已经训练好的胰腺分割网络中,得到胰腺预测区域,记为R0。胰腺分割网络可以选择密集连接扩张卷积网络或者其他端到端的全卷积神经网络。在本发明中,选择密集连接扩张卷积神经网络(DenseASPP)作为胰腺分割网络。DenseASPP是一种广义的密集连接网络,它具有密集连接的空洞空间金字塔池化层(ASPP)。它通过连接来自具有不同扩张率的空洞卷积的特征图来编码多尺度信息。与原始的ASPP相比,DenseASPP使用密集连接到每个空洞卷积层的输出,并以合理的扩张率获得越来越大的感受野。因此,DenseASPP可以以非常密集的方式获得覆盖更大感受野的输出特征图。本发明使用DenseNet161后跟空洞卷积层来构建基本的DenseASPP网络(见图2)。DenseASPP的第一部分是一个特征生成块,输出尺寸为1/8输入图像尺寸的基本特征图y0。具体来说,它由一个卷积层、4个密集块和3个过渡层组成。第一个密集块中的初始特征图数量为96,增长率为48。DenseASPP的第二部分是通过密集连接的空洞卷积层构建在特征图y0上的密集ASPP模块,其中空洞卷积层数量为3,扩张率d分别为3、6、12。
在已自动分割好的胰腺预测区域上,根据手术记录或胰腺肿瘤位置,利用MITK软件,模拟胰腺切除的边缘,得到切除术后的残余胰腺区域,记为R,作为后续计算影像特征的残余胰腺感兴趣区域。
所述影像特征计算模块利用残余胰腺感兴趣区域进行影像组学特征计算。由于大部分胰腺区域的原始图像通常是分片常值的,并且包含高冗余信号。本发明采用小波分析来提取图像高频和低频的信息。小波分析使用所谓小波的小波函数将信号转换为空间/频率表示。经过小波变换后,图像被分解为多分辨率子空间,相应的小波系数反映了原始图像的高低频信号。对胰腺切除术前CT图像数据作小波滤波,小波基分别选取db1、db5、sym7、coif3和haar小波基,基于matlab的wavelet工具包对图像分别在三个方向进行分解高频和低频信号的分解。具体地,使用wavelet工具包中的wavedec3对图像进行高频和低频信息的分解。3D小波变换可以表示为
其中分别表示相加和卷积操作。H和L分别表示高通滤波和低通滤波,x,y和z表示三维坐标轴。
每组小波基下,一共可以得到8个分解系数(LLL,LLH,LHL,LHH,HLL,HLH,HHL, HHH)。一共可以得到40组滤波后的图像。利用小波滤波后的图像和残余胰腺感兴趣区域,基于Pyradiomics工具包计算一阶统计特征、形状特征以及纹理特征(GLCM、GLRLM、NGTDM、GLDM),各类特征具体包含的特征名称见表1,每个滤波后的图像可以计算得到85个特征。记计算得到的特征向量为Xff代表具体的特征名。最终得到3400维的特征。
表1影像特征名称
之后将胰腺分割网络的全连接层输入提取出来,作为高层语义特征,并计算残余胰腺感兴趣区域内所有像素点的的特征均值。在本发明中,选择密集连接扩张卷积神经网络(DenseASPP)作为胰腺分割网络,将胰腺切除术前CT图像数据经过对比度调整后,输入到训练好的分割网络中。经网络前向传播,取网络输出的最后一层全连接层的输入为提取的高层语义特征,并且在计算其在感兴趣区域内所有像素点的均值,最终得到1488维的特征。
将所有CT图像数据的图像特征拉伸到[0,1]范围内,即 (Xf=(Xf-min(Xf))/(max(Xf)-min(Xf)),其中Xf表示特征向量,向量长度为所有CT图像数据个数n。将基于小波的特征向量和高层语义特征向量相连接,得到影像特征Ximg=concat(Xradiomics,Xdeep),axis=1)。d1为影像特征的维度,n为CT图像数据个数,Xradiomics为滤波处理后的特征向量,表示影像组学特征,Xdeep为高层语义特征向量。
所述临床特征计算模块包括身体组成成分特征计算模块、临床信息特征计算模块和胰腺切除特征计算模块;
所述身体组成成分特征计算模块提取CT体数据的第三节脊椎位置的横断面图像,手工标注身体外围边界1,皮下脂肪与骨骼肌边界2,骨骼肌与腹部内腔边界3(见图3)。根据CT HU值计算内脏脂肪、皮下脂肪和骨骼肌的区域面积。具体地,设定脂肪组织的HU值范围为[-190.-30],肌肉组织的HU值范围为[-29,150],边界1与边界2之间的区域为皮下脂肪所在的位置,根据脂肪组织HU值范围,设定阈值提取皮下脂肪组织区域SAT。边界3内的区域为内脏脂肪所在区域,根据脂肪组织HU值范围,设定阈值提取内脏脂肪组织区域VAT。边界2与边界3之间的区域为骨骼肌所在位置,根据肌肉HU值范围,提取骨骼肌区域SKM。对上述提取的三种组织区域,计算其面积Si,i∈{SAT,VAT,SKM}以及总脂肪面积SAT=SVAT+SSAT。除此之外,再计算内脏脂肪与骨骼肌之比SVAT/SSKM,内脏脂肪与皮下脂肪之比SVAT/SSAT。身体组成成分特征Xcomposition={SVAT,SSAT,SSKM,SVAT/SSAT,SVAT/SSKM,SAT}。
所述临床信息特征获取模块获取患者基础临床信息,包括人口学特征(性别、年龄等)和生活习惯(吸烟、饮酒等),组成临床信息特征Xinfo
所述胰腺切除特征计算模块根据胰腺预测区域及残余胰腺感兴趣区域,计算胰腺术前体积为V0,术后体积为V1,计算胰腺切除比resect_rate=(V0-V1)/V0,构建胰腺切除特征Xresect={(V0-V1)/V0,V1}。
将身体组成成分特征计算模块、临床信息特征计算模块和胰腺切除特征计算模块的结果连接构成临床特征,即Xclinic={Xcomposition,Xinfo,Xresect},d2为临床特征的维度。
上述影像特征集和临床特征集中存在高度冗余或者与目标不相关的特征,需要进行特征降维。同时,影像特征与临床特征需要进行融合。本发明通过深度子空间学习模块进行深度子空间学习过程,实现特征降维和融合。
所述深度子空间学习模块包括基于自编码器的深度子空间学习网络,具体构建过程如下: 首先,设计一个自编码器网络AE。普通自编码器一般由编码器和解码器两部分组成,通过无监督学习,学习输入数据的高效表示。输入数据的这一高效表示称为编码(codings),其维度一般远小于输入数据,使得自编码器可用于降维。在本发明中,为了利用自编码器实现子空间聚类,本发明还对编码器加入了一个隐空间变量自表示层。因此,本发明设计AE由一个编码器E、一个解码器D和一个隐空间变量自表达层组成。编码器E包括三个卷积层-激活层,解码器D由对应的三个卷积层-激活层组成,隐空间变量自表达层包括一个全连接层。然后,对自编码器网络加入一个监督学习模块,即对编码器输出的隐空间变量Z连接上一个全连接层,并作用激活函数,获得对标签的预测值。
记输入深度子空间学习网络的数据为X={Ximg,Xclinic},经过编码器E输出为隐空间变量Θe为编码器参数,再经解码器的输出为Θd为解码器参数。记S为刻画数据之间相似性的相似矩阵,其中相似矩阵的元素Sij表示数据Xi可被重构后的数据表示的系数。为使经过编码和解码重构出来的数据与原始数据X之间的差异要尽可能小,定义网络的重构损失
其中对角矩阵L=D-S为拉普拉斯矩阵。符号Tr表示矩阵的迹,符号T表示矩阵转置,符号diag(·)表示对角矩阵,n表示样本个数。
在子空间聚类中,假设一个样本可以由数据本身的字典线性表示,即
X=XZ,s.t. diag(Z)=0
其中是表示矩阵。然后,描述数据关系的相似图可以构造相似矩阵由于高维数据之间的关系可能是非线性的,无法用线性关系表示,本发明先通过自编码器中的编码器将数据X投影到隐空间,认为在低维隐空间中的数据表示Z可以被自身线性表示。同时,要求这种表示具有稀疏性。因此,自表达层的损失函数可以定义为
其中相似矩阵S由自表示系数矩阵C计算得到,计算公式为损失函数公式第一项要求系数矩阵C具有稀疏性,即筛选出的特征是稀疏的。第二项要求同一个子空间中的数据可以自我表示,即可以由同一子空间中的其他数据线性表示。符号||·||1表示L1范数,符号||·||F表示frobenius范数,α表示正则化系数。
对上述矩阵D进行归一化操作,记归一化后的对角矩阵为Dn=I,相应的归一化后的拉普 拉斯矩阵为Ln=D-1/2LD1/2。带入到L0中,L0可以重新表示为
传统的自编码器学习方式是无监督的,为了加强所选择的特征的特异性,我们对自编码器引入监督学习模块。记引入的全连接层的参数为θs,激活函数为softmax函数σ,隐空间变量Z经过全连接层和激活函数后得记数据的真实标签为y,监督学习模块的损失函数表示为
其中BCE(·)表示交叉熵损失。
综上,深度子空间学习网络的损失函数定义为

s.t. diag(C)=0
其中,Θ表示网络中所有的参数,包括自编码器参数Θe,自表示层参数C,监督模块参数Θs和解码器参数Θd。α,γ1和γ2为正则化系数。
利用训练集Str和验证集Sval,基于网格搜索方法设置网络结构超参数和模型的参数,包括神经元个数、网络层数、学习率、batchsize、迭代步数、正则化系数γ1和γ2。利用划分好的训练集Str训练子空间学习网络,利用ADAM方法优化网络参数,最后得到训练好的深度子空间学习网络模型。
对子空间学习网络模型输入测试数据X_tt,得到数据在低维子空间的表示Z以及Z的自表示ZC。利用子空间学习网络中的监督学习模块,对ZC进行类别预测,得到预测值与标签ytt进行比较,进行ROC分析。在ROC曲线上的阳性似然比最大的点确定界值(Cut-off points),并计算敏感度、特异度、ROC曲线下面积(AUC)和评估病理反应的准确度。AUC值越大证明本发明系统的评估准确率越高。
具体应用实施例:构建一个胰体尾切除队列,包含212例患者数据,其中术后患糖尿病的患者有65例。提取患者术前CT图像和电子病历中的患者基本特征。采用五折交叉验证方法在该队列上验证本方法。将所有数据分成5份,编号为1,2,3,4,5。第一组实验使用编号2,3,4,5训练,1用来测试,第二组实验使用1,3,4,5训练,2来测试,依次类推,最后将5组实验的准确率取平均。按以下步骤处理数据:
1.图像处理。利用密集连接扩张卷积网络分割CT图像胰腺区域。然后利用MITK软件模拟胰腺切缘,得到术后残余胰腺感兴趣区域。
2.特征计算。针对残余胰腺感兴趣区域,计算基于小波滤波的影像特征和基于深度学习 的高层语义特征,组成影像特征集。计算身体组成成分特征、胰腺切除特征,提取患者临床信息,组成临床特征集。
3.将训练集数据输入到子空间学习网络中,训练网络模型。
4.对子空间学习网络输入测试数据,得到预测的换糖尿病风险值。
五折交叉验证结果显示,子空间学习网络筛选出来的特征包括36个影像特征,和4个临床特征。临床特征包括饮酒、肌肉含量,年龄和残余胰腺体积,影像特征包含来自9个db5滤波特征,8个sym7滤波特征,19个haar滤波特征。患者胰腺切除术后患糖尿病风险预测模型的准确率AUC=0.824。本发明方法对影像和临床特征联合挖掘,所挖据的临床变量与文献中报道的相关因素相吻合,说明了本方法筛选糖尿病相关风险因子的有效性。
上述实施例用来解释说明本发明,而不是对本发明进行限制,在本发明的精神和权利要求的保护范围内,对本发明作出的任何修改和改变,都落入本发明的保护范围。

Claims (8)

  1. 一种基于有监督深度子空间学习的胰腺术后糖尿病预测系统,其特征在于,该系统包括术前CT图像数据获取模块、残余胰腺感兴趣区域获取模块、影像特征计算模块、临床特征计算模块和深度子空间学习模块;
    所述术前CT图像数据获取模块用于获取胰腺切除术前CT图像数据,输入到残余胰腺感兴趣区域获取模块和影像特征计算模块;
    所述残余胰腺感兴趣区域获取模块用于将术前CT图像数据输入到训练好的胰腺分割网络中,得到胰腺预测区域;在胰腺预测区域上,通过软件模拟胰腺切除的边缘,得到切除术后的残余胰腺区域,作为后续计算影像特征的感兴趣区域,输入到影像特征计算模块;
    所述影像特征计算模块用于根据术前CT图像数据以及影像特征的感兴趣区域计算得到胰腺影像特征,输入到深度子空间学习模块;
    所述临床特征计算模块用于获取患者术后患糖尿病相关的临床信息,包括人口学信息、生活习惯、胰腺体积切除率、胰腺剩余体积以及腹部脂肪和肌肉含量特征,并进行特征连接后构成临床特征,输入到深度子空间学习模块;
    所述深度子空间学习模块通过深度子空间学习网络进行特征降维和融合,所述深度子空间学习网络包括编码器、隐空间变量自表达层和解码器,对隐空间变量自表达层监督学习;深度子空间学习网络输入胰腺影像特征和临床特征,经过编码器输出隐空间变量,对编码器输出的隐空间变量连接一个全连接层,并作用激活函数,获得患糖尿病风险的预测值;所述深度子空间学习模块中的深度子空间学习网络的损失函数为:

    s.t.diag(C)=0
    其中,X={Ximg,Xclinic},Ximg为影像特征,Xclinic为临床特征,为解码器的输出,y为患者术后患糖尿病的真实情况,是模型预测的糖尿病风险,Z为编码器输出的隐空间变量,L为拉普拉斯矩阵,符号Tr表示矩阵的迹,符号T表示矩阵转置,Θ表示网络中所有的参数,包括编码器参数Θe,自表示系数矩阵C,监督模块参数Θs和解码器参数Θd;a,γ1和γ2为正则化系数,符号||·||F表示frobenius范数,BCE(·)表示交叉熵损失。
  2. 根据权利要求1所述的一种基于有监督深度子空间学习的胰腺术后糖尿病预测系统,其特征在于,所述术前CT图像数据获取模块,获取胰腺切除术前CT图像数据后将CT图像 数据的HU值截断在[-100,240]之间,然后离散化到[0,255]之间,根据残余胰腺计算其包围区域的矩形框,设定边缘扩展值,然后截取CT图像数据和残余胰腺标注图像的矩形框。
  3. 根据权利要求1所述的一种基于有监督深度子空间学习的胰腺术后糖尿病预测系统,其特征在于,所述残余胰腺感兴趣区域获取模块中,基于深度卷积神经网络自动分割术前胰腺CT图像,获得完整的胰腺预测区域,根据手术记录或者肿瘤位置,在医学图像处理工具集MITK软件中模拟手术切割面,得到切除术后的残余胰腺区域作为后续影像特征计算的残余胰腺感兴趣区域。
  4. 根据权利要求3所述的一种基于有监督深度子空间学习的胰腺术后糖尿病预测系统,其特征在于,所述残余胰腺感兴趣区域获取模块中,胰腺分割网络选择密集连接扩张卷积网络。
  5. 根据权利要求1所述的一种基于有监督深度子空间学习的胰腺术后糖尿病预测系统,其特征在于,所述影像特征计算模块用于对术前CT图像数据做滤波处理,利用滤波后的图像和残余胰腺感兴趣区域,计算一阶统计特征向量、形状特征向量以及纹理特征向量,将三个特征向量连接得到滤波处理后的特征向量;根据胰腺分割网络的全连接层输入,计算残余胰腺感兴趣区域内所有像素点的特征均值,并进行标准化处理,得到高层语义特征向量,将滤波处理后的特征向量和高层语义特征向量相连接,得到胰腺影像特征。
  6. 根据权利要求5所述的一种基于有监督深度子空间学习的胰腺术后糖尿病预测系统,其特征在于,将所有CT图像数据滤波处理后的特征向量按照如下方式处理:
    Xf=(Xf-min(Xf))/(max(Xf)-min(Xf))
    其中Xf表示特征向量,f代表具体的特征名,向量长度为所有CT图像数据个数n。
  7. 根据权利要求6所述的一种基于有监督深度子空间学习的胰腺术后糖尿病预测系统,其特征在于,将滤波处理后的特征向量和高层语义特征向量相连接,得到影像特征Ximg
    Ximg=concat(Xradiomics,Xdeep),axis=1)
    其中d1为影像特征的维度,n为CT图像数据个数,Xradionics为滤波处理后的特征向量,表示影像组学特征,Xdeep为高层语义特征向量。
  8. 根据权利要求1所述的一种基于有监督深度子空间学习的胰腺术后糖尿病预测系统,其特征在于,所述临床特征计算模块包括身体组成成分特征计算模块、临床信息特征计算模块和胰腺切除特征计算模块;
    所述身体组成成分特征计算模块用于计算CT体数据的第三节脊椎位置的横断面图像的内脏脂肪、皮下脂肪和骨骼肌的区域面积,并计算内脏脂肪与骨骼肌之比和内脏脂肪与皮下 脂肪之比,得到身体组成成分特征;
    所述临床信息特征获取模块用于获取患者基础临床信息,包括人口学特征和生活习惯,组成临床信息特征;
    所述胰腺切除特征计算模块计算胰腺术前体积、术后体积,计算胰腺切除比,构建胰腺切除特征;
    将身体组成成分特征计算模块、临床信息特征计算模块和胰腺切除特征计算模块的结果连接构成临床特征,输入到深度子空间学习模块。
PCT/CN2023/089985 2022-04-29 2023-04-23 基于有监督深度子空间学习的胰腺术后糖尿病预测系统 WO2023207820A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210466102.7 2022-04-29
CN202210466102.7A CN114565613B (zh) 2022-04-29 2022-04-29 基于有监督深度子空间学习的胰腺术后糖尿病预测系统

Publications (1)

Publication Number Publication Date
WO2023207820A1 true WO2023207820A1 (zh) 2023-11-02

Family

ID=81721134

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/089985 WO2023207820A1 (zh) 2022-04-29 2023-04-23 基于有监督深度子空间学习的胰腺术后糖尿病预测系统

Country Status (2)

Country Link
CN (1) CN114565613B (zh)
WO (1) WO2023207820A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117975004A (zh) * 2024-01-25 2024-05-03 扬州大学 一种基于编码器-解码器架构结合带状池化和aspp的田埂分割方法
CN118154593A (zh) * 2024-05-10 2024-06-07 吉林大学 一种基于数据分析的直肠术后吻合口并发症检测系统

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114565613B (zh) * 2022-04-29 2022-08-23 之江实验室 基于有监督深度子空间学习的胰腺术后糖尿病预测系统
CN115131642B (zh) * 2022-08-30 2022-12-27 之江实验室 一种基于多视子空间聚类的多模态医学数据融合系统
CN116309385B (zh) * 2023-02-27 2023-10-10 之江实验室 基于弱监督学习的腹部脂肪与肌肉组织测量方法及系统

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
RU2313832C1 (ru) * 2006-05-16 2007-12-27 Государственное учреждение Научный центр реконструктивной и восстановительной хирургии Восточно-Сибирского научного центра Сибирского отделения Российской Академии медицинских наук Способ моделирования пострезекционной гипергликемии
US20190019300A1 (en) * 2015-05-26 2019-01-17 Memorial Sloan-Kettering Cancer Center System, method and computer-accessible medium for texture analysis of hepatopancreatobiliary diseases
CN110599461A (zh) * 2019-08-21 2019-12-20 东南大学 一种基于子空间特征学习的丘脑功能分区方法
CN113160229A (zh) * 2021-03-15 2021-07-23 西北大学 基于层级监督级联金字塔网络的胰腺分割方法及装置
CN113284151A (zh) * 2021-06-07 2021-08-20 山东澳望德信息科技有限责任公司 一种基于深度卷积神经网络的胰腺分割方法及系统
CN113570619A (zh) * 2021-07-13 2021-10-29 清影医疗科技(深圳)有限公司 基于人工智能的计算机辅助胰腺病理图像诊断系统
CN113870258A (zh) * 2021-12-01 2021-12-31 浙江大学 一种基于对抗学习的无标签胰腺影像自动分割系统
CN114565613A (zh) * 2022-04-29 2022-05-31 之江实验室 基于有监督深度子空间学习的胰腺术后糖尿病预测系统

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110047082B (zh) * 2019-03-27 2023-05-16 深圳大学 基于深度学习的胰腺神经内分泌肿瘤自动分割方法及系统
CN111739076B (zh) * 2020-06-15 2022-09-30 大连理工大学 面向多种ct肺部纹理识别的非监督保内容域适应方法
CN111915596A (zh) * 2020-08-07 2020-11-10 杭州深睿博联科技有限公司 一种肺结节良恶性预测方法及装置
CN112164067A (zh) * 2020-10-12 2021-01-01 西南科技大学 一种基于多模态子空间聚类的医学图像分割方法及装置
CN113113140B (zh) * 2021-04-02 2022-09-23 中山大学 基于自监督dnn的糖尿病预警方法、系统、设备及存储介质

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
RU2313832C1 (ru) * 2006-05-16 2007-12-27 Государственное учреждение Научный центр реконструктивной и восстановительной хирургии Восточно-Сибирского научного центра Сибирского отделения Российской Академии медицинских наук Способ моделирования пострезекционной гипергликемии
US20190019300A1 (en) * 2015-05-26 2019-01-17 Memorial Sloan-Kettering Cancer Center System, method and computer-accessible medium for texture analysis of hepatopancreatobiliary diseases
CN110599461A (zh) * 2019-08-21 2019-12-20 东南大学 一种基于子空间特征学习的丘脑功能分区方法
CN113160229A (zh) * 2021-03-15 2021-07-23 西北大学 基于层级监督级联金字塔网络的胰腺分割方法及装置
CN113284151A (zh) * 2021-06-07 2021-08-20 山东澳望德信息科技有限责任公司 一种基于深度卷积神经网络的胰腺分割方法及系统
CN113570619A (zh) * 2021-07-13 2021-10-29 清影医疗科技(深圳)有限公司 基于人工智能的计算机辅助胰腺病理图像诊断系统
CN113870258A (zh) * 2021-12-01 2021-12-31 浙江大学 一种基于对抗学习的无标签胰腺影像自动分割系统
CN114565613A (zh) * 2022-04-29 2022-05-31 之江实验室 基于有监督深度子空间学习的胰腺术后糖尿病预测系统

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
" Master's Thesis ", 15 April 2021, NANJING UNIVERSITY, China, article WU, TING: "Convolutional Neural Network and Variation Based Segmentation Algorithm of Pancreas and Cyst", pages: 1 - 64, XP009550043, DOI: 10.27235/d.cnki.gnjiu.2020.002020 *
LI FEIYAN; LI WEISHENG; SHU YUCHENG; QIN SHENG; XIAO BIN; ZHAN ZIWEI: "Multiscale receptive field based on residual network for pancreas segmentation in CT images", BIOMEDICAL SIGNAL PROCESSING AND CONTROL, ELSEVIER, AMSTERDAM, NL, vol. 57, 20 December 2019 (2019-12-20), NL , XP086012360, ISSN: 1746-8094, DOI: 10.1016/j.bspc.2019.101828 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117975004A (zh) * 2024-01-25 2024-05-03 扬州大学 一种基于编码器-解码器架构结合带状池化和aspp的田埂分割方法
CN118154593A (zh) * 2024-05-10 2024-06-07 吉林大学 一种基于数据分析的直肠术后吻合口并发症检测系统

Also Published As

Publication number Publication date
CN114565613A (zh) 2022-05-31
CN114565613B (zh) 2022-08-23

Similar Documents

Publication Publication Date Title
WO2023207820A1 (zh) 基于有监督深度子空间学习的胰腺术后糖尿病预测系统
Basheera et al. A novel CNN based Alzheimer’s disease classification using hybrid enhanced ICA segmented gray matter of MRI
CN113674281B (zh) 一种基于深度形状学习的肝脏ct自动分割方法
US9980704B2 (en) Non-invasive image analysis techniques for diagnosing diseases
US20170249739A1 (en) Computer analysis of mammograms
CN115131642B (zh) 一种基于多视子空间聚类的多模态医学数据融合系统
CN111553892B (zh) 基于深度学习的肺结节分割计算方法、装置及系统
Yao et al. Automated hematoma segmentation and outcome prediction for patients with traumatic brain injury
CN112263217B (zh) 一种基于改进卷积神经网络的非黑素瘤皮肤癌病理图像病变区域检测方法
Ye et al. Medical image diagnosis of prostate tumor based on PSP-Net+ VGG16 deep learning network
CN112884759B (zh) 一种乳腺癌腋窝淋巴结转移状态的检测方法及相关装置
CN112638262B (zh) 相似度确定装置、方法及程序
CN112819765A (zh) 一种肝脏图像处理方法
Irene et al. Segmentation and approximation of blood volume in intracranial hemorrhage patients based on computed tomography scan images using deep learning method
CN113539476A (zh) 基于人工智能的胃内窥活检拉曼图像辅助诊断方法和系统
CN112861881A (zh) 一种基于改进MobileNet模型的蜂窝肺识别方法
CN116740386A (zh) 图像处理方法、装置、设备和计算机可读存储介质
CN115409812A (zh) 一种基于融合时间注意机制的ct图像自动分类方法
Armand et al. Transformers Effectiveness in Medical Image Segmentation: A Comparative Analysis of UNet-Based Architectures
CN114822842A (zh) 磁共振结直肠癌t分期预测方法及系统
CN114445374A (zh) 一种基于扩散峰度成像mk图的图像特征处理方法及系统
CN113889235A (zh) 一种三维医学影像无监督特征抽取系统
Mousavi Moghaddam et al. Lung parenchyma segmentation from CT images with a fully automatic method
Reddy et al. Detection and prediction of lung cancer using different algorithms
Fathima et al. Deep Learning and Machine Learning Approaches for Brain Tumor Detection and Classification

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23795253

Country of ref document: EP

Kind code of ref document: A1