CN118553407A - Lung tumor diagnosis and prediction system based on multi-mode deep learning - Google Patents

Lung tumor diagnosis and prediction system based on multi-mode deep learning Download PDF

Info

Publication number
CN118553407A
CN118553407A CN202410665633.8A CN202410665633A CN118553407A CN 118553407 A CN118553407 A CN 118553407A CN 202410665633 A CN202410665633 A CN 202410665633A CN 118553407 A CN118553407 A CN 118553407A
Authority
CN
China
Prior art keywords
model
data
features
pathological
clinical
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410665633.8A
Other languages
Chinese (zh)
Inventor
孙艳芹
林潍轩
王永炫
成旭晨
周锶琪
黄建迪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Medical University
Original Assignee
Guangdong Medical University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Medical University filed Critical Guangdong Medical University
Priority to CN202410665633.8A priority Critical patent/CN118553407A/en
Publication of CN118553407A publication Critical patent/CN118553407A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • G06V10/765Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects using rules for classification or partitioning the feature space
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B25/00ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H10/00ICT specially adapted for the handling or processing of patient-related medical or healthcare data
    • G16H10/60ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H30/00ICT specially adapted for the handling or processing of medical images
    • G16H30/40ICT specially adapted for the handling or processing of medical images for processing medical images, e.g. editing
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30096Tumor; Lesion
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Public Health (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Epidemiology (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Primary Health Care (AREA)
  • Multimedia (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biotechnology (AREA)
  • Evolutionary Biology (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Radiology & Medical Imaging (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Pathology (AREA)
  • Genetics & Genomics (AREA)
  • Bioethics (AREA)
  • Quality & Reliability (AREA)
  • Image Analysis (AREA)

Abstract

The application provides a lung tumor diagnosis and prediction system based on multi-mode deep learning, which comprises the following steps: an input unit that acquires at least two of lung tumor pathology data, transcription data, and clinical prognosis data of a patient; a preprocessing unit for respectively preprocessing the pathological data, the transcription data and the clinical prognosis data; the layering processing unit inputs the data into the multi-mode prediction model, and predicts the obtained result characteristics; the fusion processing unit predicts and obtains a tumor prediction result through a multi-mode prediction model; and the survival time prediction unit inputs the pathology input image into a survival time prediction model to obtain a final patient survival time prediction result. The application utilizes the deep learning technology to effectively fuse the lung tumor pathology, transcriptome and clinical prognosis data, improves the diagnosis accuracy and the model interpretability, and the survival time prediction can provide a prediction result for doctors in real time and can be effectively fused into the current lung tumor clinical diagnosis and treatment flow.

Description

一种基于多模态深度学习的肺肿瘤诊断及预测系统A lung tumor diagnosis and prediction system based on multimodal deep learning

技术领域Technical Field

本申请涉及肿瘤诊断领域,特别涉及一种基于多模态深度学习的肺肿瘤诊断及预测系统。The present application relates to the field of tumor diagnosis, and in particular to a lung tumor diagnosis and prediction system based on multimodal deep learning.

背景技术Background Art

随着医学成像技术和基因组学的进步,肺肿瘤的诊断和预后预测越来越依赖于多维度数据分析。虽然医学影像如CT和MRI提供了肺肿瘤形态学的重要信息,但仅通过影像数据区分良恶性及预测预后仍具挑战性,临床依旧使用临床病理诊断作为肺肿瘤良恶性诊断的金标准。近年来,深度学习在医学影像分析领域取得了显著突破,但在病理图像领域,针对肺肿瘤良恶性的分类研究仍显不足,主要源于病理图像的复杂组织结构与纹理特征,以及不同病理类型间的微妙差异。因此,结合多种模态的数据进行综合分析成为关键研究方向。除了医学影像组学外,病理组学、转录组学等组学数据也能揭示肿瘤内的微观结构与基因表达信息,提高诊断和预后预测效果。With the advancement of medical imaging technology and genomics, the diagnosis and prognosis prediction of lung tumors are increasingly dependent on multi-dimensional data analysis. Although medical images such as CT and MRI provide important information on the morphology of lung tumors, it is still challenging to distinguish between benign and malignant tumors and predict prognosis only through imaging data. Clinical pathological diagnosis is still used as the gold standard for the diagnosis of benign and malignant lung tumors. In recent years, deep learning has made significant breakthroughs in the field of medical image analysis, but in the field of pathological images, research on the classification of benign and malignant lung tumors is still insufficient, mainly due to the complex tissue structure and texture features of pathological images, as well as the subtle differences between different pathological types. Therefore, combining data from multiple modalities for comprehensive analysis has become a key research direction. In addition to medical imaging omics, omics data such as pathological omics and transcriptomics can also reveal the microstructure and gene expression information in tumors, improving the diagnosis and prognosis prediction effects.

目前,深度学习在肺肿瘤的诊断和预后预测方面已有初步应用,如卷积神经网络(CNN)用于CT图像特征提取和分类,循环神经网络(RNN)用于时间序列医学数据建模预测肺肿瘤预后。然而,这些方案主要关注单一模态数据,未能充分利用多模态数据间的关联性和互补性,因此在肺肿瘤良恶性的精确鉴别和预后预测上仍有待提高。At present, deep learning has been initially applied in the diagnosis and prognosis prediction of lung tumors, such as convolutional neural networks (CNN) for CT image feature extraction and classification, and recurrent neural networks (RNN) for time series medical data modeling to predict lung tumor prognosis. However, these solutions mainly focus on single-modality data and fail to fully utilize the correlation and complementarity between multimodal data. Therefore, there is still room for improvement in the accurate identification of benign and malignant lung tumors and prognosis prediction.

尽管有些技术尝试整合多模态数据,但由于数据间的异质性和复杂性,整合效果并不理想,限制了其在肺肿瘤诊断和治疗中的应用。并且,由于数据收集和标注的困难,现有的相关模型在新数据上的表现不佳,限制了其泛化能力和实际应用价值。此外,许多现有的深度学习模型并未被有效地集成到现有的临床工作流程中,增加了医生的工作负担,也限制了模型的实际应用效果。Although some technologies attempt to integrate multimodal data, the heterogeneity and complexity of the data make the integration effect unsatisfactory, limiting its application in the diagnosis and treatment of lung tumors. In addition, due to the difficulty of data collection and annotation, the existing related models perform poorly on new data, limiting their generalization ability and practical application value. In addition, many existing deep learning models have not been effectively integrated into the existing clinical workflow, increasing the workload of doctors and limiting the practical application effect of the models.

发明内容Summary of the invention

有鉴于此,本申请提出了一种基于多模态深度学习的肺肿瘤诊断及预测系统,以解决前述背景技术中记载的技术问题中的至少一个。具体方案如下:In view of this, the present application proposes a lung tumor diagnosis and prediction system based on multimodal deep learning to solve at least one of the technical problems described in the above background technology. The specific solution is as follows:

一种基于多模态深度学习的肺肿瘤诊断及预测系统,包括:A lung tumor diagnosis and prediction system based on multimodal deep learning, comprising:

输入单元,用于获取患者包括涉及肺肿瘤病理图像的病理数据、涉及肺肿瘤组织RNA转录情况的转录数据和涉及肺肿瘤患者临床治疗的临床预后数据中的至少两种数据;An input unit, used to obtain at least two data of a patient including pathological data related to lung tumor pathological images, transcription data related to RNA transcription of lung tumor tissue, and clinical prognosis data related to clinical treatment of lung tumor patients;

预处理单元,用于对所述病理数据进行预设第一预处理得到病理输入图像、对所述转录数据进行预设第二预处理得到转录输入数据、对所述临床预后数据进行第三预处理得到临床输入数据;A preprocessing unit, configured to perform a preset first preprocessing on the pathological data to obtain a pathological input image, perform a preset second preprocessing on the transcription data to obtain transcription input data, and perform a third preprocessing on the clinical prognosis data to obtain clinical input data;

分层处理单元,用于将所述病理输入图像、所述转录输入数据和所述临床输入数据输入到融合有病理模型、转录模型和临床模型的多模态预测模型中,使所述病理模型根据所述病理输入图像输出n个病理结果特征,使所述转录模型根据所述转录输入数据输出n个的,使所述临床模型根据所述临床输入数据输出n个临床结果特征;a hierarchical processing unit, configured to input the pathological input image, the transcription input data and the clinical input data into a multimodal prediction model integrated with a pathological model, a transcription model and a clinical model, so that the pathological model outputs n pathological result features according to the pathological input image, the transcription model outputs n pathological result features according to the transcription input data, and the clinical model outputs n clinical result features according to the clinical input data;

融合处理单元,用于通过所述多模态预测模型融合n个所述病理结果特征、n个所述转录结果特征以及n个临床结果特征,得到最终的肿瘤预测结果;A fusion processing unit, used for fusing the n pathological result features, the n transcription result features and the n clinical result features through the multimodal prediction model to obtain a final tumor prediction result;

生存时间预测单元,用于将所述病理输入图像输入到预设生存时间预测模型中,得到最终预测的患者的生存时间。The survival time prediction unit is used to input the pathological input image into a preset survival time prediction model to obtain the final predicted survival time of the patient.

其中,所述病理结果特征、所述转录结果特征、所述临床结果特征和所述肿瘤预测结果均涉及预测的肿瘤类型及其概率,n为不小于2的整数。Among them, the pathological result characteristics, the transcription result characteristics, the clinical result characteristics and the tumor prediction results all involve predicted tumor types and their probabilities, and n is an integer not less than 2.

在一些具体实施例中,所述多模态预测模型的表达式为:In some specific embodiments, the expression of the multimodal prediction model is:

其中,表示多模态预测模型的输出,Wclass是所述病理模型、所述转录模型和所述临床模型的输出的权重矩阵,Batch表示批归一化层,Dropout表示Dropout函数,ReLU表示ReLU激活函数,Wfusion是融合层的权重矩阵,bfusion是融合层的偏置项,f1、f2和f3分别表示所述病理结果特征、所述转录结果特征和所述临床结果特征;μbatch表示批次的平均值;表示批次的方差;∈是用于数值稳定性的小标量,γ和β是BatchNorm层的可训练参数;x是融合层和之前操作的输出(ReLU,Dropout)。in, represents the output of the multimodal prediction model, W class is the weight matrix of the output of the pathological model, the transcriptional model and the clinical model, Batch represents the batch normalization layer, Dropout represents the Dropout function, ReLU represents the ReLU activation function, W fusion is the weight matrix of the fusion layer, b fusion is the bias term of the fusion layer, f 1 , f 2 and f 3 represent the pathological result feature, the transcriptional result feature and the clinical result feature respectively; μ batch represents the average value of the batch; represents the variance of the batch; ∈ is a small scalar for numerical stability, γ and β are trainable parameters of the BatchNorm layer; x is the output of the fusion layer and the previous operation (ReLU, Dropout).

在一些具体实施例中,在所述多模态预测模型中:In some specific embodiments, in the multimodal prediction model:

通过融合层将n个病理结果特征、n个转录结果特征以及n个临床结果特征作为输入特征融合到低维空间中,得到融合特征;The n pathological result features, n transcription result features, and n clinical result features are fused into the low-dimensional space as input features through the fusion layer to obtain fusion features;

在融合特征上应用ReLU激活函数引入非线性;Applying ReLU activation function to the fused features introduces nonlinearity;

在ReLU激活后的特征上应用Dropout函数,丢弃预设比例的输入特征以减少过拟合;Apply the Dropout function to the features after ReLU activation to discard a preset proportion of input features to reduce overfitting;

在Dropout后的特征上应用批量归一化层进行特征规一化;Apply batch normalization layer to the features after Dropout to normalize the features;

通过预设全连接层将所述融合特征映射最终的肿瘤类别上,得到最终的肿瘤预测结果。The fusion features are mapped to the final tumor category through a preset fully connected layer to obtain the final tumor prediction result.

在一些具体实施例中,所述第一预处理包括图像的亮度调整、对比度调整、饱和度调整、Gamma校正以及像素标准化;所述第二预处理包括转录数据的标准化处理以及提取表达拷贝数非零的基因;所述第三预处理包括临床预后数据的缺失值处理和离群值处理。In some specific embodiments, the first preprocessing includes brightness adjustment, contrast adjustment, saturation adjustment, gamma correction and pixel standardization of the image; the second preprocessing includes standardization of transcription data and extraction of genes with non-zero expression copy number; the third preprocessing includes missing value processing and outlier processing of clinical prognosis data.

在一些具体实施例中,所述第二预处理还包括:对所述转录数据进行标准化处理后,提取表达拷贝数非零的基因;对提取的基因表达矩阵降维至与预设肿瘤类型分类总数相匹配的维度DIM,从降维后的数据中提取m*DIM个分类高贡献基因得到所述转录输入数据,m为不小于2的整数。In some specific embodiments, the second preprocessing further includes: extracting genes with non-zero expression copy numbers after standardizing the transcription data; reducing the dimension of the extracted gene expression matrix to a dimension DIM that matches the total number of preset tumor type classifications, and extracting m*DIM classification high-contribution genes from the reduced-dimensional data to obtain the transcription input data, where m is an integer not less than 2.

在一些具体实施例中,所述转录模型的输入层具有m*DIM个神经元;In some specific embodiments, the input layer of the transcription model has m*DIM neurons;

所述转录模型具有两个全连接层;第一个全连接层之后引入有ReLU激活函数,以引入非线性;第一个全连接层之后应用有Dropout函数来随机丢弃预设比例的输出,以减少过拟合;第二个全连接层具有n个神经元。The transcription model has two fully connected layers; a ReLU activation function is introduced after the first fully connected layer to introduce nonlinearity; a Dropout function is applied after the first fully connected layer to randomly discard a preset proportion of outputs to reduce overfitting; and the second fully connected layer has n neurons.

在一些具体实施例中,所述临床预后数据为以文字描述或图表形式展示的用于评估疾病发展可能的结果以及患者治疗效果和生存前景的统计指标;In some specific embodiments, the clinical prognosis data are statistical indicators presented in the form of text descriptions or charts for evaluating possible outcomes of disease development and patient treatment effects and survival prospects;

所述第三预处理还包括:提取临床预后数据中包括肿瘤分期、生存时间在内的关键信息。The third preprocessing further includes: extracting key information including tumor stage and survival time from the clinical prognosis data.

在一些具体实施例中,所述临床输入数据中包含数值特征和类别特征;In some specific embodiments, the clinical input data includes numerical features and categorical features;

在所述临床模型中,使用嵌入层来处理所述类别特征,将其映射到高维空间;将所述数值特征和嵌入层输出的类别特征拼接,以形成单一的输入向量;通过两个全连接层对所述输入向量进行特征提取和降维,通过ReLU激活函数引入非线性,应用Dropout函数以减少过拟合。In the clinical model, an embedding layer is used to process the category features and map them to a high-dimensional space; the numerical features and the category features output by the embedding layer are concatenated to form a single input vector; the input vector is subjected to feature extraction and dimensionality reduction through two fully connected layers, nonlinearity is introduced through the ReLU activation function, and the Dropout function is applied to reduce overfitting.

在一些具体实施例中,在所述病理模型中,输出层为具有n个神经元的全连接层;In some specific embodiments, in the pathological model, the output layer is a fully connected layer having n neurons;

在所述病理模型训练过程中,利用交叉熵损失函数来衡量模型输出与训练数据之间的差异,并通过反向传播算法计算损失相对于模型参数的梯度;使用SGD优化器根据计算出的梯度更新模型权重,其中学习率初始设置为0.001。During the pathological model training process, the cross entropy loss function is used to measure the difference between the model output and the training data, and the gradient of the loss relative to the model parameters is calculated by the back propagation algorithm; the SGD optimizer is used to update the model weights according to the calculated gradient, and the learning rate is initially set to 0.001.

在一些具体实施例中,在所述融合处理单元中,所述病理结果特征、所述转录结果特征、所述临床结果特征均具有对应的权重,通过所述多模态预测模型融合n个所述病理结果特征、n个所述转录结果特征、n个所述临床结果特征及其对应的权重,得到最终的肿瘤预测结果。In some specific embodiments, in the fusion processing unit, the pathological result features, the transcription result features, and the clinical result features all have corresponding weights, and the multimodal prediction model is used to fuse n of the pathological result features, n of the transcription result features, n of the clinical result features and their corresponding weights to obtain the final tumor prediction result.

在一些具体实施例中,在所述生存时间预测模型中,使用预训练的卷积神经网络模型提取所述病理输入图像的图像特征,然后使用多层感知机基于所述图像特征进行生存时间预测,得到最终的生存时间预测结果。In some specific embodiments, in the survival time prediction model, a pre-trained convolutional neural network model is used to extract image features of the pathological input image, and then a multi-layer perceptron is used to perform survival time prediction based on the image features to obtain the final survival time prediction result.

在一些具体实施例中,在所述生存时间预测模型的训练过程中:In some specific embodiments, during the training process of the survival time prediction model:

使用均方误差(MSE)作为损失函数,Adam优化器进行参数优化;The mean square error (MSE) is used as the loss function and the Adam optimizer is used for parameter optimization;

在每个训练周期中,通过前向传播计算预测值,通过反向传播更新权重。In each training cycle, predictions are calculated via forward propagation and weights are updated via backpropagation.

有益效果:本申请提供了一种基于多模态深度学习的肺肿瘤诊断及预测系统,针对传统深度学习预测模型数据单一的问题,利用多模态深度学习技术对病理组学、转录组学和临床预后数据进行有效融合,本申请利用深度学习技术对肺肿瘤病理组学、转录组学和临床预后数据进行有效融合,提高了诊断的准确性和模型的可解释性,生存时间预测能够实时地为医生提供预测结果,为医生提供更全面、深入的诊断及预后信息,帮助医生更好地制定治疗方案和决策。Beneficial effects: The present application provides a lung tumor diagnosis and prediction system based on multimodal deep learning. To address the problem of single data in traditional deep learning prediction models, the present application uses multimodal deep learning technology to effectively integrate pathological genomics, transcriptomics, and clinical prognosis data. The present application uses deep learning technology to effectively integrate lung tumor pathological genomics, transcriptomics, and clinical prognosis data, thereby improving the accuracy of diagnosis and the interpretability of the model. The survival time prediction can provide doctors with prediction results in real time, providing doctors with more comprehensive and in-depth diagnostic and prognostic information, and helping doctors to better formulate treatment plans and make decisions.

针对多模态数据间的异质性,通过数据预处理和模态融合等技术手段消除差异和冲突,确保模型的有效性和稳定性,同时自适应调整不同模态数据的权重和贡献度。采用注意力机制、残差连接等先进技术优化模型性能,采用梯度下降、正则化、批量归一化等优化算法和技术手段加速模型收敛,利用数据增强和迁移学习等技术手段扩展训练数据集,提高模型准确性和泛化能力,使得模型能够同时处理多种模态的数据,并且具有更强的特征提取和分类能力。将多模态深度学习模型与临床工作流程相结合,为医生提供实时诊断建议和预后预测结果,能够适应不同临床场景需求。In view of the heterogeneity between multimodal data, technical means such as data preprocessing and modal fusion are used to eliminate differences and conflicts, ensure the effectiveness and stability of the model, and adaptively adjust the weights and contributions of different modal data. Advanced technologies such as attention mechanism and residual connection are used to optimize model performance. Optimization algorithms and technical means such as gradient descent, regularization, and batch normalization are used to accelerate model convergence. Technical means such as data enhancement and transfer learning are used to expand the training data set to improve model accuracy and generalization ability, so that the model can process data of multiple modalities at the same time and has stronger feature extraction and classification capabilities. The multimodal deep learning model is combined with the clinical workflow to provide doctors with real-time diagnostic suggestions and prognosis prediction results, which can adapt to the needs of different clinical scenarios.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

为了更清楚地说明本申请实施例的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,应当理解,以下附图仅示出了本申请的某些实施例,因此不应被看作是对范围的限定,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他相关的附图。In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required for use in the embodiments will be briefly introduced below. It should be understood that the following drawings only show certain embodiments of the present application and therefore should not be regarded as limiting the scope. For ordinary technicians in this field, other related drawings can be obtained based on these drawings without paying creative work.

图1为本申请的肺肿瘤诊断及预测系统模块示意图;FIG1 is a schematic diagram of the lung tumor diagnosis and prediction system modules of the present application;

图2为本申请的肺肿瘤诊断及预测系统原理示意图;FIG2 is a schematic diagram of the principle of the lung tumor diagnosis and prediction system of the present application;

图3为本申请的修改前后的亮度对比图;FIG3 is a brightness comparison diagram before and after the modification of the present application;

图4为本申请的修改前后的对比度对比图;FIG4 is a contrast comparison diagram before and after the modification of the present application;

图5为本申请的修改前后的饱和度对比图;FIG5 is a saturation comparison diagram before and after the modification of the present application;

图6为本申请的融合模型的准确率示意图;FIG6 is a schematic diagram of the accuracy of the fusion model of the present application;

图7为本申请的生存时间预测的准确率示意图。FIG. 7 is a schematic diagram of the accuracy of survival time prediction of the present application.

附图标记:1-输入单元;2-预处理单元;3-分层处理单元;4-融合处理单元;5-生存时间预测单元。Figure numerals: 1-input unit; 2-preprocessing unit; 3-layered processing unit; 4-fusion processing unit; 5-survival time prediction unit.

具体实施方式DETAILED DESCRIPTION

在下文中,将更全面地描述本申请公开的各种实施例。本申请公开可具有各种实施例,并且可在其中做出调整和改变。然而,应理解:不存在将本申请公开的各种实施例限于在此公开的特定实施例的意图,而是应将本申请公开理解为涵盖落入本申请公开的各种实施例的精神和范围内的所有调整、等同物和/或可选方案。Hereinafter, various embodiments disclosed in the present application will be described more fully. The present application discloses various embodiments, and adjustments and changes can be made therein. However, it should be understood that there is no intention to limit the various embodiments disclosed in the present application to the specific embodiments disclosed herein, but the present application should be understood to cover all adjustments, equivalents and/or alternatives within the spirit and scope of the various embodiments disclosed in the present application.

在本申请公开的各种实施例中使用的术语仅用于描述特定实施例的目的并且并非意在限制本申请公开的各种实施例。如在此所使用,单数形式意在也包括复数形式,除非上下文清楚地另有指示。除非另有限定,否则在这里使用的所有术语(包括技术术语和科学术语)具有与本申请公开的各种实施例所属领域普通技术人员通常理解的含义相同的含义。所述术语(诸如在一般使用的词典中限定的术语)将被解释为具有与在相关技术领域中的语境含义相同的含义并且将不被解释为具有理想化的含义或过于正式的含义,除非在本申请公开的各种实施例中被清楚地限定。The terms used in the various embodiments disclosed in the present application are only used to describe the purpose of specific embodiments and are not intended to limit the various embodiments disclosed in the present application. As used herein, the singular form is intended to also include the plural form, unless the context clearly indicates otherwise. Unless otherwise specified, all terms used here (including technical terms and scientific terms) have the same meaning as the meaning generally understood by ordinary technicians in the field of various embodiments disclosed in the present application. The terms (such as the terms defined in the dictionary generally used) will be interpreted as having the same meaning as the contextual meaning in the relevant technical field and will not be interpreted as having an idealized meaning or an overly formal meaning, unless clearly defined in the various embodiments disclosed in the present application.

本申请公开了一种基于多模态深度学习的肺肿瘤诊断及预测系统,整合病理组学、转录组学及临床预后数据,构建多模态深度学习模型,实现对肺肿瘤良恶性的精确鉴别和预后预测。同时,针对多模态数据间的异质性进行处理,确保模型有效性和稳定性。肺肿瘤诊断及预测系统的模块如附图1所示,原理如附图2所示,具体方案如下:This application discloses a lung tumor diagnosis and prediction system based on multimodal deep learning, which integrates pathological genomics, transcriptomics and clinical prognosis data to build a multimodal deep learning model to achieve accurate identification of benign and malignant lung tumors and prognosis prediction. At the same time, the heterogeneity between multimodal data is processed to ensure the effectiveness and stability of the model. The module of the lung tumor diagnosis and prediction system is shown in Figure 1, the principle is shown in Figure 2, and the specific scheme is as follows:

一种基于多模态深度学习的肺肿瘤诊断及预测系统,包括:A lung tumor diagnosis and prediction system based on multimodal deep learning, comprising:

输入单元1,用于获取患者包括涉及肺肿瘤病理图像的病理数据、涉及肺肿瘤组织RNA转录情况的转录数据和涉及肺肿瘤患者临床治疗的临床预后数据中的至少两种数据;Input unit 1, used to obtain at least two data of the patient including pathological data related to lung tumor pathological images, transcription data related to RNA transcription of lung tumor tissue, and clinical prognosis data related to clinical treatment of lung tumor patients;

预处理单元2,用于对病理数据进行预设第一预处理得到病理输入图像、对转录数据进行预设第二预处理得到转录输入数据、对临床预后数据进行第三预处理得到临床输入数据;Preprocessing unit 2, used for performing a preset first preprocessing on the pathological data to obtain a pathological input image, performing a preset second preprocessing on the transcription data to obtain transcription input data, and performing a third preprocessing on the clinical prognosis data to obtain clinical input data;

分层处理单元3,用于将病理输入图像、转录输入数据和临床输入数据输入到融合有病理模型、转录模型和临床模型的多模态预测模型中,使病理模型根据病理输入图像输出n个病理结果特征,使转录模型根据转录输入数据输出n个的转录结果特征,使临床模型根据临床输入数据输出n个临床结果特征;A hierarchical processing unit 3 is used to input the pathological input image, the transcription input data and the clinical input data into a multimodal prediction model that integrates the pathological model, the transcription model and the clinical model, so that the pathological model outputs n pathological result features according to the pathological input image, the transcription model outputs n transcription result features according to the transcription input data, and the clinical model outputs n clinical result features according to the clinical input data;

融合处理单元4,用于通过多模态预测模型融合n个病理结果特征、n个转录结果特征以及n个临床结果特征,得到最终的肿瘤预测结果。The fusion processing unit 4 is used to fuse n pathological result features, n transcription result features and n clinical result features through a multimodal prediction model to obtain a final tumor prediction result.

生存时间预测单元5,用于将病理输入图像输入到预设生存时间预测模型中,得到最终预测的患者的生存时间。The survival time prediction unit 5 is used to input the pathological input image into a preset survival time prediction model to obtain the final predicted survival time of the patient.

其中,病理结果特征、转录结果特征、临床结果特征和肿瘤预测结果均涉及预测的肿瘤类型及其概率,n为不小于2的整数。n的值越大,融合的数据越多,最终的肿瘤预测结果的可靠性越高,同时数据融合过程越复杂。因此,需要合理设定n的值。在实际应用中,可以根据模型的输出节点来调整n的值,输出节点的数量对应n的值。例如,病理模型具有128个神经元,可输出128个病理结果特征,相应的将转录模型和临床模型的输出也更改为128个,使得病理结果特征、转录结果特征和临床结果特征的数量相同,方便多模态预测模型进行融合。Among them, the pathological result features, transcription result features, clinical result features and tumor prediction results all involve the predicted tumor type and its probability, and n is an integer not less than 2. The larger the value of n, the more data is fused, the higher the reliability of the final tumor prediction result, and the more complex the data fusion process. Therefore, it is necessary to set the value of n reasonably. In practical applications, the value of n can be adjusted according to the output nodes of the model, and the number of output nodes corresponds to the value of n. For example, the pathological model has 128 neurons and can output 128 pathological result features. Correspondingly, the outputs of the transcription model and the clinical model are also changed to 128, so that the number of pathological result features, transcription result features and clinical result features is the same, which is convenient for the fusion of multimodal prediction models.

本申请的方案通过整合多模态数据,提高肺肿瘤诊断和预后预测的准确性和可靠性。同时,注重模型的泛化能力和临床工作流程的集成性,为医生提供便捷、快速且准确的诊断支持。在模型训练过程中,采用了适当的超参数设置和优化策略,如学习率调整、批量大小选择和正则化等。本申请的多模态数据至少包括病理数据、转录数据和临床预后数据,输入到预训练的多模态预测模型中的数据至少包括前述中的两种,以实现多模态数据预测。The solution of the present application improves the accuracy and reliability of lung tumor diagnosis and prognosis prediction by integrating multimodal data. At the same time, it focuses on the generalization ability of the model and the integration of clinical workflows to provide doctors with convenient, fast and accurate diagnostic support. During the model training process, appropriate hyperparameter settings and optimization strategies are adopted, such as learning rate adjustment, batch size selection and regularization. The multimodal data of the present application includes at least pathological data, transcriptional data and clinical prognosis data, and the data input into the pre-trained multimodal prediction model includes at least two of the foregoing to achieve multimodal data prediction.

其中,病理数据包括在疾病诊断、研究和治疗过程中,通过组织学、细胞学、免疫组织化学、分子生物学等手段获取的关于疾病状态的信息。这些数据通常来源于对患者生物样本(如组织切片、细胞涂片等)的分析,是理解疾病本质、判断病情、指导治疗计划和评估预后的关键依据。在本申请中,病理数据主要以图像形式存在,如全视野数字切片(WholeSlide Images,WSIs)、普通数字病理图像等。这些图像数据在现代医疗中扮演着核心角色,在临床实践中用于直观展示和分析疾病状态,辅助医生做出更准确的诊断和治疗决策。病理数据不同于主要以数值形式存在的转录数据和主要以文字形式存在的临床预后数据,其图像中蕴含的特征信息更多,处理过程更复杂。Among them, pathological data includes information about disease status obtained by means of histology, cytology, immunohistochemistry, molecular biology, etc. during disease diagnosis, research and treatment. These data are usually derived from the analysis of patient biological samples (such as tissue sections, cell smears, etc.), which are the key basis for understanding the nature of the disease, judging the condition, guiding treatment plans and evaluating prognosis. In this application, pathological data mainly exist in the form of images, such as full-view digital slices (WholeSlide Images, WSIs), ordinary digital pathological images, etc. These image data play a core role in modern medicine, and are used in clinical practice to intuitively display and analyze disease status, and assist doctors in making more accurate diagnosis and treatment decisions. Pathological data is different from transcription data that mainly exists in numerical form and clinical prognosis data that mainly exists in text form. The characteristic information contained in its image is more and the processing process is more complicated.

针对病理数据,需要进行图像数据的相关预处理,通过一系列图像预处理步骤增强图像特征,以提升图像识别或分类任务的性能,并使图像数据适应深度学习模型的输入要求。示例性的,第一预处理包括图像的亮度调整、对比度调整、饱和度调整、Gamma校正以及像素标准化。示例性的,为了适配PyTorch框架,将图像数据转换为张量格式,以确保了图像数据能够以适当的形式在PyTorch中进行处理和分析。示例性的,对图像的像素值进行了归一化处理,将其限定在[-1,1]的范围内。这一步骤有助于确保数据的稳定性和模型的收敛性,为后续的模型训练奠定了坚实的基础。For pathological data, relevant preprocessing of image data is required. Image features are enhanced through a series of image preprocessing steps to improve the performance of image recognition or classification tasks and adapt the image data to the input requirements of deep learning models. Exemplarily, the first preprocessing includes brightness adjustment, contrast adjustment, saturation adjustment, gamma correction and pixel standardization of the image. Exemplarily, in order to adapt to the PyTorch framework, the image data is converted into a tensor format to ensure that the image data can be processed and analyzed in PyTorch in an appropriate form. Exemplarily, the pixel values of the image are normalized and limited to the range of [-1,1]. This step helps to ensure the stability of the data and the convergence of the model, laying a solid foundation for subsequent model training.

具体地,设定亮度调整因子、对比度调整因子、Gamma校正值和饱和度调整因子,以调整图像的整体视觉效果,减少图像间的亮度、对比度、饱和度差异。图像调整前后的亮度、对比度、饱和度差异对比如附图3-5所示。Specifically, the brightness adjustment factor, contrast adjustment factor, gamma correction value and saturation adjustment factor are set to adjust the overall visual effect of the image and reduce the brightness, contrast and saturation differences between images. The comparison of the brightness, contrast and saturation differences before and after the image adjustment is shown in Figures 3-5.

1)关于亮度调整,亮度调整因子的标准参数为参数a,通过改变该调整因子可以控制图像的整体明暗程度。如果亮度调整因子大于a,图像将变亮,有助于突出暗区域的细节。如果亮度调整因子小于a,图像将变暗,有助于突出亮区域的细节。如果亮度调整因子等于a,图像的亮度将保持原状,不做改变。1) Regarding brightness adjustment, the standard parameter of the brightness adjustment factor is parameter a. By changing this adjustment factor, the overall brightness of the image can be controlled. If the brightness adjustment factor is greater than a, the image will become brighter, which helps to highlight the details of the dark area. If the brightness adjustment factor is less than a, the image will become darker, which helps to highlight the details of the bright area. If the brightness adjustment factor is equal to a, the brightness of the image will remain unchanged.

2)关于对比度调整,对比度调整因子的标准参数为参数b,用于控制图像中明暗区域之间的差异。如果对比度调整因子大于b,图像的对比度将增加,使得图像中的特征更加鲜明。如果对比度调整因子小于b,图像的对比度将减少,使得图像看起来更平滑,细节差异减少。如果对比度调整因子等于b,图像的对比度将保持不变。2) Regarding contrast adjustment, the standard parameter of the contrast adjustment factor is parameter b, which is used to control the difference between light and dark areas in the image. If the contrast adjustment factor is greater than b, the contrast of the image will increase, making the features in the image more distinct. If the contrast adjustment factor is less than b, the contrast of the image will decrease, making the image look smoother and with less detail differences. If the contrast adjustment factor is equal to b, the contrast of the image will remain unchanged.

3)关于Gamma校正3) About Gamma Correction

Gamma校正是非线性的操作,用于改善图像的亮度响应,参数c为Gamma值的倒数。当Gamma值小于c时,暗区域会变亮,从而增加图像的整体对比度。当Gamma值大于c时,亮区域会变暗,从而减少图像的整体对比度。Gamma校正的数学表达式为i(1/),其中i是像素值,e是Gamma校正值。Gamma correction is a nonlinear operation used to improve the brightness response of an image. The parameter c is the inverse of the gamma value. When the gamma value is less than c, the dark area becomes brighter, thereby increasing the overall contrast of the image. When the gamma value is greater than c, the bright area becomes darker, thereby reducing the overall contrast of the image. The mathematical expression of gamma correction is i (1/) , where i is the pixel value and e is the gamma correction value.

4)关于饱和度调整4) About saturation adjustment

饱和度调整因子影响图像中颜色的强度和纯度。降低饱和度会使图像颜色趋于灰色,减少颜色之间的差异,参数d控制这一过程。提高饱和度会使图像颜色更加鲜艳和突出,增强视觉冲击力。饱和度调整通过对图像的每个RGB通道进行操作来实现,涉及颜色空间内的色度和亮度调整。The saturation adjustment factor affects the intensity and purity of the colors in the image. Lowering the saturation will make the image color tend to be gray, reducing the difference between colors. The parameter d controls this process. Increasing the saturation will make the image colors more vivid and prominent, enhancing the visual impact. Saturation adjustment is achieved by operating on each RGB channel of the image, involving chroma and brightness adjustments within the color space.

5)关于像素标准化处理5) About pixel standardization

将每个通道的像素值通过以下公式进行标准化:The pixel values of each channel are normalized by the following formula:

对于RGB图像中的每个像素值,设定原始像素值为originalpixel,均值是mean,标准差是std,则标准化后的像素值normalizedpixel通过上述公式计算。For each pixel value in the RGB image, set the original pixel value to original pixel , the mean to mean, and the standard deviation to std, then the normalized pixel value is calculated by the above formula.

6)调整前后参数可视化6) Parameter visualization before and after adjustment

采用直方图可视化方案,直观展示调整前后图像的亮度、饱和度分布,以比较第一预处理效果,便于调整参数a、b、c的具体取值。具体包括如下:收集每张图像的统计数据;计算原始图像亮度对比度分布,包括:原始均值(原始亮度)、原始标准差(原始对比度);计算调整后图像亮度对比度分布,包括:调整后均值(调整后亮度)、调整后标准差(调整后对比度);使用直方图展示处理前后图像亮度对比度均值和标准差的分布。The histogram visualization scheme is used to intuitively display the brightness and saturation distribution of the image before and after adjustment, so as to compare the effect of the first preprocessing and facilitate the adjustment of the specific values of parameters a, b, and c. Specifically, it includes the following: collecting statistical data for each image; calculating the brightness contrast distribution of the original image, including: the original mean (original brightness) and the original standard deviation (original contrast); calculating the brightness contrast distribution of the adjusted image, including: the adjusted mean (adjusted brightness) and the adjusted standard deviation (adjusted contrast); using a histogram to display the distribution of the mean and standard deviation of the brightness contrast of the image before and after processing.

采用病理数据作为训练数据对模型进行训练时,还可对图像进行扩充处理,以使模型能够适应不同参数的图像。通过随机裁剪、随机水平翻转、颜色抖动(亮度、对比度、饱和度和色调的随机调整)等方式扩充训练数据。具体地,对视觉效果调整后的图像对图像进行批量预处理,以扩增病理切片数量并提高模型泛化能力。具体为:对图像进行随机调整大小并裁剪为224x224像素;对图像进行随机水平翻转;对图像的亮度、对比度、饱和度和色调进行随机调整;将每个通道的像素值进行标准化;将处理后的图像保存为PNG格式。When using pathological data as training data to train the model, the images can also be expanded so that the model can adapt to images with different parameters. The training data is expanded by random cropping, random horizontal flipping, color jittering (random adjustment of brightness, contrast, saturation and hue), etc. Specifically, the images after visual effect adjustment are batch preprocessed to expand the number of pathological sections and improve the generalization ability of the model. Specifically: randomly resize the image and crop it to 224x224 pixels; randomly flip the image horizontally; randomly adjust the brightness, contrast, saturation and hue of the image; standardize the pixel values of each channel; save the processed image in PNG format.

关于病理模型,可采用深度卷积神经网络(CNN)、循环神经网络(RNN)或图神经网络(GNN)等深度学习网络模型作为基础图像模型,以适应不同的数据结构和特征。优选地,选择ResNet18模型作为病理模型的基础模型。ResNet18是一种经典的卷积神经网络模型,通过在大规模图像数据集上的预训练,能够有效地提取图像中的关键特征,这些特征对于肿瘤诊断具有重要的参考价值。Regarding the pathological model, deep learning network models such as deep convolutional neural network (CNN), recurrent neural network (RNN) or graph neural network (GNN) can be used as the basic image model to adapt to different data structures and features. Preferably, the ResNet18 model is selected as the basic model of the pathological model. ResNet18 is a classic convolutional neural network model that can effectively extract key features in the image through pre-training on large-scale image datasets. These features have important reference value for tumor diagnosis.

示例性,关于图像模型的构建和训练包括:使用预训练的ResNet-18模型作为特征提取器;将模型最后的全连接层替换为一个具有n个输出节点的新全连接层,以适应特定的病理图像分类任务。在训练过程中,针对肿瘤诊断任务对模型的参数进行了精细调整,以提高其对该任务的适应性,并输出肿瘤标签及其对应的概率。利用交叉熵损失函数来衡量模型预测的概率分布与真实标签之间的差异,并通过反向传播算法计算损失相对于模型参数的梯度。使用SGD优化器根据计算出的梯度更新模型权重,其中学习率初始设置为0.001,并在每a个epoch后乘以0.1进行衰减。在b个epoch的训练过程中,持续调整学习率以优化模型性能。epoch代表整个训练数据集被模型遍历并学习一次的过程。一个epoch就是算法对所有训练样本完成一次完整的前向传播(forwardpass)和反向传播(backward pass)的过程。Exemplarily, the construction and training of the image model include: using a pre-trained ResNet-18 model as a feature extractor; replacing the last fully connected layer of the model with a new fully connected layer with n output nodes to adapt to a specific pathological image classification task. During the training process, the parameters of the model are fine-tuned for the tumor diagnosis task to improve its adaptability to the task, and the tumor label and its corresponding probability are output. The cross entropy loss function is used to measure the difference between the probability distribution predicted by the model and the true label, and the gradient of the loss relative to the model parameters is calculated by the back propagation algorithm. The SGD optimizer is used to update the model weights according to the calculated gradient, where the learning rate is initially set to 0.001 and is multiplied by 0.1 after each a epoch for decay. During the training process of b epochs, the learning rate is continuously adjusted to optimize the model performance. An epoch represents the process in which the entire training data set is traversed and learned by the model once. An epoch is the process in which the algorithm completes a complete forward pass and backward pass for all training samples.

通过分析肿瘤组织和正常组织的RNA转录数据差异,可以筛选出具有诊断价值的生物标志物。基于转录数据还可以监测肿瘤的进展和治疗效果。如果治疗后相关基因的表达水平恢复正常或向有利方向转变,可能意味着治疗有效。例如,某些特定基因的异常表达模式可能与特定类型的肿瘤高度相关。在本申请中,转录数据主要针对RNA,反映人体内某段时间内的基因表达情况。转录数据包括基因表达矩阵,可通过高通量测序等方式获取。示例性的,在基因表达矩阵中,行代表基因,列代表样本,矩阵中的元素表示该基因在相应样本中的表达量。By analyzing the difference in RNA transcription data between tumor tissue and normal tissue, biomarkers with diagnostic value can be screened out. Based on transcription data, the progression and treatment effect of tumors can also be monitored. If the expression level of related genes returns to normal or changes in a favorable direction after treatment, it may mean that the treatment is effective. For example, the abnormal expression pattern of certain specific genes may be highly correlated with a specific type of tumor. In this application, transcription data is mainly for RNA, reflecting the gene expression in the human body over a period of time. Transcription data includes a gene expression matrix, which can be obtained by high-throughput sequencing and the like. Exemplarily, in a gene expression matrix, rows represent genes, columns represent samples, and the elements in the matrix represent the expression of the gene in the corresponding sample.

对转录数据进行第二预处理,第二预处理涉及数据清洗、标准化和归一化等操作,以确保数据的准确性和一致性。在一些具体实施例中,在模型训练过程中,采用了FPKM(Fragments Per Kilobase of exonmodel per Million mapped reads)方法对于转录数据进行标准化,以消除样本间差异和基因长度的影响。能够有效地将RNA测序数据转换为适用于模型训练的格式。The transcription data is subjected to a second preprocessing, which involves operations such as data cleaning, standardization, and normalization to ensure the accuracy and consistency of the data. In some specific embodiments, during the model training process, the FPKM (Fragments Per Kilobase of exonmodel per Million mapped reads) method is used to standardize the transcription data to eliminate the influence of sample differences and gene length. The RNA sequencing data can be effectively converted into a format suitable for model training.

在一些具体实施例中,第二预处理还包括:对转录数据进行标准化处理后,提取表达拷贝数非零的基因;对提取的基因表达矩阵降维至与预设肿瘤类型分类总数相匹配的维度DIM,从降维后的数据中提取m*DIM个分类高贡献基因得到所述转录输入数据。拷贝数非零的基因指的是在一个基因组或特定区域内,某个基因的拷贝数量大于零。高贡献基因指的是对某种肿瘤疾病发生或表型具有较大影响的基因。这些基因的变异或表达水平的改变可能与特定的生物学现象或疾病状态密切相关。维度DIM与模型所能预测的肿瘤类型分类总数相匹配。In some specific embodiments, the second preprocessing also includes: extracting genes with non-zero expression copy numbers after standardizing the transcription data; reducing the extracted gene expression matrix to a dimension DIM that matches the total number of preset tumor type classifications, and extracting m*DIM classified high-contribution genes from the reduced-dimensional data to obtain the transcription input data. A gene with a non-zero copy number refers to a gene with a copy number greater than zero in a genome or a specific region. A high-contribution gene refers to a gene that has a greater impact on the occurrence or phenotype of a certain tumor disease. The variation of these genes or changes in expression levels may be closely related to specific biological phenomena or disease states. The dimension DIM matches the total number of tumor type classifications that the model can predict.

拷贝数非零的基因意味着这些基因在基因组中至少有一个拷贝。这些基因在特定条件下有一定的表达水平,即它们的转录产物(如mRNA)在细胞中存在,并且可能在细胞的生理过程、发育、疾病发生等方面发挥着重要的作用。由于它们的表达拷贝数非零,可以通过适当的技术手段(如转录组测序、qPCR等)检测到它们的存在和表达水平。Genes with non-zero copy numbers mean that these genes have at least one copy in the genome. These genes have a certain expression level under specific conditions, that is, their transcription products (such as mRNA) exist in cells and may play an important role in the physiological processes, development, and disease occurrence of cells. Because their expression copy number is non-zero, their presence and expression level can be detected by appropriate technical means (such as transcriptome sequencing, qPCR, etc.).

优选地,采用主成分分析(Principal Component Analysis,PCA)进行降维,维度降至与肿瘤类型分类总数相匹配的维度,以减少数据维度,降低计算复杂度。肿瘤类型分类总数可以预先设置,其代表模型所涉及的所有肿瘤类型。Preferably, principal component analysis (PCA) is used to reduce the dimension to a dimension that matches the total number of tumor type classifications, so as to reduce the data dimension and reduce the computational complexity. The total number of tumor type classifications can be preset, which represents all tumor types involved in the model.

在一些具体实施例中,转录模型的输入层具有m*DIM个神经元;转录模型具有两个全连接层;第一个全连接层之后引入有ReLU激活函数,以引入非线性;第一个全连接层之后应用有Dropout函数来随机丢弃预设比例的输出,以减少过拟合;第二个全连接层具有n个神经元,以输出n个预测结果。示例性的,构建一个包含两个全连接层的神经网络模型,其中输入层有Class*50个神经元,第一个隐藏层有512个神经元,第二个隐藏层有128个神经元,输出128个转录结果特征。在第一个隐藏层之后引入ReLU激活函数,以引入非线性并增强模型的表达能力。应用Dropout技术,以概率0.5随机丢弃第一个隐藏层的神经元输出,减少过拟合并提高模型的泛化能力。In some specific embodiments, the input layer of the transcription model has m*DIM neurons; the transcription model has two fully connected layers; a ReLU activation function is introduced after the first fully connected layer to introduce nonlinearity; a Dropout function is applied after the first fully connected layer to randomly discard a preset proportion of outputs to reduce overfitting; the second fully connected layer has n neurons to output n prediction results. Exemplarily, a neural network model containing two fully connected layers is constructed, in which the input layer has Class*50 neurons, the first hidden layer has 512 neurons, the second hidden layer has 128 neurons, and 128 transcription result features are output. A ReLU activation function is introduced after the first hidden layer to introduce nonlinearity and enhance the expressiveness of the model. The Dropout technique is applied to randomly discard the neuron output of the first hidden layer with a probability of 0.5 to reduce overfitting and improve the generalization ability of the model.

临床预后数据为以文字描述或图表形式展示的用于评估疾病发展可能的结果以及患者治疗效果和生存前景的统计指标,通常包括生存时间、疾病复发情况、康复状态等,有助于深入了解疾病的自然病程、影响因素等,为新的治疗方法和药物研发提供依据。此外,临床预后数据也包括年龄、性别等患者情况,此类情况也会在一定程度上影响患者的肿瘤预测。在实际应用中,临床预后数据既可以是医生对患者预后情况的文字描述,也可以是以表格形式记载的相关诊疗数据。Clinical prognosis data are statistical indicators presented in the form of text descriptions or charts to evaluate the possible outcomes of disease development, as well as the effectiveness of patient treatment and survival prospects. They usually include survival time, disease recurrence, recovery status, etc., which help to gain a deeper understanding of the natural course of the disease, influencing factors, etc., and provide a basis for the development of new treatments and drugs. In addition, clinical prognosis data also include patient conditions such as age and gender, which will also affect the patient's tumor prediction to a certain extent. In actual applications, clinical prognosis data can be either a doctor's text description of the patient's prognosis or related diagnosis and treatment data recorded in a table.

在一些具体实施例中,关于临床预后数据的第三预处理,执行了缺失值处理和离群值处理,以确保数据的完整性和准确性。对于缺失值,采用了填充、删除或插值进行处理;对于离群值,采用了截断、替换或剔除进行处理。在一些实施例中,第三预处理还包括:提取临床预后数据中包括肿瘤分期、生存时间在内的关键信息,关键信息对肿瘤预测的影响因素较大,因此需要单独处理。关键信息中至少包含肿瘤分期和生存时间两个指标。肿瘤分期是反映肿瘤严重程度和扩散范围的重要指标。不同的肿瘤分期系统可能有所差异,但通常会考虑肿瘤的大小、侵犯深度、是否有淋巴结转移以及是否有远处转移等因素。早期肿瘤分期通常意味着肿瘤较小且局限,而晚期分期则表示肿瘤可能已经广泛扩散。生存时间则是指患者从确诊疾病到死亡或特定时间点的时间跨度。In some specific embodiments, the third preprocessing of clinical prognosis data performs missing value processing and outlier processing to ensure the integrity and accuracy of the data. For missing values, filling, deletion or interpolation is used for processing; for outliers, truncation, replacement or elimination is used for processing. In some embodiments, the third preprocessing also includes: extracting key information including tumor stage and survival time from the clinical prognosis data, and the key information has a greater impact on tumor prediction, so it needs to be processed separately. The key information includes at least two indicators, tumor stage and survival time. Tumor stage is an important indicator reflecting the severity and spread of the tumor. Different tumor staging systems may vary, but usually consider factors such as the size of the tumor, the depth of invasion, whether there is lymph node metastasis, and whether there is distant metastasis. Early tumor stage usually means that the tumor is small and localized, while late stage means that the tumor may have spread widely. Survival time refers to the time span from the diagnosis of the disease to death or a specific time point.

在一些具体实施例中,临床输入数据中包含数值特征和类别特征;在临床模型中,使用嵌入层来处理类别特征,将其映射到高维空间;将数值特征和嵌入层输出的类别特征拼接,以形成单一的输入向量;通过两个全连接层对输入向量进行特征提取和降维,通过ReLU激活函数引入非线性,应用Dropout函数以减少过拟合。临床模型同样以一个全连接层模型的形式存在,该模型以临床输入数据为输入,输出肿瘤标签及其对应的概率。在模型构建过程中,也对网络结构和参数进行了调整,以提高模型的性能和鲁棒性。In some specific embodiments, the clinical input data contains numerical features and categorical features; in the clinical model, an embedding layer is used to process the categorical features and map them to a high-dimensional space; the numerical features and the categorical features output by the embedding layer are concatenated to form a single input vector; the input vector is subjected to feature extraction and dimensionality reduction through two fully connected layers, nonlinearity is introduced through the ReLU activation function, and the Dropout function is applied to reduce overfitting. The clinical model also exists in the form of a fully connected layer model, which takes the clinical input data as input and outputs a tumor label and its corresponding probability. During the model construction process, the network structure and parameters are also adjusted to improve the performance and robustness of the model.

为了构建一个能够精准诊断肿瘤的深度学习模型,本申请采用了多信息融合的策略,解决了多模态数据的异质性。首先,分别利用病理数据、转录数据和临床数据训练了三个独立的深度学习模型。这些模型各自负责从特定的数据集中提取关键信息并生成对应的输出结果。随后,将这三个模型的输出结果作为输入送入一个全连接层进行融合。融合层的作用是将来自不同数据源的信息进行整合以生成一个更全面、准确的肿瘤标签。通过多模态肿瘤预测模型能够综合利用多源信息提高诊断的准确度和可靠性。In order to build a deep learning model that can accurately diagnose tumors, this application adopts a multi-information fusion strategy to solve the heterogeneity of multimodal data. First, three independent deep learning models were trained using pathological data, transcriptional data, and clinical data. These models are each responsible for extracting key information from a specific data set and generating corresponding output results. Subsequently, the output results of these three models are fed into a fully connected layer as input for fusion. The role of the fusion layer is to integrate information from different data sources to generate a more comprehensive and accurate tumor label. The multimodal tumor prediction model can comprehensively utilize multi-source information to improve the accuracy and reliability of diagnosis.

为了构建一个能够精准预测患者生存时间的深度学习模型,本申请使用预训练的ResNet50模型作为病理数据特征提取器,将模型最后的全连接层替换为一个具有1个输出节点的新全连接层,以适应生存时间预测任务。通过这种方式,模型能够提取病理图像中的深层次特征,并输出预测的生存时间。In order to build a deep learning model that can accurately predict patient survival time, this application uses a pre-trained ResNet50 model as a pathology data feature extractor, and replaces the last fully connected layer of the model with a new fully connected layer with 1 output node to adapt to the survival time prediction task. In this way, the model can extract deep features in pathology images and output predicted survival time.

在训练阶段,将上述构建的融合模型作为一个整体进行训练。为了优化模型的参数和权重采用了梯度下降算法等优化方法。通过不断地迭代和更新逐渐调整模型的参数使得模型的输出结果与真实标签之间的误差逐渐减小。在这个过程中还需要选择合适的损失函数来评估模型的性能。对于多分类问题常用的损失函数包括交叉熵损失函数等。通过最小化损失函数确保模型在训练过程中不断改善其性能。经过充分的训练和优化后的深度学习融合模型已经具备了较高的诊断能力。在实际应用中可以使用这个训练好的模型对患者样本进行预测。In the training phase, the fusion model constructed above is trained as a whole. In order to optimize the parameters and weights of the model, optimization methods such as the gradient descent algorithm are used. Through continuous iteration and updating, the parameters of the model are gradually adjusted so that the error between the output of the model and the true label is gradually reduced. In this process, it is also necessary to select a suitable loss function to evaluate the performance of the model. Commonly used loss functions for multi-classification problems include cross entropy loss functions. By minimizing the loss function, it is ensured that the model continuously improves its performance during the training process. After sufficient training and optimization, the deep learning fusion model has a high diagnostic ability. In practical applications, this trained model can be used to predict patient samples.

在预测阶段,将患者样本的病理数据、转录数据和临床预后数据输入到多模态肿瘤预测模型中,预测模型会根据其学习到的知识和规则生成对应的肿瘤标签,从而为医生提供有力的辅助工具。多模态肿瘤预测模型能够实时地为医生提供诊断建议和预后预测结果,帮助医生更好地制定治疗方案和决策。医生可以结合模型的预测结果和患者的临床信息制定更加精准的治疗方案。同时,模型还可以用于肿瘤的早期筛查和预防工作,提高人们的生活质量和健康水平。将多模态深度学习模型与临床工作流程相结合,为医生提供了便捷、快速且准确的诊断支持系统。这种集成方式不仅提高了医生的工作效率,而且使得模型能够更好地适应实际临床需求。In the prediction stage, the pathological data, transcriptional data and clinical prognosis data of the patient samples are input into the multimodal tumor prediction model. The prediction model will generate corresponding tumor labels based on the knowledge and rules it has learned, thus providing doctors with a powerful auxiliary tool. The multimodal tumor prediction model can provide doctors with diagnostic suggestions and prognosis prediction results in real time, helping doctors to better formulate treatment plans and make decisions. Doctors can formulate more accurate treatment plans based on the prediction results of the model and the clinical information of the patient. At the same time, the model can also be used for early screening and prevention of tumors to improve people's quality of life and health level. Combining the multimodal deep learning model with the clinical workflow provides doctors with a convenient, fast and accurate diagnostic support system. This integration method not only improves the work efficiency of doctors, but also enables the model to better adapt to actual clinical needs.

针对多模态数据间的异质性,本申请提出了有效的处理机制,包括数据预处理和模态融合等步骤。这些机制确保了模型的有效性和稳定性,使得模型能够自适应调整不同模态数据的权重和贡献度。In view of the heterogeneity between multimodal data, this application proposes an effective processing mechanism, including data preprocessing and modal fusion, etc. These mechanisms ensure the effectiveness and stability of the model, enabling the model to adaptively adjust the weights and contributions of different modal data.

在一些具体实施例中,多模态预测模型的表达式为:In some specific embodiments, the expression of the multimodal prediction model is:

ReLU(x)=ax(0,x)ReLU(x)=ax(0,x)

其中,表示多模态预测模型的输出,Wclass是病理模型、转录模型和临床模型的输出的权重矩阵,Batch表示批归一化层,Dropout表示Dropout函数,ReLU表示ReLU激活函数,f1、f2和f3分别表示病理结果特征、转录结果特征和临床结果特征;μbatch表示批次的平均值;表示批次的方差;∈是用于数值稳定性的小标量,γ和β是BatchNorm层的可训练参数;x是融合层和之前操作的输出(ReLU,Dropout)。in, represents the output of the multimodal prediction model, W class is the weight matrix of the output of the pathological model, transcriptional model and clinical model, Batch represents the batch normalization layer, Dropout represents the Dropout function, ReLU represents the ReLU activation function, f 1 , f 2 and f 3 represent the pathological result features, transcriptional result features and clinical result features respectively; μ batch represents the average value of the batch; represents the variance of the batch; ∈ is a small scalar for numerical stability, γ and β are trainable parameters of the BatchNorm layer; x is the output of the fusion layer and the previous operation (ReLU, Dropout).

在一些具体实施例中,在多模态预测模型中:通过第一个全连接层将n个病理结果特征、n个转录结果特征以及n个临床结果特征作为输入特征融合到低维空间中,得到融合特征;通过ReLU激活函数引入非线性;应用Dropout函数丢弃预设比例的输入特征以减少过拟合;使用批量归一化层规划特征;通过第二个全连接层将融合特征映射最终的肿瘤类别上。In some specific embodiments, in a multimodal prediction model: n pathological result features, n transcriptional result features, and n clinical result features are fused into a low-dimensional space as input features through a first fully connected layer to obtain fused features; nonlinearity is introduced through a ReLU activation function; a Dropout function is applied to discard a preset proportion of input features to reduce overfitting; a batch normalization layer is used to plan features; and the fused features are mapped to the final tumor category through a second fully connected layer.

在融合不同模态的数据时,可以根据实际情况调整各模态数据的权重和贡献度。在一些具体实施例中,在融合处理单元中,病理结果特征、转录结果特征、临床结果特征均具有对应的权重,通过多模态预测模型融合n个病理结果特征、n个转录结果特征、n个临床结果特征及其对应的权重,得到最终的肿瘤预测结果。例如,在某些情况下,病理数据可能更为重要,而在其他情况下,转录数据可能更为关键。通过灵活调整权重,可以更好地平衡各模态数据的影响。When fusing data of different modalities, the weight and contribution of each modality data can be adjusted according to the actual situation. In some specific embodiments, in the fusion processing unit, the pathological result features, transcription result features, and clinical result features all have corresponding weights, and the multimodal prediction model is used to fuse n pathological result features, n transcription result features, n clinical result features and their corresponding weights to obtain the final tumor prediction result. For example, in some cases, pathological data may be more important, while in other cases, transcription data may be more critical. By flexibly adjusting the weights, the influence of each modality data can be better balanced.

不同模态的数据往往存在异质性,即数据分布和特征表达上的差异。本申请提出的异质性处理机制,通过数据预处理和模态融合等技术手段,能够消除这种异质性,保证模型的有效性和稳定性。这种机制使得模型能够更好地适应不同的数据环境和应用场景,提高了模型的泛化能力。在处理异质性时,除了使用数据预处理和模态融合等技术手段外,还可以考虑采用其他先进的异质性处理方法,如基于生成对抗网络(GAN)的数据增强或基于迁移学习的特征转换。Data of different modalities often have heterogeneity, that is, differences in data distribution and feature expression. The heterogeneity processing mechanism proposed in this application can eliminate this heterogeneity and ensure the effectiveness and stability of the model through technical means such as data preprocessing and modal fusion. This mechanism enables the model to better adapt to different data environments and application scenarios, and improves the generalization ability of the model. When dealing with heterogeneity, in addition to using technical means such as data preprocessing and modal fusion, other advanced heterogeneity processing methods can also be considered, such as data enhancement based on generative adversarial networks (GAN) or feature transformation based on transfer learning.

在一些具体实施例中,在生存时间预测模型中,使用预训练的卷积神经网络模型提取所述病理输入图像的图像特征,然后使用多层感知机基于所述图像特征进行生存时间预测,得到最终的生存时间预测结果。示例性的,获取患者的病理输入图像后,输入到预训练的卷积神经网络模型中,提取图像特征;将提取的图像特征输入到多层感知机(MLP)模型中,预测患者的生存时间。利用深度学习技术对病理图像数据进行有效分析,更全面地反映肿瘤的特征,不仅提高了生存时间预测的准确性,而且为医生提供了更全面、深入的病理信息。In some specific embodiments, in the survival time prediction model, a pre-trained convolutional neural network model is used to extract the image features of the pathological input image, and then a multi-layer perceptron is used to predict the survival time based on the image features to obtain the final survival time prediction result. Exemplarily, after obtaining the patient's pathological input image, it is input into a pre-trained convolutional neural network model to extract image features; the extracted image features are input into a multi-layer perceptron (MLP) model to predict the patient's survival time. Using deep learning technology to effectively analyze pathological image data and more comprehensively reflect the characteristics of the tumor not only improves the accuracy of survival time prediction, but also provides doctors with more comprehensive and in-depth pathological information.

在一些具体实施例中,生存时间预测模型的表达式为:In some specific embodiments, the expression of the survival time prediction model is:

其中,X表示输入的病理图像,W1和b1是第一层全连接层的权重和偏置,H1是经过ReLU激活函数后的隐藏层输出。W2和b2是第二层全连接层的权重和偏置,是最终的生存时间预测值。Batch表示批归一化层,Dropout表示Dropout函数,ReLU表示ReLU激活函数。Where X represents the input pathological image, W1 and b1 are the weights and biases of the first fully connected layer, H1 is the hidden layer output after the ReLU activation function. W2 and b2 are the weights and biases of the second fully connected layer. is the final survival time prediction value. Batch represents the batch normalization layer, Dropout represents the Dropout function, and ReLU represents the ReLU activation function.

具体地,对患者的病理切片图像数据进行预处理得到病理输入图像。病理切片图像提供了肿瘤的形态学信息,是模型进行生存时间预测的基础数据。预处理操作包括图像的亮度调整、对比度调整、饱和度调整、Gamma校正以及像素标准化,得到标准化的病理输入图像。将病理输入图像输入到预训练的卷积神经网络(CNN)模型中,如ResNet50,提取图像特征。通过利用预训练模型,可以有效提取病理图像中的深层次特征,增强模型的特征表达能力。之后,将提取的图像特征输入到多层感知机(MLP)模型中,预测患者的生存时间。MLP模型包括多个全连接层、ReLU激活函数、Dropout层以及批量归一化层,通过这些层的组合,模型能够有效捕捉特征间的非线性关系,提高预测的准确性。Specifically, the patient's pathological slice image data is preprocessed to obtain a pathological input image. The pathological slice image provides morphological information of the tumor and is the basic data for the model to predict survival time. The preprocessing operations include image brightness adjustment, contrast adjustment, saturation adjustment, Gamma correction and pixel standardization to obtain a standardized pathological input image. The pathological input image is input into a pre-trained convolutional neural network (CNN) model, such as ResNet50, to extract image features. By using the pre-trained model, deep features in the pathological image can be effectively extracted and the feature expression ability of the model can be enhanced. Afterwards, the extracted image features are input into a multi-layer perceptron (MLP) model to predict the patient's survival time. The MLP model includes multiple fully connected layers, ReLU activation functions, Dropout layers, and batch normalization layers. Through the combination of these layers, the model can effectively capture the nonlinear relationship between features and improve the accuracy of prediction.

在一些具体实施例中,在生存时间预测模型的训练过程中:使用均方误差(MSE)作为损失函数,Adam优化器进行参数优化;在每个训练周期中,通过前向传播计算预测值,通过反向传播更新权重。In some specific embodiments, during the training process of the survival time prediction model: the mean square error (MSE) is used as the loss function, and the Adam optimizer is used for parameter optimization; in each training cycle, the predicted value is calculated by forward propagation, and the weight is updated by back propagation.

示例性的,生存时间预测模型的输出层为具有一个神经元的全连接层,用于回归生存时间。在生存时间预测模型的训练过程中,利用均方误差(MSE)损失函数来衡量模型输出与训练数据之间的差异,并通过反向传播算法计算损失相对于模型参数的梯度。使用Adam优化器根据计算出的梯度更新模型权重,其中学习率初始设置为0.001,以确保模型能够有效地收敛。Exemplarily, the output layer of the survival time prediction model is a fully connected layer with one neuron, which is used to regress the survival time. During the training process of the survival time prediction model, the mean square error (MSE) loss function is used to measure the difference between the model output and the training data, and the gradient of the loss relative to the model parameters is calculated by the back propagation algorithm. The Adam optimizer is used to update the model weights according to the calculated gradient, where the learning rate is initially set to 0.001 to ensure that the model can converge effectively.

在一些具体实施例中,生存时间预测模型的训练过程包括:对训练用的病理图像进行包括亮度调整、对比度调整、饱和度调整、Gamma家政、像素标准化等形式的预处理,得到图像数据。读取图像数据和生存时间数据,确保图像和生存时间数据一致。筛选生存时间小于等于五年的数据(1825天)。处理缺失值和负值,将负值转换为正值,并用平均值填补缺失值。采用ResNet50模型提取图像特征,然后使用多层感知机(MLP)进行生存时间预测。使用均方误差(MSE)作为损失函数,Adam优化器进行参数优化。在每个训练周期中,模型通过前向传播计算预测值,通过反向传播更新权重。评估过程:评估模型的预测性能,计算预测值与实际值之间的误差。使用对数逆变换将预测值和实际值还原为原始尺度,并计算在±180天范围内的准确度。实际值与预测值的散点图、残差图和实际值与预测值的对比直方图。In some specific embodiments, the training process of the survival time prediction model includes: preprocessing the pathological images used for training, including brightness adjustment, contrast adjustment, saturation adjustment, Gamma housekeeping, pixel standardization, etc., to obtain image data. Read the image data and survival time data to ensure that the image and survival time data are consistent. Screen the data with a survival time of less than or equal to five years (1825 days). Process missing values and negative values, convert negative values to positive values, and fill the missing values with the average value. Use the ResNet50 model to extract image features, and then use the multi-layer perceptron (MLP) to predict survival time. Use the mean square error (MSE) as the loss function and the Adam optimizer for parameter optimization. In each training cycle, the model calculates the predicted value through forward propagation and updates the weight through back propagation. Evaluation process: Evaluate the prediction performance of the model and calculate the error between the predicted value and the actual value. Use the inverse logarithmic transformation to restore the predicted value and the actual value to the original scale, and calculate the accuracy within the range of ±180 days. Scatter plots of actual values and predicted values, residual plots, and histograms of actual values and predicted values.

对于生存时间预测模型,可采用深度卷积神经网络(CNN)、循环神经网络(RNN)或图神经网络(GNN)等深度学习网络模型作为基础图像模型,以适应不同的数据结构和特征。优选地,选择ResNet50模型作为病理模型的基础模型。ResNet50是一种经典的卷积神经网络模型,通过在大规模图像数据集上的预训练,能够有效地提取图像中的关键特征,这些特征对于肿瘤诊断具有重要的参考价值。For the survival time prediction model, deep learning network models such as deep convolutional neural network (CNN), recurrent neural network (RNN) or graph neural network (GNN) can be used as the basic image model to adapt to different data structures and features. Preferably, the ResNet50 model is selected as the basic model of the pathological model. ResNet50 is a classic convolutional neural network model that can effectively extract key features in the image through pre-training on large-scale image datasets. These features have important reference value for tumor diagnosis.

示例性的,模型构建:使用预训练的ResNet50模型作为特征提取器,将模型最后的全连接层替换为一个具有1个输出节点的新全连接层,以适应生存时间预测任务。通过这种方式,模型能够提取病理图像中的深层次特征,并输出预测的生存时间。Exemplary,model construction: Use the pre-trained ResNet50 model as a feature extractor, and replace the last fully connected layer of the model with a new fully connected layer with 1 output node to adapt to the survival time prediction task. In this way, the model can extract deep features in pathological images and output the predicted survival time.

模型训练:在训练过程中,针对生存时间预测任务对模型的参数进行了精细调整,以提高其对该任务的适应性。具体训练步骤如下:Model training: During the training process, the parameters of the model were fine-tuned for the survival time prediction task to improve its adaptability to the task. The specific training steps are as follows:

利用均方误差(MSE)损失函数来衡量模型预测的生存时间与真实标签之间的差异。通过反向传播算法计算损失相对于模型参数的梯度。使用Adam优化器根据计算出的梯度更新模型权重,其中学习率初始设置为0.001,并在每若干个epoch后进行调整,以优化模型性能。The mean square error (MSE) loss function is used to measure the difference between the survival time predicted by the model and the true label. The gradient of the loss relative to the model parameters is calculated by the back propagation algorithm. The Adam optimizer is used to update the model weights according to the calculated gradient, where the learning rate is initially set to 0.001 and adjusted after every several epochs to optimize the model performance.

在训练过程中,具体的超参数设置如下:During the training process, the specific hyperparameters are set as follows:

损失函数:均方误差(MSE)损失函数,用于衡量模型输出与真实生存时间之间的差异。Loss function: Mean squared error (MSE) loss function, which measures the difference between the model output and the true survival time.

优化器:Adam优化器,初始学习率设置为0.001。Optimizer: Adam optimizer, with the initial learning rate set to 0.001.

学习率调整:在每若干个epoch后,将学习率乘以0.1进行衰减,以提高模型的收敛效果。Learning rate adjustment: After every several epochs, the learning rate is multiplied by 0.1 to decay to improve the convergence effect of the model.

示例性的,训练过程中每个epoch代表整个训练数据集被模型遍历并学习一次的过程。一个epoch包括一次完整的前向传播(forward pass)和反向传播(backward pass),以便模型在每次迭代中逐步优化其参数。For example, each epoch in the training process represents the process in which the entire training data set is traversed and learned by the model once. An epoch includes a complete forward pass and backward pass, so that the model gradually optimizes its parameters in each iteration.

在训练阶段,将上述构建的深度学习模型作为一个整体进行训练。为了优化模型的参数和权重,采用了梯度下降算法等优化方法。通过不断地迭代和更新,逐渐调整模型的参数,使得模型的输出结果与真实标签之间的误差逐渐减小。在这个过程中,还需要选择合适的损失函数来评估模型的性能。对于回归问题常用的损失函数包括均方误差损失函数(MSE)等。通过最小化损失函数,确保模型在训练过程中不断改善其性能。经过充分的训练和优化后的深度学习模型已经具备了较高的生存时间预测能力。在实际应用中,可以使用这个训练好的模型对患者样本进行生存时间预测。在训练过程中,采用了适当的超参数设置和优化策略,如学习率调整、批量大小选择和正则化等。本申请的模型通过预训练的ResNet50提取图像特征,然后使用多层感知机(MLP)进行生存时间预测,实现了对单一模态数据的高效利用。In the training phase, the deep learning model constructed above is trained as a whole. In order to optimize the parameters and weights of the model, optimization methods such as the gradient descent algorithm are used. Through continuous iteration and updating, the parameters of the model are gradually adjusted so that the error between the output result of the model and the true label is gradually reduced. In this process, it is also necessary to select a suitable loss function to evaluate the performance of the model. Commonly used loss functions for regression problems include mean square error loss function (MSE) and the like. By minimizing the loss function, it is ensured that the model continuously improves its performance during the training process. The deep learning model after sufficient training and optimization has a high ability to predict survival time. In practical applications, this trained model can be used to predict the survival time of patient samples. During the training process, appropriate hyperparameter settings and optimization strategies are adopted, such as learning rate adjustment, batch size selection and regularization. The model of this application extracts image features through pre-trained ResNet50, and then uses a multi-layer perceptron (MLP) to predict survival time, thereby realizing efficient use of single modality data.

在预测阶段,将患者样本的病理输入图像输入到生存时间预测模型中,预测模型会根据其学习到的知识和规则生成对应的生存时间预测结果,从而为医生提供有力的辅助工具。生存时间预测模型能够实时地为医生提供预测结果,帮助医生更好地制定治疗方案和决策。医生可以结合模型的预测结果和患者的临床信息制定更加精准的治疗方案。同时,模型还可以用于癌症患者的预后预测,提高人们的生活质量和健康水平。将深度学习模型与临床工作流程相结合,为医生提供了便捷、快速且准确的诊断支持系统。这种集成方式不仅提高了医生的工作效率,而且使得模型能够更好地适应实际临床需求。In the prediction stage, the pathological input image of the patient sample is input into the survival time prediction model. The prediction model will generate the corresponding survival time prediction results based on the knowledge and rules it has learned, thus providing doctors with a powerful auxiliary tool. The survival time prediction model can provide doctors with prediction results in real time, helping them to better formulate treatment plans and make decisions. Doctors can formulate more accurate treatment plans based on the prediction results of the model and the clinical information of the patient. At the same time, the model can also be used to predict the prognosis of cancer patients and improve people's quality of life and health level. Combining deep learning models with clinical workflows provides doctors with a convenient, fast and accurate diagnostic support system. This integration method not only improves the work efficiency of doctors, but also enables the model to better adapt to actual clinical needs.

将本申请的方案应用到肺肿瘤实验中,实验结果与对比分析如下:经过多轮实验和参数调整,取得了令人满意的实验结果。具体来说,在肺肿瘤良恶性鉴别任务上,本申请的多模态深度学习模型相比传统的单一模态模型,具有更高的准确性和更低的误判率。此外,还通过与其他先进的肺肿瘤鉴别方法进行比较,发现本申请在多个评价指标上均取得了显著的优势。附图6为本申请的融合模型在训练阶段和预测阶段的准确率。图7表示生存时间预测模型在不同的截断时间内,生存时间预测的准确率。The scheme of the present application was applied to lung tumor experiments, and the experimental results and comparative analysis are as follows: After multiple rounds of experiments and parameter adjustments, satisfactory experimental results were obtained. Specifically, in the task of distinguishing benign and malignant lung tumors, the multimodal deep learning model of the present application has higher accuracy and lower misjudgment rate than the traditional single modality model. In addition, by comparing with other advanced lung tumor identification methods, it was found that the present application has achieved significant advantages in multiple evaluation indicators. Figure 6 shows the accuracy of the fusion model of the present application in the training stage and the prediction stage. Figure 7 shows the accuracy of the survival time prediction model at different cutoff times.

为了验证本申请的实际应用效果,将其集成到了医院的临床工作流程中,并进行了为期数月的试用。期间,医生们积极反馈了本申请的使用情况和改进意见。根据实际应用反馈,医生们普遍认为本申请提供的诊断建议和预后预测结果准确可靠,能够为他们制定治疗方案和决策提供有力的支持。In order to verify the actual application effect of this application, it was integrated into the hospital's clinical workflow and tested for several months. During this period, doctors actively provided feedback on the use of this application and suggestions for improvement. According to the feedback from actual applications, doctors generally believe that the diagnostic suggestions and prognosis prediction results provided by this application are accurate and reliable, and can provide strong support for them to formulate treatment plans and make decisions.

通过一系列的实验、模拟和实际应用,证明了本申请的可行性和有效性。实验结果表明,本申请在肺肿瘤良恶性鉴别任务上具有较高的准确性和实用性,有望为临床诊断和治疗带来实质性的改进。同时,也将根据实际应用反馈和医生建议,不断完善和优化本申请的技术方案,以更好地服务于临床医生和患者。Through a series of experiments, simulations and practical applications, the feasibility and effectiveness of this application have been proven. The experimental results show that this application has high accuracy and practicality in the task of distinguishing benign and malignant lung tumors, and is expected to bring substantial improvements to clinical diagnosis and treatment. At the same time, the technical solution of this application will be continuously improved and optimized based on actual application feedback and doctor's suggestions to better serve clinicians and patients.

虽然本申请主要针对肺肿瘤进行了实验和验证,但其核心技术和方法同样适用于其他类型的肿瘤。通过调整模型结构和参数设置,也可以将其应用于乳腺癌、肝癌、胃癌等多种肿瘤类型的鉴别和预后预测。Although this application is mainly used for lung tumor experiments and verification, its core technology and methods are also applicable to other types of tumors. By adjusting the model structure and parameter settings, it can also be applied to the identification and prognosis prediction of various tumor types such as breast cancer, liver cancer, and gastric cancer.

在实际应用中,除了病理组学、转录组学和临床预后数据外,还可以考虑集成其他模态的数据,如蛋白质组学、代谢组学等,以更全面地反映肿瘤的特征和状态。这将进一步提高肿瘤鉴别和预后预测的准确性和可靠性。In practical applications, in addition to pathological omics, transcriptomics and clinical prognosis data, other modalities of data, such as proteomics, metabolomics, etc., can also be considered to be integrated to more comprehensively reflect the characteristics and status of the tumor. This will further improve the accuracy and reliability of tumor identification and prognosis prediction.

本申请提出了一种基于多模态深度学习的肺肿瘤诊断及预测系统,针对传统诊断方法中数据单一的问题,利用深度学习技术对肺肿瘤病理组学、转录组学和临床预后数据进行有效融合,提高了诊断的准确性和模型的可解释性,其中生存时间预测能够实时地为医生提供预测结果,为医生提供更全面、深入的诊断及预后信息,帮助医生更好地制定治疗方案和决策,能够有效融合到目前肺肿瘤临床诊疗流程中。This application proposes a lung tumor diagnosis and prediction system based on multimodal deep learning. To address the problem of single data in traditional diagnostic methods, deep learning technology is used to effectively integrate lung tumor pathological genomics, transcriptomics and clinical prognosis data, thereby improving the accuracy of diagnosis and the interpretability of the model. The survival time prediction can provide doctors with prediction results in real time, providing doctors with more comprehensive and in-depth diagnostic and prognostic information, helping doctors to better formulate treatment plans and make decisions, and can be effectively integrated into the current clinical diagnosis and treatment process of lung tumors.

针对多模态数据间的异质性,通过数据预处理和模态融合等技术手段消除差异和冲突,确保模型的有效性和稳定性,同时自适应调整不同模态数据的权重和贡献度。采用注意力机制、残差连接等先进技术优化模型性能,采用梯度下降、正则化、批量归一化等优化算法和技术手段加速模型收敛,利用数据增强和迁移学习等技术手段扩展训练数据集,提高模型准确性和泛化能力,使得模型能够同时处理多种模态的数据,并且具有更强的特征提取和分类能力。将多模态深度学习模型与临床工作流程相结合,为医生提供实时诊断建议和预后预测结果,能够适应不同临床场景需求。In view of the heterogeneity between multimodal data, technical means such as data preprocessing and modal fusion are used to eliminate differences and conflicts, ensure the effectiveness and stability of the model, and adaptively adjust the weights and contributions of different modal data. Advanced technologies such as attention mechanism and residual connection are used to optimize model performance. Optimization algorithms and technical means such as gradient descent, regularization, and batch normalization are used to accelerate model convergence. Technical means such as data enhancement and transfer learning are used to expand the training data set to improve model accuracy and generalization ability, so that the model can process data of multiple modalities at the same time and has stronger feature extraction and classification capabilities. The multimodal deep learning model is combined with the clinical workflow to provide doctors with real-time diagnostic suggestions and prognosis prediction results, which can adapt to the needs of different clinical scenarios.

本领域技术人员可以理解附图只是一个优选实施场景的示意图,附图中的模块或流程并不一定是实施本申请所必须的。本领域技术人员可以理解实施场景中的装置中的模块可以按照实施场景描述进行分布于实施场景的装置中,也可以进行相应变化位于不同于本实施场景的一个或多个装置中。上述实施场景的模块可以合并为一个模块,也可以进一步拆分成多个子模块。上述本申请序号仅仅为了描述,不代表实施场景的优劣。以上公开的仅为本申请的几个具体实施场景,但是,本申请并非局限于此,任何本领域的技术人员能思之的变化都应落入本申请的保护范围。Those skilled in the art will appreciate that the accompanying drawings are only schematic diagrams of a preferred implementation scenario, and the modules or processes in the accompanying drawings are not necessarily necessary for the implementation of the present application. Those skilled in the art will appreciate that the modules in the devices in the implementation scenario can be distributed in the devices of the implementation scenario according to the description of the implementation scenario, or can be changed accordingly and located in one or more devices different from the present implementation scenario. The modules of the above-mentioned implementation scenarios can be combined into one module, or can be further split into multiple sub-modules. The above-mentioned serial numbers of this application are only for description and do not represent the pros and cons of the implementation scenarios. The above disclosure is only a few specific implementation scenarios of the present application, but the present application is not limited thereto, and any changes that can be thought of by a technician in this field should fall within the scope of protection of the present application.

Claims (12)

1.一种基于多模态深度学习的肺肿瘤诊断及预测系统,其特征在于,包括:1. A lung tumor diagnosis and prediction system based on multimodal deep learning, characterized by comprising: 输入单元,用于获取患者包括涉及肺肿瘤病理图像的病理数据、涉及肺肿瘤组织RNA转录情况的转录数据和涉及肺肿瘤患者临床治疗的临床预后数据中的至少两种数据;An input unit, used to obtain at least two data of a patient including pathological data related to lung tumor pathological images, transcription data related to RNA transcription of lung tumor tissue, and clinical prognosis data related to clinical treatment of lung tumor patients; 预处理单元,用于对所述病理数据进行预设第一预处理得到病理输入图像、对所述转录数据进行预设第二预处理得到转录输入数据、对所述临床预后数据进行第三预处理得到临床输入数据;A preprocessing unit, configured to perform a preset first preprocessing on the pathological data to obtain a pathological input image, perform a preset second preprocessing on the transcription data to obtain transcription input data, and perform a third preprocessing on the clinical prognosis data to obtain clinical input data; 分层处理单元,用于将所述病理输入图像、所述转录输入数据和所述临床输入数据输入到融合有病理模型、转录模型和临床模型的多模态预测模型中,使所述病理模型根据所述病理输入图像输出n个病理结果特征,使所述转录模型根据所述转录输入数据输出n个的,使所述临床模型根据所述临床输入数据输出n个临床结果特征;a hierarchical processing unit, configured to input the pathological input image, the transcription input data and the clinical input data into a multimodal prediction model integrated with a pathological model, a transcription model and a clinical model, so that the pathological model outputs n pathological result features according to the pathological input image, the transcription model outputs n pathological result features according to the transcription input data, and the clinical model outputs n clinical result features according to the clinical input data; 融合处理单元,用于通过所述多模态预测模型融合n个所述病理结果特征、n个所述转录结果特征以及n个临床结果特征,得到最终的肿瘤预测结果;A fusion processing unit, used for fusing the n pathological result features, the n transcription result features and the n clinical result features through the multimodal prediction model to obtain a final tumor prediction result; 生存时间预测单元,用于将所述病理输入图像输入到预设生存时间预测模型中,得到最终预测的患者的生存时间。The survival time prediction unit is used to input the pathological input image into a preset survival time prediction model to obtain the final predicted survival time of the patient. 其中,所述病理结果特征、所述转录结果特征、所述临床结果特征和所述肿瘤预测结果均涉及预测的肿瘤类型及其概率,n为不小于2的整数。Among them, the pathological result characteristics, the transcription result characteristics, the clinical result characteristics and the tumor prediction results all involve predicted tumor types and their probabilities, and n is an integer not less than 2. 2.根据权利要求1所述的肺肿瘤诊断及预测系统,其特征在于,所述多模态预测模型的表达式为:2. The lung tumor diagnosis and prediction system according to claim 1, wherein the expression of the multimodal prediction model is: 其中,表示多模态预测模型的输出,Wclass是所述病理模型、所述转录模型和所述临床模型的输出的权重矩阵,BatchNorm表示批归一化层,Dropout表示Dropout函数,ReLU表示ReLU激活函数,Wfusion是融合层的权重矩阵,bfusion是融合层的偏置项,f1、f2和f3分别表示所述病理结果特征、所述转录结果特征和所述临床结果特征;μbatch表示批次的平均值;表示批次的方差;∈是用于数值稳定性的小标量,γ和β是BatchNorm层的可训练参数;x是融合层和之前操作的输出(ReLU,Dropout)。in, represents the output of the multimodal prediction model, W class is the weight matrix of the output of the pathological model, the transcriptional model and the clinical model, BatchNorm represents the batch normalization layer, Dropout represents the Dropout function, ReLU represents the ReLU activation function, W fusion is the weight matrix of the fusion layer, b fusion is the bias term of the fusion layer, f 1 , f 2 and f 3 represent the pathological result feature, the transcriptional result feature and the clinical result feature respectively; μ batch represents the average value of the batch; represents the variance of the batch; ∈ is a small scalar for numerical stability, γ and β are trainable parameters of the BatchNorm layer; x is the output of the fusion layer and the previous operation (ReLU, Dropout). 3.根据权利要求2所述的肺肿瘤诊断及预测系统,其特征在于,在所述多模态预测模型中:3. The lung tumor diagnosis and prediction system according to claim 2, characterized in that, in the multimodal prediction model: 通过融合层将n个病理结果特征、n个转录结果特征以及n个临床结果特征作为输入特征融合到低维空间中,得到融合特征;The n pathological result features, n transcription result features, and n clinical result features are fused into the low-dimensional space as input features through the fusion layer to obtain fusion features; 在融合特征上应用ReLU激活函数引入非线性;Applying ReLU activation function to the fused features introduces nonlinearity; 在ReLU激活后的特征上应用Dropout函数,丢弃预设比例的输入特征以减少过拟合;Apply the Dropout function to the features after ReLU activation to discard a preset proportion of input features to reduce overfitting; 在Dropout后的特征上应用批量归一化层进行特征规一化;Apply batch normalization layer to the features after Dropout to normalize the features; 通过预设全连接层将所述融合特征映射最终的肿瘤类别上,得到最终的肿瘤预测结果。The fusion features are mapped to the final tumor category through a preset fully connected layer to obtain the final tumor prediction result. 4.根据权利要求1所述的肺肿瘤诊断及预测系统,其特征在于,所述第一预处理包括图像的亮度调整、对比度调整、饱和度调整、Gamma校正以及像素标准化;所述第二预处理包括转录数据的标准化处理以及提取表达拷贝数非零的基因;所述第三预处理包括临床预后数据的缺失值处理和离群值处理。4. The lung tumor diagnosis and prediction system according to claim 1 is characterized in that the first preprocessing includes brightness adjustment, contrast adjustment, saturation adjustment, gamma correction and pixel standardization of the image; the second preprocessing includes standardization of transcription data and extraction of genes with non-zero expression copy number; the third preprocessing includes missing value processing and outlier processing of clinical prognosis data. 5.根据权利要求4所述的肺肿瘤诊断及预测系统,其特征在于,所述第二预处理还包括:对所述转录数据进行标准化处理后,提取表达拷贝数非零的基因;对提取的基因表达矩阵降维至与预设肿瘤类型分类总数相匹配的维度DIM,从降维后的数据中提取m*DIM个分类高贡献基因得到所述转录输入数据,m为不小于2的整数。5. The lung tumor diagnosis and prediction system according to claim 4 is characterized in that the second preprocessing also includes: after standardizing the transcription data, extracting genes with non-zero expression copy numbers; reducing the dimension of the extracted gene expression matrix to a dimension DIM that matches the total number of preset tumor type classifications, and extracting m*DIM classification high contribution genes from the reduced-dimensional data to obtain the transcription input data, where m is an integer not less than 2. 6.根据权利要求5所述的肺肿瘤诊断及预测系统,其特征在于,所述转录模型的输入层具有m*DIM个神经元;6. The lung tumor diagnosis and prediction system according to claim 5, characterized in that the input layer of the transcription model has m*DIM neurons; 所述转录模型具有两个全连接层;第一个全连接层之后引入有ReLU激活函数,以引入非线性;第一个全连接层之后应用有Dropout函数来随机丢弃预设比例的输出,以减少过拟合;第二个全连接层具有n个神经元。The transcription model has two fully connected layers; a ReLU activation function is introduced after the first fully connected layer to introduce nonlinearity; a Dropout function is applied after the first fully connected layer to randomly discard a preset proportion of outputs to reduce overfitting; and the second fully connected layer has n neurons. 7.根据权利要求4所述的肺肿瘤诊断及预测系统,其特征在于,所述临床预后数据为以文字描述或图表形式展示的用于评估疾病发展可能的结果以及患者治疗效果和生存前景的统计指标;7. The lung tumor diagnosis and prediction system according to claim 4, characterized in that the clinical prognosis data are statistical indicators presented in the form of text descriptions or charts for evaluating possible outcomes of disease development and patient treatment effects and survival prospects; 所述第三预处理还包括:提取临床预后数据中包括肿瘤分期、生存时间在内的关键信息。The third preprocessing further includes: extracting key information including tumor stage and survival time from the clinical prognosis data. 8.根据权利要求1所述的肺肿瘤诊断及预测系统,其特征在于,所述临床输入数据中包含数值特征和类别特征;8. The lung tumor diagnosis and prediction system according to claim 1, wherein the clinical input data includes numerical features and categorical features; 在所述临床模型中,使用嵌入层来处理所述类别特征,将其映射到高维空间;将所述数值特征和嵌入层输出的类别特征拼接,以形成单一的输入向量;通过两个全连接层对所述输入向量进行特征提取和降维,通过ReLU激活函数引入非线性,应用Dropout函数以减少过拟合。In the clinical model, an embedding layer is used to process the category features and map them to a high-dimensional space; the numerical features and the category features output by the embedding layer are concatenated to form a single input vector; the input vector is subjected to feature extraction and dimensionality reduction through two fully connected layers, nonlinearity is introduced through the ReLU activation function, and the Dropout function is applied to reduce overfitting. 9.根据权利要求1所述的肺肿瘤诊断及预测系统,其特征在于,在所述病理模型中,输出层为具有n个神经元的全连接层;9. The lung tumor diagnosis and prediction system according to claim 1, characterized in that, in the pathological model, the output layer is a fully connected layer with n neurons; 在所述病理模型训练过程中,利用交叉熵损失函数来衡量模型输出与训练数据之间的差异,并通过反向传播算法计算损失相对于模型参数的梯度;使用SGD优化器根据计算出的梯度更新模型权重,其中学习率初始设置为0.001。During the pathological model training process, the cross entropy loss function is used to measure the difference between the model output and the training data, and the gradient of the loss relative to the model parameters is calculated by the back propagation algorithm; the SGD optimizer is used to update the model weights according to the calculated gradient, and the learning rate is initially set to 0.001. 10.根据权利要求1所述的肺肿瘤诊断及预测系统,其特征在于,在所述融合处理单元中,所述病理结果特征、所述转录结果特征、所述临床结果特征均具有对应的权重,通过所述多模态预测模型融合n个所述病理结果特征、n个所述转录结果特征、n个所述临床结果特征及其对应的权重,得到最终的肿瘤预测结果。10. The lung tumor diagnosis and prediction system according to claim 1 is characterized in that, in the fusion processing unit, the pathological result features, the transcription result features, and the clinical result features all have corresponding weights, and the multimodal prediction model is used to fuse n of the pathological result features, n of the transcription result features, n of the clinical result features and their corresponding weights to obtain a final tumor prediction result. 11.根据权利要求1所述的肺肿瘤诊断及预测系统,其特征在于,在所述生存时间预测模型中,使用预训练的卷积神经网络模型提取所述病理输入图像的图像特征,然后使用多层感知机基于所述图像特征进行生存时间预测,得到最终的生存时间预测结果。11. The lung tumor diagnosis and prediction system according to claim 1 is characterized in that, in the survival time prediction model, a pre-trained convolutional neural network model is used to extract image features of the pathological input image, and then a multi-layer perceptron is used to predict the survival time based on the image features to obtain the final survival time prediction result. 12.根据权利要求1所述的肺肿瘤诊断及预测系统,其特征在于,在所述生存时间预测模型的训练过程中:12. The lung tumor diagnosis and prediction system according to claim 1, characterized in that during the training process of the survival time prediction model: 使用均方误差(MSE)作为损失函数,Adam优化器进行参数优化;The mean square error (MSE) is used as the loss function and the Adam optimizer is used for parameter optimization; 在每个训练周期中,通过前向传播计算预测值,通过反向传播更新权重。In each training cycle, predictions are calculated via forward propagation and weights are updated via backpropagation.
CN202410665633.8A 2024-05-27 2024-05-27 Lung tumor diagnosis and prediction system based on multi-mode deep learning Pending CN118553407A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410665633.8A CN118553407A (en) 2024-05-27 2024-05-27 Lung tumor diagnosis and prediction system based on multi-mode deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410665633.8A CN118553407A (en) 2024-05-27 2024-05-27 Lung tumor diagnosis and prediction system based on multi-mode deep learning

Publications (1)

Publication Number Publication Date
CN118553407A true CN118553407A (en) 2024-08-27

Family

ID=92449444

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410665633.8A Pending CN118553407A (en) 2024-05-27 2024-05-27 Lung tumor diagnosis and prediction system based on multi-mode deep learning

Country Status (1)

Country Link
CN (1) CN118553407A (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118824524A (en) * 2024-09-13 2024-10-22 江苏省疾病预防控制中心(江苏省预防医学科学院) Multimodal lung disease diagnosis and screening system based on GAN algorithm
CN118969290A (en) * 2024-10-16 2024-11-15 南京汉卫公共卫生研究院有限公司 Tumor disease risk prediction and drug screening method and system based on artificial intelligence
CN119339956A (en) * 2024-12-20 2025-01-21 吉林大学第一医院 Endocrine disease data processing system and method based on AI intelligence
CN119400336A (en) * 2024-09-10 2025-02-07 中国人民解放军总医院第一医学中心 A multivariate-based pathological data analysis method
CN119446495A (en) * 2024-11-05 2025-02-14 南通大学 A multi-task method for full-volume histopathology images based on fuzzy kernel
CN119626571A (en) * 2024-11-27 2025-03-14 广东医科大学 Liver cancer analysis method, system, equipment and medium based on multimodal deep learning
CN119650072A (en) * 2025-02-18 2025-03-18 温州市中心医院 Intelligent classification prediction method, system and equipment for tumor treatment
CN119811667A (en) * 2024-12-23 2025-04-11 南京医科大学 Cardiovascular disease detection method and system integrating medical imaging and molecular omics features
CN119833128A (en) * 2025-03-19 2025-04-15 杭州一真医疗器械有限公司 Tumor diagnosis method and device based on meridian digital human and continuous behavior data
CN119905237A (en) * 2024-11-29 2025-04-29 上海交通大学 A prognosis prediction method based on multimodal intermediate fusion
CN120432189A (en) * 2025-04-25 2025-08-05 重庆医科大学 Deep learning-assisted prevention and treatment management platform for latent tuberculosis infection

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090098538A1 (en) * 2006-03-31 2009-04-16 Glinsky Gennadi V Prognostic and diagnostic method for disease therapy
JP2018068752A (en) * 2016-10-31 2018-05-10 株式会社Preferred Networks Machine learning device, machine learning method and program
US20190114511A1 (en) * 2017-10-16 2019-04-18 Illumina, Inc. Deep Learning-Based Techniques for Training Deep Convolutional Neural Networks
CN114359666A (en) * 2021-12-28 2022-04-15 清华珠三角研究院 Multi-mode fusion lung cancer patient curative effect prediction method, system, device and medium
CN116386725A (en) * 2023-03-21 2023-07-04 南开大学 Method and system for predicting tumor differential gene expression profile combined with pathomic features
CN116403721A (en) * 2023-03-24 2023-07-07 首都医科大学附属北京胸科医院 A prediction system for small cell lung cancer
CN117198536A (en) * 2023-10-18 2023-12-08 宁波市临床病理诊断中心 Multi-mode group low-level glioma auxiliary treatment decision-making system based on machine learning
CN117594225A (en) * 2023-12-05 2024-02-23 桂林电子科技大学 Multimodal fusion survival prognosis method and device based on pathology and genes
CN117912694A (en) * 2024-02-02 2024-04-19 安徽工业大学 Multi-mode cancer survival risk prediction method based on deep learning

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090098538A1 (en) * 2006-03-31 2009-04-16 Glinsky Gennadi V Prognostic and diagnostic method for disease therapy
JP2018068752A (en) * 2016-10-31 2018-05-10 株式会社Preferred Networks Machine learning device, machine learning method and program
US20190114511A1 (en) * 2017-10-16 2019-04-18 Illumina, Inc. Deep Learning-Based Techniques for Training Deep Convolutional Neural Networks
CN114359666A (en) * 2021-12-28 2022-04-15 清华珠三角研究院 Multi-mode fusion lung cancer patient curative effect prediction method, system, device and medium
CN116386725A (en) * 2023-03-21 2023-07-04 南开大学 Method and system for predicting tumor differential gene expression profile combined with pathomic features
CN116403721A (en) * 2023-03-24 2023-07-07 首都医科大学附属北京胸科医院 A prediction system for small cell lung cancer
CN117198536A (en) * 2023-10-18 2023-12-08 宁波市临床病理诊断中心 Multi-mode group low-level glioma auxiliary treatment decision-making system based on machine learning
CN117594225A (en) * 2023-12-05 2024-02-23 桂林电子科技大学 Multimodal fusion survival prognosis method and device based on pathology and genes
CN117912694A (en) * 2024-02-02 2024-04-19 安徽工业大学 Multi-mode cancer survival risk prediction method based on deep learning

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN119400336A (en) * 2024-09-10 2025-02-07 中国人民解放军总医院第一医学中心 A multivariate-based pathological data analysis method
CN118824524A (en) * 2024-09-13 2024-10-22 江苏省疾病预防控制中心(江苏省预防医学科学院) Multimodal lung disease diagnosis and screening system based on GAN algorithm
CN118969290A (en) * 2024-10-16 2024-11-15 南京汉卫公共卫生研究院有限公司 Tumor disease risk prediction and drug screening method and system based on artificial intelligence
CN119446495A (en) * 2024-11-05 2025-02-14 南通大学 A multi-task method for full-volume histopathology images based on fuzzy kernel
CN119626571A (en) * 2024-11-27 2025-03-14 广东医科大学 Liver cancer analysis method, system, equipment and medium based on multimodal deep learning
CN119626571B (en) * 2024-11-27 2025-09-23 广东医科大学 Liver cancer analysis method, system, equipment and medium based on multimodal deep learning
CN119905237A (en) * 2024-11-29 2025-04-29 上海交通大学 A prognosis prediction method based on multimodal intermediate fusion
CN119339956A (en) * 2024-12-20 2025-01-21 吉林大学第一医院 Endocrine disease data processing system and method based on AI intelligence
CN119811667A (en) * 2024-12-23 2025-04-11 南京医科大学 Cardiovascular disease detection method and system integrating medical imaging and molecular omics features
CN119650072A (en) * 2025-02-18 2025-03-18 温州市中心医院 Intelligent classification prediction method, system and equipment for tumor treatment
CN119833128A (en) * 2025-03-19 2025-04-15 杭州一真医疗器械有限公司 Tumor diagnosis method and device based on meridian digital human and continuous behavior data
CN120432189A (en) * 2025-04-25 2025-08-05 重庆医科大学 Deep learning-assisted prevention and treatment management platform for latent tuberculosis infection

Similar Documents

Publication Publication Date Title
CN118553407A (en) Lung tumor diagnosis and prediction system based on multi-mode deep learning
Wells et al. Artificial intelligence in dermatopathology: Diagnosis, education, and research
US11328798B2 (en) Utilizing multiple sub-models via a multi-model medical scan analysis system
CN114846507B (en) Methods and systems for non-invasive genetic testing using artificial intelligence (AI) models
Mahesh et al. An XAI-enhanced efficientNetB0 framework for precision brain tumor detection in MRI imaging
WO2021114130A1 (en) Unsupervised self-adaptive mammary gland lesion segmentation method
US20240161035A1 (en) Multi-model medical scan analysis system and methods for use therewith
CN113643297B (en) A computer-aided tooth age analysis method based on neural network
CN113469958A (en) Method, system, equipment and storage medium for predicting development potential of embryo
Tsaniya et al. Automatic radiology report generator using transformer with contrast-based image enhancement
CN118196013B (en) Multi-task medical image segmentation method and system supporting collaborative supervision of multiple doctors
CN117976185A (en) A breast cancer risk assessment method and system based on deep learning
CN111179277B (en) An Unsupervised Adaptive Breast Lesion Segmentation Method
Geroski et al. Enhancing COVID-19 disease severity classification through advanced transfer learning techniques and optimal weight initialization schemes
CN120126779A (en) A risk prediction method for thyroid disease
CN119626571B (en) Liver cancer analysis method, system, equipment and medium based on multimodal deep learning
Abraham et al. Transparent brain tumor detection using DenseNet169 and LIME
CN120163789A (en) A medical image analysis method based on deep learning
Parthiban et al. Prediction of Lymphoma cancer by Analyzing histopathological image using machine learning
Hrizi et al. Lung cancer detection and classification using CNN and image segmentation
KR20250047009A (en) Image diagnosis apparatus for diagnosing otitis media and method thereof
Srilakshmi et al. Data augmentation-based diabetic retinopathy classification and grading with the dynamic weighted optimization approach
Venkatachalam et al. Hybrid deep learning model combining xception and resnet with backpropagation and sgd for robust lung and colon cancer classification
Hashmi et al. A hybrid deep learning approach for rice plant disease detection
Sunil et al. Efficient diabetic retinopathy classification grading using GAN based EM and PCA learning framework

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination