WO2022083140A1 - Patient length of stay prediction method and apparatus, electronic device, and storage medium - Google Patents

Patient length of stay prediction method and apparatus, electronic device, and storage medium Download PDF

Info

Publication number
WO2022083140A1
WO2022083140A1 PCT/CN2021/099644 CN2021099644W WO2022083140A1 WO 2022083140 A1 WO2022083140 A1 WO 2022083140A1 CN 2021099644 W CN2021099644 W CN 2021099644W WO 2022083140 A1 WO2022083140 A1 WO 2022083140A1
Authority
WO
WIPO (PCT)
Prior art keywords
prediction
training
data
prediction model
classification
Prior art date
Application number
PCT/CN2021/099644
Other languages
French (fr)
Chinese (zh)
Inventor
吴静依
李鹏飞
李青
张路霞
Original Assignee
杭州未名信科科技有限公司
浙江省北大信息技术高等研究院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 杭州未名信科科技有限公司, 浙江省北大信息技术高等研究院 filed Critical 杭州未名信科科技有限公司
Publication of WO2022083140A1 publication Critical patent/WO2022083140A1/en

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H10/00ICT specially adapted for the handling or processing of patient-related medical or healthcare data
    • G16H10/60ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2433Single-class perspective, e.g. one-against-all classification; Novelty detection; Outlier detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/245Classification techniques relating to the decision surface
    • G06F18/2451Classification techniques relating to the decision surface linear, e.g. hyperplane
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H70/00ICT specially adapted for the handling or processing of medical references
    • G16H70/20ICT specially adapted for the handling or processing of medical references relating to practices or guidelines

Definitions

  • the present application relates to the technical field of data processing, and in particular to a method, device, electronic device and storage medium for predicting the length of hospitalization of a patient.
  • the length of hospital stay is a key indicator for evaluating the efficiency of medical resource utilization.
  • the intelligent length of stay prediction system can assist clinicians to identify patients with high disease risk and provide timely medical intervention, thereby improving the patient’s hospitalization prognosis; it can also assist doctors in making reasonable arrangements Limited medical resources maximize the utilization efficiency of medical resources; it can also provide patients and their families with information about the length of stay in the early stage of admission, so that patients and their families can learn more about their illness and possible hospitalization. information, thereby improving patient satisfaction with medical services and reducing doctor-patient conflicts caused by information asymmetry.
  • kidney disease is a group of common chronic diseases caused by kidney damage caused by various primary kidney diseases, diabetes and hypertension.
  • my country's kidney disease medical and health system urgently needs to combine an intelligent clinical decision support system to improve medical efficiency and improve patient prognosis.
  • the existing hospitalization length prediction of patients is generally based on the clinician's work experience. Due to the complexity of the patient's condition, the subjectivity of the doctor's work experience is too high. The prediction of the patient's hospitalization length is difficult, the analysis efficiency is low, the accuracy rate is low, and it cannot be effective. Assist doctors in clinical decision-making and improve medical efficiency.
  • the prediction model of the length of hospitalization that is accurate to the number of days often has a large error. Converting the prediction of hospitalization length from a numerical prediction problem to an ordered multi-classification prediction problem, the differences in patient characteristics between each classification group are more typical, which can improve the prediction accuracy of the model, and the classification results can provide enough information for clinical decision-making Support consultation with patients.
  • ordered multi-classification problems are generally solved based on numerical prediction models or disordered multi-classification prediction models:
  • Numerical prediction models assume that multiple categories of outcome variables follow an proportional correlation, while in real-world ordinal multi-classification data Multiple categories often do not follow a strict proportional relationship; the disordered multi-category prediction model directly ignores the progressive relationship between the categories of the ordered multi-category outcome variable, and the performance of the prediction model is often limited to a certain extent.
  • the unordered multi-category prediction model will produce large prediction errors.
  • the purpose of the present application is to provide a method, device, electronic device and storage medium for predicting the length of a patient's hospital stay.
  • a brief summary is given below. This summary is not intended to be an extensive review, nor is it intended to identify key/critical elements or delineate the scope of protection of these embodiments. Its sole purpose is to present some concepts in a simplified form as a prelude to the detailed description that follows.
  • a method for predicting the length of hospitalization of a patient including:
  • the samples to be predicted are selected and input into the trained prediction model to obtain the prediction result.
  • the prediction method further includes:
  • training data is extracted to form a training data set.
  • the prediction method further includes:
  • the selected predictive features are supplemented and adjusted to obtain preset predictive features.
  • performing data cleaning includes:
  • the binary classification base learner is a gradient boosting decision tree algorithm.
  • the use of the training data set to train each of the two-class base learners until each of the two-class base learners meets performance index requirements including:
  • step S2 determine whether m ⁇ M; if so, go to step S3; if not, skip to step S7;
  • the random hyperparameter search combined with the five-fold cross-validation method is used to realize the hyperparameter optimization of each basic learner, and the F1 score is used as the reference index of the model prediction performance of the hyperparameter optimization.
  • the prediction method also includes:
  • the prediction model is updated periodically and synchronously.
  • a device for predicting hospitalization length of a patient including:
  • the building module is used to construct an ordered multi-class prediction model by cascading and concatenating multiple binary classification base learners;
  • a training module used for training each of the basic learners by using the training data set until each of the basic learners meets the performance index requirements, and obtains a trained prediction model
  • the prediction module is used for selecting samples to be predicted and inputting the trained prediction model according to the preset prediction features to obtain prediction results.
  • an electronic device including a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor executing the program, In order to achieve the above-mentioned method of predicting the length of hospitalization of patients.
  • a computer-readable storage medium on which a computer program is stored, and the program is executed by a processor to implement the above-mentioned method for predicting the length of hospitalization of a patient.
  • an ordered and multi-classified prediction model is constructed by cascading and concatenated multiple binary base learners, and the ordered and multi-classified prediction task is divided into several layer-by-layer steps.
  • each layer has a base learner, and the information of the samples to be predicted is input into each trained base learner layer by layer to obtain the predicted category, and the sequence between the categories in the ordered multi-category outcome variable is preserved.
  • Progressive relationship and does not assume a proportional relationship between ordered categories, which is more in line with the characteristics of real data.
  • FIG. 1 shows a flowchart of a method for predicting hospitalization length of a patient according to an embodiment of the present application
  • Fig. 2 shows the training process flow chart of the basic learner in an embodiment of the present application
  • FIG. 3 shows a flowchart of selecting a sample to be predicted and inputting a trained prediction model to obtain a prediction result in an embodiment of the present application
  • FIG. 4 shows a structural block diagram of an apparatus for predicting hospitalization length of a patient provided by another embodiment of the present application
  • FIG. 5 shows a structural block diagram of an electronic device provided by another embodiment of the present application.
  • FIG. 6 shows a structural block diagram of an intelligent prediction system for the length of stay of a patient with kidney disease provided by another embodiment of the present application.
  • an embodiment of the present application provides a method for predicting the length of hospitalization of a patient, including the following steps:
  • a patient with renal disease is used as an example.
  • the method in this embodiment is not limited to being used for patients with renal disease, but can also be used for predicting the length of hospitalization for patients with other diseases.
  • Based on the electronic medical record data of kidney disease patients in the hospital information management system after data cleaning, effective modeling data is extracted. Modeling data is the training data used to train the base learner.
  • a certain number of predictive features with high predictive value and easy to collect in clinical practice are selected from the modeling database to form a feature subset for modeling.
  • a predictive feature set is extracted from the electronic medical record data in the hospital information management system, wherein the predictive feature set includes: demographic features, kidney disease features, medical treatment features, general disease features, laboratory test index features, etc.
  • Demographic characteristics include: age, gender, marital status, occupation, education level, medical insurance type and other parameter data;
  • kidney disease includes: chronic kidney disease stage, primary disease of kidney disease, years of diagnosis of kidney disease and other parameter data;
  • the characteristics of medical treatment include: type of medical institution, number of hospitalizations, admission status, admission route, admission department and other parameter data;
  • General disease characteristics include: the cause of admission, whether there is comorbidity (diabetes, hypertension, tumor, chronic obstructive pulmonary disease, pulmonary infection, cardiovascular disease, cerebrovascular disease, chronic liver disease) and other parameter data;
  • Laboratory test index characteristics include: blood routine, urine routine, urine protein/creatinine, serum creatinine, blood glucose, blood lipids, electrolytes, serum calcium, serum phosphorus, parathyroid hormone and other parameter data.
  • a recursive feature elimination algorithm was used to screen out a certain number of predictive feature subsets with high predictive value for the length of stay in patients with renal disease; secondly, combined with expert knowledge, the selected predictive feature subsets were supplemented and adjusted.
  • the feature selection combining expert knowledge and feature screening algorithm is beneficial to ensure the accuracy of screening features and the feasibility of clinical practice. Feature screening can reduce the complexity of predictive models and facilitate clinical practice.
  • a multi-class prediction model is constructed by cascading and concatenating multiple binary base learners.
  • the length of hospitalization of patients with renal disease is divided into M categories in order from low to high, and the predicted feature subset screened in step S2 is used as the input of the prediction model, and the cascaded layer-by-layer modeling algorithm is used, Using the gradient boosting decision tree algorithm as the base learner, a prediction model for the length of stay in patients with kidney disease was constructed; among them, the hyperparameter optimization of each base learner used random hyperparameter search combined with five-fold cross-validation method, and F1 score was used as hyperparameter search. A reference indicator of optimal model prediction performance.
  • the basic structure of the cascaded layer-by-layer modeling algorithm in this embodiment adopts a multi-level integrated architecture, which is composed of multiple binary classification base learners connected in series. Each layer trains a base learner respectively.
  • the prediction model contains M-1 base learners. M is the number of classification categories of the prediction model.
  • the M categories of outcome variables are arranged in increasing order.
  • the training data subset is the data of y ⁇ mth category.
  • y is an outcome variable containing an ordered M classification, and the M categories of the outcome variable are arranged in increasing order to obtain the first category ⁇ the second category ⁇ ... ⁇ mth category ⁇ ... ⁇ Mth category;
  • x represents the set of predicted features for the training samples.
  • the training process of the base learner includes the following steps:
  • the random hyperparameter search combined with the five-fold cross-validation method is used to realize the hyperparameter optimization of each basic learner, and the F1 score is used as the reference index of the model prediction performance of the hyperparameter optimization.
  • the information of newly admitted patients is input into the hospitalization length prediction model, and the prediction results are obtained, and the prediction results and diagnosis and treatment suggestions are displayed visually.
  • step S5 specifically includes:
  • step S54 determine whether m is equal to M: if yes, then the final prediction category of the sample is the Mth category, and skip to step S55; if not, return to step S52;
  • the hospitalization duration prediction model is updated synchronously on a regular basis.
  • the modeling data is updated based on the system data of the past three years at the end of each year, and a new hospitalization length prediction model is constructed according to the method described in step S3, and the updated hospitalization length prediction model is used to replace the historical prediction. model, thereby realizing regular synchronous updates to the hospital length prediction model.
  • the method for predicting the length of stay of a patient in the embodiment of the present application is based on a cascaded layer-by-layer modeling algorithm based on ordered multi-classification prediction, adopts a multi-level integrated architecture, and is formed by cascading a plurality of basic learners, and is suitable for ordering There are multiple categories and the categories do not follow the proportional relationship or there is a data imbalance between the categories.
  • the method provided by the embodiment of the present application divides the ordered multi-category prediction task into several progressive binary classification tasks, each layer trains a basic learner, and the information of the new sample to be predicted is input layer by layer Each trained base learner until its predicted class is obtained and output.
  • the cascaded layer-by-layer modeling algorithm retains the sequential progressive relationship between the categories in the ordered multi-category outcome variable, and does not assume the proportional relationship between the ordered categories, which is more in line with the real data characteristics.
  • the data of the two categories in the data set used for training each layer of the base learner is relatively balanced, which can effectively solve the problem of data imbalance between multiple categories.
  • FIG. 4 another embodiment of the present application provides a device for predicting the length of hospitalization of a patient, including:
  • the building module 30 is used for constructing an ordered multi-classification prediction model by cascading and concatenating a plurality of binary classification base learners;
  • a training module 40 configured to train each of the basic learners by using the training data set until each of the basic learners meets the performance index requirements, and obtain a trained prediction model
  • the prediction module 50 is configured to select the samples to be predicted and input the trained prediction model according to the preset prediction feature to obtain the prediction result.
  • the prediction device further includes a data extraction module 10 for performing data cleaning based on the patient's electronic medical record data in the hospital information management system before using the training data set to train each basic learner, and extracting the training data to form a training dataset.
  • the prediction device further includes a prediction feature acquisition module 20, which is used to select samples to be predicted according to preset prediction features and input them into the trained prediction model,
  • the selected predictive features are supplemented and adjusted to obtain preset predictive features.
  • the data extraction module 10 includes a cleaning unit for performing data cleaning, and the cleaning unit is specifically used for:
  • the binary classification base learner is a gradient boosting decision tree algorithm.
  • the training module 40 is specifically used to:
  • step S12 determine whether m ⁇ M; if yes, go to step S13; if not, go to step S17;
  • the training module 40 is further configured to use random hyperparameter search combined with a five-fold cross-validation method to realize the hyperparameter optimization of each basic learner, and use the F1 score as a reference index of the model prediction performance for hyperparameter optimization. .
  • the prediction apparatus further includes an update module 60, and the update module 60 is configured to periodically update the prediction model synchronously based on the update of the electronic medical record data in the hospital information management system.
  • the electronic device 70 may include: a processor 700, a memory 701, a bus 702 and a communication interface 703, the processor 700, the communication interface 703 and the memory 701 are connected through the bus 702; the memory 701
  • a computer program that can be run on the processor 700 is stored in the computer, and when the processor 700 runs the computer program, the method for predicting the length of hospitalization of a patient provided by any of the foregoing embodiments of the present application is executed.
  • Another embodiment of the present application provides a computer-readable storage medium on which a computer program is stored, and the program is executed by a processor to implement the above-mentioned method for predicting the length of hospitalization of a patient.
  • FIG. 6 another embodiment of the present application provides an intelligent prediction system for the length of hospitalization of patients with renal disease, including:
  • An input module at least for entering information on newly admitted kidney disease patients
  • a prediction module which is at least used for the prediction model of the length of stay of kidney disease patients constructed and trained by the aforementioned method, and predicts the length of stay in the hospital for the data of the newly admitted patient;
  • a display module at least used to display the visual prediction results.
  • the method for predicting the length of stay of a patient in the embodiment of the present application can achieve the following beneficial effects: the cascaded layer-by-layer modeling algorithm based on ordered multi-classification
  • the sequential progressive relationship between categories, and does not assume the proportional relationship between the ordered categories, is more in line with the real data characteristics; by splitting the data set layer by layer, the data set used for each layer of base learner training is made.
  • the data is relatively balanced, which can effectively solve the problem of data imbalance between multiple categories.
  • the present disclosure mines the patient data collected by the hospital electronic case data management system based on the cascaded layer-by-layer modeling algorithm, and uses the gradient boosting decision tree algorithm as the base learner to construct a patient-oriented hospitalization duration prediction model and system.
  • the method, apparatus, electronic device, and computer-readable storage medium provided by the embodiments of the present application are not only limited to predicting the length of hospitalization for patients with kidney disease, but can also be widely used for predicting the length of hospitalization for patients with other diseases.
  • module is not intended to be limited to a particular physical form. Depending on the specific application, a module may be implemented in hardware, firmware, software, and/or a combination thereof. Furthermore, different modules can share common components or even be implemented by the same components. There may or may not be clear boundaries between different modules.

Abstract

Disclosed in the present application are a patient length of stay prediction method and apparatus, an electronic device, and a storage medium. The method comprises: constructing an ordered multi-classification prediction model by means of cascade concatenation of a plurality of binary classification base learners; training each base learner by using a training data set until each base learner meets performance index requirements to obtain a trained prediction model; and according to a preset prediction feature, selecting a sample to be predicted, and inputting same into the trained prediction model to obtain a prediction result. According to the method of the present application, the ordered multi-classification prediction model is constructed by means of cascade concatenation of the plurality of binary classification base learners; a sequence progressive relationship among categories in ordered multi-classification outcome variables is reserved, and the ordered categories are not assumed to be a geometrically proportional relationship, thereby more meeting real data features; the data set is split layer by layer, so that data of two categories in the data set for training each base learner is relatively balanced, thereby effectively solving the problem of unbalance among multi-category data, and improving the accuracy of the prediction result.

Description

患者住院时长的预测方法、装置、电子设备及存储介质Method, device, electronic device and storage medium for predicting length of hospital stay of patients 技术领域technical field
本申请涉及数据处理技术领域,具体涉及一种患者住院时长的预测方法、装置、电子设备及存储介质。The present application relates to the technical field of data processing, and in particular to a method, device, electronic device and storage medium for predicting the length of hospitalization of a patient.
背景技术Background technique
住院时长是评价医疗资源利用效率的关键指标,智能化的住院时长预测系统可以辅助临床医生识别疾病风险较高的患者,提供及时的医疗干预,从而改善患者的住院预后;也可辅助医生合理安排有限的医疗资源,使得医疗资源的利用效率达到最大化;还可在患者入院初期为患者及其家属提供住院时长相关的信息咨询,使得患者及其家属可以对其病情与住院可能情况掌握更多信息,由此提高患者的医疗服务满意度并减少由于信息不对称所造成的医患矛盾。The length of hospital stay is a key indicator for evaluating the efficiency of medical resource utilization. The intelligent length of stay prediction system can assist clinicians to identify patients with high disease risk and provide timely medical intervention, thereby improving the patient’s hospitalization prognosis; it can also assist doctors in making reasonable arrangements Limited medical resources maximize the utilization efficiency of medical resources; it can also provide patients and their families with information about the length of stay in the early stage of admission, so that patients and their families can learn more about their illness and possible hospitalization. information, thereby improving patient satisfaction with medical services and reducing doctor-patient conflicts caused by information asymmetry.
以肾脏疾病为例,慢性肾脏疾病是由各种原发性肾脏疾病及糖尿病、高血压等导致肾脏损害引起的一组常见慢性疾病群。我国的肾脏病医疗卫生体系亟需结合智能化的临床决策支持系统以提高医疗效率,改善患者预后。Taking kidney disease as an example, chronic kidney disease is a group of common chronic diseases caused by kidney damage caused by various primary kidney diseases, diabetes and hypertension. my country's kidney disease medical and health system urgently needs to combine an intelligent clinical decision support system to improve medical efficiency and improve patient prognosis.
现有的患者住院时长预测一般是依据临床医生的工作经验,由于患者病情的复杂性,医生的工作经验主观性过高,患者住院时长预测的难度大、分析效率低、准确率低,无法有效地辅助医生的临床决策、提升医疗效率。The existing hospitalization length prediction of patients is generally based on the clinician's work experience. Due to the complexity of the patient's condition, the subjectivity of the doctor's work experience is too high. The prediction of the patient's hospitalization length is difficult, the analysis efficiency is low, the accuracy rate is low, and it cannot be effective. Assist doctors in clinical decision-making and improve medical efficiency.
考虑到真实世界中住院时长受人为因素影响具有一定的波动性,精确到天的数值型住院时长的预测模型往往误差较大。将住院时长预测从数值型预测问题转换为有序多分类预测问题,各个分类组间患者特征差异更为典型,可以由此提升模型预测准确率,且分类结果能提供足够的信息用于临床决策支持与患者咨询。目前,有序多分类问题一般基于数值型预测模型或无序多分类预测模型解决:数值型预测模型假设结局变量多个类别之间遵循等比相关关系,而真实世界中有序多分类数据的多个类别间往往并不遵循严格的等比相关关系;无序多分类预测模型则直接忽视了有序多分类结局变量各个类别之间的递进关系,预测模型的性能往往受到一定限制。同时,当有序多分类结局变量各个类 别之间存在数据不平衡问题时,无序多分类预测模型会产生较大的预测误差。Considering that the length of hospital stay in the real world is affected by human factors and has a certain volatility, the prediction model of the length of hospitalization that is accurate to the number of days often has a large error. Converting the prediction of hospitalization length from a numerical prediction problem to an ordered multi-classification prediction problem, the differences in patient characteristics between each classification group are more typical, which can improve the prediction accuracy of the model, and the classification results can provide enough information for clinical decision-making Support consultation with patients. At present, ordered multi-classification problems are generally solved based on numerical prediction models or disordered multi-classification prediction models: Numerical prediction models assume that multiple categories of outcome variables follow an proportional correlation, while in real-world ordinal multi-classification data Multiple categories often do not follow a strict proportional relationship; the disordered multi-category prediction model directly ignores the progressive relationship between the categories of the ordered multi-category outcome variable, and the performance of the prediction model is often limited to a certain extent. At the same time, when there is a data imbalance problem between the categories of the ordered multi-category outcome variables, the unordered multi-category prediction model will produce large prediction errors.
发明内容SUMMARY OF THE INVENTION
本申请的目的是提供一种患者住院时长的预测方法、装置、电子设备及存储介质。为了对披露的实施例的一些方面有一个基本的理解,下面给出了简单的概括。该概括部分不是泛泛评述,也不是要确定关键/重要组成元素或描绘这些实施例的保护范围。其唯一目的是用简单的形式呈现一些概念,以此作为后面的详细说明的序言。The purpose of the present application is to provide a method, device, electronic device and storage medium for predicting the length of a patient's hospital stay. In order to provide a basic understanding of some aspects of the disclosed embodiments, a brief summary is given below. This summary is not intended to be an extensive review, nor is it intended to identify key/critical elements or delineate the scope of protection of these embodiments. Its sole purpose is to present some concepts in a simplified form as a prelude to the detailed description that follows.
根据本申请实施例的一个方面,提供一种患者住院时长的预测方法,包括:According to an aspect of the embodiments of the present application, a method for predicting the length of hospitalization of a patient is provided, including:
利用多个二分类基学习器级联串接构建有序多分类的预测模型;Construct an ordered multi-classification prediction model by cascading multiple binary base learners;
利用训练数据集训练各个所述基学习器直至每一所述基学习器达到性能指标要求,得到训练好的预测模型;Use the training data set to train each of the basic learners until each of the basic learners meets the performance index requirements, and obtain a trained prediction model;
根据预设预测特征,选取待预测样本输入所述训练好的预测模型,得到预测结果。According to the preset prediction feature, the samples to be predicted are selected and input into the trained prediction model to obtain the prediction result.
进一步地,在所述利用训练数据集训练各个所述基学习器之前,所述预测方法还包括:Further, before using the training data set to train each of the basic learners, the prediction method further includes:
基于医院信息管理系统中患者的电子病历数据,进行数据清理,提取训练数据构成训练数据集。Based on the patient's electronic medical record data in the hospital information management system, data cleaning is performed, and training data is extracted to form a training data set.
进一步地,在所述根据预设预测特征,选取待预测样本输入所述训练好的预测模型之前,所述预测方法还包括:Further, before selecting the sample to be predicted and inputting the trained prediction model according to the preset prediction feature, the prediction method further includes:
从所述医院信息管理系统的电子病历数据中或者从所述训练数据集中筛选出对患者的住院时长预测价值高的预测特征;From the electronic medical record data of the hospital information management system or from the training data set, select the predictive features with high predictive value for the length of stay of the patient;
结合专家知识对所筛选的预测特征进行补充与调整,得到预设预测特征。Combined with expert knowledge, the selected predictive features are supplemented and adjusted to obtain preset predictive features.
进一步地,所述进行数据清理,包括:Further, performing data cleaning includes:
剔除缺失率过高的患者数据,剔除异常数据,以及随机填补数据缺失值。Eliminate patient data with a high missing rate, remove abnormal data, and randomly fill in missing data.
进一步地,所述二分类基学习器为梯度提升决策树算法。Further, the binary classification base learner is a gradient boosting decision tree algorithm.
进一步地,所述利用训练数据集训练各个所述二分类基学习器直至每一所述二分类基学习器达到性能指标要求,包括:Further, the use of the training data set to train each of the two-class base learners until each of the two-class base learners meets performance index requirements, including:
S1、将所述训练数据集输入所述预测模型,设定初始值m=1;单个训练样 本输入格式为(x,y);y为包含有序M分类的结局变量,x代表训练样本的预测特征的集合;M为所述预测模型的分类类别的数量;S1. Input the training data set into the prediction model, and set the initial value m=1; the input format of a single training sample is (x, y); y is the outcome variable containing the ordered M classification, and x represents the A set of prediction features; M is the number of classification categories of the prediction model;
S2、判断m是否<M;若是,则进入步骤S3;若否,则跳到步骤S7;S2, determine whether m<M; if so, go to step S3; if not, skip to step S7;
S3、提取y≥第m类别的数据作为第m个基学习器的训练数据子集;S3. Extract the data of y≥mth category as the training data subset of the mth base learner;
S4、用第一训练标签标记所述训练数据子集中y=第m类别的数据,用第二训练标签标记所述训练数据子集中y>第m类别的数据;S4, mark the data of y=mth category in the training data subset with the first training label, and mark the data of y>mth category in the training data subset with the second training label;
S5、基于上述步骤获得的所述训练数据子集与训练标签,训练所述二分类基学习器,得到第m个基学习器;S5, based on the training data subsets and training labels obtained in the above steps, train the two-class base learner to obtain the mth base learner;
S6、m自增1后更新,返回步骤S2;S6, m is updated after incrementing by 1, and returns to step S2;
S7、输出训练完成的M-1个基学习器。S7. Output the M-1 basic learners that have been trained.
进一步地,采用随机超参数搜索结合五折交叉验证方法实现各个基学习器的超参数优化,使用F1分数作为超参数寻优的模型预测性能的参考指标。Further, the random hyperparameter search combined with the five-fold cross-validation method is used to realize the hyperparameter optimization of each basic learner, and the F1 score is used as the reference index of the model prediction performance of the hyperparameter optimization.
进一步地,所述预测方法还包括:Further, the prediction method also includes:
基于医院信息管理系统中电子病历数据的更新,定期同步对所述预测模型进行更新。Based on the update of the electronic medical record data in the hospital information management system, the prediction model is updated periodically and synchronously.
根据本申请实施例的另一个方面,提供一种患者住院时长的预测装置,包括:According to another aspect of the embodiments of the present application, a device for predicting hospitalization length of a patient is provided, including:
构建模块,用于利用多个二分类基学习器级联串接构建有序多分类的预测模型;The building module is used to construct an ordered multi-class prediction model by cascading and concatenating multiple binary classification base learners;
训练模块,用于利用训练数据集训练各个所述基学习器直至每一所述基学习器达到性能指标要求,得到训练好的预测模型;A training module, used for training each of the basic learners by using the training data set until each of the basic learners meets the performance index requirements, and obtains a trained prediction model;
预测模块,用于根据预设预测特征,选取待预测样本输入所述训练好的预测模型,得到预测结果。The prediction module is used for selecting samples to be predicted and inputting the trained prediction model according to the preset prediction features to obtain prediction results.
根据本申请实施例的另一个方面,提供一种电子设备,包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述处理器执行所述程序,以实现上述的患者住院时长的预测方法。According to another aspect of the embodiments of the present application, an electronic device is provided, including a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor executing the program, In order to achieve the above-mentioned method of predicting the length of hospitalization of patients.
根据本申请实施例的另一个方面,提供一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行,以实现上述的患者住院时长的预测方法。According to another aspect of the embodiments of the present application, there is provided a computer-readable storage medium on which a computer program is stored, and the program is executed by a processor to implement the above-mentioned method for predicting the length of hospitalization of a patient.
本申请实施例的其中一个方面提供的技术方案可以包括以下有益效果:The technical solution provided by one aspect of the embodiments of the present application may include the following beneficial effects:
本申请实施例提供的患者住院时长的预测方法,利用多个二分类基学习器级联串接构建有序多分类的预测模型,将有序多类别的预测任务拆分为几个逐层递进的二分类任务,每一层分别有一个基学习器,待预测样本的信息逐层输入各个训练好的基学习器,获得预测类别,保留了有序多分类结局变量中各个类别间的序列递进关系,且不假设有序类别间为等比关系,更加符合真实数据特征,通过将数据集逐层拆分,使得用于各层基学习器训练的数据集中两个类别的数据相对平衡,可以有效解决多类别间数据不平衡的问题,提高了预测结果准确率。In the method for predicting the length of hospitalization of a patient provided by the embodiment of the present application, an ordered and multi-classified prediction model is constructed by cascading and concatenated multiple binary base learners, and the ordered and multi-classified prediction task is divided into several layer-by-layer steps. For the advanced binary classification task, each layer has a base learner, and the information of the samples to be predicted is input into each trained base learner layer by layer to obtain the predicted category, and the sequence between the categories in the ordered multi-category outcome variable is preserved. Progressive relationship, and does not assume a proportional relationship between ordered categories, which is more in line with the characteristics of real data. By splitting the data set layer by layer, the data of the two categories in the data set used for the training of each layer of basic learners is relatively balanced. , which can effectively solve the problem of data imbalance between multiple categories and improve the accuracy of prediction results.
本申请的其他特征和优点将在随后的说明书中阐述,并且,部分地从说明书中变得显而易见,或者,部分特征和优点可以从说明书中推知或毫无疑义地确定,或者通过实施本申请实施例了解。本申请的目的和其他优点可通过在所写的说明书、权利要求书、以及附图中所特别指出的结构来实现和获得。Other features and advantages of the present application will be set forth in the description which follows, and, in part, will become apparent from the description, or may be inferred or unambiguously determined from the description, or may be implemented by practice of the present application. example to understand. The objectives and other advantages of the application may be realized and attained by the structure particularly pointed out in the written description, claims, and drawings.
附图说明Description of drawings
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请中记载的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the following briefly introduces the accompanying drawings required for the description of the embodiments or the prior art. Obviously, the drawings in the following description are only These are some embodiments described in this application. For those of ordinary skill in the art, other drawings can also be obtained based on these drawings without any creative effort.
图1示出了本申请的一个实施例的患者住院时长的预测方法流程图;FIG. 1 shows a flowchart of a method for predicting hospitalization length of a patient according to an embodiment of the present application;
图2示出了本申请的一个实施例中的基学习器的训练过程流程图;Fig. 2 shows the training process flow chart of the basic learner in an embodiment of the present application;
图3示出了本申请的一个实施例中的选取待预测样本输入训练好的预测模型得到预测结果的流程图;FIG. 3 shows a flowchart of selecting a sample to be predicted and inputting a trained prediction model to obtain a prediction result in an embodiment of the present application;
图4示出了本申请的另一个实施例提供的患者住院时长的预测装置的结构框图;FIG. 4 shows a structural block diagram of an apparatus for predicting hospitalization length of a patient provided by another embodiment of the present application;
图5示出了本申请的另一个实施例提供的电子设备的结构框图;5 shows a structural block diagram of an electronic device provided by another embodiment of the present application;
图6示出了本申请的另一个实施例提供的一种智能化肾脏疾病患者住院时长的预测系统的结构框图。FIG. 6 shows a structural block diagram of an intelligent prediction system for the length of stay of a patient with kidney disease provided by another embodiment of the present application.
具体实施方式Detailed ways
为了使本申请的目的、技术方案及优点更加清楚明白,下面结合附图和具体实施例对本申请做进一步说明。应当理解,此处所描述的具体实施例仅用以解释本申请,并不用于限定本申请。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。In order to make the purpose, technical solutions and advantages of the present application clearer, the present application will be further described below with reference to the accompanying drawings and specific embodiments. It should be understood that the specific embodiments described herein are only used to explain the present application, but not to limit the present application. Based on the embodiments in the present application, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present application.
本技术领域技术人员可以理解,除非另外定义,这里使用的所有术语(包括技术术语和科学术语),具有与本申请所属领域中的普通技术人员的一般理解相同的意义。还应该理解的是,诸如通用字典中定义的那些术语,应该被理解为具有与现有技术的上下文中的意义一致的意义,并且除非像这里一样被特定定义,否则不会用理想化或过于正式的含义来解释。It will be understood by those skilled in the art that, unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. It should also be understood that terms, such as those defined in a general dictionary, should be understood to have meanings consistent with their meanings in the context of the prior art and, unless specifically defined as herein, should not be interpreted in idealistic or overly formal meaning to explain.
如图1所示,本申请的一个实施例提供了一种患者住院时长的预测方法,包括以下步骤:As shown in FIG. 1, an embodiment of the present application provides a method for predicting the length of hospitalization of a patient, including the following steps:
S1、采集有效建模数据。S1. Collect effective modeling data.
本实施例以肾脏疾病患者为例,本领域技术人员可以理解的是,本实施例的方法不仅仅局限于用于肾脏疾病患者,还可以用于其他疾病患者的住院时长预测。基于医院信息管理系统中肾脏疾病患者的电子病历数据,经过数据清理,提取有效建模数据。建模数据即用于训练基学习器的训练数据。In this embodiment, a patient with renal disease is used as an example. Those skilled in the art can understand that the method in this embodiment is not limited to being used for patients with renal disease, but can also be used for predicting the length of hospitalization for patients with other diseases. Based on the electronic medical record data of kidney disease patients in the hospital information management system, after data cleaning, effective modeling data is extracted. Modeling data is the training data used to train the base learner.
采集医院信息管理系统中的电子病历数据,基于国际通用的KDIGO肾脏疾病临床指南给出的慢性肾脏疾病诊断标准,筛选出肾脏疾病患者;对信息缺失率超过30%的患者数据和特征指标、数据异常值进行删除处理,不纳入最终模型构建;对数据的缺失值采用随机填补算法进行填充,随机填补算法可以使得填补后数据保持真实数据的分布特征;由此提取出有效建模数据,利用有效建模数据构成建模数据库。Collect the electronic medical record data in the hospital information management system, and screen out the patients with chronic kidney disease based on the diagnostic criteria of chronic kidney disease given by the international KDIGO clinical guidelines for kidney disease; the data and characteristic indicators and data of patients whose information missing rate exceeds 30% The outliers are deleted and not included in the final model construction; the missing values of the data are filled with a random filling algorithm, and the random filling algorithm can keep the distribution characteristics of the real data after filling; The modeling data constitutes the modeling database.
S2、筛选预测特征。S2, screening prediction features.
结合专家知识与特征筛选算法从建模数据库中筛选出预测价值高且便于临床实践采集的一定数量的预测特征构成用于建模的特征子集。Combined with expert knowledge and feature screening algorithm, a certain number of predictive features with high predictive value and easy to collect in clinical practice are selected from the modeling database to form a feature subset for modeling.
从医院信息管理系统中的电子病历数据中提取出预测特征集合,其中,预测特征集合包括:人口学特征,肾脏疾病特征,就医特征,一般疾病特征,实验室检验指标特征等。A predictive feature set is extracted from the electronic medical record data in the hospital information management system, wherein the predictive feature set includes: demographic features, kidney disease features, medical treatment features, general disease features, laboratory test index features, etc.
1)人口学特征包括:年龄,性别,婚姻状态,职业,教育水平,医疗保险类型等参数数据;1) Demographic characteristics include: age, gender, marital status, occupation, education level, medical insurance type and other parameter data;
2)肾脏疾病特征包括:慢性肾脏病分期,肾脏疾病原发病,肾脏疾病诊断年限等参数数据;2) The characteristics of kidney disease include: chronic kidney disease stage, primary disease of kidney disease, years of diagnosis of kidney disease and other parameter data;
3)就医特征包括:医疗机构类型,住院次数,入院状态,入院途径,入院科室等参数数据;3) The characteristics of medical treatment include: type of medical institution, number of hospitalizations, admission status, admission route, admission department and other parameter data;
4)一般疾病特征包括:入院病因,是否患有合并症(糖尿病,高血压,肿瘤,慢性阻塞性肺病,肺部感染,心血管疾病,脑血管疾病,慢性肝病)等参数数据;4) General disease characteristics include: the cause of admission, whether there is comorbidity (diabetes, hypertension, tumor, chronic obstructive pulmonary disease, pulmonary infection, cardiovascular disease, cerebrovascular disease, chronic liver disease) and other parameter data;
5)实验室检验指标特征包括:血常规,尿常规,尿蛋白/肌酐,血肌酐,血糖,血脂,电解质,血钙,血磷,全段甲状旁腺激素等参数数据。5) Laboratory test index characteristics include: blood routine, urine routine, urine protein/creatinine, serum creatinine, blood glucose, blood lipids, electrolytes, serum calcium, serum phosphorus, parathyroid hormone and other parameter data.
使用递归特征消除算法筛选出对肾脏疾病患者的住院时长预测价值高的一定数量的预测特征子集;其次结合专家知识对所筛选的预测特征子集进行补充与调整。结合专家知识与特征筛选算法的特征选择有利于保证筛选特征的准确性与临床实践的可行性。特征筛选可降低预测模型的复杂度,便于临床实践。A recursive feature elimination algorithm was used to screen out a certain number of predictive feature subsets with high predictive value for the length of stay in patients with renal disease; secondly, combined with expert knowledge, the selected predictive feature subsets were supplemented and adjusted. The feature selection combining expert knowledge and feature screening algorithm is beneficial to ensure the accuracy of screening features and the feasibility of clinical practice. Feature screening can reduce the complexity of predictive models and facilitate clinical practice.
S3、构建预测模型。S3. Build a prediction model.
利用多个二分类基学习器级联串接构建有序多分类的预测模型。A multi-class prediction model is constructed by cascading and concatenating multiple binary base learners.
具体地,将肾脏疾病患者的住院时长从低到高依次划分为M个类别,基于步骤S2所筛选的预测特征子集作为预测模型的输入,采用所述的级联式逐层建模算法,以梯度提升决策树算法为基学习器,构建肾脏疾病患者住院时长的预测模型;其中,各个基学习器的超参数优化采用随机超参数搜索结合五折交叉验证方法,使用F1分数作为超参数寻优的模型预测性能的参考指标。Specifically, the length of hospitalization of patients with renal disease is divided into M categories in order from low to high, and the predicted feature subset screened in step S2 is used as the input of the prediction model, and the cascaded layer-by-layer modeling algorithm is used, Using the gradient boosting decision tree algorithm as the base learner, a prediction model for the length of stay in patients with kidney disease was constructed; among them, the hyperparameter optimization of each base learner used random hyperparameter search combined with five-fold cross-validation method, and F1 score was used as hyperparameter search. A reference indicator of optimal model prediction performance.
S4、利用训练数据集训练各个所述基学习器直至每一所述基学习器达到性能指标要求,得到训练好的预测模型。S4. Use the training data set to train each of the basic learners until each of the basic learners meets the performance index requirements, and obtain a trained prediction model.
本实施例的级联式逐层建模算法的基本结构采用多级集成架构,由多个二分类基学习器级联串接构成,每一层分别训练一个基学习器,有序M分类的预测模型则包含M-1个基学习器。M为预测模型的分类类别数量。The basic structure of the cascaded layer-by-layer modeling algorithm in this embodiment adopts a multi-level integrated architecture, which is composed of multiple binary classification base learners connected in series. Each layer trains a base learner respectively. The prediction model contains M-1 base learners. M is the number of classification categories of the prediction model.
将结局变量的M个类别按照递增顺序排列,对于第m(m=1,2,…,M-1)个基学习器,其训练数据子集为y≥第m类别的数据。The M categories of outcome variables are arranged in increasing order. For the mth (m=1, 2, ..., M-1) basic learner, the training data subset is the data of y≥mth category.
给定训练数据集D,其单个训练样本输入格式为(x,y)。其中,y为包含有序M分类的结局变量,将结局变量的M个类别按照递增顺序排列,得到第1类别<第2类别<···<第m类别<···<第M类别;x代表训练样本的预测特征的集合。Given a training dataset D, its single training sample input format is (x,y). Among them, y is an outcome variable containing an ordered M classification, and the M categories of the outcome variable are arranged in increasing order to obtain the first category < the second category < ... < mth category < ... < Mth category; x represents the set of predicted features for the training samples.
如图2所示,在某些实施方式中,基学习器的训练过程包括以下步骤:As shown in Figure 2, in some embodiments, the training process of the base learner includes the following steps:
S11、输入训练数据集D,设定初始值m=1;S11, input the training data set D, and set the initial value m=1;
S12、判断m是否<M:若是,则进入步骤S13,;若否,则跳到步骤S17;S12, judge whether m<M: if yes, then go to step S13; if no, go to step S17;
S13、提取训练数据子集:提取y≥第m类别的数据作为第m个基学习器的训练子集;S13, extracting a training data subset: extracting the data of y≥mth category as the training subset of the mth base learner;
S14、标记数据标签:将提取训练数据子集中y=第m类别的数据的训练标签记为0,将y>第m类别的数据的训练标签记为1;S14, labeling the data label: the training label of the data of the y=mth category in the extracted training data subset is marked as 0, and the training label of the data of y>mth category is marked as 1;
S15、训练基学习器:基于上述步骤提取的训练数据子集与数据标签,训练预设的二分类基学习器,由此得到第m个基学习器;S15, training a base learner: based on the training data subsets and data labels extracted in the above steps, train a preset two-class base learner, thereby obtaining the mth base learner;
S16、m自增1后更新,返回步骤S12;即m=m+1或m=m++;S16, m is updated after incrementing by 1, and returns to step S12; that is, m=m+1 or m=m++;
S17、输出训练完成的M-1个基学习器。S17. Output the M-1 basic learners that have been trained.
其中,采用随机超参数搜索结合五折交叉验证方法实现各个基学习器的超参数优化,使用F1分数作为超参数寻优的模型预测性能的参考指标。Among them, the random hyperparameter search combined with the five-fold cross-validation method is used to realize the hyperparameter optimization of each basic learner, and the F1 score is used as the reference index of the model prediction performance of the hyperparameter optimization.
S5、根据预设预测特征,选取待预测样本输入所述训练好的预测模型,得到预测结果。S5. According to the preset prediction feature, select samples to be predicted and input the trained prediction model to obtain a prediction result.
向预测模型中输入待预测样本,得到预测结果,在某些实施方式中,还包括对预测结果进行可视化展示。Inputting the samples to be predicted into the prediction model to obtain the prediction result, and in some embodiments, it also includes visual display of the prediction result.
将新入院患者的信息输入住院时长预测模型中,得到预测结果,可视化展示预测结果及诊疗建议,基于SHAP算法给出该患者的预测特征对其住院时长影响的可视化结果。The information of newly admitted patients is input into the hospitalization length prediction model, and the prediction results are obtained, and the prediction results and diagnosis and treatment suggestions are displayed visually.
将新的待预测样本的信息逐层输入各个训练好的基学习器,直到获得其预测类别并输出。Input the information of the new sample to be predicted into each trained basic learner layer by layer until the predicted category is obtained and output.
在某些实施方式中,如图3所示,步骤S5具体包括:In some embodiments, as shown in Figure 3, step S5 specifically includes:
S51、输入待预测样本的信息,设定初始值m=1;S51, input the information of the sample to be predicted, and set the initial value m=1;
S52、判断m是否小于M:若是,则将样本信息输入训练好的第m个基学习 器中,得到输出0或者1;S52, determine whether m is less than M: if so, input the sample information into the trained mth basic learner, and obtain an output of 0 or 1;
S53、若输出为0,则该样本的最终预测类别为第m类别,同时跳到步骤S55;若输出为1,则m自增1后更新(即执行操作m=m+1),同时进入步骤S54;S53. If the output is 0, then the final prediction category of the sample is the mth category, and skip to step S55; if the output is 1, m is updated by incrementing by 1 (that is, performing operation m=m+1), and at the same time entering Step S54;
S54、判断m是否等于M:若是,则该样本的最终预测类别为第M类别,同时跳到步骤S55;若否,则返回步骤S52;S54, determine whether m is equal to M: if yes, then the final prediction category of the sample is the Mth category, and skip to step S55; if not, return to step S52;
S55、输出该样本的最终预测类别。S55. Output the final predicted category of the sample.
S6、自动更新预测模型。S6, automatically update the prediction model.
基于医院电子病历数据管理系统收集数据的更新,定期同步对住院时长预测模型进行更新。Based on the update of the data collected by the hospital electronic medical record data management system, the hospitalization duration prediction model is updated synchronously on a regular basis.
基于医院电子病历数据管理系统收集数据的更新,每年年末基于近三年系统数据更新建模数据,根据步骤S3所述方法构建新的住院时长预测模型,使用更新后的住院时长预测模型代替历史预测模型,由此实现对住院时长预测模型的定期同步更新。Based on the update of the data collected by the hospital electronic medical record data management system, the modeling data is updated based on the system data of the past three years at the end of each year, and a new hospitalization length prediction model is constructed according to the method described in step S3, and the updated hospitalization length prediction model is used to replace the historical prediction. model, thereby realizing regular synchronous updates to the hospital length prediction model.
本申请实施例的患者住院时长预测方法,基于有序多分类预测的级联式逐层建模算法,采用多级集成架构,由多个基学习器级联串接而成,适用于有序多类别且类别间不遵循等比相关关系或类别间存在数据不平衡的预测问题。本申请实施例提供的方法,将有序多类别的预测任务拆分为几个逐层递进的二分类任务,每一层分别训练一个基学习器,新的待预测样本的信息逐层输入各个训练好的基学习器,直到获得其预测类别并输出。级联式逐层建模算法保留了有序多分类结局变量中各个类别间的序列递进关系,且不假设有序类别间为等比关系,更加符合真实数据特征。另外,通过将数据集逐层拆分,使得用于各层基学习器训练的数据集中两个类别的数据相对平衡,可以有效解决多类别间数据不平衡问题。The method for predicting the length of stay of a patient in the embodiment of the present application is based on a cascaded layer-by-layer modeling algorithm based on ordered multi-classification prediction, adopts a multi-level integrated architecture, and is formed by cascading a plurality of basic learners, and is suitable for ordering There are multiple categories and the categories do not follow the proportional relationship or there is a data imbalance between the categories. The method provided by the embodiment of the present application divides the ordered multi-category prediction task into several progressive binary classification tasks, each layer trains a basic learner, and the information of the new sample to be predicted is input layer by layer Each trained base learner until its predicted class is obtained and output. The cascaded layer-by-layer modeling algorithm retains the sequential progressive relationship between the categories in the ordered multi-category outcome variable, and does not assume the proportional relationship between the ordered categories, which is more in line with the real data characteristics. In addition, by splitting the data set layer by layer, the data of the two categories in the data set used for training each layer of the base learner is relatively balanced, which can effectively solve the problem of data imbalance between multiple categories.
如图4所示,本申请的另一个实施例提供了一种患者住院时长的预测装置,包括:As shown in FIG. 4 , another embodiment of the present application provides a device for predicting the length of hospitalization of a patient, including:
构建模块30,用于利用多个二分类基学习器级联串接构建有序多分类的预测模型;The building module 30 is used for constructing an ordered multi-classification prediction model by cascading and concatenating a plurality of binary classification base learners;
训练模块40,用于利用训练数据集训练各个所述基学习器直至每一所述基学习器达到性能指标要求,得到训练好的预测模型;A training module 40, configured to train each of the basic learners by using the training data set until each of the basic learners meets the performance index requirements, and obtain a trained prediction model;
预测模块50,用于根据预设预测特征,选取待预测样本输入所述训练好的预测模型,得到预测结果。The prediction module 50 is configured to select the samples to be predicted and input the trained prediction model according to the preset prediction feature to obtain the prediction result.
在某些实施方式中,预测装置还包括数据提取模块10,用于在利用训练数据集训练各个基学习器之前,基于医院信息管理系统中患者的电子病历数据,进行数据清理,提取训练数据构成训练数据集。In some embodiments, the prediction device further includes a data extraction module 10 for performing data cleaning based on the patient's electronic medical record data in the hospital information management system before using the training data set to train each basic learner, and extracting the training data to form a training dataset.
在某些实施方式中,预测装置还包括预测特征获取模块20,用于在根据预设预测特征,选取待预测样本输入所述训练好的预测模型之前,In some embodiments, the prediction device further includes a prediction feature acquisition module 20, which is used to select samples to be predicted according to preset prediction features and input them into the trained prediction model,
从所述医院信息管理系统的电子病历数据中或者从所述训练数据集中筛选出对患者的住院时长预测价值高的预测特征;From the electronic medical record data of the hospital information management system or from the training data set, select the predictive features with high predictive value for the length of stay of the patient;
结合专家知识对所筛选的预测特征进行补充与调整,得到预设预测特征。Combined with expert knowledge, the selected predictive features are supplemented and adjusted to obtain preset predictive features.
在某些实施方式中,数据提取模块10包括用于进行数据清理的清理单元,清理单元具体用于:In some embodiments, the data extraction module 10 includes a cleaning unit for performing data cleaning, and the cleaning unit is specifically used for:
剔除缺失率过高的患者数据,剔除异常数据,以及随机填补数据缺失值。Eliminate patient data with a high missing rate, remove abnormal data, and randomly fill in missing data.
二分类基学习器为梯度提升决策树算法。The binary classification base learner is a gradient boosting decision tree algorithm.
在某些实施方式中,训练模块40具体用于:In some embodiments, the training module 40 is specifically used to:
S11、将所述训练数据集输入所述预测模型,设定初始值m=1;单个训练样本输入格式为(x,y);y为包含有序M分类的结局变量,x代表训练样本的预测特征的集合;M为所述预测模型的分类类别的数量;S11. Input the training data set into the prediction model, and set the initial value m=1; the input format of a single training sample is (x, y); y is the outcome variable including the ordered M classification, and x represents the training sample A set of prediction features; M is the number of classification categories of the prediction model;
S12、判断m是否<M;若是,则进入步骤S13;若否,则跳到步骤S17;S12, determine whether m<M; if yes, go to step S13; if not, go to step S17;
S13、提取y≥第m类别的数据作为第m个基学习器的训练数据子集;S13, extracting the data of y≥mth category as the training data subset of the mth base learner;
S14、用第一训练标签标记所述训练数据子集中y=第m类别的数据,用第二训练标签标记所述训练数据子集中y>第m类别的数据;S14, mark the data of y=mth category in the training data subset with a first training label, and mark the data of y>mth category in the training data subset with a second training label;
S15、基于上述步骤获得的所述训练数据子集与训练标签,训练所述二分类基学习器,得到第m个基学习器;S15, based on the training data subsets and training labels obtained in the above steps, train the two-class base learner to obtain the mth base learner;
S16、m自增1后更新,返回步骤S12;S16, m is updated after being incremented by 1, and returns to step S12;
S17、输出训练完成的M-1个基学习器。S17. Output the M-1 basic learners that have been trained.
在某些实施方式中,训练模块40具体还用于采用随机超参数搜索结合五折交叉验证方法实现各个基学习器的超参数优化,使用F1分数作为超参数寻优的模型预测性能的参考指标。In some embodiments, the training module 40 is further configured to use random hyperparameter search combined with a five-fold cross-validation method to realize the hyperparameter optimization of each basic learner, and use the F1 score as a reference index of the model prediction performance for hyperparameter optimization. .
在某些实施方式中,预测装置还包括更新模块60,更新模块60用于基于医院信息管理系统中电子病历数据的更新,定期同步对所述预测模型进行更新。In some embodiments, the prediction apparatus further includes an update module 60, and the update module 60 is configured to periodically update the prediction model synchronously based on the update of the electronic medical record data in the hospital information management system.
本申请的另一个实施例提供了一种电子设备,包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述处理器执行所述程序,以实现上述的患者住院时长的预测方法。如图5所示,在某些实施方式中,电子设备70可以包括:处理器700,存储器701,总线702和通信接口703,处理器700、通信接口703和存储器701通过总线702连接;存储器701中存储有可在处理器700上运行的计算机程序,处理器700运行该计算机程序时执行本申请前述任一实施方式所提供的患者住院时长的预测方法。Another embodiment of the present application provides an electronic device, including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor executes the program to achieve The above-mentioned methods for predicting the length of hospital stay in patients. As shown in FIG. 5, in some embodiments, the electronic device 70 may include: a processor 700, a memory 701, a bus 702 and a communication interface 703, the processor 700, the communication interface 703 and the memory 701 are connected through the bus 702; the memory 701 A computer program that can be run on the processor 700 is stored in the computer, and when the processor 700 runs the computer program, the method for predicting the length of hospitalization of a patient provided by any of the foregoing embodiments of the present application is executed.
本申请的另一个实施例提供了一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行,以实现上述的患者住院时长的预测方法。Another embodiment of the present application provides a computer-readable storage medium on which a computer program is stored, and the program is executed by a processor to implement the above-mentioned method for predicting the length of hospitalization of a patient.
如图6所示,本申请的另一个实施例提供了一种智能化肾脏疾病患者住院时长的预测系统,包括:As shown in FIG. 6 , another embodiment of the present application provides an intelligent prediction system for the length of hospitalization of patients with renal disease, including:
输入模块,至少用于输入新入院的肾脏疾病患者的信息;An input module, at least for entering information on newly admitted kidney disease patients;
预测模块,至少用于通过前述方法构建并训练得到的肾脏疾病患者住院时长预测模型,对该新入院患者的数据进行住院时长预测;A prediction module, which is at least used for the prediction model of the length of stay of kidney disease patients constructed and trained by the aforementioned method, and predicts the length of stay in the hospital for the data of the newly admitted patient;
显示模块,至少用于显示可视化预测结果。A display module, at least used to display the visual prediction results.
与现有技术相比,本申请实施例的患者住院时长预测方法,能达到如下有益效果:基于有序多分类预测的级联式逐层建模算法,保留了有序多分类结局变量中各个类别间的序列递进关系,且不假设有序类别间为等比关系,更加符合真实数据特征;通过将数据集逐层拆分,使得用于各层基学习器训练的数据集中两个类别的数据相对平衡,可以有效解决多类别间数据不平衡问题。同时,本公开基于该级联式逐层建模算法对医院电子病例数据管理系统收集的患者数据进行挖掘,以梯度提升决策树算法为基学习器,构建了面向患者的住院时长预测模型及系统,为新入院的患者提供可视化的预测结果展示,并根据医院电子病历数据管理系统的数据更新实现智能化的住院时长预测模型的同步化更新,改善了现有的住院时长预测依据临床医生经验进行主观预测的不足,有效提高了患者住院时长的预测效率及准确度,从而辅助临床决策与医疗资源分配, 提高患者的住院预后与医疗满意度。Compared with the prior art, the method for predicting the length of stay of a patient in the embodiment of the present application can achieve the following beneficial effects: the cascaded layer-by-layer modeling algorithm based on ordered multi-classification The sequential progressive relationship between categories, and does not assume the proportional relationship between the ordered categories, is more in line with the real data characteristics; by splitting the data set layer by layer, the data set used for each layer of base learner training is made. The data is relatively balanced, which can effectively solve the problem of data imbalance between multiple categories. At the same time, the present disclosure mines the patient data collected by the hospital electronic case data management system based on the cascaded layer-by-layer modeling algorithm, and uses the gradient boosting decision tree algorithm as the base learner to construct a patient-oriented hospitalization duration prediction model and system. , provides a visual display of prediction results for newly admitted patients, and realizes the synchronous update of the intelligent hospitalization duration prediction model according to the data update of the hospital electronic medical record data management system, which improves the existing hospitalization duration prediction based on the experience of clinicians. The insufficiency of subjective prediction effectively improves the efficiency and accuracy of the prediction of the length of hospitalization of patients, thereby assisting clinical decision-making and allocation of medical resources, and improving the hospitalization prognosis and medical satisfaction of patients.
本申请实施例提供的方法、装置、电子设备以及计算机可读存储介质不仅仅局限于用于肾脏疾病患者住院时长的预测,还可以广泛用于其他疾病患者的住院时长的预测。The method, apparatus, electronic device, and computer-readable storage medium provided by the embodiments of the present application are not only limited to predicting the length of hospitalization for patients with kidney disease, but can also be widely used for predicting the length of hospitalization for patients with other diseases.
需要说明的是:It should be noted:
术语“模块”并非意图受限于特定物理形式。取决于具体应用,模块可以实现为硬件、固件、软件和/或其组合。此外,不同的模块可以共享公共组件或甚至由相同组件实现。不同模块之间可以存在或不存在清楚的界限。The term "module" is not intended to be limited to a particular physical form. Depending on the specific application, a module may be implemented in hardware, firmware, software, and/or a combination thereof. Furthermore, different modules can share common components or even be implemented by the same components. There may or may not be clear boundaries between different modules.
在此提供的算法和显示不与任何特定计算机、虚拟装置或者其它设备固有相关。各种通用装置也可以与基于在此的示教一起使用。根据上面的描述,构造这类装置所要求的结构是显而易见的。此外,本申请也不针对任何特定编程语言。应当明白,可以利用各种编程语言实现在此描述的本申请的内容,并且上面对特定语言所做的描述是为了披露本申请的最佳实施方式。The algorithms and displays provided herein are not inherently related to any particular computer, virtual appliance, or other device. Various general-purpose devices can also be used with the teachings based on this. The structure required to construct such a device is apparent from the above description. Furthermore, this application is not directed to any particular programming language. It should be understood that the content of the application described herein can be implemented using a variety of programming languages and that the descriptions of specific languages above are intended to disclose the best mode of the application.
类似地,应当理解,为了精简本公开并帮助理解各个发明方面中的一个或多个,在上面对本申请的示例性实施例的描述中,本申请的各个特征有时被一起分组到单个实施例、图、或者对其的描述中。然而,并不应将该公开的方法解释成反映如下意图:即所要求保护的本申请要求比在每个权利要求中所明确记载的特征更多的特征。更确切地说,如下面的权利要求书所反映的那样,发明方面在于少于前面公开的单个实施例的所有特征。因此,遵循具体实施方式的权利要求书由此明确地并入该具体实施方式,其中每个权利要求本身都作为本申请的单独实施例。Similarly, it is to be understood that in the above description of exemplary embodiments of the application, various features of the application are sometimes grouped together into a single embodiment, figure, or its description. This disclosure, however, should not be construed as reflecting an intention that the claimed application requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the Detailed Description are hereby expressly incorporated into this Detailed Description, with each claim standing on its own as a separate embodiment of this application.
应该理解的是,虽然附图的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,其可以以其他的顺序执行。而且,附图的流程图中的至少一部分步骤可以包括多个子步骤或者多个阶段,这些子步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,其执行顺序也不必然是依次进行,而是可以与其他步骤或者其他步骤的子步骤或者阶段的至少一部分轮流或者交替地执行。It should be understood that although the various steps in the flowchart of the accompanying drawings are sequentially shown in the order indicated by the arrows, these steps are not necessarily executed in sequence in the order indicated by the arrows. Unless explicitly stated herein, the execution of these steps is not strictly limited to the order and may be performed in other orders. Moreover, at least a part of the steps in the flowcharts of the accompanying drawings may include multiple sub-steps or multiple stages, and these sub-steps or stages are not necessarily executed at the same time, but may be executed at different times, and the execution sequence is also It does not have to be performed sequentially, but may be performed alternately or alternately with other steps or at least a portion of sub-steps or stages of other steps.
以上所述实施例仅表达了本申请的实施方式,其描述较为具体和详细,但 并不能因此而理解为对本申请专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范围。因此,本申请的保护范围应以所附权利要求为准。The above-mentioned embodiment only expresses the embodiment of the present application, and its description is more specific and detailed, but should not be construed as a limitation to the patent scope of the present application. It should be noted that, for those skilled in the art, without departing from the concept of the present application, several modifications and improvements can be made, which all belong to the protection scope of the present application. Therefore, the scope of protection of the present application should be determined by the appended claims.

Claims (11)

  1. 一种患者住院时长的预测方法,其特征在于,包括:A method for predicting the length of a patient's hospital stay, characterized by comprising:
    利用多个二分类基学习器级联串接构建有序多分类的预测模型;Construct an ordered multi-classification prediction model by cascading multiple binary base learners;
    利用训练数据集训练各个所述基学习器直至每一所述基学习器达到性能指标要求,得到训练好的预测模型;Use the training data set to train each of the basic learners until each of the basic learners meets the performance index requirements, and obtain a trained prediction model;
    根据预设预测特征,选取待预测样本输入所述训练好的预测模型,得到预测结果。According to the preset prediction feature, the sample to be predicted is selected and input to the trained prediction model to obtain the prediction result.
  2. 根据权利要求1所述的方法,其特征在于,在所述利用训练数据集训练各个所述基学习器之前,所述预测方法还包括:The method according to claim 1, characterized in that, before using the training data set to train each of the basic learners, the prediction method further comprises:
    基于医院信息管理系统中患者的电子病历数据,进行数据清理,提取训练数据构成训练数据集。Based on the patient's electronic medical record data in the hospital information management system, data cleaning is performed, and training data is extracted to form a training data set.
  3. 根据权利要求2所述的方法,其特征在于,在所述根据预设预测特征,选取待预测样本输入所述训练好的预测模型之前,所述预测方法还包括:The method according to claim 2, wherein, before selecting the sample to be predicted and inputting the trained prediction model according to the preset prediction feature, the prediction method further comprises:
    从所述医院信息管理系统的电子病历数据中或者从所述训练数据集中筛选出对患者的住院时长预测价值高的预测特征;From the electronic medical record data of the hospital information management system or from the training data set, select the predictive features with high predictive value for the length of stay of the patient;
    结合专家知识对所筛选的预测特征进行补充与调整,得到预设预测特征。Combined with expert knowledge, the selected predictive features are supplemented and adjusted to obtain preset predictive features.
  4. 根据权利要求2所述的方法,其特征在于,所述进行数据清理,包括:The method according to claim 2, wherein the performing data cleaning comprises:
    剔除缺失率过高的患者数据,剔除异常数据,以及随机填补数据缺失值。Eliminate patient data with a high missing rate, remove abnormal data, and randomly fill in missing data.
  5. 根据权利要求1所述的方法,其特征在于,所述二分类基学习器为梯度提升决策树算法。The method according to claim 1, wherein the binary classification base learner is a gradient boosting decision tree algorithm.
  6. 根据权利要求1所述的方法,其特征在于,所述利用训练数据集训练各个所述二分类基学习器直至每一所述二分类基学习器达到性能指标要求,包括:The method according to claim 1, wherein the training each of the two-class base learners by using a training data set until each of the two-class base learners meets performance index requirements, comprising:
    S1、将所述训练数据集输入所述预测模型,设定初始值m=1;单个训练样本输入格式为(x,y);y为包含有序M分类的结局变量,x代表训练样本的预测特征的集合;M为所述预测模型的分类类别的数量;S1. Input the training data set into the prediction model, and set the initial value m=1; the input format of a single training sample is (x, y); y is the outcome variable containing the ordered M classification, and x represents the A set of prediction features; M is the number of classification categories of the prediction model;
    S2、判断m是否<M;若是,则进入步骤S3;若否,则跳到步骤S7;S2, determine whether m<M; if so, go to step S3; if not, skip to step S7;
    S3、提取y≥第m类别的数据作为第m个基学习器的训练数据子集;S3. Extract the data of y≥mth category as the training data subset of the mth base learner;
    S4、用第一训练标签标记所述训练数据子集中y=第m类别的数据,用第二训练标签标记所述训练数据子集中y>第m类别的数据;S4, mark the data of y=mth category in the training data subset with the first training label, and mark the data of y>mth category in the training data subset with the second training label;
    S5、基于上述步骤获得的所述训练数据子集与训练标签,训练所述二分类基学习器,得到第m个基学习器;S5, based on the training data subsets and training labels obtained in the above steps, train the two-class base learner to obtain the mth base learner;
    S6、m自增1后更新,返回步骤S2;S6, m is updated after incrementing by 1, and returns to step S2;
    S7、输出训练完成的M-1个基学习器。S7. Output the M-1 basic learners that have been trained.
  7. 根据权利要求6所述的方法,其特征在于,采用随机超参数搜索结合五折交叉验证方法实现各个基学习器的超参数优化,使用F1分数作为超参数寻优的模型预测性能的参考指标。The method according to claim 6, characterized in that a random hyperparameter search combined with a five-fold cross-validation method is used to realize the hyperparameter optimization of each basic learner, and the F1 score is used as a reference index of the model prediction performance for hyperparameter optimization.
  8. 根据权利要求1所述的方法,其特征在于,所述预测方法还包括:The method according to claim 1, wherein the prediction method further comprises:
    基于医院信息管理系统中电子病历数据的更新,定期同步对所述预测模型进行更新。Based on the update of the electronic medical record data in the hospital information management system, the prediction model is updated periodically and synchronously.
  9. 一种患者住院时长的预测装置,其特征在于,包括:A device for predicting the length of hospitalization of a patient, comprising:
    构建模块,用于利用多个二分类基学习器级联串接构建有序多分类的预测模型;The building module is used to construct an ordered multi-classification prediction model by cascading and concatenating multiple binary classification base learners;
    训练模块,用于利用训练数据集训练各个所述基学习器直至每一所述基学习器达到性能指标要求,得到训练好的预测模型;A training module, used for training each of the basic learners by using the training data set until each of the basic learners meets the performance index requirements, and obtains a trained prediction model;
    预测模块,用于根据预设预测特征,选取待预测样本输入所述训练好的预测模型,得到预测结果。The prediction module is used for selecting samples to be predicted and inputting the trained prediction model according to the preset prediction features to obtain prediction results.
  10. 一种电子设备,其特征在于,包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述处理器执行所述程序,以实现如权利要求1-8中任一所述的患者住院时长的预测方法。An electronic device, characterized in that it comprises a memory, a processor, and a computer program stored on the memory and running on the processor, the processor executing the program to achieve claims 1- The method for predicting the length of hospitalization of a patient according to any one of 8.
  11. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,该程序被处理器执行,以实现如权利要求1-8中任一所述的患者住院时长的预测方法。A computer-readable storage medium on which a computer program is stored, characterized in that the program is executed by a processor to implement the method for predicting the length of hospitalization of a patient according to any one of claims 1-8.
PCT/CN2021/099644 2020-10-22 2021-06-11 Patient length of stay prediction method and apparatus, electronic device, and storage medium WO2022083140A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011136028.X 2020-10-22
CN202011136028.XA CN112365943A (en) 2020-10-22 2020-10-22 Method and device for predicting length of stay of patient, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
WO2022083140A1 true WO2022083140A1 (en) 2022-04-28

Family

ID=74511555

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/099644 WO2022083140A1 (en) 2020-10-22 2021-06-11 Patient length of stay prediction method and apparatus, electronic device, and storage medium

Country Status (2)

Country Link
CN (1) CN112365943A (en)
WO (1) WO2022083140A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115831300A (en) * 2022-09-29 2023-03-21 广州金域医学检验中心有限公司 Detection method, device, equipment and medium based on patient information
CN116434893A (en) * 2023-06-12 2023-07-14 中才邦业(杭州)智能技术有限公司 Concrete compressive strength prediction model, construction method, medium and electronic equipment
CN117472789A (en) * 2023-12-28 2024-01-30 成都工业学院 Software defect prediction model construction method and device based on ensemble learning

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112365943A (en) * 2020-10-22 2021-02-12 杭州未名信科科技有限公司 Method and device for predicting length of stay of patient, electronic equipment and storage medium
CN113393939A (en) * 2021-04-26 2021-09-14 上海米健信息技术有限公司 Intensive care unit patient hospitalization day prediction method and system
CN113197578B (en) * 2021-05-07 2023-06-09 天津医科大学 Schizophrenia classification method and system based on multi-center model

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106202883A (en) * 2016-06-28 2016-12-07 成都中医药大学 A kind of method setting up disease cloud atlas based on big data analysis
CN107403198A (en) * 2017-07-31 2017-11-28 广州探迹科技有限公司 A kind of official website recognition methods based on cascade classifier
CN108231146A (en) * 2017-12-01 2018-06-29 华南师范大学 A kind of medical records model building method, system and device based on deep learning
US20190357797A1 (en) * 2018-05-28 2019-11-28 The Governing Council Of The University Of Toronto System and method for generating visual identity and category reconstruction from electroencephalography (eeg) signals
CN112365943A (en) * 2020-10-22 2021-02-12 杭州未名信科科技有限公司 Method and device for predicting length of stay of patient, electronic equipment and storage medium

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107103198A (en) * 2017-04-26 2017-08-29 上海联影医疗科技有限公司 Medical data processing method, device and equipment

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106202883A (en) * 2016-06-28 2016-12-07 成都中医药大学 A kind of method setting up disease cloud atlas based on big data analysis
CN107403198A (en) * 2017-07-31 2017-11-28 广州探迹科技有限公司 A kind of official website recognition methods based on cascade classifier
CN108231146A (en) * 2017-12-01 2018-06-29 华南师范大学 A kind of medical records model building method, system and device based on deep learning
US20190357797A1 (en) * 2018-05-28 2019-11-28 The Governing Council Of The University Of Toronto System and method for generating visual identity and category reconstruction from electroencephalography (eeg) signals
CN112365943A (en) * 2020-10-22 2021-02-12 杭州未名信科科技有限公司 Method and device for predicting length of stay of patient, electronic equipment and storage medium

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115831300A (en) * 2022-09-29 2023-03-21 广州金域医学检验中心有限公司 Detection method, device, equipment and medium based on patient information
CN115831300B (en) * 2022-09-29 2023-12-29 广州金域医学检验中心有限公司 Detection method, device, equipment and medium based on patient information
CN116434893A (en) * 2023-06-12 2023-07-14 中才邦业(杭州)智能技术有限公司 Concrete compressive strength prediction model, construction method, medium and electronic equipment
CN116434893B (en) * 2023-06-12 2023-08-29 中才邦业(杭州)智能技术有限公司 Concrete compressive strength prediction model, construction method, medium and electronic equipment
CN117472789A (en) * 2023-12-28 2024-01-30 成都工业学院 Software defect prediction model construction method and device based on ensemble learning
CN117472789B (en) * 2023-12-28 2024-03-12 成都工业学院 Software defect prediction model construction method and device based on ensemble learning

Also Published As

Publication number Publication date
CN112365943A (en) 2021-02-12

Similar Documents

Publication Publication Date Title
WO2022083140A1 (en) Patient length of stay prediction method and apparatus, electronic device, and storage medium
US11152119B2 (en) Care path analysis and management platform
AU2005321925A1 (en) Methods, systems, and computer program products for developing and using predictive models for predicting a plurality of medical outcomes, for evaluating intervention strategies, and for simultaneously validating biomarker causality
US20200388358A1 (en) Machine Learning Method for Generating Labels for Fuzzy Outcomes
US10692254B2 (en) Systems and methods for constructing clinical pathways within a GUI
CN111144658A (en) Medical risk prediction method, device, system, storage medium and electronic equipment
Liebeskind Innovative interventional and imaging registries: precision medicine in cerebrovascular disorders
US11626210B2 (en) Digital health prognostic analyzer for multiple myeloma mortality predictions
US11004564B2 (en) Method and apparatus for processing medical data
US20220068492A1 (en) System and method for selecting required parameters for predicting or detecting a medical condition of a patient
WO2021044594A1 (en) Method, system, and apparatus for health status prediction
Jean et al. Predictive modelling of telehealth system deployment
CN114078576B (en) Clinical auxiliary decision-making method, device, equipment and medium
CN114203306A (en) Medical event prediction model training method, medical event prediction method and device
van Steenbergen et al. The next phase in the implementation of value-based healthcare: Adding patient-relevant cost drivers to existing outcome measure sets
CN114283915A (en) Method and device for determining patient full-course care plan
CN113947278A (en) Hospital specialty decision support system, method and corresponding device and storage medium
Sendak et al. Development and validation of ML-DQA–a machine learning data quality assurance framework for healthcare
Fernandez-Llatas Bringing interactive process mining to health professionals: interactive data rodeos
Baron Artificial Intelligence in the Clinical Laboratory: An Overview with Frequently Asked Questions
Pereira et al. Predicting pre-triage waiting time in a maternity emergency room through data mining
Garuti Artificial intelligence (AI) and ethical artificial intelligence (EAI): Medical decision support system, medical sapiens (MS)
US20230207127A1 (en) Copd monitoring
CN114783587A (en) Intelligent prediction system for severe acute kidney injury
Meyer Developing and Applying a Design Framework to Prepare Electronic Health Record Data for Time-Series Modeling

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21881562

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21881562

Country of ref document: EP

Kind code of ref document: A1