WO2021114635A1

WO2021114635A1 - Patient grouping model constructing method, patient grouping method, and related device

Info

Publication number: WO2021114635A1
Application number: PCT/CN2020/099530
Authority: WO
Inventors: 徐卓扬; 孙行智; 赵惟; 左磊; 胡岗
Original assignee: 平安科技（深圳）有限公司
Priority date: 2020-05-13
Filing date: 2020-06-30
Publication date: 2021-06-17
Also published as: CN111696661B; CN111696661A

Abstract

A patient grouping model constructing method, a patient grouping method, and a related device. The patient grouping model constructing method comprises: acquiring a disease prevention and control guide, performing keyword recognition with respect to the disease prevention and control guide to produce a division attribute set of diseases in a combined disease, calculating an information gain rate of each division attribute in the division attribute set to generate a first knowledge grouping decision tree of the diseases, and producing, on the basis of the first knowledge grouping decision tree, n first candidate combined grouping solutions for patients suffering from the combined disease (S21); acquiring N pieces of sample data of the patients suffering from the combined disease, and generating an ending tag for each piece of sample data on the basis of indicators in each piece of sample data (S22); and utilizing the sample data having the ending tags and the first candidate combined grouping solutions to train a LambdaMART model to produce a constructed patient grouping model (S23). The employment of the patient grouping model provided favors an increased grouping effect of grouping patients suffering from various diseases.

Description

Method for constructing patient grouping model, method for grouping patient and related equipment

This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on May 13, 2020 with the application number 202010404637.2 and the invention title of "Patient Grouping Model Construction Method, Patient Grouping Method and Related Equipment", the entire content of which is incorporated by reference Incorporated in this application.

Technical field

This application relates to the field of machine learning technology, and in particular to a method for constructing a patient clustering model, a patient clustering method, and related equipment.

Background technique

The development of artificial intelligence is inseparable from the advancement of machine learning. As the core of artificial intelligence, machine learning specializes in the study of how computers simulate or realize human learning behaviors in order to acquire new knowledge or skills, and reorganize existing knowledge structures. It continues to improve its performance. In the medical field, machine learning has been widely used in patient grouping, and patient grouping is extremely important in precision medicine. The current patient grouping method will give a unique grouping result of the patient, or give several different grouping results. The inventor realizes that these grouping results are obtained by grouping patients for one disease. When patients with multiple diseases are grouped together, the existing grouping methods are not effective.

Summary of the invention

In order to solve the above-mentioned problems, the present application provides a method for constructing a patient grouping model, a method for grouping patients, and related equipment, which are beneficial to improve the grouping effect of comprehensive grouping of patients with multiple diseases.

In the first aspect, an embodiment of the present application provides a method for constructing a patient grouping model, the method including:

Obtain a preset disease prevention and control guide, identify keywords in the disease prevention guide, obtain the divided attribute set of each disease in the combined disease, and calculate the information gain rate of each divided attribute in the divided attribute set to generate each disease in the combined disease The first knowledge grouping decision tree of, and according to the first knowledge grouping decision tree, n first candidate joint grouping schemes of patients suffering from the joint disease are obtained;

Obtain n pieces of sample data of patients suffering from the combined disease, and generate an outcome label for each piece of sample data according to various indicators in each piece of sample data; the sample data and the first candidate are grouped together There is a one-to-one correspondence between the schemes, and the outcome label of each piece of sample data is used to indicate the score of the corresponding first candidate joint clustering scheme, and the outcome label includes an absolute outcome and a relative outcome;

The lambdaMART model is trained by using the sample data with the outcome label and the first candidate joint clustering scheme to obtain a constructed patient clustering model.

In the second aspect, an embodiment of the present application provides a method for grouping patients, which includes:

Receiving a patient grouping request submitted by the user terminal; the patient grouping request includes at least two diseases of the patient to be grouped;

Acquiring a second knowledge grouping decision tree for each disease that the patient to be grouped suffers from, and obtaining a second candidate joint grouping plan of the patient to be grouped according to the second knowledge grouping decision tree;

Input the second candidate joint grouping scheme into a pre-trained patient grouping model for sorting, and obtain a sorting result of the second candidate joint grouping scheme;

According to the ranking result of the second candidate joint grouping scheme, a preset number of the second candidate joint grouping schemes are selected as the grouping result of the patient to be grouped and returned to the user terminal.

In a third aspect, an embodiment of the present application provides an apparatus for constructing a patient grouping model, the apparatus including:

The first clustering scheme acquisition module is used to acquire preset disease prevention and control guidelines, identify keywords in the disease prevention and control guidelines, obtain the partition attribute set of each disease in the joint disease, and calculate the information gain rate of each partition attribute in the partition attribute set To generate a first knowledge grouping decision tree for each disease in the combined disease, and according to the first knowledge grouping decision tree, to obtain n first candidate joint grouping schemes of patients suffering from the combined disease;

The outcome label generation module is used to obtain n pieces of sample data of the patient suffering from the combined disease, and generate an outcome label for each piece of sample data according to each indicator in each piece of sample data; the sample data and The first candidate joint grouping solution corresponds one-to-one, and the outcome label of each piece of sample data is used to indicate the score of the corresponding first candidate joint grouping solution, and the outcome label includes an absolute outcome and a relative outcome;

The clustering model training module is used to train the lambdaMART model by using the sample data with the outcome label and the first candidate joint clustering scheme to obtain a constructed patient clustering model.

In a fourth aspect, an embodiment of the present application provides a patient grouping device, which includes:

The grouping request acquisition module is configured to receive a patient grouping request submitted by the user terminal; the patient grouping request includes at least two diseases of the patient to be grouped;

The second clustering scheme acquisition module is used to acquire a second knowledge clustering decision tree for each disease that the patient to be clustered suffers from, and to obtain the second candidate combination of the patient to be clustered according to the second knowledge clustering decision tree Grouping scheme

The clustering scheme ranking module is configured to input the second candidate joint clustering plan into a pre-trained patient clustering model for ranking, and obtain a ranking result of the second candidate joint clustering plan;

The grouping result output module is configured to select a preset number of the second candidate joint grouping schemes as the grouping result of the patient to be grouped and return to the user terminal according to the sorting result of the second candidate joint grouping scheme.

In a fifth aspect, an embodiment of the present application provides an electronic device that includes an input device and an output device, and also includes a processor, which is adapted to implement one or more instructions; and, a computer-readable storage medium. The readable storage medium stores one or more instructions, and the one or more instructions are suitable for being loaded by the processor and executing the following steps:

In a sixth aspect, an embodiment of the present application provides an electronic device, which includes an input device and an output device, and also includes a processor, adapted to implement one or more instructions; and, a computer-readable storage medium. The readable storage medium stores one or more instructions, and the one or more instructions are suitable for being loaded by the processor and executing the following steps:

In a seventh aspect, an embodiment of the present application provides a computer-readable storage medium, the computer-readable storage medium stores one or more instructions, and the one or more instructions are suitable for being loaded by a processor and executing the following steps :

In an eighth aspect, an embodiment of the present application provides a computer-readable storage medium, the computer-readable storage medium stores one or more instructions, and the one or more instructions are suitable for being loaded by a processor and executing the following steps :

In the embodiments of this application, in the patient clustering model training stage, the clustering plan of a single disease is no longer considered, but the plan of multi-disease joint clustering is sorted out, taking into account the relevant effects between different clustering decisions, and the outcome label of the sample data is not only considered The outcome label also considers the relative outcome, which eliminates to a certain extent the problem that the biased samples are difficult to learn when only the absolute outcome is used. Moreover, the lambdaMART model is used for training, and the resulting patient clustering model not only focuses on the first candidate joint clustering plan itself, but also Pay attention to the priority order between the first candidate joint grouping schemes, so as to improve the grouping effect of grouping patients with multiple diseases.

Description of the drawings

In order to more clearly describe the technical solutions in the embodiments of the present application or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the drawings in the following description are only These are some embodiments of the present application. For those of ordinary skill in the art, other drawings can be obtained based on these drawings without creative work.

FIG. 1 is a diagram of a network system architecture provided by an embodiment of this application;

2 is a schematic flowchart of a method for constructing a patient grouping model provided by an embodiment of the application;

FIG. 3 is a schematic flowchart of another method for constructing a patient grouping model provided by an embodiment of the application;

FIG. 4 is an example diagram of constructing a patient grouping model provided by an embodiment of the application;

FIG. 5 is a schematic flowchart of a method for grouping patients according to an embodiment of the application;

FIG. 6 is an example diagram of a patient grouping provided by an embodiment of the application;

FIG. 7 is a schematic structural diagram of an apparatus for constructing a patient grouping model provided by an embodiment of the application;

FIG. 8 is a schematic structural diagram of a patient grouping device provided by an embodiment of the application;

FIG. 9 is a schematic structural diagram of an electronic device provided by an embodiment of the application

FIG. 10 is a schematic structural diagram of another electronic device provided by an embodiment of the application.

Detailed ways

The embodiment of the application provides a solution for constructing a patient grouping model to construct a patient grouping model suitable for patients with multiple diseases. In the model training stage, the knowledge grouping decision tree of each disease in the joint disease is used to obtain the patient suffering from the joint disease. The candidate joint clustering plan for patients fully considers the relevant effects between the individual disease clustering plans. The follow-up data of the patient is used as the sample data, and the demographic information, medication history, laboratory examination, vital signs and other indicators of the patient in the sample data The importance is to generate an outcome label for each sample data. Compared with the case where the existing technology only considers the absolute outcome and the model learning effect is not good, this application also considers the relative outcome, which is more objective and reasonable. In addition, the patient grouping model uses the lambdaMART model As a basis, the model pays more attention to the order of the top-ranked candidate joint clustering schemes when learning, so that when the trained patient clustering model is applied to the multi-disease patient clustering scenario, better clustering results can be obtained. It is more suitable for precision medicine.

Specifically, the patient grouping model construction solution can be implemented based on the network system architecture shown in Figure 1. As shown in Figure 1, the network system architecture at least includes a user terminal, a server, and a database, and the three are connected through a wired or wireless network. Communication, the specific communication protocol is not limited. The user terminal can be used to submit disease prevention and control guidelines, follow-up data of patients with joint diseases, etc. to the server through program codes or touch signals, so as to request the server to execute the relevant steps of constructing the patient grouping model. Carry out a series of patient grouping model construction processing, such as: combing the knowledge grouping decision tree, generating the outcome label, calculating the lambda value, etc., based on the lambdaMART model, using the sample data with the outcome label and the candidate joint grouping scheme to train the training set Develop a patient grouping model. The database can be used to store disease prevention guidelines and a large number of patients’ demographic information, medical treatment data, follow-up data, etc. Developers can use the user terminal to input conditional query sentences to extract the required information from the database, such as: extracting hypertension The follow-up data of patients with diabetes is used as sample data, and the database can be a database in a server, a database independent of the server, or a cloud database. It is understandable that the user terminal in this application can be a desktop computer, a tablet computer, a supercomputer, etc. The server can be a local server, a cloud server, or a server cluster, and so on.

Based on the network system architecture shown in FIG. 1, the method for constructing a patient grouping model proposed in an embodiment of the application will be described in detail below in conjunction with related drawings. Please refer to FIG. 2. FIG. 2 is a construction of a patient grouping model provided by an embodiment of the application. The schematic flow chart of the method, as shown in Figure 2, includes steps S21-S23:

S21: Obtain a preset disease prevention and control guide, perform keyword recognition on the disease prevention and control guide, and obtain the divided attribute set of each disease in the combined disease, and calculate the information gain rate of each divided attribute in the divided attribute set to generate the combined disease The first knowledge grouping decision tree for each disease, and according to the first knowledge grouping decision tree, n first candidate joint grouping schemes of patients suffering from the joint disease are obtained.

In the specific embodiment of the present application, the combined disease refers to a combination of at least two diseases, such as: diabetes + hypertension, diabetes + hypertension + heart disease, etc. The disease prevention and control guidelines may be guidelines corresponding to each disease in the combined disease, for example: Diabetes prevention guidelines, hypertension prevention guidelines, heart disease prevention guidelines, etc., can be stored in the database, the server can obtain from the database, or can be sent to the server by the developer through the user terminal. The disease prevention guide can be identified by keywords, Text processing and other technologies extract the partition attribute set, for example: the partition attribute set for hypertension can be {age, blood pressure, glucose tolerance,..., high salt, ankle/brachial blood pressure index}, the first-knowledge clustering decision tree is model training The knowledge grouping decision tree for each disease in the joint disease combed out in the stage can be constructed by calculating the information gain rate of each divided attribute in the divided attribute set through the C4.5 algorithm. The first candidate joint grouping scheme is the model training stage server pairing the first The scheme obtained by combining the grouping schemes under the knowledge grouping decision tree. The disease prevention and control guidelines contain the treatment decision-making knowledge of related diseases, such as some treatment suggestions, drug suggestions, etc., to sort out the prevention and treatment guidelines related to each disease in the joint disease to obtain the first knowledge group decision tree corresponding to each disease, each first The knowledge grouping decision trees are independent of each other. Each first-knowledge grouping decision tree includes the grouping scheme of the disease. For example, the grouping scheme under the first-knowledge grouping decision tree corresponding to diabetes is A={A1,A2,... An} (where each Ai is a clustering plan, indicating that the patient may be assigned to the patient group Ai); the clustering plan under the first knowledge clustering decision tree corresponding to hypertension is B={B1,B2,...Bm}(where Each Bj is a grouping scheme).

In addition, if the disease in the combined disease is diabetes or hypertension, the grouping scheme under the first knowledge grouping decision tree corresponding to diabetes obtained in step S21 has A={A1,A2,...An} and the first knowledge corresponding to hypertension. The grouping scheme under a knowledge grouping decision tree has B={B1,B2,...Bm}, each Ai+Bj is a first candidate joint grouping scheme, for example: a patient with hypertension and diabetes is in diabetes The selectable grouping scheme under the corresponding first knowledge grouping decision tree is {A1,A2}, and the selectable grouping scheme under the first knowledge grouping decision tree corresponding to hypertension is {B1,B2}, then the patient may be A candidate joint clustering plan includes: {A1+B1, A2+B1, A1+B2, A2+B2}, and this combination will yield n (multiple) first candidate joint clustering plans for the patient.

S22: Obtain n pieces of sample data of the patient suffering from the combined disease, and generate an outcome label for each piece of sample data according to each index in each piece of sample data.

In the specific embodiment of this application, the sample data refers to the follow-up data of patients suffering from the combined disease. The so-called follow-up data refers to the hospital's communication or other methods for the patients who have been in the hospital to regularly understand the changes in the patient's condition and guide the patients to recover. An observation method. Usually, a patient has multiple follow-up visits. The data from one follow-up visit can be used as a piece of sample data, and each piece of sample data has a corresponding first candidate joint clustering plan. Optionally, each piece of sample data includes the patient’s demographic information, medication history of all diseases, test inspection indicators, doctor’s prescriptions, and patient’s vital signs. For example, there may be multiple indicators in the medication history. There may be multiple indicators in the inspection and inspection indicators (for example, hemoglobin glycosylated (HbA1c) in the inspection and inspection indicators for diabetes and blood pressure (BP) in the inspection and inspection indicators for high blood pressure). The importance of each indicator in the sample data is used to generate an outcome label for each sample data using the importance of each indicator.

Specifically, use each index in each sample data to train the logist regression model: y=1/1+e ^{-(β0+β1X1+β2X2+...+βnXn)} , and use gradient descent method to reduce logarithmic loss during training. To estimate the regression coefficients β0, β1, β2...βn, when the gradient drops, when the difference in log loss between two iterations is less than the preset threshold, the regression model converges. Among them, y represents the output of the regression model, that is, whether the next follow-up will increase complications or whether it is dead. It is a binary classification, X represents the input of the regression model, that is, each indicator in the sample data, and Xn represents the nth input Index, β represents the regression coefficient of each index, that is, β1 represents the importance of the index X1, and the regression coefficient is regarded as the importance of the corresponding indexes.

When using machine learning methods to classify patients, it is necessary to generate outcome labels for sample data to identify the effects of specific clusters under specific patient conditions, so as to learn clustering schemes with good outcomes. However, existing methods only consider absolute outcomes, which will lead to Machine learning is not effective. In this scheme, when generating an outcome label for each sample data, both the absolute outcome and the relative outcome are considered. The formula effect(i)=absolute(i)*relative(i) is used to complete, where effect(i) represents the ith item The outcome label of the sample data, absolute(i) represents the absolute outcome of the i-th sample data, absolute(i) is customized according to the importance of each index in the i-th sample data, and relative(i) represents the i-th sample data The relative outcome of relative(i) is defined according to absolute(i).

For example: patients with diabetes and hypertension, the test index of diabetes mellitus glycosylated hemoglobin (HbA1c), the test index of hypertension blood pressure (BP), definition: absolute(i)=β _HbA1c *(HbA1c(i)-HbA1c (i+1))+β _BP *(BP(i)-BP(i+1)), where β _HbA1c represents the importance of glycosylated hemoglobin, derived from the regression coefficient evaluated in the above regression model, and β _BP represents blood pressure The importance of HbA1c(i) represents the glycosylated hemoglobin in the ith sample data, BP(i) represents the blood pressure in the ith sample data, HbA1c(i+1) represents the glycosylated hemoglobin in the next sample data, BP (i+1) represents the blood pressure in the next sample data. Definition: relative(i)=∑ _k∈N(pi,di) absolute(k)/∑ _j∈N(pi) absolute(j), where N(pi) means that it is divided by each first knowledge grouping decision tree The sample set of the same leaf node as i, N(pi,di) is the set of the same grouping scheme actually adopted in N(pi) as i. Since each piece of sample data has a corresponding first candidate joint grouping scheme, the outcome label of each piece of sample data here can be used to indicate the score of the candidate joint grouping scheme of the sample.

S23: Train a lambdaMART model using the sample data with an outcome label and the first candidate joint clustering scheme to obtain a constructed patient clustering model.

In the specific embodiment of the present application, the lambdaMART model is originally a method for sorting documents in information retrieval, that is, when the user proposes a Query, the candidate documents are sorted. In this plan, the demographic information, inspection and inspection indicators, and medication history in each sample data are used as Query, and the first candidate joint clustering plan is used as documents. Each Query-documents pair (Query-documents pair) has an outcome. label. For each document, first calculate the lambda value, train a regression tree with the lambda value as the label, and calculate the final output score through the predicted regression result at each leaf node of the regression tree (the score here is the predicted score ), using this method to predict the score of each sample data with an outcome label, sort the first candidate joint clustering scheme corresponding to each sample data according to the level of the score, and then return to the step of calculating the lambda value and repeat The steps of training regression trees, predicting scoring, and sorting form a random forest. Training can be stopped until one of the preset convergence conditions is met, and the patient grouping model we need is obtained. The convergence conditions are: the number of regression trees reaches the preset parameter settings , Random Forest is no longer continuously updated on the validation set, that is, it is no longer getting better.

It can be seen that the embodiment of the application obtains the preset disease prevention and control guide, performs keyword recognition on the disease prevention and control guide, obtains the divided attribute set of each disease in the joint disease, and calculates the information gain rate of each divided attribute in the divided attribute set to generate The first knowledge grouping decision tree of each disease, and according to the first knowledge grouping decision tree, n first candidate joint grouping schemes of patients with joint diseases are obtained; n pieces of sample data of patients with joint diseases are obtained, and According to the indicators in each sample data, an outcome label is generated for each sample data; the lambdaMART model is trained using the sample data with the outcome label and the first candidate joint clustering scheme to obtain the constructed patient clustering model. In this way, in the patient clustering model training phase, the clustering plan of a single disease is no longer considered, but the plan of multi-disease joint clustering is sorted out, taking into account the relevant effects between different clustering decisions. At the same time, the outcome label of the sample data considers not only the outcome label, but also To a certain extent, the problem of biased samples that are difficult to learn when only the absolute outcome is used is eliminated. Moreover, the lambdaMART model is used for training, and the resulting patient clustering model focuses on the first candidate joint clustering scheme itself and the first candidate joint The priority order between the grouping schemes is helpful to improve the grouping effect of grouping patients with multiple diseases.

Please refer to FIG. 3. FIG. 3 is a schematic flowchart of another method for constructing a patient grouping model provided by an embodiment of the application. As shown in FIG. 3, it includes steps S31-S35:

S31. Obtain a preset disease prevention and control guide, perform keyword recognition on the disease prevention and control guide, and obtain a divided attribute set of each disease in the combined disease, and calculate the information gain rate of each divided attribute in the divided attribute set to generate the combined disease The first knowledge group decision tree for each disease;

S32: Obtain n first candidate joint grouping schemes of patients suffering from the joint disease according to the first knowledge grouping decision tree;

S33: Obtain n pieces of sample data of patients suffering from the combined disease, and obtain the importance of each index in each piece of sample data.

In a possible implementation manner, the foregoing obtaining the importance of each indicator in each piece of sample data includes:

Train the logist regression model using the indicators in each sample data: y=1/1+e ^{-(β0+β1X1+β2X2+...+βnXn)} , where y represents the output of the regression model, X1, X2... Xn represents each index in each of the sample data, and the coefficients β1, β2...βn represent the importance of each index;

In the training process, the logarithmic loss is reduced by the gradient descent method to estimate the regression coefficients β0, β1, β2...βn, and obtain the importance of each index.

In this embodiment, the regression coefficient β is used as the importance of each indicator in the sample data, which is conducive to the subsequent definition of absolute and relative outcomes.

S34, generating an outcome label for each piece of sample data based on the importance of each indicator in each piece of sample data;

In a possible implementation manner, the foregoing generating an outcome label for each piece of sample data based on the importance of each indicator in each piece of sample data includes:

Use the preset formula: effect(i)=absolute(i)*relative(i) to generate an outcome label for each piece of sample data; where effect(i) represents the outcome label of the i-th sample data, absolute(i) Represents the absolute outcome of the i-th sample data, customized according to the importance of each indicator in the i-th sample data; relative(i) represents the relative outcome of the i-th sample data, defined according to absolute(i).

In this embodiment, on the basis of the importance of each index obtained in step S33, an outcome label is generated for each sample data. The outcome label not only considers the absolute outcome, but also considers the relative outcome, which solves the unobjectiveness caused by only considering the absolute solution. It is helpful to reduce the learning difficulty of the patient grouping model.

S35. Train a lambdaMART model by using the sample data with an outcome label and the first candidate joint clustering scheme to obtain a constructed patient clustering model.

Among them, the specific implementation manners of steps S31-S35 have been described in detail in the embodiment shown in FIG. 2, and in order to avoid repetition, they will not be repeated here.

In order to better understand the patient grouping model construction scheme proposed in the embodiments of the present application, a brief description is now given by taking the combined disease as diabetes and hypertension as an example. As shown in Figure 4, use the Diabetes Prevention Guide (Guide 1) to sort out the knowledge grouping decision tree for diabetes, and use the Hypertension Prevention Guide (Guide 2) to sort out the knowledge grouping decision tree for hypertension, which is under the diabetes knowledge grouping decision tree The combination of the grouping plan of the grouping plan and the grouping plan under the decision tree of the knowledge grouping decision tree of the hypertension obtains the candidate joint grouping plan of diabetes and hypertension. Obtain multiple follow-up data of patients with diabetes and hypertension from the database, use the glycated hemoglobin, blood pressure and other indicators in each follow-up data to train the logist regression model, and estimate the value of the regression coefficient in the regression model to the value of the regression coefficient As the importance of each indicator, define the absolute outcome based on the importance of each indicator, and define the relative outcome relative based on the absolute outcome. Use a formula that considers the absolute outcome and the relative outcome to label the outcome for each follow-up data, and obtain a sample with an outcome label. Finally, use the sample data with outcome labels and the joint clustering scheme of diabetes and hypertension candidates for lambdaMART training. When the preset convergence conditions are met, the training is stopped, and a usable patient clustering model is obtained.

For the patient grouping model constructed based on the embodiment shown in FIG. 2 or FIG. 3, please refer to FIG. 5. FIG. 5 is a schematic flowchart of a patient grouping method provided by an embodiment of the application. The patient grouping method can also be based on the method shown in FIG. The implementation of the network system architecture, as shown in Figure 5, specifically includes steps S51-S54:

S51: Receive a patient grouping request submitted by the user terminal; the patient grouping request includes at least two diseases of the patient to be grouped;

In the specific embodiment of the application, the patient grouping request is used to request the server to group the patients to be grouped into groups. The patients to be grouped are patients with the same combined disease as the sample patient in the model training stage, such as patients with diabetes and hypertension. . The patient grouping can include the combined disease that the patient to be grouped suffers from. Of course, it can also include the prevention and treatment guidelines for various diseases in the combined disease, basic information of the patient to be grouped, diagnosis information, etc. At this time, the user terminal can be a medical staff The terminal used, the terminal of the medical research room, the terminal of the staff of the medical and health enterprise, etc., for example: the medical staff can send the patient grouping request to the server through the user terminal after the patient is to be grouped for diagnosis.

S52: Obtain a second knowledge grouping decision tree for each disease that the patient to be grouped suffers from, and obtain a second candidate joint grouping solution for the patient to be grouped according to the second knowledge grouping decision tree;

In the specific embodiment of this application, the second knowledge grouping decision tree is the knowledge grouping decision tree generated by combing the disease prevention and control guidelines through keyword recognition and calculating the information gain rate during the use stage of the patient grouping model, and the second knowledge grouping decision tree The grouping scheme under the tree decision tree is composed, and the second candidate joint grouping scheme is obtained.

S53: Input the second candidate joint grouping scheme into a pre-trained patient grouping model for sorting, and obtain a sorting result of the second candidate joint grouping scheme;

In the specific embodiment of the application, the patient clustering model uses the method of training regression trees to predict the score of each second candidate joint clustering scheme, and ranks each second candidate joint clustering scheme according to this score, and the second candidate with the larger score The joint grouping plan should be ranked higher, and the second candidate joint grouping plan with the lower score should be ranked lower.

S54: According to the sorting result of the second candidate joint grouping solution, a preset number of the second candidate joint grouping solutions are selected as the grouping result of the patient to be grouped and returned to the user terminal.

In the specific embodiment of the present application, the preset number of second candidate joint grouping schemes can be set according to actual conditions, and may be the second candidate joint grouping scheme ranked first, or the second candidate joint grouping plan ranked three. The candidate joint grouping scheme is not specifically limited. For example: the second candidate joint grouping scheme of patients to be grouped is A1+B1, A2+B1, A1+B2, A2+B2, and their ranking results are: A2+B1, A2+B2, A1+B1, A1+B2 Now it is set to select the second candidate joint grouping scheme of top2 as the final joint grouping scheme of patients to be grouped, and the return result received by the user terminal is: A2+B1, A2+B2.

According to the method for grouping patients provided by the embodiments of the present application, if the patients to be grouped have diabetes and hypertension, in the case of receiving a request for grouping patients sent by the user terminal, the realization can be implemented as shown in FIG. 6 through the diabetes prevention guide and Hypertension prevention and control guidelines, respectively sort out the diabetes knowledge grouping decision tree and hypertension knowledge grouping decision tree, according to the knowledge grouping decision tree of the two to obtain multiple second candidate joint grouping schemes, and input them into the patient grouping model for score prediction and ranking , And finally output the top-k best second candidate joint clustering scheme. Because the patient clustering model constructed in the embodiment shown in Figure 2 or Figure 3 is used to predict and sort, it is beneficial to improve the clustering effect of multi-disease combined patients. , More suitable for precision medicine.

Based on the description of the embodiment of the method for constructing a patient grouping model shown in FIG. 2, an embodiment of the present application also provides a device for constructing a patient grouping model. The device for constructing a patient grouping model may be running in a terminal. A computer program (including program code). The device for constructing a patient grouping model can execute the method shown in FIG. 2 or FIG. 3. Please refer to Figure 7, the device includes:

The first clustering scheme acquisition module 71 is used to acquire preset disease prevention and control guidelines, perform keyword recognition on the disease prevention and control guidelines, obtain the partition attribute set of each disease in the joint disease, and calculate the information gain of each partition attribute in the partition attribute set Rate to generate a first knowledge clustering decision tree for each disease in the combined disease, and according to the first knowledge clustering decision tree, to obtain n first candidate combined clustering schemes for patients suffering from the combined disease;

The outcome label generation module 72 is configured to obtain n pieces of sample data of patients suffering from the combined disease, and generate an outcome label for each piece of sample data according to various indicators in each piece of sample data; the sample data One-to-one correspondence with the first candidate joint grouping scheme, and the outcome label of each piece of sample data is used to indicate the score of the corresponding first candidate joint grouping scheme, and the outcome label includes an absolute outcome and a relative outcome;

The clustering model training module 73 is configured to train the lambdaMART model by using the sample data with the outcome label and the first candidate joint clustering scheme to obtain a constructed patient clustering model.

Based on the description of the embodiment of the method for grouping patients shown in FIG. 5, an embodiment of the present application also provides a device for grouping patients. Referring to FIG. 8, the device includes:

The grouping request obtaining module 81 is configured to receive a patient grouping request submitted by the user terminal; the patient grouping request includes at least two diseases of the patient to be grouped;

The second clustering scheme acquisition module 82 is configured to acquire a second knowledge clustering decision tree of each disease that the patient to be clustered suffers from, and obtain the second candidate of the patient to be clustered according to the second knowledge clustering decision tree Joint grouping scheme;

The clustering scheme ranking module 83 is configured to input the second candidate joint clustering plan into a pre-trained patient clustering model for ranking, and obtain a ranking result of the second candidate joint clustering plan;

The grouping result output module 84 is configured to select a preset number of the second candidate joint grouping solutions as the grouping results of the patients to be grouped and return to the user terminal according to the sorting result of the second candidate joint grouping solutions.

According to an embodiment of the present application, each unit in the patient grouping model construction device and the patient grouping device shown in FIG. 7 and FIG. 8 can be separately or completely combined into one or several other units to form, or one of them The unit(s) can be further divided into multiple units with smaller functions to form, which can realize the same operation without affecting the realization of the technical effects of the embodiments of the present application. The above-mentioned units are divided based on logical functions. In practical applications, the function of one unit may also be realized by multiple units, or the functions of multiple units may be realized by one unit. In other embodiments of the present application, the patient grouping model construction device and the patient grouping device may also include other units. In practical applications, these functions can also be implemented with the assistance of other units, and can be implemented by multiple units in cooperation.

According to another embodiment of the present application, a general-purpose computing device such as a computer including a central processing unit (CPU), a random access storage medium (RAM), a read-only storage medium (ROM) and other processing elements and storage elements can be used Run a computer program (including program code) capable of executing the steps involved in the corresponding method shown in FIG. 2, FIG. 3, or FIG. 5 to construct the patient grouping model construction device as shown in FIG. 7 or FIG. 8, A device for grouping patients, and a method for constructing a patient grouping model and a method for grouping patients according to the embodiments of the present application. The computer program may be recorded on, for example, a computer-readable recording medium, and loaded into the above-mentioned computing device through the computer-readable recording medium, and run in it.

Please refer to FIG. 9. FIG. 9 is a schematic structural diagram of an electronic device provided by an embodiment of the application. As shown in FIG. 9, the electronic device includes at least a processor 901, an input device 902, an output device 903, and a computer-readable storage medium. 904. Wherein, the processor 901, the input device 902, the output device 903, and the computer-readable storage medium 904 in the electronic device may be connected by a bus or other methods.

The computer-readable storage medium 904 may be stored in the memory of the electronic device. The computer-readable storage medium 904 is used to store a computer program. The computer program includes program instructions. The processor 901 is used to execute the computer-readable Program instructions stored in the storage medium 904. The processor 901 (or CPU (Central Processing Unit, central processing unit)) is the computing core and control core of an electronic device. It is suitable for implementing one or more instructions, specifically suitable for loading and executing one or more instructions to achieve Corresponding method flow or corresponding function.

In one embodiment, the processor 901 of the electronic device provided in the embodiment of the present application may be used to construct a series of patient grouping models, including:

The lambdaMART model is trained using the sample data with the outcome label and the first candidate joint clustering scheme to obtain a constructed patient clustering model.

In one embodiment, the processor 901 executing the generating of an ending label for each piece of sample data according to each indicator in each piece of sample data includes: obtaining the importance of each indicator in each piece of sample data; The importance of each indicator in each piece of sample data generates an outcome label for each piece of sample data.

In one embodiment, the processor 901 executes the acquisition of the importance of each indicator in each piece of sample data, including: training a logist regression model using each indicator in each piece of sample data: y=1/1+ e ^{-(β0+β1X1+β2X2+...+βnXn)} , where y represents the output of the regression model, X1, X2...Xn represent each index in each of the sample data, and the coefficients β1, β2...βn represent each index The importance of; in the training process, the logarithmic loss is reduced by gradient descent method to estimate the regression coefficients β0, β1, β2...βn to obtain the importance of each index.

In one embodiment, the processor 901 executing the generation of an outcome label for each piece of sample data based on the importance of each indicator in each piece of sample data includes: using a preset formula: effect(i)=absolute (i)*relative(i) generates an outcome label for each piece of sample data; among them, effect(i) represents the outcome label of the i-th sample data, absolute(i) represents the absolute outcome of the i-th sample data, according to The importance of each indicator in the sample data of Article i is customized; relative(i) represents the relative outcome of the sample data of Article i, which is defined by absolute(i).

In one embodiment, the processor 901 executes the training of the lambdaMART model using the sample data with the outcome label and the first candidate joint clustering scheme to obtain a constructed patient clustering model, including:

A: Calculate the lambda value of the first candidate joint grouping scheme;

B: Train a regression tree with the lambda value as a label, and calculate the final output score based on the predicted regression result at each leaf node of the regression tree;

C: Predict the score of each piece of sample data with an outcome label through steps A and B, and compare the score of each piece of sample data with an outcome label to the first piece of sample data corresponding to each piece of sample data. A candidate joint grouping scheme is sorted;

D: Repeat steps A to C to form a random forest, and stop training until one of the preset convergence conditions is met to obtain the patient grouping model; the preset convergence conditions include: the number of regression trees reaches the preset parameter setting, random The forest is no longer continuously updated on the validation set.

It should be noted that, since the processor 901 of the electronic device executes the computer program to implement the steps in the method for constructing a patient grouping model, the embodiments of the method for constructing a patient grouping model are all applicable to the electronic device, and can achieve the same Or similar beneficial effects.

Please refer to FIG. 10, which is a schematic structural diagram of another electronic device provided by an embodiment of the application. As shown in FIG. 10, the electronic device includes at least a processor 1001, an input device 1002, an output device 1003, and a computer-readable storage Medium 1004. Wherein, the processor 1001, the input device 1002, the output device 1003, and the computer-readable storage medium 1004 in the electronic device may be connected by a bus or other methods.

In an embodiment, the processor 1001 of the electronic device provided in the embodiment of the present application may be used to perform a series of patient grouping processing, including:

It should be noted that since the processor 1001 of the electronic device executes the computer program to implement the steps in the above-mentioned patient grouping method, the embodiments of the above-mentioned patient grouping method are all applicable to the electronic device, and can achieve the same or similar benefits. effect. In addition, the above-mentioned patient grouping method and patient grouping model construction method can be executed by the same electronic device, or can be executed by different electronic devices, which is not limited in the embodiment of the present application.

The embodiment of the present application also provides a computer-readable storage medium (Memory). The computer-readable storage medium is a memory device in an electronic device for storing programs and data. It can be understood that the computer-readable storage medium herein may include a built-in storage medium in the terminal, and of course, may also include an extended storage medium supported by the terminal. The computer-readable storage medium provides storage space, and the storage space stores the operating system of the terminal. In addition, one or more instructions suitable for being loaded and executed by the processor 901 are stored in the storage space, and these instructions may be one or more computer programs (including program codes). It should be noted that the computer-readable storage medium here can be a high-speed RAM memory, or a non-volatile memory (non-volatile memory), such as at least one disk memory; optionally, it can also be at least one located far away The aforementioned processor 901 is a computer-readable storage medium. In an embodiment, the processor 901 can load and execute one or more instructions stored in a computer-readable storage medium to implement the following steps:

In an example, when one or more instructions in the computer-readable storage medium are loaded by the processor 901, the following steps are performed: acquiring the importance of each indicator in each piece of sample data; based on each piece of sample data The importance of each indicator in the data generates an outcome label for each piece of sample data.

In an example, when one or more instructions in the computer-readable storage medium are loaded by the processor 901, the following steps are also performed: training a logist regression model using each indicator in each piece of sample data: y=1/1 +e ^{-(β0+β1X1+β2X2+...+βnXn)} , where y represents the output of the regression model, X1, X2...Xn represent each index in each of the sample data, and the coefficients β1, β2...βn represent each The importance of the indicators; the gradient descent method is used to reduce the logarithmic loss during the training process to estimate the regression coefficients β0, β1, β2...βn to obtain the importance of each indicator.

In an example, when one or more instructions in the computer-readable storage medium are loaded by the processor 901, the following steps are also executed: using a preset formula: effect(i)=absolute(i)*relative(i) is each The sample data described in Article 1 generates an outcome label; among them, effect(i) represents the outcome label of the i-th sample data, absolute(i) represents the absolute outcome of the i-th sample data, according to the index Importance is self-defined; relative(i) represents the relative outcome of the i-th sample data, defined according to absolute(i).

In an example, when one or more instructions in the computer-readable storage medium are loaded by the processor 901, the following steps are also executed:

A: Calculate the lambda value of the first candidate joint grouping scheme;

The embodiment of the present application also provides a computer-readable storage medium (Memory). In one embodiment, the processor 1001 can load and execute one or more instructions stored in the computer-readable storage medium to implement the following steps:

Exemplarily, the computer program of the computer-readable storage medium includes computer program code, and the computer program code may be in the form of source code, object code, executable file, or some intermediate form, etc., and the computer-readable storage medium may It is non-volatile or volatile. The computer-readable storage medium may include: any entity or device capable of carrying the computer program code, recording medium, U disk, mobile hard disk, magnetic disk, optical disk, computer memory, read-only memory (ROM, Read-Only Memory) ), Random Access Memory (RAM, Random Access Memory), electrical carrier signal, telecommunications signal, and software distribution media, etc.

The above-disclosed are only part of the embodiments of this application. Of course, it cannot be used to limit the scope of rights of this application. Those of ordinary skill in the art can understand all or part of the procedures for implementing the above-mentioned embodiments and make them in accordance with the claims of this application. The equivalent change of is still within the scope of this application.

Claims

A method for constructing a patient grouping model, wherein the method includes:

Obtain a preset disease prevention and control guide, identify keywords in the disease prevention guide, obtain the divided attribute set of each disease in the combined disease, and calculate the information gain rate of each divided attribute in the divided attribute set to generate each disease in the combined disease The first knowledge grouping decision tree of, and according to the first knowledge grouping decision tree, n first candidate joint grouping schemes of patients suffering from the joint disease are obtained;

Obtain n pieces of sample data of patients suffering from the combined disease, and generate an outcome label for each piece of sample data according to various indicators in each piece of sample data; the sample data and the first candidate are grouped together There is a one-to-one correspondence between the schemes, and the outcome label of each piece of sample data is used to indicate the score of the corresponding first candidate joint clustering scheme, and the outcome label includes an absolute outcome and a relative outcome;

The lambdaMART model is trained by using the sample data with the outcome label and the first candidate joint clustering scheme to obtain a constructed patient clustering model.
The method according to claim 1, wherein the generating an outcome label for each piece of sample data according to each indicator in each piece of sample data comprises:

Obtain the importance of each indicator in each piece of sample data;

An outcome label is generated for each piece of sample data based on the importance of each indicator in each piece of sample data.
3. The method according to claim 2, wherein said obtaining the importance of each indicator in each piece of sample data comprises:

Train the logist regression model using the indicators in each of the sample data: y=1/1+e -(β0+β1X1+β2X2+...+βnXn) , where y represents the output of the regression model, X1, X2... Xn represents each index in each piece of sample data, and the coefficients β1, β2...βn represent the importance of each index;

In the training process, the logarithmic loss is reduced by the gradient descent method to estimate the regression coefficients β0, β1, β2...βn, and obtain the importance of each index.
The method according to claim 2, wherein the generating an outcome label for each piece of sample data based on the importance of each indicator in each piece of sample data comprises:

Use the preset formula: effect(i)=absolute(i)*relative(i) to generate an outcome label for each piece of sample data; where effect(i) represents the outcome label of the i-th sample data, absolute(i) Represents the absolute outcome of the i-th sample data, customized according to the importance of each indicator in the i-th sample data; relative(i) represents the relative outcome of the i-th sample data, defined according to absolute(i).
The method according to any one of claims 1 to 4, wherein the training a lambdaMART model using the sample data with an outcome label and the first candidate joint clustering scheme to obtain a constructed patient clustering model includes :

A: Calculate the lambda value of the first candidate joint grouping scheme;

B: Train a regression tree with the lambda value as a label, and calculate the final output score based on the predicted regression result at each leaf node of the regression tree;

C: Predict the score of each piece of sample data with an outcome label through steps A and B, and compare the score of each piece of sample data with an outcome label to the first piece of sample data corresponding to each piece of sample data. A candidate joint grouping scheme is sorted;

D: Repeat steps A to C to form a random forest, and stop training until one of the preset convergence conditions is met to obtain the patient grouping model; the preset convergence conditions include: the number of regression trees reaches the preset parameter setting, random The forest is no longer continuously updated on the validation set.
A patient grouping method using the patient grouping model constructed by the method of any one of claims 1 to 5, wherein the method comprises:

Receiving a patient grouping request submitted by the user terminal; the patient grouping request includes at least two diseases of the patient to be grouped;

Acquiring a second knowledge grouping decision tree for each disease that the patient to be grouped suffers from, and obtaining a second candidate joint grouping plan of the patient to be grouped according to the second knowledge grouping decision tree;

Input the second candidate joint grouping scheme into a pre-trained patient grouping model for sorting, and obtain a sorting result of the second candidate joint grouping scheme;

According to the ranking result of the second candidate joint grouping scheme, a preset number of the second candidate joint grouping schemes are selected as the grouping result of the patient to be grouped and returned to the user terminal.
A device for constructing a patient grouping model, wherein the device comprises:

The first clustering scheme acquisition module is used to acquire preset disease prevention and control guidelines, identify keywords in the disease prevention and control guidelines, obtain the partition attribute set of each disease in the joint disease, and calculate the information gain rate of each partition attribute in the partition attribute set To generate a first knowledge grouping decision tree for each disease in the combined disease, and according to the first knowledge grouping decision tree, to obtain n first candidate joint grouping schemes of patients suffering from the combined disease;

The outcome label generation module is used to obtain n pieces of sample data of the patient suffering from the combined disease, and generate an outcome label for each piece of sample data according to various indicators in each piece of sample data; the sample data and The first candidate joint grouping scheme corresponds one-to-one, and the outcome label of each piece of sample data is used to indicate the score of the corresponding first candidate joint grouping scheme, and the outcome label includes an absolute outcome and a relative outcome;

The clustering model training module is used to train the lambdaMART model by using the sample data with the outcome label and the first candidate joint clustering scheme to obtain a constructed patient clustering model.
A device for grouping patients, wherein the device comprises:

The grouping request acquisition module is configured to receive a patient grouping request submitted by the user terminal; the patient grouping request includes at least two diseases of the patient to be grouped;

The second clustering scheme acquisition module is used to acquire a second knowledge clustering decision tree for each disease that the patient to be clustered suffers from, and to obtain the second candidate combination of the patient to be clustered according to the second knowledge clustering decision tree Grouping scheme

The clustering scheme ranking module is configured to input the second candidate joint clustering plan into a pre-trained patient clustering model for ranking, and obtain a ranking result of the second candidate joint clustering plan;

The grouping result output module is configured to select a preset number of the second candidate joint grouping schemes as the grouping result of the patient to be grouped and return to the user terminal according to the sorting result of the second candidate joint grouping scheme.
An electronic device, including an input device and an output device, which also includes:

Processor, suitable for implementing one or more instructions; and,

A computer-readable storage medium storing one or more instructions, and the one or more instructions are suitable for being loaded and executed by the processor:

Obtain a preset disease prevention and control guide, identify keywords in the disease prevention guide, obtain the divided attribute set of each disease in the combined disease, and calculate the information gain rate of each divided attribute in the divided attribute set to generate each disease in the combined disease The first knowledge grouping decision tree of, and according to the first knowledge grouping decision tree, n first candidate joint grouping schemes of patients suffering from the joint disease are obtained;

Obtain n pieces of sample data of patients suffering from the combined disease, and generate an outcome label for each piece of sample data according to various indicators in each piece of sample data; the sample data and the first candidate are grouped together There is a one-to-one correspondence between the schemes, and the outcome label of each piece of sample data is used to indicate the score of the corresponding first candidate joint clustering scheme, and the outcome label includes an absolute outcome and a relative outcome;

The lambdaMART model is trained by using the sample data with the outcome label and the first candidate joint clustering scheme to obtain a constructed patient clustering model.
11. The electronic device according to claim 9, wherein the execution of the processor to generate an outcome label for each piece of the sample data according to each indicator in each piece of the sample data comprises:

Obtain the importance of each indicator in each piece of sample data;

An outcome label is generated for each piece of sample data based on the importance of each indicator in each piece of sample data.
11. The electronic device according to claim 10, wherein said acquiring, by said processor, the importance of each index in each piece of sample data comprises:

Train the logist regression model using the indicators in each of the sample data: y=1/1+e -(β0+β1X1+β2X2+...+βnXn) , where y represents the output of the regression model, X1, X2... Xn represents each index in each piece of sample data, and the coefficients β1, β2...βn represent the importance of each index;

In the training process, the logarithmic loss is reduced by the gradient descent method to estimate the regression coefficients β0, β1, β2...βn, and obtain the importance of each index.
The electronic device according to claim 10, wherein the execution of the processor to generate an outcome label for each piece of the sample data based on the importance of each index in each piece of the sample data comprises:

Use the preset formula: effect(i)=absolute(i)*relative(i) to generate an outcome label for each piece of sample data; where effect(i) represents the outcome label of the i-th sample data, absolute(i) Represents the absolute outcome of the i-th sample data, customized according to the importance of each indicator in the i-th sample data; relative(i) represents the relative outcome of the i-th sample data, defined according to absolute(i).
The electronic device according to any one of claims 9-12, wherein the processor executes the training of the lambdaMART model using the sample data with the ending label and the first candidate joint clustering scheme to obtain a built Patient classification model, including:

A: Calculate the lambda value of the first candidate joint grouping scheme;

B: Train a regression tree with the lambda value as a label, and calculate the final output score based on the predicted regression result at each leaf node of the regression tree;

C: Predict the score of each piece of sample data with an outcome label through steps A and B, and compare the score of each piece of sample data with an outcome label to the first piece of sample data corresponding to each piece of sample data. A candidate joint grouping scheme is sorted;

D: Repeat steps A to C to form a random forest, and stop training until one of the preset convergence conditions is met to obtain the patient grouping model; the preset convergence conditions include: the number of regression trees reaches the preset parameter setting, random The forest is no longer continuously updated on the validation set.
An electronic device, including an input device and an output device, which also includes:

Processor, suitable for implementing one or more instructions; and,

A computer-readable storage medium storing one or more instructions, and the one or more instructions are suitable for being loaded and executed by the processor:

Receiving a patient grouping request submitted by the user terminal; the patient grouping request includes at least two diseases of the patient to be grouped;

Acquiring a second knowledge grouping decision tree for each disease that the patient to be grouped suffers from, and obtaining a second candidate joint grouping plan of the patient to be grouped according to the second knowledge grouping decision tree;

Input the second candidate joint grouping scheme into a pre-trained patient grouping model for sorting, and obtain a sorting result of the second candidate joint grouping scheme;

According to the ranking result of the second candidate joint grouping scheme, a preset number of the second candidate joint grouping schemes are selected as the grouping result of the patient to be grouped and returned to the user terminal.
A computer-readable storage medium, wherein the computer-readable storage medium stores one or more instructions, and the one or more instructions are suitable for being loaded and executed by a processor:

Obtain a preset disease prevention and control guide, identify keywords in the disease prevention guide, obtain the divided attribute set of each disease in the combined disease, and calculate the information gain rate of each divided attribute in the divided attribute set to generate each disease in the combined disease The first knowledge grouping decision tree of, and according to the first knowledge grouping decision tree, n first candidate joint grouping schemes of patients suffering from the joint disease are obtained;

Obtain n pieces of sample data of patients suffering from the combined disease, and generate an outcome label for each piece of sample data according to various indicators in each piece of sample data; the sample data and the first candidate are grouped together There is a one-to-one correspondence between the schemes, and the outcome label of each piece of sample data is used to indicate the score of the corresponding first candidate joint clustering scheme, and the outcome label includes an absolute outcome and a relative outcome;

The lambdaMART model is trained by using the sample data with the outcome label and the first candidate joint clustering scheme to obtain a constructed patient clustering model.
The computer-readable storage medium according to claim 15, wherein the one or more instructions are also executed when loaded by the processor:

Obtain the importance of each indicator in each piece of sample data; and generate an outcome label for each piece of sample data based on the importance of each indicator in each piece of sample data.
The computer-readable storage medium according to claim 16, wherein the one or more instructions are also executed when loaded by the processor:

Train the logist regression model using the indicators in each of the sample data: y=1/1+e -(β0+β1X1+β2X2+...+βnXn) , where y represents the output of the regression model, X1, X2... Xn represents the various indicators in each piece of sample data, and the coefficients β1, β2...βn represent the importance of each indicator; the gradient descent method is used to reduce the log loss during the training process to estimate the regression coefficients β0, β1, β2...βn, Get the importance of each indicator.
The computer-readable storage medium according to claim 16, wherein the one or more instructions are also executed when loaded by the processor:

Use the preset formula: effect(i)=absolute(i)*relative(i) to generate an outcome label for each piece of sample data; where effect(i) represents the outcome label of the i-th sample data, absolute(i) Represents the absolute outcome of the i-th sample data, customized according to the importance of each indicator in the i-th sample data; relative(i) represents the relative outcome of the i-th sample data, defined according to absolute(i).
18. The computer-readable storage medium according to any one of claims 15-18, wherein the one or more instructions are also executed when loaded by the processor:

A: Calculate the lambda value of the first candidate joint grouping scheme;

B: Train a regression tree with the lambda value as a label, and calculate the final output score based on the predicted regression result at each leaf node of the regression tree;

C: Predict the score of each piece of sample data with an outcome label through steps A and B, and compare the score of each piece of sample data with an outcome label to the first piece of sample data corresponding to each piece of sample data. A candidate joint grouping scheme is sorted;

D: Repeat steps A to C to form a random forest, and stop training until one of the preset convergence conditions is met to obtain the patient grouping model; the preset convergence conditions include: the number of regression trees reaches the preset parameter setting, random The forest is no longer continuously updated on the validation set.
A computer-readable storage medium, wherein the computer-readable storage medium stores one or more instructions, and the one or more instructions are suitable for being loaded and executed by a processor:

Receiving a patient grouping request submitted by the user terminal; the patient grouping request includes at least two diseases of the patient to be grouped;

Acquiring a second knowledge grouping decision tree for each disease that the patient to be grouped suffers from, and obtaining a second candidate joint grouping plan of the patient to be grouped according to the second knowledge grouping decision tree;

Input the second candidate joint grouping scheme into a pre-trained patient grouping model for sorting, and obtain a sorting result of the second candidate joint grouping scheme;

According to the ranking result of the second candidate joint grouping scheme, a preset number of the second candidate joint grouping schemes are selected as the grouping result of the patient to be grouped and returned to the user terminal.