WO2021114635A1 - Patient grouping model constructing method, patient grouping method, and related device - Google Patents

Patient grouping model constructing method, patient grouping method, and related device Download PDF

Info

Publication number
WO2021114635A1
WO2021114635A1 PCT/CN2020/099530 CN2020099530W WO2021114635A1 WO 2021114635 A1 WO2021114635 A1 WO 2021114635A1 CN 2020099530 W CN2020099530 W CN 2020099530W WO 2021114635 A1 WO2021114635 A1 WO 2021114635A1
Authority
WO
WIPO (PCT)
Prior art keywords
sample data
grouping
patient
piece
outcome
Prior art date
Application number
PCT/CN2020/099530
Other languages
French (fr)
Chinese (zh)
Inventor
徐卓扬
孙行智
赵惟
左磊
胡岗
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021114635A1 publication Critical patent/WO2021114635A1/en

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Definitions

  • This application relates to the field of machine learning technology, and in particular to a method for constructing a patient clustering model, a patient clustering method, and related equipment.
  • the present application provides a method for constructing a patient grouping model, a method for grouping patients, and related equipment, which are beneficial to improve the grouping effect of comprehensive grouping of patients with multiple diseases.
  • an embodiment of the present application provides a method for constructing a patient grouping model, the method including:
  • Obtain a preset disease prevention and control guide identify keywords in the disease prevention guide, obtain the divided attribute set of each disease in the combined disease, and calculate the information gain rate of each divided attribute in the divided attribute set to generate each disease in the combined disease
  • the first knowledge grouping decision tree of, and according to the first knowledge grouping decision tree, n first candidate joint grouping schemes of patients suffering from the joint disease are obtained;
  • the lambdaMART model is trained by using the sample data with the outcome label and the first candidate joint clustering scheme to obtain a constructed patient clustering model.
  • an embodiment of the present application provides a method for grouping patients, which includes:
  • the patient grouping request includes at least two diseases of the patient to be grouped;
  • a preset number of the second candidate joint grouping schemes are selected as the grouping result of the patient to be grouped and returned to the user terminal.
  • an embodiment of the present application provides an apparatus for constructing a patient grouping model, the apparatus including:
  • the first clustering scheme acquisition module is used to acquire preset disease prevention and control guidelines, identify keywords in the disease prevention and control guidelines, obtain the partition attribute set of each disease in the joint disease, and calculate the information gain rate of each partition attribute in the partition attribute set To generate a first knowledge grouping decision tree for each disease in the combined disease, and according to the first knowledge grouping decision tree, to obtain n first candidate joint grouping schemes of patients suffering from the combined disease;
  • the outcome label generation module is used to obtain n pieces of sample data of the patient suffering from the combined disease, and generate an outcome label for each piece of sample data according to each indicator in each piece of sample data; the sample data and The first candidate joint grouping solution corresponds one-to-one, and the outcome label of each piece of sample data is used to indicate the score of the corresponding first candidate joint grouping solution, and the outcome label includes an absolute outcome and a relative outcome;
  • the clustering model training module is used to train the lambdaMART model by using the sample data with the outcome label and the first candidate joint clustering scheme to obtain a constructed patient clustering model.
  • an embodiment of the present application provides a patient grouping device, which includes:
  • the grouping request acquisition module is configured to receive a patient grouping request submitted by the user terminal; the patient grouping request includes at least two diseases of the patient to be grouped;
  • the second clustering scheme acquisition module is used to acquire a second knowledge clustering decision tree for each disease that the patient to be clustered suffers from, and to obtain the second candidate combination of the patient to be clustered according to the second knowledge clustering decision tree Grouping scheme
  • the clustering scheme ranking module is configured to input the second candidate joint clustering plan into a pre-trained patient clustering model for ranking, and obtain a ranking result of the second candidate joint clustering plan;
  • the grouping result output module is configured to select a preset number of the second candidate joint grouping schemes as the grouping result of the patient to be grouped and return to the user terminal according to the sorting result of the second candidate joint grouping scheme.
  • an embodiment of the present application provides an electronic device that includes an input device and an output device, and also includes a processor, which is adapted to implement one or more instructions; and, a computer-readable storage medium.
  • the readable storage medium stores one or more instructions, and the one or more instructions are suitable for being loaded by the processor and executing the following steps:
  • Obtain a preset disease prevention and control guide identify keywords in the disease prevention guide, obtain the divided attribute set of each disease in the combined disease, and calculate the information gain rate of each divided attribute in the divided attribute set to generate each disease in the combined disease
  • the first knowledge grouping decision tree of, and according to the first knowledge grouping decision tree, n first candidate joint grouping schemes of patients suffering from the joint disease are obtained;
  • the lambdaMART model is trained by using the sample data with the outcome label and the first candidate joint clustering scheme to obtain a constructed patient clustering model.
  • an embodiment of the present application provides an electronic device, which includes an input device and an output device, and also includes a processor, adapted to implement one or more instructions; and, a computer-readable storage medium.
  • the readable storage medium stores one or more instructions, and the one or more instructions are suitable for being loaded by the processor and executing the following steps:
  • the patient grouping request includes at least two diseases of the patient to be grouped;
  • a preset number of the second candidate joint grouping schemes are selected as the grouping result of the patient to be grouped and returned to the user terminal.
  • an embodiment of the present application provides a computer-readable storage medium, the computer-readable storage medium stores one or more instructions, and the one or more instructions are suitable for being loaded by a processor and executing the following steps :
  • Obtain a preset disease prevention and control guide identify keywords in the disease prevention guide, obtain the divided attribute set of each disease in the combined disease, and calculate the information gain rate of each divided attribute in the divided attribute set to generate each disease in the combined disease
  • the first knowledge grouping decision tree of, and according to the first knowledge grouping decision tree, n first candidate joint grouping schemes of patients suffering from the joint disease are obtained;
  • the lambdaMART model is trained by using the sample data with the outcome label and the first candidate joint clustering scheme to obtain a constructed patient clustering model.
  • an embodiment of the present application provides a computer-readable storage medium, the computer-readable storage medium stores one or more instructions, and the one or more instructions are suitable for being loaded by a processor and executing the following steps :
  • the patient grouping request includes at least two diseases of the patient to be grouped;
  • a preset number of the second candidate joint grouping schemes are selected as the grouping result of the patient to be grouped and returned to the user terminal.
  • the clustering plan of a single disease is no longer considered, but the plan of multi-disease joint clustering is sorted out, taking into account the relevant effects between different clustering decisions, and the outcome label of the sample data is not only considered
  • the outcome label also considers the relative outcome, which eliminates to a certain extent the problem that the biased samples are difficult to learn when only the absolute outcome is used.
  • the lambdaMART model is used for training, and the resulting patient clustering model not only focuses on the first candidate joint clustering plan itself, but also Pay attention to the priority order between the first candidate joint grouping schemes, so as to improve the grouping effect of grouping patients with multiple diseases.
  • FIG. 1 is a diagram of a network system architecture provided by an embodiment of this application.
  • FIG. 2 is a schematic flowchart of a method for constructing a patient grouping model provided by an embodiment of the application
  • FIG. 3 is a schematic flowchart of another method for constructing a patient grouping model provided by an embodiment of the application
  • FIG. 4 is an example diagram of constructing a patient grouping model provided by an embodiment of the application.
  • FIG. 5 is a schematic flowchart of a method for grouping patients according to an embodiment of the application.
  • FIG. 6 is an example diagram of a patient grouping provided by an embodiment of the application.
  • FIG. 7 is a schematic structural diagram of an apparatus for constructing a patient grouping model provided by an embodiment of the application.
  • FIG. 8 is a schematic structural diagram of a patient grouping device provided by an embodiment of the application.
  • FIG. 9 is a schematic structural diagram of an electronic device provided by an embodiment of the application.
  • FIG. 10 is a schematic structural diagram of another electronic device provided by an embodiment of the application.
  • the embodiment of the application provides a solution for constructing a patient grouping model to construct a patient grouping model suitable for patients with multiple diseases.
  • the knowledge grouping decision tree of each disease in the joint disease is used to obtain the patient suffering from the joint disease.
  • the candidate joint clustering plan for patients fully considers the relevant effects between the individual disease clustering plans.
  • the follow-up data of the patient is used as the sample data, and the demographic information, medication history, laboratory examination, vital signs and other indicators of the patient in the sample data The importance is to generate an outcome label for each sample data.
  • this application also considers the relative outcome, which is more objective and reasonable.
  • the patient grouping model uses the lambdaMART model As a basis, the model pays more attention to the order of the top-ranked candidate joint clustering schemes when learning, so that when the trained patient clustering model is applied to the multi-disease patient clustering scenario, better clustering results can be obtained. It is more suitable for precision medicine.
  • the patient grouping model construction solution can be implemented based on the network system architecture shown in Figure 1.
  • the network system architecture at least includes a user terminal, a server, and a database, and the three are connected through a wired or wireless network. Communication, the specific communication protocol is not limited.
  • the user terminal can be used to submit disease prevention and control guidelines, follow-up data of patients with joint diseases, etc. to the server through program codes or touch signals, so as to request the server to execute the relevant steps of constructing the patient grouping model.
  • the database can be used to store disease prevention guidelines and a large number of patients’ demographic information, medical treatment data, follow-up data, etc. Developers can use the user terminal to input conditional query sentences to extract the required information from the database, such as: extracting hypertension
  • the follow-up data of patients with diabetes is used as sample data, and the database can be a database in a server, a database independent of the server, or a cloud database. It is understandable that the user terminal in this application can be a desktop computer, a tablet computer, a supercomputer, etc.
  • the server can be a local server, a cloud server, or a server cluster, and so on.
  • FIG. 2 is a construction of a patient grouping model provided by an embodiment of the application.
  • the schematic flow chart of the method, as shown in Figure 2, includes steps S21-S23:
  • S21 Obtain a preset disease prevention and control guide, perform keyword recognition on the disease prevention and control guide, and obtain the divided attribute set of each disease in the combined disease, and calculate the information gain rate of each divided attribute in the divided attribute set to generate the combined disease
  • the first knowledge grouping decision tree for each disease, and according to the first knowledge grouping decision tree, n first candidate joint grouping schemes of patients suffering from the joint disease are obtained.
  • the combined disease refers to a combination of at least two diseases, such as: diabetes + hypertension, diabetes + hypertension + heart disease, etc.
  • the disease prevention and control guidelines may be guidelines corresponding to each disease in the combined disease, for example: Diabetes prevention guidelines, hypertension prevention guidelines, heart disease prevention guidelines, etc., can be stored in the database, the server can obtain from the database, or can be sent to the server by the developer through the user terminal.
  • the disease prevention guide can be identified by keywords, Text processing and other technologies extract the partition attribute set, for example: the partition attribute set for hypertension can be ⁇ age, blood pressure, glucose tolerance,..., high salt, ankle/brachial blood pressure index ⁇ , the first-knowledge clustering decision tree is model training
  • the knowledge grouping decision tree for each disease in the joint disease combed out in the stage can be constructed by calculating the information gain rate of each divided attribute in the divided attribute set through the C4.5 algorithm.
  • the first candidate joint grouping scheme is the model training stage server pairing the first The scheme obtained by combining the grouping schemes under the knowledge grouping decision tree.
  • the disease prevention and control guidelines contain the treatment decision-making knowledge of related diseases, such as some treatment suggestions, drug suggestions, etc., to sort out the prevention and treatment guidelines related to each disease in the joint disease to obtain the first knowledge group decision tree corresponding to each disease, each first The knowledge grouping decision trees are independent of each other.
  • the selectable grouping scheme under the corresponding first knowledge grouping decision tree is ⁇ A1,A2 ⁇
  • the selectable grouping scheme under the first knowledge grouping decision tree corresponding to hypertension is ⁇ B1,B2 ⁇
  • the patient may be
  • a candidate joint clustering plan includes: ⁇ A1+B1, A2+B1, A1+B2, A2+B2 ⁇ , and this combination will yield n (multiple) first candidate joint clustering plans for the patient.
  • S22 Obtain n pieces of sample data of the patient suffering from the combined disease, and generate an outcome label for each piece of sample data according to each index in each piece of sample data.
  • the sample data refers to the follow-up data of patients suffering from the combined disease.
  • the so-called follow-up data refers to the hospital's communication or other methods for the patients who have been in the hospital to regularly understand the changes in the patient's condition and guide the patients to recover.
  • An observation method Usually, a patient has multiple follow-up visits.
  • the data from one follow-up visit can be used as a piece of sample data, and each piece of sample data has a corresponding first candidate joint clustering plan.
  • each piece of sample data includes the patient’s demographic information, medication history of all diseases, test inspection indicators, doctor’s prescriptions, and patient’s vital signs. For example, there may be multiple indicators in the medication history.
  • HbA1c hemoglobin glycosylated
  • BP blood pressure
  • y represents the output of the regression model, that is, whether the next follow-up will increase complications or whether it is dead.
  • X represents the input of the regression model, that is, each indicator in the sample data
  • Xn represents the nth input Index
  • represents the regression coefficient of each index, that is, ⁇ 1 represents the importance of the index X1
  • the regression coefficient is regarded as the importance of the corresponding indexes.
  • effect(i) represents the ith item
  • the outcome label of the sample data absolute(i) represents the absolute outcome of the i-th sample data
  • absolute(i) is customized according to the importance of each index in the i-th sample data
  • relative(i) represents the i-th sample data
  • the relative outcome of relative(i) is defined according to absolute(i).
  • HbA1c the test index of diabetes mellitus glycosylated hemoglobin
  • BP the test index of hypertension blood pressure
  • absolute(i) ⁇ HbA1c *(HbA1c(i)-HbA1c (i+1))+ ⁇ BP *(BP(i)-BP(i+1))
  • ⁇ HbA1c represents the importance of glycosylated hemoglobin, derived from the regression coefficient evaluated in the above regression model
  • ⁇ BP represents blood pressure
  • HbA1c(i) represents the glycosylated hemoglobin in the ith sample data
  • BP(i) the blood pressure in the ith sample data
  • HbA1c(i+1) the glycosylated hemoglobin in the next sample data
  • BP (i+1) the blood pressure in the next sample data.
  • relative(i) ⁇ k ⁇ N(pi,di) absolute(k)/ ⁇ j ⁇ N(pi) absolute(j), where N(pi) means that it is divided by each first knowledge grouping decision tree
  • N(pi) means that it is divided by each first knowledge grouping decision tree
  • the sample set of the same leaf node as i, N(pi,di) is the set of the same grouping scheme actually adopted in N(pi) as i. Since each piece of sample data has a corresponding first candidate joint grouping scheme, the outcome label of each piece of sample data here can be used to indicate the score of the candidate joint grouping scheme of the sample.
  • the lambdaMART model is originally a method for sorting documents in information retrieval, that is, when the user proposes a Query, the candidate documents are sorted.
  • the demographic information, inspection and inspection indicators, and medication history in each sample data are used as Query
  • the first candidate joint clustering plan is used as documents.
  • Each Query-documents pair (Query-documents pair) has an outcome. label.
  • the lambda value For each document, first calculate the lambda value, train a regression tree with the lambda value as the label, and calculate the final output score through the predicted regression result at each leaf node of the regression tree (the score here is the predicted score ), using this method to predict the score of each sample data with an outcome label, sort the first candidate joint clustering scheme corresponding to each sample data according to the level of the score, and then return to the step of calculating the lambda value and repeat
  • the steps of training regression trees, predicting scoring, and sorting form a random forest. Training can be stopped until one of the preset convergence conditions is met, and the patient grouping model we need is obtained.
  • the convergence conditions are: the number of regression trees reaches the preset parameter settings , Random Forest is no longer continuously updated on the validation set, that is, it is no longer getting better.
  • the embodiment of the application obtains the preset disease prevention and control guide, performs keyword recognition on the disease prevention and control guide, obtains the divided attribute set of each disease in the joint disease, and calculates the information gain rate of each divided attribute in the divided attribute set to generate
  • the first knowledge grouping decision tree of each disease and according to the first knowledge grouping decision tree, n first candidate joint grouping schemes of patients with joint diseases are obtained; n pieces of sample data of patients with joint diseases are obtained, and According to the indicators in each sample data, an outcome label is generated for each sample data; the lambdaMART model is trained using the sample data with the outcome label and the first candidate joint clustering scheme to obtain the constructed patient clustering model.
  • the clustering plan of a single disease is no longer considered, but the plan of multi-disease joint clustering is sorted out, taking into account the relevant effects between different clustering decisions.
  • the outcome label of the sample data considers not only the outcome label, but also To a certain extent, the problem of biased samples that are difficult to learn when only the absolute outcome is used is eliminated.
  • the lambdaMART model is used for training, and the resulting patient clustering model focuses on the first candidate joint clustering scheme itself and the first candidate joint The priority order between the grouping schemes is helpful to improve the grouping effect of grouping patients with multiple diseases.
  • FIG. 3 is a schematic flowchart of another method for constructing a patient grouping model provided by an embodiment of the application. As shown in FIG. 3, it includes steps S31-S35:
  • the foregoing obtaining the importance of each indicator in each piece of sample data includes:
  • the logarithmic loss is reduced by the gradient descent method to estimate the regression coefficients ⁇ 0, ⁇ 1, ⁇ 2... ⁇ n, and obtain the importance of each index.
  • the regression coefficient ⁇ is used as the importance of each indicator in the sample data, which is conducive to the subsequent definition of absolute and relative outcomes.
  • the foregoing generating an outcome label for each piece of sample data based on the importance of each indicator in each piece of sample data includes:
  • effect(i) absolute(i)*relative(i) to generate an outcome label for each piece of sample data; where effect(i) represents the outcome label of the i-th sample data, absolute(i) Represents the absolute outcome of the i-th sample data, customized according to the importance of each indicator in the i-th sample data; relative(i) represents the relative outcome of the i-th sample data, defined according to absolute(i).
  • an outcome label is generated for each sample data.
  • the outcome label not only considers the absolute outcome, but also considers the relative outcome, which solves the unobjectiveness caused by only considering the absolute solution. It is helpful to reduce the learning difficulty of the patient grouping model.
  • steps S31-S35 have been described in detail in the embodiment shown in FIG. 2, and in order to avoid repetition, they will not be repeated here.
  • FIG. 5 is a schematic flowchart of a patient grouping method provided by an embodiment of the application.
  • the patient grouping method can also be based on the method shown in FIG.
  • the implementation of the network system architecture, as shown in Figure 5, specifically includes steps S51-S54:
  • S51 Receive a patient grouping request submitted by the user terminal; the patient grouping request includes at least two diseases of the patient to be grouped;
  • the patient grouping request is used to request the server to group the patients to be grouped into groups.
  • the patients to be grouped are patients with the same combined disease as the sample patient in the model training stage, such as patients with diabetes and hypertension.
  • the patient grouping can include the combined disease that the patient to be grouped suffers from. Of course, it can also include the prevention and treatment guidelines for various diseases in the combined disease, basic information of the patient to be grouped, diagnosis information, etc.
  • the user terminal can be a medical staff The terminal used, the terminal of the medical research room, the terminal of the staff of the medical and health enterprise, etc., for example: the medical staff can send the patient grouping request to the server through the user terminal after the patient is to be grouped for diagnosis.
  • S52 Obtain a second knowledge grouping decision tree for each disease that the patient to be grouped suffers from, and obtain a second candidate joint grouping solution for the patient to be grouped according to the second knowledge grouping decision tree;
  • the second knowledge grouping decision tree is the knowledge grouping decision tree generated by combing the disease prevention and control guidelines through keyword recognition and calculating the information gain rate during the use stage of the patient grouping model, and the second knowledge grouping decision tree The grouping scheme under the tree decision tree is composed, and the second candidate joint grouping scheme is obtained.
  • S53 Input the second candidate joint grouping scheme into a pre-trained patient grouping model for sorting, and obtain a sorting result of the second candidate joint grouping scheme;
  • the patient clustering model uses the method of training regression trees to predict the score of each second candidate joint clustering scheme, and ranks each second candidate joint clustering scheme according to this score, and the second candidate with the larger score
  • the joint grouping plan should be ranked higher, and the second candidate joint grouping plan with the lower score should be ranked lower.
  • the preset number of second candidate joint grouping schemes can be set according to actual conditions, and may be the second candidate joint grouping scheme ranked first, or the second candidate joint grouping plan ranked three.
  • the candidate joint grouping scheme is not specifically limited.
  • the second candidate joint grouping scheme of patients to be grouped is A1+B1, A2+B1, A1+B2, A2+B2, and their ranking results are: A2+B1, A2+B2, A1+B1, A1+B2
  • the realization can be implemented as shown in FIG. 6 through the diabetes prevention guide and Hypertension prevention and control guidelines, respectively sort out the diabetes knowledge grouping decision tree and hypertension knowledge grouping decision tree, according to the knowledge grouping decision tree of the two to obtain multiple second candidate joint grouping schemes, and input them into the patient grouping model for score prediction and ranking , And finally output the top-k best second candidate joint clustering scheme.
  • the patient clustering model constructed in the embodiment shown in Figure 2 or Figure 3 is used to predict and sort, it is beneficial to improve the clustering effect of multi-disease combined patients. , More suitable for precision medicine.
  • an embodiment of the present application also provides a device for constructing a patient grouping model.
  • the device for constructing a patient grouping model may be running in a terminal.
  • a computer program (including program code).
  • the device for constructing a patient grouping model can execute the method shown in FIG. 2 or FIG. 3. Please refer to Figure 7, the device includes:
  • the first clustering scheme acquisition module 71 is used to acquire preset disease prevention and control guidelines, perform keyword recognition on the disease prevention and control guidelines, obtain the partition attribute set of each disease in the joint disease, and calculate the information gain of each partition attribute in the partition attribute set Rate to generate a first knowledge clustering decision tree for each disease in the combined disease, and according to the first knowledge clustering decision tree, to obtain n first candidate combined clustering schemes for patients suffering from the combined disease;
  • the outcome label generation module 72 is configured to obtain n pieces of sample data of patients suffering from the combined disease, and generate an outcome label for each piece of sample data according to various indicators in each piece of sample data; the sample data One-to-one correspondence with the first candidate joint grouping scheme, and the outcome label of each piece of sample data is used to indicate the score of the corresponding first candidate joint grouping scheme, and the outcome label includes an absolute outcome and a relative outcome;
  • the clustering model training module 73 is configured to train the lambdaMART model by using the sample data with the outcome label and the first candidate joint clustering scheme to obtain a constructed patient clustering model.
  • the device includes:
  • the grouping request obtaining module 81 is configured to receive a patient grouping request submitted by the user terminal; the patient grouping request includes at least two diseases of the patient to be grouped;
  • the second clustering scheme acquisition module 82 is configured to acquire a second knowledge clustering decision tree of each disease that the patient to be clustered suffers from, and obtain the second candidate of the patient to be clustered according to the second knowledge clustering decision tree Joint grouping scheme;
  • the clustering scheme ranking module 83 is configured to input the second candidate joint clustering plan into a pre-trained patient clustering model for ranking, and obtain a ranking result of the second candidate joint clustering plan;
  • the grouping result output module 84 is configured to select a preset number of the second candidate joint grouping solutions as the grouping results of the patients to be grouped and return to the user terminal according to the sorting result of the second candidate joint grouping solutions.
  • each unit in the patient grouping model construction device and the patient grouping device shown in FIG. 7 and FIG. 8 can be separately or completely combined into one or several other units to form, or one of them
  • the unit(s) can be further divided into multiple units with smaller functions to form, which can realize the same operation without affecting the realization of the technical effects of the embodiments of the present application.
  • the above-mentioned units are divided based on logical functions.
  • the function of one unit may also be realized by multiple units, or the functions of multiple units may be realized by one unit.
  • the patient grouping model construction device and the patient grouping device may also include other units. In practical applications, these functions can also be implemented with the assistance of other units, and can be implemented by multiple units in cooperation.
  • a general-purpose computing device such as a computer including a central processing unit (CPU), a random access storage medium (RAM), a read-only storage medium (ROM) and other processing elements and storage elements
  • CPU central processing unit
  • RAM random access storage medium
  • ROM read-only storage medium
  • other processing elements and storage elements can be used
  • Run a computer program (including program code) capable of executing the steps involved in the corresponding method shown in FIG. 2, FIG. 3, or FIG. 5 to construct the patient grouping model construction device as shown in FIG. 7 or FIG. 8,
  • the computer program may be recorded on, for example, a computer-readable recording medium, and loaded into the above-mentioned computing device through the computer-readable recording medium, and run in it.
  • FIG. 9 is a schematic structural diagram of an electronic device provided by an embodiment of the application.
  • the electronic device includes at least a processor 901, an input device 902, an output device 903, and a computer-readable storage medium. 904.
  • the processor 901, the input device 902, the output device 903, and the computer-readable storage medium 904 in the electronic device may be connected by a bus or other methods.
  • the computer-readable storage medium 904 may be stored in the memory of the electronic device.
  • the computer-readable storage medium 904 is used to store a computer program.
  • the computer program includes program instructions.
  • the processor 901 is used to execute the computer-readable Program instructions stored in the storage medium 904.
  • the processor 901 (or CPU (Central Processing Unit, central processing unit)) is the computing core and control core of an electronic device. It is suitable for implementing one or more instructions, specifically suitable for loading and executing one or more instructions to achieve Corresponding method flow or corresponding function.
  • the processor 901 of the electronic device provided in the embodiment of the present application may be used to construct a series of patient grouping models, including:
  • Obtain a preset disease prevention and control guide identify keywords in the disease prevention guide, obtain the divided attribute set of each disease in the combined disease, and calculate the information gain rate of each divided attribute in the divided attribute set to generate each disease in the combined disease
  • the first knowledge grouping decision tree of, and according to the first knowledge grouping decision tree, n first candidate joint grouping schemes of patients suffering from the joint disease are obtained;
  • the lambdaMART model is trained using the sample data with the outcome label and the first candidate joint clustering scheme to obtain a constructed patient clustering model.
  • the processor 901 executing the generating of an ending label for each piece of sample data according to each indicator in each piece of sample data includes: obtaining the importance of each indicator in each piece of sample data; The importance of each indicator in each piece of sample data generates an outcome label for each piece of sample data.
  • the processor 901 executes the training of the lambdaMART model using the sample data with the outcome label and the first candidate joint clustering scheme to obtain a constructed patient clustering model, including:
  • the preset convergence conditions include: the number of regression trees reaches the preset parameter setting, random The forest is no longer continuously updated on the validation set.
  • the processor 901 of the electronic device executes the computer program to implement the steps in the method for constructing a patient grouping model
  • the embodiments of the method for constructing a patient grouping model are all applicable to the electronic device, and can achieve the same Or similar beneficial effects.
  • FIG. 10 is a schematic structural diagram of another electronic device provided by an embodiment of the application.
  • the electronic device includes at least a processor 1001, an input device 1002, an output device 1003, and a computer-readable storage Medium 1004.
  • the processor 1001, the input device 1002, the output device 1003, and the computer-readable storage medium 1004 in the electronic device may be connected by a bus or other methods.
  • the processor 1001 of the electronic device provided in the embodiment of the present application may be used to perform a series of patient grouping processing, including:
  • the patient grouping request includes at least two diseases of the patient to be grouped;
  • a preset number of the second candidate joint grouping schemes are selected as the grouping result of the patient to be grouped and returned to the user terminal.
  • the processor 1001 of the electronic device executes the computer program to implement the steps in the above-mentioned patient grouping method
  • the embodiments of the above-mentioned patient grouping method are all applicable to the electronic device, and can achieve the same or similar benefits. effect.
  • the above-mentioned patient grouping method and patient grouping model construction method can be executed by the same electronic device, or can be executed by different electronic devices, which is not limited in the embodiment of the present application.
  • the embodiment of the present application also provides a computer-readable storage medium (Memory).
  • the computer-readable storage medium is a memory device in an electronic device for storing programs and data. It can be understood that the computer-readable storage medium herein may include a built-in storage medium in the terminal, and of course, may also include an extended storage medium supported by the terminal.
  • the computer-readable storage medium provides storage space, and the storage space stores the operating system of the terminal.
  • one or more instructions suitable for being loaded and executed by the processor 901 are stored in the storage space, and these instructions may be one or more computer programs (including program codes).
  • the computer-readable storage medium here can be a high-speed RAM memory, or a non-volatile memory (non-volatile memory), such as at least one disk memory; optionally, it can also be at least one located far away
  • the aforementioned processor 901 is a computer-readable storage medium.
  • the processor 901 can load and execute one or more instructions stored in a computer-readable storage medium to implement the following steps:
  • Obtain a preset disease prevention and control guide identify keywords in the disease prevention guide, obtain the divided attribute set of each disease in the combined disease, and calculate the information gain rate of each divided attribute in the divided attribute set to generate each disease in the combined disease
  • the first knowledge grouping decision tree of, and according to the first knowledge grouping decision tree, n first candidate joint grouping schemes of patients suffering from the joint disease are obtained;
  • the lambdaMART model is trained by using the sample data with the outcome label and the first candidate joint clustering scheme to obtain a constructed patient clustering model.
  • the processor 901 when one or more instructions in the computer-readable storage medium are loaded by the processor 901, the following steps are performed: acquiring the importance of each indicator in each piece of sample data; based on each piece of sample data The importance of each indicator in the data generates an outcome label for each piece of sample data.
  • the preset convergence conditions include: the number of regression trees reaches the preset parameter setting, random The forest is no longer continuously updated on the validation set.
  • the embodiment of the present application also provides a computer-readable storage medium (Memory).
  • the processor 1001 can load and execute one or more instructions stored in the computer-readable storage medium to implement the following steps:
  • the patient grouping request includes at least two diseases of the patient to be grouped;
  • a preset number of the second candidate joint grouping schemes are selected as the grouping result of the patient to be grouped and returned to the user terminal.
  • the computer program of the computer-readable storage medium includes computer program code
  • the computer program code may be in the form of source code, object code, executable file, or some intermediate form, etc.
  • the computer-readable storage medium may It is non-volatile or volatile.
  • the computer-readable storage medium may include: any entity or device capable of carrying the computer program code, recording medium, U disk, mobile hard disk, magnetic disk, optical disk, computer memory, read-only memory (ROM, Read-Only Memory) ), Random Access Memory (RAM, Random Access Memory), electrical carrier signal, telecommunications signal, and software distribution media, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Medical Informatics (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Biomedical Technology (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Public Health (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Pathology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Evolutionary Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

A patient grouping model constructing method, a patient grouping method, and a related device. The patient grouping model constructing method comprises: acquiring a disease prevention and control guide, performing keyword recognition with respect to the disease prevention and control guide to produce a division attribute set of diseases in a combined disease, calculating an information gain rate of each division attribute in the division attribute set to generate a first knowledge grouping decision tree of the diseases, and producing, on the basis of the first knowledge grouping decision tree, n first candidate combined grouping solutions for patients suffering from the combined disease (S21); acquiring N pieces of sample data of the patients suffering from the combined disease, and generating an ending tag for each piece of sample data on the basis of indicators in each piece of sample data (S22); and utilizing the sample data having the ending tags and the first candidate combined grouping solutions to train a LambdaMART model to produce a constructed patient grouping model (S23). The employment of the patient grouping model provided favors an increased grouping effect of grouping patients suffering from various diseases.

Description

患者分群模型构建方法、患者分群方法及相关设备Method for constructing patient grouping model, method for grouping patient and related equipment
本申请要求于2020年5月13日提交中国专利局、申请号为202010404637.2,发明名称为“患者分群模型构建方法、患者分群方法及相关设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on May 13, 2020 with the application number 202010404637.2 and the invention title of "Patient Grouping Model Construction Method, Patient Grouping Method and Related Equipment", the entire content of which is incorporated by reference Incorporated in this application.
技术领域Technical field
本申请涉及机器学习技术领域,尤其涉及一种患者分群模型构建方法、患者分群方法及相关设备。This application relates to the field of machine learning technology, and in particular to a method for constructing a patient clustering model, a patient clustering method, and related equipment.
背景技术Background technique
人工智能的发展与机器学习的进步是牢不可分的,机器学习作为人工智能的核心,专门研究计算机怎样模拟或实现人类的学习行为,以获取新的知识或技能,重新组织已有的知识结构使之不断改善自身的性能。医学领域中,机器学习在患者分群上已经有了广泛的应用,而患者分群在精准医疗中又极其重要。目前的患者分群方法会给出患者唯一的分群结果,或者给出几种不同的分群结果,发明人意识到这些分群结果都是针对患者的一种疾病进行分群得出的,而对患有多种疾病的患者进行多病综合分群时,现有的分群方法效果都不佳。The development of artificial intelligence is inseparable from the advancement of machine learning. As the core of artificial intelligence, machine learning specializes in the study of how computers simulate or realize human learning behaviors in order to acquire new knowledge or skills, and reorganize existing knowledge structures. It continues to improve its performance. In the medical field, machine learning has been widely used in patient grouping, and patient grouping is extremely important in precision medicine. The current patient grouping method will give a unique grouping result of the patient, or give several different grouping results. The inventor realizes that these grouping results are obtained by grouping patients for one disease. When patients with multiple diseases are grouped together, the existing grouping methods are not effective.
发明内容Summary of the invention
为解决上述问题,本申请提供了一种患者分群模型构建方法、患者分群方法及相关设备,有利于提高对患有多种疾病的患者进行综合分群的分群效果。In order to solve the above-mentioned problems, the present application provides a method for constructing a patient grouping model, a method for grouping patients, and related equipment, which are beneficial to improve the grouping effect of comprehensive grouping of patients with multiple diseases.
第一方面,本申请实施例提供了一种患者分群模型构建方法,该方法包括:In the first aspect, an embodiment of the present application provides a method for constructing a patient grouping model, the method including:
获取预设疾病防治指南,对所述疾病防治指南进行关键词识别,得到联合疾病中各个疾病的划分属性集,计算划分属性集中每个划分属性的信息增益率以生成所述联合疾病中各个疾病的第一知识分群决策树,并根据所述第一知识分群决策树,得到患有所述联合疾病的患者的n个第一候选联合分群方案;Obtain a preset disease prevention and control guide, identify keywords in the disease prevention guide, obtain the divided attribute set of each disease in the combined disease, and calculate the information gain rate of each divided attribute in the divided attribute set to generate each disease in the combined disease The first knowledge grouping decision tree of, and according to the first knowledge grouping decision tree, n first candidate joint grouping schemes of patients suffering from the joint disease are obtained;
获取患有所述联合疾病的患者的n条样本数据,并根据每条所述样本数据中的各个指标为每条所述样本数据生成结局标签;所述样本数据与所述第一候选联合分群方案一一对应,且每条所述样本数据的结局标签用于表示对应的所述第一候选联合分群方案的得分,所述结局标签包括绝对结局和相对结局;Obtain n pieces of sample data of patients suffering from the combined disease, and generate an outcome label for each piece of sample data according to various indicators in each piece of sample data; the sample data and the first candidate are grouped together There is a one-to-one correspondence between the schemes, and the outcome label of each piece of sample data is used to indicate the score of the corresponding first candidate joint clustering scheme, and the outcome label includes an absolute outcome and a relative outcome;
利用带有结局标签的所述样本数据、所述第一候选联合分群方案训练lambdaMART模型,得到构建好的患者分群模型。The lambdaMART model is trained by using the sample data with the outcome label and the first candidate joint clustering scheme to obtain a constructed patient clustering model.
第二方面,本申请实施例提供了一种患者分群方法,该方法包括:In the second aspect, an embodiment of the present application provides a method for grouping patients, which includes:
接收用户终端提交的患者分群请求;所述患者分群请求中包括待分群患者患有的至少两种疾病;Receiving a patient grouping request submitted by the user terminal; the patient grouping request includes at least two diseases of the patient to be grouped;
获取所述待分群患者患有的每种疾病的第二知识分群决策树,根据所述第二知识分群决策树得到所述待分群患者的第二候选联合分群方案;Acquiring a second knowledge grouping decision tree for each disease that the patient to be grouped suffers from, and obtaining a second candidate joint grouping plan of the patient to be grouped according to the second knowledge grouping decision tree;
将所述第二候选联合分群方案输入预训练的患者分群模型进行排序,得到所述第二候选联合分群方案的排序结果;Input the second candidate joint grouping scheme into a pre-trained patient grouping model for sorting, and obtain a sorting result of the second candidate joint grouping scheme;
根据所述第二候选联合分群方案的排序结果,选取预设数量个所述第二候选联合分群方案作为所述待分群患者的分群结果返回至所述用户终端。According to the ranking result of the second candidate joint grouping scheme, a preset number of the second candidate joint grouping schemes are selected as the grouping result of the patient to be grouped and returned to the user terminal.
第三方面,本申请实施例提供了一种患者分群模型构建装置,该装置包括:In a third aspect, an embodiment of the present application provides an apparatus for constructing a patient grouping model, the apparatus including:
第一分群方案获取模块,用于获取预设疾病防治指南,对所述疾病防治指南进行关键词识别,得到联合疾病中各个疾病的划分属性集,计算划分属性集中每个划分属性的信息增益率以生成所述联合疾病中各个疾病的第一知识分群决策树,并根据所述第一知识分群决策树,得到患有所述联合疾病的患者的n个第一候选联合分群方案;The first clustering scheme acquisition module is used to acquire preset disease prevention and control guidelines, identify keywords in the disease prevention and control guidelines, obtain the partition attribute set of each disease in the joint disease, and calculate the information gain rate of each partition attribute in the partition attribute set To generate a first knowledge grouping decision tree for each disease in the combined disease, and according to the first knowledge grouping decision tree, to obtain n first candidate joint grouping schemes of patients suffering from the combined disease;
结局标签生成模块,用于获取患有所述联合疾病的患者的n条样本数据,并根据每条 所述样本数据中的各个指标为每条所述样本数据生成结局标签;所述样本数据与所述第一候选联合分群方案一一对应,且每条所述样本数据的结局标签用于表示对应的所述第一候选联合分群方案的得分,所述结局标签包括绝对结局和相对结局;The outcome label generation module is used to obtain n pieces of sample data of the patient suffering from the combined disease, and generate an outcome label for each piece of sample data according to each indicator in each piece of sample data; the sample data and The first candidate joint grouping solution corresponds one-to-one, and the outcome label of each piece of sample data is used to indicate the score of the corresponding first candidate joint grouping solution, and the outcome label includes an absolute outcome and a relative outcome;
分群模型训练模块,用于利用带有结局标签的所述样本数据、所述第一候选联合分群方案训练lambdaMART模型,得到构建好的患者分群模型。The clustering model training module is used to train the lambdaMART model by using the sample data with the outcome label and the first candidate joint clustering scheme to obtain a constructed patient clustering model.
第四方面,本申请实施例提供了一种患者分群装置,该装置包括:In a fourth aspect, an embodiment of the present application provides a patient grouping device, which includes:
分群请求获取模块,用于接收用户终端提交的患者分群请求;所述患者分群请求中包括待分群患者患有的至少两种疾病;The grouping request acquisition module is configured to receive a patient grouping request submitted by the user terminal; the patient grouping request includes at least two diseases of the patient to be grouped;
第二分群方案获取模块,用于获取所述待分群患者患有的每种疾病的第二知识分群决策树,根据所述第二知识分群决策树得到所述待分群患者的第二候选联合分群方案;The second clustering scheme acquisition module is used to acquire a second knowledge clustering decision tree for each disease that the patient to be clustered suffers from, and to obtain the second candidate combination of the patient to be clustered according to the second knowledge clustering decision tree Grouping scheme
分群方案排序模块,用于将所述第二候选联合分群方案输入预训练的患者分群模型进行排序,得到所述第二候选联合分群方案的排序结果;The clustering scheme ranking module is configured to input the second candidate joint clustering plan into a pre-trained patient clustering model for ranking, and obtain a ranking result of the second candidate joint clustering plan;
分群结果输出模块,用于根据所述第二候选联合分群方案的排序结果,选取预设数量个所述第二候选联合分群方案作为所述待分群患者的分群结果返回至所述用户终端。The grouping result output module is configured to select a preset number of the second candidate joint grouping schemes as the grouping result of the patient to be grouped and return to the user terminal according to the sorting result of the second candidate joint grouping scheme.
第五方面,本申请实施例提供了一种电子设备,该电子设备包括输入设备和输出设备,还包括处理器,适于实现一条或多条指令;以及,计算机可读存储介质,所述计算机可读存储介质存储有一条或多条指令,所述一条或多条指令适于由所述处理器加载并执行如下步骤:In a fifth aspect, an embodiment of the present application provides an electronic device that includes an input device and an output device, and also includes a processor, which is adapted to implement one or more instructions; and, a computer-readable storage medium. The readable storage medium stores one or more instructions, and the one or more instructions are suitable for being loaded by the processor and executing the following steps:
获取预设疾病防治指南,对所述疾病防治指南进行关键词识别,得到联合疾病中各个疾病的划分属性集,计算划分属性集中每个划分属性的信息增益率以生成所述联合疾病中各个疾病的第一知识分群决策树,并根据所述第一知识分群决策树,得到患有所述联合疾病的患者的n个第一候选联合分群方案;Obtain a preset disease prevention and control guide, identify keywords in the disease prevention guide, obtain the divided attribute set of each disease in the combined disease, and calculate the information gain rate of each divided attribute in the divided attribute set to generate each disease in the combined disease The first knowledge grouping decision tree of, and according to the first knowledge grouping decision tree, n first candidate joint grouping schemes of patients suffering from the joint disease are obtained;
获取患有所述联合疾病的患者的n条样本数据,并根据每条所述样本数据中的各个指标为每条所述样本数据生成结局标签;所述样本数据与所述第一候选联合分群方案一一对应,且每条所述样本数据的结局标签用于表示对应的所述第一候选联合分群方案的得分,所述结局标签包括绝对结局和相对结局;Obtain n pieces of sample data of patients suffering from the combined disease, and generate an outcome label for each piece of sample data according to various indicators in each piece of sample data; the sample data and the first candidate are grouped together There is a one-to-one correspondence between the schemes, and the outcome label of each piece of sample data is used to indicate the score of the corresponding first candidate joint clustering scheme, and the outcome label includes an absolute outcome and a relative outcome;
利用带有结局标签的所述样本数据、所述第一候选联合分群方案训练lambdaMART模型,得到构建好的患者分群模型。The lambdaMART model is trained by using the sample data with the outcome label and the first candidate joint clustering scheme to obtain a constructed patient clustering model.
第六方面,本申请实施例提供了一种电子设备,该电子设备包括输入设备和输出设备,还包括处理器,适于实现一条或多条指令;以及,计算机可读存储介质,所述计算机可读存储介质存储有一条或多条指令,所述一条或多条指令适于由所述处理器加载并执行如下步骤:In a sixth aspect, an embodiment of the present application provides an electronic device, which includes an input device and an output device, and also includes a processor, adapted to implement one or more instructions; and, a computer-readable storage medium. The readable storage medium stores one or more instructions, and the one or more instructions are suitable for being loaded by the processor and executing the following steps:
接收用户终端提交的患者分群请求;所述患者分群请求中包括待分群患者患有的至少两种疾病;Receiving a patient grouping request submitted by the user terminal; the patient grouping request includes at least two diseases of the patient to be grouped;
获取所述待分群患者患有的每种疾病的第二知识分群决策树,根据所述第二知识分群决策树得到所述待分群患者的第二候选联合分群方案;Acquiring a second knowledge grouping decision tree for each disease that the patient to be grouped suffers from, and obtaining a second candidate joint grouping plan of the patient to be grouped according to the second knowledge grouping decision tree;
将所述第二候选联合分群方案输入预训练的患者分群模型进行排序,得到所述第二候选联合分群方案的排序结果;Input the second candidate joint grouping scheme into a pre-trained patient grouping model for sorting, and obtain a sorting result of the second candidate joint grouping scheme;
根据所述第二候选联合分群方案的排序结果,选取预设数量个所述第二候选联合分群方案作为所述待分群患者的分群结果返回至所述用户终端。According to the ranking result of the second candidate joint grouping scheme, a preset number of the second candidate joint grouping schemes are selected as the grouping result of the patient to be grouped and returned to the user terminal.
第七方面,本申请实施例提供了一种计算机可读存储介质,所述计算机可读存储介质存储有一条或多条指令,所述一条或多条指令适于由处理器加载并执行如下步骤:In a seventh aspect, an embodiment of the present application provides a computer-readable storage medium, the computer-readable storage medium stores one or more instructions, and the one or more instructions are suitable for being loaded by a processor and executing the following steps :
获取预设疾病防治指南,对所述疾病防治指南进行关键词识别,得到联合疾病中各个疾病的划分属性集,计算划分属性集中每个划分属性的信息增益率以生成所述联合疾病中 各个疾病的第一知识分群决策树,并根据所述第一知识分群决策树,得到患有所述联合疾病的患者的n个第一候选联合分群方案;Obtain a preset disease prevention and control guide, identify keywords in the disease prevention guide, obtain the divided attribute set of each disease in the combined disease, and calculate the information gain rate of each divided attribute in the divided attribute set to generate each disease in the combined disease The first knowledge grouping decision tree of, and according to the first knowledge grouping decision tree, n first candidate joint grouping schemes of patients suffering from the joint disease are obtained;
获取患有所述联合疾病的患者的n条样本数据,并根据每条所述样本数据中的各个指标为每条所述样本数据生成结局标签;所述样本数据与所述第一候选联合分群方案一一对应,且每条所述样本数据的结局标签用于表示对应的所述第一候选联合分群方案的得分,所述结局标签包括绝对结局和相对结局;Obtain n pieces of sample data of patients suffering from the combined disease, and generate an outcome label for each piece of sample data according to various indicators in each piece of sample data; the sample data and the first candidate are grouped together There is a one-to-one correspondence between the schemes, and the outcome label of each piece of sample data is used to indicate the score of the corresponding first candidate joint clustering scheme, and the outcome label includes an absolute outcome and a relative outcome;
利用带有结局标签的所述样本数据、所述第一候选联合分群方案训练lambdaMART模型,得到构建好的患者分群模型。The lambdaMART model is trained by using the sample data with the outcome label and the first candidate joint clustering scheme to obtain a constructed patient clustering model.
第八方面,本申请实施例提供了一种计算机可读存储介质,所述计算机可读存储介质存储有一条或多条指令,所述一条或多条指令适于由处理器加载并执行如下步骤:In an eighth aspect, an embodiment of the present application provides a computer-readable storage medium, the computer-readable storage medium stores one or more instructions, and the one or more instructions are suitable for being loaded by a processor and executing the following steps :
接收用户终端提交的患者分群请求;所述患者分群请求中包括待分群患者患有的至少两种疾病;Receiving a patient grouping request submitted by the user terminal; the patient grouping request includes at least two diseases of the patient to be grouped;
获取所述待分群患者患有的每种疾病的第二知识分群决策树,根据所述第二知识分群决策树得到所述待分群患者的第二候选联合分群方案;Acquiring a second knowledge grouping decision tree for each disease that the patient to be grouped suffers from, and obtaining a second candidate joint grouping plan of the patient to be grouped according to the second knowledge grouping decision tree;
将所述第二候选联合分群方案输入预训练的患者分群模型进行排序,得到所述第二候选联合分群方案的排序结果;Input the second candidate joint grouping scheme into a pre-trained patient grouping model for sorting, and obtain a sorting result of the second candidate joint grouping scheme;
根据所述第二候选联合分群方案的排序结果,选取预设数量个所述第二候选联合分群方案作为所述待分群患者的分群结果返回至所述用户终端。According to the ranking result of the second candidate joint grouping scheme, a preset number of the second candidate joint grouping schemes are selected as the grouping result of the patient to be grouped and returned to the user terminal.
本申请实施例中,在患者分群模型训练阶段不再考虑单个疾病的分群方案,而是梳理多疾病联合分群的方案,考虑了不同分群决策间的相关效应,同时,样本数据的结局标签不仅考虑结局标签,还考虑了相对结局,一定程度上消除了只使用绝对结局时偏倚样本难以学习的问题,而且,使用lambdaMART模型进行训练,得到的患者分群模型既关注第一候选联合分群方案本身,还关注第一候选联合分群方案之间的优先级顺序,从而有利于提高对患有多种疾病的患者进行分群的分群效果。In the embodiments of this application, in the patient clustering model training stage, the clustering plan of a single disease is no longer considered, but the plan of multi-disease joint clustering is sorted out, taking into account the relevant effects between different clustering decisions, and the outcome label of the sample data is not only considered The outcome label also considers the relative outcome, which eliminates to a certain extent the problem that the biased samples are difficult to learn when only the absolute outcome is used. Moreover, the lambdaMART model is used for training, and the resulting patient clustering model not only focuses on the first candidate joint clustering plan itself, but also Pay attention to the priority order between the first candidate joint grouping schemes, so as to improve the grouping effect of grouping patients with multiple diseases.
附图说明Description of the drawings
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly describe the technical solutions in the embodiments of the present application or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the drawings in the following description are only These are some embodiments of the present application. For those of ordinary skill in the art, other drawings can be obtained based on these drawings without creative work.
图1为本申请实施例提供的一种网络系统架构图;FIG. 1 is a diagram of a network system architecture provided by an embodiment of this application;
图2为本申请实施例提供的一种患者分群模型构建方法的流程示意图;2 is a schematic flowchart of a method for constructing a patient grouping model provided by an embodiment of the application;
图3为本申请实施例提供的另一种患者分群模型构建方法的流程示意图;FIG. 3 is a schematic flowchart of another method for constructing a patient grouping model provided by an embodiment of the application;
图4为本申请实施例提供的一种患者分群模型构建的示例图;FIG. 4 is an example diagram of constructing a patient grouping model provided by an embodiment of the application;
图5为本申请实施例提供的一种患者分群方法的流程示意图;FIG. 5 is a schematic flowchart of a method for grouping patients according to an embodiment of the application;
图6为本申请实施例提供的一种患者分群的示例图;FIG. 6 is an example diagram of a patient grouping provided by an embodiment of the application;
图7为本申请实施例提供的一种患者分群模型构建装置的结构示意图;FIG. 7 is a schematic structural diagram of an apparatus for constructing a patient grouping model provided by an embodiment of the application;
图8为本申请实施例提供的一种患者分群装置的结构示意图;FIG. 8 is a schematic structural diagram of a patient grouping device provided by an embodiment of the application;
图9为本申请实施例提供的一种电子设备的结构示意图FIG. 9 is a schematic structural diagram of an electronic device provided by an embodiment of the application
图10为本申请实施例提供的另一种电子设备的结构示意图。FIG. 10 is a schematic structural diagram of another electronic device provided by an embodiment of the application.
具体实施方式Detailed ways
本申请实施例提供一种患者分群模型构建方案以构建出适用于多疾病联合患者的患者分群模型,在模型训练阶段,通过联合疾病中各个疾病的知识分群决策树获取到患有该联合疾病的患者的候选联合分群方案,充分考虑单个疾病分群方案之间的相关效应,以患者的随访数据为样本数据,以样本数据中患者的人口统计学信息、用药史、检验检查、生命 体征等指标的重要性为每条样本数据生成结局标签,相比现有技术只考虑绝对结局导致模型学习效果不佳的情况,本申请中还考虑了相对结局,更加客观合理,另外,患者分群模型以lambdaMART模型为基础,使得模型在学习时,更注重排名靠前的候选联合分群方案间的顺序,从而在将训练好的患者分群模型应用到多疾病患者分群场景中时,能够得到较佳的分群结果,更适用于精准医疗。The embodiment of the application provides a solution for constructing a patient grouping model to construct a patient grouping model suitable for patients with multiple diseases. In the model training stage, the knowledge grouping decision tree of each disease in the joint disease is used to obtain the patient suffering from the joint disease. The candidate joint clustering plan for patients fully considers the relevant effects between the individual disease clustering plans. The follow-up data of the patient is used as the sample data, and the demographic information, medication history, laboratory examination, vital signs and other indicators of the patient in the sample data The importance is to generate an outcome label for each sample data. Compared with the case where the existing technology only considers the absolute outcome and the model learning effect is not good, this application also considers the relative outcome, which is more objective and reasonable. In addition, the patient grouping model uses the lambdaMART model As a basis, the model pays more attention to the order of the top-ranked candidate joint clustering schemes when learning, so that when the trained patient clustering model is applied to the multi-disease patient clustering scenario, better clustering results can be obtained. It is more suitable for precision medicine.
具体的,该患者分群模型构建方案可基于图1所示的网络系统架构进行实施,如图1所示,该网络系统架构至少包括用户终端、服务器和数据库,三者通过有线或无线的网络连接通信,具体通信协议不作限定。用户终端可用于通过程序代码或触控信号向服务器提交疾病防治指南、联合疾病患者的随访数据等,以此请求服务器执行患者分群模型构建相关步骤,服务器为执行主体,通过处理器执行程序代码来进行一系列患者分群模型构建处理,例如:梳理知识分群决策树、生成结局标签、计算lambda值等等,在lambdaMART模型的基础上,使用带结局标签的样本数据和候选联合分群方案为训练集训练出患者分群模型。数据库可用于存储疾病防治指南和大量患者的人口统计学信息、就医数据、随访数据等,开发人员可通过用户终端输入条件查询语句从该数据库中提取需要的信息数据,例如:提取患有高血压和糖尿病患者的随访数据作为样本数据,该数据库可以是服务器中的数据库,也可以是独立于服务器的数据库,或者还可以是云端数据库。可以理解的,本申请中用户终端可以是台式电脑、平板电脑、超级计算机等设备,服务器可以是本地服务器,也可以是云端服务器,或者也可以是服务器集群,等等。Specifically, the patient grouping model construction solution can be implemented based on the network system architecture shown in Figure 1. As shown in Figure 1, the network system architecture at least includes a user terminal, a server, and a database, and the three are connected through a wired or wireless network. Communication, the specific communication protocol is not limited. The user terminal can be used to submit disease prevention and control guidelines, follow-up data of patients with joint diseases, etc. to the server through program codes or touch signals, so as to request the server to execute the relevant steps of constructing the patient grouping model. Carry out a series of patient grouping model construction processing, such as: combing the knowledge grouping decision tree, generating the outcome label, calculating the lambda value, etc., based on the lambdaMART model, using the sample data with the outcome label and the candidate joint grouping scheme to train the training set Develop a patient grouping model. The database can be used to store disease prevention guidelines and a large number of patients’ demographic information, medical treatment data, follow-up data, etc. Developers can use the user terminal to input conditional query sentences to extract the required information from the database, such as: extracting hypertension The follow-up data of patients with diabetes is used as sample data, and the database can be a database in a server, a database independent of the server, or a cloud database. It is understandable that the user terminal in this application can be a desktop computer, a tablet computer, a supercomputer, etc. The server can be a local server, a cloud server, or a server cluster, and so on.
基于图1所示的网络系统架构,以下结合相关附图对本申请实施例提出的患者分群模型构建方法进行详细阐述,请参见图2,图2为本申请实施例提供的一种患者分群模型构建方法的流程示意图,如图2所示,包括步骤S21-S23:Based on the network system architecture shown in FIG. 1, the method for constructing a patient grouping model proposed in an embodiment of the application will be described in detail below in conjunction with related drawings. Please refer to FIG. 2. FIG. 2 is a construction of a patient grouping model provided by an embodiment of the application. The schematic flow chart of the method, as shown in Figure 2, includes steps S21-S23:
S21,获取预设疾病防治指南,对所述疾病防治指南进行关键词识别,得到联合疾病中各个疾病的划分属性集,计算划分属性集中每个划分属性的信息增益率以生成所述联合疾病中各个疾病的第一知识分群决策树,并根据所述第一知识分群决策树,得到患有所述联合疾病的患者的n个第一候选联合分群方案。S21: Obtain a preset disease prevention and control guide, perform keyword recognition on the disease prevention and control guide, and obtain the divided attribute set of each disease in the combined disease, and calculate the information gain rate of each divided attribute in the divided attribute set to generate the combined disease The first knowledge grouping decision tree for each disease, and according to the first knowledge grouping decision tree, n first candidate joint grouping schemes of patients suffering from the joint disease are obtained.
本申请具体实施例中,联合疾病指至少两种疾病的组合,例如:糖尿病+高血压,糖尿病+高血压+心脏病等,疾病防治指南可以是联合疾病中每种疾病对应的指南,例如:糖尿病防治指南、高血压防治指南、心脏病防治指南等,可存储于数据库中,服务器可从数据库中进行获取,也可由开发人员通过用户终端发送给服务器,疾病防治指南中可通过关键词识别、文本处理等技术提取出划分属性集,例如:关于高血压的划分属性集可以是{年龄,血压,糖耐量,…,高盐,踝/臂血压指数},第一知识分群决策树即模型训练阶段梳理出的联合疾病中各个疾病的知识分群决策树,可通过C4.5算法计算划分属性集中每一划分属性的信息增益率来构造,第一候选联合分群方案即模型训练阶段服务器对第一知识分群决策树下的分群方案组合得到的方案。疾病防治指南中有相关疾病的治疗决策知识,例如一些治疗建议、药剂建议等,对联合疾病中各个疾病相关的防治指南进行梳理,以得到各个疾病对应的第一知识分群决策树,各第一知识分群决策树之间相互独立,每个第一知识分群决策树下包括该种疾病的分群方案,例如:糖尿病对应的第一知识分群决策树下的分群方案有A={A1,A2,…An}(其中每个Ai为一个分群方案,表示患者可能被分到患者群Ai);高血压对应的第一知识分群决策树下的分群方案有B={B1,B2,…Bm}(其中每个Bj为一个分群方案)。In the specific embodiment of the present application, the combined disease refers to a combination of at least two diseases, such as: diabetes + hypertension, diabetes + hypertension + heart disease, etc. The disease prevention and control guidelines may be guidelines corresponding to each disease in the combined disease, for example: Diabetes prevention guidelines, hypertension prevention guidelines, heart disease prevention guidelines, etc., can be stored in the database, the server can obtain from the database, or can be sent to the server by the developer through the user terminal. The disease prevention guide can be identified by keywords, Text processing and other technologies extract the partition attribute set, for example: the partition attribute set for hypertension can be {age, blood pressure, glucose tolerance,..., high salt, ankle/brachial blood pressure index}, the first-knowledge clustering decision tree is model training The knowledge grouping decision tree for each disease in the joint disease combed out in the stage can be constructed by calculating the information gain rate of each divided attribute in the divided attribute set through the C4.5 algorithm. The first candidate joint grouping scheme is the model training stage server pairing the first The scheme obtained by combining the grouping schemes under the knowledge grouping decision tree. The disease prevention and control guidelines contain the treatment decision-making knowledge of related diseases, such as some treatment suggestions, drug suggestions, etc., to sort out the prevention and treatment guidelines related to each disease in the joint disease to obtain the first knowledge group decision tree corresponding to each disease, each first The knowledge grouping decision trees are independent of each other. Each first-knowledge grouping decision tree includes the grouping scheme of the disease. For example, the grouping scheme under the first-knowledge grouping decision tree corresponding to diabetes is A={A1,A2,... An} (where each Ai is a clustering plan, indicating that the patient may be assigned to the patient group Ai); the clustering plan under the first knowledge clustering decision tree corresponding to hypertension is B={B1,B2,...Bm}(where Each Bj is a grouping scheme).
另外,若联合疾病中的疾病为糖尿病、高血压,如步骤S21中得到的糖尿病对应的第一知识分群决策树下的分群方案有A={A1,A2,…An}和高血压对应的第一知识分群决策树下的分群方案有B={B1,B2,…Bm},每个Ai+Bj均为一种第一候选联合分群方案,例如:一个患有高血压和糖尿病的患者在糖尿病对应的第一知识分群决策树下可选分群方案为 {A1,A2},在高血压对应的第一知识分群决策树下可选分群方案为{B1,B2},那么,这个患者可能的第一候选联合分群方案就包括:{A1+B1,A2+B1,A1+B2,A2+B2},如此组合便得出患者的n个(多个)第一候选联合分群方案。In addition, if the disease in the combined disease is diabetes or hypertension, the grouping scheme under the first knowledge grouping decision tree corresponding to diabetes obtained in step S21 has A={A1,A2,...An} and the first knowledge corresponding to hypertension. The grouping scheme under a knowledge grouping decision tree has B={B1,B2,...Bm}, each Ai+Bj is a first candidate joint grouping scheme, for example: a patient with hypertension and diabetes is in diabetes The selectable grouping scheme under the corresponding first knowledge grouping decision tree is {A1,A2}, and the selectable grouping scheme under the first knowledge grouping decision tree corresponding to hypertension is {B1,B2}, then the patient may be A candidate joint clustering plan includes: {A1+B1, A2+B1, A1+B2, A2+B2}, and this combination will yield n (multiple) first candidate joint clustering plans for the patient.
S22,获取患有所述联合疾病的患者的n条样本数据,并根据每条所述样本数据中的各个指标为每条所述样本数据生成结局标签。S22: Obtain n pieces of sample data of the patient suffering from the combined disease, and generate an outcome label for each piece of sample data according to each index in each piece of sample data.
本申请具体实施例中,样本数据即患有联合疾病的患者的随访数据,所谓随访数据指医院对曾在医院就诊的病人以通讯或其他的方式,进行定期了解患者病情变化和指导患者康复的一种观察方法,通常一个病人有多次随访,一次随访的数据即可作为一条样本数据,每条样本数据均有一个对应的第一候选联合分群方案。可选的,每条样本数据均包括患者的人口统计学信息、所有疾病的用药史、检验检查指标、医生开药、患者生命体征五大类中的多个指标,例如:用药史中可能存在多个指标,检验检查指标中可能存在多个指标(例如:糖尿病的检验检查指标中有糖化血红蛋白(HbA1c)、高血压的检验检查指标中有血压(BP)),通过训练回归模型的方法获取每条样本数据中的各个指标的重要性,利用各个指标的重要性为每条样本数据生成结局标签。In the specific embodiment of this application, the sample data refers to the follow-up data of patients suffering from the combined disease. The so-called follow-up data refers to the hospital's communication or other methods for the patients who have been in the hospital to regularly understand the changes in the patient's condition and guide the patients to recover. An observation method. Usually, a patient has multiple follow-up visits. The data from one follow-up visit can be used as a piece of sample data, and each piece of sample data has a corresponding first candidate joint clustering plan. Optionally, each piece of sample data includes the patient’s demographic information, medication history of all diseases, test inspection indicators, doctor’s prescriptions, and patient’s vital signs. For example, there may be multiple indicators in the medication history. There may be multiple indicators in the inspection and inspection indicators (for example, hemoglobin glycosylated (HbA1c) in the inspection and inspection indicators for diabetes and blood pressure (BP) in the inspection and inspection indicators for high blood pressure). The importance of each indicator in the sample data is used to generate an outcome label for each sample data using the importance of each indicator.
具体的,利用每条样本数据中的各个指标训练logist回归模型:y=1/1+e -(β0+β1X1+β2X2+...+βnXn),训练过程中通过梯度下降法减少对数损失,以估计回归系数β0、β1、β2…βn,梯度下降时,当两次迭代之间对数损失的差值小于预设阈值时,回归模型收敛。其中,y表示回归模型的输出,即下一次随访是否增加并发症或是否死亡,是一个二分类,X表示回归模型的输入,即样本数据中的每个指标,Xn即表示输入的第n个指标,β表示每个指标的回归系数,即β1表示指标X1的重要性,将该回归系数作为对应的各个指标的重要性。 Specifically, use each index in each sample data to train the logist regression model: y=1/1+e -(β0+β1X1+β2X2+...+βnXn) , and use gradient descent method to reduce logarithmic loss during training. To estimate the regression coefficients β0, β1, β2...βn, when the gradient drops, when the difference in log loss between two iterations is less than the preset threshold, the regression model converges. Among them, y represents the output of the regression model, that is, whether the next follow-up will increase complications or whether it is dead. It is a binary classification, X represents the input of the regression model, that is, each indicator in the sample data, and Xn represents the nth input Index, β represents the regression coefficient of each index, that is, β1 represents the importance of the index X1, and the regression coefficient is regarded as the importance of the corresponding indexes.
在使用机器学习方法对患者分群时,需要为样本数据生成结局标签,以鉴别在特定患者病情下特定分群的效果,从而学习结局好的分群方案,但是现有方法只考虑绝对结局,这会导致机器学习效果不佳。本方案中为每条样本数据生成结局标签时既考虑绝对结局,还考虑相对结局,采用公式effect(i)=absolute(i)*relative(i)完成,其中,effect(i)表示第i条样本数据的结局标签,absolute(i)表示第i条样本数据的绝对结局,absolute(i)根据第i条样本数据中的各个指标的重要性自定义,relative(i)表示第i条样本数据的相对结局,relative(i)根据absolute(i)定义。When using machine learning methods to classify patients, it is necessary to generate outcome labels for sample data to identify the effects of specific clusters under specific patient conditions, so as to learn clustering schemes with good outcomes. However, existing methods only consider absolute outcomes, which will lead to Machine learning is not effective. In this scheme, when generating an outcome label for each sample data, both the absolute outcome and the relative outcome are considered. The formula effect(i)=absolute(i)*relative(i) is used to complete, where effect(i) represents the ith item The outcome label of the sample data, absolute(i) represents the absolute outcome of the i-th sample data, absolute(i) is customized according to the importance of each index in the i-th sample data, and relative(i) represents the i-th sample data The relative outcome of relative(i) is defined according to absolute(i).
例如:患有糖尿病和高血压的患者,糖尿病的检验检查指标糖化血红蛋白(HbA1c)、高血压的检验检查指标血压(BP),定义:absolute(i)=β HbA1c*(HbA1c(i)-HbA1c(i+1))+β BP*(BP(i)-BP(i+1)),其中,β HbA1c表示糖化血红蛋白的重要性,来自上述回归模型中评估的回归系数,β BP表示表示血压的重要性,HbA1c(i)表示第i条样本数据中的糖化血红蛋白,BP(i)表示第i条样本数据中的血压,HbA1c(i+1)表示下一条样本数据中的糖化血红蛋白,BP(i+1)表示下一条样本数据中的血 压。定义:relative(i)=∑ k∈N(pi,di)absolute(k)/∑ j∈N(pi)absolute(j),其中N(pi)表示被各个第一知识分群决策树都分到与i相同叶子节点的样本集合,N(pi,di)为N(pi)中实际采纳的分群方案与i相同的集合。由于每条样本数据都有对应的第一候选联合分群方案,则此处每条样本数据的结局标签即可用来表示该样本的候选联合分群方案的得分。 For example: patients with diabetes and hypertension, the test index of diabetes mellitus glycosylated hemoglobin (HbA1c), the test index of hypertension blood pressure (BP), definition: absolute(i)=β HbA1c *(HbA1c(i)-HbA1c (i+1))+β BP *(BP(i)-BP(i+1)), where β HbA1c represents the importance of glycosylated hemoglobin, derived from the regression coefficient evaluated in the above regression model, and β BP represents blood pressure The importance of HbA1c(i) represents the glycosylated hemoglobin in the ith sample data, BP(i) represents the blood pressure in the ith sample data, HbA1c(i+1) represents the glycosylated hemoglobin in the next sample data, BP (i+1) represents the blood pressure in the next sample data. Definition: relative(i)=∑ k∈N(pi,di) absolute(k)/∑ j∈N(pi) absolute(j), where N(pi) means that it is divided by each first knowledge grouping decision tree The sample set of the same leaf node as i, N(pi,di) is the set of the same grouping scheme actually adopted in N(pi) as i. Since each piece of sample data has a corresponding first candidate joint grouping scheme, the outcome label of each piece of sample data here can be used to indicate the score of the candidate joint grouping scheme of the sample.
S23,利用带有结局标签的所述样本数据、所述第一候选联合分群方案训练lambdaMART模型,得到构建好的患者分群模型。S23: Train a lambdaMART model using the sample data with an outcome label and the first candidate joint clustering scheme to obtain a constructed patient clustering model.
本申请具体实施例中,lambdaMART模型原本为信息检索中对文档进行排序的方法,即当用户提出一个Query后,对候选的documents进行排序。本方案中以每条样本数据中的人口统计学信息、检验检查指标、用药史为Query,以第一候选联合分群方案为documents,每个Query-documents pair(Query-documents对)都带有结局标签。针对每个documents,首先计算出lambda值,以该lambda值为标签训练一棵回归树,在回归树的每个叶子节点通过预测的回归结果计算出最终输出的得分(此处的得分为预测得分),采用如此方法预测出带有结局标签的每条样本数据的得分,根据该得分的高低对每条样本数据对应的第一候选联合分群方案进行排序,之后回到计算lambda值的步骤,重复训练回归树、预测得分、排序的步骤,组成随机森林,直到满足预设收敛条件之一即可停止训练,得到我们需要的患者分群模型,收敛条件有:回归树的数量达到预设的参数设置,随机森林在验证集上不再持续更新,即不再变好。In the specific embodiment of the present application, the lambdaMART model is originally a method for sorting documents in information retrieval, that is, when the user proposes a Query, the candidate documents are sorted. In this plan, the demographic information, inspection and inspection indicators, and medication history in each sample data are used as Query, and the first candidate joint clustering plan is used as documents. Each Query-documents pair (Query-documents pair) has an outcome. label. For each document, first calculate the lambda value, train a regression tree with the lambda value as the label, and calculate the final output score through the predicted regression result at each leaf node of the regression tree (the score here is the predicted score ), using this method to predict the score of each sample data with an outcome label, sort the first candidate joint clustering scheme corresponding to each sample data according to the level of the score, and then return to the step of calculating the lambda value and repeat The steps of training regression trees, predicting scoring, and sorting form a random forest. Training can be stopped until one of the preset convergence conditions is met, and the patient grouping model we need is obtained. The convergence conditions are: the number of regression trees reaches the preset parameter settings , Random Forest is no longer continuously updated on the validation set, that is, it is no longer getting better.
可以看出,本申请实施例通过获取预设疾病防治指南,对疾病防治指南进行关键词识别,得到联合疾病中各个疾病的划分属性集,计算划分属性集中每个划分属性的信息增益率以生成各个疾病的第一知识分群决策树,并根据第一知识分群决策树,得到患有联合疾病的患者的n个第一候选联合分群方案;获取患有联合疾病的患者的n条样本数据,并根据每条样本数据中的各个指标为每条样本数据生成结局标签;利用带有结局标签的样本数据、第一候选联合分群方案训练lambdaMART模型,得到构建好的患者分群模型。这样在患者分群模型训练阶段不再考虑单个疾病的分群方案,而是梳理多疾病联合分群的方案,考虑了不同分群决策间的相关效应,同时,样本数据的结局标签不仅考虑结局标签,还考虑了相对结局,一定程度上消除了只使用绝对结局时偏倚样本难以学习的问题,而且,使用lambdaMART模型进行训练,得到的患者分群模型既关注第一候选联合分群方案本身,还关注第一候选联合分群方案之间的优先级顺序,从而有利于提高对患有多种疾病的患者进行分群的分群效果。It can be seen that the embodiment of the application obtains the preset disease prevention and control guide, performs keyword recognition on the disease prevention and control guide, obtains the divided attribute set of each disease in the joint disease, and calculates the information gain rate of each divided attribute in the divided attribute set to generate The first knowledge grouping decision tree of each disease, and according to the first knowledge grouping decision tree, n first candidate joint grouping schemes of patients with joint diseases are obtained; n pieces of sample data of patients with joint diseases are obtained, and According to the indicators in each sample data, an outcome label is generated for each sample data; the lambdaMART model is trained using the sample data with the outcome label and the first candidate joint clustering scheme to obtain the constructed patient clustering model. In this way, in the patient clustering model training phase, the clustering plan of a single disease is no longer considered, but the plan of multi-disease joint clustering is sorted out, taking into account the relevant effects between different clustering decisions. At the same time, the outcome label of the sample data considers not only the outcome label, but also To a certain extent, the problem of biased samples that are difficult to learn when only the absolute outcome is used is eliminated. Moreover, the lambdaMART model is used for training, and the resulting patient clustering model focuses on the first candidate joint clustering scheme itself and the first candidate joint The priority order between the grouping schemes is helpful to improve the grouping effect of grouping patients with multiple diseases.
请参见图3,图3为本申请实施例提供的另一种患者分群模型构建方法的流程示意图,如图3所示,包括步骤S31-S35:Please refer to FIG. 3. FIG. 3 is a schematic flowchart of another method for constructing a patient grouping model provided by an embodiment of the application. As shown in FIG. 3, it includes steps S31-S35:
S31,获取预设疾病防治指南,对所述疾病防治指南进行关键词识别,得到联合疾病中各个疾病的划分属性集,计算划分属性集中每个划分属性的信息增益率以生成所述联合疾病中各个疾病的第一知识分群决策树;S31. Obtain a preset disease prevention and control guide, perform keyword recognition on the disease prevention and control guide, and obtain a divided attribute set of each disease in the combined disease, and calculate the information gain rate of each divided attribute in the divided attribute set to generate the combined disease The first knowledge group decision tree for each disease;
S32,根据所述第一知识分群决策树,得到患有所述联合疾病的患者的n个第一候选联合分群方案;S32: Obtain n first candidate joint grouping schemes of patients suffering from the joint disease according to the first knowledge grouping decision tree;
S33,获取患有所述联合疾病的患者的n条样本数据,并获取每条所述样本数据中的各个指标的重要性。S33: Obtain n pieces of sample data of patients suffering from the combined disease, and obtain the importance of each index in each piece of sample data.
在一种可能的实施方式中,上述获取每条所述样本数据中的各个指标的重要性,包括:In a possible implementation manner, the foregoing obtaining the importance of each indicator in each piece of sample data includes:
利用每条所述样本数据中的各个指标训练logist回归模型: y=1/1+e -(β0+β1X1+β2X2+...+βnXn),其中,y表示回归模型的输出,X1、X2…Xn表示每条所述样本数据中的各个指标,系数β1、β2…βn表示各个指标的重要性; Train the logist regression model using the indicators in each sample data: y=1/1+e -(β0+β1X1+β2X2+...+βnXn) , where y represents the output of the regression model, X1, X2... Xn represents each index in each of the sample data, and the coefficients β1, β2...βn represent the importance of each index;
训练过程中通过梯度下降法减少对数损失,以估计回归系数β0、β1、β2…βn,得到各个指标的重要性。In the training process, the logarithmic loss is reduced by the gradient descent method to estimate the regression coefficients β0, β1, β2...βn, and obtain the importance of each index.
该实施方式中,以回归系数β为样本数据中各个指标的重要性,有利于后续绝对结局和相对结局的定义。In this embodiment, the regression coefficient β is used as the importance of each indicator in the sample data, which is conducive to the subsequent definition of absolute and relative outcomes.
S34,基于每条所述样本数据中的各个指标的重要性为每条所述样本数据生成结局标签;S34, generating an outcome label for each piece of sample data based on the importance of each indicator in each piece of sample data;
在一种可能的实施方式中,上述基于每条所述样本数据中的各个指标的重要性为每条所述样本数据生成结局标签,包括:In a possible implementation manner, the foregoing generating an outcome label for each piece of sample data based on the importance of each indicator in each piece of sample data includes:
采用预设公式:effect(i)=absolute(i)*relative(i)为每条所述样本数据生成结局标签;其中,effect(i)表示第i条样本数据的结局标签,absolute(i)表示第i条样本数据的绝对结局,根据第i条样本数据中的各个指标的重要性自定义;relative(i)表示第i条样本数据的相对结局,根据absolute(i)定义。Use the preset formula: effect(i)=absolute(i)*relative(i) to generate an outcome label for each piece of sample data; where effect(i) represents the outcome label of the i-th sample data, absolute(i) Represents the absolute outcome of the i-th sample data, customized according to the importance of each indicator in the i-th sample data; relative(i) represents the relative outcome of the i-th sample data, defined according to absolute(i).
该实施方式中,在步骤S33得到各个指标的重要性的基础上,为每条样本数据生成结局标签,结局标签不仅考虑绝对结局,还考虑相对结局,解决了仅考虑绝对解决带来的不客观性,有利于降低患者分群模型的学习难度。In this embodiment, on the basis of the importance of each index obtained in step S33, an outcome label is generated for each sample data. The outcome label not only considers the absolute outcome, but also considers the relative outcome, which solves the unobjectiveness caused by only considering the absolute solution. It is helpful to reduce the learning difficulty of the patient grouping model.
S35,利用带有结局标签的所述样本数据、所述第一候选联合分群方案训练lambdaMART模型,得到构建好的患者分群模型。S35. Train a lambdaMART model by using the sample data with an outcome label and the first candidate joint clustering scheme to obtain a constructed patient clustering model.
其中,步骤S31-S35的具体实施方式在图2所示的实施例中已有详细说明,为避免重复,此处不再赘述。Among them, the specific implementation manners of steps S31-S35 have been described in detail in the embodiment shown in FIG. 2, and in order to avoid repetition, they will not be repeated here.
为更好地理解本申请实施例提出的患者分群模型构建方案,现以联合疾病为糖尿病和高血压为例进行简要说明。如图4所示,利用糖尿病预防指南(指南1)梳理出糖尿病的知识分群决策树,利用高血压预防指南(指南2)梳理出高血压的知识分群决策树,由糖尿病的知识分群决策树下的分群方案和高血压的知识分群决策树下的分群方案组合得到糖尿病、高血压候选联合分群方案。从数据库中获取多条患有糖尿病和高血压的患者的随访数据,使用每条随访数据中的糖化血红蛋白、血压等指标训练logist回归模型,估计回归模型中回归系数的值,以回归系数的值作为各个指标的重要性,以各个指标的重要性定义绝对结局absolute、根据绝对结局absolute定义相对结局relative,用考虑绝对结局和相对结局的公式为每条随访数据标注结局,得到带结局标签的样本数据,最后用带结局标签的样本数据和糖尿病、高血压候选联合分群方案进行lambdaMART训练,当满足预设收敛条件时停止训练,得到可使用的患者分群模型。In order to better understand the patient grouping model construction scheme proposed in the embodiments of the present application, a brief description is now given by taking the combined disease as diabetes and hypertension as an example. As shown in Figure 4, use the Diabetes Prevention Guide (Guide 1) to sort out the knowledge grouping decision tree for diabetes, and use the Hypertension Prevention Guide (Guide 2) to sort out the knowledge grouping decision tree for hypertension, which is under the diabetes knowledge grouping decision tree The combination of the grouping plan of the grouping plan and the grouping plan under the decision tree of the knowledge grouping decision tree of the hypertension obtains the candidate joint grouping plan of diabetes and hypertension. Obtain multiple follow-up data of patients with diabetes and hypertension from the database, use the glycated hemoglobin, blood pressure and other indicators in each follow-up data to train the logist regression model, and estimate the value of the regression coefficient in the regression model to the value of the regression coefficient As the importance of each indicator, define the absolute outcome based on the importance of each indicator, and define the relative outcome relative based on the absolute outcome. Use a formula that considers the absolute outcome and the relative outcome to label the outcome for each follow-up data, and obtain a sample with an outcome label. Finally, use the sample data with outcome labels and the joint clustering scheme of diabetes and hypertension candidates for lambdaMART training. When the preset convergence conditions are met, the training is stopped, and a usable patient clustering model is obtained.
基于图2或图3所示实施例构建的患者分群模型,请参见图5,图5为本申请实施例提供的一种患者分群方法的流程示意图,该患者分群方法同样可基于图1所示的网络系统架构实施,如图5所示,具体包括步骤S51-S54:For the patient grouping model constructed based on the embodiment shown in FIG. 2 or FIG. 3, please refer to FIG. 5. FIG. 5 is a schematic flowchart of a patient grouping method provided by an embodiment of the application. The patient grouping method can also be based on the method shown in FIG. The implementation of the network system architecture, as shown in Figure 5, specifically includes steps S51-S54:
S51,接收用户终端提交的患者分群请求;所述患者分群请求中包括待分群患者患有的至少两种疾病;S51: Receive a patient grouping request submitted by the user terminal; the patient grouping request includes at least two diseases of the patient to be grouped;
本申请具体实施例中,患者分群请求用于向服务器请求对待分群患者进行分群,待分 群患者即与模型训练阶段的样本患者患有相同联合疾病的患者,例如:患有糖尿病和高血压的患者。患者分群中可包括该待分群患者患有的联合疾病,当然,还可以包括联合疾病中各种疾病的防治指南、待分群患者的基本信息、诊断信息等,此时,用户终端可以是医护人员使用的终端、医疗研究室的终端、医疗健康型企业工作人员的终端,等等,例如:医护人员可在对待分群患者进行诊断后,通过该用户终端向服务器发送患者分群请求。In the specific embodiment of the application, the patient grouping request is used to request the server to group the patients to be grouped into groups. The patients to be grouped are patients with the same combined disease as the sample patient in the model training stage, such as patients with diabetes and hypertension. . The patient grouping can include the combined disease that the patient to be grouped suffers from. Of course, it can also include the prevention and treatment guidelines for various diseases in the combined disease, basic information of the patient to be grouped, diagnosis information, etc. At this time, the user terminal can be a medical staff The terminal used, the terminal of the medical research room, the terminal of the staff of the medical and health enterprise, etc., for example: the medical staff can send the patient grouping request to the server through the user terminal after the patient is to be grouped for diagnosis.
S52,获取所述待分群患者患有的每种疾病的第二知识分群决策树,根据所述第二知识分群决策树得到所述待分群患者的第二候选联合分群方案;S52: Obtain a second knowledge grouping decision tree for each disease that the patient to be grouped suffers from, and obtain a second candidate joint grouping solution for the patient to be grouped according to the second knowledge grouping decision tree;
本申请具体实施例中,第二知识分群决策树即患者分群模型使用阶段通过关键词识别、计算信息增益率等技术,由梳理疾病防治指南而生成的知识分群决策树,对第二知识分群决策树决策树下的分群方案进行组成,得到第二候选联合分群方案。In the specific embodiment of this application, the second knowledge grouping decision tree is the knowledge grouping decision tree generated by combing the disease prevention and control guidelines through keyword recognition and calculating the information gain rate during the use stage of the patient grouping model, and the second knowledge grouping decision tree The grouping scheme under the tree decision tree is composed, and the second candidate joint grouping scheme is obtained.
S53,将所述第二候选联合分群方案输入预训练的患者分群模型进行排序,得到所述第二候选联合分群方案的排序结果;S53: Input the second candidate joint grouping scheme into a pre-trained patient grouping model for sorting, and obtain a sorting result of the second candidate joint grouping scheme;
本申请具体实施例中,患者分群模型采用训练回归树的方法预测每个第二候选联合分群方案的得分,依据这个得分对每个第二候选联合分群方案进行排序,得分越大的第二候选联合分群方案应该排得越靠前,得分越小的第二候选联合分群方案应该排得越靠后。In the specific embodiment of the application, the patient clustering model uses the method of training regression trees to predict the score of each second candidate joint clustering scheme, and ranks each second candidate joint clustering scheme according to this score, and the second candidate with the larger score The joint grouping plan should be ranked higher, and the second candidate joint grouping plan with the lower score should be ranked lower.
S54,根据所述第二候选联合分群方案的排序结果,选取预设数量个所述第二候选联合分群方案作为所述待分群患者的分群结果返回至所述用户终端。S54: According to the sorting result of the second candidate joint grouping solution, a preset number of the second candidate joint grouping solutions are selected as the grouping result of the patient to be grouped and returned to the user terminal.
本申请具体实施例中,预设数量个第二候选联合分群方案可根据实际情况设定,可以是排在第一位的第二候选联合分群方案,或者也可以是排在前三的第二候选联合分群方案,具体不作限定。例如:待分群患者的第二候选联合分群方案为A1+B1,A2+B1,A1+B2,A2+B2,它们的排序结果是:A2+B1,A2+B2,A1+B1,A1+B2,现设定选取top2的第二候选联合分群方案为待分群患者的最终联合分群方案,则用户终端收到的返回结果是:A2+B1,A2+B2。In the specific embodiment of the present application, the preset number of second candidate joint grouping schemes can be set according to actual conditions, and may be the second candidate joint grouping scheme ranked first, or the second candidate joint grouping plan ranked three. The candidate joint grouping scheme is not specifically limited. For example: the second candidate joint grouping scheme of patients to be grouped is A1+B1, A2+B1, A1+B2, A2+B2, and their ranking results are: A2+B1, A2+B2, A1+B1, A1+B2 Now it is set to select the second candidate joint grouping scheme of top2 as the final joint grouping scheme of patients to be grouped, and the return result received by the user terminal is: A2+B1, A2+B2.
本申请实施例提供的患者分群方法,若待分群患者患有糖尿病和高血压,则在接收到用户终端发送的患者分群请求的情况下,其实现可如图6所示,通过糖尿病防治指南和高血压防治指南,分别梳理出糖尿病知识分群决策树和高血压知识分群决策树,根据二者的知识分群决策树得到多个第二候选联合分群方案,将其输入患者分群模型进行得分预测和排序,最后输出top-k最佳第二候选联合分群方案,由于是采用图2或图3所示实施例构建的患者分群模型进行预测、排序,有利于提高对多疾病联合患者进行分群的分群效果,更适用于精准医疗。According to the method for grouping patients provided by the embodiments of the present application, if the patients to be grouped have diabetes and hypertension, in the case of receiving a request for grouping patients sent by the user terminal, the realization can be implemented as shown in FIG. 6 through the diabetes prevention guide and Hypertension prevention and control guidelines, respectively sort out the diabetes knowledge grouping decision tree and hypertension knowledge grouping decision tree, according to the knowledge grouping decision tree of the two to obtain multiple second candidate joint grouping schemes, and input them into the patient grouping model for score prediction and ranking , And finally output the top-k best second candidate joint clustering scheme. Because the patient clustering model constructed in the embodiment shown in Figure 2 or Figure 3 is used to predict and sort, it is beneficial to improve the clustering effect of multi-disease combined patients. , More suitable for precision medicine.
基于上述图2所示的患者分群模型构建方法实施例的描述,本申请实施例还提供一种患者分群模型构建装置患者分群模型构建装置,所述患者分群模型构建装置可以是运行于终端中的一个计算机程序(包括程序代码)。该患者分群模型构建装置可以执行图2或图3所示的方法。请参见图7,该装置包括:Based on the description of the embodiment of the method for constructing a patient grouping model shown in FIG. 2, an embodiment of the present application also provides a device for constructing a patient grouping model. The device for constructing a patient grouping model may be running in a terminal. A computer program (including program code). The device for constructing a patient grouping model can execute the method shown in FIG. 2 or FIG. 3. Please refer to Figure 7, the device includes:
第一分群方案获取模块71,用于获取预设疾病防治指南,对所述疾病防治指南进行关键词识别,得到联合疾病中各个疾病的划分属性集,计算划分属性集中每个划分属性的信息增益率以生成所述联合疾病中各个疾病的第一知识分群决策树,并根据所述第一知识分群决策树,得到患有所述联合疾病的患者的n个第一候选联合分群方案;The first clustering scheme acquisition module 71 is used to acquire preset disease prevention and control guidelines, perform keyword recognition on the disease prevention and control guidelines, obtain the partition attribute set of each disease in the joint disease, and calculate the information gain of each partition attribute in the partition attribute set Rate to generate a first knowledge clustering decision tree for each disease in the combined disease, and according to the first knowledge clustering decision tree, to obtain n first candidate combined clustering schemes for patients suffering from the combined disease;
结局标签生成模块72,用于获取患有所述联合疾病的患者的n条样本数据,并根据每条所述样本数据中的各个指标为每条所述样本数据生成结局标签;所述样本数据与所述第一候选联合分群方案一一对应,且每条所述样本数据的结局标签用于表示对应的所述第一候选联合分群方案的得分,所述结局标签包括绝对结局和相对结局;The outcome label generation module 72 is configured to obtain n pieces of sample data of patients suffering from the combined disease, and generate an outcome label for each piece of sample data according to various indicators in each piece of sample data; the sample data One-to-one correspondence with the first candidate joint grouping scheme, and the outcome label of each piece of sample data is used to indicate the score of the corresponding first candidate joint grouping scheme, and the outcome label includes an absolute outcome and a relative outcome;
分群模型训练模块73,用于利用带有结局标签的所述样本数据、所述第一候选联合分群方案训练lambdaMART模型,得到构建好的患者分群模型。The clustering model training module 73 is configured to train the lambdaMART model by using the sample data with the outcome label and the first candidate joint clustering scheme to obtain a constructed patient clustering model.
基于上述图5所示的患者分群方法实施例的描述,本申请实施例还提供一种患者分群装置,请参见图8,该装置包括:Based on the description of the embodiment of the method for grouping patients shown in FIG. 5, an embodiment of the present application also provides a device for grouping patients. Referring to FIG. 8, the device includes:
分群请求获取模块81,用于接收用户终端提交的患者分群请求;所述患者分群请求中包括待分群患者患有的至少两种疾病;The grouping request obtaining module 81 is configured to receive a patient grouping request submitted by the user terminal; the patient grouping request includes at least two diseases of the patient to be grouped;
第二分群方案获取模块82,用于获取所述待分群患者患有的每种疾病的第二知识分群决策树,根据所述第二知识分群决策树得到所述待分群患者的第二候选联合分群方案;The second clustering scheme acquisition module 82 is configured to acquire a second knowledge clustering decision tree of each disease that the patient to be clustered suffers from, and obtain the second candidate of the patient to be clustered according to the second knowledge clustering decision tree Joint grouping scheme;
分群方案排序模块83,用于将所述第二候选联合分群方案输入预训练的患者分群模型进行排序,得到所述第二候选联合分群方案的排序结果;The clustering scheme ranking module 83 is configured to input the second candidate joint clustering plan into a pre-trained patient clustering model for ranking, and obtain a ranking result of the second candidate joint clustering plan;
分群结果输出模块84,用于根据所述第二候选联合分群方案的排序结果,选取预设数量个所述第二候选联合分群方案作为所述待分群患者的分群结果返回至所述用户终端。The grouping result output module 84 is configured to select a preset number of the second candidate joint grouping solutions as the grouping results of the patients to be grouped and return to the user terminal according to the sorting result of the second candidate joint grouping solutions.
根据本申请的一个实施例,图7和图8所示的患者分群模型构建装置和患者分群装置中的各个单元可以分别或全部合并为一个或若干个另外的单元来构成,或者其中的某个(些)单元还可以再拆分为功能上更小的多个单元来构成,这可以实现同样的操作,而不影响本申请的实施例的技术效果的实现。上述单元是基于逻辑功能划分的,在实际应用中,一个单元的功能也可以由多个单元来实现,或者多个单元的功能由一个单元实现。在本申请的其它实施例中,患者分群模型构建装置、患者分群装置也可以包括其它单元,在实际应用中,这些功能也可以由其它单元协助实现,并且可以由多个单元协作实现。According to an embodiment of the present application, each unit in the patient grouping model construction device and the patient grouping device shown in FIG. 7 and FIG. 8 can be separately or completely combined into one or several other units to form, or one of them The unit(s) can be further divided into multiple units with smaller functions to form, which can realize the same operation without affecting the realization of the technical effects of the embodiments of the present application. The above-mentioned units are divided based on logical functions. In practical applications, the function of one unit may also be realized by multiple units, or the functions of multiple units may be realized by one unit. In other embodiments of the present application, the patient grouping model construction device and the patient grouping device may also include other units. In practical applications, these functions can also be implemented with the assistance of other units, and can be implemented by multiple units in cooperation.
根据本申请的另一个实施例,可以通过在包括中央处理单元(CPU)、随机存取存储介质(RAM)、只读存储介质(ROM)等处理元件和存储元件的例如计算机的通用计算设备上运行能够执行如图2、图3或图5中所示的相应方法所涉及的各步骤的计算机程序(包括程序代码),来构造如图7或图8中所示的患者分群模型构建装置、患者分群装置设备,以及来实现本申请实施例的患者分群模型构建方法、患者分群方法。所述计算机程序可以记载于例如计算机可读记录介质上,并通过计算机可读记录介质装载于上述计算设备中,并在其中运行。According to another embodiment of the present application, a general-purpose computing device such as a computer including a central processing unit (CPU), a random access storage medium (RAM), a read-only storage medium (ROM) and other processing elements and storage elements can be used Run a computer program (including program code) capable of executing the steps involved in the corresponding method shown in FIG. 2, FIG. 3, or FIG. 5 to construct the patient grouping model construction device as shown in FIG. 7 or FIG. 8, A device for grouping patients, and a method for constructing a patient grouping model and a method for grouping patients according to the embodiments of the present application. The computer program may be recorded on, for example, a computer-readable recording medium, and loaded into the above-mentioned computing device through the computer-readable recording medium, and run in it.
请参见图9,图9为本申请实施例提供的一种电子设备的结构示意图,如图9所示,该电子设备至少包括处理器901、输入设备902、输出设备903以及计算机可读存储介质904。其中,电子设备内的处理器901、输入设备902、输出设备903以及计算机可读存储介质904可通过总线或其他方式连接。Please refer to FIG. 9. FIG. 9 is a schematic structural diagram of an electronic device provided by an embodiment of the application. As shown in FIG. 9, the electronic device includes at least a processor 901, an input device 902, an output device 903, and a computer-readable storage medium. 904. Wherein, the processor 901, the input device 902, the output device 903, and the computer-readable storage medium 904 in the electronic device may be connected by a bus or other methods.
计算机可读存储介质904可以存储在电子设备的存储器中,所述计算机可读存储介质904用于存储计算机程序,所述计算机程序包括程序指令,所述处理器901用于执行所述计算机可读存储介质904存储的程序指令。处理器901(或称CPU(Central Processing Unit,中央处理器))是电子设备的计算核心以及控制核心,其适于实现一条或多条指令,具体适于加载并执行一条或多条指令从而实现相应方法流程或相应功能。The computer-readable storage medium 904 may be stored in the memory of the electronic device. The computer-readable storage medium 904 is used to store a computer program. The computer program includes program instructions. The processor 901 is used to execute the computer-readable Program instructions stored in the storage medium 904. The processor 901 (or CPU (Central Processing Unit, central processing unit)) is the computing core and control core of an electronic device. It is suitable for implementing one or more instructions, specifically suitable for loading and executing one or more instructions to achieve Corresponding method flow or corresponding function.
在一个实施例中,本申请实施例提供的电子设备的处理器901可以用于进行一系列患者分群模型构建处理,包括:In one embodiment, the processor 901 of the electronic device provided in the embodiment of the present application may be used to construct a series of patient grouping models, including:
获取预设疾病防治指南,对所述疾病防治指南进行关键词识别,得到联合疾病中各个疾病的划分属性集,计算划分属性集中每个划分属性的信息增益率以生成所述联合疾病中各个疾病的第一知识分群决策树,并根据所述第一知识分群决策树,得到患有所述联合疾病的患者的n个第一候选联合分群方案;Obtain a preset disease prevention and control guide, identify keywords in the disease prevention guide, obtain the divided attribute set of each disease in the combined disease, and calculate the information gain rate of each divided attribute in the divided attribute set to generate each disease in the combined disease The first knowledge grouping decision tree of, and according to the first knowledge grouping decision tree, n first candidate joint grouping schemes of patients suffering from the joint disease are obtained;
获取患有所述联合疾病的患者的n条样本数据,并根据每条所述样本数据中的各个指标为每条所述样本数据生成结局标签;所述样本数据与所述第一候选联合分群方案一一对应,且每条所述样本数据的结局标签用于表示对应的所述第一候选联合分群方案的得分,所述结局标签包括绝对结局和相对结局;Obtain n pieces of sample data of patients suffering from the combined disease, and generate an outcome label for each piece of sample data according to various indicators in each piece of sample data; the sample data and the first candidate are grouped together There is a one-to-one correspondence between the schemes, and the outcome label of each piece of sample data is used to indicate the score of the corresponding first candidate joint clustering scheme, and the outcome label includes an absolute outcome and a relative outcome;
利用带有结局标签的所述样本数据、所述第一候选联合分群方案训练lambdaMART模 型,得到构建好的患者分群模型。The lambdaMART model is trained using the sample data with the outcome label and the first candidate joint clustering scheme to obtain a constructed patient clustering model.
在一个实施例中,处理器901执行所述根据每条所述样本数据中的各个指标为每条样本数据生成结局标签,包括:获取每条所述样本数据中的各个指标的重要性;基于每条所述样本数据中的各个指标的重要性为每条所述样本数据生成结局标签。In one embodiment, the processor 901 executing the generating of an ending label for each piece of sample data according to each indicator in each piece of sample data includes: obtaining the importance of each indicator in each piece of sample data; The importance of each indicator in each piece of sample data generates an outcome label for each piece of sample data.
在一个实施例中,处理器901执行所述获取每条所述样本数据中各个指标的重要性,包括:利用每条所述样本数据中的各个指标训练logist回归模型:y=1/1+e -(β0+β1X1+β2X2+...+βnXn),其中,y表示回归模型的输出,X1、X2…Xn表示每条所述样本数据中的各个指标,系数β1、β2…βn表示各个指标的重要性;训练过程中通过梯度下降法减少对数损失,以估计回归系数β0、β1、β2…βn,得到各个指标的重要性。 In one embodiment, the processor 901 executes the acquisition of the importance of each indicator in each piece of sample data, including: training a logist regression model using each indicator in each piece of sample data: y=1/1+ e -(β0+β1X1+β2X2+...+βnXn) , where y represents the output of the regression model, X1, X2...Xn represent each index in each of the sample data, and the coefficients β1, β2...βn represent each index The importance of; in the training process, the logarithmic loss is reduced by gradient descent method to estimate the regression coefficients β0, β1, β2...βn to obtain the importance of each index.
在一个实施例中,处理器901执行所述基于每条所述样本数据中的各个指标的重要性为每条所述样本数据生成结局标签,包括:采用预设公式:effect(i)=absolute(i)*relative(i)为每条所述样本数据生成结局标签;其中,effect(i)表示第i条样本数据的结局标签,absolute(i)表示第i条样本数据的绝对结局,根据第i条样本数据中的各个指标的重要性自定义;relative(i)表示第i条样本数据的相对结局,根据absolute(i)定义。In one embodiment, the processor 901 executing the generation of an outcome label for each piece of sample data based on the importance of each indicator in each piece of sample data includes: using a preset formula: effect(i)=absolute (i)*relative(i) generates an outcome label for each piece of sample data; among them, effect(i) represents the outcome label of the i-th sample data, absolute(i) represents the absolute outcome of the i-th sample data, according to The importance of each indicator in the sample data of Article i is customized; relative(i) represents the relative outcome of the sample data of Article i, which is defined by absolute(i).
在一个实施例中,处理器901执行所述利用带有结局标签的所述样本数据、所述第一候选联合分群方案训练lambdaMART模型,得到构建好的患者分群模型,包括:In one embodiment, the processor 901 executes the training of the lambdaMART model using the sample data with the outcome label and the first candidate joint clustering scheme to obtain a constructed patient clustering model, including:
A:计算所述第一候选联合分群方案的lambda值;A: Calculate the lambda value of the first candidate joint grouping scheme;
B:以所述lambda值为标签训练一棵回归树,在回归树的每个叶子节点通过预测的回归结果计算出最终输出得分;B: Train a regression tree with the lambda value as a label, and calculate the final output score based on the predicted regression result at each leaf node of the regression tree;
C:通过步骤A和步骤B预测出带有结局标签的每条所述样本数据的得分,根据带有结局标签的每条所述样本数据的得分对每条所述样本数据对应的所述第一候选联合分群方案进行排序;C: Predict the score of each piece of sample data with an outcome label through steps A and B, and compare the score of each piece of sample data with an outcome label to the first piece of sample data corresponding to each piece of sample data. A candidate joint grouping scheme is sorted;
D:重复步骤A至步骤C组成随机森林,直至满足预设收敛条件之一便停止训练,得到所述患者分群模型;所述预设收敛条件包括:回归树的数量达到预设参数设置、随机森林在验证集上不再持续更新。D: Repeat steps A to C to form a random forest, and stop training until one of the preset convergence conditions is met to obtain the patient grouping model; the preset convergence conditions include: the number of regression trees reaches the preset parameter setting, random The forest is no longer continuously updated on the validation set.
需要说明的是,由于电子设备的处理器901执行计算机程序时实现上述的患者分群模型构建方法中的步骤,因此上述患者分群模型构建方法的实施例均适用于该电子设备,且均能达到相同或相似的有益效果。It should be noted that, since the processor 901 of the electronic device executes the computer program to implement the steps in the method for constructing a patient grouping model, the embodiments of the method for constructing a patient grouping model are all applicable to the electronic device, and can achieve the same Or similar beneficial effects.
请参见图10,图10为本申请实施例提供的另一种电子设备的结构示意图,如图10所示,该电子设备至少包括处理器1001、输入设备1002、输出设备1003以及计算机可读存储介质1004。其中,电子设备内的处理器1001、输入设备1002、输出设备1003以及计算机可读存储介质1004可通过总线或其他方式连接。Please refer to FIG. 10, which is a schematic structural diagram of another electronic device provided by an embodiment of the application. As shown in FIG. 10, the electronic device includes at least a processor 1001, an input device 1002, an output device 1003, and a computer-readable storage Medium 1004. Wherein, the processor 1001, the input device 1002, the output device 1003, and the computer-readable storage medium 1004 in the electronic device may be connected by a bus or other methods.
在一个实施例中,本申请实施例提供的电子设备的处理器1001可以用于进行一系列患者分群处理,包括:In an embodiment, the processor 1001 of the electronic device provided in the embodiment of the present application may be used to perform a series of patient grouping processing, including:
接收用户终端提交的患者分群请求;所述患者分群请求中包括待分群患者患有的至少两种疾病;Receiving a patient grouping request submitted by the user terminal; the patient grouping request includes at least two diseases of the patient to be grouped;
获取所述待分群患者患有的每种疾病的第二知识分群决策树,根据所述第二知识分群决策树得到所述待分群患者的第二候选联合分群方案;Acquiring a second knowledge grouping decision tree for each disease that the patient to be grouped suffers from, and obtaining a second candidate joint grouping plan of the patient to be grouped according to the second knowledge grouping decision tree;
将所述第二候选联合分群方案输入预训练的患者分群模型进行排序,得到所述第二候选联合分群方案的排序结果;Input the second candidate joint grouping scheme into a pre-trained patient grouping model for sorting, and obtain a sorting result of the second candidate joint grouping scheme;
根据所述第二候选联合分群方案的排序结果,选取预设数量个所述第二候选联合分群方案作为所述待分群患者的分群结果返回至所述用户终端。According to the ranking result of the second candidate joint grouping scheme, a preset number of the second candidate joint grouping schemes are selected as the grouping result of the patient to be grouped and returned to the user terminal.
需要说明的是,由于电子设备的处理器1001执行计算机程序时实现上述的患者分群方法中的步骤,因此上述患者分群方法的实施例均适用于该电子设备,且均能达到相同或相似的有益效果。另外,上述患者分群方法和患者分群模型构建方法可由同一电子设备执行,也可由不同的电子设备执行,本申请实施例并不限定。It should be noted that since the processor 1001 of the electronic device executes the computer program to implement the steps in the above-mentioned patient grouping method, the embodiments of the above-mentioned patient grouping method are all applicable to the electronic device, and can achieve the same or similar benefits. effect. In addition, the above-mentioned patient grouping method and patient grouping model construction method can be executed by the same electronic device, or can be executed by different electronic devices, which is not limited in the embodiment of the present application.
本申请实施例还提供了一种计算机可读存储介质(Memory),所述计算机可读存储介质是电子设备中的记忆设备,用于存放程序和数据。可以理解的是,此处的计算机可读存储介质既可以包括终端中的内置存储介质,当然也可以包括终端所支持的扩展存储介质。计算机可读存储介质提供存储空间,该存储空间存储了终端的操作系统。并且,在该存储空间中还存放了适于被处理器901加载并执行的一条或多条的指令,这些指令可以是一个或一个以上的计算机程序(包括程序代码)。需要说明的是,此处的计算机可读存储介质可以是高速RAM存储器,也可以是非不稳定的存储器(non-volatile memory),例如至少一个磁盘存储器;可选的,还可以是至少一个位于远离前述处理器901的计算机可读存储介质。在一个实施例中,可由处理器901加载并执行计算机可读存储介质中存放的一条或多条指令,以实现以下步骤:The embodiment of the present application also provides a computer-readable storage medium (Memory). The computer-readable storage medium is a memory device in an electronic device for storing programs and data. It can be understood that the computer-readable storage medium herein may include a built-in storage medium in the terminal, and of course, may also include an extended storage medium supported by the terminal. The computer-readable storage medium provides storage space, and the storage space stores the operating system of the terminal. In addition, one or more instructions suitable for being loaded and executed by the processor 901 are stored in the storage space, and these instructions may be one or more computer programs (including program codes). It should be noted that the computer-readable storage medium here can be a high-speed RAM memory, or a non-volatile memory (non-volatile memory), such as at least one disk memory; optionally, it can also be at least one located far away The aforementioned processor 901 is a computer-readable storage medium. In an embodiment, the processor 901 can load and execute one or more instructions stored in a computer-readable storage medium to implement the following steps:
获取预设疾病防治指南,对所述疾病防治指南进行关键词识别,得到联合疾病中各个疾病的划分属性集,计算划分属性集中每个划分属性的信息增益率以生成所述联合疾病中各个疾病的第一知识分群决策树,并根据所述第一知识分群决策树,得到患有所述联合疾病的患者的n个第一候选联合分群方案;Obtain a preset disease prevention and control guide, identify keywords in the disease prevention guide, obtain the divided attribute set of each disease in the combined disease, and calculate the information gain rate of each divided attribute in the divided attribute set to generate each disease in the combined disease The first knowledge grouping decision tree of, and according to the first knowledge grouping decision tree, n first candidate joint grouping schemes of patients suffering from the joint disease are obtained;
获取患有所述联合疾病的患者的n条样本数据,并根据每条所述样本数据中的各个指标为每条所述样本数据生成结局标签;所述样本数据与所述第一候选联合分群方案一一对应,且每条所述样本数据的结局标签用于表示对应的所述第一候选联合分群方案的得分,所述结局标签包括绝对结局和相对结局;Obtain n pieces of sample data of patients suffering from the combined disease, and generate an outcome label for each piece of sample data according to various indicators in each piece of sample data; the sample data and the first candidate are grouped together There is a one-to-one correspondence between the schemes, and the outcome label of each piece of sample data is used to indicate the score of the corresponding first candidate joint clustering scheme, and the outcome label includes an absolute outcome and a relative outcome;
利用带有结局标签的所述样本数据、所述第一候选联合分群方案训练lambdaMART模型,得到构建好的患者分群模型。The lambdaMART model is trained by using the sample data with the outcome label and the first candidate joint clustering scheme to obtain a constructed patient clustering model.
在一种示例中,计算机可读存储介质中的一条或多条指令由处理器901加载时还执行如下步骤:获取每条所述样本数据中的各个指标的重要性;基于每条所述样本数据中的各个指标的重要性为每条所述样本数据生成结局标签。In an example, when one or more instructions in the computer-readable storage medium are loaded by the processor 901, the following steps are performed: acquiring the importance of each indicator in each piece of sample data; based on each piece of sample data The importance of each indicator in the data generates an outcome label for each piece of sample data.
在一种示例中,计算机可读存储介质中的一条或多条指令由处理器901加载时还执行如下步骤:利用每条所述样本数据中的各个指标训练logist回归模型:y=1/1+e -(β0+β1X1+β2X2+...+βnXn),其中,y表示回归模型的输出,X1、X2…Xn表示每条所述样本数据中的各个指标,系数β1、β2…βn表示各个指标的重要性;训练过程中通过梯度下降法减少对数损失,以估计回归系数β0、β1、β2…βn,得到各个指标的重要性。 In an example, when one or more instructions in the computer-readable storage medium are loaded by the processor 901, the following steps are also performed: training a logist regression model using each indicator in each piece of sample data: y=1/1 +e -(β0+β1X1+β2X2+...+βnXn) , where y represents the output of the regression model, X1, X2...Xn represent each index in each of the sample data, and the coefficients β1, β2...βn represent each The importance of the indicators; the gradient descent method is used to reduce the logarithmic loss during the training process to estimate the regression coefficients β0, β1, β2...βn to obtain the importance of each indicator.
在一种示例中,计算机可读存储介质中的一条或多条指令由处理器901加载时还执行如下步骤:采用预设公式:effect(i)=absolute(i)*relative(i)为每条所述样本数据生成结局标签;其中,effect(i)表示第i条样本数据的结局标签,absolute(i)表示第i条样本数据的绝对结局,根据第i条样本数据中的各个指标的重要性自定义;relative(i)表示第i条样本 数据的相对结局,根据absolute(i)定义。In an example, when one or more instructions in the computer-readable storage medium are loaded by the processor 901, the following steps are also executed: using a preset formula: effect(i)=absolute(i)*relative(i) is each The sample data described in Article 1 generates an outcome label; among them, effect(i) represents the outcome label of the i-th sample data, absolute(i) represents the absolute outcome of the i-th sample data, according to the index Importance is self-defined; relative(i) represents the relative outcome of the i-th sample data, defined according to absolute(i).
在一种示例中,计算机可读存储介质中的一条或多条指令由处理器901加载时还执行如下步骤:In an example, when one or more instructions in the computer-readable storage medium are loaded by the processor 901, the following steps are also executed:
A:计算所述第一候选联合分群方案的lambda值;A: Calculate the lambda value of the first candidate joint grouping scheme;
B:以所述lambda值为标签训练一棵回归树,在回归树的每个叶子节点通过预测的回归结果计算出最终输出得分;B: Train a regression tree with the lambda value as a label, and calculate the final output score based on the predicted regression result at each leaf node of the regression tree;
C:通过步骤A和步骤B预测出带有结局标签的每条所述样本数据的得分,根据带有结局标签的每条所述样本数据的得分对每条所述样本数据对应的所述第一候选联合分群方案进行排序;C: Predict the score of each piece of sample data with an outcome label through steps A and B, and compare the score of each piece of sample data with an outcome label to the first piece of sample data corresponding to each piece of sample data. A candidate joint grouping scheme is sorted;
D:重复步骤A至步骤C组成随机森林,直至满足预设收敛条件之一便停止训练,得到所述患者分群模型;所述预设收敛条件包括:回归树的数量达到预设参数设置、随机森林在验证集上不再持续更新。D: Repeat steps A to C to form a random forest, and stop training until one of the preset convergence conditions is met to obtain the patient grouping model; the preset convergence conditions include: the number of regression trees reaches the preset parameter setting, random The forest is no longer continuously updated on the validation set.
本申请实施例还提供了一种计算机可读存储介质(Memory),在一个实施例中,可由处理器1001加载并执行计算机可读存储介质中存放的一条或多条指令,以实现以下步骤:The embodiment of the present application also provides a computer-readable storage medium (Memory). In one embodiment, the processor 1001 can load and execute one or more instructions stored in the computer-readable storage medium to implement the following steps:
接收用户终端提交的患者分群请求;所述患者分群请求中包括待分群患者患有的至少两种疾病;Receiving a patient grouping request submitted by the user terminal; the patient grouping request includes at least two diseases of the patient to be grouped;
获取所述待分群患者患有的每种疾病的第二知识分群决策树,根据所述第二知识分群决策树得到所述待分群患者的第二候选联合分群方案;Acquiring a second knowledge grouping decision tree for each disease that the patient to be grouped suffers from, and obtaining a second candidate joint grouping plan of the patient to be grouped according to the second knowledge grouping decision tree;
将所述第二候选联合分群方案输入预训练的患者分群模型进行排序,得到所述第二候选联合分群方案的排序结果;Input the second candidate joint grouping scheme into a pre-trained patient grouping model for sorting, and obtain a sorting result of the second candidate joint grouping scheme;
根据所述第二候选联合分群方案的排序结果,选取预设数量个所述第二候选联合分群方案作为所述待分群患者的分群结果返回至所述用户终端。According to the ranking result of the second candidate joint grouping scheme, a preset number of the second candidate joint grouping schemes are selected as the grouping result of the patient to be grouped and returned to the user terminal.
示例性的,计算机可读存储介质的计算机程序包括计算机程序代码,所述计算机程序代码可以为源代码形式、对象代码形式、可执行文件或某些中间形式等,所述计算机可读存储介质可以是非易失性,也可以是易失性。所述计算机可读存储介质可以包括:能够携带所述计算机程序代码的任何实体或装置、记录介质、U盘、移动硬盘、磁碟、光盘、计算机存储器、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、电载波信号、电信信号以及软件分发介质等。Exemplarily, the computer program of the computer-readable storage medium includes computer program code, and the computer program code may be in the form of source code, object code, executable file, or some intermediate form, etc., and the computer-readable storage medium may It is non-volatile or volatile. The computer-readable storage medium may include: any entity or device capable of carrying the computer program code, recording medium, U disk, mobile hard disk, magnetic disk, optical disk, computer memory, read-only memory (ROM, Read-Only Memory) ), Random Access Memory (RAM, Random Access Memory), electrical carrier signal, telecommunications signal, and software distribution media, etc.
以上所揭露的仅为本申请的部分实施例而已,当然不能以此来限定本申请之权利范围,本领域普通技术人员可以理解实现上述实施例的全部或部分流程,并依本申请权利要求所作的等同变化,仍属于本申请所涵盖的范围。The above-disclosed are only part of the embodiments of this application. Of course, it cannot be used to limit the scope of rights of this application. Those of ordinary skill in the art can understand all or part of the procedures for implementing the above-mentioned embodiments and make them in accordance with the claims of this application. The equivalent change of is still within the scope of this application.

Claims (20)

  1. 一种患者分群模型构建方法,其中,所述方法包括:A method for constructing a patient grouping model, wherein the method includes:
    获取预设疾病防治指南,对所述疾病防治指南进行关键词识别,得到联合疾病中各个疾病的划分属性集,计算划分属性集中每个划分属性的信息增益率以生成所述联合疾病中各个疾病的第一知识分群决策树,并根据所述第一知识分群决策树,得到患有所述联合疾病的患者的n个第一候选联合分群方案;Obtain a preset disease prevention and control guide, identify keywords in the disease prevention guide, obtain the divided attribute set of each disease in the combined disease, and calculate the information gain rate of each divided attribute in the divided attribute set to generate each disease in the combined disease The first knowledge grouping decision tree of, and according to the first knowledge grouping decision tree, n first candidate joint grouping schemes of patients suffering from the joint disease are obtained;
    获取患有所述联合疾病的患者的n条样本数据,并根据每条所述样本数据中的各个指标为每条所述样本数据生成结局标签;所述样本数据与所述第一候选联合分群方案一一对应,且每条所述样本数据的结局标签用于表示对应的所述第一候选联合分群方案的得分,所述结局标签包括绝对结局和相对结局;Obtain n pieces of sample data of patients suffering from the combined disease, and generate an outcome label for each piece of sample data according to various indicators in each piece of sample data; the sample data and the first candidate are grouped together There is a one-to-one correspondence between the schemes, and the outcome label of each piece of sample data is used to indicate the score of the corresponding first candidate joint clustering scheme, and the outcome label includes an absolute outcome and a relative outcome;
    利用带有结局标签的所述样本数据、所述第一候选联合分群方案训练lambdaMART模型,得到构建好的患者分群模型。The lambdaMART model is trained by using the sample data with the outcome label and the first candidate joint clustering scheme to obtain a constructed patient clustering model.
  2. 根据权利要求1所述的方法,其中,所述根据每条所述样本数据中的各个指标为每条所述样本数据生成结局标签,包括:The method according to claim 1, wherein the generating an outcome label for each piece of sample data according to each indicator in each piece of sample data comprises:
    获取每条所述样本数据中的各个指标的重要性;Obtain the importance of each indicator in each piece of sample data;
    基于每条所述样本数据中的各个指标的重要性为每条所述样本数据生成结局标签。An outcome label is generated for each piece of sample data based on the importance of each indicator in each piece of sample data.
  3. 根据权利要求2所述的方法,其中,所述获取每条所述样本数据中各个指标的重要性,包括:3. The method according to claim 2, wherein said obtaining the importance of each indicator in each piece of sample data comprises:
    利用每条所述样本数据中的各个指标训练logist回归模型:y=1/1+e -(β0+β1X1+β2X2+...+βnXn),其中,y表示回归模型的输出,X1、X2…Xn表示每条所述样本数据中的各个指标,系数β1、β2…βn表示各个指标的重要性; Train the logist regression model using the indicators in each of the sample data: y=1/1+e -(β0+β1X1+β2X2+...+βnXn) , where y represents the output of the regression model, X1, X2... Xn represents each index in each piece of sample data, and the coefficients β1, β2...βn represent the importance of each index;
    训练过程中通过梯度下降法减少对数损失,以估计回归系数β0、β1、β2…βn,得到各个指标的重要性。In the training process, the logarithmic loss is reduced by the gradient descent method to estimate the regression coefficients β0, β1, β2...βn, and obtain the importance of each index.
  4. 根据权利要求2所述的方法,其中,所述基于每条所述样本数据中的各个指标的重要性为每条所述样本数据生成结局标签,包括:The method according to claim 2, wherein the generating an outcome label for each piece of sample data based on the importance of each indicator in each piece of sample data comprises:
    采用预设公式:effect(i)=absolute(i)*relative(i)为每条所述样本数据生成结局标签;其中,effect(i)表示第i条样本数据的结局标签,absolute(i)表示第i条样本数据的绝对结局,根据第i条样本数据中的各个指标的重要性自定义;relative(i)表示第i条样本数据的相对结局,根据absolute(i)定义。Use the preset formula: effect(i)=absolute(i)*relative(i) to generate an outcome label for each piece of sample data; where effect(i) represents the outcome label of the i-th sample data, absolute(i) Represents the absolute outcome of the i-th sample data, customized according to the importance of each indicator in the i-th sample data; relative(i) represents the relative outcome of the i-th sample data, defined according to absolute(i).
  5. 根据权利要求1-4任一项所述的方法,其中,所述利用带有结局标签的所述样本数据、所述第一候选联合分群方案训练lambdaMART模型,得到构建好的患者分群模型,包括:The method according to any one of claims 1 to 4, wherein the training a lambdaMART model using the sample data with an outcome label and the first candidate joint clustering scheme to obtain a constructed patient clustering model includes :
    A:计算所述第一候选联合分群方案的lambda值;A: Calculate the lambda value of the first candidate joint grouping scheme;
    B:以所述lambda值为标签训练一棵回归树,在回归树的每个叶子节点通过预测的回归结果计算出最终输出得分;B: Train a regression tree with the lambda value as a label, and calculate the final output score based on the predicted regression result at each leaf node of the regression tree;
    C:通过步骤A和步骤B预测出带有结局标签的每条所述样本数据的得分,根据带有结局标签的每条所述样本数据的得分对每条所述样本数据对应的所述第一候选联合分群方案进行排序;C: Predict the score of each piece of sample data with an outcome label through steps A and B, and compare the score of each piece of sample data with an outcome label to the first piece of sample data corresponding to each piece of sample data. A candidate joint grouping scheme is sorted;
    D:重复步骤A至步骤C组成随机森林,直至满足预设收敛条件之一便停止训练,得到所述患者分群模型;所述预设收敛条件包括:回归树的数量达到预设参数设置、随机森林在验证集上不再持续更新。D: Repeat steps A to C to form a random forest, and stop training until one of the preset convergence conditions is met to obtain the patient grouping model; the preset convergence conditions include: the number of regression trees reaches the preset parameter setting, random The forest is no longer continuously updated on the validation set.
  6. 一种利用权利要求1-5任一项所述的方法构建的患者分群模型进行的患者分群方法,其中,所述方法包括:A patient grouping method using the patient grouping model constructed by the method of any one of claims 1 to 5, wherein the method comprises:
    接收用户终端提交的患者分群请求;所述患者分群请求中包括待分群患者患有的至少两种疾病;Receiving a patient grouping request submitted by the user terminal; the patient grouping request includes at least two diseases of the patient to be grouped;
    获取所述待分群患者患有的每种疾病的第二知识分群决策树,根据所述第二知识分群决策树得到所述待分群患者的第二候选联合分群方案;Acquiring a second knowledge grouping decision tree for each disease that the patient to be grouped suffers from, and obtaining a second candidate joint grouping plan of the patient to be grouped according to the second knowledge grouping decision tree;
    将所述第二候选联合分群方案输入预训练的患者分群模型进行排序,得到所述第二候选联合分群方案的排序结果;Input the second candidate joint grouping scheme into a pre-trained patient grouping model for sorting, and obtain a sorting result of the second candidate joint grouping scheme;
    根据所述第二候选联合分群方案的排序结果,选取预设数量个所述第二候选联合分群方案作为所述待分群患者的分群结果返回至所述用户终端。According to the ranking result of the second candidate joint grouping scheme, a preset number of the second candidate joint grouping schemes are selected as the grouping result of the patient to be grouped and returned to the user terminal.
  7. 一种患者分群模型构建装置,其中,所述装置包括:A device for constructing a patient grouping model, wherein the device comprises:
    第一分群方案获取模块,用于获取预设疾病防治指南,对所述疾病防治指南进行关键词识别,得到联合疾病中各个疾病的划分属性集,计算划分属性集中每个划分属性的信息增益率以生成所述联合疾病中各个疾病的第一知识分群决策树,并根据所述第一知识分群决策树,得到患有所述联合疾病的患者的n个第一候选联合分群方案;The first clustering scheme acquisition module is used to acquire preset disease prevention and control guidelines, identify keywords in the disease prevention and control guidelines, obtain the partition attribute set of each disease in the joint disease, and calculate the information gain rate of each partition attribute in the partition attribute set To generate a first knowledge grouping decision tree for each disease in the combined disease, and according to the first knowledge grouping decision tree, to obtain n first candidate joint grouping schemes of patients suffering from the combined disease;
    结局标签生成模块,用于获取患有所述联合疾病的患者的n条样本数据,并根据每条所述样本数据中的各个指标为每条所述样本数据生成结局标签;所述样本数据与所述第一候选联合分群方案一一对应,且每条所述样本数据的结局标签用于表示对应的所述第一候选联合分群方案的得分,所述结局标签包括绝对结局和相对结局;The outcome label generation module is used to obtain n pieces of sample data of the patient suffering from the combined disease, and generate an outcome label for each piece of sample data according to various indicators in each piece of sample data; the sample data and The first candidate joint grouping scheme corresponds one-to-one, and the outcome label of each piece of sample data is used to indicate the score of the corresponding first candidate joint grouping scheme, and the outcome label includes an absolute outcome and a relative outcome;
    分群模型训练模块,用于利用带有结局标签的所述样本数据、所述第一候选联合分群方案训练lambdaMART模型,得到构建好的患者分群模型。The clustering model training module is used to train the lambdaMART model by using the sample data with the outcome label and the first candidate joint clustering scheme to obtain a constructed patient clustering model.
  8. 一种患者分群装置,其中,所述装置包括:A device for grouping patients, wherein the device comprises:
    分群请求获取模块,用于接收用户终端提交的患者分群请求;所述患者分群请求中包括待分群患者患有的至少两种疾病;The grouping request acquisition module is configured to receive a patient grouping request submitted by the user terminal; the patient grouping request includes at least two diseases of the patient to be grouped;
    第二分群方案获取模块,用于获取所述待分群患者患有的每种疾病的第二知识分群决策树,根据所述第二知识分群决策树得到所述待分群患者的第二候选联合分群方案;The second clustering scheme acquisition module is used to acquire a second knowledge clustering decision tree for each disease that the patient to be clustered suffers from, and to obtain the second candidate combination of the patient to be clustered according to the second knowledge clustering decision tree Grouping scheme
    分群方案排序模块,用于将所述第二候选联合分群方案输入预训练的患者分群模型进行排序,得到所述第二候选联合分群方案的排序结果;The clustering scheme ranking module is configured to input the second candidate joint clustering plan into a pre-trained patient clustering model for ranking, and obtain a ranking result of the second candidate joint clustering plan;
    分群结果输出模块,用于根据所述第二候选联合分群方案的排序结果,选取预设数量个所述第二候选联合分群方案作为所述待分群患者的分群结果返回至所述用户终端。The grouping result output module is configured to select a preset number of the second candidate joint grouping schemes as the grouping result of the patient to be grouped and return to the user terminal according to the sorting result of the second candidate joint grouping scheme.
  9. 一种电子设备,包括输入设备和输出设备,其中,还包括:An electronic device, including an input device and an output device, which also includes:
    处理器,适于实现一条或多条指令;以及,Processor, suitable for implementing one or more instructions; and,
    计算机可读存储介质,所述计算机可读存储介质存储有一条或多条指令,所述一条或多条指令适于由所述处理器加载并执行:A computer-readable storage medium storing one or more instructions, and the one or more instructions are suitable for being loaded and executed by the processor:
    获取预设疾病防治指南,对所述疾病防治指南进行关键词识别,得到联合疾病中各个疾病的划分属性集,计算划分属性集中每个划分属性的信息增益率以生成所述联合疾病中各个疾病的第一知识分群决策树,并根据所述第一知识分群决策树,得到患有所述联合疾病的患者的n个第一候选联合分群方案;Obtain a preset disease prevention and control guide, identify keywords in the disease prevention guide, obtain the divided attribute set of each disease in the combined disease, and calculate the information gain rate of each divided attribute in the divided attribute set to generate each disease in the combined disease The first knowledge grouping decision tree of, and according to the first knowledge grouping decision tree, n first candidate joint grouping schemes of patients suffering from the joint disease are obtained;
    获取患有所述联合疾病的患者的n条样本数据,并根据每条所述样本数据中的各个指标为每条所述样本数据生成结局标签;所述样本数据与所述第一候选联合分群方案一一对应,且每条所述样本数据的结局标签用于表示对应的所述第一候选联合分群方案的得分, 所述结局标签包括绝对结局和相对结局;Obtain n pieces of sample data of patients suffering from the combined disease, and generate an outcome label for each piece of sample data according to various indicators in each piece of sample data; the sample data and the first candidate are grouped together There is a one-to-one correspondence between the schemes, and the outcome label of each piece of sample data is used to indicate the score of the corresponding first candidate joint clustering scheme, and the outcome label includes an absolute outcome and a relative outcome;
    利用带有结局标签的所述样本数据、所述第一候选联合分群方案训练lambdaMART模型,得到构建好的患者分群模型。The lambdaMART model is trained by using the sample data with the outcome label and the first candidate joint clustering scheme to obtain a constructed patient clustering model.
  10. 根据权利要求9所述的电子设备,其中,所述处理器执行所述根据每条所述样本数据中的各个指标为每条所述样本数据生成结局标签,包括:11. The electronic device according to claim 9, wherein the execution of the processor to generate an outcome label for each piece of the sample data according to each indicator in each piece of the sample data comprises:
    获取每条所述样本数据中的各个指标的重要性;Obtain the importance of each indicator in each piece of sample data;
    基于每条所述样本数据中的各个指标的重要性为每条所述样本数据生成结局标签。An outcome label is generated for each piece of sample data based on the importance of each indicator in each piece of sample data.
  11. 根据权利要求10所述的电子设备,其中,所述处理器执行所述获取每条所述样本数据中各个指标的重要性,包括:11. The electronic device according to claim 10, wherein said acquiring, by said processor, the importance of each index in each piece of sample data comprises:
    利用每条所述样本数据中的各个指标训练logist回归模型:y=1/1+e -(β0+β1X1+β2X2+...+βnXn),其中,y表示回归模型的输出,X1、X2…Xn表示每条所述样本数据中的各个指标,系数β1、β2…βn表示各个指标的重要性; Train the logist regression model using the indicators in each of the sample data: y=1/1+e -(β0+β1X1+β2X2+...+βnXn) , where y represents the output of the regression model, X1, X2... Xn represents each index in each piece of sample data, and the coefficients β1, β2...βn represent the importance of each index;
    训练过程中通过梯度下降法减少对数损失,以估计回归系数β0、β1、β2…βn,得到各个指标的重要性。In the training process, the logarithmic loss is reduced by the gradient descent method to estimate the regression coefficients β0, β1, β2...βn, and obtain the importance of each index.
  12. 根据权利要求10所述的电子设备,其中,所述处理器执行所述基于每条所述样本数据中的各个指标的重要性为每条所述样本数据生成结局标签,包括:The electronic device according to claim 10, wherein the execution of the processor to generate an outcome label for each piece of the sample data based on the importance of each index in each piece of the sample data comprises:
    采用预设公式:effect(i)=absolute(i)*relative(i)为每条所述样本数据生成结局标签;其中,effect(i)表示第i条样本数据的结局标签,absolute(i)表示第i条样本数据的绝对结局,根据第i条样本数据中的各个指标的重要性自定义;relative(i)表示第i条样本数据的相对结局,根据absolute(i)定义。Use the preset formula: effect(i)=absolute(i)*relative(i) to generate an outcome label for each piece of sample data; where effect(i) represents the outcome label of the i-th sample data, absolute(i) Represents the absolute outcome of the i-th sample data, customized according to the importance of each indicator in the i-th sample data; relative(i) represents the relative outcome of the i-th sample data, defined according to absolute(i).
  13. 根据权利要求9-12任一项所述的电子设备,其中所述处理器执行所述利用带有结局标签的所述样本数据、所述第一候选联合分群方案训练lambdaMART模型,得到构建好的患者分群模型,包括:The electronic device according to any one of claims 9-12, wherein the processor executes the training of the lambdaMART model using the sample data with the ending label and the first candidate joint clustering scheme to obtain a built Patient classification model, including:
    A:计算所述第一候选联合分群方案的lambda值;A: Calculate the lambda value of the first candidate joint grouping scheme;
    B:以所述lambda值为标签训练一棵回归树,在回归树的每个叶子节点通过预测的回归结果计算出最终输出得分;B: Train a regression tree with the lambda value as a label, and calculate the final output score based on the predicted regression result at each leaf node of the regression tree;
    C:通过步骤A和步骤B预测出带有结局标签的每条所述样本数据的得分,根据带有结局标签的每条所述样本数据的得分对每条所述样本数据对应的所述第一候选联合分群方案进行排序;C: Predict the score of each piece of sample data with an outcome label through steps A and B, and compare the score of each piece of sample data with an outcome label to the first piece of sample data corresponding to each piece of sample data. A candidate joint grouping scheme is sorted;
    D:重复步骤A至步骤C组成随机森林,直至满足预设收敛条件之一便停止训练,得到所述患者分群模型;所述预设收敛条件包括:回归树的数量达到预设参数设置、随机森林在验证集上不再持续更新。D: Repeat steps A to C to form a random forest, and stop training until one of the preset convergence conditions is met to obtain the patient grouping model; the preset convergence conditions include: the number of regression trees reaches the preset parameter setting, random The forest is no longer continuously updated on the validation set.
  14. 一种电子设备,包括输入设备和输出设备,其中,还包括:An electronic device, including an input device and an output device, which also includes:
    处理器,适于实现一条或多条指令;以及,Processor, suitable for implementing one or more instructions; and,
    计算机可读存储介质,所述计算机可读存储介质存储有一条或多条指令,所述一条或多条指令适于由所述处理器加载并执行:A computer-readable storage medium storing one or more instructions, and the one or more instructions are suitable for being loaded and executed by the processor:
    接收用户终端提交的患者分群请求;所述患者分群请求中包括待分群患者患有的至少两种疾病;Receiving a patient grouping request submitted by the user terminal; the patient grouping request includes at least two diseases of the patient to be grouped;
    获取所述待分群患者患有的每种疾病的第二知识分群决策树,根据所述第二知识分群决策树得到所述待分群患者的第二候选联合分群方案;Acquiring a second knowledge grouping decision tree for each disease that the patient to be grouped suffers from, and obtaining a second candidate joint grouping plan of the patient to be grouped according to the second knowledge grouping decision tree;
    将所述第二候选联合分群方案输入预训练的患者分群模型进行排序,得到所述第二候选联合分群方案的排序结果;Input the second candidate joint grouping scheme into a pre-trained patient grouping model for sorting, and obtain a sorting result of the second candidate joint grouping scheme;
    根据所述第二候选联合分群方案的排序结果,选取预设数量个所述第二候选联合分群方案作为所述待分群患者的分群结果返回至所述用户终端。According to the ranking result of the second candidate joint grouping scheme, a preset number of the second candidate joint grouping schemes are selected as the grouping result of the patient to be grouped and returned to the user terminal.
  15. 一种计算机可读存储介质,其中,所述计算机可读存储介质存储有一条或多条指令,所述一条或多条指令适于由处理器加载并执行:A computer-readable storage medium, wherein the computer-readable storage medium stores one or more instructions, and the one or more instructions are suitable for being loaded and executed by a processor:
    获取预设疾病防治指南,对所述疾病防治指南进行关键词识别,得到联合疾病中各个疾病的划分属性集,计算划分属性集中每个划分属性的信息增益率以生成所述联合疾病中各个疾病的第一知识分群决策树,并根据所述第一知识分群决策树,得到患有所述联合疾病的患者的n个第一候选联合分群方案;Obtain a preset disease prevention and control guide, identify keywords in the disease prevention guide, obtain the divided attribute set of each disease in the combined disease, and calculate the information gain rate of each divided attribute in the divided attribute set to generate each disease in the combined disease The first knowledge grouping decision tree of, and according to the first knowledge grouping decision tree, n first candidate joint grouping schemes of patients suffering from the joint disease are obtained;
    获取患有所述联合疾病的患者的n条样本数据,并根据每条所述样本数据中的各个指标为每条所述样本数据生成结局标签;所述样本数据与所述第一候选联合分群方案一一对应,且每条所述样本数据的结局标签用于表示对应的所述第一候选联合分群方案的得分,所述结局标签包括绝对结局和相对结局;Obtain n pieces of sample data of patients suffering from the combined disease, and generate an outcome label for each piece of sample data according to various indicators in each piece of sample data; the sample data and the first candidate are grouped together There is a one-to-one correspondence between the schemes, and the outcome label of each piece of sample data is used to indicate the score of the corresponding first candidate joint clustering scheme, and the outcome label includes an absolute outcome and a relative outcome;
    利用带有结局标签的所述样本数据、所述第一候选联合分群方案训练lambdaMART模型,得到构建好的患者分群模型。The lambdaMART model is trained by using the sample data with the outcome label and the first candidate joint clustering scheme to obtain a constructed patient clustering model.
  16. 根据权利要求15所述的计算机可读存储介质,其中,所述一条或多条指令由处理器加载时还执行:The computer-readable storage medium according to claim 15, wherein the one or more instructions are also executed when loaded by the processor:
    获取每条所述样本数据中的各个指标的重要性;基于每条所述样本数据中的各个指标的重要性为每条所述样本数据生成结局标签。Obtain the importance of each indicator in each piece of sample data; and generate an outcome label for each piece of sample data based on the importance of each indicator in each piece of sample data.
  17. 根据权利要求16所述的计算机可读存储介质,其中,所述一条或多条指令由处理器加载时还执行:The computer-readable storage medium according to claim 16, wherein the one or more instructions are also executed when loaded by the processor:
    利用每条所述样本数据中的各个指标训练logist回归模型:y=1/1+e -(β0+β1X1+β2X2+...+βnXn),其中,y表示回归模型的输出,X1、X2…Xn表示每条所述样本数据中的各个指标,系数β1、β2…βn表示各个指标的重要性;训练过程中通过梯度下降法减少对数损失,以估计回归系数β0、β1、β2…βn,得到各个指标的重要性。 Train the logist regression model using the indicators in each of the sample data: y=1/1+e -(β0+β1X1+β2X2+...+βnXn) , where y represents the output of the regression model, X1, X2... Xn represents the various indicators in each piece of sample data, and the coefficients β1, β2...βn represent the importance of each indicator; the gradient descent method is used to reduce the log loss during the training process to estimate the regression coefficients β0, β1, β2...βn, Get the importance of each indicator.
  18. 根据权利要求16所述的计算机可读存储介质,其中,所述一条或多条指令由处理器加载时还执行:The computer-readable storage medium according to claim 16, wherein the one or more instructions are also executed when loaded by the processor:
    采用预设公式:effect(i)=absolute(i)*relative(i)为每条所述样本数据生成结局标签;其中,effect(i)表示第i条样本数据的结局标签,absolute(i)表示第i条样本数据的绝对结局,根据第i条样本数据中的各个指标的重要性自定义;relative(i)表示第i条样本数据的相对结局,根据absolute(i)定义。Use the preset formula: effect(i)=absolute(i)*relative(i) to generate an outcome label for each piece of sample data; where effect(i) represents the outcome label of the i-th sample data, absolute(i) Represents the absolute outcome of the i-th sample data, customized according to the importance of each indicator in the i-th sample data; relative(i) represents the relative outcome of the i-th sample data, defined according to absolute(i).
  19. 根据权利要求15-18任一项所述的计算机可读存储介质,其中,所述一条或多条指令由处理器加载时还执行:18. The computer-readable storage medium according to any one of claims 15-18, wherein the one or more instructions are also executed when loaded by the processor:
    A:计算所述第一候选联合分群方案的lambda值;A: Calculate the lambda value of the first candidate joint grouping scheme;
    B:以所述lambda值为标签训练一棵回归树,在回归树的每个叶子节点通过预测的回 归结果计算出最终输出得分;B: Train a regression tree with the lambda value as a label, and calculate the final output score based on the predicted regression result at each leaf node of the regression tree;
    C:通过步骤A和步骤B预测出带有结局标签的每条所述样本数据的得分,根据带有结局标签的每条所述样本数据的得分对每条所述样本数据对应的所述第一候选联合分群方案进行排序;C: Predict the score of each piece of sample data with an outcome label through steps A and B, and compare the score of each piece of sample data with an outcome label to the first piece of sample data corresponding to each piece of sample data. A candidate joint grouping scheme is sorted;
    D:重复步骤A至步骤C组成随机森林,直至满足预设收敛条件之一便停止训练,得到所述患者分群模型;所述预设收敛条件包括:回归树的数量达到预设参数设置、随机森林在验证集上不再持续更新。D: Repeat steps A to C to form a random forest, and stop training until one of the preset convergence conditions is met to obtain the patient grouping model; the preset convergence conditions include: the number of regression trees reaches the preset parameter setting, random The forest is no longer continuously updated on the validation set.
  20. 一种计算机可读存储介质,其中,所述计算机可读存储介质存储有一条或多条指令,所述一条或多条指令适于由处理器加载并执行:A computer-readable storage medium, wherein the computer-readable storage medium stores one or more instructions, and the one or more instructions are suitable for being loaded and executed by a processor:
    接收用户终端提交的患者分群请求;所述患者分群请求中包括待分群患者患有的至少两种疾病;Receiving a patient grouping request submitted by the user terminal; the patient grouping request includes at least two diseases of the patient to be grouped;
    获取所述待分群患者患有的每种疾病的第二知识分群决策树,根据所述第二知识分群决策树得到所述待分群患者的第二候选联合分群方案;Acquiring a second knowledge grouping decision tree for each disease that the patient to be grouped suffers from, and obtaining a second candidate joint grouping plan of the patient to be grouped according to the second knowledge grouping decision tree;
    将所述第二候选联合分群方案输入预训练的患者分群模型进行排序,得到所述第二候选联合分群方案的排序结果;Input the second candidate joint grouping scheme into a pre-trained patient grouping model for sorting, and obtain a sorting result of the second candidate joint grouping scheme;
    根据所述第二候选联合分群方案的排序结果,选取预设数量个所述第二候选联合分群方案作为所述待分群患者的分群结果返回至所述用户终端。According to the ranking result of the second candidate joint grouping scheme, a preset number of the second candidate joint grouping schemes are selected as the grouping result of the patient to be grouped and returned to the user terminal.
PCT/CN2020/099530 2020-05-13 2020-06-30 Patient grouping model constructing method, patient grouping method, and related device WO2021114635A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010404637.2A CN111696661B (en) 2020-05-13 2020-05-13 Patient grouping model construction method, patient grouping method and related equipment
CN202010404637.2 2020-05-13

Publications (1)

Publication Number Publication Date
WO2021114635A1 true WO2021114635A1 (en) 2021-06-17

Family

ID=72477306

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/099530 WO2021114635A1 (en) 2020-05-13 2020-06-30 Patient grouping model constructing method, patient grouping method, and related device

Country Status (2)

Country Link
CN (1) CN111696661B (en)
WO (1) WO2021114635A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116759042A (en) * 2023-08-22 2023-09-15 之江实验室 System and method for generating anti-facts medical data based on annular consistency

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112819527B (en) * 2021-01-29 2024-05-24 百果园技术(新加坡)有限公司 User grouping processing method and device
CN112883654B (en) * 2021-03-24 2023-01-31 国家超级计算天津中心 Model training system based on data driving
CN113724061A (en) * 2021-08-18 2021-11-30 杭州信雅达泛泰科技有限公司 Consumer financial product credit scoring method and device based on customer grouping
CN113724815B (en) * 2021-08-30 2024-06-21 深圳平安智慧医健科技有限公司 Information pushing method and device based on decision grouping model
CN113782192A (en) * 2021-09-30 2021-12-10 平安科技(深圳)有限公司 Grouping model construction method based on causal inference and medical data processing method
CN118507030A (en) * 2024-05-31 2024-08-16 山东纬横医疗科技有限公司 Medical prevention decision-making system based on informatization

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180322660A1 (en) * 2017-05-02 2018-11-08 Techcyte, Inc. Machine learning classification and training for digital microscopy images
CN109243618A (en) * 2018-09-12 2019-01-18 腾讯科技(深圳)有限公司 Construction method, disease label construction method and the smart machine of medical model
CN109801705A (en) * 2018-12-12 2019-05-24 平安科技(深圳)有限公司 Treat recommended method, system, device and storage medium
CN110164519A (en) * 2019-05-06 2019-08-23 北京工业大学 A kind of classification method for being used to handle electronic health record blended data based on many intelligence networks
CN110363226A (en) * 2019-06-21 2019-10-22 平安科技(深圳)有限公司 Ophthalmology disease classifying identification method, device and medium based on random forest
CN110929752A (en) * 2019-10-18 2020-03-27 平安科技(深圳)有限公司 Knowledge-driven and data-driven clustering method and related equipment

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8706521B2 (en) * 2010-07-16 2014-04-22 Naresh Ramarajan Treatment related quantitative decision engine
CN108109692A (en) * 2017-11-08 2018-06-01 北京无极慧通科技有限公司 The selection method and system of a kind of therapeutic scheme

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180322660A1 (en) * 2017-05-02 2018-11-08 Techcyte, Inc. Machine learning classification and training for digital microscopy images
CN109243618A (en) * 2018-09-12 2019-01-18 腾讯科技(深圳)有限公司 Construction method, disease label construction method and the smart machine of medical model
CN109801705A (en) * 2018-12-12 2019-05-24 平安科技(深圳)有限公司 Treat recommended method, system, device and storage medium
CN110164519A (en) * 2019-05-06 2019-08-23 北京工业大学 A kind of classification method for being used to handle electronic health record blended data based on many intelligence networks
CN110363226A (en) * 2019-06-21 2019-10-22 平安科技(深圳)有限公司 Ophthalmology disease classifying identification method, device and medium based on random forest
CN110929752A (en) * 2019-10-18 2020-03-27 平安科技(深圳)有限公司 Knowledge-driven and data-driven clustering method and related equipment

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116759042A (en) * 2023-08-22 2023-09-15 之江实验室 System and method for generating anti-facts medical data based on annular consistency
CN116759042B (en) * 2023-08-22 2023-12-22 之江实验室 System and method for generating anti-facts medical data based on annular consistency

Also Published As

Publication number Publication date
CN111696661B (en) 2024-09-24
CN111696661A (en) 2020-09-22

Similar Documents

Publication Publication Date Title
WO2021114635A1 (en) Patient grouping model constructing method, patient grouping method, and related device
US11232365B2 (en) Digital assistant platform
US20200265931A1 (en) Systems and methods for coding health records using weighted belief networks
US20200311610A1 (en) Rule-based feature engineering, model creation and hosting
CN112528660A (en) Method, apparatus, device, storage medium and program product for processing text
Bardak et al. Improving clinical outcome predictions using convolution over medical entities with multimodal learning
US20240347202A1 (en) Cross Care Matrix Based Care Giving Intelligence
Rabie et al. A decision support system for diagnosing diabetes using deep neural network
Khilji et al. Healfavor: Dataset and a prototype system for healthcare chatbot
Jia et al. DKDR: An approach of knowledge graph and deep reinforcement learning for disease diagnosis
US12087442B2 (en) Methods and systems for confirming an advisory interaction with an artificial intelligence platform
Zaghir et al. Real-world patient trajectory prediction from clinical notes using artificial neural networks and UMLS-based extraction of concepts
Ren et al. Mortality prediction in ICU using a stacked ensemble model
Malgieri Ontologies, Machine Learning and Deep Learning in Obstetrics
Razmi AI Doctor: The Rise of Artificial Intelligence in Healthcare-A Guide for Users, Buyers, Builders, and Investors
Zhu et al. Is larger always better? Evaluating and prompting large language models for non-generative medical tasks
WO2021120528A1 (en) Automatic report interpretation method and system
Mohan Predicting post-procedural complications using neural networks on MIMIC-III data
Yan et al. Generating Synthetic Electronic Health Record Data Using Generative Adversarial Networks: Tutorial
US12087443B2 (en) System and method for transmitting a severity vector
Sousa et al. An architecture based on fuzzy systems for personalized medicine in ICUs
US12094582B1 (en) Intelligent healthcare data fabric system
US12124966B1 (en) Apparatus and method for generating a text output
Fazlinovic et al. Patient outcome prediction using knowledge graph representation learning
US11561938B1 (en) Closed-loop intelligence

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20899600

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20899600

Country of ref document: EP

Kind code of ref document: A1