WO2021114635A1 - Procédé de construction de modèle de regroupement de patients, procédé de regroupement de patients et dispositif associé - Google Patents

Procédé de construction de modèle de regroupement de patients, procédé de regroupement de patients et dispositif associé Download PDF

Info

Publication number
WO2021114635A1
WO2021114635A1 PCT/CN2020/099530 CN2020099530W WO2021114635A1 WO 2021114635 A1 WO2021114635 A1 WO 2021114635A1 CN 2020099530 W CN2020099530 W CN 2020099530W WO 2021114635 A1 WO2021114635 A1 WO 2021114635A1
Authority
WO
WIPO (PCT)
Prior art keywords
sample data
grouping
patient
piece
outcome
Prior art date
Application number
PCT/CN2020/099530
Other languages
English (en)
Chinese (zh)
Inventor
徐卓扬
孙行智
赵惟
左磊
胡岗
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021114635A1 publication Critical patent/WO2021114635A1/fr

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Definitions

  • This application relates to the field of machine learning technology, and in particular to a method for constructing a patient clustering model, a patient clustering method, and related equipment.
  • the present application provides a method for constructing a patient grouping model, a method for grouping patients, and related equipment, which are beneficial to improve the grouping effect of comprehensive grouping of patients with multiple diseases.
  • an embodiment of the present application provides a method for constructing a patient grouping model, the method including:
  • Obtain a preset disease prevention and control guide identify keywords in the disease prevention guide, obtain the divided attribute set of each disease in the combined disease, and calculate the information gain rate of each divided attribute in the divided attribute set to generate each disease in the combined disease
  • the first knowledge grouping decision tree of, and according to the first knowledge grouping decision tree, n first candidate joint grouping schemes of patients suffering from the joint disease are obtained;
  • the lambdaMART model is trained by using the sample data with the outcome label and the first candidate joint clustering scheme to obtain a constructed patient clustering model.
  • an embodiment of the present application provides a method for grouping patients, which includes:
  • the patient grouping request includes at least two diseases of the patient to be grouped;
  • a preset number of the second candidate joint grouping schemes are selected as the grouping result of the patient to be grouped and returned to the user terminal.
  • an embodiment of the present application provides an apparatus for constructing a patient grouping model, the apparatus including:
  • the first clustering scheme acquisition module is used to acquire preset disease prevention and control guidelines, identify keywords in the disease prevention and control guidelines, obtain the partition attribute set of each disease in the joint disease, and calculate the information gain rate of each partition attribute in the partition attribute set To generate a first knowledge grouping decision tree for each disease in the combined disease, and according to the first knowledge grouping decision tree, to obtain n first candidate joint grouping schemes of patients suffering from the combined disease;
  • the outcome label generation module is used to obtain n pieces of sample data of the patient suffering from the combined disease, and generate an outcome label for each piece of sample data according to each indicator in each piece of sample data; the sample data and The first candidate joint grouping solution corresponds one-to-one, and the outcome label of each piece of sample data is used to indicate the score of the corresponding first candidate joint grouping solution, and the outcome label includes an absolute outcome and a relative outcome;
  • the clustering model training module is used to train the lambdaMART model by using the sample data with the outcome label and the first candidate joint clustering scheme to obtain a constructed patient clustering model.
  • an embodiment of the present application provides a patient grouping device, which includes:
  • the grouping request acquisition module is configured to receive a patient grouping request submitted by the user terminal; the patient grouping request includes at least two diseases of the patient to be grouped;
  • the second clustering scheme acquisition module is used to acquire a second knowledge clustering decision tree for each disease that the patient to be clustered suffers from, and to obtain the second candidate combination of the patient to be clustered according to the second knowledge clustering decision tree Grouping scheme
  • the clustering scheme ranking module is configured to input the second candidate joint clustering plan into a pre-trained patient clustering model for ranking, and obtain a ranking result of the second candidate joint clustering plan;
  • the grouping result output module is configured to select a preset number of the second candidate joint grouping schemes as the grouping result of the patient to be grouped and return to the user terminal according to the sorting result of the second candidate joint grouping scheme.
  • an embodiment of the present application provides an electronic device that includes an input device and an output device, and also includes a processor, which is adapted to implement one or more instructions; and, a computer-readable storage medium.
  • the readable storage medium stores one or more instructions, and the one or more instructions are suitable for being loaded by the processor and executing the following steps:
  • Obtain a preset disease prevention and control guide identify keywords in the disease prevention guide, obtain the divided attribute set of each disease in the combined disease, and calculate the information gain rate of each divided attribute in the divided attribute set to generate each disease in the combined disease
  • the first knowledge grouping decision tree of, and according to the first knowledge grouping decision tree, n first candidate joint grouping schemes of patients suffering from the joint disease are obtained;
  • the lambdaMART model is trained by using the sample data with the outcome label and the first candidate joint clustering scheme to obtain a constructed patient clustering model.
  • an embodiment of the present application provides an electronic device, which includes an input device and an output device, and also includes a processor, adapted to implement one or more instructions; and, a computer-readable storage medium.
  • the readable storage medium stores one or more instructions, and the one or more instructions are suitable for being loaded by the processor and executing the following steps:
  • the patient grouping request includes at least two diseases of the patient to be grouped;
  • a preset number of the second candidate joint grouping schemes are selected as the grouping result of the patient to be grouped and returned to the user terminal.
  • an embodiment of the present application provides a computer-readable storage medium, the computer-readable storage medium stores one or more instructions, and the one or more instructions are suitable for being loaded by a processor and executing the following steps :
  • Obtain a preset disease prevention and control guide identify keywords in the disease prevention guide, obtain the divided attribute set of each disease in the combined disease, and calculate the information gain rate of each divided attribute in the divided attribute set to generate each disease in the combined disease
  • the first knowledge grouping decision tree of, and according to the first knowledge grouping decision tree, n first candidate joint grouping schemes of patients suffering from the joint disease are obtained;
  • the lambdaMART model is trained by using the sample data with the outcome label and the first candidate joint clustering scheme to obtain a constructed patient clustering model.
  • an embodiment of the present application provides a computer-readable storage medium, the computer-readable storage medium stores one or more instructions, and the one or more instructions are suitable for being loaded by a processor and executing the following steps :
  • the patient grouping request includes at least two diseases of the patient to be grouped;
  • a preset number of the second candidate joint grouping schemes are selected as the grouping result of the patient to be grouped and returned to the user terminal.
  • the clustering plan of a single disease is no longer considered, but the plan of multi-disease joint clustering is sorted out, taking into account the relevant effects between different clustering decisions, and the outcome label of the sample data is not only considered
  • the outcome label also considers the relative outcome, which eliminates to a certain extent the problem that the biased samples are difficult to learn when only the absolute outcome is used.
  • the lambdaMART model is used for training, and the resulting patient clustering model not only focuses on the first candidate joint clustering plan itself, but also Pay attention to the priority order between the first candidate joint grouping schemes, so as to improve the grouping effect of grouping patients with multiple diseases.
  • FIG. 1 is a diagram of a network system architecture provided by an embodiment of this application.
  • FIG. 2 is a schematic flowchart of a method for constructing a patient grouping model provided by an embodiment of the application
  • FIG. 3 is a schematic flowchart of another method for constructing a patient grouping model provided by an embodiment of the application
  • FIG. 4 is an example diagram of constructing a patient grouping model provided by an embodiment of the application.
  • FIG. 5 is a schematic flowchart of a method for grouping patients according to an embodiment of the application.
  • FIG. 6 is an example diagram of a patient grouping provided by an embodiment of the application.
  • FIG. 7 is a schematic structural diagram of an apparatus for constructing a patient grouping model provided by an embodiment of the application.
  • FIG. 8 is a schematic structural diagram of a patient grouping device provided by an embodiment of the application.
  • FIG. 9 is a schematic structural diagram of an electronic device provided by an embodiment of the application.
  • FIG. 10 is a schematic structural diagram of another electronic device provided by an embodiment of the application.
  • the embodiment of the application provides a solution for constructing a patient grouping model to construct a patient grouping model suitable for patients with multiple diseases.
  • the knowledge grouping decision tree of each disease in the joint disease is used to obtain the patient suffering from the joint disease.
  • the candidate joint clustering plan for patients fully considers the relevant effects between the individual disease clustering plans.
  • the follow-up data of the patient is used as the sample data, and the demographic information, medication history, laboratory examination, vital signs and other indicators of the patient in the sample data The importance is to generate an outcome label for each sample data.
  • this application also considers the relative outcome, which is more objective and reasonable.
  • the patient grouping model uses the lambdaMART model As a basis, the model pays more attention to the order of the top-ranked candidate joint clustering schemes when learning, so that when the trained patient clustering model is applied to the multi-disease patient clustering scenario, better clustering results can be obtained. It is more suitable for precision medicine.
  • the patient grouping model construction solution can be implemented based on the network system architecture shown in Figure 1.
  • the network system architecture at least includes a user terminal, a server, and a database, and the three are connected through a wired or wireless network. Communication, the specific communication protocol is not limited.
  • the user terminal can be used to submit disease prevention and control guidelines, follow-up data of patients with joint diseases, etc. to the server through program codes or touch signals, so as to request the server to execute the relevant steps of constructing the patient grouping model.
  • the database can be used to store disease prevention guidelines and a large number of patients’ demographic information, medical treatment data, follow-up data, etc. Developers can use the user terminal to input conditional query sentences to extract the required information from the database, such as: extracting hypertension
  • the follow-up data of patients with diabetes is used as sample data, and the database can be a database in a server, a database independent of the server, or a cloud database. It is understandable that the user terminal in this application can be a desktop computer, a tablet computer, a supercomputer, etc.
  • the server can be a local server, a cloud server, or a server cluster, and so on.
  • FIG. 2 is a construction of a patient grouping model provided by an embodiment of the application.
  • the schematic flow chart of the method, as shown in Figure 2, includes steps S21-S23:
  • S21 Obtain a preset disease prevention and control guide, perform keyword recognition on the disease prevention and control guide, and obtain the divided attribute set of each disease in the combined disease, and calculate the information gain rate of each divided attribute in the divided attribute set to generate the combined disease
  • the first knowledge grouping decision tree for each disease, and according to the first knowledge grouping decision tree, n first candidate joint grouping schemes of patients suffering from the joint disease are obtained.
  • the combined disease refers to a combination of at least two diseases, such as: diabetes + hypertension, diabetes + hypertension + heart disease, etc.
  • the disease prevention and control guidelines may be guidelines corresponding to each disease in the combined disease, for example: Diabetes prevention guidelines, hypertension prevention guidelines, heart disease prevention guidelines, etc., can be stored in the database, the server can obtain from the database, or can be sent to the server by the developer through the user terminal.
  • the disease prevention guide can be identified by keywords, Text processing and other technologies extract the partition attribute set, for example: the partition attribute set for hypertension can be ⁇ age, blood pressure, glucose tolerance,..., high salt, ankle/brachial blood pressure index ⁇ , the first-knowledge clustering decision tree is model training
  • the knowledge grouping decision tree for each disease in the joint disease combed out in the stage can be constructed by calculating the information gain rate of each divided attribute in the divided attribute set through the C4.5 algorithm.
  • the first candidate joint grouping scheme is the model training stage server pairing the first The scheme obtained by combining the grouping schemes under the knowledge grouping decision tree.
  • the disease prevention and control guidelines contain the treatment decision-making knowledge of related diseases, such as some treatment suggestions, drug suggestions, etc., to sort out the prevention and treatment guidelines related to each disease in the joint disease to obtain the first knowledge group decision tree corresponding to each disease, each first The knowledge grouping decision trees are independent of each other.
  • the selectable grouping scheme under the corresponding first knowledge grouping decision tree is ⁇ A1,A2 ⁇
  • the selectable grouping scheme under the first knowledge grouping decision tree corresponding to hypertension is ⁇ B1,B2 ⁇
  • the patient may be
  • a candidate joint clustering plan includes: ⁇ A1+B1, A2+B1, A1+B2, A2+B2 ⁇ , and this combination will yield n (multiple) first candidate joint clustering plans for the patient.
  • S22 Obtain n pieces of sample data of the patient suffering from the combined disease, and generate an outcome label for each piece of sample data according to each index in each piece of sample data.
  • the sample data refers to the follow-up data of patients suffering from the combined disease.
  • the so-called follow-up data refers to the hospital's communication or other methods for the patients who have been in the hospital to regularly understand the changes in the patient's condition and guide the patients to recover.
  • An observation method Usually, a patient has multiple follow-up visits.
  • the data from one follow-up visit can be used as a piece of sample data, and each piece of sample data has a corresponding first candidate joint clustering plan.
  • each piece of sample data includes the patient’s demographic information, medication history of all diseases, test inspection indicators, doctor’s prescriptions, and patient’s vital signs. For example, there may be multiple indicators in the medication history.
  • HbA1c hemoglobin glycosylated
  • BP blood pressure
  • y represents the output of the regression model, that is, whether the next follow-up will increase complications or whether it is dead.
  • X represents the input of the regression model, that is, each indicator in the sample data
  • Xn represents the nth input Index
  • represents the regression coefficient of each index, that is, ⁇ 1 represents the importance of the index X1
  • the regression coefficient is regarded as the importance of the corresponding indexes.
  • effect(i) represents the ith item
  • the outcome label of the sample data absolute(i) represents the absolute outcome of the i-th sample data
  • absolute(i) is customized according to the importance of each index in the i-th sample data
  • relative(i) represents the i-th sample data
  • the relative outcome of relative(i) is defined according to absolute(i).
  • HbA1c the test index of diabetes mellitus glycosylated hemoglobin
  • BP the test index of hypertension blood pressure
  • absolute(i) ⁇ HbA1c *(HbA1c(i)-HbA1c (i+1))+ ⁇ BP *(BP(i)-BP(i+1))
  • ⁇ HbA1c represents the importance of glycosylated hemoglobin, derived from the regression coefficient evaluated in the above regression model
  • ⁇ BP represents blood pressure
  • HbA1c(i) represents the glycosylated hemoglobin in the ith sample data
  • BP(i) the blood pressure in the ith sample data
  • HbA1c(i+1) the glycosylated hemoglobin in the next sample data
  • BP (i+1) the blood pressure in the next sample data.
  • relative(i) ⁇ k ⁇ N(pi,di) absolute(k)/ ⁇ j ⁇ N(pi) absolute(j), where N(pi) means that it is divided by each first knowledge grouping decision tree
  • N(pi) means that it is divided by each first knowledge grouping decision tree
  • the sample set of the same leaf node as i, N(pi,di) is the set of the same grouping scheme actually adopted in N(pi) as i. Since each piece of sample data has a corresponding first candidate joint grouping scheme, the outcome label of each piece of sample data here can be used to indicate the score of the candidate joint grouping scheme of the sample.
  • the lambdaMART model is originally a method for sorting documents in information retrieval, that is, when the user proposes a Query, the candidate documents are sorted.
  • the demographic information, inspection and inspection indicators, and medication history in each sample data are used as Query
  • the first candidate joint clustering plan is used as documents.
  • Each Query-documents pair (Query-documents pair) has an outcome. label.
  • the lambda value For each document, first calculate the lambda value, train a regression tree with the lambda value as the label, and calculate the final output score through the predicted regression result at each leaf node of the regression tree (the score here is the predicted score ), using this method to predict the score of each sample data with an outcome label, sort the first candidate joint clustering scheme corresponding to each sample data according to the level of the score, and then return to the step of calculating the lambda value and repeat
  • the steps of training regression trees, predicting scoring, and sorting form a random forest. Training can be stopped until one of the preset convergence conditions is met, and the patient grouping model we need is obtained.
  • the convergence conditions are: the number of regression trees reaches the preset parameter settings , Random Forest is no longer continuously updated on the validation set, that is, it is no longer getting better.
  • the embodiment of the application obtains the preset disease prevention and control guide, performs keyword recognition on the disease prevention and control guide, obtains the divided attribute set of each disease in the joint disease, and calculates the information gain rate of each divided attribute in the divided attribute set to generate
  • the first knowledge grouping decision tree of each disease and according to the first knowledge grouping decision tree, n first candidate joint grouping schemes of patients with joint diseases are obtained; n pieces of sample data of patients with joint diseases are obtained, and According to the indicators in each sample data, an outcome label is generated for each sample data; the lambdaMART model is trained using the sample data with the outcome label and the first candidate joint clustering scheme to obtain the constructed patient clustering model.
  • the clustering plan of a single disease is no longer considered, but the plan of multi-disease joint clustering is sorted out, taking into account the relevant effects between different clustering decisions.
  • the outcome label of the sample data considers not only the outcome label, but also To a certain extent, the problem of biased samples that are difficult to learn when only the absolute outcome is used is eliminated.
  • the lambdaMART model is used for training, and the resulting patient clustering model focuses on the first candidate joint clustering scheme itself and the first candidate joint The priority order between the grouping schemes is helpful to improve the grouping effect of grouping patients with multiple diseases.
  • FIG. 3 is a schematic flowchart of another method for constructing a patient grouping model provided by an embodiment of the application. As shown in FIG. 3, it includes steps S31-S35:
  • the foregoing obtaining the importance of each indicator in each piece of sample data includes:
  • the logarithmic loss is reduced by the gradient descent method to estimate the regression coefficients ⁇ 0, ⁇ 1, ⁇ 2... ⁇ n, and obtain the importance of each index.
  • the regression coefficient ⁇ is used as the importance of each indicator in the sample data, which is conducive to the subsequent definition of absolute and relative outcomes.
  • the foregoing generating an outcome label for each piece of sample data based on the importance of each indicator in each piece of sample data includes:
  • effect(i) absolute(i)*relative(i) to generate an outcome label for each piece of sample data; where effect(i) represents the outcome label of the i-th sample data, absolute(i) Represents the absolute outcome of the i-th sample data, customized according to the importance of each indicator in the i-th sample data; relative(i) represents the relative outcome of the i-th sample data, defined according to absolute(i).
  • an outcome label is generated for each sample data.
  • the outcome label not only considers the absolute outcome, but also considers the relative outcome, which solves the unobjectiveness caused by only considering the absolute solution. It is helpful to reduce the learning difficulty of the patient grouping model.
  • steps S31-S35 have been described in detail in the embodiment shown in FIG. 2, and in order to avoid repetition, they will not be repeated here.
  • FIG. 5 is a schematic flowchart of a patient grouping method provided by an embodiment of the application.
  • the patient grouping method can also be based on the method shown in FIG.
  • the implementation of the network system architecture, as shown in Figure 5, specifically includes steps S51-S54:
  • S51 Receive a patient grouping request submitted by the user terminal; the patient grouping request includes at least two diseases of the patient to be grouped;
  • the patient grouping request is used to request the server to group the patients to be grouped into groups.
  • the patients to be grouped are patients with the same combined disease as the sample patient in the model training stage, such as patients with diabetes and hypertension.
  • the patient grouping can include the combined disease that the patient to be grouped suffers from. Of course, it can also include the prevention and treatment guidelines for various diseases in the combined disease, basic information of the patient to be grouped, diagnosis information, etc.
  • the user terminal can be a medical staff The terminal used, the terminal of the medical research room, the terminal of the staff of the medical and health enterprise, etc., for example: the medical staff can send the patient grouping request to the server through the user terminal after the patient is to be grouped for diagnosis.
  • S52 Obtain a second knowledge grouping decision tree for each disease that the patient to be grouped suffers from, and obtain a second candidate joint grouping solution for the patient to be grouped according to the second knowledge grouping decision tree;
  • the second knowledge grouping decision tree is the knowledge grouping decision tree generated by combing the disease prevention and control guidelines through keyword recognition and calculating the information gain rate during the use stage of the patient grouping model, and the second knowledge grouping decision tree The grouping scheme under the tree decision tree is composed, and the second candidate joint grouping scheme is obtained.
  • S53 Input the second candidate joint grouping scheme into a pre-trained patient grouping model for sorting, and obtain a sorting result of the second candidate joint grouping scheme;
  • the patient clustering model uses the method of training regression trees to predict the score of each second candidate joint clustering scheme, and ranks each second candidate joint clustering scheme according to this score, and the second candidate with the larger score
  • the joint grouping plan should be ranked higher, and the second candidate joint grouping plan with the lower score should be ranked lower.
  • the preset number of second candidate joint grouping schemes can be set according to actual conditions, and may be the second candidate joint grouping scheme ranked first, or the second candidate joint grouping plan ranked three.
  • the candidate joint grouping scheme is not specifically limited.
  • the second candidate joint grouping scheme of patients to be grouped is A1+B1, A2+B1, A1+B2, A2+B2, and their ranking results are: A2+B1, A2+B2, A1+B1, A1+B2
  • the realization can be implemented as shown in FIG. 6 through the diabetes prevention guide and Hypertension prevention and control guidelines, respectively sort out the diabetes knowledge grouping decision tree and hypertension knowledge grouping decision tree, according to the knowledge grouping decision tree of the two to obtain multiple second candidate joint grouping schemes, and input them into the patient grouping model for score prediction and ranking , And finally output the top-k best second candidate joint clustering scheme.
  • the patient clustering model constructed in the embodiment shown in Figure 2 or Figure 3 is used to predict and sort, it is beneficial to improve the clustering effect of multi-disease combined patients. , More suitable for precision medicine.
  • an embodiment of the present application also provides a device for constructing a patient grouping model.
  • the device for constructing a patient grouping model may be running in a terminal.
  • a computer program (including program code).
  • the device for constructing a patient grouping model can execute the method shown in FIG. 2 or FIG. 3. Please refer to Figure 7, the device includes:
  • the first clustering scheme acquisition module 71 is used to acquire preset disease prevention and control guidelines, perform keyword recognition on the disease prevention and control guidelines, obtain the partition attribute set of each disease in the joint disease, and calculate the information gain of each partition attribute in the partition attribute set Rate to generate a first knowledge clustering decision tree for each disease in the combined disease, and according to the first knowledge clustering decision tree, to obtain n first candidate combined clustering schemes for patients suffering from the combined disease;
  • the outcome label generation module 72 is configured to obtain n pieces of sample data of patients suffering from the combined disease, and generate an outcome label for each piece of sample data according to various indicators in each piece of sample data; the sample data One-to-one correspondence with the first candidate joint grouping scheme, and the outcome label of each piece of sample data is used to indicate the score of the corresponding first candidate joint grouping scheme, and the outcome label includes an absolute outcome and a relative outcome;
  • the clustering model training module 73 is configured to train the lambdaMART model by using the sample data with the outcome label and the first candidate joint clustering scheme to obtain a constructed patient clustering model.
  • the device includes:
  • the grouping request obtaining module 81 is configured to receive a patient grouping request submitted by the user terminal; the patient grouping request includes at least two diseases of the patient to be grouped;
  • the second clustering scheme acquisition module 82 is configured to acquire a second knowledge clustering decision tree of each disease that the patient to be clustered suffers from, and obtain the second candidate of the patient to be clustered according to the second knowledge clustering decision tree Joint grouping scheme;
  • the clustering scheme ranking module 83 is configured to input the second candidate joint clustering plan into a pre-trained patient clustering model for ranking, and obtain a ranking result of the second candidate joint clustering plan;
  • the grouping result output module 84 is configured to select a preset number of the second candidate joint grouping solutions as the grouping results of the patients to be grouped and return to the user terminal according to the sorting result of the second candidate joint grouping solutions.
  • each unit in the patient grouping model construction device and the patient grouping device shown in FIG. 7 and FIG. 8 can be separately or completely combined into one or several other units to form, or one of them
  • the unit(s) can be further divided into multiple units with smaller functions to form, which can realize the same operation without affecting the realization of the technical effects of the embodiments of the present application.
  • the above-mentioned units are divided based on logical functions.
  • the function of one unit may also be realized by multiple units, or the functions of multiple units may be realized by one unit.
  • the patient grouping model construction device and the patient grouping device may also include other units. In practical applications, these functions can also be implemented with the assistance of other units, and can be implemented by multiple units in cooperation.
  • a general-purpose computing device such as a computer including a central processing unit (CPU), a random access storage medium (RAM), a read-only storage medium (ROM) and other processing elements and storage elements
  • CPU central processing unit
  • RAM random access storage medium
  • ROM read-only storage medium
  • other processing elements and storage elements can be used
  • Run a computer program (including program code) capable of executing the steps involved in the corresponding method shown in FIG. 2, FIG. 3, or FIG. 5 to construct the patient grouping model construction device as shown in FIG. 7 or FIG. 8,
  • the computer program may be recorded on, for example, a computer-readable recording medium, and loaded into the above-mentioned computing device through the computer-readable recording medium, and run in it.
  • FIG. 9 is a schematic structural diagram of an electronic device provided by an embodiment of the application.
  • the electronic device includes at least a processor 901, an input device 902, an output device 903, and a computer-readable storage medium. 904.
  • the processor 901, the input device 902, the output device 903, and the computer-readable storage medium 904 in the electronic device may be connected by a bus or other methods.
  • the computer-readable storage medium 904 may be stored in the memory of the electronic device.
  • the computer-readable storage medium 904 is used to store a computer program.
  • the computer program includes program instructions.
  • the processor 901 is used to execute the computer-readable Program instructions stored in the storage medium 904.
  • the processor 901 (or CPU (Central Processing Unit, central processing unit)) is the computing core and control core of an electronic device. It is suitable for implementing one or more instructions, specifically suitable for loading and executing one or more instructions to achieve Corresponding method flow or corresponding function.
  • the processor 901 of the electronic device provided in the embodiment of the present application may be used to construct a series of patient grouping models, including:
  • Obtain a preset disease prevention and control guide identify keywords in the disease prevention guide, obtain the divided attribute set of each disease in the combined disease, and calculate the information gain rate of each divided attribute in the divided attribute set to generate each disease in the combined disease
  • the first knowledge grouping decision tree of, and according to the first knowledge grouping decision tree, n first candidate joint grouping schemes of patients suffering from the joint disease are obtained;
  • the lambdaMART model is trained using the sample data with the outcome label and the first candidate joint clustering scheme to obtain a constructed patient clustering model.
  • the processor 901 executing the generating of an ending label for each piece of sample data according to each indicator in each piece of sample data includes: obtaining the importance of each indicator in each piece of sample data; The importance of each indicator in each piece of sample data generates an outcome label for each piece of sample data.
  • the processor 901 executes the training of the lambdaMART model using the sample data with the outcome label and the first candidate joint clustering scheme to obtain a constructed patient clustering model, including:
  • the preset convergence conditions include: the number of regression trees reaches the preset parameter setting, random The forest is no longer continuously updated on the validation set.
  • the processor 901 of the electronic device executes the computer program to implement the steps in the method for constructing a patient grouping model
  • the embodiments of the method for constructing a patient grouping model are all applicable to the electronic device, and can achieve the same Or similar beneficial effects.
  • FIG. 10 is a schematic structural diagram of another electronic device provided by an embodiment of the application.
  • the electronic device includes at least a processor 1001, an input device 1002, an output device 1003, and a computer-readable storage Medium 1004.
  • the processor 1001, the input device 1002, the output device 1003, and the computer-readable storage medium 1004 in the electronic device may be connected by a bus or other methods.
  • the processor 1001 of the electronic device provided in the embodiment of the present application may be used to perform a series of patient grouping processing, including:
  • the patient grouping request includes at least two diseases of the patient to be grouped;
  • a preset number of the second candidate joint grouping schemes are selected as the grouping result of the patient to be grouped and returned to the user terminal.
  • the processor 1001 of the electronic device executes the computer program to implement the steps in the above-mentioned patient grouping method
  • the embodiments of the above-mentioned patient grouping method are all applicable to the electronic device, and can achieve the same or similar benefits. effect.
  • the above-mentioned patient grouping method and patient grouping model construction method can be executed by the same electronic device, or can be executed by different electronic devices, which is not limited in the embodiment of the present application.
  • the embodiment of the present application also provides a computer-readable storage medium (Memory).
  • the computer-readable storage medium is a memory device in an electronic device for storing programs and data. It can be understood that the computer-readable storage medium herein may include a built-in storage medium in the terminal, and of course, may also include an extended storage medium supported by the terminal.
  • the computer-readable storage medium provides storage space, and the storage space stores the operating system of the terminal.
  • one or more instructions suitable for being loaded and executed by the processor 901 are stored in the storage space, and these instructions may be one or more computer programs (including program codes).
  • the computer-readable storage medium here can be a high-speed RAM memory, or a non-volatile memory (non-volatile memory), such as at least one disk memory; optionally, it can also be at least one located far away
  • the aforementioned processor 901 is a computer-readable storage medium.
  • the processor 901 can load and execute one or more instructions stored in a computer-readable storage medium to implement the following steps:
  • Obtain a preset disease prevention and control guide identify keywords in the disease prevention guide, obtain the divided attribute set of each disease in the combined disease, and calculate the information gain rate of each divided attribute in the divided attribute set to generate each disease in the combined disease
  • the first knowledge grouping decision tree of, and according to the first knowledge grouping decision tree, n first candidate joint grouping schemes of patients suffering from the joint disease are obtained;
  • the lambdaMART model is trained by using the sample data with the outcome label and the first candidate joint clustering scheme to obtain a constructed patient clustering model.
  • the processor 901 when one or more instructions in the computer-readable storage medium are loaded by the processor 901, the following steps are performed: acquiring the importance of each indicator in each piece of sample data; based on each piece of sample data The importance of each indicator in the data generates an outcome label for each piece of sample data.
  • the preset convergence conditions include: the number of regression trees reaches the preset parameter setting, random The forest is no longer continuously updated on the validation set.
  • the embodiment of the present application also provides a computer-readable storage medium (Memory).
  • the processor 1001 can load and execute one or more instructions stored in the computer-readable storage medium to implement the following steps:
  • the patient grouping request includes at least two diseases of the patient to be grouped;
  • a preset number of the second candidate joint grouping schemes are selected as the grouping result of the patient to be grouped and returned to the user terminal.
  • the computer program of the computer-readable storage medium includes computer program code
  • the computer program code may be in the form of source code, object code, executable file, or some intermediate form, etc.
  • the computer-readable storage medium may It is non-volatile or volatile.
  • the computer-readable storage medium may include: any entity or device capable of carrying the computer program code, recording medium, U disk, mobile hard disk, magnetic disk, optical disk, computer memory, read-only memory (ROM, Read-Only Memory) ), Random Access Memory (RAM, Random Access Memory), electrical carrier signal, telecommunications signal, and software distribution media, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Public Health (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Pathology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Databases & Information Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

L'invention concerne un procédé de construction de modèle de regroupement de patients, un procédé de regroupement de patients et un dispositif associé. Le procédé de construction de modèle de regroupement de patients comprend: l'acquisition d'un guide de prévention et de régulation de maladie, la réalisation d'une reconnaissance de mot-clé par rapport au guide de prévention et de régulation de maladie pour produire un ensemble d'attributs de division de maladies dans une maladie combinée, le calcul d'un taux de gain d'informations de chaque attribut de division dans l'ensemble d'attributs de division pour générer un premier arbre de décision de regroupement de connaissances des maladies, et produire, sur la base du premier arbre de décision de regroupement de connaissances, n premières solutions de regroupement combinées candidates pour des patients souffrant de la maladie combinée (S21) ; l'acquisition de N éléments de données d'échantillon des patients souffrant de la maladie combinée, et la génération d'une étiquette de fin pour chaque élément de données d'échantillon sur la base d'indicateurs dans chaque élément de données d'échantillon (S22) ; et l'utilisation des données d'échantillon dotées des étiquettes de fin et des premières solutions de regroupement combinées candidates pour entraîner un modèle LambdaMART pour produire un modèle de regroupement de patients construit (S23). L'utilisation du modèle de regroupement de patients selon l'invention favorise un effet de regroupement accru de patients atteints de diverses maladies.
PCT/CN2020/099530 2020-05-13 2020-06-30 Procédé de construction de modèle de regroupement de patients, procédé de regroupement de patients et dispositif associé WO2021114635A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010404637.2 2020-05-13
CN202010404637.2A CN111696661A (zh) 2020-05-13 2020-05-13 患者分群模型构建方法、患者分群方法及相关设备

Publications (1)

Publication Number Publication Date
WO2021114635A1 true WO2021114635A1 (fr) 2021-06-17

Family

ID=72477306

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/099530 WO2021114635A1 (fr) 2020-05-13 2020-06-30 Procédé de construction de modèle de regroupement de patients, procédé de regroupement de patients et dispositif associé

Country Status (2)

Country Link
CN (1) CN111696661A (fr)
WO (1) WO2021114635A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116759042A (zh) * 2023-08-22 2023-09-15 之江实验室 一种基于环形一致性的反事实医疗数据生成系统及方法

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112819527B (zh) * 2021-01-29 2024-05-24 百果园技术(新加坡)有限公司 一种用户分群处理方法及装置
CN112883654B (zh) * 2021-03-24 2023-01-31 国家超级计算天津中心 一种基于数据驱动的模型训练系统
CN113724061A (zh) * 2021-08-18 2021-11-30 杭州信雅达泛泰科技有限公司 基于客户分群的消费金融产品信用评分方法及装置
CN113724815B (zh) * 2021-08-30 2024-06-21 深圳平安智慧医健科技有限公司 基于决策分群模型的信息推送方法及装置
CN113782192A (zh) * 2021-09-30 2021-12-10 平安科技(深圳)有限公司 基于因果推断的分群模型构建方法和医疗数据处理方法

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180322660A1 (en) * 2017-05-02 2018-11-08 Techcyte, Inc. Machine learning classification and training for digital microscopy images
CN109243618A (zh) * 2018-09-12 2019-01-18 腾讯科技(深圳)有限公司 医学模型的构建方法、疾病标签构建方法及智能设备
CN109801705A (zh) * 2018-12-12 2019-05-24 平安科技(深圳)有限公司 治疗推荐方法、系统、装置及存储介质
CN110164519A (zh) * 2019-05-06 2019-08-23 北京工业大学 一种基于众智网络的用于处理电子病历混合数据的分类方法
CN110363226A (zh) * 2019-06-21 2019-10-22 平安科技(深圳)有限公司 基于随机森林的眼科病种分类识别方法、装置及介质
CN110929752A (zh) * 2019-10-18 2020-03-27 平安科技(深圳)有限公司 基于知识驱动和数据驱动的分群方法及相关设备

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180322660A1 (en) * 2017-05-02 2018-11-08 Techcyte, Inc. Machine learning classification and training for digital microscopy images
CN109243618A (zh) * 2018-09-12 2019-01-18 腾讯科技(深圳)有限公司 医学模型的构建方法、疾病标签构建方法及智能设备
CN109801705A (zh) * 2018-12-12 2019-05-24 平安科技(深圳)有限公司 治疗推荐方法、系统、装置及存储介质
CN110164519A (zh) * 2019-05-06 2019-08-23 北京工业大学 一种基于众智网络的用于处理电子病历混合数据的分类方法
CN110363226A (zh) * 2019-06-21 2019-10-22 平安科技(深圳)有限公司 基于随机森林的眼科病种分类识别方法、装置及介质
CN110929752A (zh) * 2019-10-18 2020-03-27 平安科技(深圳)有限公司 基于知识驱动和数据驱动的分群方法及相关设备

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116759042A (zh) * 2023-08-22 2023-09-15 之江实验室 一种基于环形一致性的反事实医疗数据生成系统及方法
CN116759042B (zh) * 2023-08-22 2023-12-22 之江实验室 一种基于环形一致性的反事实医疗数据生成系统及方法

Also Published As

Publication number Publication date
CN111696661A (zh) 2020-09-22

Similar Documents

Publication Publication Date Title
Akella et al. Machine learning algorithms for predicting coronary artery disease: efforts toward an open source solution
WO2021114635A1 (fr) Procédé de construction de modèle de regroupement de patients, procédé de regroupement de patients et dispositif associé
US11232365B2 (en) Digital assistant platform
US20200265931A1 (en) Systems and methods for coding health records using weighted belief networks
US20200311610A1 (en) Rule-based feature engineering, model creation and hosting
CN112528660A (zh) 处理文本的方法、装置、设备、存储介质和程序产品
US10936962B1 (en) Methods and systems for confirming an advisory interaction with an artificial intelligence platform
CN114078597A (zh) 从文本获得支持的决策树用于医疗健康应用
US20230244869A1 (en) Systems and methods for classification of textual works
Khilji et al. Healfavor: Dataset and a prototype system for healthcare chatbot
US12020818B1 (en) Cross care matrix based care giving intelligence
Rabie et al. A decision support system for diagnosing diabetes using deep neural network
Nguyen et al. AI in the intensive care unit: up-to-date review
US20200219617A1 (en) Apparatus and method for initial information gathering from patients at the point of care
Jia et al. DKDR: An approach of knowledge graph and deep reinforcement learning for disease diagnosis
Ren et al. Mortality prediction in ICU using a stacked ensemble model
Wang et al. Medical Data Classification Assisted by Machine Learning Strategy
Zaghir et al. Real-world patient trajectory prediction from clinical notes using artificial neural networks and UMLS-based extraction of concepts
Malgieri Ontologies, Machine Learning and Deep Learning in Obstetrics
US20210133627A1 (en) Methods and systems for confirming an advisory interaction with an artificial intelligence platform
WO2021120528A1 (fr) Procédé et système d'interprétation de rapport automatique
Sousa et al. An architecture based on fuzzy systems for personalized medicine in ICUs
US20220108799A1 (en) System and method for transmitting a severity vector
US11561938B1 (en) Closed-loop intelligence
González et al. A recommendation system for electronic health records in the context of the HOPE project

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20899600

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20899600

Country of ref document: EP

Kind code of ref document: A1