WO2021180244A1 - Disease risk prediction system, method and apparatus, device and medium - Google Patents

Disease risk prediction system, method and apparatus, device and medium Download PDF

Info

Publication number
WO2021180244A1
WO2021180244A1 PCT/CN2021/084030 CN2021084030W WO2021180244A1 WO 2021180244 A1 WO2021180244 A1 WO 2021180244A1 CN 2021084030 W CN2021084030 W CN 2021084030W WO 2021180244 A1 WO2021180244 A1 WO 2021180244A1
Authority
WO
WIPO (PCT)
Prior art keywords
target
risk
diagnosis
treatment data
risk factor
Prior art date
Application number
PCT/CN2021/084030
Other languages
French (fr)
Chinese (zh)
Inventor
陈天歌
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021180244A1 publication Critical patent/WO2021180244A1/en

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/30ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients

Definitions

  • This application relates to the field of artificial intelligence technology, and in particular to a disease risk prediction system, method, device, equipment, and medium.
  • the embodiments of the present application provide a disease risk prediction system, method, device, equipment, and medium, which help to improve the disease risk prediction effect.
  • an embodiment of the present application provides a disease risk prediction system, including: a risk prediction device and a storage device; wherein the storage device is used to store diagnosis and treatment data of a user;
  • the risk prediction device is used to perform the following steps:
  • a plurality of second risk factors are screened from the plurality of first risk factors according to a two-norm objective function including the plurality of first risk factors, and the coefficient of each second risk factor to be screened is determined to A risk prediction model is determined based on the coefficients of each second risk factor; wherein the coefficients of the second risk factor are integers determined from a set of integers;
  • the risk prediction model Obtain the target diagnosis and treatment data of the target user, call the risk prediction model to determine the target risk factor corresponding to the target diagnosis and treatment data and the coefficient of each target risk factor, and determine the target risk factor based on the target risk factor and the coefficient of the target risk factor.
  • the risk prediction result of the target user for the target disease Obtain the target diagnosis and treatment data of the target user, call the risk prediction model to determine the target risk factor corresponding to the target diagnosis and treatment data and the coefficient of each target risk factor, and determine the target risk factor based on the target risk factor and the coefficient of the target risk factor.
  • the embodiments of the present application provide a disease risk prediction method, including:
  • a plurality of second risk factors are screened from the plurality of first risk factors according to a two-norm objective function including the plurality of first risk factors, and the coefficient of each second risk factor to be screened is determined to A risk prediction model is determined based on the coefficients of each second risk factor; wherein the coefficients of the second risk factor are integers determined from a set of integers;
  • the risk prediction model Obtain the target diagnosis and treatment data of the target user, call the risk prediction model to determine the target risk factor corresponding to the target diagnosis and treatment data and the coefficient of each target risk factor, and determine the target risk factor based on the target risk factor and the coefficient of the target risk factor.
  • the risk prediction result of the target user for the target disease Obtain the target diagnosis and treatment data of the target user, call the risk prediction model to determine the target risk factor corresponding to the target diagnosis and treatment data and the coefficient of each target risk factor, and determine the target risk factor based on the target risk factor and the coefficient of the target risk factor.
  • an embodiment of the present application provides a disease risk prediction device, including:
  • the acquisition module is used to acquire the diagnosis and treatment data corresponding to the target diseases of multiple users;
  • a determining module configured to determine multiple first risk factors corresponding to the target disease according to the diagnosis and treatment data
  • the processing module is configured to screen out a plurality of second risk factors from the plurality of first risk factors according to a two-norm objective function including the plurality of first risk factors, and determine each second risk to be screened
  • the coefficients of the factors are used to determine the risk prediction model based on the coefficients of each second risk factor; wherein, the coefficients of the second risk factor are integers determined from a set of integers;
  • the acquisition module is also used to acquire target diagnosis and treatment data of the target user
  • the processing module is further configured to call the risk prediction model to determine the target risk factor corresponding to the target diagnosis and treatment data and the coefficient of each target risk factor, and determine the target risk factor based on the target risk factor and the coefficient of the target risk factor.
  • the risk prediction result of the target user for the target disease is further configured to call the risk prediction model to determine the target risk factor corresponding to the target diagnosis and treatment data and the coefficient of each target risk factor, and determine the target risk factor based on the target risk factor and the coefficient of the target risk factor.
  • an embodiment of the present application provides a risk prediction device.
  • the risk prediction device may include a processor and a memory, and the processor and the memory are connected to each other.
  • the memory is used to store a computer program that supports the terminal device to execute the above-mentioned methods or steps, the computer program includes program instructions, and the processor is configured to call the program instructions to execute the following methods:
  • a plurality of second risk factors are screened from the plurality of first risk factors according to a two-norm objective function including the plurality of first risk factors, and the coefficient of each second risk factor to be screened is determined to A risk prediction model is determined based on the coefficients of each second risk factor; wherein the coefficients of the second risk factor are integers determined from a set of integers;
  • the risk prediction model Obtain the target diagnosis and treatment data of the target user, call the risk prediction model to determine the target risk factor corresponding to the target diagnosis and treatment data and the coefficient of each target risk factor, and determine the target risk factor based on the target risk factor and the coefficient of the target risk factor.
  • the risk prediction result of the target user for the target disease Obtain the target diagnosis and treatment data of the target user, call the risk prediction model to determine the target risk factor corresponding to the target diagnosis and treatment data and the coefficient of each target risk factor, and determine the target risk factor based on the target risk factor and the coefficient of the target risk factor.
  • embodiments of the present application provide a computer-readable storage medium that stores a computer program, and the computer program includes program instructions that, when executed by a processor, cause all The processor executes the following methods:
  • a plurality of second risk factors are screened from the plurality of first risk factors according to a two-norm objective function including the plurality of first risk factors, and the coefficient of each second risk factor to be screened is determined to A risk prediction model is determined based on the coefficients of each second risk factor; wherein the coefficients of the second risk factor are integers determined from a set of integers;
  • the risk prediction model Obtain the target diagnosis and treatment data of the target user, call the risk prediction model to determine the target risk factor corresponding to the target diagnosis and treatment data and the coefficient of each target risk factor, and determine the target risk factor based on the target risk factor and the coefficient of the target risk factor.
  • the risk prediction result of the target user for the target disease Obtain the target diagnosis and treatment data of the target user, call the risk prediction model to determine the target risk factor corresponding to the target diagnosis and treatment data and the coefficient of each target risk factor, and determine the target risk factor based on the target risk factor and the coefficient of the target risk factor.
  • the embodiments of the present application can apply integer optimization algorithms to achieve control of the number of risk factors and optimize prediction results by setting integer constraint conditions and a two-norm-based objective function, thereby helping to improve the prediction effect of disease risk.
  • Figure 1 is a schematic structural diagram of a disease risk prediction system provided by an embodiment of the present application.
  • FIG. 2 is a schematic flowchart of a disease risk prediction method provided by an embodiment of the present application
  • Fig. 3 is a schematic structural diagram of a disease risk prediction device provided by an embodiment of the present application.
  • Fig. 4 is a schematic structural diagram of a risk prediction device provided by an embodiment of the present application.
  • the technical solution of the present application can be applied to a disease risk prediction system, and can be specifically applied to a risk prediction device (risk prediction device) to realize disease risk prediction.
  • the risk prediction device may be a terminal, a server, or a data platform or other devices.
  • the terminal may include a mobile phone, a tablet computer, a computer, etc., which is not limited in this application. It can be understood that, in other embodiments, the terminal may also be called other names, such as terminal equipment, smart terminal, user equipment, user terminal, etc., which are not listed here.
  • this application can determine multiple risk factors corresponding to the target disease according to the diagnosis and treatment data of multiple users, and then screen multiple risk factors from the multiple risk factors, and determine the coefficient of each risk factor to be screened, the risk factor
  • the coefficient of is an integer determined from a set of integers, and then by obtaining the target diagnosis and treatment data of the target user, based on the coefficients of each risk factor determined above, the target risk factor corresponding to the target diagnosis and treatment data and the coefficient of each target risk factor can be determined , And can determine the target user’s risk prediction result for the target disease based on the target risk factor and the coefficient of the target risk factor, so that the prediction requirements of risk prediction can be met by setting integer constraints, and the number of risk factors can be controlled.
  • this application can combine model algorithms, apply integer optimization algorithms, set integer constraints to meet the forecast requirements of risk prediction, and achieve model solution and control risk factors by optimizing the objective function based on two norms. In order to achieve reliable prediction of disease risk and improve the effect of disease risk prediction.
  • the technical solution of this application can be applied to the fields of artificial intelligence, smart city, blockchain and/or big data technology.
  • it can be realized through a data platform or other equipment.
  • the data involved can be stored through blockchain nodes, or can be stored in The database is not limited in this application.
  • the embodiments of the present application provide a disease risk prediction system, method, device, equipment, medium, etc., so as to help improve the disease risk prediction effect. Detailed descriptions are given below.
  • FIG. 1 is a schematic structural diagram of a disease risk prediction system provided by an embodiment of the present application.
  • the disease risk prediction system may include a risk prediction device (risk prediction device) 101 and a storage device (storage device) 102. in,
  • the storage device 102 can be used to store the user's diagnosis and treatment data
  • the risk prediction device 101 can be used to perform the following steps:
  • a two-norm objective function including the plurality of first risk factors, a plurality of second risk factors are screened from the plurality of first risk factors, and the coefficients of the screened second risk factors are determined to be based on each The coefficient of the second risk factor determines the risk prediction model; wherein, the coefficient of the second risk factor is an integer determined from a set of integers;
  • Obtain the target diagnosis and treatment data of the target user call the risk prediction model to determine the target risk factor corresponding to the target diagnosis and treatment data and the coefficient of each target risk factor, and determine that the target user is directed against the target risk factor based on the target risk factor and the coefficient of the target risk factor.
  • the risk prediction result of the target disease Obtain the target diagnosis and treatment data of the target user, call the risk prediction model to determine the target risk factor corresponding to the target diagnosis and treatment data and the coefficient of each target risk factor, and determine that the target user is directed against the target risk factor based on the target risk factor and the coefficient of the target risk factor.
  • the storage device 102 may also be used to store other data related to the present application, such as various risk factors and coefficients of risk factors, and so on.
  • the storage device and the risk prediction device may be independent devices, that is, independently deployed, or the storage device and the risk prediction device may also be deployed in the same device, which is not limited in this application, and FIG. 1 only shows Independent deployment scenario.
  • the storage device and the risk prediction device may be deployed in a server, or in other words, the storage device may be deployed in a risk prediction device.
  • the diagnosis and treatment data may include physical sign data, examination data, and so on.
  • the user may be a patient suffering from a target disease, for example, it may be referred to as a target patient.
  • the diagnosis and treatment data corresponding to different diseases can be different.
  • the collected diagnosis and treatment data can be determined according to the target disease, and then the diagnosis and treatment data corresponding to the target disease can be obtained; or, in some embodiments, the diagnosis and treatment data corresponding to different diseases can be obtained.
  • the data can be the same, for example, all the diagnosis and treatment data within the preset time range of the user such as the target patient can be collected as the diagnosis and treatment data corresponding to the target disease.
  • the collected diagnosis and treatment data may be determined according to the target disease type, or the diagnosis and treatment data may be all the diagnosis and treatment data of the target user, or may be the diagnosis and treatment data within a preset time period (such as within the last year).
  • the data can be extracted from the monitoring system, and the storage device is a storage device in the monitoring system, or the data can be stored in the storage device after being extracted by the monitoring system, which is not limited in this application.
  • the diagnosis and treatment data may be obtained by processing the collected original medical (diagnosis and treatment) data, and the processing includes sampling, filling in missing values, and so on.
  • the patient's original medical data can be obtained, including the patient's historical baseline data.
  • the historical base station data can include multiple visit records, and each visit record may include various diagnoses, tests, examinations, medications, and surgical items.
  • the historical baseline data can be preprocessed.
  • the physical sign data can be obtained by sampling the collected original physical sign data in a preset time unit (for example, in 1h unit), and the original physical sign data can be continuous data. ;
  • the diagnosis and treatment data can be text data, or vectors, such as binary features, or called two-dimensional feature vectors, and so on.
  • the outcome data corresponding to the diagnosis and treatment data may also be obtained, and the outcome data may be used to indicate the health status of the user.
  • the outcome data can also be called outcome, clinical outcome or other names, and this application does not make any restrictions.
  • the outcome data may be the discharge diagnosis data corresponding to the patient's record of each visit, such as death, aggravation of the disease, occurrence of complications, diagnosis of the target disease, and so on.
  • the processing of the outcome data can be similar to that of the diagnosis and treatment data, which will not be repeated here.
  • the outcome data may be text data, or a vector, such as a binary feature, or a two-dimensional feature vector, and so on.
  • the diagnosis and treatment data may include age, systolic blood pressure, and cardiac function classification Killip.
  • the outcome data can be death or other outcomes.
  • the risk prediction result may include a prediction score of the target user for the target disease, and the prediction score may be the sum of the coefficients (weights) of the target risk factors.
  • the target risk factor corresponding to the target diagnosis and treatment data may be a part or all of the multiple second risk factors that are screened out.
  • the prediction score may also be obtained after processing the coefficients of the target risk factor.
  • the target disease when the target disease is an infectious disease, the target disease may be determined according to the incidence of the target disease in the area where the target user is located. The coefficient of the risk factor is weighted to obtain the prediction score, etc., which is not limited in this application. As a result, the accuracy and reliability of the determined disease risk prediction score can be improved.
  • the higher the incidence of the target disease in the area where the target user is located the larger the weighting coefficient can be set; on the contrary, the lower the incidence of the target disease in the area where the target user is located, the smaller the weighting coefficient can be set.
  • the preset first weighting coefficient is used for weighting, and the incidence of the target disease in the area where the target user is located is lower than the average incidence of each area
  • the preset second weighting coefficient is used for weighting, and the first weighting coefficient is greater than the second weighting coefficient.
  • the risk prediction device can obtain the target incidence of the target disease in the target area where the target user is located, and compare the target incidence with the average incidence of the target disease. If the target incidence is higher than the average incidence of the target disease, and the difference between the two exceeds the threshold, the coefficients of one or more target risk factors can be weighted (for example, multiplied by a coefficient greater than 1, or can be The sum of the coefficients of each target risk factor is weighted, or a score can be added), and the risk score is added to the original prediction score to obtain the prediction score. This helps to further improve the reliability of disease risk prediction.
  • the risk factor may be a binary feature.
  • the risk prediction device 101 is acquiring the target diagnosis and treatment data of the target user, calling the risk prediction model to determine the target risk factor corresponding to the target diagnosis and treatment data and the coefficient of each target risk factor, and based on the target risk factor and the target risk
  • the coefficient of the factor determines the risk prediction result of the target user for the target disease, it can be specifically used to: obtain the target diagnosis and treatment data of the target user, and convert the target diagnosis and treatment data into a binary feature; the target diagnosis and treatment data corresponds to the binary feature
  • the feature is input into the risk prediction model to obtain the target risk factor corresponding to the binary feature and the coefficient of each target risk factor, and the risk prediction result is determined based on the target risk factor and the coefficient of the target risk factor.
  • the diagnosis and treatment data may include multiple risk factor data.
  • the risk prediction device 101 determines multiple risk factors of the target disease according to the diagnosis and treatment data, it can obtain multiple risk factor data included in the diagnosis and treatment data, and determine the outcome data according to the risk factor data and the outcome data corresponding to the diagnosis and treatment data. Relationship, the risk factor data is converted into dual characteristics to obtain multiple risk factors. Among them, the risk factor data can be variables that affect the clinical outcome of the target disease.
  • the risk prediction device 101 is also used to obtain outcome data corresponding to the diagnosis and treatment data. Further, the risk prediction device 101 screens out a plurality of second risk factors from the plurality of first risk factors according to a two-norm objective function including the plurality of first risk factors, and determines each second risk factor to be screened.
  • the coefficient of the risk factor when the risk prediction model is determined based on the coefficient of each second risk factor, can be specifically used to: According to the multiple first risk factors corresponding to the diagnosis and treatment data and the outcome data, determine to minimize the target function
  • the multiple selected second risk factors and the coefficients of each selected second risk factor can be trained to obtain the risk prediction model.
  • the two norm can be used to control the number of selected risk factors, that is, the number of second risk factors.
  • the objective function may be determined according to the logistic loss function and the second norm.
  • the risk prediction device 101 can also be used to receive a disease risk prediction request sent by a terminal, and the disease risk prediction request carries the target user's identifier;
  • the risk prediction device 101 may be specifically used to obtain the target diagnosis and treatment data according to the target user's identifier
  • the risk prediction device 101 may also be used to determine a target score threshold according to the type of the target disease, and send warning information to the terminal when the predicted score included in the risk prediction result is greater than the target score threshold.
  • the terminal that sends the disease risk prediction request can be any terminal, for example, it can be a terminal of a doctor, a patient, or other users, which is not limited in this application.
  • the terminal may be a specific terminal, such as a legitimate terminal that has passed the verification, or it may obtain the target diagnosis and treatment data after receiving the disease risk prediction request and verifying the terminal successfully.
  • the risk prediction device may receive the disease risk prediction request, and after receiving the request, verify the identity of the terminal; if the verification passes, it triggers the acquisition of the target diagnosis and treatment data.
  • the verification method can be multiple.
  • the disease risk prediction request may also carry the identity of the terminal, and it can be verified whether the identity is in the white list of the terminal.
  • the disease risk prediction request can also be encrypted with a preset public key. If the request is successfully decrypted based on the private key, the verification is determined to pass; otherwise, the verification fails; etc., this is not the same. One enumerate.
  • the disease risk prediction request may also carry the identifier of the target disease and/or the type of the target disease.
  • the early warning information includes a risk item corresponding to the target risk factor, the prediction score, and a treatment plan.
  • the treatment plan may be a treatment plan corresponding to the user group to which the target user belongs.
  • the scoring thresholds corresponding to different disease types (or diseases) can be different, which can be specifically determined according to the risk levels corresponding to the disease types (or diseases). For example, the higher the risk level corresponding to the disease type (or disease), the lower the score threshold corresponding to the disease type (or disease); on the contrary, the lower the risk level corresponding to the disease type (or disease), the lower the risk level of the disease type (or disease).
  • the score threshold corresponding to disease can be higher. This helps to improve the flexibility of information early warning operations.
  • the storage device 102 may be a blockchain node, and the diagnosis and treatment data may be obtained from the blockchain. That is, the diagnosis and treatment data of each patient can be stored in the blockchain in advance. By obtaining the user's diagnosis and treatment data from the blockchain node, the reliability of the obtained diagnosis and treatment data can be improved, which in turn helps to improve the reliability of the disease risk determined based on the diagnosis and treatment data.
  • the risk prediction device 101 may also be used to send a diagnosis and treatment data acquisition request to the storage device 102, and the diagnosis and treatment data acquisition request carries the identification of the target user;
  • the storage device 102 can also be used to receive the diagnosis and treatment data acquisition request, and verify the identity of the risk prediction device; if the verification is passed, query the target user’s target diagnosis and treatment data according to the target user’s identity, and send it to The risk prediction device sends the diagnosis and treatment data;
  • the risk prediction device 101 may be specifically configured to receive the target diagnosis and treatment data sent by the storage device to obtain the target diagnosis and treatment data.
  • the risk prediction device 101 can determine multiple risk factors corresponding to the target disease according to the diagnosis and treatment data of multiple users, such as multiple target patients, acquired from the storage device 102, and according to the two risk factors including multiple risk factors.
  • the objective function of the norm screens multiple second risk factors from multiple first risk factors, and determines the coefficients of each second risk factor screened from the integer set, so as to train the risk prediction model, and then call the risk
  • the prediction model determines the target risk factor corresponding to the acquired target diagnosis and treatment data (for example, the target diagnosis and treatment data can be obtained from the storage device 102) and the coefficient of each target risk factor to obtain the target user's risk prediction result for the target disease.
  • the embodiments of the present application can apply integer optimization algorithms to achieve control of the number of risk factors and optimize prediction results by setting integer constraint conditions and a two-norm-based objective function, thereby helping to improve the prediction effect of disease risk.
  • FIG. 2 is a schematic flowchart of a disease risk prediction method provided by an embodiment of the present application. The method may be executed by the above-mentioned risk prediction device. As shown in FIG. 2, the disease risk prediction method may include the following steps:
  • the user may be a patient suffering from the target disease.
  • the diagnosis and treatment data may include physical sign data, inspection data, etc., which will not be repeated here.
  • the diagnosis and treatment data may include age, systolic blood pressure, and cardiac function classification Killip.
  • the diagnosis and treatment data can be obtained from the blockchain, that is, the diagnosis and treatment data of each patient can be stored in the blockchain in advance, which will not be repeated here.
  • the diagnosis and treatment data can be obtained from the blockchain, that is, the diagnosis and treatment data of each patient can be stored in the blockchain in advance, which will not be repeated here.
  • the risk factor may be a feature vector, such as a binary feature.
  • the risk prediction device may also be used to obtain outcome data corresponding to the diagnosis and treatment data.
  • the risk factor of the target disease After obtaining the diagnosis and treatment data of the target disease, the risk factor of the target disease, that is, the first risk factor, can be determined.
  • the risk factor can be a vector of partial data of the user's diagnosis and treatment data, or a vector of data after processing the diagnosis and treatment data, or the diagnosis and treatment data can be directly composed of multiple risk factors, etc.
  • the application is not limited.
  • the obtained diagnosis and treatment data including multiple wind direction factor data may be converted into a binary feature.
  • feature engineering can be performed on risk factor data, and the risk factor data can be converted into binary features suitable for integer optimization algorithms.
  • the diagnosis and treatment data may include multiple risk factor data, and each risk factor involved in the present application may be a binary feature.
  • multiple risk factor data included in the diagnosis and treatment data can be obtained, and then the risk factor data can be obtained according to the relationship between the risk factor data and the outcome data corresponding to the diagnosis and treatment data. Converted into dual characteristics to obtain multiple risk factors. That is to say, when determining the risk factors based on the diagnosis and treatment data, the diagnosis and treatment data can be converted into a binary feature according to the relationship between the risk factor data (that is, the variable that affects the clinical outcome) and the outcome.
  • the diagnosis and treatment data when converting the diagnosis and treatment data into binary features based on the relationship between the risk factor data and the outcome, it may refer to the conversion of the binary features based on the parameter values corresponding to the risk factor data and the critical value corresponding to the outcome data, where ,
  • the critical value can be used to indicate the risk information of the outcome. If the binary characteristics of the corresponding risk factor data below the critical value are the same, the binary characteristics of the risk factor data corresponding to the critical value and above are the same, etc., this application does not limit it.
  • the critical value may be one or more. If there are multiple, the binary characteristics of the risk factor data of the interval corresponding to each critical value may be the same.
  • risk factor data x such as the cardiac function grade Killip
  • outcome y such as whether a myocardial infarction occurs
  • cut-off value c if the Killip grade is higher than grade II, the risk of myocardial infarction will be significantly increased
  • the risk factor data x can be segmented according to the critical value c, and the original continuous variable can be converted into a binary coded variable.
  • the selected multiple risk factors that is, the second risk factor
  • the part of the risk factors can be used to characterize the key variables of the outcome of the target disease.
  • the present application can train a risk prediction model based on binary features, filter out risk factors from the multiple risk factors by optimizing the objective function based on the two norm, and determine the coefficient of each risk factor.
  • the training method of the risk prediction model may be as follows: according to the multiple first risk factors corresponding to the diagnosis and treatment data and the outcome data, determine to minimize multiple selected second risk factors under the objective function, and The coefficients of each second risk factor that are screened out can be trained to obtain the risk prediction model.
  • the objective function may be determined according to the logistic loss function and the second norm, and the second norm is used to control the number of risk factors selected.
  • the risk prediction model satisfies the optimization objective function and sets integer constraints.
  • the objective function can be as follows:
  • can represent the coefficient vector of the risk factor, that is, the coefficient corresponding to each second risk factor, such as the score; l( ⁇ ) can be the logistic loss function; C ⁇ 2 can represent the two corresponding to the risk factor set. Norm; ⁇ can be a set of integers.
  • the loss function can be expressed as follows:
  • n can represent the number of samples in the data
  • yi is the clinical outcome (outcome data) corresponding to sample i
  • x i is the feature vector of sample i, such as the first risk factor.
  • the objective function of this application can be based on the traditional logistic loss function, adding the two-norm part of the risk factor (C ⁇ 2 ).
  • the purpose is to control the model by adjusting the parameter C while obtaining the optimal solution.
  • the number of selected risk factors realizes automatic selection of the optimal subset of risk factors.
  • the integer constraint conditions are limited, so that the parameter ⁇ is selected from the set integer set ⁇ , in order to make the model solution result meet the actual needs of the disease risk score.
  • the optimal risk factor combination and the coefficient ⁇ corresponding to each risk factor can be solved by minimizing the objective function that satisfies the constraints, and the model training is completed.
  • the coefficient ⁇ can be expressed as the risk score corresponding to each risk factor. For example, a positive value represents an increase in the risk of clinical outcome, and a negative score represents a decrease in the risk.
  • the user's diagnosis and treatment data can be subsequently obtained, and the user's disease risk score can be determined based on the risk factors and coefficients corresponding to the diagnosis and treatment data.
  • multiple risk factors can be screened from the multiple risk factors in other ways, and the coefficient of each risk factor to be screened can be determined. For example, it can be based on the probability (such as the percentage of a certain risk factor in all samples) or the number (such as a certain risk factor in all samples) involved in the diagnosis and treatment data of multiple users, such as multiple target patients.
  • the probability of the risk factor or the largest number of N risk factors can be used as the selected risk factor, that is, the second risk factor, and the greater the probability or number of risk factors Larger, the larger the coefficient corresponding to the risk factor (for example, the probability interval or quantity interval of the risk factor can be set, and the coefficient corresponding to each probability interval or quantity interval can be the same), and the coefficient is an integer, and N is an integer greater than 2.
  • the target diagnosis and treatment data may include physical sign data, inspection and inspection data, etc., for example, may include various diagnoses, inspections, inspections, medications, and surgical items.
  • the target diagnosis and treatment data may be obtained by processing collected original diagnosis and treatment data.
  • the original diagnosis and treatment data of the target user may be obtained, which may include records of multiple visits of the target user, and each visit record may include data such as various diagnoses, tests, examinations, medications, and surgical items.
  • the original diagnosis and treatment data can be preprocessed to obtain the preprocessed diagnosis and treatment data, which will not be repeated here.
  • the risk prediction device when the risk prediction device realizes the disease risk prediction of the user, such as obtaining the target diagnosis and treatment data, it can be triggered based on the user's request, or can be triggered actively for a specific user, or can be triggered in other ways. Not limited.
  • the risk prediction device may also receive a disease risk prediction request sent by the terminal, and the disease risk prediction request carries the identifier of the target user. Furthermore, the target diagnosis and treatment data can be obtained according to the target user's identification.
  • the risk prediction device may also send a diagnosis and treatment data acquisition request to the storage device, and the diagnosis and treatment data acquisition request carries the identification of the target user.
  • the storage device can receive the diagnosis and treatment data acquisition request, and after receiving the diagnosis and treatment data acquisition request, it can verify the identity of the risk prediction device; if the verification is passed, the target user will be retrieved according to the target user's identification query Target diagnosis and treatment data, and can send the diagnosis and treatment data to the risk prediction device. Therefore, the risk prediction device can receive the target diagnosis and treatment data sent by the storage device to obtain the target diagnosis and treatment data.
  • the storage device may be a blockchain node, or a server or other storage device.
  • the verification method can be one or more.
  • the diagnosis and treatment data acquisition request can also carry the identity of the risk prediction device, and the storage device can verify whether the identity of the risk prediction device exists in a preset white list, and if it exists in the white list, the school Pass the verification; otherwise, the verification fails.
  • a preset public key can be used to encrypt the diagnosis and treatment data acquisition request. If the storage device successfully decrypts the diagnosis and treatment data acquisition request based on the private key, the verification is determined to pass; otherwise, the verification fails; , The storage device can also be verified based on other methods, which are not listed here.
  • the risk prediction result may include a prediction score of the target user for the target disease, and the prediction score may be the sum of the coefficients of each target risk factor, or may be obtained by processing the coefficients of the target risk factor, I will not repeat them here.
  • the risk prediction device may convert the acquired target diagnosis and treatment data into binary features, and input the binary characteristics corresponding to the target diagnosis and treatment data into the risk prediction model to obtain the binary characteristics.
  • the target risk factor corresponding to the feature and the coefficient of each target risk factor, and the risk prediction result is determined based on the target risk factor and the coefficient of the target risk factor.
  • the target diagnosis and treatment data can be converted into a binary feature, and the trained risk prediction model can be called to process the binary feature to obtain the target user’s response to the target.
  • the risk score of the disease For example, the coefficients of the risk factors determined by the risk prediction model are added to obtain the final risk score, that is, the scores corresponding to the true values of the risk factors finally selected by the algorithm are added to obtain the final risk score of the target user.
  • This application can convert the original data into a binary feature form that can be directly input to the integer optimization algorithm, apply the integer optimization algorithm, meet the scoring requirements of the risk score by setting integer constraints, and optimize the objective function based on the two norm To solve the model and control the number of risk factors, so as to realize the reliable prediction of disease risk.
  • the risk factors screened by the risk prediction model include cardiac arrest, age, Killip, and systolic blood pressure.
  • the corresponding coefficients (scores) are 2, 1 , 1, 1, the risk score is 5 points. That is, if a patient has a history of cardiac arrest, two points are added to the total risk score, and so on. This can quickly determine the user's risk score.
  • the risk prediction device may also determine a target score threshold according to the type of the target disease, and send warning information to the terminal when the predicted score included in the risk prediction result is greater than the target score threshold.
  • the prediction score is the sum of the coefficients of each target risk factor
  • the early warning information may include information such as the risk item corresponding to the target risk factor, the prediction score, and treatment plan.
  • the treatment plan may be a treatment plan corresponding to the user group to which the target user belongs.
  • the user group to which the target user belongs may be the group with the largest net benefit under the treatment plan.
  • users can be grouped based on the net benefits of treatment plans, and the user groups with the largest net benefits under each treatment plan can be obtained. Therefore, when recommending a treatment plan to a user, it can be pushed in conjunction with the net benefit, for example, recommending to the target user the treatment plan with the largest net benefit corresponding to the user group to which the target user belongs.
  • the risk prediction device may determine multiple risk factors corresponding to the target disease through acquired diagnosis and treatment data of multiple users, and obtain multiple risk factors from the multiple risk factors according to a two-norm objective function including multiple risk factors. Multiple risk factors are screened out, and the coefficients of the selected risk factors are determined from the integer set, so as to train the risk prediction model, and then the target risk factor and each target corresponding to the target diagnosis and treatment data can be determined by calling the risk prediction model. The coefficient of the risk factor to obtain the risk prediction result of the target user for the target disease.
  • the embodiments of this application can apply integer optimization algorithms to open up the process of feature selection, model parameter learning, and risk factor scoring, avoiding the subjective problems of traditional methods; it can also be combined with the application of automated data preprocessing methods, such as data filling And feature engineering, which automatically converts the original data into a form that can be directly input to the integer optimization algorithm; and the process of this program is convenient, fast, and highly automated, and the results meet the clinical needs of disease risk scores, which can be used for clinical trials without algorithms and development experience. Used by doctors.
  • the embodiment of the present application also provides a disease risk prediction device.
  • the device may include modules for performing the method of FIG. 2 described above.
  • FIG. 3 is a schematic structural diagram of a disease risk prediction device provided by an embodiment of the present application.
  • the disease risk prediction device described in this embodiment may be configured in a risk prediction device.
  • the disease risk prediction device 300 of this embodiment may include:
  • the obtaining module 301 is used to obtain diagnosis and treatment data corresponding to the target diseases of multiple users;
  • the determining module 302 is configured to determine multiple first risk factors corresponding to the target disease according to the diagnosis and treatment data;
  • the processing module 303 is configured to screen out a plurality of second risk factors from the plurality of first risk factors according to a two-norm objective function including the plurality of first risk factors, and determine each second risk factor to be screened
  • the coefficients of the risk factors are used to determine the risk prediction model based on the coefficients of the respective second risk factors; wherein the coefficients of the second risk factors are integers determined from a set of integers;
  • the obtaining module 301 is also used to obtain target diagnosis and treatment data of the target user;
  • the processing module 303 is further configured to call the risk prediction model to determine the target risk factor corresponding to the target diagnosis and treatment data and the coefficient of each target risk factor, and determine based on the target risk factor and the coefficient of the target risk factor The risk prediction result of the target user for the target disease.
  • the processing module 303 calls the risk prediction model to determine the target risk factor corresponding to the target diagnosis and treatment data and the coefficient of each target risk factor, and is based on the target risk factor and the target risk factor.
  • the following steps can be specifically performed:
  • the binary features corresponding to the target diagnosis and treatment data are input into the risk prediction model to obtain the target risk factor corresponding to the binary feature and the coefficient of each target risk factor, and based on the target risk factor and the target risk factor The coefficient determines the risk prediction result;
  • the risk prediction result includes a prediction score of the target user for the target disease, and the prediction score is the sum of the coefficients of each of the target risk factors.
  • the acquiring module 301 may also be used to acquire outcome data corresponding to the diagnosis and treatment data, and the outcome data is used to indicate the health status of the user;
  • the diagnosis and treatment data includes multiple risk factor data; the determining module 302 may specifically perform the following steps when determining multiple risk factors of the target disease according to the diagnosis and treatment data:
  • the risk factor data is converted into a binary feature to obtain multiple risk factors.
  • the obtaining module 301 is also used to obtain outcome data corresponding to the diagnosis and treatment data;
  • the processing module 303 screens out a plurality of second risk factors from the plurality of first risk factors according to a two-norm objective function including the plurality of first risk factors, and determines each second risk factor to be screened.
  • a two-norm objective function including the plurality of first risk factors
  • the risk prediction model is obtained by training; wherein, the objective function is determined according to a logistic loss function and a two-norm, and the two-norm is used to control the number of second risk factors that are screened out.
  • the acquiring module 301 is further configured to receive a disease risk prediction request sent by the terminal, and the disease risk prediction request carries the target user's identifier;
  • the acquiring module 301 is further configured to acquire the target diagnosis and treatment data according to the identifier of the target user;
  • the determining module 302 is further configured to determine a target score threshold according to the type of the target disease, and send early warning information to the terminal when the predicted score included in the risk prediction result is greater than the target score threshold;
  • the prediction score is the sum of the coefficients of each of the target risk factors
  • the early warning information includes the risk item corresponding to the target risk factor, the prediction score, and a treatment plan
  • the treatment plan is the target user The treatment plan corresponding to the user group.
  • the storage device is a blockchain node
  • the acquisition module 301 is further configured to send a diagnosis and treatment data obtaining request to the storage device, and the diagnosis and treatment data obtaining request carries the identification of the target user, so that the storage device can check the identity of the risk prediction device. Verification, if the verification is passed, the storage device obtains the target diagnosis and treatment data of the target user according to the target user's identification query, and sends the diagnosis and treatment data to the risk prediction device;
  • the acquiring module 301 is specifically configured to receive the target diagnosis and treatment data sent by the storage device.
  • FIG. 4 is a schematic structural diagram of a risk prediction device provided by an embodiment of the present application.
  • the risk prediction device may include: a processor 401 and a memory 402.
  • the risk prediction device may further include a communication interface 403.
  • the processor 401, the memory 402, and the communication interface 403 may be connected by a bus or in other ways.
  • the connection by a bus is taken as an example.
  • the communication interface 403 may be controlled by the processor to send and receive messages, the memory 402 may be used to store a computer program, the computer program includes program instructions, and the processor 401 is used to execute the program instructions stored in the memory 402.
  • the processor 401 is configured to call the program instructions to execute the following steps:
  • a plurality of second risk factors are screened from the plurality of first risk factors according to a two-norm objective function including the plurality of first risk factors, and the coefficient of each second risk factor to be screened is determined to A risk prediction model is determined based on the coefficients of each second risk factor; wherein the coefficients of the second risk factor are integers determined from a set of integers;
  • the risk prediction model is called to determine the target risk factor corresponding to the target diagnosis and treatment data and the coefficient of each target risk factor, and based on the target risk factor and the coefficient of the target risk factor, it is determined that the target user is targeted for the target disease The result of risk prediction.
  • the processor 401 calls the risk prediction model to determine the target risk factor corresponding to the target diagnosis and treatment data and the coefficient of each target risk factor, and based on the target risk factor and the target risk factor
  • the following steps can be specifically performed:
  • the binary features corresponding to the target diagnosis and treatment data are input into the risk prediction model to obtain the target risk factor corresponding to the binary feature and the coefficient of each target risk factor, and based on the target risk factor and the target risk factor The coefficient determines the risk prediction result;
  • the risk prediction result includes a prediction score of the target user for the target disease, and the prediction score is the sum of the coefficients of each of the target risk factors.
  • the processor 401 may also execute:
  • the diagnosis and treatment data includes multiple risk factor data; the processor 401 may specifically perform the following steps when determining multiple first risk factors of the target disease according to the diagnosis and treatment data:
  • the risk factor data is converted into a binary feature to obtain a plurality of first risk factors.
  • the processor 401 may further execute the following steps:
  • the processor 401 screens out a plurality of second risk factors from the plurality of first risk factors according to a two-norm objective function including the plurality of first risk factors, and determines each second risk factor to be screened.
  • a two-norm objective function including the plurality of first risk factors
  • the risk prediction model is obtained by training; wherein, the objective function is determined according to a logistic loss function and a two-norm, and the two-norm is used to control the number of second risk factors that are screened out.
  • the processor 401 may further execute the following steps:
  • the prediction score is the sum of the coefficients of each of the target risk factors
  • the early warning information includes the risk item corresponding to the target risk factor, the prediction score, and a treatment plan
  • the treatment plan is the target user The treatment plan corresponding to the user group.
  • the storage device is a blockchain node; the processor 401 may also perform the following steps:
  • the diagnosis and treatment data acquisition request carries the identification of the target user so that the storage device can verify the identity of the risk prediction device. If the verification is passed, the storage device obtains the target diagnosis and treatment data of the target user according to the target user's identification query, and sends the diagnosis and treatment data to the risk prediction device;
  • the target diagnosis and treatment data sent by the storage device is received through the communication interface 403.
  • the processor 401 may be a central processing unit (Central Processing Unit, CPU), and the processor 401 may also be other general-purpose processors or digital signal processors (Digital Signal Processors, DSPs). ), Application Specific Integrated Circuit (ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc.
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the memory 402 may include a read-only memory and a random access memory, and provides instructions and data to the processor 401. A part of the memory 402 may also include a non-volatile random access memory. For example, the memory 402 may also store diagnosis and treatment data of the user.
  • the communication interface 403 may include an input device and/or an output device.
  • the input device may be a control panel, a microphone, a receiver, etc.
  • the output device may be a display screen, a transmitter, etc., which are not listed here.
  • the processor 401, memory 402, and communication interface 403 described in the embodiment of this application can perform the implementation described in the method embodiment shown in FIG. 2 provided by the embodiment of this application, and can also perform the implementation of this application.
  • the implementation of the disease risk prediction device described in the example will not be repeated here.
  • An embodiment of the present application also provides a computer-readable storage medium, the computer-readable storage medium stores a computer program, the computer program includes program instructions, and when the program instructions are executed by a processor, the above-mentioned disease risk can be executed Part or all of the steps performed in the embodiment of the prediction method, such as part or all of the steps performed by the risk prediction device.
  • the storage medium involved in this application such as a computer-readable storage medium, may be non-volatile or volatile.
  • the embodiment of the present application also provides a computer program product, the computer program product includes computer program code, when the computer program code runs on a computer, the computer executes the steps performed in the above-mentioned disease risk prediction device method embodiment .
  • the computer-readable storage medium may mainly include a storage program area and a storage data area, where the storage program area may store an operating system, an application program required by at least one function, etc.; the storage data area may store Data created based on the use of blockchain nodes, etc.
  • Blockchain essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information for verification. The validity of the information (anti-counterfeiting) and the generation of the next block.
  • the blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.
  • the program can be stored in a computer readable storage medium. During execution, it may include the procedures of the above-mentioned method embodiments.
  • the storage medium may be a magnetic disk, an optical disc, a read-only memory (Read-Only Memory, ROM), or a random access memory (Random Access Memory, RAM), etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Medical Informatics (AREA)
  • Public Health (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Pathology (AREA)
  • Databases & Information Systems (AREA)
  • Biomedical Technology (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Medical Treatment And Welfare Office Work (AREA)
  • Measuring And Recording Apparatus For Diagnosis (AREA)

Abstract

A disease risk prediction system, method and apparatus, a device and a storage medium, which are applied to the field of medical technology. The disease risk prediction system comprises a risk prediction device (101) and a storage device (102), the storage device being used for storing diagnosis and treatment data of a user, and the risk prediction device being used for executing the following steps: acquiring diagnosis and treatment data corresponding to a target disease of a plurality of users (201); determining, according to the diagnosis and treatment data, a plurality of first risk factors corresponding to the target disease (202); screening a plurality of second risk factors from the plurality of first risk factors according to a target function including the 2-norm of the plurality of first risk factors, and determining coefficients of the screened second risk factors, so as to determine a risk prediction model, the coefficients being integers determined from an integer set (203); and acquiring target diagnosis and treatment data of a target user (204). The risk prediction model is invoked to determine a risk prediction result of the target user for the target disease, which helps to improve the prediction effect of disease risks.

Description

一种疾病风险预测系统、方法、装置、设备及介质Disease risk prediction system, method, device, equipment and medium
本申请要求于2020年11月2日提交中国专利局、申请号为202011200812.2,发明名称为“一种疾病风险预测系统、方法、装置、设备及介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on November 2, 2020, the application number is 202011200812.2, and the invention title is "a disease risk prediction system, method, device, equipment and medium", and its entire content Incorporated in this application by reference.
技术领域Technical field
本申请涉及人工智能技术领域,尤其涉及一种疾病风险预测系统、方法、装置、设备及介质。This application relates to the field of artificial intelligence technology, and in particular to a disease risk prediction system, method, device, equipment, and medium.
背景技术Background technique
在医疗技术领域中,对用户发生某种疾病的风险进行预测具有重要意义,比如精准的风险预测有助于为患者制定诊疗方案、提升患者预后水平等等。因此,如何实现疾病风险的预测并提升预测效果成为亟需解决的问题。In the field of medical technology, it is of great significance to predict the risk of a user's occurrence of a certain disease. For example, accurate risk prediction helps to formulate diagnosis and treatment plans for patients, improve patient prognosis, and so on. Therefore, how to realize disease risk prediction and improve the prediction effect has become an urgent problem to be solved.
发明内容Summary of the invention
本申请实施例提供了一种疾病风险预测系统、方法、装置、设备及介质,有助于提升疾病风险的预测效果。The embodiments of the present application provide a disease risk prediction system, method, device, equipment, and medium, which help to improve the disease risk prediction effect.
第一方面,本申请实施例提供了一种疾病风险预测系统,包括:风险预测设备和存储设备;其中,所述存储设备用于存储用户的诊疗数据;In the first aspect, an embodiment of the present application provides a disease risk prediction system, including: a risk prediction device and a storage device; wherein the storage device is used to store diagnosis and treatment data of a user;
所述风险预测设备,用于执行以下步骤:The risk prediction device is used to perform the following steps:
从所述存储设备获取多个用户的目标疾病对应的诊疗数据;Acquiring the diagnosis and treatment data corresponding to the target diseases of multiple users from the storage device;
根据所述诊疗数据确定目标疾病对应的多个第一风险因子;Determining multiple first risk factors corresponding to the target disease according to the diagnosis and treatment data;
根据包括所述多个第一风险因子的二范数的目标函数从所述多个第一风险因子中筛选出多个第二风险因子,并确定出筛选的各第二风险因子的系数,以基于各第二风险因子的系数确定出风险预测模型;其中,所述第二风险因子的系数为从整数集中确定出的整数;A plurality of second risk factors are screened from the plurality of first risk factors according to a two-norm objective function including the plurality of first risk factors, and the coefficient of each second risk factor to be screened is determined to A risk prediction model is determined based on the coefficients of each second risk factor; wherein the coefficients of the second risk factor are integers determined from a set of integers;
获取目标用户的目标诊疗数据,调用所述风险预测模型确定所述目标诊疗数据对应的目标风险因子以及各目标风险因子的系数,并基于所述目标风险因子以及所述目标风险因子的系数确定所述目标用户针对所述目标疾病的风险预测结果。Obtain the target diagnosis and treatment data of the target user, call the risk prediction model to determine the target risk factor corresponding to the target diagnosis and treatment data and the coefficient of each target risk factor, and determine the target risk factor based on the target risk factor and the coefficient of the target risk factor. The risk prediction result of the target user for the target disease.
第二方面,本申请实施例提供了一种疾病风险预测方法,包括:In the second aspect, the embodiments of the present application provide a disease risk prediction method, including:
获取多个用户的目标疾病对应的诊疗数据;Obtain the diagnosis and treatment data corresponding to the target disease of multiple users;
根据所述诊疗数据确定目标疾病对应的多个第一风险因子;Determining multiple first risk factors corresponding to the target disease according to the diagnosis and treatment data;
根据包括所述多个第一风险因子的二范数的目标函数从所述多个第一风险因子中筛选出多个第二风险因子,并确定出筛选的各第二风险因子的系数,以基于各第二风险因子的系数确定出风险预测模型;其中,所述第二风险因子的系数为从整数集中确定出的整数;A plurality of second risk factors are screened from the plurality of first risk factors according to a two-norm objective function including the plurality of first risk factors, and the coefficient of each second risk factor to be screened is determined to A risk prediction model is determined based on the coefficients of each second risk factor; wherein the coefficients of the second risk factor are integers determined from a set of integers;
获取目标用户的目标诊疗数据,调用所述风险预测模型确定所述目标诊疗数据对应的目标风险因子以及各目标风险因子的系数,并基于所述目标风险因子以及所述目标风险因子的系数确定所述目标用户针对所述目标疾病的风险预测结果。Obtain the target diagnosis and treatment data of the target user, call the risk prediction model to determine the target risk factor corresponding to the target diagnosis and treatment data and the coefficient of each target risk factor, and determine the target risk factor based on the target risk factor and the coefficient of the target risk factor. The risk prediction result of the target user for the target disease.
第三方面,本申请实施例提供了一种疾病风险预测装置,包括:In the third aspect, an embodiment of the present application provides a disease risk prediction device, including:
获取模块,用于获取多个用户的目标疾病对应的诊疗数据;The acquisition module is used to acquire the diagnosis and treatment data corresponding to the target diseases of multiple users;
确定模块,用于根据所述诊疗数据确定目标疾病对应的多个第一风险因子;A determining module, configured to determine multiple first risk factors corresponding to the target disease according to the diagnosis and treatment data;
处理模块,用于根据包括所述多个第一风险因子的二范数的目标函数从所述多个第一风险因子中筛选出多个第二风险因子,并确定出筛选的各第二风险因子的系数,以基于各第二风险因子的系数确定出风险预测模型;其中,所述第二风险因子的系数为从整数集中确定出的整数;The processing module is configured to screen out a plurality of second risk factors from the plurality of first risk factors according to a two-norm objective function including the plurality of first risk factors, and determine each second risk to be screened The coefficients of the factors are used to determine the risk prediction model based on the coefficients of each second risk factor; wherein, the coefficients of the second risk factor are integers determined from a set of integers;
所述获取模块,还用于获取目标用户的目标诊疗数据;The acquisition module is also used to acquire target diagnosis and treatment data of the target user;
所述处理模块,还用于调用所述风险预测模型确定所述目标诊疗数据对应的目标风险因子以及各目标风险因子的系数,并基于所述目标风险因子以及所述目标风险因子的系数 确定所述目标用户针对所述目标疾病的风险预测结果。The processing module is further configured to call the risk prediction model to determine the target risk factor corresponding to the target diagnosis and treatment data and the coefficient of each target risk factor, and determine the target risk factor based on the target risk factor and the coefficient of the target risk factor. The risk prediction result of the target user for the target disease.
第四方面,本申请实施例提供了一种风险预测设备,该风险预测设备可包括处理器和存储器,所述处理器和存储器相互连接。其中,所述存储器用于存储支持终端设备执行上述方法或步骤的计算机程序,所述计算机程序包括程序指令,所述处理器被配置用于调用所述程序指令,执行以下方法:In a fourth aspect, an embodiment of the present application provides a risk prediction device. The risk prediction device may include a processor and a memory, and the processor and the memory are connected to each other. Wherein, the memory is used to store a computer program that supports the terminal device to execute the above-mentioned methods or steps, the computer program includes program instructions, and the processor is configured to call the program instructions to execute the following methods:
获取多个用户的目标疾病对应的诊疗数据;Obtain the diagnosis and treatment data corresponding to the target disease of multiple users;
根据所述诊疗数据确定目标疾病对应的多个第一风险因子;Determining multiple first risk factors corresponding to the target disease according to the diagnosis and treatment data;
根据包括所述多个第一风险因子的二范数的目标函数从所述多个第一风险因子中筛选出多个第二风险因子,并确定出筛选的各第二风险因子的系数,以基于各第二风险因子的系数确定出风险预测模型;其中,所述第二风险因子的系数为从整数集中确定出的整数;A plurality of second risk factors are screened from the plurality of first risk factors according to a two-norm objective function including the plurality of first risk factors, and the coefficient of each second risk factor to be screened is determined to A risk prediction model is determined based on the coefficients of each second risk factor; wherein the coefficients of the second risk factor are integers determined from a set of integers;
获取目标用户的目标诊疗数据,调用所述风险预测模型确定所述目标诊疗数据对应的目标风险因子以及各目标风险因子的系数,并基于所述目标风险因子以及所述目标风险因子的系数确定所述目标用户针对所述目标疾病的风险预测结果。Obtain the target diagnosis and treatment data of the target user, call the risk prediction model to determine the target risk factor corresponding to the target diagnosis and treatment data and the coefficient of each target risk factor, and determine the target risk factor based on the target risk factor and the coefficient of the target risk factor. The risk prediction result of the target user for the target disease.
第五方面,本申请实施例提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序包括程序指令,所述程序指令当被处理器执行时使所述处理器执行以下方法:In a fifth aspect, embodiments of the present application provide a computer-readable storage medium that stores a computer program, and the computer program includes program instructions that, when executed by a processor, cause all The processor executes the following methods:
获取多个用户的目标疾病对应的诊疗数据;Obtain the diagnosis and treatment data corresponding to the target disease of multiple users;
根据所述诊疗数据确定目标疾病对应的多个第一风险因子;Determining multiple first risk factors corresponding to the target disease according to the diagnosis and treatment data;
根据包括所述多个第一风险因子的二范数的目标函数从所述多个第一风险因子中筛选出多个第二风险因子,并确定出筛选的各第二风险因子的系数,以基于各第二风险因子的系数确定出风险预测模型;其中,所述第二风险因子的系数为从整数集中确定出的整数;A plurality of second risk factors are screened from the plurality of first risk factors according to a two-norm objective function including the plurality of first risk factors, and the coefficient of each second risk factor to be screened is determined to A risk prediction model is determined based on the coefficients of each second risk factor; wherein the coefficients of the second risk factor are integers determined from a set of integers;
获取目标用户的目标诊疗数据,调用所述风险预测模型确定所述目标诊疗数据对应的目标风险因子以及各目标风险因子的系数,并基于所述目标风险因子以及所述目标风险因子的系数确定所述目标用户针对所述目标疾病的风险预测结果。Obtain the target diagnosis and treatment data of the target user, call the risk prediction model to determine the target risk factor corresponding to the target diagnosis and treatment data and the coefficient of each target risk factor, and determine the target risk factor based on the target risk factor and the coefficient of the target risk factor. The risk prediction result of the target user for the target disease.
本申请实施例能够应用整数优化算法,通过设定整数约束条件以及基于二范数的目标函数来实现控制风险因子个数并优化预测结果,从而有助于提升疾病风险的预测效果。The embodiments of the present application can apply integer optimization algorithms to achieve control of the number of risk factors and optimize prediction results by setting integer constraint conditions and a two-norm-based objective function, thereby helping to improve the prediction effect of disease risk.
附图说明Description of the drawings
为了更清楚地说明本申请实施例技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to explain the technical solutions of the embodiments of the present application more clearly, the following will briefly introduce the drawings used in the description of the embodiments. Obviously, the drawings in the following description are some embodiments of the present application. Ordinary technicians can obtain other drawings based on these drawings without creative work.
图1是本申请实施例提供的一种疾病风险预测系统的结构示意图;Figure 1 is a schematic structural diagram of a disease risk prediction system provided by an embodiment of the present application;
图2是本申请实施例提供的一种疾病风险预测方法的流程示意图;FIG. 2 is a schematic flowchart of a disease risk prediction method provided by an embodiment of the present application;
图3是本申请实施例提供的一种疾病风险预测装置的结构示意图;Fig. 3 is a schematic structural diagram of a disease risk prediction device provided by an embodiment of the present application;
图4是本申请实施例提供的一种风险预测设备的结构示意图。Fig. 4 is a schematic structural diagram of a risk prediction device provided by an embodiment of the present application.
具体实施方式Detailed ways
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。The technical solutions in the embodiments of the present application will be described clearly and completely in conjunction with the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are part of the embodiments of the present application, rather than all of them. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of this application.
本申请的技术方案可应用于疾病风险预测系统,并可具体应用于风险预测设备(风险预测装置)中,用于实现对疾病风险的预测。可选的,该风险预测设备可以是终端,也可以是服务器,还可以为数据平台或其他设备。该终端可包括手机、平板电脑、计算机等等,本申请不做限定。可以理解,在其他实施例中,该终端还可叫做其余名称,比如叫做终端设备、智能终端、用户设备、用户终端等等,此处不一一列举。The technical solution of the present application can be applied to a disease risk prediction system, and can be specifically applied to a risk prediction device (risk prediction device) to realize disease risk prediction. Optionally, the risk prediction device may be a terminal, a server, or a data platform or other devices. The terminal may include a mobile phone, a tablet computer, a computer, etc., which is not limited in this application. It can be understood that, in other embodiments, the terminal may also be called other names, such as terminal equipment, smart terminal, user equipment, user terminal, etc., which are not listed here.
在医疗技术领域中,对用户发生某种疾病的风险进行预测具有重要意义,比如精准的风险预测有助于为患者制定诊疗方案、提升患者预后水平等等。然而,发明人意识到,目前的疾病风险预测主要是基于专家直接打分或者基于一些算法打分等等,但这些方法均有其局限性,比如专家打分的方式存在主观性较强的问题,预测结果不可靠,而算法类方式是直接根据模型参数取整,使得最终结果很可能丧失最优的性能,造成预测能力的损失,预测效果较差,均不能满足疾病风险评分自动开发的需求。而本申请能够根据多个用户的诊疗数据确定目标疾病对应的多个风险因子,进而从该多个风险因子中筛选出多个风险因子,并确定出筛选的各风险因子的系数,该风险因子的系数为从整数集中确定出的整数,进而可通过获取目标用户的目标诊疗数据,基于上述确定出的各风险因子的系数,确定该目标诊疗数据对应的目标风险因子以及各目标风险因子的系数,并可基于该目标风险因子以及该目标风险因子的系数确定该目标用户针对该目标疾病的风险预测结果,使得能够通过设定整数约束条件来满足风险预测的预测需求,并能够控制风险因子个数,以实现疾病风险的可靠预测,提升疾病风险预测效果。可选的,本申请能够结合模型算法,应用整数优化算法,通过设定整数约束条件来满足风险预测的预测需求,并能够通过优化基于二范数的目标函数来实现模型求解并控制风险因子个数,以实现疾病风险的可靠预测,提升疾病风险预测效果。In the field of medical technology, it is of great significance to predict the risk of a user's occurrence of a certain disease. For example, accurate risk prediction helps to formulate diagnosis and treatment plans for patients, improve patient prognosis, and so on. However, the inventor realizes that the current disease risk prediction is mainly based on direct scoring by experts or scoring based on some algorithms, etc., but these methods have their limitations. For example, the method of expert scoring has strong subjective problems and predicts results. Unreliable, and the algorithmic method is directly rounded according to the model parameters, so that the final result is likely to lose the optimal performance, resulting in loss of predictive ability, and poor predictive effect, which cannot meet the needs of automatic development of disease risk scores. However, this application can determine multiple risk factors corresponding to the target disease according to the diagnosis and treatment data of multiple users, and then screen multiple risk factors from the multiple risk factors, and determine the coefficient of each risk factor to be screened, the risk factor The coefficient of is an integer determined from a set of integers, and then by obtaining the target diagnosis and treatment data of the target user, based on the coefficients of each risk factor determined above, the target risk factor corresponding to the target diagnosis and treatment data and the coefficient of each target risk factor can be determined , And can determine the target user’s risk prediction result for the target disease based on the target risk factor and the coefficient of the target risk factor, so that the prediction requirements of risk prediction can be met by setting integer constraints, and the number of risk factors can be controlled. In order to achieve reliable prediction of disease risk and improve the effect of disease risk prediction. Optionally, this application can combine model algorithms, apply integer optimization algorithms, set integer constraints to meet the forecast requirements of risk prediction, and achieve model solution and control risk factors by optimizing the objective function based on two norms. In order to achieve reliable prediction of disease risk and improve the effect of disease risk prediction.
本申请的技术方案可应用于人工智能、智慧城市、区块链和/或大数据技术领域,如可通过数据平台或其他设备实现,涉及的数据可通过区块链节点存储,或者可存储于数据库,本申请不做限定。The technical solution of this application can be applied to the fields of artificial intelligence, smart city, blockchain and/or big data technology. For example, it can be realized through a data platform or other equipment. The data involved can be stored through blockchain nodes, or can be stored in The database is not limited in this application.
本申请实施例提供了一种疾病风险预测系统、方法、装置、设备和介质等,使得有助于提升疾病风险的预测效果。以下分别详细说明。The embodiments of the present application provide a disease risk prediction system, method, device, equipment, medium, etc., so as to help improve the disease risk prediction effect. Detailed descriptions are given below.
请参见图1,是本申请实施例提供的一种疾病风险预测系统的结构示意图。如图1所示,该疾病风险预测系统可包括风险预测设备(风险预测装置)101和存储设备(存储装置)102。其中,Please refer to FIG. 1, which is a schematic structural diagram of a disease risk prediction system provided by an embodiment of the present application. As shown in FIG. 1, the disease risk prediction system may include a risk prediction device (risk prediction device) 101 and a storage device (storage device) 102. in,
存储设备102,可用于存储用户的诊疗数据;The storage device 102 can be used to store the user's diagnosis and treatment data;
风险预测设备101,可用于执行以下步骤:The risk prediction device 101 can be used to perform the following steps:
从该存储设备102获取多个用户的诊疗数据,如目标疾病对应的诊疗数据;Obtain the diagnosis and treatment data of multiple users from the storage device 102, such as the diagnosis and treatment data corresponding to the target disease;
根据该诊疗数据确定目标疾病对应的多个第一风险因子;Determine multiple first risk factors corresponding to the target disease according to the diagnosis and treatment data;
根据包括该多个第一风险因子的二范数的目标函数从该多个第一风险因子中筛选出多个第二风险因子,并确定出筛选的各第二风险因子的系数,以基于各第二风险因子的系数确定出风险预测模型;其中,该第二风险因子的系数为从整数集中确定出的整数;According to a two-norm objective function including the plurality of first risk factors, a plurality of second risk factors are screened from the plurality of first risk factors, and the coefficients of the screened second risk factors are determined to be based on each The coefficient of the second risk factor determines the risk prediction model; wherein, the coefficient of the second risk factor is an integer determined from a set of integers;
获取目标用户的目标诊疗数据,调用该风险预测模型确定该目标诊疗数据对应的目标风险因子以及各目标风险因子的系数,并基于该目标风险因子以及该目标风险因子的系数确定该目标用户针对该目标疾病的风险预测结果。Obtain the target diagnosis and treatment data of the target user, call the risk prediction model to determine the target risk factor corresponding to the target diagnosis and treatment data and the coefficient of each target risk factor, and determine that the target user is directed against the target risk factor based on the target risk factor and the coefficient of the target risk factor. The risk prediction result of the target disease.
可选的,该存储设备102还可用于存储本申请涉及的其他数据,如各风险因子以及风险因子的系数等等。Optionally, the storage device 102 may also be used to store other data related to the present application, such as various risk factors and coefficients of risk factors, and so on.
可以理解,该存储设备和风险预测设备可以分别为独立的设备,即独立部署,或者,该存储设备和风险预测设备也可以部署于同一设备中,本申请不做限定,图1仅示出了独立部署的场景。例如,在一些实施例中,该存储设备和风险预测设备可部署于服务器中,或者说,该存储设备可以部署于风险预测设备中。It can be understood that the storage device and the risk prediction device may be independent devices, that is, independently deployed, or the storage device and the risk prediction device may also be deployed in the same device, which is not limited in this application, and FIG. 1 only shows Independent deployment scenario. For example, in some embodiments, the storage device and the risk prediction device may be deployed in a server, or in other words, the storage device may be deployed in a risk prediction device.
可选的,该诊疗数据可以包括体征数据、检查检验数据等等。进一步可选的,用户可以为患有目标疾病的患者,如可称为目标患者。在一些实施例中,不同疾病对应的诊疗数据可以不同,如可以根据目标疾病确定采集的诊疗数据,进而可获取该目标疾病对应的诊疗数据;或者,在一些实施例中,不同疾病对应的诊疗数据可以相同,如可采集该用户如 目标患者预设时间范围内的所有诊疗数据作为目标疾病对应的诊疗数据。例如,可以根据目标疾病类型确定采集的诊疗数据,或者该诊疗数据可以为该目标用户的所有诊疗数据,或者可以为预设时间段内(如最近一年内)的诊疗数据。进一步可选的,该数据可以由监护系统中提取,该存储设备为监护系统中的存储设备,或者该数据可以由监护系统提取后存储于该存储设备,本申请不做限定。Optionally, the diagnosis and treatment data may include physical sign data, examination data, and so on. Further optionally, the user may be a patient suffering from a target disease, for example, it may be referred to as a target patient. In some embodiments, the diagnosis and treatment data corresponding to different diseases can be different. For example, the collected diagnosis and treatment data can be determined according to the target disease, and then the diagnosis and treatment data corresponding to the target disease can be obtained; or, in some embodiments, the diagnosis and treatment data corresponding to different diseases can be obtained. The data can be the same, for example, all the diagnosis and treatment data within the preset time range of the user such as the target patient can be collected as the diagnosis and treatment data corresponding to the target disease. For example, the collected diagnosis and treatment data may be determined according to the target disease type, or the diagnosis and treatment data may be all the diagnosis and treatment data of the target user, or may be the diagnosis and treatment data within a preset time period (such as within the last year). Further optionally, the data can be extracted from the monitoring system, and the storage device is a storage device in the monitoring system, or the data can be stored in the storage device after being extracted by the monitoring system, which is not limited in this application.
可选的,该诊疗数据可以是对采集的原始医疗(诊疗)数据进行处理得到,该处理包括采样、填充缺失值等等。例如,可获取患者的原始医疗数据,包括患者的历史基线数据,该历史基站数据可以包括多次就诊记录,每次就诊记录可包括各种诊断、检验、检查、药物、手术项目等。进一步的,可以对该历史基线数据进行预处理,例如,该体征数据可通过对采集的原始体征数据以预设时间单位(如以1h为单位)进行采样得到,该原始体征数据可以为连续数据;又如,可对检查检验数据,可使用多次插补(多重插补)填充缺失值。从而得到预处理后的诊疗数据。进一步可选的,该诊疗数据可以为文本数据,也可以向量,如二元特征,或者称为二维特征向量,等等。Optionally, the diagnosis and treatment data may be obtained by processing the collected original medical (diagnosis and treatment) data, and the processing includes sampling, filling in missing values, and so on. For example, the patient's original medical data can be obtained, including the patient's historical baseline data. The historical base station data can include multiple visit records, and each visit record may include various diagnoses, tests, examinations, medications, and surgical items. Further, the historical baseline data can be preprocessed. For example, the physical sign data can be obtained by sampling the collected original physical sign data in a preset time unit (for example, in 1h unit), and the original physical sign data can be continuous data. ; For another example, you can use multiple imputation (multiple imputation) to fill in missing values for the inspection data. In order to obtain the pre-processed diagnosis and treatment data. Further optionally, the diagnosis and treatment data can be text data, or vectors, such as binary features, or called two-dimensional feature vectors, and so on.
在一些实施例中,还可获取该诊疗数据对应的结局数据,该结局数据可以用于指示用户健康状态。该结局数据还可称为结局、临床结局或其余名称,本申请不做些限定。例如,结局数据可以是患者每次就诊记录对应的出院诊断数据,如为死亡、病情加重、出现并发症、确诊目标疾病等等。可选的,对结局数据的处理可以与诊疗数据类似,此处不赘述。以便于根据患者的诊疗数据和结局数据进行模型训练,以得到风险预测模型。进一步可选的,该结局数据可以为文本数据,也可以向量,如二元特征,或者称为二维特征向量,等等。In some embodiments, the outcome data corresponding to the diagnosis and treatment data may also be obtained, and the outcome data may be used to indicate the health status of the user. The outcome data can also be called outcome, clinical outcome or other names, and this application does not make any restrictions. For example, the outcome data may be the discharge diagnosis data corresponding to the patient's record of each visit, such as death, aggravation of the disease, occurrence of complications, diagnosis of the target disease, and so on. Optionally, the processing of the outcome data can be similar to that of the diagnosis and treatment data, which will not be repeated here. In order to conduct model training based on the patient's diagnosis and treatment data and outcome data to obtain a risk prediction model. Further optionally, the outcome data may be text data, or a vector, such as a binary feature, or a two-dimensional feature vector, and so on.
例如,以用户为心梗患者,目标疾病为心梗为例,该诊疗数据可包括年龄、收缩压、心功能分级Killip等。相应的,其结局数据可以为死亡或其他结局。For example, if the user is a patient with myocardial infarction and the target disease is myocardial infarction, the diagnosis and treatment data may include age, systolic blood pressure, and cardiac function classification Killip. Correspondingly, the outcome data can be death or other outcomes.
可选的,该风险预测结果可包括该目标用户针对该目标疾病的预测评分,该预测评分可以为各个该目标风险因子的系数(权重)之和。其中,该目标诊疗数据对应的目标风险因子可以为该筛选出的多个第二风险因子中的部分或全部风险因子。Optionally, the risk prediction result may include a prediction score of the target user for the target disease, and the prediction score may be the sum of the coefficients (weights) of the target risk factors. Wherein, the target risk factor corresponding to the target diagnosis and treatment data may be a part or all of the multiple second risk factors that are screened out.
在一些可选的实施例中,该预测评分还可以为对目标风险因子的系数进行处理后得到,比如该目标疾病为传染性疾病时,可根据目标用户所在区域的目标疾病的发生率对目标风险因子的系数加权后得到预测评分,等等,本申请不做限定。由此可以提升确定出的疾病风险预测评分的准确性和可靠性。In some optional embodiments, the prediction score may also be obtained after processing the coefficients of the target risk factor. For example, when the target disease is an infectious disease, the target disease may be determined according to the incidence of the target disease in the area where the target user is located. The coefficient of the risk factor is weighted to obtain the prediction score, etc., which is not limited in this application. As a result, the accuracy and reliability of the determined disease risk prediction score can be improved.
例如,该目标用户所在区域的目标疾病的发生率越高,加权系数可以设置为越大;反之,目标用户所在区域的目标疾病的发生率越低,加权系数可设置为越小。For example, the higher the incidence of the target disease in the area where the target user is located, the larger the weighting coefficient can be set; on the contrary, the lower the incidence of the target disease in the area where the target user is located, the smaller the weighting coefficient can be set.
又如,目标用户所在区域的目标疾病的发生率高于各区域的平均发生率时,采用预设的第一加权系数进行加权,目标用户所在区域的目标疾病的发生率低于各区域的平均发生率时,采用预设的第二加权系数进行加权,该第一加权系数大于第二加权系数。For another example, when the incidence of the target disease in the area where the target user is located is higher than the average incidence of each area, the preset first weighting coefficient is used for weighting, and the incidence of the target disease in the area where the target user is located is lower than the average incidence of each area For the occurrence rate, the preset second weighting coefficient is used for weighting, and the first weighting coefficient is greater than the second weighting coefficient.
又如,该目标疾病为传染性疾病,风险预测设备可获取目标用户所在目标区域的目标疾病的目标发生率,将该目标发生率和该目标疾病的平均发生率进行比较。如果目标发生率高于该目标疾病的平均发生率,且两者差值超过阈值,则可对某一个或多个目标风险因子的系数加权处理(如乘以一个大于1的系数,或者可对各个该目标风险因子的系数之和进行加权处理,或者可增加一项评分),在原始预测评分上增加风险评分,以得到预测评分。从而有助于进一步提升疾病风险预测的可靠性。For another example, if the target disease is an infectious disease, the risk prediction device can obtain the target incidence of the target disease in the target area where the target user is located, and compare the target incidence with the average incidence of the target disease. If the target incidence is higher than the average incidence of the target disease, and the difference between the two exceeds the threshold, the coefficients of one or more target risk factors can be weighted (for example, multiplied by a coefficient greater than 1, or can be The sum of the coefficients of each target risk factor is weighted, or a score can be added), and the risk score is added to the original prediction score to obtain the prediction score. This helps to further improve the reliability of disease risk prediction.
在一些实施例中,该风险因子可以为二元特征。可选的,风险预测设备101在获取目标用户的目标诊疗数据,调用该风险预测模型确定该目标诊疗数据对应的目标风险因子以及各目标风险因子的系数,并基于该目标风险因子以及该目标风险因子的系数确定该目标用户针对该目标疾病的风险预测结果时,可具体用于:获取目标用户的目标诊疗数据,并 将该目标诊疗数据转换为二元特征;将该目标诊疗数据对应二元特征输入该风险预测模型,得到该二元特征对应的目标风险因子以及各目标风险因子的系数,并基于该目标风险因子以及该目标风险因子的系数确定风险预测结果。In some embodiments, the risk factor may be a binary feature. Optionally, the risk prediction device 101 is acquiring the target diagnosis and treatment data of the target user, calling the risk prediction model to determine the target risk factor corresponding to the target diagnosis and treatment data and the coefficient of each target risk factor, and based on the target risk factor and the target risk When the coefficient of the factor determines the risk prediction result of the target user for the target disease, it can be specifically used to: obtain the target diagnosis and treatment data of the target user, and convert the target diagnosis and treatment data into a binary feature; the target diagnosis and treatment data corresponds to the binary feature The feature is input into the risk prediction model to obtain the target risk factor corresponding to the binary feature and the coefficient of each target risk factor, and the risk prediction result is determined based on the target risk factor and the coefficient of the target risk factor.
在一些实施例中,该诊疗数据可包括多个风险因子数据。进一步的,风险预测设备101在根据该诊疗数据确定目标疾病的多个风险因子时,可以通过获取该诊疗数据包括的多个风险因子数据,并根据风险因子数据与该诊疗数据对应的结局数据的关系,将该风险因子数据转化为二元特征,以得到多个风险因子。其中,该风险因子数据可以为对目标疾病的临床结局造成影响的变量。In some embodiments, the diagnosis and treatment data may include multiple risk factor data. Further, when the risk prediction device 101 determines multiple risk factors of the target disease according to the diagnosis and treatment data, it can obtain multiple risk factor data included in the diagnosis and treatment data, and determine the outcome data according to the risk factor data and the outcome data corresponding to the diagnosis and treatment data. Relationship, the risk factor data is converted into dual characteristics to obtain multiple risk factors. Among them, the risk factor data can be variables that affect the clinical outcome of the target disease.
在一些实施例中,该风险预测设备101,还用于获取该诊疗数据对应的结局数据。进一步的,风险预测设备101在根据包括该多个第一风险因子的二范数的目标函数从该多个第一风险因子中筛选出多个第二风险因子,并确定出筛选的各第二风险因子的系数,以基于各第二风险因子的系数确定出风险预测模型时,可具体用于:根据该诊疗数据对应的多个第一风险因子和该结局数据,确定最小化该目标函数下的多个筛选出的第二风险因子以及筛选出的各第二风险因子的系数,以训练得到该风险预测模型。其中,该二范数可用于控制筛选出的风险因子的个数即第二风险因子的个数。可选的,该目标函数可以是根据逻辑斯蒂损失函数和二范数确定的。In some embodiments, the risk prediction device 101 is also used to obtain outcome data corresponding to the diagnosis and treatment data. Further, the risk prediction device 101 screens out a plurality of second risk factors from the plurality of first risk factors according to a two-norm objective function including the plurality of first risk factors, and determines each second risk factor to be screened. The coefficient of the risk factor, when the risk prediction model is determined based on the coefficient of each second risk factor, can be specifically used to: According to the multiple first risk factors corresponding to the diagnosis and treatment data and the outcome data, determine to minimize the target function The multiple selected second risk factors and the coefficients of each selected second risk factor can be trained to obtain the risk prediction model. Among them, the two norm can be used to control the number of selected risk factors, that is, the number of second risk factors. Optionally, the objective function may be determined according to the logistic loss function and the second norm.
在一些实施例中,该风险预测设备101,还可用于接收终端发送的疾病风险预测请求,该疾病风险预测请求中携带该目标用户的标识;In some embodiments, the risk prediction device 101 can also be used to receive a disease risk prediction request sent by a terminal, and the disease risk prediction request carries the target user's identifier;
风险预测设备101,可具体用于根据该目标用户的标识获取该目标诊疗数据;The risk prediction device 101 may be specifically used to obtain the target diagnosis and treatment data according to the target user's identifier;
风险预测设备101,还可用于根据该目标疾病的类型确定目标评分阈值,并在该风险预测结果包括的预测评分大于该目标评分阈值时,向该终端发送预警信息。The risk prediction device 101 may also be used to determine a target score threshold according to the type of the target disease, and send warning information to the terminal when the predicted score included in the risk prediction result is greater than the target score threshold.
其中,发送疾病风险预测请求的终端可以为任一终端,比如可以为医生、患者或其他用户的终端,本申请不做限定。在一些实施例中,该终端可以为特定终端,比如校验通过的合法终端,或者说可以在接收到疾病风险预测请求并对该终端校验成功之后再获取目标诊疗数据。例如,风险预测设备可接收该疾病风险预测请求,并在接收到该请求之后,对该终端的身份进行校验;若校验通过,则触发获取该目标诊疗数据。可选的,该校验的方式可以为多种。例如,疾病风险预测请求中还可携带该终端的身份标识,可验证该身份标识是否存在于终端白名单中,如果存在于该白名单中,则校验通过;反之,校验不通过。又如,还可采用预设的公钥对该疾病风险预测请求进行加密,如果基于私钥对该请求解密成功,则确定校验通过;反之,校验不通过;等等,此处不一一列举。可选的,该疾病风险预测请求还可以携带目标疾病的标识和/或目标疾病的类型。Among them, the terminal that sends the disease risk prediction request can be any terminal, for example, it can be a terminal of a doctor, a patient, or other users, which is not limited in this application. In some embodiments, the terminal may be a specific terminal, such as a legitimate terminal that has passed the verification, or it may obtain the target diagnosis and treatment data after receiving the disease risk prediction request and verifying the terminal successfully. For example, the risk prediction device may receive the disease risk prediction request, and after receiving the request, verify the identity of the terminal; if the verification passes, it triggers the acquisition of the target diagnosis and treatment data. Optionally, the verification method can be multiple. For example, the disease risk prediction request may also carry the identity of the terminal, and it can be verified whether the identity is in the white list of the terminal. If it exists in the white list, the verification passes; otherwise, the verification fails. For another example, the disease risk prediction request can also be encrypted with a preset public key. If the request is successfully decrypted based on the private key, the verification is determined to pass; otherwise, the verification fails; etc., this is not the same. One enumerate. Optionally, the disease risk prediction request may also carry the identifier of the target disease and/or the type of the target disease.
可选的,该预警信息包括该目标风险因子对应的风险项、该预测评分以及治疗方案。进一步可选的,该治疗方案可以为该目标用户所属的用户分群对应的治疗方案。Optionally, the early warning information includes a risk item corresponding to the target risk factor, the prediction score, and a treatment plan. Further optionally, the treatment plan may be a treatment plan corresponding to the user group to which the target user belongs.
进一步可选的,不同疾病类型(或疾病)对应的评分阈值可以不同,具体可根据该疾病类型(或疾病)对应的危险等级确定出。例如,疾病类型(或疾病)对应的危险等级越高,该疾病类型(或疾病)对应的评分阈值可以越低;反之,疾病类型(或疾病)对应的危险等级越低,该疾病类型(或疾病)对应的评分阈值可以越高。由此有助于提升信息预警操作的灵活性。Further optionally, the scoring thresholds corresponding to different disease types (or diseases) can be different, which can be specifically determined according to the risk levels corresponding to the disease types (or diseases). For example, the higher the risk level corresponding to the disease type (or disease), the lower the score threshold corresponding to the disease type (or disease); on the contrary, the lower the risk level corresponding to the disease type (or disease), the lower the risk level of the disease type (or disease). The score threshold corresponding to disease) can be higher. This helps to improve the flexibility of information early warning operations.
可选的,该存储设备102可以为区块链节点,该诊疗数据可以从区块链获取。也即,各患者的诊疗数据可以预先存储于区块链中。通过从区块链节点中获取用户的诊疗数据,可以提升获取的诊疗数据的可靠性,进而有助于提升基于该诊疗数据确定出的疾病风险的可靠性。Optionally, the storage device 102 may be a blockchain node, and the diagnosis and treatment data may be obtained from the blockchain. That is, the diagnosis and treatment data of each patient can be stored in the blockchain in advance. By obtaining the user's diagnosis and treatment data from the blockchain node, the reliability of the obtained diagnosis and treatment data can be improved, which in turn helps to improve the reliability of the disease risk determined based on the diagnosis and treatment data.
例如,在一些实施例中,该风险预测设备101,还可用于向该存储设备102发送诊疗数据获取请求,该诊疗数据获取请求中携带该目标用户的标识;For example, in some embodiments, the risk prediction device 101 may also be used to send a diagnosis and treatment data acquisition request to the storage device 102, and the diagnosis and treatment data acquisition request carries the identification of the target user;
存储设备102,还可用于接收该诊疗数据获取请求,并对该风险预测设备的身份进行校验;若校验通过,则根据该目标用户的标识查询获取该目标用户的目标诊疗数据,并向该风险预测设备发送该诊疗数据;The storage device 102 can also be used to receive the diagnosis and treatment data acquisition request, and verify the identity of the risk prediction device; if the verification is passed, query the target user’s target diagnosis and treatment data according to the target user’s identity, and send it to The risk prediction device sends the diagnosis and treatment data;
风险预测设备101,可具体用于接收该存储设备发送的该目标诊疗数据,以获取到该目标诊疗数据。The risk prediction device 101 may be specifically configured to receive the target diagnosis and treatment data sent by the storage device to obtain the target diagnosis and treatment data.
在本申请实施例中,风险预测设备101可根据从存储设备102获取的多个用户如多个目标患者的诊疗数据来确定目标疾病对应的多个风险因子,并根据包括多个风险因子的二范数的目标函数从多个第一风险因子中筛选出多个第二风险因子,以及从整数集中确定出筛选的各第二风险因子的系数,从而训练得到风险预测模型,进而可通过调用风险预测模型确定获取的目标诊疗数据(如该目标诊疗数据可以从该存储设备102获取)对应的目标风险因子以及各目标风险因子的系数,以得到目标用户针对目标疾病的风险预测结果。本申请实施例能够应用整数优化算法,通过设定整数约束条件以及基于二范数的目标函数来实现控制风险因子个数并优化预测结果,从而有助于提升疾病风险的预测效果。In the embodiment of the present application, the risk prediction device 101 can determine multiple risk factors corresponding to the target disease according to the diagnosis and treatment data of multiple users, such as multiple target patients, acquired from the storage device 102, and according to the two risk factors including multiple risk factors. The objective function of the norm screens multiple second risk factors from multiple first risk factors, and determines the coefficients of each second risk factor screened from the integer set, so as to train the risk prediction model, and then call the risk The prediction model determines the target risk factor corresponding to the acquired target diagnosis and treatment data (for example, the target diagnosis and treatment data can be obtained from the storage device 102) and the coefficient of each target risk factor to obtain the target user's risk prediction result for the target disease. The embodiments of the present application can apply integer optimization algorithms to achieve control of the number of risk factors and optimize prediction results by setting integer constraint conditions and a two-norm-based objective function, thereby helping to improve the prediction effect of disease risk.
参见图2,图2是本申请实施例提供的一种疾病风险预测方法的流程示意图。该方法可以由上述的风险预测设备执行,如图2所示,该疾病风险预测方法可包括以下步骤:Refer to FIG. 2, which is a schematic flowchart of a disease risk prediction method provided by an embodiment of the present application. The method may be executed by the above-mentioned risk prediction device. As shown in FIG. 2, the disease risk prediction method may include the following steps:
201、获取多个用户的目标疾病对应的诊疗数据。201. Obtain diagnosis and treatment data corresponding to target diseases of multiple users.
其中,该用户可以是患有目标疾病的患者。该诊疗数据(样本)可以包括体征数据、检查检验数据等等,此处不赘述。Among them, the user may be a patient suffering from the target disease. The diagnosis and treatment data (sample) may include physical sign data, inspection data, etc., which will not be repeated here.
例如,以用户为心梗患者,目标疾病为心梗为例,该诊疗数据可包括年龄、收缩压、心功能分级Killip等。For example, if the user is a patient with myocardial infarction and the target disease is myocardial infarction, the diagnosis and treatment data may include age, systolic blood pressure, and cardiac function classification Killip.
可选的,该诊疗数据可以从区块链获取,即各患者的诊疗数据可以预先存储于区块链中,此处不赘述。通过从区块链中获取用户的诊疗数据,可以提升基于该诊疗数据预测出的疾病风险的可靠性。Optionally, the diagnosis and treatment data can be obtained from the blockchain, that is, the diagnosis and treatment data of each patient can be stored in the blockchain in advance, which will not be repeated here. By obtaining the user's diagnosis and treatment data from the blockchain, the reliability of the disease risk predicted based on the diagnosis and treatment data can be improved.
202、根据该诊疗数据确定目标疾病对应的多个第一风险因子。202. Determine multiple first risk factors corresponding to the target disease according to the diagnosis and treatment data.
可选的,该风险因子可以为特征向量,如二元特征。Optionally, the risk factor may be a feature vector, such as a binary feature.
在一些实施例中,风险预测设备还可用于获取该诊疗数据对应的结局数据。In some embodiments, the risk prediction device may also be used to obtain outcome data corresponding to the diagnosis and treatment data.
在获取得到目标疾病的诊疗数据之后,即可确定该目标疾病的风险因子,即第一风险因子。其中,该风险因子可以为该用户诊疗数据的部分数据的向量,也可以为对该诊疗数据进行处理后的数据的向量,或者,该诊疗数据可以直接由多个风险因子组成,等等,本申请不做限定。After obtaining the diagnosis and treatment data of the target disease, the risk factor of the target disease, that is, the first risk factor, can be determined. Among them, the risk factor can be a vector of partial data of the user's diagnosis and treatment data, or a vector of data after processing the diagnosis and treatment data, or the diagnosis and treatment data can be directly composed of multiple risk factors, etc. The application is not limited.
在一些实施例中,在根据该诊疗数据确定目标疾病对应的多个风险因子时,可以是将获取的包括多个风向因子数据的诊疗数据转换为二元特征。例如,可以对风险因子数据进行特征工程,将风险因子数据转化为适用于整数优化算法的二元特征。In some embodiments, when determining multiple risk factors corresponding to the target disease based on the diagnosis and treatment data, the obtained diagnosis and treatment data including multiple wind direction factor data may be converted into a binary feature. For example, feature engineering can be performed on risk factor data, and the risk factor data can be converted into binary features suitable for integer optimization algorithms.
在一些实施例中,诊疗数据可包括多个风险因子数据,本申请涉及的各风险因子可以为二元特征。可选的,在确定目标疾病对应的多个风险因子时,可以获取该诊疗数据包括的多个风险因子数据,进而根据风险因子数据与该诊疗数据对应的结局数据的关系,将该风险因子数据转化为二元特征,以得到多个风险因子。也就是说,在根据诊疗数据确定风险因子时,可以根据风险因子数据(即对临床结局造成影响的变量)与结局的关系,将诊疗数据转化为二元特征。In some embodiments, the diagnosis and treatment data may include multiple risk factor data, and each risk factor involved in the present application may be a binary feature. Optionally, when multiple risk factors corresponding to the target disease are determined, multiple risk factor data included in the diagnosis and treatment data can be obtained, and then the risk factor data can be obtained according to the relationship between the risk factor data and the outcome data corresponding to the diagnosis and treatment data. Converted into dual characteristics to obtain multiple risk factors. That is to say, when determining the risk factors based on the diagnosis and treatment data, the diagnosis and treatment data can be converted into a binary feature according to the relationship between the risk factor data (that is, the variable that affects the clinical outcome) and the outcome.
可选的,在根据风险因子数据与结局的关系,将诊疗数据转化为二元特征时,可以是指根据风险因子数据对应的参数值和结局数据对应的临界值进行二元特征的转化,其中,该临界值可用于指示该结局的风险信息。如临界值以下对应的风险因子数据的二元特征相同,临界值及其以上对应的风险因子数据的二元特征相同,等等,本申请不做限定。可选的,该临界值可以为一个或多个,如果为多个,每个临界值对应的区间的风险因子数据的 二元特征可以相同。Optionally, when converting the diagnosis and treatment data into binary features based on the relationship between the risk factor data and the outcome, it may refer to the conversion of the binary features based on the parameter values corresponding to the risk factor data and the critical value corresponding to the outcome data, where , The critical value can be used to indicate the risk information of the outcome. If the binary characteristics of the corresponding risk factor data below the critical value are the same, the binary characteristics of the risk factor data corresponding to the critical value and above are the same, etc., this application does not limit it. Optionally, the critical value may be one or more. If there are multiple, the binary characteristics of the risk factor data of the interval corresponding to each critical value may be the same.
例如,如果风险因子数据x(如心功能分级Killip)与结局y(如是否发生心梗)的关系是根据临界值c分层的(如Killip分级高于II级,显著提升心梗发生风险),则可按照临界值c对风险因子数据x进行分段,将原来的连续变量转化为二元编码变量。For example, if the relationship between risk factor data x (such as the cardiac function grade Killip) and the outcome y (such as whether a myocardial infarction occurs) is stratified according to the cut-off value c (if the Killip grade is higher than grade II, the risk of myocardial infarction will be significantly increased) , Then the risk factor data x can be segmented according to the critical value c, and the original continuous variable can be converted into a binary coded variable.
203、根据包括该多个第一风险因子的二范数的目标函数从该多个第一风险因子中筛选出多个第二风险因子,并确定出筛选的各第二风险因子的系数,以基于各第二风险因子的系数确定出风险预测模型;其中,该第二风险因子的系数为从整数集中确定出的整数。203. Screen a plurality of second risk factors from the plurality of first risk factors according to a two-norm objective function including the plurality of first risk factors, and determine the coefficient of each of the screened second risk factors, and The risk prediction model is determined based on the coefficient of each second risk factor; wherein, the coefficient of the second risk factor is an integer determined from a set of integers.
其中,该筛选出的多个风险因子即第二风险因子为该目标疾病对应的多个第一风险因子中的部分风险因子,该部分风险因子可用于表征该目标疾病的结局的关键变量。Wherein, the selected multiple risk factors, that is, the second risk factor, are part of the risk factors of the multiple first risk factors corresponding to the target disease, and the part of the risk factors can be used to characterize the key variables of the outcome of the target disease.
也就是说,本申请可基于二元特征训练风险预测模型,通过优化基于二范数的目标函数来从该多个风险因子中筛选出风险因子,并确定出各风险因子的系数。That is to say, the present application can train a risk prediction model based on binary features, filter out risk factors from the multiple risk factors by optimizing the objective function based on the two norm, and determine the coefficient of each risk factor.
在一些实施例中,风险预测模型的训练方式可以如下:根据该诊疗数据对应的多个第一风险因子和该结局数据,确定最小化该目标函数下的多个筛选出的第二风险因子以及筛选出的各第二风险因子的系数,以训练得到该风险预测模型。可选的,该目标函数可以根据逻辑斯蒂损失函数和二范数确定,该二范数用于控制筛选出的风险因子的个数。In some embodiments, the training method of the risk prediction model may be as follows: according to the multiple first risk factors corresponding to the diagnosis and treatment data and the outcome data, determine to minimize multiple selected second risk factors under the objective function, and The coefficients of each second risk factor that are screened out can be trained to obtain the risk prediction model. Optionally, the objective function may be determined according to the logistic loss function and the second norm, and the second norm is used to control the number of risk factors selected.
其中,该风险预测模型满足最优化目标函数及设定整数约束条件。例如,该目标函数可以如下所示:Among them, the risk prediction model satisfies the optimization objective function and sets integer constraints. For example, the objective function can be as follows:
Figure PCTCN2021084030-appb-000001
Figure PCTCN2021084030-appb-000001
s.t.θ∈Ψ,
Figure PCTCN2021084030-appb-000002
stθ∈Ψ,
Figure PCTCN2021084030-appb-000002
其中,θ可表示风险因子的系数向量,也即,各个第二风险因子对应的系数如评分;l(θ)可以为逻辑斯蒂损失函数;C‖θ‖ 2可以表示风险因子集对应的二范数;Ψ可以为整数集。例如,损失函数可表示如下: Among them, θ can represent the coefficient vector of the risk factor, that is, the coefficient corresponding to each second risk factor, such as the score; l(θ) can be the logistic loss function; C‖θ‖ 2 can represent the two corresponding to the risk factor set. Norm; Ψ can be a set of integers. For example, the loss function can be expressed as follows:
Figure PCTCN2021084030-appb-000003
Figure PCTCN2021084030-appb-000003
其中,n可代表数据中的样本个数,y i为样本i对应的临床结局(结局数据),x i为样本i的特征向量,如第一风险因子。 Among them, n can represent the number of samples in the data, yi is the clinical outcome (outcome data) corresponding to sample i, and x i is the feature vector of sample i, such as the first risk factor.
本申请的目标函数可以是在传统的逻辑斯蒂损失函数基础上,增加风险因子的二范数部分(C‖θ‖ 2),目的为在取得最优解的同时,通过调整参数C控制模型选择出的风险因子个数,实现最优风险因子子集的自动筛选。另外限定整数约束条件,使得参数θ从设定好的整数集Ψ中选取,目的为使模型求解结果符合疾病风险评分的现实需求。 The objective function of this application can be based on the traditional logistic loss function, adding the two-norm part of the risk factor (C‖θ‖ 2 ). The purpose is to control the model by adjusting the parameter C while obtaining the optimal solution. The number of selected risk factors realizes automatic selection of the optimal subset of risk factors. In addition, the integer constraint conditions are limited, so that the parameter θ is selected from the set integer set Ψ, in order to make the model solution result meet the actual needs of the disease risk score.
由此可基于包含n个样本的特征向量和结局的数据集,通过最小化满足约束条件的目标函数,求解出最优的风险因子组合及各风险因子对应的系数θ,则模型训练完毕。系数θ可以表示为各个风险因子对应的风险评分,如正值代表临床结局发生的风险增大,负分代表风险降低。Therefore, based on the feature vector and outcome data set containing n samples, the optimal risk factor combination and the coefficient θ corresponding to each risk factor can be solved by minimizing the objective function that satisfies the constraints, and the model training is completed. The coefficient θ can be expressed as the risk score corresponding to each risk factor. For example, a positive value represents an increase in the risk of clinical outcome, and a negative score represents a decrease in the risk.
由此,后续可通过获取用户的诊疗数据,基于诊疗数据对应的风险因子及其系数,判断得到用户的疾病风险评分。As a result, the user's diagnosis and treatment data can be subsequently obtained, and the user's disease risk score can be determined based on the risk factors and coefficients corresponding to the diagnosis and treatment data.
在其他可选的实施例中,还可通过其他方式从该多个风险因子中筛选出多个风险因子,以及确定出筛选的各风险因子的系数。比如可根据多个用户如多个目标患者的诊疗数据中涉及的风险因子的概率(如某一风险因子在所有样本中所出现的百分比)或数量(如某一风险因子在所有样本中所出现的样本数量)来进行风险因子的筛选及系数的确定,如可将风险因子概率或数量最大的N个风险因子作为筛选出的风险因子,即第二风险因子,且风险因子的概率或数量越大,该风险因子对应的系数越大(如可设置风险因子的概率区间或数量区间,每个概率区间或数量区间对应的系数可以相同),且该系数为整数,N为大于2的整数。In other optional embodiments, multiple risk factors can be screened from the multiple risk factors in other ways, and the coefficient of each risk factor to be screened can be determined. For example, it can be based on the probability (such as the percentage of a certain risk factor in all samples) or the number (such as a certain risk factor in all samples) involved in the diagnosis and treatment data of multiple users, such as multiple target patients. For example, the probability of the risk factor or the largest number of N risk factors can be used as the selected risk factor, that is, the second risk factor, and the greater the probability or number of risk factors Larger, the larger the coefficient corresponding to the risk factor (for example, the probability interval or quantity interval of the risk factor can be set, and the coefficient corresponding to each probability interval or quantity interval can be the same), and the coefficient is an integer, and N is an integer greater than 2.
204、获取目标用户的目标诊疗数据。204. Obtain target diagnosis and treatment data of the target user.
其中,该目标诊疗数据可以包括体征数据、检查检验数据等等,如可包括各种诊断、检验、检查、药物、手术项目。Among them, the target diagnosis and treatment data may include physical sign data, inspection and inspection data, etc., for example, may include various diagnoses, inspections, inspections, medications, and surgical items.
可选的,该目标诊疗数据可以是对采集的原始诊疗数据进行处理得到。例如,可获取目标用户的原始诊疗数据,可包括目标用户的多次就诊记录,每次就诊记录可包括各种诊断、检验、检查、药物、手术项目等数据。进一步的,可以对该原始诊疗数据进行预处理,从而得到预处理后的诊疗数据,此处不赘述。Optionally, the target diagnosis and treatment data may be obtained by processing collected original diagnosis and treatment data. For example, the original diagnosis and treatment data of the target user may be obtained, which may include records of multiple visits of the target user, and each visit record may include data such as various diagnoses, tests, examinations, medications, and surgical items. Further, the original diagnosis and treatment data can be preprocessed to obtain the preprocessed diagnosis and treatment data, which will not be repeated here.
可选的,风险预测设备在实现对用户的疾病风险预测,如获取该目标诊疗数据时,可以是基于用户的请求触发的,或者可以针对特定用户主动触发,或者可以通过其他方式触发,本申请不做限定。Optionally, when the risk prediction device realizes the disease risk prediction of the user, such as obtaining the target diagnosis and treatment data, it can be triggered based on the user's request, or can be triggered actively for a specific user, or can be triggered in other ways. Not limited.
例如,在一些实施例中,风险预测设备还可接收终端发送的疾病风险预测请求,该疾病风险预测请求中携带该目标用户的标识。进而可根据该目标用户的标识获取该目标诊疗数据。For example, in some embodiments, the risk prediction device may also receive a disease risk prediction request sent by the terminal, and the disease risk prediction request carries the identifier of the target user. Furthermore, the target diagnosis and treatment data can be obtained according to the target user's identification.
在一些实施例中,风险预测设备还可向存储设备发送诊疗数据获取请求,该诊疗数据获取请求中携带目标用户的标识。存储设备可接收该诊疗数据获取请求,并在接收到该诊疗数据获取请求之后,可对该风险预测设备的身份进行校验;若校验通过,则根据该目标用户的标识查询获取该目标用户的目标诊疗数据,并可向该风险预测设备发送该诊疗数据。从而风险预测设备可接收该存储设备发送的该目标诊疗数据,以获取到该目标诊疗数据。可选的,该存储设备可以为区块链节点,也可以服务器或其他存储设备。In some embodiments, the risk prediction device may also send a diagnosis and treatment data acquisition request to the storage device, and the diagnosis and treatment data acquisition request carries the identification of the target user. The storage device can receive the diagnosis and treatment data acquisition request, and after receiving the diagnosis and treatment data acquisition request, it can verify the identity of the risk prediction device; if the verification is passed, the target user will be retrieved according to the target user's identification query Target diagnosis and treatment data, and can send the diagnosis and treatment data to the risk prediction device. Therefore, the risk prediction device can receive the target diagnosis and treatment data sent by the storage device to obtain the target diagnosis and treatment data. Optionally, the storage device may be a blockchain node, or a server or other storage device.
可选的,该校验的方式可以为一种或多种。例如,诊疗数据获取请求中还可携带该风险预测设备的身份标识,则存储设备可验证该风险预测设备的身份标识是否存在于预设的白名单中,如果存在于该白名单中,则校验通过;反之,校验不通过。又如,还可采用预设的公钥对该诊疗数据获取请求进行加密,存储设备如果基于私钥对该诊疗数据获取请求解密成功,则确定校验通过;反之,校验不通过;又如,存储设备还可基于其他方式进行校验,此处不一一列举。Optionally, the verification method can be one or more. For example, the diagnosis and treatment data acquisition request can also carry the identity of the risk prediction device, and the storage device can verify whether the identity of the risk prediction device exists in a preset white list, and if it exists in the white list, the school Pass the verification; otherwise, the verification fails. For another example, a preset public key can be used to encrypt the diagnosis and treatment data acquisition request. If the storage device successfully decrypts the diagnosis and treatment data acquisition request based on the private key, the verification is determined to pass; otherwise, the verification fails; , The storage device can also be verified based on other methods, which are not listed here.
205、调用该风险预测模型确定该目标诊疗数据对应的目标风险因子以及各目标风险因子的系数,并基于该目标风险因子以及该目标风险因子的系数确定该目标用户针对该目标疾病的风险预测结果。205. Invoke the risk prediction model to determine the target risk factor corresponding to the target diagnosis and treatment data and the coefficient of each target risk factor, and determine the risk prediction result of the target user for the target disease based on the target risk factor and the coefficient of the target risk factor .
可选的,该风险预测结果可包括该目标用户针对该目标疾病的预测评分,该预测评分可以为各个该目标风险因子的系数之和,或者可以通过对该目标风险因子的系数进行处理得到,此处不赘述。Optionally, the risk prediction result may include a prediction score of the target user for the target disease, and the prediction score may be the sum of the coefficients of each target risk factor, or may be obtained by processing the coefficients of the target risk factor, I will not repeat them here.
在一些实施例中,风险预测设备在确定该目标疾病的风险预测结果时,可以将获取的目标诊疗数据转换为二元特征,将目标诊疗数据对应二元特征输入风险预测模型,得到该二元特征对应的目标风险因子以及各目标风险因子的系数,并基于该目标风险因子以及该 目标风险因子的系数确定风险预测结果。In some embodiments, when determining the risk prediction result of the target disease, the risk prediction device may convert the acquired target diagnosis and treatment data into binary features, and input the binary characteristics corresponding to the target diagnosis and treatment data into the risk prediction model to obtain the binary characteristics. The target risk factor corresponding to the feature and the coefficient of each target risk factor, and the risk prediction result is determined based on the target risk factor and the coefficient of the target risk factor.
也就是说,在获取目标用户的疾病风险预测结果时,可以将该目标诊疗数据转换为二元特征,调用训练好的风险预测模型对该二元特征进行处理,以得到该目标用户对于该目标疾病的风险评分。比如将风险预测模型确定出的风险因子的系数加和获得最终风险评分,即将算法最终选出的风险因子的真实数值对应的评分加和,得到目标用户的最终风险评分。That is to say, when obtaining the disease risk prediction result of the target user, the target diagnosis and treatment data can be converted into a binary feature, and the trained risk prediction model can be called to process the binary feature to obtain the target user’s response to the target. The risk score of the disease. For example, the coefficients of the risk factors determined by the risk prediction model are added to obtain the final risk score, that is, the scores corresponding to the true values of the risk factors finally selected by the algorithm are added to obtain the final risk score of the target user.
本申请可将原始数据转化为可直接输入整数优化算法的二元特征形式,应用整数优化算法,通过设定整数约束条件来满足风险评分的赋分需求,并通过优化基于二范数的目标函数来实现模型求解并控制风险因子个数,以实现疾病风险的可靠预测。This application can convert the original data into a binary feature form that can be directly input to the integer optimization algorithm, apply the integer optimization algorithm, meet the scoring requirements of the risk score by setting integer constraints, and optimize the objective function based on the two norm To solve the model and control the number of risk factors, so as to realize the reliable prediction of disease risk.
例如,以心梗患者预测院内死亡风险为例,风险预测模型筛选出的风险因子包括心脏骤停、年龄、Killip、收缩压这4个风险因子,其分别对应的系数(评分)为2、1、1、1,则其风险评分为5分。也即,若某患者曾经有心脏骤停病史,则在总风险评分中加两分,以此类推。由此可以快速确定出用户的风险评分。For example, taking myocardial infarction patients predicting the risk of death in the hospital as an example, the risk factors screened by the risk prediction model include cardiac arrest, age, Killip, and systolic blood pressure. The corresponding coefficients (scores) are 2, 1 , 1, 1, the risk score is 5 points. That is, if a patient has a history of cardiac arrest, two points are added to the total risk score, and so on. This can quickly determine the user's risk score.
在一些实施例中,风险预测设备还可根据该目标疾病的类型确定目标评分阈值,并在该风险预测结果包括的预测评分大于该目标评分阈值时,向该终端发送预警信息。其中,该预测评分为各个该目标风险因子的系数之和,该预警信息可包括该目标风险因子对应的风险项、该预测评分以及治疗方案等信息。In some embodiments, the risk prediction device may also determine a target score threshold according to the type of the target disease, and send warning information to the terminal when the predicted score included in the risk prediction result is greater than the target score threshold. Wherein, the prediction score is the sum of the coefficients of each target risk factor, and the early warning information may include information such as the risk item corresponding to the target risk factor, the prediction score, and treatment plan.
可选的,该治疗方案可以为该目标用户所属的用户分群对应的治疗方案。进一步可选的,目标用户所属的用户分群可以是该治疗方案下净效益最大的分群。例如可基于治疗方案的净效益实现用户分群,分别得到各治疗方案下净效益最大的用户群。由此在向用户推荐治疗方案时,可以结合净效益进行推送,比如向目标用户推荐目标用户所属的用户分群对应的净效益最大的治疗方案。从而实现为用户提供符合卫生经济学成本效益最优的治疗方案推荐,使得在提供有效治疗的前提下,为患者选择最符合成本效益的治疗方式,有助于减轻患者经济,减轻医保负担。Optionally, the treatment plan may be a treatment plan corresponding to the user group to which the target user belongs. Further optionally, the user group to which the target user belongs may be the group with the largest net benefit under the treatment plan. For example, users can be grouped based on the net benefits of treatment plans, and the user groups with the largest net benefits under each treatment plan can be obtained. Therefore, when recommending a treatment plan to a user, it can be pushed in conjunction with the net benefit, for example, recommending to the target user the treatment plan with the largest net benefit corresponding to the user group to which the target user belongs. In this way, it is possible to provide users with the most cost-effective treatment plan recommendations in accordance with health economics, so that under the premise of providing effective treatment, the most cost-effective treatment method is selected for the patient, which helps to reduce the patient's economy and reduce the burden of medical insurance.
在本申请实施例中,风险预测设备可通过获取的多个用户的诊疗数据来确定目标疾病对应的多个风险因子,并根据包括多个风险因子的二范数的目标函数从多个风险因子中筛选出多个风险因子,以及从整数集中确定出筛选的各风险因子的系数,从而训练得到风险预测模型,进而可通过调用风险预测模型确定获取的目标诊疗数据对应的目标风险因子以及各目标风险因子的系数,以得到目标用户针对目标疾病的风险预测结果。本申请实施例能够应用整数优化算法,将特征选择、模型参数学习、风险因子赋分的流程打通,避免传统方法的主观性问题;还可结合应用自动化的数据预处理方式,如可包括数据填充和特征工程,自动将原始数据转化为可直接输入整数优化算法的形式;而且本方案流程便捷、快速、自动化程度高,且结果满足疾病风险评分的临床需求,可供无算法、开发经验的临床医生使用。In the embodiment of the present application, the risk prediction device may determine multiple risk factors corresponding to the target disease through acquired diagnosis and treatment data of multiple users, and obtain multiple risk factors from the multiple risk factors according to a two-norm objective function including multiple risk factors. Multiple risk factors are screened out, and the coefficients of the selected risk factors are determined from the integer set, so as to train the risk prediction model, and then the target risk factor and each target corresponding to the target diagnosis and treatment data can be determined by calling the risk prediction model. The coefficient of the risk factor to obtain the risk prediction result of the target user for the target disease. The embodiments of this application can apply integer optimization algorithms to open up the process of feature selection, model parameter learning, and risk factor scoring, avoiding the subjective problems of traditional methods; it can also be combined with the application of automated data preprocessing methods, such as data filling And feature engineering, which automatically converts the original data into a form that can be directly input to the integer optimization algorithm; and the process of this program is convenient, fast, and highly automated, and the results meet the clinical needs of disease risk scores, which can be used for clinical trials without algorithms and development experience. Used by doctors.
可以理解,上述方法实施例都是对本申请的疾病风险预测方法或系统的举例说明,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其他实施例的相关描述。It is understandable that the above method embodiments are all examples of the disease risk prediction method or system of the present application, and the description of each embodiment has its own focus. For parts that are not described in detail in an embodiment, please refer to the description of other embodiments. Related description.
本申请实施例还提供了一种疾病风险预测装置。该装置可包括用于执行前述图2的方法的模块。请参见图3,是本申请实施例提供的一种疾病风险预测装置的结构示意图。本实施例中所描述的疾病风险预测装置,可配置于风险预测设备中,如图3所示,本实施例的疾病风险预测装置300可以包括:The embodiment of the present application also provides a disease risk prediction device. The device may include modules for performing the method of FIG. 2 described above. Please refer to FIG. 3, which is a schematic structural diagram of a disease risk prediction device provided by an embodiment of the present application. The disease risk prediction device described in this embodiment may be configured in a risk prediction device. As shown in FIG. 3, the disease risk prediction device 300 of this embodiment may include:
获取模块301,用于获取多个用户的目标疾病对应的诊疗数据;The obtaining module 301 is used to obtain diagnosis and treatment data corresponding to the target diseases of multiple users;
确定模块302,用于根据所述诊疗数据确定目标疾病对应的多个第一风险因子;The determining module 302 is configured to determine multiple first risk factors corresponding to the target disease according to the diagnosis and treatment data;
处理模块303,用于根据包括所述多个第一风险因子的二范数的目标函数从所述多个第一风险因子中筛选出多个第二风险因子,并确定出筛选的各第二风险因子的系数,以基 于各第二风险因子的系数确定出风险预测模型;其中,所述第二风险因子的系数为从整数集中确定出的整数;The processing module 303 is configured to screen out a plurality of second risk factors from the plurality of first risk factors according to a two-norm objective function including the plurality of first risk factors, and determine each second risk factor to be screened The coefficients of the risk factors are used to determine the risk prediction model based on the coefficients of the respective second risk factors; wherein the coefficients of the second risk factors are integers determined from a set of integers;
所述获取模块301,还用于获取目标用户的目标诊疗数据;The obtaining module 301 is also used to obtain target diagnosis and treatment data of the target user;
所述处理模块303,还用于调用所述风险预测模型确定所述目标诊疗数据对应的目标风险因子以及各目标风险因子的系数,并基于所述目标风险因子以及所述目标风险因子的系数确定所述目标用户针对所述目标疾病的风险预测结果。The processing module 303 is further configured to call the risk prediction model to determine the target risk factor corresponding to the target diagnosis and treatment data and the coefficient of each target risk factor, and determine based on the target risk factor and the coefficient of the target risk factor The risk prediction result of the target user for the target disease.
在一些实施例中,所述处理模块303在调用所述风险预测模型确定所述目标诊疗数据对应的目标风险因子以及各目标风险因子的系数,并基于所述目标风险因子以及所述目标风险因子的系数确定所述目标用户针对所述目标疾病的风险预测结果时,可具体执行以下步骤:In some embodiments, the processing module 303 calls the risk prediction model to determine the target risk factor corresponding to the target diagnosis and treatment data and the coefficient of each target risk factor, and is based on the target risk factor and the target risk factor. When determining the risk prediction result of the target user for the target disease, the following steps can be specifically performed:
将所述目标诊疗数据转换为二元特征;Converting the target diagnosis and treatment data into binary features;
将所述目标诊疗数据对应二元特征输入所述风险预测模型,得到所述二元特征对应的目标风险因子以及各目标风险因子的系数,并基于所述目标风险因子以及所述目标风险因子的系数确定风险预测结果;The binary features corresponding to the target diagnosis and treatment data are input into the risk prediction model to obtain the target risk factor corresponding to the binary feature and the coefficient of each target risk factor, and based on the target risk factor and the target risk factor The coefficient determines the risk prediction result;
其中,所述风险预测结果包括所述目标用户针对所述目标疾病的预测评分,所述预测评分为各个所述目标风险因子的系数之和。Wherein, the risk prediction result includes a prediction score of the target user for the target disease, and the prediction score is the sum of the coefficients of each of the target risk factors.
在一些实施例中,所述所述获取模块301,还可用于获取所述诊疗数据对应的结局数据,所述结局数据用于指示用户健康状态;In some embodiments, the acquiring module 301 may also be used to acquire outcome data corresponding to the diagnosis and treatment data, and the outcome data is used to indicate the health status of the user;
所述诊疗数据包括多个风险因子数据;所述确定模块302在根据所述诊疗数据确定目标疾病的多个风险因子时,可具体执行以下步骤:The diagnosis and treatment data includes multiple risk factor data; the determining module 302 may specifically perform the following steps when determining multiple risk factors of the target disease according to the diagnosis and treatment data:
获取所述诊疗数据包括的多个风险因子数据;Acquiring multiple risk factor data included in the diagnosis and treatment data;
根据风险因子数据与所述诊疗数据对应的结局数据的关系,将所述风险因子数据转化为二元特征,以得到多个风险因子。According to the relationship between the risk factor data and the outcome data corresponding to the diagnosis and treatment data, the risk factor data is converted into a binary feature to obtain multiple risk factors.
在一些实施例中,所述获取模块301,还用于获取所述诊疗数据对应的结局数据;In some embodiments, the obtaining module 301 is also used to obtain outcome data corresponding to the diagnosis and treatment data;
所述处理模块303在根据包括所述多个第一风险因子的二范数的目标函数从所述多个第一风险因子中筛选出多个第二风险因子,并确定出筛选的各第二风险因子的系数,以基于各第二风险因子的系数确定出风险预测模型时,可具体执行以下步骤:The processing module 303 screens out a plurality of second risk factors from the plurality of first risk factors according to a two-norm objective function including the plurality of first risk factors, and determines each second risk factor to be screened. For the coefficients of risk factors, when the risk prediction model is determined by the coefficients based on the second risk factors, the following steps can be specifically performed:
根据所述诊疗数据对应的多个第一风险因子和所述结局数据,确定最小化所述目标函数下的多个筛选出的第二风险因子以及筛选出的各第二风险因子的系数,以训练得到所述风险预测模型;其中,所述目标函数根据逻辑斯蒂损失函数和二范数确定,所述二范数用于控制筛选出的第二风险因子的个数。According to the multiple first risk factors corresponding to the diagnosis and treatment data and the outcome data, determine the multiple filtered second risk factors under the objective function and the coefficients of each selected second risk factor to minimize The risk prediction model is obtained by training; wherein, the objective function is determined according to a logistic loss function and a two-norm, and the two-norm is used to control the number of second risk factors that are screened out.
在一些实施例中,所述获取模块301,还用于接收终端发送的疾病风险预测请求,所述疾病风险预测请求中携带所述目标用户的标识;In some embodiments, the acquiring module 301 is further configured to receive a disease risk prediction request sent by the terminal, and the disease risk prediction request carries the target user's identifier;
所述获取模块301,还用于根据所述目标用户的标识获取所述目标诊疗数据;The acquiring module 301 is further configured to acquire the target diagnosis and treatment data according to the identifier of the target user;
所述确定模块302,还用于根据所述目标疾病的类型确定目标评分阈值,并在所述风险预测结果包括的预测评分大于所述目标评分阈值时,向所述终端发送预警信息;The determining module 302 is further configured to determine a target score threshold according to the type of the target disease, and send early warning information to the terminal when the predicted score included in the risk prediction result is greater than the target score threshold;
其中,所述预测评分为各个所述目标风险因子的系数之和,所述预警信息包括所述目标风险因子对应的风险项、所述预测评分以及治疗方案,所述治疗方案为所述目标用户所属的用户分群对应的治疗方案。Wherein, the prediction score is the sum of the coefficients of each of the target risk factors, the early warning information includes the risk item corresponding to the target risk factor, the prediction score, and a treatment plan, and the treatment plan is the target user The treatment plan corresponding to the user group.
在一些实施例中,所述存储设备为区块链节点;In some embodiments, the storage device is a blockchain node;
所述获取模块301,还用于向所述存储设备发送诊疗数据获取请求,所述诊疗数据获取请求中携带所述目标用户的标识,以使所述存储设备对所述风险预测设备的身份进行校验,若校验通过,则所述存储设备根据所述目标用户的标识查询获取所述目标用户的目标诊疗数据,并向所述风险预测设备发送所述诊疗数据;The acquisition module 301 is further configured to send a diagnosis and treatment data obtaining request to the storage device, and the diagnosis and treatment data obtaining request carries the identification of the target user, so that the storage device can check the identity of the risk prediction device. Verification, if the verification is passed, the storage device obtains the target diagnosis and treatment data of the target user according to the target user's identification query, and sends the diagnosis and treatment data to the risk prediction device;
所述获取模块301,具体用于接收所述存储设备发送的所述目标诊疗数据。The acquiring module 301 is specifically configured to receive the target diagnosis and treatment data sent by the storage device.
可以理解的是,本实施例的疾病风险预测装置的各功能模块可根据上述方法实施例图2中的方法具体实现,其具体实现过程可以参照上述方法实施例图2的相关描述,此处不再赘述。It is understandable that the functional modules of the disease risk prediction device of this embodiment can be specifically implemented according to the method in FIG. 2 of the foregoing method embodiment. For the specific implementation process, refer to the related description in FIG. 2 of the foregoing method embodiment. Go into details again.
请参见图4,图4是本申请实施例提供的一种风险预测设备的结构示意图。如图4所示,该风险预测设备可包括:处理器401和存储器402。可选的,该风险预测设备还可包括通信接口403。上述处理器401、存储器402和通信接口403可通过总线或其他方式连接,在本申请实施例所示图4中以通过总线连接为例。其中,通信接口403可受所述处理器的控制用于收发消息,存储器402可用于存储计算机程序,所述计算机程序包括程序指令,处理器401用于执行存储器402存储的程序指令。其中,处理器401被配置用于调用所述程序指令执行以下步骤:Please refer to FIG. 4, which is a schematic structural diagram of a risk prediction device provided by an embodiment of the present application. As shown in FIG. 4, the risk prediction device may include: a processor 401 and a memory 402. Optionally, the risk prediction device may further include a communication interface 403. The processor 401, the memory 402, and the communication interface 403 may be connected by a bus or in other ways. In FIG. 4 shown in the embodiment of the present application, the connection by a bus is taken as an example. The communication interface 403 may be controlled by the processor to send and receive messages, the memory 402 may be used to store a computer program, the computer program includes program instructions, and the processor 401 is used to execute the program instructions stored in the memory 402. Wherein, the processor 401 is configured to call the program instructions to execute the following steps:
获取多个用户的目标疾病对应的诊疗数据;Obtain the diagnosis and treatment data corresponding to the target disease of multiple users;
根据所述诊疗数据确定目标疾病对应的多个第一风险因子;Determining multiple first risk factors corresponding to the target disease according to the diagnosis and treatment data;
根据包括所述多个第一风险因子的二范数的目标函数从所述多个第一风险因子中筛选出多个第二风险因子,并确定出筛选的各第二风险因子的系数,以基于各第二风险因子的系数确定出风险预测模型;其中,所述第二风险因子的系数为从整数集中确定出的整数;A plurality of second risk factors are screened from the plurality of first risk factors according to a two-norm objective function including the plurality of first risk factors, and the coefficient of each second risk factor to be screened is determined to A risk prediction model is determined based on the coefficients of each second risk factor; wherein the coefficients of the second risk factor are integers determined from a set of integers;
获取目标用户的目标诊疗数据;Obtain target diagnosis and treatment data of target users;
调用所述风险预测模型确定所述目标诊疗数据对应的目标风险因子以及各目标风险因子的系数,并基于所述目标风险因子以及所述目标风险因子的系数确定所述目标用户针对所述目标疾病的风险预测结果。The risk prediction model is called to determine the target risk factor corresponding to the target diagnosis and treatment data and the coefficient of each target risk factor, and based on the target risk factor and the coefficient of the target risk factor, it is determined that the target user is targeted for the target disease The result of risk prediction.
在一些实施例中,所述处理器401在调用所述风险预测模型确定所述目标诊疗数据对应的目标风险因子以及各目标风险因子的系数,并基于所述目标风险因子以及所述目标风险因子的系数确定所述目标用户针对所述目标疾病的风险预测结果时,可具体执行以下步骤:In some embodiments, the processor 401 calls the risk prediction model to determine the target risk factor corresponding to the target diagnosis and treatment data and the coefficient of each target risk factor, and based on the target risk factor and the target risk factor When determining the risk prediction result of the target user for the target disease, the following steps can be specifically performed:
将所述目标诊疗数据转换为二元特征;Converting the target diagnosis and treatment data into binary features;
将所述目标诊疗数据对应二元特征输入所述风险预测模型,得到所述二元特征对应的目标风险因子以及各目标风险因子的系数,并基于所述目标风险因子以及所述目标风险因子的系数确定风险预测结果;The binary features corresponding to the target diagnosis and treatment data are input into the risk prediction model to obtain the target risk factor corresponding to the binary feature and the coefficient of each target risk factor, and based on the target risk factor and the target risk factor The coefficient determines the risk prediction result;
其中,所述风险预测结果包括所述目标用户针对所述目标疾病的预测评分,所述预测评分为各个所述目标风险因子的系数之和。Wherein, the risk prediction result includes a prediction score of the target user for the target disease, and the prediction score is the sum of the coefficients of each of the target risk factors.
在一些实施例中,所述处理器401还可执行:In some embodiments, the processor 401 may also execute:
获取所述诊疗数据对应的结局数据,所述结局数据用于指示用户健康状态;Obtaining outcome data corresponding to the diagnosis and treatment data, where the outcome data is used to indicate the health status of the user;
所述诊疗数据包括多个风险因子数据;所述处理器401在根据所述诊疗数据确定目标疾病的多个第一风险因子时,可具体执行以下步骤:The diagnosis and treatment data includes multiple risk factor data; the processor 401 may specifically perform the following steps when determining multiple first risk factors of the target disease according to the diagnosis and treatment data:
获取所述诊疗数据包括的多个风险因子数据;Acquiring multiple risk factor data included in the diagnosis and treatment data;
根据风险因子数据与所述诊疗数据对应的结局数据的关系,将所述风险因子数据转化为二元特征,以得到多个第一风险因子。According to the relationship between the risk factor data and the outcome data corresponding to the diagnosis and treatment data, the risk factor data is converted into a binary feature to obtain a plurality of first risk factors.
在一些实施例中,所述处理器401还可以执行以下步骤:In some embodiments, the processor 401 may further execute the following steps:
获取所述诊疗数据对应的结局数据;Obtaining outcome data corresponding to the diagnosis and treatment data;
所述处理器401在根据包括所述多个第一风险因子的二范数的目标函数从所述多个第一风险因子中筛选出多个第二风险因子,并确定出筛选的各第二风险因子的系数,以基于各第二风险因子的系数确定出风险预测模型时,可具体执行以下步骤:The processor 401 screens out a plurality of second risk factors from the plurality of first risk factors according to a two-norm objective function including the plurality of first risk factors, and determines each second risk factor to be screened. For the coefficients of risk factors, when the risk prediction model is determined by the coefficients based on the second risk factors, the following steps can be specifically performed:
根据所述诊疗数据对应的多个第一风险因子和所述结局数据,确定最小化所述目标函数下的多个筛选出的第二风险因子以及筛选出的各第二风险因子的系数,以训练得到所述 风险预测模型;其中,所述目标函数根据逻辑斯蒂损失函数和二范数确定,所述二范数用于控制筛选出的第二风险因子的个数。According to the multiple first risk factors corresponding to the diagnosis and treatment data and the outcome data, determine the multiple filtered second risk factors under the objective function and the coefficients of each selected second risk factor to minimize The risk prediction model is obtained by training; wherein, the objective function is determined according to a logistic loss function and a two-norm, and the two-norm is used to control the number of second risk factors that are screened out.
在一些实施例中,所述处理器401还可以执行以下步骤:In some embodiments, the processor 401 may further execute the following steps:
通过通信接口403接收终端发送的疾病风险预测请求,所述疾病风险预测请求中携带所述目标用户的标识;Receiving a disease risk prediction request sent by the terminal through the communication interface 403, where the disease risk prediction request carries the identifier of the target user;
根据所述目标用户的标识获取所述目标诊疗数据;Acquiring the target diagnosis and treatment data according to the identifier of the target user;
根据所述目标疾病的类型确定目标评分阈值,并在所述风险预测结果包括的预测评分大于所述目标评分阈值时,向所述终端发送预警信息;Determining a target score threshold according to the type of the target disease, and sending early warning information to the terminal when the predicted score included in the risk prediction result is greater than the target score threshold;
其中,所述预测评分为各个所述目标风险因子的系数之和,所述预警信息包括所述目标风险因子对应的风险项、所述预测评分以及治疗方案,所述治疗方案为所述目标用户所属的用户分群对应的治疗方案。Wherein, the prediction score is the sum of the coefficients of each of the target risk factors, the early warning information includes the risk item corresponding to the target risk factor, the prediction score, and a treatment plan, and the treatment plan is the target user The treatment plan corresponding to the user group.
在一些实施例中,所述存储设备为区块链节点;所述处理器401还可以执行以下步骤:In some embodiments, the storage device is a blockchain node; the processor 401 may also perform the following steps:
通过通信接口403向所述存储设备发送诊疗数据获取请求,所述诊疗数据获取请求中携带所述目标用户的标识,以使所述存储设备对所述风险预测设备的身份进行校验,若校验通过,则所述存储设备根据所述目标用户的标识查询获取所述目标用户的目标诊疗数据,并向所述风险预测设备发送所述诊疗数据;Send a diagnosis and treatment data acquisition request to the storage device through the communication interface 403. The diagnosis and treatment data acquisition request carries the identification of the target user so that the storage device can verify the identity of the risk prediction device. If the verification is passed, the storage device obtains the target diagnosis and treatment data of the target user according to the target user's identification query, and sends the diagnosis and treatment data to the risk prediction device;
通过通信接口403接收所述存储设备发送的所述目标诊疗数据。The target diagnosis and treatment data sent by the storage device is received through the communication interface 403.
应当理解,在本申请实施例中,所称处理器401可以是中央处理单元(Central Processing Unit,CPU),该处理器401还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。It should be understood that in this embodiment of the application, the processor 401 may be a central processing unit (Central Processing Unit, CPU), and the processor 401 may also be other general-purpose processors or digital signal processors (Digital Signal Processors, DSPs). ), Application Specific Integrated Circuit (ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc. The general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
该存储器402可以包括只读存储器和随机存取存储器,并向处理器401提供指令和数据。存储器402的一部分还可以包括非易失性随机存取存储器。例如,存储器402还可以存储用户的诊疗数据。The memory 402 may include a read-only memory and a random access memory, and provides instructions and data to the processor 401. A part of the memory 402 may also include a non-volatile random access memory. For example, the memory 402 may also store diagnosis and treatment data of the user.
该通信接口403可以包括输入设备和/或输出设备,例如该输入设备是可以是控制面板、麦克风、接收器等,输出设备可以是显示屏、发送器等,此处不一一列举。The communication interface 403 may include an input device and/or an output device. For example, the input device may be a control panel, a microphone, a receiver, etc., and the output device may be a display screen, a transmitter, etc., which are not listed here.
具体实现中,本申请实施例中所描述的处理器401、存储器402和通信接口403可执行本申请实施例提供的图2所述的方法实施例所描述的实现方式,也可执行本申请实施例所描述的疾病风险预测装置的实现方式,在此不再赘述。In specific implementation, the processor 401, memory 402, and communication interface 403 described in the embodiment of this application can perform the implementation described in the method embodiment shown in FIG. 2 provided by the embodiment of this application, and can also perform the implementation of this application. The implementation of the disease risk prediction device described in the example will not be repeated here.
本申请实施例中还提供一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序包括程序指令,所述程序指令被处理器执行时,可执行上述疾病风险预测方法实施例中所执行的部分或全部步骤,如风险预测设备执行的部分或全部步骤。An embodiment of the present application also provides a computer-readable storage medium, the computer-readable storage medium stores a computer program, the computer program includes program instructions, and when the program instructions are executed by a processor, the above-mentioned disease risk can be executed Part or all of the steps performed in the embodiment of the prediction method, such as part or all of the steps performed by the risk prediction device.
可选的,本申请涉及的存储介质如计算机可读存储介质可以是非易失性的,也可以是易失性的。Optionally, the storage medium involved in this application, such as a computer-readable storage medium, may be non-volatile or volatile.
本申请实施例还提供一种计算机程序产品,所述计算机程序产品包括计算机程序代码,当所述计算机程序代码在计算机上运行时,使得计算机执行上述疾病风险预测装置方法实施例中所执行的步骤。The embodiment of the present application also provides a computer program product, the computer program product includes computer program code, when the computer program code runs on a computer, the computer executes the steps performed in the above-mentioned disease risk prediction device method embodiment .
在一些实施例中,所述的计算机可读存储介质可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序等;存储数据区可存储根据区块链节点的使用所创建的数据等。In some embodiments, the computer-readable storage medium may mainly include a storage program area and a storage data area, where the storage program area may store an operating system, an application program required by at least one function, etc.; the storage data area may store Data created based on the use of blockchain nodes, etc.
其中,本申请所指区块链是分布式数据存储、点对点传输、共识机制、加密算法等计 算机技术的新型应用模式。区块链(Blockchain),本质上是一个去中心化的数据库,是一串使用密码学方法相关联产生的数据块,每一个数据块中包含了一批次网络交易的信息,用于验证其信息的有效性(防伪)和生成下一个区块。区块链可以包括区块链底层平台、平台产品服务层以及应用服务层等。Among them, the blockchain referred to in this application is a new application mode of computer technology such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm. Blockchain, essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information for verification. The validity of the information (anti-counterfeiting) and the generation of the next block. The blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,所述的程序可存储于一计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。其中,所述的存储介质可为磁碟、光盘、只读存储记忆体(Read-Only Memory,ROM)或随机存储记忆体(Random Access Memory,RAM)等。A person of ordinary skill in the art can understand that all or part of the processes in the above-mentioned embodiment methods can be implemented by instructing relevant hardware through a computer program. The program can be stored in a computer readable storage medium. During execution, it may include the procedures of the above-mentioned method embodiments. Wherein, the storage medium may be a magnetic disk, an optical disc, a read-only memory (Read-Only Memory, ROM), or a random access memory (Random Access Memory, RAM), etc.
以上所揭露的仅为本申请一种较佳实施例而已,当然不能以此来限定本申请之权利范围,本领域普通技术人员可以理解实现上述实施例的全部或部分流程,并依本申请权利要求所作的等同变化,仍属于发明所涵盖的范围。What is disclosed above is only a preferred embodiment of this application. Of course, it cannot be used to limit the scope of rights of this application. A person of ordinary skill in the art can understand all or part of the process of implementing the above-mentioned embodiments and follow the rights of this application. The equivalent changes required are still within the scope of the invention.

Claims (20)

  1. 一种疾病风险预测方法,包括:A disease risk prediction method, including:
    获取多个用户的目标疾病对应的诊疗数据;Obtain the diagnosis and treatment data corresponding to the target disease of multiple users;
    根据所述诊疗数据确定目标疾病对应的多个第一风险因子;Determining multiple first risk factors corresponding to the target disease according to the diagnosis and treatment data;
    根据包括所述多个第一风险因子的二范数的目标函数从所述多个第一风险因子中筛选出多个第二风险因子,并确定出筛选的各第二风险因子的系数,以基于各第二风险因子的系数确定出风险预测模型;其中,所述第二风险因子的系数为从整数集中确定出的整数;A plurality of second risk factors are screened from the plurality of first risk factors according to a two-norm objective function including the plurality of first risk factors, and the coefficient of each second risk factor to be screened is determined to A risk prediction model is determined based on the coefficients of each second risk factor; wherein the coefficients of the second risk factor are integers determined from a set of integers;
    获取目标用户的目标诊疗数据,调用所述风险预测模型确定所述目标诊疗数据对应的目标风险因子以及各目标风险因子的系数,并基于所述目标风险因子以及所述目标风险因子的系数确定所述目标用户针对所述目标疾病的风险预测结果。Obtain the target diagnosis and treatment data of the target user, call the risk prediction model to determine the target risk factor corresponding to the target diagnosis and treatment data and the coefficient of each target risk factor, and determine the target risk factor based on the target risk factor and the coefficient of the target risk factor. The risk prediction result of the target user for the target disease.
  2. 根据权利要求1所述的方法,其中,所述方法还包括:The method according to claim 1, wherein the method further comprises:
    获取所述诊疗数据对应的结局数据,所述结局数据用于指示用户健康状态;Obtaining outcome data corresponding to the diagnosis and treatment data, where the outcome data is used to indicate the health status of the user;
    所述诊疗数据包括多个风险因子数据;所述根据所述诊疗数据确定目标疾病对应的多个第一风险因子,包括:The diagnosis and treatment data includes multiple risk factor data; the determination of multiple first risk factors corresponding to the target disease according to the diagnosis and treatment data includes:
    获取所述诊疗数据包括的多个风险因子数据;Acquiring multiple risk factor data included in the diagnosis and treatment data;
    根据风险因子数据与所述诊疗数据对应的结局数据的关系,将所述风险因子数据转化为二元特征,以得到多个第一风险因子。According to the relationship between the risk factor data and the outcome data corresponding to the diagnosis and treatment data, the risk factor data is converted into a binary feature to obtain a plurality of first risk factors.
  3. 根据权利要求1所述的方法,其中,所述获取目标用户的目标诊疗数据,调用所述风险预测模型确定所述目标诊疗数据对应的目标风险因子以及各目标风险因子的系数,并基于所述目标风险因子以及所述目标风险因子的系数确定所述目标用户针对所述目标疾病的风险预测结果,包括:The method according to claim 1, wherein said acquiring target diagnosis and treatment data of a target user, calling said risk prediction model to determine the target risk factor corresponding to said target diagnosis and treatment data and the coefficient of each target risk factor, and based on said The target risk factor and the coefficient of the target risk factor determine the risk prediction result of the target user for the target disease, including:
    获取目标用户的目标诊疗数据,并将所述目标诊疗数据转换为二元特征;Acquiring target diagnosis and treatment data of the target user, and converting the target diagnosis and treatment data into binary features;
    将所述目标诊疗数据对应二元特征输入所述风险预测模型,得到所述二元特征对应的目标风险因子以及各目标风险因子的系数,并基于所述目标风险因子以及所述目标风险因子的系数确定风险预测结果;The binary features corresponding to the target diagnosis and treatment data are input into the risk prediction model to obtain the target risk factor corresponding to the binary feature and the coefficient of each target risk factor, and based on the target risk factor and the target risk factor The coefficient determines the risk prediction result;
    其中,所述风险预测结果包括所述目标用户针对所述目标疾病的预测评分,所述预测评分为各个所述目标风险因子的系数之和。Wherein, the risk prediction result includes a prediction score of the target user for the target disease, and the prediction score is the sum of the coefficients of each of the target risk factors.
  4. 根据权利要求1所述的方法,其中,所述方法还包括:The method according to claim 1, wherein the method further comprises:
    获取所述诊疗数据对应的结局数据,所述结局数据用于指示用户健康状态;Obtaining outcome data corresponding to the diagnosis and treatment data, where the outcome data is used to indicate the health status of the user;
    所述根据包括所述多个第一风险因子的二范数的目标函数从所述多个第一风险因子中筛选出多个第二风险因子,并确定出筛选的各第二风险因子的系数,以基于各第二风险因子的系数确定出风险预测模型,包括:The plurality of second risk factors are screened from the plurality of first risk factors according to the objective function including the two norms of the plurality of first risk factors, and the coefficient of each of the screened second risk factors is determined , To determine the risk prediction model based on the coefficients of each second risk factor, including:
    根据所述诊疗数据对应的多个第一风险因子和所述结局数据,确定最小化所述目标函数下的多个筛选出的第二风险因子以及筛选出的各第二风险因子的系数,以训练得到所述风险预测模型;其中,所述目标函数根据逻辑斯蒂损失函数和二范数确定,所述二范数用于控制筛选出的第二风险因子的个数。According to the multiple first risk factors corresponding to the diagnosis and treatment data and the outcome data, determine the multiple filtered second risk factors under the objective function and the coefficients of each selected second risk factor to minimize The risk prediction model is obtained by training; wherein, the objective function is determined according to a logistic loss function and a two-norm, and the two-norm is used to control the number of second risk factors that are screened out.
  5. 根据权利要求1-4任一项所述的方法,其中,所述方法还包括:The method according to any one of claims 1-4, wherein the method further comprises:
    接收终端发送的疾病风险预测请求,所述疾病风险预测请求中携带所述目标用户的标识;所述目标诊疗数据是根据所述目标用户的标识获取的;Receiving a disease risk prediction request sent by a terminal, where the disease risk prediction request carries the target user's identification; the target diagnosis and treatment data is obtained according to the target user's identification;
    根据所述目标疾病的类型确定目标评分阈值,并在所述风险预测结果包括的预测评分大于所述目标评分阈值时,向所述终端发送预警信息;Determining a target score threshold according to the type of the target disease, and sending early warning information to the terminal when the predicted score included in the risk prediction result is greater than the target score threshold;
    其中,所述预测评分为各个所述目标风险因子的系数之和,所述预警信息包括所述目标风险因子对应的风险项、所述预测评分以及治疗方案,所述治疗方案为所述目标用户所属的用户分群对应的治疗方案。Wherein, the prediction score is the sum of the coefficients of each of the target risk factors, the early warning information includes the risk item corresponding to the target risk factor, the prediction score, and a treatment plan, and the treatment plan is the target user The treatment plan corresponding to the user group.
  6. 根据权利要求1-4任一项所述的方法,其中,所述方法还包括:The method according to any one of claims 1-4, wherein the method further comprises:
    向存储设备发送诊疗数据获取请求,所述诊疗数据获取请求中携带所述目标用户的标识;所述存储设备为区块链节点;Sending a diagnosis and treatment data acquisition request to a storage device, where the diagnosis and treatment data acquisition request carries the identification of the target user; the storage device is a blockchain node;
    所述获取目标用户的目标诊疗数据,包括:The acquiring target diagnosis and treatment data of the target user includes:
    接收所述存储设备发送的所述目标诊疗数据。Receiving the target diagnosis and treatment data sent by the storage device.
  7. 一种疾病风险预测装置,包括:A disease risk prediction device, including:
    获取模块,用于获取多个用户的目标疾病对应的诊疗数据;The acquisition module is used to acquire the diagnosis and treatment data corresponding to the target diseases of multiple users;
    确定模块,用于根据所述诊疗数据确定目标疾病对应的多个第一风险因子;A determining module, configured to determine multiple first risk factors corresponding to the target disease according to the diagnosis and treatment data;
    处理模块,用于根据包括所述多个第一风险因子的二范数的目标函数从所述多个第一风险因子中筛选出多个第二风险因子,并确定出筛选的各第二风险因子的系数,以基于各第二风险因子的系数确定出风险预测模型;其中,所述第二风险因子的系数为从整数集中确定出的整数;The processing module is configured to screen out a plurality of second risk factors from the plurality of first risk factors according to a two-norm objective function including the plurality of first risk factors, and determine each second risk to be screened The coefficients of the factors are used to determine the risk prediction model based on the coefficients of each second risk factor; wherein, the coefficients of the second risk factor are integers determined from a set of integers;
    所述获取模块,还用于获取目标用户的目标诊疗数据;The acquisition module is also used to acquire target diagnosis and treatment data of the target user;
    所述处理模块,还用于调用所述风险预测模型确定所述目标诊疗数据对应的目标风险因子以及各目标风险因子的系数,并基于所述目标风险因子以及所述目标风险因子的系数确定所述目标用户针对所述目标疾病的风险预测结果。The processing module is further configured to call the risk prediction model to determine the target risk factor corresponding to the target diagnosis and treatment data and the coefficient of each target risk factor, and determine the target risk factor based on the target risk factor and the coefficient of the target risk factor. The risk prediction result of the target user for the target disease.
  8. 一种风险预测设备,包括处理器和存储器,所述处理器和所述存储器相互连接,其中,所述存储器用于存储计算机程序,所述计算机程序包括程序指令,所述处理器被配置用于调用所述程序指令,执行以下方法:A risk prediction device includes a processor and a memory, the processor and the memory are connected to each other, wherein the memory is used to store a computer program, the computer program includes program instructions, and the processor is configured to Call the program instructions to execute the following methods:
    获取多个用户的目标疾病对应的诊疗数据;Obtain the diagnosis and treatment data corresponding to the target disease of multiple users;
    根据所述诊疗数据确定目标疾病对应的多个第一风险因子;Determining multiple first risk factors corresponding to the target disease according to the diagnosis and treatment data;
    根据包括所述多个第一风险因子的二范数的目标函数从所述多个第一风险因子中筛选出多个第二风险因子,并确定出筛选的各第二风险因子的系数,以基于各第二风险因子的系数确定出风险预测模型;其中,所述第二风险因子的系数为从整数集中确定出的整数;A plurality of second risk factors are screened from the plurality of first risk factors according to a two-norm objective function including the plurality of first risk factors, and the coefficient of each second risk factor to be screened is determined to A risk prediction model is determined based on the coefficients of each second risk factor; wherein the coefficients of the second risk factor are integers determined from a set of integers;
    获取目标用户的目标诊疗数据,调用所述风险预测模型确定所述目标诊疗数据对应的目标风险因子以及各目标风险因子的系数,并基于所述目标风险因子以及所述目标风险因子的系数确定所述目标用户针对所述目标疾病的风险预测结果。Obtain the target diagnosis and treatment data of the target user, call the risk prediction model to determine the target risk factor corresponding to the target diagnosis and treatment data and the coefficient of each target risk factor, and determine the target risk factor based on the target risk factor and the coefficient of the target risk factor. The risk prediction result of the target user for the target disease.
  9. 根据权利要求8所述的风险预测设备,其中,所述处理器还用于执行:The risk prediction device according to claim 8, wherein the processor is further configured to execute:
    获取所述诊疗数据对应的结局数据,所述结局数据用于指示用户健康状态;Obtaining outcome data corresponding to the diagnosis and treatment data, where the outcome data is used to indicate the health status of the user;
    执行所述诊疗数据包括多个风险因子数据;所述根据所述诊疗数据确定目标疾病对应的多个第一风险因子,包括:The execution of the diagnosis and treatment data includes multiple risk factor data; the determination of multiple first risk factors corresponding to the target disease according to the diagnosis and treatment data includes:
    获取所述诊疗数据包括的多个风险因子数据;Acquiring multiple risk factor data included in the diagnosis and treatment data;
    根据风险因子数据与所述诊疗数据对应的结局数据的关系,将所述风险因子数据转化为二元特征,以得到多个第一风险因子。According to the relationship between the risk factor data and the outcome data corresponding to the diagnosis and treatment data, the risk factor data is converted into a binary feature to obtain a plurality of first risk factors.
  10. 根据权利要求8所述的风险预测设备,其中,执行所述获取目标用户的目标诊疗数据,调用所述风险预测模型确定所述目标诊疗数据对应的目标风险因子以及各目标风险因子的系数,并基于所述目标风险因子以及所述目标风险因子的系数确定所述目标用户针对所述目标疾病的风险预测结果,包括:The risk prediction device according to claim 8, wherein the acquiring target diagnosis and treatment data of the target user is executed, the risk prediction model is called to determine the target risk factor corresponding to the target diagnosis and treatment data and the coefficient of each target risk factor, and Determining the risk prediction result of the target user for the target disease based on the target risk factor and the coefficient of the target risk factor includes:
    获取目标用户的目标诊疗数据,并将所述目标诊疗数据转换为二元特征;Acquiring target diagnosis and treatment data of the target user, and converting the target diagnosis and treatment data into binary features;
    将所述目标诊疗数据对应二元特征输入所述风险预测模型,得到所述二元特征对应的目标风险因子以及各目标风险因子的系数,并基于所述目标风险因子以及所述目标风险因子的系数确定风险预测结果;The binary features corresponding to the target diagnosis and treatment data are input into the risk prediction model to obtain the target risk factor corresponding to the binary feature and the coefficient of each target risk factor, and based on the target risk factor and the target risk factor The coefficient determines the risk prediction result;
    其中,所述风险预测结果包括所述目标用户针对所述目标疾病的预测评分,所述预测评分为各个所述目标风险因子的系数之和。Wherein, the risk prediction result includes a prediction score of the target user for the target disease, and the prediction score is the sum of the coefficients of each of the target risk factors.
  11. 根据权利要求8所述的风险预测设备,其中,所述处理器还用于执行:The risk prediction device according to claim 8, wherein the processor is further configured to execute:
    获取所述诊疗数据对应的结局数据,所述结局数据用于指示用户健康状态;Obtaining outcome data corresponding to the diagnosis and treatment data, where the outcome data is used to indicate the health status of the user;
    执行所述根据包括所述多个第一风险因子的二范数的目标函数从所述多个第一风险因子中筛选出多个第二风险因子,并确定出筛选的各第二风险因子的系数,以基于各第二风险因子的系数确定出风险预测模型,包括:Perform the screening of a plurality of second risk factors from the plurality of first risk factors according to a two-norm objective function including the plurality of first risk factors, and determine the value of each second risk factor to be screened The coefficients are used to determine the risk prediction model based on the coefficients of each second risk factor, including:
    根据所述诊疗数据对应的多个第一风险因子和所述结局数据,确定最小化所述目标函数下的多个筛选出的第二风险因子以及筛选出的各第二风险因子的系数,以训练得到所述风险预测模型;其中,所述目标函数根据逻辑斯蒂损失函数和二范数确定,所述二范数用于控制筛选出的第二风险因子的个数。According to the multiple first risk factors corresponding to the diagnosis and treatment data and the outcome data, determine the multiple filtered second risk factors under the objective function and the coefficients of each selected second risk factor to minimize The risk prediction model is obtained by training; wherein, the objective function is determined according to a logistic loss function and a two-norm, and the two-norm is used to control the number of second risk factors that are screened out.
  12. 根据权利要求8-11任一项所述的风险预测设备,其中,所述处理器还用于执行:The risk prediction device according to any one of claims 8-11, wherein the processor is further configured to execute:
    接收终端发送的疾病风险预测请求,所述疾病风险预测请求中携带所述目标用户的标识;所述目标诊疗数据是根据所述目标用户的标识获取的;Receiving a disease risk prediction request sent by a terminal, where the disease risk prediction request carries the target user's identification; the target diagnosis and treatment data is obtained according to the target user's identification;
    根据所述目标疾病的类型确定目标评分阈值,并在所述风险预测结果包括的预测评分大于所述目标评分阈值时,向所述终端发送预警信息;Determining a target score threshold according to the type of the target disease, and sending early warning information to the terminal when the predicted score included in the risk prediction result is greater than the target score threshold;
    其中,所述预测评分为各个所述目标风险因子的系数之和,所述预警信息包括所述目标风险因子对应的风险项、所述预测评分以及治疗方案,所述治疗方案为所述目标用户所属的用户分群对应的治疗方案。Wherein, the prediction score is the sum of the coefficients of each of the target risk factors, the early warning information includes the risk item corresponding to the target risk factor, the prediction score, and a treatment plan, and the treatment plan is the target user The treatment plan corresponding to the user group.
  13. 根据权利要求8-11任一项所述的风险预测设备,其中,所述处理器还用于执行:The risk prediction device according to any one of claims 8-11, wherein the processor is further configured to execute:
    向存储设备发送诊疗数据获取请求,所述诊疗数据获取请求中携带所述目标用户的标识;所述存储设备为区块链节点;Sending a diagnosis and treatment data acquisition request to a storage device, where the diagnosis and treatment data acquisition request carries the identification of the target user; the storage device is a blockchain node;
    执行所述获取目标用户的目标诊疗数据,包括:Executing the acquisition of the target diagnosis and treatment data of the target user includes:
    接收所述存储设备发送的所述目标诊疗数据。Receiving the target diagnosis and treatment data sent by the storage device.
  14. 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序包括程序指令,所述程序指令当被处理器执行时使所述处理器执行以下方法:A computer-readable storage medium, the computer-readable storage medium stores a computer program, and the computer program includes program instructions that, when executed by a processor, cause the processor to perform the following method:
    获取多个用户的目标疾病对应的诊疗数据;Obtain the diagnosis and treatment data corresponding to the target disease of multiple users;
    根据所述诊疗数据确定目标疾病对应的多个第一风险因子;Determining multiple first risk factors corresponding to the target disease according to the diagnosis and treatment data;
    根据包括所述多个第一风险因子的二范数的目标函数从所述多个第一风险因子中筛选出多个第二风险因子,并确定出筛选的各第二风险因子的系数,以基于各第二风险因子的系数确定出风险预测模型;其中,所述第二风险因子的系数为从整数集中确定出的整数;A plurality of second risk factors are screened from the plurality of first risk factors according to a two-norm objective function including the plurality of first risk factors, and the coefficient of each second risk factor to be screened is determined to A risk prediction model is determined based on the coefficients of each second risk factor; wherein the coefficients of the second risk factor are integers determined from a set of integers;
    获取目标用户的目标诊疗数据,调用所述风险预测模型确定所述目标诊疗数据对应的目标风险因子以及各目标风险因子的系数,并基于所述目标风险因子以及所述目标风险因子的系数确定所述目标用户针对所述目标疾病的风险预测结果。Obtain the target diagnosis and treatment data of the target user, call the risk prediction model to determine the target risk factor corresponding to the target diagnosis and treatment data and the coefficient of each target risk factor, and determine the target risk factor based on the target risk factor and the coefficient of the target risk factor. The risk prediction result of the target user for the target disease.
  15. 根据权利要求14所述的计算机可读存储介质,其中,所述程序指令被处理器执行时还使所述处理器执行:The computer-readable storage medium according to claim 14, wherein, when the program instructions are executed by the processor, they also cause the processor to execute:
    获取所述诊疗数据对应的结局数据,所述结局数据用于指示用户健康状态;Obtaining outcome data corresponding to the diagnosis and treatment data, where the outcome data is used to indicate the health status of the user;
    执行所述诊疗数据包括多个风险因子数据;所述根据所述诊疗数据确定目标疾病对应的多个第一风险因子,包括:The execution of the diagnosis and treatment data includes multiple risk factor data; the determination of multiple first risk factors corresponding to the target disease according to the diagnosis and treatment data includes:
    获取所述诊疗数据包括的多个风险因子数据;Acquiring multiple risk factor data included in the diagnosis and treatment data;
    根据风险因子数据与所述诊疗数据对应的结局数据的关系,将所述风险因子数据转化为二元特征,以得到多个第一风险因子。According to the relationship between the risk factor data and the outcome data corresponding to the diagnosis and treatment data, the risk factor data is converted into a binary feature to obtain a plurality of first risk factors.
  16. 根据权利要求14所述的计算机可读存储介质,其中,执行所述获取目标用户的目标诊疗数据,调用所述风险预测模型确定所述目标诊疗数据对应的目标风险因子以及各目标风险因子的系数,并基于所述目标风险因子以及所述目标风险因子的系数确定所述目标用户针对所述目标疾病的风险预测结果,包括:The computer-readable storage medium according to claim 14, wherein the acquiring target diagnosis and treatment data of the target user is executed, and the risk prediction model is called to determine the target risk factor corresponding to the target diagnosis and treatment data and the coefficient of each target risk factor , And determining the risk prediction result of the target user for the target disease based on the target risk factor and the coefficient of the target risk factor, including:
    获取目标用户的目标诊疗数据,并将所述目标诊疗数据转换为二元特征;Acquiring target diagnosis and treatment data of the target user, and converting the target diagnosis and treatment data into binary features;
    将所述目标诊疗数据对应二元特征输入所述风险预测模型,得到所述二元特征对应的目标风险因子以及各目标风险因子的系数,并基于所述目标风险因子以及所述目标风险因子的系数确定风险预测结果;The binary features corresponding to the target diagnosis and treatment data are input into the risk prediction model to obtain the target risk factor corresponding to the binary feature and the coefficient of each target risk factor, and based on the target risk factor and the target risk factor The coefficient determines the risk prediction result;
    其中,所述风险预测结果包括所述目标用户针对所述目标疾病的预测评分,所述预测评分为各个所述目标风险因子的系数之和。Wherein, the risk prediction result includes a prediction score of the target user for the target disease, and the prediction score is the sum of the coefficients of each of the target risk factors.
  17. 根据权利要求14所述的计算机可读存储介质,其中,所述程序指令被处理器执行时还使所述处理器执行:The computer-readable storage medium according to claim 14, wherein, when the program instructions are executed by the processor, they also cause the processor to execute:
    获取所述诊疗数据对应的结局数据,所述结局数据用于指示用户健康状态;Obtaining outcome data corresponding to the diagnosis and treatment data, where the outcome data is used to indicate the health status of the user;
    执行所述根据包括所述多个第一风险因子的二范数的目标函数从所述多个第一风险因子中筛选出多个第二风险因子,并确定出筛选的各第二风险因子的系数,以基于各第二风险因子的系数确定出风险预测模型,包括:Perform the screening of a plurality of second risk factors from the plurality of first risk factors according to the objective function including the two norms of the plurality of first risk factors, and determine the value of each second risk factor to be screened The coefficients are used to determine the risk prediction model based on the coefficients of each second risk factor, including:
    根据所述诊疗数据对应的多个第一风险因子和所述结局数据,确定最小化所述目标函数下的多个筛选出的第二风险因子以及筛选出的各第二风险因子的系数,以训练得到所述风险预测模型;其中,所述目标函数根据逻辑斯蒂损失函数和二范数确定,所述二范数用于控制筛选出的第二风险因子的个数。According to the multiple first risk factors corresponding to the diagnosis and treatment data and the outcome data, determine the multiple filtered second risk factors under the objective function and the coefficients of each selected second risk factor to minimize The risk prediction model is obtained by training; wherein, the objective function is determined according to a logistic loss function and a two-norm, and the two-norm is used to control the number of second risk factors that are screened out.
  18. 根据权利要求14-17任一项所述的计算机可读存储介质,其中,所述程序指令被处理器执行时还使所述处理器执行:18. The computer-readable storage medium according to any one of claims 14-17, wherein the program instructions, when executed by the processor, also cause the processor to execute:
    接收终端发送的疾病风险预测请求,所述疾病风险预测请求中携带所述目标用户的标识;所述目标诊疗数据是根据所述目标用户的标识获取的;Receiving a disease risk prediction request sent by a terminal, where the disease risk prediction request carries the target user's identification; the target diagnosis and treatment data is obtained according to the target user's identification;
    根据所述目标疾病的类型确定目标评分阈值,并在所述风险预测结果包括的预测评分大于所述目标评分阈值时,向所述终端发送预警信息;Determining a target score threshold according to the type of the target disease, and sending early warning information to the terminal when the predicted score included in the risk prediction result is greater than the target score threshold;
    其中,所述预测评分为各个所述目标风险因子的系数之和,所述预警信息包括所述目标风险因子对应的风险项、所述预测评分以及治疗方案,所述治疗方案为所述目标用户所属的用户分群对应的治疗方案。Wherein, the prediction score is the sum of the coefficients of each of the target risk factors, the early warning information includes the risk item corresponding to the target risk factor, the prediction score, and a treatment plan, and the treatment plan is the target user The treatment plan corresponding to the user group.
  19. 根据权利要求14-17任一项所述的计算机可读存储介质,其中,所述程序指令被处理器执行时还使所述处理器执行:18. The computer-readable storage medium according to any one of claims 14-17, wherein the program instructions, when executed by the processor, also cause the processor to execute:
    向存储设备发送诊疗数据获取请求,所述诊疗数据获取请求中携带所述目标用户的标识;所述存储设备为区块链节点;Sending a diagnosis and treatment data acquisition request to a storage device, where the diagnosis and treatment data acquisition request carries the identification of the target user; the storage device is a blockchain node;
    执行所述获取目标用户的目标诊疗数据,包括:Executing the acquisition of the target diagnosis and treatment data of the target user includes:
    接收所述存储设备发送的所述目标诊疗数据。Receiving the target diagnosis and treatment data sent by the storage device.
  20. 一种疾病风险预测系统,包括:风险预测设备和存储设备;其中,所述存储设备用于存储用户的诊疗数据;A disease risk prediction system includes: a risk prediction device and a storage device; wherein the storage device is used to store diagnosis and treatment data of a user;
    所述风险预测设备,用于执行以下步骤:The risk prediction device is used to perform the following steps:
    从所述存储设备获取多个用户的目标疾病对应的诊疗数据;Acquiring the diagnosis and treatment data corresponding to the target diseases of multiple users from the storage device;
    根据所述诊疗数据确定目标疾病对应的多个第一风险因子;Determining multiple first risk factors corresponding to the target disease according to the diagnosis and treatment data;
    根据包括所述多个第一风险因子的二范数的目标函数从所述多个第一风险因子中筛选出多个第二风险因子,并确定出筛选的各第二风险因子的系数,以基于各第二风险因子的系数确定出风险预测模型;其中,所述第二风险因子的系数为从整数集中确定出的整数;A plurality of second risk factors are screened from the plurality of first risk factors according to a two-norm objective function including the plurality of first risk factors, and the coefficient of each second risk factor to be screened is determined to A risk prediction model is determined based on the coefficients of each second risk factor; wherein the coefficients of the second risk factor are integers determined from a set of integers;
    获取目标用户的目标诊疗数据,调用所述风险预测模型确定所述目标诊疗数据对应的目标风险因子以及各目标风险因子的系数,并基于所述目标风险因子以及所述目标风险因子的系数确定所述目标用户针对所述目标疾病的风险预测结果。Obtain the target diagnosis and treatment data of the target user, call the risk prediction model to determine the target risk factor corresponding to the target diagnosis and treatment data and the coefficient of each target risk factor, and determine the target risk factor based on the target risk factor and the coefficient of the target risk factor. The risk prediction result of the target user for the target disease.
PCT/CN2021/084030 2020-11-02 2021-03-30 Disease risk prediction system, method and apparatus, device and medium WO2021180244A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011200812.2A CN112017785B (en) 2020-11-02 2020-11-02 Disease risk prediction system, method, device, equipment and medium
CN202011200812.2 2020-11-02

Publications (1)

Publication Number Publication Date
WO2021180244A1 true WO2021180244A1 (en) 2021-09-16

Family

ID=73527729

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/084030 WO2021180244A1 (en) 2020-11-02 2021-03-30 Disease risk prediction system, method and apparatus, device and medium

Country Status (2)

Country Link
CN (1) CN112017785B (en)
WO (1) WO2021180244A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116741384A (en) * 2023-08-14 2023-09-12 惠民县人民医院 Bedside care-based severe acute pancreatitis clinical data management method

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112017785B (en) * 2020-11-02 2021-02-05 平安科技(深圳)有限公司 Disease risk prediction system, method, device, equipment and medium
TWI764478B (en) * 2020-12-28 2022-05-11 中華電信股份有限公司 Health risk grading apparatus, system and method thereof
CN112712435A (en) * 2020-12-28 2021-04-27 天津幸福生命科技有限公司 Service management system, computer-readable storage medium, and electronic device
CN113782216B (en) * 2021-09-15 2023-10-24 平安科技(深圳)有限公司 Disabling weight determining method and device, electronic equipment and storage medium
CN114242244A (en) * 2021-12-21 2022-03-25 广州海思医疗科技有限公司 Personalized medicine-taking risk prediction and risk stratification system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170242972A1 (en) * 2016-02-19 2017-08-24 International Business Machines Corporation Method for proactive comprehensive geriatric risk screening
CN109859846A (en) * 2019-01-08 2019-06-07 重庆邮电大学 A kind of personal health archives storage method based on privately owned chain
CN111243736A (en) * 2019-10-24 2020-06-05 中国人民解放军海军军医大学第三附属医院 Survival risk assessment method and system
CN111554401A (en) * 2020-03-26 2020-08-18 肾泰网健康科技(南京)有限公司 Method for constructing AI (artificial intelligence) chronic kidney disease screening model, and chronic kidney disease screening method and system
CN112017785A (en) * 2020-11-02 2020-12-01 平安科技(深圳)有限公司 Disease risk prediction system, method, device, equipment and medium

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180322956A1 (en) * 2017-05-05 2018-11-08 University Of Pittsburgh - Of The Commonwealth System Of Higher Education Time Window-Based Platform for the Rapid Stratification of Blunt Trauma Patients into Distinct Outcome Cohorts
CN108198621B (en) * 2018-01-18 2022-03-08 中山大学 Database data comprehensive diagnosis and treatment decision method based on neural network
CN111178449B (en) * 2019-12-31 2021-11-05 浙江大学 Liver cancer image classification method combining computer vision characteristics and imaging omics characteristics

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170242972A1 (en) * 2016-02-19 2017-08-24 International Business Machines Corporation Method for proactive comprehensive geriatric risk screening
CN109859846A (en) * 2019-01-08 2019-06-07 重庆邮电大学 A kind of personal health archives storage method based on privately owned chain
CN111243736A (en) * 2019-10-24 2020-06-05 中国人民解放军海军军医大学第三附属医院 Survival risk assessment method and system
CN111554401A (en) * 2020-03-26 2020-08-18 肾泰网健康科技(南京)有限公司 Method for constructing AI (artificial intelligence) chronic kidney disease screening model, and chronic kidney disease screening method and system
CN112017785A (en) * 2020-11-02 2020-12-01 平安科技(深圳)有限公司 Disease risk prediction system, method, device, equipment and medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116741384A (en) * 2023-08-14 2023-09-12 惠民县人民医院 Bedside care-based severe acute pancreatitis clinical data management method
CN116741384B (en) * 2023-08-14 2023-11-21 惠民县人民医院 Bedside care-based severe acute pancreatitis clinical data management method

Also Published As

Publication number Publication date
CN112017785A (en) 2020-12-01
CN112017785B (en) 2021-02-05

Similar Documents

Publication Publication Date Title
WO2021180244A1 (en) Disease risk prediction system, method and apparatus, device and medium
US20240006038A1 (en) Team-based tele-diagnostics blockchain-enabled system
WO2021179630A1 (en) Complications risk prediction system, method, apparatus, and device, and medium
WO2021159761A1 (en) Pathological data analysis method and apparatus, and computer device and storage medium
CN117238458B (en) Critical care cross-mechanism collaboration platform system based on cloud computing
US11721023B1 (en) Distinguishing a disease state from a non-disease state in an image
CN110675942A (en) Medical image diagnosis distribution method, device, terminal and storage medium
US20210192365A1 (en) Computer device, system, readable storage medium and medical data analysis method
CN114783580B (en) Medical data quality evaluation method and system
Bhuiyan et al. An artificial-intelligence-and telemedicine-based screening tool to identify glaucoma suspects from color fundus imaging
CN116452851A (en) Training method and device for disease classification model, terminal and readable storage medium
CN112397195B (en) Method, apparatus, electronic device and medium for generating physical examination model
WO2021203997A1 (en) Complication risk-based chronic disease medical insurance cost prediction method, and related device
US20150339602A1 (en) System and method for modeling health care costs
WO2021151330A1 (en) User grouping method, apparatus and device, and computer-readable storage medium
CN109872812A (en) A kind of fititious doctor diagnostic system and method based on convolutional neural networks
CN113724846A (en) Treatment data processing method, treatment data processing device, storage medium and equipment
CN114927179A (en) Information and medical diagnosis classification method, computing device and storage medium
Mustapha et al. Dldiagnosis: A mobile and web application for diseases classification using deep learning
CN112394924B (en) Method, device, electronic equipment and medium for generating questioning model
CN113723524B (en) Data processing method based on prediction model, related equipment and medium
US11869656B1 (en) Provider assessment system, methods for assessing provider performance, methods for curating provider networks based on provider performance, and methods for defining a provider network based on provider performance
US20240161875A1 (en) Machine learning system for predicting biomarkers
Petreska et al. Artificial Intelligence and Machine Learning Algorithms in Modern Cardiology
US20230069693A1 (en) Trauma-intervention determination

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21768550

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205N DATED 27/06/2023)

122 Ep: pct application non-entry in european phase

Ref document number: 21768550

Country of ref document: EP

Kind code of ref document: A1