WO2021180244A1

WO2021180244A1 - Disease risk prediction system, method and apparatus, device and medium

Info

Publication number: WO2021180244A1
Application number: PCT/CN2021/084030
Authority: WO
Inventors: 陈天歌
Original assignee: 平安科技（深圳）有限公司
Priority date: 2020-11-02
Filing date: 2021-03-30
Publication date: 2021-09-16
Also published as: CN112017785A; CN112017785B

Abstract

A disease risk prediction system, method and apparatus, a device and a storage medium, which are applied to the field of medical technology. The disease risk prediction system comprises a risk prediction device (101) and a storage device (102), the storage device being used for storing diagnosis and treatment data of a user, and the risk prediction device being used for executing the following steps: acquiring diagnosis and treatment data corresponding to a target disease of a plurality of users (201); determining, according to the diagnosis and treatment data, a plurality of first risk factors corresponding to the target disease (202); screening a plurality of second risk factors from the plurality of first risk factors according to a target function including the 2-norm of the plurality of first risk factors, and determining coefficients of the screened second risk factors, so as to determine a risk prediction model, the coefficients being integers determined from an integer set (203); and acquiring target diagnosis and treatment data of a target user (204). The risk prediction model is invoked to determine a risk prediction result of the target user for the target disease, which helps to improve the prediction effect of disease risks.

Description

Disease risk prediction system, method, device, equipment and medium

This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on November 2, 2020, the application number is 202011200812.2, and the invention title is "a disease risk prediction system, method, device, equipment and medium", and its entire content Incorporated in this application by reference.

Technical field

This application relates to the field of artificial intelligence technology, and in particular to a disease risk prediction system, method, device, equipment, and medium.

Background technique

In the field of medical technology, it is of great significance to predict the risk of a user's occurrence of a certain disease. For example, accurate risk prediction helps to formulate diagnosis and treatment plans for patients, improve patient prognosis, and so on. Therefore, how to realize disease risk prediction and improve the prediction effect has become an urgent problem to be solved.

Summary of the invention

The embodiments of the present application provide a disease risk prediction system, method, device, equipment, and medium, which help to improve the disease risk prediction effect.

In the first aspect, an embodiment of the present application provides a disease risk prediction system, including: a risk prediction device and a storage device; wherein the storage device is used to store diagnosis and treatment data of a user;

The risk prediction device is used to perform the following steps:

Acquiring the diagnosis and treatment data corresponding to the target diseases of multiple users from the storage device;

Determining multiple first risk factors corresponding to the target disease according to the diagnosis and treatment data;

A plurality of second risk factors are screened from the plurality of first risk factors according to a two-norm objective function including the plurality of first risk factors, and the coefficient of each second risk factor to be screened is determined to A risk prediction model is determined based on the coefficients of each second risk factor; wherein the coefficients of the second risk factor are integers determined from a set of integers;

Obtain the target diagnosis and treatment data of the target user, call the risk prediction model to determine the target risk factor corresponding to the target diagnosis and treatment data and the coefficient of each target risk factor, and determine the target risk factor based on the target risk factor and the coefficient of the target risk factor. The risk prediction result of the target user for the target disease.

In the second aspect, the embodiments of the present application provide a disease risk prediction method, including:

Obtain the diagnosis and treatment data corresponding to the target disease of multiple users;

In the third aspect, an embodiment of the present application provides a disease risk prediction device, including:

The acquisition module is used to acquire the diagnosis and treatment data corresponding to the target diseases of multiple users;

A determining module, configured to determine multiple first risk factors corresponding to the target disease according to the diagnosis and treatment data;

The processing module is configured to screen out a plurality of second risk factors from the plurality of first risk factors according to a two-norm objective function including the plurality of first risk factors, and determine each second risk to be screened The coefficients of the factors are used to determine the risk prediction model based on the coefficients of each second risk factor; wherein, the coefficients of the second risk factor are integers determined from a set of integers;

The acquisition module is also used to acquire target diagnosis and treatment data of the target user;

The processing module is further configured to call the risk prediction model to determine the target risk factor corresponding to the target diagnosis and treatment data and the coefficient of each target risk factor, and determine the target risk factor based on the target risk factor and the coefficient of the target risk factor. The risk prediction result of the target user for the target disease.

In a fourth aspect, an embodiment of the present application provides a risk prediction device. The risk prediction device may include a processor and a memory, and the processor and the memory are connected to each other. Wherein, the memory is used to store a computer program that supports the terminal device to execute the above-mentioned methods or steps, the computer program includes program instructions, and the processor is configured to call the program instructions to execute the following methods:

In a fifth aspect, embodiments of the present application provide a computer-readable storage medium that stores a computer program, and the computer program includes program instructions that, when executed by a processor, cause all The processor executes the following methods:

The embodiments of the present application can apply integer optimization algorithms to achieve control of the number of risk factors and optimize prediction results by setting integer constraint conditions and a two-norm-based objective function, thereby helping to improve the prediction effect of disease risk.

Description of the drawings

In order to explain the technical solutions of the embodiments of the present application more clearly, the following will briefly introduce the drawings used in the description of the embodiments. Obviously, the drawings in the following description are some embodiments of the present application. Ordinary technicians can obtain other drawings based on these drawings without creative work.

Figure 1 is a schematic structural diagram of a disease risk prediction system provided by an embodiment of the present application;

FIG. 2 is a schematic flowchart of a disease risk prediction method provided by an embodiment of the present application;

Fig. 3 is a schematic structural diagram of a disease risk prediction device provided by an embodiment of the present application;

Fig. 4 is a schematic structural diagram of a risk prediction device provided by an embodiment of the present application.

Detailed ways

The technical solutions in the embodiments of the present application will be described clearly and completely in conjunction with the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are part of the embodiments of the present application, rather than all of them. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of this application.

The technical solution of the present application can be applied to a disease risk prediction system, and can be specifically applied to a risk prediction device (risk prediction device) to realize disease risk prediction. Optionally, the risk prediction device may be a terminal, a server, or a data platform or other devices. The terminal may include a mobile phone, a tablet computer, a computer, etc., which is not limited in this application. It can be understood that, in other embodiments, the terminal may also be called other names, such as terminal equipment, smart terminal, user equipment, user terminal, etc., which are not listed here.

In the field of medical technology, it is of great significance to predict the risk of a user's occurrence of a certain disease. For example, accurate risk prediction helps to formulate diagnosis and treatment plans for patients, improve patient prognosis, and so on. However, the inventor realizes that the current disease risk prediction is mainly based on direct scoring by experts or scoring based on some algorithms, etc., but these methods have their limitations. For example, the method of expert scoring has strong subjective problems and predicts results. Unreliable, and the algorithmic method is directly rounded according to the model parameters, so that the final result is likely to lose the optimal performance, resulting in loss of predictive ability, and poor predictive effect, which cannot meet the needs of automatic development of disease risk scores. However, this application can determine multiple risk factors corresponding to the target disease according to the diagnosis and treatment data of multiple users, and then screen multiple risk factors from the multiple risk factors, and determine the coefficient of each risk factor to be screened, the risk factor The coefficient of is an integer determined from a set of integers, and then by obtaining the target diagnosis and treatment data of the target user, based on the coefficients of each risk factor determined above, the target risk factor corresponding to the target diagnosis and treatment data and the coefficient of each target risk factor can be determined , And can determine the target user’s risk prediction result for the target disease based on the target risk factor and the coefficient of the target risk factor, so that the prediction requirements of risk prediction can be met by setting integer constraints, and the number of risk factors can be controlled. In order to achieve reliable prediction of disease risk and improve the effect of disease risk prediction. Optionally, this application can combine model algorithms, apply integer optimization algorithms, set integer constraints to meet the forecast requirements of risk prediction, and achieve model solution and control risk factors by optimizing the objective function based on two norms. In order to achieve reliable prediction of disease risk and improve the effect of disease risk prediction.

The technical solution of this application can be applied to the fields of artificial intelligence, smart city, blockchain and/or big data technology. For example, it can be realized through a data platform or other equipment. The data involved can be stored through blockchain nodes, or can be stored in The database is not limited in this application.

The embodiments of the present application provide a disease risk prediction system, method, device, equipment, medium, etc., so as to help improve the disease risk prediction effect. Detailed descriptions are given below.

Please refer to FIG. 1, which is a schematic structural diagram of a disease risk prediction system provided by an embodiment of the present application. As shown in FIG. 1, the disease risk prediction system may include a risk prediction device (risk prediction device) 101 and a storage device (storage device) 102. in,

The storage device 102 can be used to store the user's diagnosis and treatment data;

The risk prediction device 101 can be used to perform the following steps:

Obtain the diagnosis and treatment data of multiple users from the storage device 102, such as the diagnosis and treatment data corresponding to the target disease;

Determine multiple first risk factors corresponding to the target disease according to the diagnosis and treatment data;

According to a two-norm objective function including the plurality of first risk factors, a plurality of second risk factors are screened from the plurality of first risk factors, and the coefficients of the screened second risk factors are determined to be based on each The coefficient of the second risk factor determines the risk prediction model; wherein, the coefficient of the second risk factor is an integer determined from a set of integers;

Obtain the target diagnosis and treatment data of the target user, call the risk prediction model to determine the target risk factor corresponding to the target diagnosis and treatment data and the coefficient of each target risk factor, and determine that the target user is directed against the target risk factor based on the target risk factor and the coefficient of the target risk factor. The risk prediction result of the target disease.

Optionally, the storage device 102 may also be used to store other data related to the present application, such as various risk factors and coefficients of risk factors, and so on.

It can be understood that the storage device and the risk prediction device may be independent devices, that is, independently deployed, or the storage device and the risk prediction device may also be deployed in the same device, which is not limited in this application, and FIG. 1 only shows Independent deployment scenario. For example, in some embodiments, the storage device and the risk prediction device may be deployed in a server, or in other words, the storage device may be deployed in a risk prediction device.

Optionally, the diagnosis and treatment data may include physical sign data, examination data, and so on. Further optionally, the user may be a patient suffering from a target disease, for example, it may be referred to as a target patient. In some embodiments, the diagnosis and treatment data corresponding to different diseases can be different. For example, the collected diagnosis and treatment data can be determined according to the target disease, and then the diagnosis and treatment data corresponding to the target disease can be obtained; or, in some embodiments, the diagnosis and treatment data corresponding to different diseases can be obtained. The data can be the same, for example, all the diagnosis and treatment data within the preset time range of the user such as the target patient can be collected as the diagnosis and treatment data corresponding to the target disease. For example, the collected diagnosis and treatment data may be determined according to the target disease type, or the diagnosis and treatment data may be all the diagnosis and treatment data of the target user, or may be the diagnosis and treatment data within a preset time period (such as within the last year). Further optionally, the data can be extracted from the monitoring system, and the storage device is a storage device in the monitoring system, or the data can be stored in the storage device after being extracted by the monitoring system, which is not limited in this application.

Optionally, the diagnosis and treatment data may be obtained by processing the collected original medical (diagnosis and treatment) data, and the processing includes sampling, filling in missing values, and so on. For example, the patient's original medical data can be obtained, including the patient's historical baseline data. The historical base station data can include multiple visit records, and each visit record may include various diagnoses, tests, examinations, medications, and surgical items. Further, the historical baseline data can be preprocessed. For example, the physical sign data can be obtained by sampling the collected original physical sign data in a preset time unit (for example, in 1h unit), and the original physical sign data can be continuous data. ; For another example, you can use multiple imputation (multiple imputation) to fill in missing values for the inspection data. In order to obtain the pre-processed diagnosis and treatment data. Further optionally, the diagnosis and treatment data can be text data, or vectors, such as binary features, or called two-dimensional feature vectors, and so on.

In some embodiments, the outcome data corresponding to the diagnosis and treatment data may also be obtained, and the outcome data may be used to indicate the health status of the user. The outcome data can also be called outcome, clinical outcome or other names, and this application does not make any restrictions. For example, the outcome data may be the discharge diagnosis data corresponding to the patient's record of each visit, such as death, aggravation of the disease, occurrence of complications, diagnosis of the target disease, and so on. Optionally, the processing of the outcome data can be similar to that of the diagnosis and treatment data, which will not be repeated here. In order to conduct model training based on the patient's diagnosis and treatment data and outcome data to obtain a risk prediction model. Further optionally, the outcome data may be text data, or a vector, such as a binary feature, or a two-dimensional feature vector, and so on.

For example, if the user is a patient with myocardial infarction and the target disease is myocardial infarction, the diagnosis and treatment data may include age, systolic blood pressure, and cardiac function classification Killip. Correspondingly, the outcome data can be death or other outcomes.

Optionally, the risk prediction result may include a prediction score of the target user for the target disease, and the prediction score may be the sum of the coefficients (weights) of the target risk factors. Wherein, the target risk factor corresponding to the target diagnosis and treatment data may be a part or all of the multiple second risk factors that are screened out.

In some optional embodiments, the prediction score may also be obtained after processing the coefficients of the target risk factor. For example, when the target disease is an infectious disease, the target disease may be determined according to the incidence of the target disease in the area where the target user is located. The coefficient of the risk factor is weighted to obtain the prediction score, etc., which is not limited in this application. As a result, the accuracy and reliability of the determined disease risk prediction score can be improved.

For example, the higher the incidence of the target disease in the area where the target user is located, the larger the weighting coefficient can be set; on the contrary, the lower the incidence of the target disease in the area where the target user is located, the smaller the weighting coefficient can be set.

For another example, when the incidence of the target disease in the area where the target user is located is higher than the average incidence of each area, the preset first weighting coefficient is used for weighting, and the incidence of the target disease in the area where the target user is located is lower than the average incidence of each area For the occurrence rate, the preset second weighting coefficient is used for weighting, and the first weighting coefficient is greater than the second weighting coefficient.

For another example, if the target disease is an infectious disease, the risk prediction device can obtain the target incidence of the target disease in the target area where the target user is located, and compare the target incidence with the average incidence of the target disease. If the target incidence is higher than the average incidence of the target disease, and the difference between the two exceeds the threshold, the coefficients of one or more target risk factors can be weighted (for example, multiplied by a coefficient greater than 1, or can be The sum of the coefficients of each target risk factor is weighted, or a score can be added), and the risk score is added to the original prediction score to obtain the prediction score. This helps to further improve the reliability of disease risk prediction.

In some embodiments, the risk factor may be a binary feature. Optionally, the risk prediction device 101 is acquiring the target diagnosis and treatment data of the target user, calling the risk prediction model to determine the target risk factor corresponding to the target diagnosis and treatment data and the coefficient of each target risk factor, and based on the target risk factor and the target risk When the coefficient of the factor determines the risk prediction result of the target user for the target disease, it can be specifically used to: obtain the target diagnosis and treatment data of the target user, and convert the target diagnosis and treatment data into a binary feature; the target diagnosis and treatment data corresponds to the binary feature The feature is input into the risk prediction model to obtain the target risk factor corresponding to the binary feature and the coefficient of each target risk factor, and the risk prediction result is determined based on the target risk factor and the coefficient of the target risk factor.

In some embodiments, the diagnosis and treatment data may include multiple risk factor data. Further, when the risk prediction device 101 determines multiple risk factors of the target disease according to the diagnosis and treatment data, it can obtain multiple risk factor data included in the diagnosis and treatment data, and determine the outcome data according to the risk factor data and the outcome data corresponding to the diagnosis and treatment data. Relationship, the risk factor data is converted into dual characteristics to obtain multiple risk factors. Among them, the risk factor data can be variables that affect the clinical outcome of the target disease.

In some embodiments, the risk prediction device 101 is also used to obtain outcome data corresponding to the diagnosis and treatment data. Further, the risk prediction device 101 screens out a plurality of second risk factors from the plurality of first risk factors according to a two-norm objective function including the plurality of first risk factors, and determines each second risk factor to be screened. The coefficient of the risk factor, when the risk prediction model is determined based on the coefficient of each second risk factor, can be specifically used to: According to the multiple first risk factors corresponding to the diagnosis and treatment data and the outcome data, determine to minimize the target function The multiple selected second risk factors and the coefficients of each selected second risk factor can be trained to obtain the risk prediction model. Among them, the two norm can be used to control the number of selected risk factors, that is, the number of second risk factors. Optionally, the objective function may be determined according to the logistic loss function and the second norm.

In some embodiments, the risk prediction device 101 can also be used to receive a disease risk prediction request sent by a terminal, and the disease risk prediction request carries the target user's identifier;

The risk prediction device 101 may be specifically used to obtain the target diagnosis and treatment data according to the target user's identifier;

The risk prediction device 101 may also be used to determine a target score threshold according to the type of the target disease, and send warning information to the terminal when the predicted score included in the risk prediction result is greater than the target score threshold.

Among them, the terminal that sends the disease risk prediction request can be any terminal, for example, it can be a terminal of a doctor, a patient, or other users, which is not limited in this application. In some embodiments, the terminal may be a specific terminal, such as a legitimate terminal that has passed the verification, or it may obtain the target diagnosis and treatment data after receiving the disease risk prediction request and verifying the terminal successfully. For example, the risk prediction device may receive the disease risk prediction request, and after receiving the request, verify the identity of the terminal; if the verification passes, it triggers the acquisition of the target diagnosis and treatment data. Optionally, the verification method can be multiple. For example, the disease risk prediction request may also carry the identity of the terminal, and it can be verified whether the identity is in the white list of the terminal. If it exists in the white list, the verification passes; otherwise, the verification fails. For another example, the disease risk prediction request can also be encrypted with a preset public key. If the request is successfully decrypted based on the private key, the verification is determined to pass; otherwise, the verification fails; etc., this is not the same. One enumerate. Optionally, the disease risk prediction request may also carry the identifier of the target disease and/or the type of the target disease.

Optionally, the early warning information includes a risk item corresponding to the target risk factor, the prediction score, and a treatment plan. Further optionally, the treatment plan may be a treatment plan corresponding to the user group to which the target user belongs.

Further optionally, the scoring thresholds corresponding to different disease types (or diseases) can be different, which can be specifically determined according to the risk levels corresponding to the disease types (or diseases). For example, the higher the risk level corresponding to the disease type (or disease), the lower the score threshold corresponding to the disease type (or disease); on the contrary, the lower the risk level corresponding to the disease type (or disease), the lower the risk level of the disease type (or disease). The score threshold corresponding to disease) can be higher. This helps to improve the flexibility of information early warning operations.

Optionally, the storage device 102 may be a blockchain node, and the diagnosis and treatment data may be obtained from the blockchain. That is, the diagnosis and treatment data of each patient can be stored in the blockchain in advance. By obtaining the user's diagnosis and treatment data from the blockchain node, the reliability of the obtained diagnosis and treatment data can be improved, which in turn helps to improve the reliability of the disease risk determined based on the diagnosis and treatment data.

For example, in some embodiments, the risk prediction device 101 may also be used to send a diagnosis and treatment data acquisition request to the storage device 102, and the diagnosis and treatment data acquisition request carries the identification of the target user;

The storage device 102 can also be used to receive the diagnosis and treatment data acquisition request, and verify the identity of the risk prediction device; if the verification is passed, query the target user’s target diagnosis and treatment data according to the target user’s identity, and send it to The risk prediction device sends the diagnosis and treatment data;

The risk prediction device 101 may be specifically configured to receive the target diagnosis and treatment data sent by the storage device to obtain the target diagnosis and treatment data.

In the embodiment of the present application, the risk prediction device 101 can determine multiple risk factors corresponding to the target disease according to the diagnosis and treatment data of multiple users, such as multiple target patients, acquired from the storage device 102, and according to the two risk factors including multiple risk factors. The objective function of the norm screens multiple second risk factors from multiple first risk factors, and determines the coefficients of each second risk factor screened from the integer set, so as to train the risk prediction model, and then call the risk The prediction model determines the target risk factor corresponding to the acquired target diagnosis and treatment data (for example, the target diagnosis and treatment data can be obtained from the storage device 102) and the coefficient of each target risk factor to obtain the target user's risk prediction result for the target disease. The embodiments of the present application can apply integer optimization algorithms to achieve control of the number of risk factors and optimize prediction results by setting integer constraint conditions and a two-norm-based objective function, thereby helping to improve the prediction effect of disease risk.

Refer to FIG. 2, which is a schematic flowchart of a disease risk prediction method provided by an embodiment of the present application. The method may be executed by the above-mentioned risk prediction device. As shown in FIG. 2, the disease risk prediction method may include the following steps:

201. Obtain diagnosis and treatment data corresponding to target diseases of multiple users.

Among them, the user may be a patient suffering from the target disease. The diagnosis and treatment data (sample) may include physical sign data, inspection data, etc., which will not be repeated here.

For example, if the user is a patient with myocardial infarction and the target disease is myocardial infarction, the diagnosis and treatment data may include age, systolic blood pressure, and cardiac function classification Killip.

Optionally, the diagnosis and treatment data can be obtained from the blockchain, that is, the diagnosis and treatment data of each patient can be stored in the blockchain in advance, which will not be repeated here. By obtaining the user's diagnosis and treatment data from the blockchain, the reliability of the disease risk predicted based on the diagnosis and treatment data can be improved.

202. Determine multiple first risk factors corresponding to the target disease according to the diagnosis and treatment data.

Optionally, the risk factor may be a feature vector, such as a binary feature.

In some embodiments, the risk prediction device may also be used to obtain outcome data corresponding to the diagnosis and treatment data.

After obtaining the diagnosis and treatment data of the target disease, the risk factor of the target disease, that is, the first risk factor, can be determined. Among them, the risk factor can be a vector of partial data of the user's diagnosis and treatment data, or a vector of data after processing the diagnosis and treatment data, or the diagnosis and treatment data can be directly composed of multiple risk factors, etc. The application is not limited.

In some embodiments, when determining multiple risk factors corresponding to the target disease based on the diagnosis and treatment data, the obtained diagnosis and treatment data including multiple wind direction factor data may be converted into a binary feature. For example, feature engineering can be performed on risk factor data, and the risk factor data can be converted into binary features suitable for integer optimization algorithms.

In some embodiments, the diagnosis and treatment data may include multiple risk factor data, and each risk factor involved in the present application may be a binary feature. Optionally, when multiple risk factors corresponding to the target disease are determined, multiple risk factor data included in the diagnosis and treatment data can be obtained, and then the risk factor data can be obtained according to the relationship between the risk factor data and the outcome data corresponding to the diagnosis and treatment data. Converted into dual characteristics to obtain multiple risk factors. That is to say, when determining the risk factors based on the diagnosis and treatment data, the diagnosis and treatment data can be converted into a binary feature according to the relationship between the risk factor data (that is, the variable that affects the clinical outcome) and the outcome.

Optionally, when converting the diagnosis and treatment data into binary features based on the relationship between the risk factor data and the outcome, it may refer to the conversion of the binary features based on the parameter values corresponding to the risk factor data and the critical value corresponding to the outcome data, where , The critical value can be used to indicate the risk information of the outcome. If the binary characteristics of the corresponding risk factor data below the critical value are the same, the binary characteristics of the risk factor data corresponding to the critical value and above are the same, etc., this application does not limit it. Optionally, the critical value may be one or more. If there are multiple, the binary characteristics of the risk factor data of the interval corresponding to each critical value may be the same.

For example, if the relationship between risk factor data x (such as the cardiac function grade Killip) and the outcome y (such as whether a myocardial infarction occurs) is stratified according to the cut-off value c (if the Killip grade is higher than grade II, the risk of myocardial infarction will be significantly increased) , Then the risk factor data x can be segmented according to the critical value c, and the original continuous variable can be converted into a binary coded variable.

203. Screen a plurality of second risk factors from the plurality of first risk factors according to a two-norm objective function including the plurality of first risk factors, and determine the coefficient of each of the screened second risk factors, and The risk prediction model is determined based on the coefficient of each second risk factor; wherein, the coefficient of the second risk factor is an integer determined from a set of integers.

Wherein, the selected multiple risk factors, that is, the second risk factor, are part of the risk factors of the multiple first risk factors corresponding to the target disease, and the part of the risk factors can be used to characterize the key variables of the outcome of the target disease.

That is to say, the present application can train a risk prediction model based on binary features, filter out risk factors from the multiple risk factors by optimizing the objective function based on the two norm, and determine the coefficient of each risk factor.

In some embodiments, the training method of the risk prediction model may be as follows: according to the multiple first risk factors corresponding to the diagnosis and treatment data and the outcome data, determine to minimize multiple selected second risk factors under the objective function, and The coefficients of each second risk factor that are screened out can be trained to obtain the risk prediction model. Optionally, the objective function may be determined according to the logistic loss function and the second norm, and the second norm is used to control the number of risk factors selected.

Among them, the risk prediction model satisfies the optimization objective function and sets integer constraints. For example, the objective function can be as follows:

stθ∈Ψ,

Among them, θ can represent the coefficient vector of the risk factor, that is, the coefficient corresponding to each second risk factor, such as the score; l(θ) can be the logistic loss function; C‖θ‖ ₂ can represent the two corresponding to the risk factor set. Norm; Ψ can be a set of integers. For example, the loss function can be expressed as follows:

Among them, n can represent the number of samples in the data, _yi is the clinical outcome (outcome data) corresponding to sample i, and x _i is the feature vector of sample i, such as the first risk factor.

The objective function of this application can be based on the traditional logistic loss function, adding the two-norm part of the risk factor (C‖θ‖ ₂ ). The purpose is to control the model by adjusting the parameter C while obtaining the optimal solution. The number of selected risk factors realizes automatic selection of the optimal subset of risk factors. In addition, the integer constraint conditions are limited, so that the parameter θ is selected from the set integer set Ψ, in order to make the model solution result meet the actual needs of the disease risk score.

Therefore, based on the feature vector and outcome data set containing n samples, the optimal risk factor combination and the coefficient θ corresponding to each risk factor can be solved by minimizing the objective function that satisfies the constraints, and the model training is completed. The coefficient θ can be expressed as the risk score corresponding to each risk factor. For example, a positive value represents an increase in the risk of clinical outcome, and a negative score represents a decrease in the risk.

As a result, the user's diagnosis and treatment data can be subsequently obtained, and the user's disease risk score can be determined based on the risk factors and coefficients corresponding to the diagnosis and treatment data.

In other optional embodiments, multiple risk factors can be screened from the multiple risk factors in other ways, and the coefficient of each risk factor to be screened can be determined. For example, it can be based on the probability (such as the percentage of a certain risk factor in all samples) or the number (such as a certain risk factor in all samples) involved in the diagnosis and treatment data of multiple users, such as multiple target patients. For example, the probability of the risk factor or the largest number of N risk factors can be used as the selected risk factor, that is, the second risk factor, and the greater the probability or number of risk factors Larger, the larger the coefficient corresponding to the risk factor (for example, the probability interval or quantity interval of the risk factor can be set, and the coefficient corresponding to each probability interval or quantity interval can be the same), and the coefficient is an integer, and N is an integer greater than 2.

204. Obtain target diagnosis and treatment data of the target user.

Among them, the target diagnosis and treatment data may include physical sign data, inspection and inspection data, etc., for example, may include various diagnoses, inspections, inspections, medications, and surgical items.

Optionally, the target diagnosis and treatment data may be obtained by processing collected original diagnosis and treatment data. For example, the original diagnosis and treatment data of the target user may be obtained, which may include records of multiple visits of the target user, and each visit record may include data such as various diagnoses, tests, examinations, medications, and surgical items. Further, the original diagnosis and treatment data can be preprocessed to obtain the preprocessed diagnosis and treatment data, which will not be repeated here.

Optionally, when the risk prediction device realizes the disease risk prediction of the user, such as obtaining the target diagnosis and treatment data, it can be triggered based on the user's request, or can be triggered actively for a specific user, or can be triggered in other ways. Not limited.

For example, in some embodiments, the risk prediction device may also receive a disease risk prediction request sent by the terminal, and the disease risk prediction request carries the identifier of the target user. Furthermore, the target diagnosis and treatment data can be obtained according to the target user's identification.

In some embodiments, the risk prediction device may also send a diagnosis and treatment data acquisition request to the storage device, and the diagnosis and treatment data acquisition request carries the identification of the target user. The storage device can receive the diagnosis and treatment data acquisition request, and after receiving the diagnosis and treatment data acquisition request, it can verify the identity of the risk prediction device; if the verification is passed, the target user will be retrieved according to the target user's identification query Target diagnosis and treatment data, and can send the diagnosis and treatment data to the risk prediction device. Therefore, the risk prediction device can receive the target diagnosis and treatment data sent by the storage device to obtain the target diagnosis and treatment data. Optionally, the storage device may be a blockchain node, or a server or other storage device.

Optionally, the verification method can be one or more. For example, the diagnosis and treatment data acquisition request can also carry the identity of the risk prediction device, and the storage device can verify whether the identity of the risk prediction device exists in a preset white list, and if it exists in the white list, the school Pass the verification; otherwise, the verification fails. For another example, a preset public key can be used to encrypt the diagnosis and treatment data acquisition request. If the storage device successfully decrypts the diagnosis and treatment data acquisition request based on the private key, the verification is determined to pass; otherwise, the verification fails; , The storage device can also be verified based on other methods, which are not listed here.

205. Invoke the risk prediction model to determine the target risk factor corresponding to the target diagnosis and treatment data and the coefficient of each target risk factor, and determine the risk prediction result of the target user for the target disease based on the target risk factor and the coefficient of the target risk factor .

Optionally, the risk prediction result may include a prediction score of the target user for the target disease, and the prediction score may be the sum of the coefficients of each target risk factor, or may be obtained by processing the coefficients of the target risk factor, I will not repeat them here.

In some embodiments, when determining the risk prediction result of the target disease, the risk prediction device may convert the acquired target diagnosis and treatment data into binary features, and input the binary characteristics corresponding to the target diagnosis and treatment data into the risk prediction model to obtain the binary characteristics. The target risk factor corresponding to the feature and the coefficient of each target risk factor, and the risk prediction result is determined based on the target risk factor and the coefficient of the target risk factor.

That is to say, when obtaining the disease risk prediction result of the target user, the target diagnosis and treatment data can be converted into a binary feature, and the trained risk prediction model can be called to process the binary feature to obtain the target user’s response to the target. The risk score of the disease. For example, the coefficients of the risk factors determined by the risk prediction model are added to obtain the final risk score, that is, the scores corresponding to the true values of the risk factors finally selected by the algorithm are added to obtain the final risk score of the target user.

This application can convert the original data into a binary feature form that can be directly input to the integer optimization algorithm, apply the integer optimization algorithm, meet the scoring requirements of the risk score by setting integer constraints, and optimize the objective function based on the two norm To solve the model and control the number of risk factors, so as to realize the reliable prediction of disease risk.

For example, taking myocardial infarction patients predicting the risk of death in the hospital as an example, the risk factors screened by the risk prediction model include cardiac arrest, age, Killip, and systolic blood pressure. The corresponding coefficients (scores) are 2, 1 , 1, 1, the risk score is 5 points. That is, if a patient has a history of cardiac arrest, two points are added to the total risk score, and so on. This can quickly determine the user's risk score.

In some embodiments, the risk prediction device may also determine a target score threshold according to the type of the target disease, and send warning information to the terminal when the predicted score included in the risk prediction result is greater than the target score threshold. Wherein, the prediction score is the sum of the coefficients of each target risk factor, and the early warning information may include information such as the risk item corresponding to the target risk factor, the prediction score, and treatment plan.

Optionally, the treatment plan may be a treatment plan corresponding to the user group to which the target user belongs. Further optionally, the user group to which the target user belongs may be the group with the largest net benefit under the treatment plan. For example, users can be grouped based on the net benefits of treatment plans, and the user groups with the largest net benefits under each treatment plan can be obtained. Therefore, when recommending a treatment plan to a user, it can be pushed in conjunction with the net benefit, for example, recommending to the target user the treatment plan with the largest net benefit corresponding to the user group to which the target user belongs. In this way, it is possible to provide users with the most cost-effective treatment plan recommendations in accordance with health economics, so that under the premise of providing effective treatment, the most cost-effective treatment method is selected for the patient, which helps to reduce the patient's economy and reduce the burden of medical insurance.

In the embodiment of the present application, the risk prediction device may determine multiple risk factors corresponding to the target disease through acquired diagnosis and treatment data of multiple users, and obtain multiple risk factors from the multiple risk factors according to a two-norm objective function including multiple risk factors. Multiple risk factors are screened out, and the coefficients of the selected risk factors are determined from the integer set, so as to train the risk prediction model, and then the target risk factor and each target corresponding to the target diagnosis and treatment data can be determined by calling the risk prediction model. The coefficient of the risk factor to obtain the risk prediction result of the target user for the target disease. The embodiments of this application can apply integer optimization algorithms to open up the process of feature selection, model parameter learning, and risk factor scoring, avoiding the subjective problems of traditional methods; it can also be combined with the application of automated data preprocessing methods, such as data filling And feature engineering, which automatically converts the original data into a form that can be directly input to the integer optimization algorithm; and the process of this program is convenient, fast, and highly automated, and the results meet the clinical needs of disease risk scores, which can be used for clinical trials without algorithms and development experience. Used by doctors.

It is understandable that the above method embodiments are all examples of the disease risk prediction method or system of the present application, and the description of each embodiment has its own focus. For parts that are not described in detail in an embodiment, please refer to the description of other embodiments. Related description.

The embodiment of the present application also provides a disease risk prediction device. The device may include modules for performing the method of FIG. 2 described above. Please refer to FIG. 3, which is a schematic structural diagram of a disease risk prediction device provided by an embodiment of the present application. The disease risk prediction device described in this embodiment may be configured in a risk prediction device. As shown in FIG. 3, the disease risk prediction device 300 of this embodiment may include:

The obtaining module 301 is used to obtain diagnosis and treatment data corresponding to the target diseases of multiple users;

The determining module 302 is configured to determine multiple first risk factors corresponding to the target disease according to the diagnosis and treatment data;

The processing module 303 is configured to screen out a plurality of second risk factors from the plurality of first risk factors according to a two-norm objective function including the plurality of first risk factors, and determine each second risk factor to be screened The coefficients of the risk factors are used to determine the risk prediction model based on the coefficients of the respective second risk factors; wherein the coefficients of the second risk factors are integers determined from a set of integers;

The obtaining module 301 is also used to obtain target diagnosis and treatment data of the target user;

The processing module 303 is further configured to call the risk prediction model to determine the target risk factor corresponding to the target diagnosis and treatment data and the coefficient of each target risk factor, and determine based on the target risk factor and the coefficient of the target risk factor The risk prediction result of the target user for the target disease.

In some embodiments, the processing module 303 calls the risk prediction model to determine the target risk factor corresponding to the target diagnosis and treatment data and the coefficient of each target risk factor, and is based on the target risk factor and the target risk factor. When determining the risk prediction result of the target user for the target disease, the following steps can be specifically performed:

Converting the target diagnosis and treatment data into binary features;

The binary features corresponding to the target diagnosis and treatment data are input into the risk prediction model to obtain the target risk factor corresponding to the binary feature and the coefficient of each target risk factor, and based on the target risk factor and the target risk factor The coefficient determines the risk prediction result;

Wherein, the risk prediction result includes a prediction score of the target user for the target disease, and the prediction score is the sum of the coefficients of each of the target risk factors.

In some embodiments, the acquiring module 301 may also be used to acquire outcome data corresponding to the diagnosis and treatment data, and the outcome data is used to indicate the health status of the user;

The diagnosis and treatment data includes multiple risk factor data; the determining module 302 may specifically perform the following steps when determining multiple risk factors of the target disease according to the diagnosis and treatment data:

Acquiring multiple risk factor data included in the diagnosis and treatment data;

According to the relationship between the risk factor data and the outcome data corresponding to the diagnosis and treatment data, the risk factor data is converted into a binary feature to obtain multiple risk factors.

In some embodiments, the obtaining module 301 is also used to obtain outcome data corresponding to the diagnosis and treatment data;

The processing module 303 screens out a plurality of second risk factors from the plurality of first risk factors according to a two-norm objective function including the plurality of first risk factors, and determines each second risk factor to be screened. For the coefficients of risk factors, when the risk prediction model is determined by the coefficients based on the second risk factors, the following steps can be specifically performed:

According to the multiple first risk factors corresponding to the diagnosis and treatment data and the outcome data, determine the multiple filtered second risk factors under the objective function and the coefficients of each selected second risk factor to minimize The risk prediction model is obtained by training; wherein, the objective function is determined according to a logistic loss function and a two-norm, and the two-norm is used to control the number of second risk factors that are screened out.

In some embodiments, the acquiring module 301 is further configured to receive a disease risk prediction request sent by the terminal, and the disease risk prediction request carries the target user's identifier;

The acquiring module 301 is further configured to acquire the target diagnosis and treatment data according to the identifier of the target user;

The determining module 302 is further configured to determine a target score threshold according to the type of the target disease, and send early warning information to the terminal when the predicted score included in the risk prediction result is greater than the target score threshold;

Wherein, the prediction score is the sum of the coefficients of each of the target risk factors, the early warning information includes the risk item corresponding to the target risk factor, the prediction score, and a treatment plan, and the treatment plan is the target user The treatment plan corresponding to the user group.

In some embodiments, the storage device is a blockchain node;

The acquisition module 301 is further configured to send a diagnosis and treatment data obtaining request to the storage device, and the diagnosis and treatment data obtaining request carries the identification of the target user, so that the storage device can check the identity of the risk prediction device. Verification, if the verification is passed, the storage device obtains the target diagnosis and treatment data of the target user according to the target user's identification query, and sends the diagnosis and treatment data to the risk prediction device;

The acquiring module 301 is specifically configured to receive the target diagnosis and treatment data sent by the storage device.

It is understandable that the functional modules of the disease risk prediction device of this embodiment can be specifically implemented according to the method in FIG. 2 of the foregoing method embodiment. For the specific implementation process, refer to the related description in FIG. 2 of the foregoing method embodiment. Go into details again.

Please refer to FIG. 4, which is a schematic structural diagram of a risk prediction device provided by an embodiment of the present application. As shown in FIG. 4, the risk prediction device may include: a processor 401 and a memory 402. Optionally, the risk prediction device may further include a communication interface 403. The processor 401, the memory 402, and the communication interface 403 may be connected by a bus or in other ways. In FIG. 4 shown in the embodiment of the present application, the connection by a bus is taken as an example. The communication interface 403 may be controlled by the processor to send and receive messages, the memory 402 may be used to store a computer program, the computer program includes program instructions, and the processor 401 is used to execute the program instructions stored in the memory 402. Wherein, the processor 401 is configured to call the program instructions to execute the following steps:

Obtain target diagnosis and treatment data of target users;

The risk prediction model is called to determine the target risk factor corresponding to the target diagnosis and treatment data and the coefficient of each target risk factor, and based on the target risk factor and the coefficient of the target risk factor, it is determined that the target user is targeted for the target disease The result of risk prediction.

In some embodiments, the processor 401 calls the risk prediction model to determine the target risk factor corresponding to the target diagnosis and treatment data and the coefficient of each target risk factor, and based on the target risk factor and the target risk factor When determining the risk prediction result of the target user for the target disease, the following steps can be specifically performed:

Converting the target diagnosis and treatment data into binary features;

In some embodiments, the processor 401 may also execute:

Obtaining outcome data corresponding to the diagnosis and treatment data, where the outcome data is used to indicate the health status of the user;

The diagnosis and treatment data includes multiple risk factor data; the processor 401 may specifically perform the following steps when determining multiple first risk factors of the target disease according to the diagnosis and treatment data:

According to the relationship between the risk factor data and the outcome data corresponding to the diagnosis and treatment data, the risk factor data is converted into a binary feature to obtain a plurality of first risk factors.

In some embodiments, the processor 401 may further execute the following steps:

Obtaining outcome data corresponding to the diagnosis and treatment data;

The processor 401 screens out a plurality of second risk factors from the plurality of first risk factors according to a two-norm objective function including the plurality of first risk factors, and determines each second risk factor to be screened. For the coefficients of risk factors, when the risk prediction model is determined by the coefficients based on the second risk factors, the following steps can be specifically performed:

In some embodiments, the processor 401 may further execute the following steps:

Receiving a disease risk prediction request sent by the terminal through the communication interface 403, where the disease risk prediction request carries the identifier of the target user;

Acquiring the target diagnosis and treatment data according to the identifier of the target user;

Determining a target score threshold according to the type of the target disease, and sending early warning information to the terminal when the predicted score included in the risk prediction result is greater than the target score threshold;

In some embodiments, the storage device is a blockchain node; the processor 401 may also perform the following steps:

Send a diagnosis and treatment data acquisition request to the storage device through the communication interface 403. The diagnosis and treatment data acquisition request carries the identification of the target user so that the storage device can verify the identity of the risk prediction device. If the verification is passed, the storage device obtains the target diagnosis and treatment data of the target user according to the target user's identification query, and sends the diagnosis and treatment data to the risk prediction device;

The target diagnosis and treatment data sent by the storage device is received through the communication interface 403.

It should be understood that in this embodiment of the application, the processor 401 may be a central processing unit (Central Processing Unit, CPU), and the processor 401 may also be other general-purpose processors or digital signal processors (Digital Signal Processors, DSPs). ), Application Specific Integrated Circuit (ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc. The general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.

The memory 402 may include a read-only memory and a random access memory, and provides instructions and data to the processor 401. A part of the memory 402 may also include a non-volatile random access memory. For example, the memory 402 may also store diagnosis and treatment data of the user.

The communication interface 403 may include an input device and/or an output device. For example, the input device may be a control panel, a microphone, a receiver, etc., and the output device may be a display screen, a transmitter, etc., which are not listed here.

In specific implementation, the processor 401, memory 402, and communication interface 403 described in the embodiment of this application can perform the implementation described in the method embodiment shown in FIG. 2 provided by the embodiment of this application, and can also perform the implementation of this application. The implementation of the disease risk prediction device described in the example will not be repeated here.

An embodiment of the present application also provides a computer-readable storage medium, the computer-readable storage medium stores a computer program, the computer program includes program instructions, and when the program instructions are executed by a processor, the above-mentioned disease risk can be executed Part or all of the steps performed in the embodiment of the prediction method, such as part or all of the steps performed by the risk prediction device.

Optionally, the storage medium involved in this application, such as a computer-readable storage medium, may be non-volatile or volatile.

The embodiment of the present application also provides a computer program product, the computer program product includes computer program code, when the computer program code runs on a computer, the computer executes the steps performed in the above-mentioned disease risk prediction device method embodiment .

In some embodiments, the computer-readable storage medium may mainly include a storage program area and a storage data area, where the storage program area may store an operating system, an application program required by at least one function, etc.; the storage data area may store Data created based on the use of blockchain nodes, etc.

Among them, the blockchain referred to in this application is a new application mode of computer technology such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm. Blockchain, essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information for verification. The validity of the information (anti-counterfeiting) and the generation of the next block. The blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.

A person of ordinary skill in the art can understand that all or part of the processes in the above-mentioned embodiment methods can be implemented by instructing relevant hardware through a computer program. The program can be stored in a computer readable storage medium. During execution, it may include the procedures of the above-mentioned method embodiments. Wherein, the storage medium may be a magnetic disk, an optical disc, a read-only memory (Read-Only Memory, ROM), or a random access memory (Random Access Memory, RAM), etc.

What is disclosed above is only a preferred embodiment of this application. Of course, it cannot be used to limit the scope of rights of this application. A person of ordinary skill in the art can understand all or part of the process of implementing the above-mentioned embodiments and follow the rights of this application. The equivalent changes required are still within the scope of the invention.

Claims

A disease risk prediction method, including:

Obtain the diagnosis and treatment data corresponding to the target disease of multiple users;

Determining multiple first risk factors corresponding to the target disease according to the diagnosis and treatment data;

A plurality of second risk factors are screened from the plurality of first risk factors according to a two-norm objective function including the plurality of first risk factors, and the coefficient of each second risk factor to be screened is determined to A risk prediction model is determined based on the coefficients of each second risk factor; wherein the coefficients of the second risk factor are integers determined from a set of integers;

Obtain the target diagnosis and treatment data of the target user, call the risk prediction model to determine the target risk factor corresponding to the target diagnosis and treatment data and the coefficient of each target risk factor, and determine the target risk factor based on the target risk factor and the coefficient of the target risk factor. The risk prediction result of the target user for the target disease.
The method according to claim 1, wherein the method further comprises:

Obtaining outcome data corresponding to the diagnosis and treatment data, where the outcome data is used to indicate the health status of the user;

The diagnosis and treatment data includes multiple risk factor data; the determination of multiple first risk factors corresponding to the target disease according to the diagnosis and treatment data includes:

Acquiring multiple risk factor data included in the diagnosis and treatment data;

According to the relationship between the risk factor data and the outcome data corresponding to the diagnosis and treatment data, the risk factor data is converted into a binary feature to obtain a plurality of first risk factors.
The method according to claim 1, wherein said acquiring target diagnosis and treatment data of a target user, calling said risk prediction model to determine the target risk factor corresponding to said target diagnosis and treatment data and the coefficient of each target risk factor, and based on said The target risk factor and the coefficient of the target risk factor determine the risk prediction result of the target user for the target disease, including:

Acquiring target diagnosis and treatment data of the target user, and converting the target diagnosis and treatment data into binary features;

The binary features corresponding to the target diagnosis and treatment data are input into the risk prediction model to obtain the target risk factor corresponding to the binary feature and the coefficient of each target risk factor, and based on the target risk factor and the target risk factor The coefficient determines the risk prediction result;

Wherein, the risk prediction result includes a prediction score of the target user for the target disease, and the prediction score is the sum of the coefficients of each of the target risk factors.
The method according to claim 1, wherein the method further comprises:

Obtaining outcome data corresponding to the diagnosis and treatment data, where the outcome data is used to indicate the health status of the user;

The plurality of second risk factors are screened from the plurality of first risk factors according to the objective function including the two norms of the plurality of first risk factors, and the coefficient of each of the screened second risk factors is determined , To determine the risk prediction model based on the coefficients of each second risk factor, including:

According to the multiple first risk factors corresponding to the diagnosis and treatment data and the outcome data, determine the multiple filtered second risk factors under the objective function and the coefficients of each selected second risk factor to minimize The risk prediction model is obtained by training; wherein, the objective function is determined according to a logistic loss function and a two-norm, and the two-norm is used to control the number of second risk factors that are screened out.
The method according to any one of claims 1-4, wherein the method further comprises:

Receiving a disease risk prediction request sent by a terminal, where the disease risk prediction request carries the target user's identification; the target diagnosis and treatment data is obtained according to the target user's identification;

Determining a target score threshold according to the type of the target disease, and sending early warning information to the terminal when the predicted score included in the risk prediction result is greater than the target score threshold;

Wherein, the prediction score is the sum of the coefficients of each of the target risk factors, the early warning information includes the risk item corresponding to the target risk factor, the prediction score, and a treatment plan, and the treatment plan is the target user The treatment plan corresponding to the user group.
The method according to any one of claims 1-4, wherein the method further comprises:

Sending a diagnosis and treatment data acquisition request to a storage device, where the diagnosis and treatment data acquisition request carries the identification of the target user; the storage device is a blockchain node;

The acquiring target diagnosis and treatment data of the target user includes:

Receiving the target diagnosis and treatment data sent by the storage device.
A disease risk prediction device, including:

The acquisition module is used to acquire the diagnosis and treatment data corresponding to the target diseases of multiple users;

A determining module, configured to determine multiple first risk factors corresponding to the target disease according to the diagnosis and treatment data;

The processing module is configured to screen out a plurality of second risk factors from the plurality of first risk factors according to a two-norm objective function including the plurality of first risk factors, and determine each second risk to be screened The coefficients of the factors are used to determine the risk prediction model based on the coefficients of each second risk factor; wherein, the coefficients of the second risk factor are integers determined from a set of integers;

The acquisition module is also used to acquire target diagnosis and treatment data of the target user;

The processing module is further configured to call the risk prediction model to determine the target risk factor corresponding to the target diagnosis and treatment data and the coefficient of each target risk factor, and determine the target risk factor based on the target risk factor and the coefficient of the target risk factor. The risk prediction result of the target user for the target disease.
A risk prediction device includes a processor and a memory, the processor and the memory are connected to each other, wherein the memory is used to store a computer program, the computer program includes program instructions, and the processor is configured to Call the program instructions to execute the following methods:

Obtain the diagnosis and treatment data corresponding to the target disease of multiple users;

Determining multiple first risk factors corresponding to the target disease according to the diagnosis and treatment data;

A plurality of second risk factors are screened from the plurality of first risk factors according to a two-norm objective function including the plurality of first risk factors, and the coefficient of each second risk factor to be screened is determined to A risk prediction model is determined based on the coefficients of each second risk factor; wherein the coefficients of the second risk factor are integers determined from a set of integers;

Obtain the target diagnosis and treatment data of the target user, call the risk prediction model to determine the target risk factor corresponding to the target diagnosis and treatment data and the coefficient of each target risk factor, and determine the target risk factor based on the target risk factor and the coefficient of the target risk factor. The risk prediction result of the target user for the target disease.
The risk prediction device according to claim 8, wherein the processor is further configured to execute:

Obtaining outcome data corresponding to the diagnosis and treatment data, where the outcome data is used to indicate the health status of the user;

The execution of the diagnosis and treatment data includes multiple risk factor data; the determination of multiple first risk factors corresponding to the target disease according to the diagnosis and treatment data includes:

Acquiring multiple risk factor data included in the diagnosis and treatment data;

According to the relationship between the risk factor data and the outcome data corresponding to the diagnosis and treatment data, the risk factor data is converted into a binary feature to obtain a plurality of first risk factors.
The risk prediction device according to claim 8, wherein the acquiring target diagnosis and treatment data of the target user is executed, the risk prediction model is called to determine the target risk factor corresponding to the target diagnosis and treatment data and the coefficient of each target risk factor, and Determining the risk prediction result of the target user for the target disease based on the target risk factor and the coefficient of the target risk factor includes:

Acquiring target diagnosis and treatment data of the target user, and converting the target diagnosis and treatment data into binary features;

The binary features corresponding to the target diagnosis and treatment data are input into the risk prediction model to obtain the target risk factor corresponding to the binary feature and the coefficient of each target risk factor, and based on the target risk factor and the target risk factor The coefficient determines the risk prediction result;

Wherein, the risk prediction result includes a prediction score of the target user for the target disease, and the prediction score is the sum of the coefficients of each of the target risk factors.
The risk prediction device according to claim 8, wherein the processor is further configured to execute:

Obtaining outcome data corresponding to the diagnosis and treatment data, where the outcome data is used to indicate the health status of the user;

Perform the screening of a plurality of second risk factors from the plurality of first risk factors according to a two-norm objective function including the plurality of first risk factors, and determine the value of each second risk factor to be screened The coefficients are used to determine the risk prediction model based on the coefficients of each second risk factor, including:

According to the multiple first risk factors corresponding to the diagnosis and treatment data and the outcome data, determine the multiple filtered second risk factors under the objective function and the coefficients of each selected second risk factor to minimize The risk prediction model is obtained by training; wherein, the objective function is determined according to a logistic loss function and a two-norm, and the two-norm is used to control the number of second risk factors that are screened out.
The risk prediction device according to any one of claims 8-11, wherein the processor is further configured to execute:

Receiving a disease risk prediction request sent by a terminal, where the disease risk prediction request carries the target user's identification; the target diagnosis and treatment data is obtained according to the target user's identification;

Determining a target score threshold according to the type of the target disease, and sending early warning information to the terminal when the predicted score included in the risk prediction result is greater than the target score threshold;

Wherein, the prediction score is the sum of the coefficients of each of the target risk factors, the early warning information includes the risk item corresponding to the target risk factor, the prediction score, and a treatment plan, and the treatment plan is the target user The treatment plan corresponding to the user group.
The risk prediction device according to any one of claims 8-11, wherein the processor is further configured to execute:

Sending a diagnosis and treatment data acquisition request to a storage device, where the diagnosis and treatment data acquisition request carries the identification of the target user; the storage device is a blockchain node;

Executing the acquisition of the target diagnosis and treatment data of the target user includes:

Receiving the target diagnosis and treatment data sent by the storage device.
A computer-readable storage medium, the computer-readable storage medium stores a computer program, and the computer program includes program instructions that, when executed by a processor, cause the processor to perform the following method:

Obtain the diagnosis and treatment data corresponding to the target disease of multiple users;

Determining multiple first risk factors corresponding to the target disease according to the diagnosis and treatment data;

A plurality of second risk factors are screened from the plurality of first risk factors according to a two-norm objective function including the plurality of first risk factors, and the coefficient of each second risk factor to be screened is determined to A risk prediction model is determined based on the coefficients of each second risk factor; wherein the coefficients of the second risk factor are integers determined from a set of integers;

Obtain the target diagnosis and treatment data of the target user, call the risk prediction model to determine the target risk factor corresponding to the target diagnosis and treatment data and the coefficient of each target risk factor, and determine the target risk factor based on the target risk factor and the coefficient of the target risk factor. The risk prediction result of the target user for the target disease.
The computer-readable storage medium according to claim 14, wherein, when the program instructions are executed by the processor, they also cause the processor to execute:

Obtaining outcome data corresponding to the diagnosis and treatment data, where the outcome data is used to indicate the health status of the user;

The execution of the diagnosis and treatment data includes multiple risk factor data; the determination of multiple first risk factors corresponding to the target disease according to the diagnosis and treatment data includes:

Acquiring multiple risk factor data included in the diagnosis and treatment data;

According to the relationship between the risk factor data and the outcome data corresponding to the diagnosis and treatment data, the risk factor data is converted into a binary feature to obtain a plurality of first risk factors.
The computer-readable storage medium according to claim 14, wherein the acquiring target diagnosis and treatment data of the target user is executed, and the risk prediction model is called to determine the target risk factor corresponding to the target diagnosis and treatment data and the coefficient of each target risk factor , And determining the risk prediction result of the target user for the target disease based on the target risk factor and the coefficient of the target risk factor, including:

Acquiring target diagnosis and treatment data of the target user, and converting the target diagnosis and treatment data into binary features;

The binary features corresponding to the target diagnosis and treatment data are input into the risk prediction model to obtain the target risk factor corresponding to the binary feature and the coefficient of each target risk factor, and based on the target risk factor and the target risk factor The coefficient determines the risk prediction result;

Wherein, the risk prediction result includes a prediction score of the target user for the target disease, and the prediction score is the sum of the coefficients of each of the target risk factors.
The computer-readable storage medium according to claim 14, wherein, when the program instructions are executed by the processor, they also cause the processor to execute:

Obtaining outcome data corresponding to the diagnosis and treatment data, where the outcome data is used to indicate the health status of the user;

Perform the screening of a plurality of second risk factors from the plurality of first risk factors according to the objective function including the two norms of the plurality of first risk factors, and determine the value of each second risk factor to be screened The coefficients are used to determine the risk prediction model based on the coefficients of each second risk factor, including:

According to the multiple first risk factors corresponding to the diagnosis and treatment data and the outcome data, determine the multiple filtered second risk factors under the objective function and the coefficients of each selected second risk factor to minimize The risk prediction model is obtained by training; wherein, the objective function is determined according to a logistic loss function and a two-norm, and the two-norm is used to control the number of second risk factors that are screened out.
18. The computer-readable storage medium according to any one of claims 14-17, wherein the program instructions, when executed by the processor, also cause the processor to execute:

Receiving a disease risk prediction request sent by a terminal, where the disease risk prediction request carries the target user's identification; the target diagnosis and treatment data is obtained according to the target user's identification;

Determining a target score threshold according to the type of the target disease, and sending early warning information to the terminal when the predicted score included in the risk prediction result is greater than the target score threshold;

Wherein, the prediction score is the sum of the coefficients of each of the target risk factors, the early warning information includes the risk item corresponding to the target risk factor, the prediction score, and a treatment plan, and the treatment plan is the target user The treatment plan corresponding to the user group.
18. The computer-readable storage medium according to any one of claims 14-17, wherein the program instructions, when executed by the processor, also cause the processor to execute:

Sending a diagnosis and treatment data acquisition request to a storage device, where the diagnosis and treatment data acquisition request carries the identification of the target user; the storage device is a blockchain node;

Executing the acquisition of the target diagnosis and treatment data of the target user includes:

Receiving the target diagnosis and treatment data sent by the storage device.
A disease risk prediction system includes: a risk prediction device and a storage device; wherein the storage device is used to store diagnosis and treatment data of a user;

The risk prediction device is used to perform the following steps:

Acquiring the diagnosis and treatment data corresponding to the target diseases of multiple users from the storage device;

Determining multiple first risk factors corresponding to the target disease according to the diagnosis and treatment data;

A plurality of second risk factors are screened from the plurality of first risk factors according to a two-norm objective function including the plurality of first risk factors, and the coefficient of each second risk factor to be screened is determined to A risk prediction model is determined based on the coefficients of each second risk factor; wherein the coefficients of the second risk factor are integers determined from a set of integers;

Obtain the target diagnosis and treatment data of the target user, call the risk prediction model to determine the target risk factor corresponding to the target diagnosis and treatment data and the coefficient of each target risk factor, and determine the target risk factor based on the target risk factor and the coefficient of the target risk factor. The risk prediction result of the target user for the target disease.