CN115148375B - High-throughput real world drug effectiveness and safety evaluation method and system - Google Patents

High-throughput real world drug effectiveness and safety evaluation method and system Download PDF

Info

Publication number
CN115148375B
CN115148375B CN202211051408.2A CN202211051408A CN115148375B CN 115148375 B CN115148375 B CN 115148375B CN 202211051408 A CN202211051408 A CN 202211051408A CN 115148375 B CN115148375 B CN 115148375B
Authority
CN
China
Prior art keywords
drug
target
event
effectiveness
calculating
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211051408.2A
Other languages
Chinese (zh)
Other versions
CN115148375A (en
Inventor
李劲松
王昱
马爽
王嘉琪
田雨
周天舒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Lab
Original Assignee
Zhejiang Lab
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Lab filed Critical Zhejiang Lab
Priority to CN202211051408.2A priority Critical patent/CN115148375B/en
Publication of CN115148375A publication Critical patent/CN115148375A/en
Application granted granted Critical
Publication of CN115148375B publication Critical patent/CN115148375B/en
Priority to JP2023095912A priority patent/JP7433503B1/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H70/00ICT specially adapted for the handling or processing of medical references
    • G16H70/40ICT specially adapted for the handling or processing of medical references relating to drugs, e.g. their side effects or intended usage
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Public Health (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Epidemiology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Medicinal Chemistry (AREA)
  • Toxicology (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Chemical & Material Sciences (AREA)
  • Biomedical Technology (AREA)
  • Databases & Information Systems (AREA)
  • Pathology (AREA)
  • Medical Treatment And Welfare Office Work (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a method and a system for evaluating the effectiveness and safety of a high-throughput real-world medicament, which comprise the following steps: step S1: constructing a data set; step S2: constructing a drug effectiveness/safety index phenotype library; and step S3: selecting a target drug, defining a target event, an ending event, a target date and an ending date, and extracting a patient ID, the target date and the ending date; and step S4: selecting indexes from the drug effectiveness/safety index phenotype library to obtain target drug-effectiveness/safety index pairs; performing high-throughput signal screening on the target drug-effectiveness/safety index pair by using event sequence symmetry analysis to obtain a primary screening positive signal; step S5: and carrying out causal evaluation on the primary screening positive signal to determine the effectiveness and safety of the target drug. According to the invention, the causality evaluation of the drug effect of the target drug is completed at high flux through high-flux signal screening and a cause and effect evaluation algorithm based on clinical test simulation.

Description

High-throughput real-world drug effectiveness and safety evaluation method and system
Technical Field
The invention relates to the technical field of drug management, in particular to a high-throughput real-world drug effectiveness and safety evaluation method and system.
Background
The drug evaluation is an important means for promoting the clinical reasonable medication and effectively controlling the medical expense. Medical research is directed to the need to utilize the highest quality of evidence to determine the efficacy and safety of new therapeutic substances and to minimize the time to market for drugs. The use of Random Clinical Trials (RCTs) has revolutionized medical development in the 20 th century and provided a gold standard for evaluating the efficacy of drugs. However, random control trials are not always feasible, particularly in severe or rare diseases where subject enrollment is very difficult and slow. The quantity and availability of conventional medical data (also called Real-world data (RWD)) acquired electronically in a Real diagnosis and treatment process are rapidly increasing in recent years, and the method can be used as a new way for evaluating the safety and effectiveness of the drugs after being put into the market. RWD more often reflects the true nature of the treatment than clinical trials, since patients of widely different characteristics are often contained therein, and therefore studies using RWD can better represent the true population in clinical practice requiring treatment. On one hand, the RWD is used for evaluating the medicine, so that evidence of clinical real application condition of the medicine after the medicine is on the market can be obtained, and the medicine taking safety and effectiveness of patients are promoted; on one hand, the method can provide inspiration for the research and development of new drugs and the design of new clinical trials for new application of old drugs.
Although the world countries have been called to promote the development and marketing of new drugs by issuing policy laws in recent years, there is no general method or framework for evaluating drugs by using RWD. In the aspect of effectiveness, the system of the invention patent publication No. CN114025253A, a drug efficacy evaluation system based on real world studies, can be applied to lesion change effect evaluation by realizing VNA wide-area barrier-free data acquisition and multi-directionally acquiring real world medical data. In the aspect of safety, the invention patent with the authorization number of CN111402971B, namely a method and a system for rapidly identifying adverse drug reactions based on big data, and the invention patent with the authorization number of CN104765947B, namely a method for mining potential adverse drug reactions facing big data, discover or evaluate adverse drug reactions by combining domain knowledge and the association degree of the drugs and the adverse reactions in real world data according to reported adverse drug reaction information. The invention patent CN202111628626.3 discloses a method and a device for identifying adverse drug reactions by using a prescription sequence symmetry analysis method, and the basic principle is to use a label drug as a proxy event of the adverse drug reactions and identify the adverse drug reactions by analyzing the time before and after the use of the index drug and the label drug.
In view of the above, there are limitations: 1) The signal of the drug-adverse reaction relation comes from an adverse drug reaction reporting system, and a plurality of signals which really occur in the real world but are not noticed can be lost; 2) The association relation between the medicine and the adverse reaction is obtained by combining the domain knowledge and the real world data, and the causal relation between the medicine and the adverse reaction cannot be represented; 3) Prescription sequence symmetry analysis is often applied to insurance databases, and the choice of agent drugs can seriously affect the reliability of algorithm outcome.
Therefore, a high-throughput real-world drug effectiveness and safety evaluation method and system are provided.
Disclosure of Invention
In order to solve the technical problems, the invention provides a high-throughput real-world drug effectiveness and safety evaluation method and system.
The technical scheme adopted by the invention is as follows:
a high-throughput real-world drug effectiveness and safety evaluation method comprises the following steps:
step S1: acquiring real world data, carrying out unified data coding on the data, extracting necessary minimum data of the data after the unified data coding, and completing construction of a data set;
step S2: constructing a concept set + behavior + standard three-tuple mode, and constructing a drug effectiveness/safety index phenotype library through the three-tuple mode, a drug effectiveness/safety term set and clinical indexes;
and step S3: selecting a target drug, and selecting an index from the drug effectiveness/safety index phenotype library to obtain a target drug-effectiveness/safety index pair; defining a target event, an ending event, a target date and an ending date, extracting a patient ID, the target date and the ending date, and forming a data set for high-throughput signal primary screening;
and step S4: aiming at the data set of the high-throughput signal preliminary screening, carrying out high-throughput signal screening on the target drug-effectiveness/safety index pair by utilizing event sequence symmetry analysis to obtain a preliminary screening positive signal;
step S5: and carrying out causal evaluation on the primary screening positive signal to determine the effectiveness and safety of the target drug.
Further, the step S3 specifically includes the following sub-steps:
step S31: taking the first use of the target drug as a target event, forming one or a plurality of indexes selected by a user in the drug effectiveness/safety index phenotype library into outcome indexes, and taking the first occurrence of each outcome index as an outcome event;
step S32: taking the date of the target event as a target date, and taking the date of the ending event as an ending date;
step S33: and selecting the target event and the ending event, screening the patients with the target event and the ending event in the data set, extracting the ID (identity), the target date and the ending date of the patients, and forming the data set for high-throughput signal prescreening.
Further, the step S4 specifically includes the following sub-steps:
step S41: calculating a sequence ratio by using the ratio of the number of people who have the ending event after the target event occurs to the number of people who have the ending event before the target event occurs;
step S42: calculating a null effect sequence ratio, and adjusting the sequence ratio by using the null effect sequence ratio to obtain a corrected sequence ratio;
step S43: calculating a confidence interval of the corrected sequence ratio by calculating a confidence interval of the probability of occurrence of the target event before occurrence of the ending event;
step S44: and taking the signal with the lower limit of the confidence interval of the corrected sequence ratio larger than 1 as a primary screening positive signal.
Further, the step S42 specifically includes the following sub-steps:
step S421: calculating the overall average probability of the ending event occurring in a preset observation window after the target event occurs aiming at the data set subjected to the high-flux signal preliminary screening;
step S422: calculating to obtain an empty effect sequence ratio by using the overall average probability;
step S423: and calculating the ratio of the sequence ratio to the null effect sequence ratio to obtain a corrected sequence ratio.
Further, the step S5 specifically includes the following sub-steps:
step S51: screening patients in the data set for which the target event occurs for the target drug-effectiveness/safety indicator pair corresponding to the prescreening positive signal, and assigning the patients to a user queue and an initial non-user queue;
step S52: randomly selecting one of the drugs used by the initial non-user cohort of patients as a replacement drug and constructing a non-user cohort;
step S53: taking the time of using the target drug or the substitute drug for the first time as group entering time, taking the target drug or the substitute drug as a treatment mode, taking whether the ending event occurs or not as a result, extracting baseline data of the patient before group entering, forming a covariate set by demographic information in the data set and the baseline data, extracting characteristics and screening to obtain a variable for calculating an average treatment effect;
step S54: calculating a mean therapeutic effect of the drug of interest using the variables used to calculate the mean therapeutic effect;
step S55: repeating steps S52-S54 until reaching the maximum test times, calculating the average effect of all the average treatment effects, and determining the effectiveness and safety of the target drug by calculating the confidence interval of the average effect.
Further, the step S51 specifically includes the following sub-steps:
step S511: aiming at the target drug-effectiveness/safety index pair corresponding to the primary screening positive signal, screening the patient with the target event in the data set, acquiring a main diagnosis of the patient, counting the patient according to an ICD-10 code, and taking the main diagnosis with the largest number of patients as a basic diagnosis;
step S512: assigning the patient to a user queue and an initial non-user queue.
Further, the step S53 specifically includes the following sub-steps:
step S531: selecting a covariate in the covariate set, recording the covariate as an important covariate, and recording the other variables in the covariate set as the other variables;
step S532: under the condition that the important covariates and the rest variables are given, constructing an ending event prediction model taking all covariates as input variables for the condition expectation of the result to obtain an estimation value of the condition expectation of the result;
step S533: under the condition of giving the other variables, estimating the probability that the value of the important covariate is 1 to obtain an estimated value;
step S534: constructing a variable importance indication disturbance variable by using an estimated value of the probability that the importance covariate is 1;
step S535: taking the estimated value of the condition expectation of the result as an intercept term, taking the variable importance indication disturbance variable as an input variable, and constructing an ending event regression model to obtain a variable importance disturbance parameter;
step S536: updating the estimated value of the condition expectation of the result by using the variable importance disturbance parameter and the variable importance indication disturbance variable, and calculating to obtain the variable importance;
step S537: calculating the standard deviation and confidence interval of the variable importance;
step S538: repeating steps S531-S537 until each covariate of the set of covariates is traversed, taking as the variable used for calculating the average therapeutic effect the variable for which the left and right confidence intervals of the standard deviation of the variable importance do not contain zero.
Further, the step S54 specifically includes the following sub-steps:
step S541: constructing an ending event prediction model by using the treatment mode and the variable for calculating the average treatment effect as input variables to obtain an expected estimation value of the ending event condition;
step S542: evaluating the probability of treatment distribution to obtain an estimated value of treatment distribution;
step S543: constructing an indicative perturbation variable using the estimate of the therapy allocation;
step S544: taking an estimated value of the ending event condition expectation obtained by taking the treatment mode and the variable of the average treatment effect as input and calculating as an intercept term, constructing a regression model of the ending event and the indication disturbance variable, and obtaining an estimated disturbance parameter;
step S545: and updating the estimated value of the ending event condition expectation by using the estimated disturbance parameter and the disturbance indicator variable, and calculating to obtain an average treatment effect.
Further, the step S55 specifically includes the following sub-steps:
step S551: repeating steps S52-S54 until a maximum number of trials is reached, calculating the average effect of all said average therapeutic effects;
step S552: and calculating the confidence interval of the average action, acquiring a signal with the left confidence interval larger than zero or a signal with the right confidence interval smaller than zero as a final positive signal, and determining the effectiveness and safety of the target drug.
The invention also provides a high-throughput real-world drug effectiveness and safety evaluation system, which comprises:
the data acquisition module is used for acquiring data from the existing data and cleaning the data to complete the construction of a data set;
the high-throughput signal screening module is used for converting the data in the data set into clinical events and carrying out high-throughput signal screening by using event sequence symmetry analysis to obtain primary screening positive signals;
the cause and effect evaluation module is used for carrying out cause and effect evaluation on the primary screening positive signals and determining the effectiveness and safety of the target drug;
and the result display module is used for displaying the result of the effectiveness and the safety of the target medicine.
The invention has the beneficial effects that:
1. the invention is based on large real world data sets, is not limited to specific drugs and specific effectiveness/safety indexes, and can complete the causality evaluation of the drug effect of the target drug with high flux through high flux signal screening and a causality evaluation algorithm based on clinical test simulation.
2. The invention constructs an extensible drug effectiveness/safety phenotype library based on a three-tuple mode of concept set + behavior + standard and is used for high-flux real world drug evaluation.
3. In the causal relationship evaluation of the target drug and the outcome event, two-stage target maximum likelihood estimation is adopted, firstly, the importance of all data characteristics generated in the real world of a patient queue is evaluated, human expert knowledge is not relied on, covariates most relevant to the outcome event and treatment distribution are selected in a data-driven mode, and then the average treatment effect is calculated according to the comparison of the covariates; and reliable average therapeutic effect results are obtained by multiple times of replacement random selection and substitution drug simulation clinical tests so as to generate quantitative causal relationship evaluation among target drug-outcome indexes.
Drawings
FIG. 1 is a schematic flow chart of a high throughput real-world drug effectiveness and safety evaluation method according to the present invention;
FIG. 2 is a schematic diagram of a high-throughput real-world drug efficacy and safety evaluation system according to the present invention.
Detailed Description
The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the invention, its application, or uses. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, a high-throughput real-world drug effectiveness and safety evaluation method includes the following steps:
step S1: acquiring real world data, carrying out unified data coding on the data, extracting necessary minimum data of the data after the unified data coding, and completing construction of a data set;
step S2: constructing a concept set + behavior + standard three-tuple mode, and constructing a drug effectiveness/safety index phenotype library through the three-tuple mode, a drug effectiveness/safety term set and clinical indexes;
and step S3: selecting a target drug, and selecting an index from the drug effectiveness/safety index phenotype library to obtain a target drug-effectiveness/safety index pair; defining a target event, an ending event, a target date and an ending date, extracting a patient ID, the target date and the ending date, and forming a data set for high-throughput signal primary screening;
step S31: taking the first use of the target drug as a target event, forming one or a plurality of indexes selected by a user in the drug effectiveness/safety index phenotype library into outcome indexes, and taking the first occurrence of each outcome index as an outcome event;
step S32: taking the date of the target event as a target date, and taking the date of the ending event as an ending date;
step S33: and selecting the target event and the ending event, screening the patients with the target event and the ending event in the data set, extracting the ID, the target date and the ending date of the patients, and forming the data set for high-throughput signal preliminary screening.
And step S4: aiming at the data set of the high-throughput signal preliminary screening, carrying out high-throughput signal screening on the target drug-effectiveness/safety index pair by utilizing event sequence symmetry analysis to obtain a preliminary screening positive signal;
step S41: calculating a sequence ratio by using the ratio of the number of people who have the ending event after the target event occurs to the number of people who have the ending event before the target event occurs;
step S42: calculating a null effect sequence ratio, and adjusting the sequence ratio by using the null effect sequence ratio to obtain a corrected sequence ratio;
step S421: calculating the overall average probability of the ending event occurring in a preset observation window after the target event occurs aiming at the data set subjected to the high-flux signal preliminary screening;
step S422: calculating to obtain a null effect sequence ratio by using the overall average probability;
step S423: and calculating the ratio of the sequence ratio to the null effect sequence ratio to obtain a corrected sequence ratio.
Step S43: calculating a confidence interval of the corrected sequence ratio by calculating a confidence interval of the probability of occurrence of the target event before occurrence of the ending event;
step S44: and taking the signal with the lower limit of the confidence interval of the corrected sequence ratio larger than 1 as a primary screening positive signal.
Step S5: and carrying out causal evaluation on the primary screening positive signal to determine the effectiveness and safety of the target drug.
Step S51: screening patients in the data set for the target event to which the target drug-effectiveness/safety index pair corresponds to the prescreening positive signal, and allocating the patients to a user queue and an initial non-user queue;
step S511: aiming at the target drug-effectiveness/safety index pair corresponding to the primary screening positive signal, screening the patient with the target event in the data set, acquiring a main diagnosis of the patient, counting the patient according to an ICD-10 code, and taking the main diagnosis with the largest number of patients as a basic diagnosis;
step S512: assigning the patient to a user queue and an initial non-user queue.
Step S52: randomly selecting one drug from the drugs used by the initial non-user cohort of patients as a substitute drug, and constructing a non-user cohort;
step S53: taking the time of using the target drug or the substitute drug for the first time as the group entering time, taking the target drug or the substitute drug as a treatment mode, taking whether the ending event occurs as a result, extracting baseline data of the patients before group entering, forming a covariate set by demographic information in the data set and the baseline data, extracting characteristics and screening to obtain a variable for calculating the average treatment effect;
step S531: selecting a covariate in the covariate set, recording the covariate as an important covariate, and recording the other variables in the covariate set as the other variables;
step S532: under the condition that the important covariates and the rest variables are given, constructing an ending event prediction model taking all covariates as input variables for the condition expectation of the result to obtain an estimation value of the condition expectation of the result;
step S533: under the condition of giving the other variables, estimating the probability that the value of the important covariate is 1 to obtain an estimated value;
step S534: constructing a variable importance indication disturbance variable by using the estimated value of the probability of the importance covariate being 1;
step S535: taking the estimated value of the condition expectation of the result as an intercept term, taking the variable importance indication disturbance variable as an input variable, and constructing an ending event regression model to obtain a variable importance disturbance parameter;
step S536: updating the estimation value of the condition expectation of the result by using the variable importance disturbance parameter and the variable importance indication disturbance variable, and calculating to obtain the variable importance;
step S537: calculating the standard deviation and confidence interval of the variable importance;
step S538: repeating steps S531-S537 until each covariate of the set of covariates is traversed, taking as the variable used for calculating the average therapeutic effect the variable for which the left and right confidence intervals of the standard deviation of the variable importance do not contain zero.
Step S54: calculating a mean therapeutic effect of the drug of interest using the variables used to calculate the mean therapeutic effect;
step S541: constructing an ending event prediction model by using the treatment mode and the variable for calculating the average treatment effect as input variables to obtain an expected estimation value of the ending event condition;
step S542: evaluating the probability of treatment distribution to obtain an estimated value of treatment distribution;
step S543: constructing an indicative perturbation variable using the estimate of the therapy allocation;
step S544: taking an estimated value of the ending event condition expectation obtained by taking the treatment mode and the variable of the average treatment effect as input and calculating as an intercept term, constructing a regression model of the ending event and the indication disturbance variable, and obtaining an estimated disturbance parameter;
step S545: and updating the estimation value of the ending event condition expectation by using the estimation disturbance parameter and the disturbance indicating variable, and calculating to obtain an average treatment effect.
Step S55: repeating the steps S52-S54 until reaching the maximum test times, calculating the average effect of all the average treatment effects, and determining the effectiveness and safety of the target drug by calculating the confidence interval of the average effect.
Step S551: repeating steps S52-S54 until a maximum number of trials is reached, calculating the average effect of all said average therapeutic effects;
step S552: and calculating the confidence interval of the average action, acquiring a signal with the left confidence interval larger than zero or a signal with the right confidence interval smaller than zero as a final positive signal, and determining the effectiveness and safety of the target drug.
Referring to fig. 2, a high throughput real world drug efficacy and safety evaluation system includes:
the data acquisition module is used for acquiring data from the existing data and cleaning the data to complete the construction of a data set;
the high-throughput signal screening module is used for converting the data in the data set into clinical events and performing high-throughput signal screening by using event sequence symmetry analysis to obtain a primary screening positive signal;
the cause and effect evaluation module is used for carrying out cause and effect evaluation on the primary screening positive signals and determining the effectiveness and safety of the target drug;
and the result display module is used for displaying the result of the effectiveness and the safety of the target medicine.
The embodiment is as follows: a high-throughput real-world drug effectiveness and safety evaluation method comprises the following steps:
step S1: acquiring real world data, carrying out unified data coding on the data, extracting necessary minimum data of the data after the unified data coding, and completing construction of a data set;
acquiring real world data, acquiring medication data, diagnostic data, operation data, laboratory test results and the like through electronic medical record data, not processing data occurrence time, and keeping original date and time, wherein the acquired information specifically comprises the following information: (1) demographic information: year and month of birth, sex, ethnicity; (2) basic information of medical treatment: history of allergies, family history, blood type; (3) diagnosis and treatment information: diagnosis record, test result, medication record, operation record and image examination report.
Firstly, unified data coding: the codes for sex, age, nationality, blood type, test item and medication information are self-set codes, the ICD-10 codes are used for diagnosis and family medical history, and the ICD-9-CM codes are used for operation information.
The data set in step S1 includes demographic information, assay results, medication information, diagnostic information, surgical records, and clinical presentations.
After uniform data encoding, the minimum amount of data necessary for clinical event translation is first extracted for the structured data. The minimum necessary data for the structured data includes:
demographic information: patient identification, year and month of birth, ethnicity, blood type;
and (3) testing results: patient identification, test item coding, test event, test result;
medication information: patient identification, drug coding, time to begin medication;
diagnosis information: patient identification, diagnostic code, time of diagnosis;
and (3) recording the operation: patient identification, surgical coding, surgical time;
aiming at natural language text information in an image examination report, firstly, body parts and clinical expressions are extracted through a named entity recognition technology, and after self-set coding mapping is completed, the body parts and the clinical expressions are converted into necessary minimum data:
the clinical manifestations are as follows: patient identification, clinical presentation coding, clinical presentation time.
Step S2: constructing a concept set + behavior + standard three-tuple mode, and constructing a drug effectiveness/safety index phenotype library through the three-tuple mode, a drug effectiveness/safety term set and clinical indexes;
the drug effectiveness/safety index phenotypes vary widely due to different drugs and different evaluation dimensions, so the embodiment provides a triple mode of concept set + behavior + standard. Wherein the concept set is composed of one or more data codes (such as diagnosis codes, assay codes and the like), the behavior is composed of a series of behaviors or states, including occurrence, more than, less than, equal to, increase and decrease, and the standard represents the magnitude meaning of specific effectiveness/safety indexes and can be deleted. The following are two examples:
"tumor size" + "decrease" + "0.5cm";
"serum glutamate pyruvate transaminase" + "is more than" + "3-fold the upper limit of normal values";
by using the three-tuple mode, a drug effectiveness/safety phenotype library can be constructed according to the existing drug effectiveness/safety term set and clinically widely approved indexes in clinical guidelines, and a specific effectiveness/safety index can be individually predefined according to the needs of users to perfect the drug effectiveness/safety index phenotype library.
And step S3: selecting a target drug, and selecting an index from the drug effectiveness/safety index phenotype library to obtain a target drug-effectiveness/safety index pair; defining a target event, an ending event, a target date and an ending date, extracting a patient ID, the target date and the ending date, and forming a data set for high-throughput signal prescreening;
step S31: taking the first use of the target drug as a target event, forming one or a plurality of indexes selected by a user in the drug effectiveness/safety index phenotype library into outcome indexes, and taking the first occurrence of each outcome index as an outcome event;
the target medicine can be a single medicine or a class of medicines with the same curative effect or the same property, and after one class of medicines is selected as the target medicine, a plurality of selected medicines are regarded as the same medicine.
Step S32: taking the date of the target event as a target date, and taking the date of the ending event as an ending date;
step S33: and selecting the target event and the ending event, screening the patients with the target event and the ending event in the data set, extracting the ID (identity), the target date and the ending date of the patients, and forming the data set for high-throughput signal prescreening.
And step S4: aiming at the data set of the high-throughput signal preliminary screening, carrying out high-throughput signal screening on the target drug-effectiveness/safety index pair by utilizing event sequence symmetry analysis to obtain a preliminary screening positive signal;
step S41: calculating a sequence ratio sr by using a ratio of the number of people who have the ending event after the target event occurs to the number of people who have the ending event before the target event occurs;
the event sequence symmetry analysis may be influenced by the changing trend of the medical behavior over time, for example, the change of the medical insurance policy may cause a certain type of test and drug to be suddenly increased or decreased in a certain period of time, which may result in a biased effect estimation. To correct for this bias, a null sequence ratio may be calculated
Figure DEST_PATH_IMAGE001
It adjusts the sequence ratio sr according to the sequential probability of occurrence of an ending event after the target event without any causal relationship.
Step S42: calculating a null effect sequence ratio, and adjusting the sequence ratio by using the null effect sequence ratio to obtain a corrected sequence ratio;
step S421: calculating the overall average probability of the ending event occurring in a preset observation window after the target event occurs aiming at the data set subjected to the high-flux signal preliminary screening
Figure 414123DEST_PATH_IMAGE002
Ratio of null effect sequences
Figure 221542DEST_PATH_IMAGE001
Is derived from the probability p that each patient who has a target event will have an outcome event within a specified observation window after the target event has occurred.
Figure DEST_PATH_IMAGE003
Wherein d is the time length of the continuous observation window, x is the day of occurrence of the target event, O t The number of patients who had an ending event on day t.
For a population of patients, the overall average probability of an ending event occurring after a target event
Figure 986235DEST_PATH_IMAGE002
The calculation is done by weighting the number of patients with the target event for m consecutive days of the observation period and then averaging over all days:
Figure 613526DEST_PATH_IMAGE004
wherein u represents the last day of the observation period, O t Number of persons showing the occurrence of an ending event on the day, T m Representing the number of people who occurred at the target event on day m, and d is the length of the continuous observation window time.
Step S422: calculating to obtain the ratio of the null effect sequences by using the overall average probability
Figure 590709DEST_PATH_IMAGE001
Figure DEST_PATH_IMAGE005
Step S423: calculating the ratio of the sequence ratio to the null effect sequence ratio to obtain a modified sequence ratio
Figure 786723DEST_PATH_IMAGE006
Figure DEST_PATH_IMAGE007
Step S43: by calculating the probability of the target event occurring first and the ending event occurring later
Figure 987897DEST_PATH_IMAGE008
Calculating the corrected sequence ratio
Figure DEST_PATH_IMAGE009
A confidence interval of (d);
order to
Figure 633642DEST_PATH_IMAGE008
Is the probability that the target event occurs first and the ending event occurs later, i.e.
Figure 414516DEST_PATH_IMAGE010
Then there is
Figure DEST_PATH_IMAGE011
By calculating the probability of occurrence of a target event before occurrence of an ending event
Figure 462106DEST_PATH_IMAGE008
Calculating corrected sequence ratios at 95% confidence intervals
Figure 568603DEST_PATH_IMAGE009
The confidence interval of (c). Probability of occurrence of end event after occurrence of target event in the embodiment
Figure 904906DEST_PATH_IMAGE008
The confidence interval of (d) is calculated using the following formula:
Figure 755050DEST_PATH_IMAGE012
wherein n is the total number of patients involved in the calculation,
Figure DEST_PATH_IMAGE013
take 1.96.
Step S44: and taking the signal with the lower limit of the confidence interval of the corrected sequence ratio larger than 1 as a primary screening positive signal.
Step S5: and carrying out causal evaluation on the primary screening positive signal to determine the effectiveness and safety of the target drug.
Step S51: screening patients in the data set for the target event to which the target drug-effectiveness/safety index pair corresponds to the prescreening positive signal, and allocating the patients to a user queue and an initial non-user queue;
step S511: aiming at the target drug-effectiveness/safety index pair corresponding to the primary screening positive signal, screening the patients with the target event in the data set, acquiring main diagnosis of the patients, counting the patients according to the ICD-10 codes, and taking the main diagnosis with the largest number of patients as basic diagnosis, which is recorded as D;
step S512: assigning the patient to a user queue and an initial non-user queue.
Screening the dataset for patients diagnosed on a primary basis and assigning the patients to a user queue if the following inclusion criteria are met: (1) The patient takes the target medication continuously (e.g., less than 30 days between doses); (2) The patient was already after the occurrence of the basic diagnostic record D at the time of first prescription of the drug; (3) Before the first prescription, the patient had at least one year (365 days) of medical records in the data set. Patients with a primary diagnosis-based diagnosis are screened in the dataset and assigned to an initial non-user cohort if they have not used the target drug.
Step S52: randomly selecting one drug from the drugs used by the initial non-user cohort of patients as a substitute drug, and constructing a non-user cohort;
to estimate the effect of a medication requires comparing the user queue with the non-user queue assigned with the alternative medication. Once the alternative medication is determined, the non-user queue will be defined by the same inclusion criteria as described above-but in relation to the alternative medication. To avoid overlap between the user queue and the non-user queue, the present embodiment further excludes any patients using the target medication from the non-user queue. In this embodiment, the substitute medication is randomly selected from the medications used by the non-user-cohort patients, excluding the target medication itself. Such non-user queues directly compare the target drug with drugs having the same therapeutic indication, thereby reducing confusion caused by different indications.
Step S53: taking the time of using the target drug or the substitute drug for the first time as group entering time, taking the target drug or the substitute drug as a treatment mode, taking whether the ending event occurs or not as a result, extracting baseline data of the patient before group entering, forming a covariate set by demographic information in the data set and the baseline data, extracting characteristics and screening to obtain a variable for calculating an average treatment effect;
the diagnosis data is extracted within one year before the group entering time, the data within one month before the group entering time is extracted according to the test result, the operation and the clinical performance, and the data within 7 days before the group entering time is extracted according to the medication information. The first stage target likelihood maximum estimation is then turned on to evaluate the importance of each baseline variable, and thereby screening for confounding factors for the evaluation of average treatment efficacy.
Step S531: selecting a covariate in the covariate set, recording the covariate as an important covariate, and recording the other variables in the covariate set as the other variables;
in the data set, the data structure is recorded as
Figure 657147DEST_PATH_IMAGE014
Wherein
Figure DEST_PATH_IMAGE015
Is a set of covariates, in this example consisting of demographic information and extracted baseline data; a is a treatment mode, namely, the target medicament or the substitute medicament is used; y is the result, i.e., whether an ending event occurred. Is provided with
Figure 200124DEST_PATH_IMAGE016
Is an importance covariate to be evaluated for importance,
Figure DEST_PATH_IMAGE017
are the remaining variables.
Step S532: under the given conditions of the important covariates and the rest variables, constructing an ending event prediction model taking all covariates as input variables for the condition expectation E of the result to obtain an estimation value of the condition expectation of the result;
covariates of given importance
Figure 551952DEST_PATH_IMAGE018
All the other variables
Figure DEST_PATH_IMAGE019
Under the condition that the condition of result Y is desired
Figure 471367DEST_PATH_IMAGE020
Performing a fitting estimation, i.e. constructing an outcome event prediction model with the totality of covariates W as input variables
Figure DEST_PATH_IMAGE021
To is aligned withFor each of the covariates of importance to be assessed for importance
Figure 431233DEST_PATH_IMAGE018
Is provided with
Figure 613952DEST_PATH_IMAGE022
Conditional expectation of value of (A) being result Y
Figure DEST_PATH_IMAGE023
The estimated values of (A) can be calculated by any machine-learned classification method, such as logistic regression, and separately
Figure 518323DEST_PATH_IMAGE018
If =0, the result Y is a desired estimation value of the condition
Figure 647953DEST_PATH_IMAGE024
And when
Figure 524642DEST_PATH_IMAGE018
If =1, the result Y is a desired estimation value of the condition
Figure DEST_PATH_IMAGE025
Step S533: under the condition of giving the other variables, estimating the probability that the value of the important covariate is 1 to obtain an estimated value;
for taking the rest of variables
Figure 878263DEST_PATH_IMAGE019
Under the premise, the covariates of importance
Figure 676455DEST_PATH_IMAGE018
Probability of 1
Figure 406514DEST_PATH_IMAGE026
Make an estimate and record as
Figure DEST_PATH_IMAGE027
Any machine may be usedClassification methods of learning, e.g. logistic regression, model fitting, calculation to obtain the remaining variables
Figure 137709DEST_PATH_IMAGE019
Under the premise, the covariates of importance
Figure 930740DEST_PATH_IMAGE018
Probability estimate of 1
Figure 950649DEST_PATH_IMAGE028
And determining the remaining variables
Figure 687661DEST_PATH_IMAGE019
On the premise, the covariates of importance
Figure 538942DEST_PATH_IMAGE018
Probability estimate of 0
Figure DEST_PATH_IMAGE029
;
Step S534: constructing a variable importance indication disturbance variable by using the estimated value of the probability of the importance covariate being 1;
by passing
Figure 968787DEST_PATH_IMAGE027
Construction variable importance indicating disturbance variable H j
Figure 741571DEST_PATH_IMAGE030
Wherein I is an indicator function when
Figure 813432DEST_PATH_IMAGE018
When the ratio is not less than 1,
Figure DEST_PATH_IMAGE031
step S535: taking the estimated value of the condition expectation of the result as an intercept term, taking the variable importance indication disturbance variable as an input variable, and constructing an ending event regression model to obtain a variable importance disturbance parameter;
at this time, in the present embodiment,
Figure 253640DEST_PATH_IMAGE032
conditional expectation of result Y
Figure DEST_PATH_IMAGE033
There is a deviation between, we need the variable importance disturbance parameter
Figure 916703DEST_PATH_IMAGE034
Come to right
Figure DEST_PATH_IMAGE035
And (6) correcting. The method is that
Figure 442362DEST_PATH_IMAGE036
As an intercept term, constructing a regression model of the outcome events and the variable importance indicators disturbance variables, i.e.
Figure DEST_PATH_IMAGE037
After the model fits well, the importance of the variable indicates the disturbance variable
Figure 317914DEST_PATH_IMAGE038
The coefficient of (a) is a variable importance disturbance parameter
Figure 143788DEST_PATH_IMAGE034
Step S536: updating the estimated value of the condition expectation of the result by using the variable importance disturbance parameter and the variable importance indication disturbance variable, and calculating to obtain the variable importance;
complete pair
Figure DEST_PATH_IMAGE039
Updating:
Figure 455382DEST_PATH_IMAGE040
the variable importance is calculated by the following formula:
Figure DEST_PATH_IMAGE041
where n is the number of patients in the data set and i is the ith record in the data set.
Step S537: calculating the standard deviation and confidence interval of the variable importance;
in order to complete the variable screening, the standard deviation of the importance of the variable is calculated according to the following formula:
Figure 733917DEST_PATH_IMAGE042
wherein,
Figure DEST_PATH_IMAGE043
and Var denotes a sample variance, and therefore,
Figure 944318DEST_PATH_IMAGE044
has a 95% confidence interval of
Figure DEST_PATH_IMAGE045
Step S538: repeating steps S531-S537 until each covariate of the set of covariates is traversed, taking the variable with the confidence interval between the left and right of the standard deviation of the variable importance not containing zero as the variable for calculating the average therapeutic effect
Figure 359119DEST_PATH_IMAGE046
Get the confidence interval between left and right
Figure DEST_PATH_IMAGE047
And
Figure 98405DEST_PATH_IMAGE048
as a function of the further development of the average therapeutic effect
Figure 801919DEST_PATH_IMAGE046
Step S54: calculating a mean therapeutic effect of the drug of interest using the variables used to calculate the mean therapeutic effect;
the framework presented in this example allows the evaluation of the impact of a drug of interest on the clinical outcome of multiple drug efficacy/safety. We define the Average Therapeutic Effect (ATE) of the drug of interest on the potential outcome Y as
Figure DEST_PATH_IMAGE049
Wherein
Figure 550432DEST_PATH_IMAGE050
Indicating the likelihood of an outcome event occurring for a patient during the complete follow-up period if all patients were assigned treatment a in the simulated clinical trial. The potential results are referred to as counterfeits, since only one of them is observed for any given individual. In the randomized controlled trial, the incidence of an event of outcome between the user cohort and the non-user cohort over a randomized treatment regimen was measured:
Figure DEST_PATH_IMAGE051
can be directly estimated as
Figure 85319DEST_PATH_IMAGE052
Figure DEST_PATH_IMAGE053
Can be estimated as
Figure 795173DEST_PATH_IMAGE054
. However, in real-world observational data, treatment assignments are not typically random, so the present embodiment employs a dual robust approach to maximize the elimination of confounding effects. In this embodiment, the device is not used by continuously putting backThe method comprises the following steps of randomly selecting substitute drugs from the drugs used by the patients in the family queue, comparing the occurrence condition of the ending events with the family queue to calculate the Average Treatment Effect (ATE), wherein the ATE calculation formula is defined as follows:
Figure DEST_PATH_IMAGE055
step S541: constructing an ending event prediction model by using the treatment mode and the variable for calculating the average treatment effect as input variables to obtain an expected estimation value of the ending event condition;
for a given treatment regimen A, important variables
Figure 985983DEST_PATH_IMAGE046
Under the condition of ending Y, the condition is expected
Figure 741449DEST_PATH_IMAGE056
Performing a fitting estimation using the covariates filtered out in the previous step
Figure DEST_PATH_IMAGE057
End event prediction model as input variable
Figure 396422DEST_PATH_IMAGE058
To make
Figure DEST_PATH_IMAGE059
Is the conditionally expected estimate of the result Y
Figure 274248DEST_PATH_IMAGE060
. The model can be constructed by using any machine learning classification method, such as logistic regression, and calculating the expected estimation value of the condition of the result Y when A =0
Figure DEST_PATH_IMAGE061
And when a =1, the result is an estimate of the conditional expectation of Y
Figure 686775DEST_PATH_IMAGE062
Step S542: evaluating the probability of treatment distribution to obtain an estimated value of treatment distribution;
probability of assignment to treatment
Figure DEST_PATH_IMAGE063
Make an estimate and record as
Figure 777090DEST_PATH_IMAGE064
The model may be any machine-learned classification method, such as logistic regression, and the probability estimate of treatment assignment a =1 is calculated after model fitting
Figure DEST_PATH_IMAGE065
And probability estimate of treatment assignment a =0
Figure 20990DEST_PATH_IMAGE066
Step S543: constructing an indicative perturbation variable using the estimate of the therapy allocation;
using estimates of treatment allocation
Figure 804138DEST_PATH_IMAGE064
Construction-indicative disturbance variable
Figure DEST_PATH_IMAGE067
Figure 763348DEST_PATH_IMAGE068
Where I is an indicator function, when a =1,
Figure DEST_PATH_IMAGE069
step S544: taking an estimated value of the ending event condition expectation obtained by taking the treatment mode and the variable of the average treatment effect as input and calculating as an intercept term, constructing a regression model of the ending event and the indication disturbance variable, and obtaining an estimated disturbance parameter;
at this time, in the present embodiment,
Figure 391776DEST_PATH_IMAGE070
and
Figure DEST_PATH_IMAGE071
there is a deviation between them, so that the disturbance parameters need to be estimated
Figure 959023DEST_PATH_IMAGE072
Come to right
Figure DEST_PATH_IMAGE073
And (6) correcting. The method used is that
Figure 647493DEST_PATH_IMAGE074
As an intercept term, a regression model of the outcome events and disturbance indicating variables is constructed, i.e.
Figure DEST_PATH_IMAGE075
After the model is fitted well, indicating disturbance variable
Figure 831350DEST_PATH_IMAGE076
Is the estimated disturbance parameter
Figure 935572DEST_PATH_IMAGE072
Step S545: updating the estimation value of the ending event condition expectation by using the estimation disturbance parameter and the disturbance indicating variable, and calculating to obtain an average treatment effect;
complete pair
Figure DEST_PATH_IMAGE077
Updating:
Figure 154064DEST_PATH_IMAGE078
ATE can be calculated by:
Figure DEST_PATH_IMAGE079
wherein,
Figure 810173DEST_PATH_IMAGE080
the number of patients in the user cohort is used in the non-user cohort for the currently selected number of surrogate drug patients, and i is the ith record for the data set.
Step S55: repeating the steps S52-S54 until reaching the maximum test times, calculating the average effect of all the average treatment effects, and determining the effectiveness and safety of the target drug by calculating the confidence interval of the average effect.
Step S551: repeating steps S52-S54 until a maximum number of trials is reached, calculating the average effect of all said average therapeutic effects;
number n of types of medication taken by patients in the original non-user cohort d Determining the maximum number of trials r
Figure DEST_PATH_IMAGE081
After the times of repeated tests of randomly selected substitute drugs in the non-user queue reach r times, the final target drug has an average effect on the outcome indexes
Figure 684588DEST_PATH_IMAGE082
The calculation is as follows:
Figure DEST_PATH_IMAGE083
wherein,
Figure 454486DEST_PATH_IMAGE084
the average therapeutic effect of the target drug with the i-th randomly selected surrogate drug.
Step S552: and calculating the confidence interval of the average action, acquiring a signal with the left confidence interval larger than zero or a signal with the right confidence interval smaller than zero as a final positive signal, and determining the effectiveness and safety of the target drug.
To complete the final determination of the causal effect between the drug and outcome indicator, a 95% confidence interval of
Figure DEST_PATH_IMAGE085
In which
Figure 527484DEST_PATH_IMAGE086
Standard deviation for sub-random simulation of ATE. When the ending index is the validity index, selecting
Figure DEST_PATH_IMAGE087
The signal of (a) is taken as the final positive signal, i.e. the drug is considered to generate a corresponding positive outcome; when the ending index is the safety index, selecting
Figure 761019DEST_PATH_IMAGE088
The signal of (a) is taken as the final positive signal, i.e. the drug is considered to produce a corresponding safety outcome.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (7)

1. A high-throughput real-world drug effectiveness and safety evaluation method is characterized by comprising the following steps:
step S1: acquiring real world data, carrying out unified data coding on the data, extracting necessary minimum data of the data after the unified data coding, and completing construction of a data set;
step S2: constructing a concept set + behavior + standard triple pattern, wherein the concept set is composed of one or more data codes, the behavior is composed of a series of behaviors or states, the standard represents the magnitude meaning of the effectiveness/safety index, and a medicament effectiveness/safety index phenotype library is constructed through the triple pattern, the medicament effectiveness/safety term set and the clinical index;
and step S3: selecting a target drug, and selecting an index from the drug effectiveness/safety index phenotype library to obtain a target drug-effectiveness/safety index pair; defining a target event, an ending event, a target date and an ending date, extracting a patient ID, the target date and the ending date, and forming a data set for high-throughput signal prescreening;
step S31: taking the first use of the target drug as a target event, forming one or a plurality of indexes selected by a user in the drug effectiveness/safety index phenotype library into outcome indexes, and taking the first occurrence of each outcome index as an outcome event;
step S32: taking the date of the target event as a target date, and taking the date of the ending event as an ending date;
step S33: selecting the target event and the ending event, screening patients with the target event and the ending event in the data set, extracting patient IDs, target dates and ending dates, and forming a data set for high-throughput signal prescreening;
and step S4: aiming at the data set of the high-throughput signal preliminary screening, carrying out high-throughput signal screening on the target drug-effectiveness/safety index pair by utilizing event sequence symmetry analysis to obtain a preliminary screening positive signal;
step S41: calculating a sequence ratio by utilizing the ratio of the number of people who have the ending event after the target event occurs to the number of people who have the ending event before the target event occurs;
step S42: calculating a null effect sequence ratio, and adjusting the sequence ratio in the step S41 by using the null effect sequence ratio to obtain a corrected sequence ratio;
step S43: calculating a confidence interval of the corrected sequence ratio by calculating a confidence interval of the probability of occurrence of the target event before occurrence of the ending event;
step S44: taking the signal with the lower limit of the confidence interval of the corrected sequence ratio larger than 1 as a primary screening positive signal;
step S5: carrying out causal evaluation on the primary screening positive signal to determine the effectiveness and safety of the target drug;
step S51: screening patients in the data set for the target event to which the target drug-effectiveness/safety index pair corresponds to the prescreening positive signal, and allocating the patients to a user queue and an initial non-user queue;
step S52: randomly selecting one drug from the drugs used by the initial non-user cohort of patients as a substitute drug, and constructing a non-user cohort;
step S53: taking the time of using the target drug or the substitute drug for the first time as the group entering time, taking the target drug or the substitute drug as a treatment mode, taking whether the ending event occurs as a result, extracting baseline data of the patients before group entering, forming a covariate set by demographic information in the data set and the baseline data, extracting characteristics and screening to obtain a variable for calculating the average treatment effect;
step S54: calculating a mean therapeutic effect of the drug of interest using the variables used to calculate the mean therapeutic effect;
step S55: repeating steps S52-S54 until reaching the maximum test times, calculating the average effect of all the average treatment effects, and determining the effectiveness and safety of the target drug by calculating the confidence interval of the average effect.
2. The method for high-throughput real-world drug efficacy and safety evaluation according to claim 1, wherein the step S42 specifically comprises the following sub-steps:
step S421: calculating the overall average probability of the ending event occurring in a preset observation window after the target event occurs aiming at the data set subjected to the high-flux signal preliminary screening;
step S422: calculating to obtain an empty effect sequence ratio by using the overall average probability;
step S423: and calculating the ratio of the sequence ratio to the null effect sequence ratio in the step S41 to obtain a corrected sequence ratio.
3. The method for high-throughput real-world drug efficacy and safety evaluation according to claim 1, wherein the step S51 specifically comprises the following sub-steps:
step S511: aiming at the target drug-effectiveness/safety index pair corresponding to the primary screening positive signal, screening the patient with the target event in the data set, acquiring a main diagnosis of the patient, counting the patient according to an ICD-10 code, and taking the main diagnosis with the largest number of patients as a basic diagnosis;
step S512: assigning the patient to a user queue and an initial non-user queue.
4. The method for evaluating the effectiveness and safety of a high-throughput real-world drug according to claim 1, wherein the step S53 comprises the following steps:
step S531: selecting a covariate in the covariate set, recording the covariate as an important covariate, and recording the rest variables in the covariate set as rest variables;
step S532: under the condition given by the important covariates and the rest variables, constructing an ending event prediction model taking all covariates as input variables for the condition expectation of the result to obtain an estimation value of the condition expectation of the result;
step S533: under the condition that the other variables are given, estimating the probability that the value of the important covariate is 1 to obtain an estimated value;
step S534: constructing a variable importance indication disturbance variable by using an estimated value of the probability that the importance covariate is 1;
step S535: taking the estimated value of the condition expectation of the result as an intercept term, taking the variable importance indication disturbance variable as an input variable, and constructing an ending event regression model to obtain a variable importance disturbance parameter;
step S536: updating the estimated value of the condition expectation of the result by using the variable importance disturbance parameter and the variable importance indication disturbance variable, and calculating to obtain the variable importance;
step S537: calculating the standard deviation and confidence interval of the variable importance;
step S538: repeating steps S531-S537 until each covariate of the set of covariates is traversed, taking as the variable used for calculating the average therapeutic effect the variable for which the left and right confidence intervals of the standard deviation of the variable importance do not contain zero.
5. The method for high-throughput real-world drug efficacy and safety evaluation according to claim 1, wherein the step S54 specifically comprises the following sub-steps:
step S541: constructing an ending event prediction model by using the treatment mode and the variable for calculating the average treatment effect as input variables to obtain an expected estimation value of the ending event condition;
step S542: evaluating the probability of treatment distribution to obtain an estimated value of treatment distribution;
step S543: constructing an indicative perturbation variable using the estimate of the therapy allocation;
step S544: taking an estimated value of the ending event condition expectation obtained by taking the treatment mode and the variable of the average treatment effect as input and calculating as an intercept term, constructing a regression model of the ending event and the indication disturbance variable, and obtaining an estimated disturbance parameter;
step S545: and updating the estimated value of the ending event condition expectation by using the estimated disturbance parameter and the indicated disturbance variable, and calculating to obtain an average treatment effect.
6. The method for evaluating the effectiveness and safety of high-throughput real-world drugs according to claim 1, wherein the step S55 comprises the following steps:
step 551: repeating steps S52-S54 until a maximum number of trials is reached, calculating the average effect of all said average therapeutic effects;
step S552: and calculating the confidence interval of the average action, acquiring a signal with the left confidence interval larger than zero or a signal with the right confidence interval smaller than zero as a final positive signal, and determining the effectiveness and safety of the target drug.
7. A system for high throughput real-world drug efficacy and safety assessment method according to any one of claims 1-6, comprising:
the data acquisition module is used for acquiring data from the existing data and cleaning the data to complete the construction of a data set;
the high-throughput signal screening module is used for converting the data in the data set into clinical events and carrying out high-throughput signal screening by using event sequence symmetry analysis to obtain primary screening positive signals;
the cause and effect evaluation module is used for carrying out cause and effect evaluation on the primary screening positive signals and determining the effectiveness and safety of the target drug;
and the result display module is used for displaying the result of the effectiveness and the safety of the target medicine.
CN202211051408.2A 2022-08-31 2022-08-31 High-throughput real world drug effectiveness and safety evaluation method and system Active CN115148375B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202211051408.2A CN115148375B (en) 2022-08-31 2022-08-31 High-throughput real world drug effectiveness and safety evaluation method and system
JP2023095912A JP7433503B1 (en) 2022-08-31 2023-06-09 High-throughput real-world drug efficacy and safety evaluation methods and systems

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211051408.2A CN115148375B (en) 2022-08-31 2022-08-31 High-throughput real world drug effectiveness and safety evaluation method and system

Publications (2)

Publication Number Publication Date
CN115148375A CN115148375A (en) 2022-10-04
CN115148375B true CN115148375B (en) 2022-11-15

Family

ID=83415505

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211051408.2A Active CN115148375B (en) 2022-08-31 2022-08-31 High-throughput real world drug effectiveness and safety evaluation method and system

Country Status (2)

Country Link
JP (1) JP7433503B1 (en)
CN (1) CN115148375B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115424741B (en) * 2022-11-02 2023-03-24 之江实验室 Adverse drug reaction signal discovery method and system based on cause and effect discovery
CN116504423B (en) * 2023-06-26 2023-09-26 北京大学 Drug effectiveness evaluation method
CN117334347B (en) * 2023-12-01 2024-03-22 北京大学 Method, device, equipment and storage medium for evaluating treatment effect
CN117690547A (en) * 2023-12-19 2024-03-12 北京遥领医疗科技有限公司 Method for multidimensional reverse mining of data based on medicine real world curative effect

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111145845A (en) * 2019-12-20 2020-05-12 四川大学华西第二医院 Block chain based anti-tumor drug grading management and tracking medication compliance system
CN114300158A (en) * 2021-12-28 2022-04-08 刘玉强 Method and device for identifying adverse drug reactions

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101603088A (en) * 2009-07-16 2009-12-16 张俊 The method and system of the dependency of assessment gene order and pharmacological reaction of medicament
KR20120124234A (en) * 2011-05-03 2012-11-13 (주)제이브이엠 Automatic reinspection system and the method of prescription drugs
US10614196B2 (en) * 2014-08-14 2020-04-07 Accenture Global Services Limited System for automated analysis of clinical text for pharmacovigilance
CN105139083A (en) * 2015-08-10 2015-12-09 石庆平 Method and system for reevaluating safety of drug after appearance on market
EP3622423A1 (en) * 2017-05-12 2020-03-18 The Regents of The University of Michigan Individual and cohort pharmacological phenotype prediction platform
CN107480443A (en) * 2017-08-08 2017-12-15 复旦大学附属儿科医院 A kind of clinical drug integrated evaluating method based on real world
CA3094294A1 (en) * 2018-03-23 2019-09-26 F. Hoffmann-La Roche Ag Methods for screening a subject for the risk of chronic kidney disease and computer-implemented method
CN111599480B (en) * 2020-04-20 2022-12-02 国家药品监督管理局药品评价中心(国家药品不良反应监测中心) Method, device, terminal and readable medium for evaluating adverse drug reactions
CN111933242B (en) 2020-08-12 2024-08-06 陈灿 Medicine usage coding information processing method, system, storage medium and terminal
CN113035369B (en) * 2021-03-10 2021-12-03 浙江大学 Construction method of kidney transplantation anti-infective drug dosage prediction model
CN114025253A (en) * 2021-11-05 2022-02-08 杭州联众医疗科技股份有限公司 Drug efficacy evaluation system based on real world research

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111145845A (en) * 2019-12-20 2020-05-12 四川大学华西第二医院 Block chain based anti-tumor drug grading management and tracking medication compliance system
CN114300158A (en) * 2021-12-28 2022-04-08 刘玉强 Method and device for identifying adverse drug reactions

Also Published As

Publication number Publication date
CN115148375A (en) 2022-10-04
JP2024035072A (en) 2024-03-13
JP7433503B1 (en) 2024-02-19

Similar Documents

Publication Publication Date Title
CN115148375B (en) High-throughput real world drug effectiveness and safety evaluation method and system
Hayat et al. Understanding poisson regression
Dolan et al. An eualuation of clinicians' subjective prior probability estimates
Lix et al. Using multiple data features improved the validity of osteoporosis case ascertainment from administrative databases
WO2009114795A2 (en) Non-natural pattern identification for cognitive assessment
WO2010005656A2 (en) Brain condition assessment
Gibson et al. A copayment increase for prescription drugs: the long-term and short-term effects on use and expenditures
Du et al. Digitally generated trail making test data: analysis using hidden Markov modeling
CN114300158A (en) Method and device for identifying adverse drug reactions
Thornley et al. Estimating diabetes prevalence in South Auckland: how accurate is a method that combines lists of linked health datasets?
Williamson et al. A classification statistic for GEE categorical response models
Aaskoven et al. Subjective well-being and chronic illnesses: A combined survey and register study
KR102650936B1 (en) Mental health risk signal detection system, and mental health risk signal detection method using thereof
Schneider et al. Traumatic brain injury and cognitive change over 30 years among community‐dwelling older adults
Wardenaar et al. International journal of methods in psychiatric research 2015, 24: 130-142.
Varewyck On quantifying quality of care
Bradley Impact of Gender Bias in Training Data for Machine Learning Models predicting Myocardial Infarction
Bhattacharyay From big data to personal narratives: a supervised learning framework for decoding the course of traumatic brain injury in intensive care
Suzen et al. What is Hiding in Medicine’s Dark Matter? Learning with Missing Data in Medical Practices
Chi Policy Evaluation with Nonorthogonal Instruments: Evidence from China's Family Planning
Kuhlmey et al. Estimating Survival Times Using Swiss Hospital Data
Henriques Risk factors for Amyotrophic Lateral Sclerosis (ALS): the effect of high-intensity sport
Kim Clinical and Economic Impact of High Short-Acting β-Agonist Use in Patients with Persistent Asthma
Al-Ghraiybah et al. Effects of the nursing practice environment, nurse staffing, patient surveillance and escalation of care on patient mortality: A multi-source quantitative study
윤성욱 The Effect of General Health Checks on Healthcare Utilization: Accounting for Self-Selection Bias: Accounting for Self-Selection Bias

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant