CN117546250A

CN117546250A - Systems and methods for estimating treatment efficacy using covariate adjustment stratification and pseudo-value regression in randomized trials

Info

Publication number: CN117546250A
Application number: CN202280044658.5A
Authority: CN
Inventors: A·舒勒达科斯塔费罗; D·P·米勒; 李云帆; A·万德比克
Original assignee: Non Learning Artificial Intelligence Co ltd
Current assignee: Non Learning Artificial Intelligence Co ltd
Priority date: 2021-06-24
Filing date: 2022-06-24
Publication date: 2024-02-09

Abstract

Systems and methods for estimating therapeutic effects using covariate adjustment stratification and pseudo-value regression in random control trials according to embodiments of the present invention are illustrated. One embodiment includes a method for estimating a therapeutic effect in a randomized controlled trial, wherein the method includes receiving external data from a previous randomized clinical trial. The method further includes generating a set of one or more subject characteristics for the plurality of test subjects, estimating a binary result for the test subjects using a stratification process, and estimating a Time To Event (TTE) treatment effect for the test subjects using pseudo-value regression.

Description

Systems and methods for estimating treatment efficacy using covariate adjustment stratification and pseudo-value regression in randomized trials

Cross reference to related applications

U.S. Pat. No.63/214,643, entitled "Systems and Methods for Randomized Trials via Prognostic Score Stratification", filed 24 on 24 th 6 th year 2021, and U.S. Pat. No.63/363,796, entitled "RMST Pseudovalue Regression Variance", filed 28 th year 2022, are hereby incorporated by reference in their entireties for all purposes, in accordance with U.S. Pat. No. 35, clause 119 (e).

Technical Field

The present invention relates generally to clinical trial design, and more particularly to improving statistical efficacy of detecting therapeutic effects using covariates derived from hierarchical (identification) and/or pseudo-value regression generative models.

Background

Clinical studies and clinical trials aim to investigate the safety and effectiveness of biomedical or behavioral interventions on humans. When new drugs and medical devices are invented, they must undergo rigorous experimentation to generate data regarding their effectiveness and safety in order to be approved by the relevant authorities for clinical use. Test products that do not yield satisfactory levels of safety or efficacy are not approved for large-scale commercial use.

Random Control Test (RCT) is one method for developing clinical trials. RCT generally has two groups, a treatment group (treatment arm) and a control group (control arm). The enrolled subjects were randomly assigned to each group and the efficacy of the proposed new therapy was determined by comparing the test results of subjects in the enrolled treatment group receiving the new therapy with those of subjects in the enrolled control group receiving the existing therapy. Although the results are affected by the individual characteristics of the participants due to their subtleties, which are different from one another, RCT allows the collectist to control these influencing factors. Well-designed RCTs can provide not only reliable indications of test results, but also information about adverse effects that may be produced by the test.

Covariate adjustment refers to controlling baseline characteristics of the test subject in estimating the effect of treatment. In most cases, the test results are correlated with baseline characteristics of the test subjects. In the RCT environment, covariate adjustment is an effective tool to assist in estimating the effect of treatment. Because baseline characteristics were collected and measured prior to random assignment, the collectists retained the ability to test the effect of treatment across the random trial by adjusting the known covariates of the random trial.

Disclosure of Invention

Systems and methods for estimating therapeutic effects using covariate adjustment stratification and pseudo-value regression in random control trials according to embodiments of the present invention are shown. One embodiment includes a method for estimating a therapeutic effect in a randomized controlled trial, wherein the method includes receiving external data from a previous randomized clinical trial. The method further includes generating a set of one or more subject characteristics for the plurality of test subjects, estimating a binary result for the test subjects using a stratification process, and estimating a Time To Event (TTE) treatment effect for the test subjects using pseudo-value regression.

In another embodiment, the method includes the step of estimating a binary outcome of the test subject using a stratification process, wherein the method includes training a prognostic model using the received external data, generating outcome predictions for the test subject using the prognostic model, defining variables for stratification of the test subject based on the outcome predictions, stratifying all of the test subject into multiple stratification by variables, and estimating therapeutic outcome of the test subject in all stratification.

In a further embodiment, the method further comprises the step of estimating the TTE treatment effect of the test subject using pseudo-value regression, wherein the method comprises training a prognosis model using the received external data, generating a prognosis score for the test subject using the prognosis model and the generated subject characteristics of the test subject, and estimating the TTE treatment effect of the test subject using the pseudo-value regression model and the prognosis score.

In yet another embodiment, the set of one or more characteristics of the plurality of test subjects includes a baseline covariate of the test subjects and a treatment session of the test subjects (treatment assignments).

In still further embodiments, the prognostic model is a generative model.

In yet another embodiment, the prognostic model is a generalized linear model.

In yet a further embodiment, the prognostic model is a simple rule-based model.

In another additional embodiment, the prognostic model is a model-based generative machine learning model.

In a further additional embodiment, estimating the TTE treatment effect comprises estimating a limited mean survival time of the test subject.

In yet another embodiment, the method further comprises designing a clinical study based on the estimated therapeutic effect.

One embodiment includes a non-transitory machine-readable medium comprising processor instructions for estimating a therapeutic effect in a randomized controlled trial using covariate adjustment stratification and pseudo-value regression, wherein execution of the instructions by the processor causes the processor to perform a process comprising receiving external data of a previous randomized clinical trial. The method further includes generating a set of one or more subject characteristics for the plurality of test subjects, estimating a binary result for the test subjects using a stratification process, and estimating a Time To Event (TTE) treatment effect for the test subjects using pseudo-value regression.

Drawings

The description and claims of the present invention will be more fully understood with reference to the following drawings and data diagrams, which are presented as exemplary embodiments of the invention, and should not be construed as a complete description of the scope of the invention.

Fig. 1 is a flow chart of a process for estimating the effect of a treatment in a randomized controlled trial.

FIG. 2 is a flow chart of a process for incorporating a generative model-based layer in the design of a random control experiment in accordance with an embodiment of the present invention.

Fig. 3 is a flow chart of a process of estimating the therapeutic effect of TTE results according to an embodiment of the invention.

Fig. 4 is a network diagram of a process upon which estimating a therapeutic effect may be implemented according to an embodiment of the invention.

Fig. 5 is a high-level block diagram of a system for a process upon which to implement an estimated therapeutic effect in accordance with an embodiment of the present invention.

Fig. 6 is a high-level block diagram of an application executing a process of estimating a therapeutic effect in accordance with an embodiment of the present invention.

Detailed Description

Systems and methods according to some embodiments of the invention may estimate therapeutic effects in a Randomized Controlled Test (RCT). In several embodiments, the effect of treatment can be estimated by the results of the subject in the inclusion trial under control and treatment conditions. Systems and methods according to various embodiments of the invention may use covariate adjustment layering to estimate treatment outcome. In many embodiments, the therapeutic effect of an event outcome can be evaluated based on the difference in time of occurrence of the event under control and therapeutic conditions. Systems and methods according to many embodiments of the invention may use covariate-adjusted pseudo-value regression to estimate the time to therapeutic effect (time to treatment effect).

The process according to certain embodiments of the present invention may improve RCT design by reducing the amount of sample required for the test. In many embodiments, the process may reduce the variance of the performed estimates, which may improve the accuracy of the estimates.

RCT generally requires a large enough sample size to make the result representative. However, a large sample size of the test subject may also increase the difficulty of enrolling a sufficient number of participants, which may make it challenging to complete a study or provide adequate efficacy to estimate the therapeutic effect. Embodiments of the present invention may address this problem by data layering. In many embodiments, test subjects may be divided into non-overlapping groups according to their specific characteristics. In several embodiments, stratification of test subjects may be performed multiple times based on multiple subject characteristics. Machine learning models according to various embodiments of the present invention can be used to estimate results under control conditions, which can be used to identify optimal groupings that can be used to stratify test subjects.

In RCT, time To Event (TTE) analysis is important to their ability to build a time frame in which important clinical events may occur in an experiment. However, in clinical studies and trials, there will always be subjects who will exit the trial before the clinical event of interest occurs. Well-performed RCTs typically leave the study about 10% to 20% of the test subjects before a predetermined follow-up time. Lost subjects were considered deleted data (cenored data) for trial purposes since the last known visit. The cumulative amount of deleted data can affect the established timeframe of major clinical events in the trial, which in turn affects the estimation of the therapeutic effect. Embodiments of the present invention can address this problem by analyzing the effects of TTE treatment on test subjects using pseudo-value regression. In some embodiments, the pseudo-value regression is the deleted data applied to estimate the TTE treatment effect.

An example process for estimating the effect of a treatment in RCT according to many embodiments of the invention is illustrated in fig. 1. In many embodiments, the process 100 obtains external data for the test subject from a prior randomized clinical trial (110). In some embodiments, the external data may be from a high quality observational study. According to several embodiments of the invention, the external data may include subject characteristics of the test subject, and/or its final test results from a previous randomized clinical trial. In many embodiments, the prognostic model is trained with the obtained external data, and the model can be used to estimate the outcome of the patient under control conditions. Embodiments of the present invention may utilize these estimations to improve the accuracy of the estimated therapeutic effect, as will be described in further detail below.

The process 100 generates a set of one or more subject characteristics of a test subject of a target test (120). In particular embodiments, the subject characteristics include a baseline covariate for each test subject and a treatment group task for the subject. The subject characteristics may be used alone or in combination of two or more in the treatment effect estimation discussed in detail below.

The process 100 estimates the effect of treatment on the test subject (130). In many embodiments, the estimated therapeutic effect includes a therapeutic result, and a TTE therapeutic effect. In several embodiments, the treatment outcome may be binary, i.e., they consider whether the test subject achieved the desired treatment outcome. Hierarchical analysis can be used to estimate binary treatment results whereby the entire test subject is divided into non-overlapping groups called tiers, per specific subject characteristic that all test subjects have, allowing researchers to observe the correlation between specific subject characteristics and binary test results. In many embodiments, the treatment task may be independent of the subject's layer, as the test subjects may be randomly assigned to the control or treatment group of the test prior to stratification.

Time To Event (TTE) analysis establishes a time frame in which significant clinical events may occur in the trial and may be another indicator of the efficacy of the new therapy in the trial. In various embodiments, the event of interest may be whether the test subject obtained the desired therapeutic result. In many embodiments, the therapeutic effect may include a TTE therapeutic effect. According to embodiments of the invention, the TTE treatment effect may allow a researcher to observe how TTE for certain events varies between test subjects. However, TTE therapeutic effects may be affected by test subjects who withdraw from the trial prior to obtaining the event of interest. Thus, in many embodiments, the TTE treatment effect of test subjects, including deleted subjects, can be estimated to maintain an accurate reflection of test results based on original test enrollment. In several embodiments, a parametric regression model (including pseudo-value regression) is used to estimate TTE treatment effects, as will be discussed in further detail below.

In several embodiments, a clinical study may be designed based on the estimated therapeutic effect. In many embodiments, clinical studies designed based on estimated therapeutic effects may maintain desired levels of study efficacy while also maintaining a small sample size to save costs. According to embodiments of the present invention, the variance of the study may also be reduced to achieve the greatest accuracy possible.

Although specific procedures for estimating therapeutic effects in RCT are described above, any of a variety of procedures may be utilized to estimate therapeutic effects in RCT as appropriate to the requirements of a particular application. In certain embodiments, the steps may be performed or practiced in any order or sequence that is not limited to the order or sequence shown and described. In various embodiments, some of the above steps may be performed or implemented substantially simultaneously, where appropriate, or in parallel with reducing latency and processing time. In some embodiments, one or more of the above steps may be omitted.

Estimating therapeutic effect of binary outcome

The therapeutic effect of using hierarchical estimated binary results is a multi-step process. A conceptual illustration of the layering and estimation process is shown in fig. 2. Process 200 trains a prognostic model using external data obtained from previous experiments (210). In some embodiments, the external data may be from a high quality observational study. According to several embodiments of the invention, the external data may include subject characteristics of the test subject, and/or its final test results from a previous randomized clinical trial. In particular embodiments, the prognostic model can be a generative model. In various embodiments, the prognostic model can have binary, categorical, continuous, and event time-of-occurrence outputs that are then used to derive a binary outcome probability for each trial participant.

Process 200 generates a prediction of test subjects under control conditions using the trained prognostic model (220). In several embodiments, the prognostic model generates a outcome prediction using the complete set of one or more subject characteristics. Since the results of interest in RCT are typically binary, the result predictions generated in many embodiments of the invention may also be binary in nature, as the score predicts the probability of a result between two possible results. If the binary result is defined by some potential continuous variable, then in certain embodiments of the invention, the prediction of the continuous variable itself may be used as a hierarchical variable. In several embodiments, the choice of hierarchical variables may be determined jointly by the definition and expected variance of the results and the possible sample size reduction.

In many embodiments, the layering process uses a framework of conventional layering chi-square (Cochran-Mantel-Haenszel, CMH) test. The CMH method uses layering variables to separate test subjects into a series of 2x2 columns, exemplified as follows:

table 1: 2x2 table of binary results for test subjects in both treatment and control groups when all test results were observed, cell a represents the number of subjects assigned to the subjects in the treatment group that achieved the desired result. Cell B represents the number of subjects assigned to the treatment group but not achieving the desired result. The same explanation applies to control groups C and D.

Process 200 defines a variable X based on the predicted outcome for stratification of the test subject (230). In several embodiments, X may be defined as the probability p of observing the result Y _j And may be ordered. In particular embodiments, process 200 may predict a by combining all treatment outcomes _i And will all a _i The variable X is defined by being split into several layers labeled j. In the context of assays using treatment outcome prediction in conjunction with CMH methods, processes according to particular embodiments of the invention may separate test subjects into layers based on the probability of binary outcome occurring during the study. In several embodiments, this may allow more flexibility in applying prognostic information in a series of baseline variables to create a layer, where the layer is based on outcome predictions under control conditions. For trials without stratification with outcome prediction under the CMH method, the trial may be replaced with a layer defined by the outcome prediction of the treatmentLayering methods because the layers defined by the treatment outcome predictions incorporate a complete set of one or more subject characteristics.

In several embodiments, the process 200 may define hierarchical variables using GLM and perform proposed covariate adjustment analysis (230). In addition to the proposed hierarchical variables, GLM may also allow for the inclusion of a number of additional covariates in the model hierarchical analysis. Set Y _i = {0,1} is a result vector labeling the result of subject i, and ZX _i Is the covariate vector for subject i. In many embodiments, GLM may be defined as g (X) =x' β. According to some embodiments of the invention, g may be a linking function including, but not limited to, rating (log), poisson, and log-binominal functions.

Process 200 stratifies test subjects into J layers by variable X, where j=1, 2. In many embodiments, p _0j And p _1j Marking the layers x respectively _j Probability of expected outcome under control and treatment groups, and n _0j And n _1j The subject counts observed in the control and treatment groups for each stratification are noted separately. Process 200 estimates the resulting distribution of all layers under control conditions (250). In several embodiments, process 200 is directed to the alternativeTest zero hypothesisWherein->Is an estimate of the effect of marginal treatment. According to many embodiments of the invention the sample distribution of ψ under the null hypothesis and the alternative hypothesis may be defined by +.>And->Give, wherein->The variance of the marginal treatment effect estimate is noted. In particular embodiments, the process may estimate the marginal treatment effect and the variance of the estimate based on the number of layers and the treatment outcome prediction for each layer. The estimated marginal treatment effect and variance under the alternative hypothesis may both be a weighted sum of J layered values, where the weights w _j Can be counted by the observed n _0j And n _1j Is defined. Furthermore, an alpha level confidence interval for the marginal treatment effect can also be estimated from the sampling distribution under the alternative hypothesis.

The embodiment of the invention can control the I-type error related to the estimated treatment effect and maintain the unbiased treatment effect. As described above, in several embodiments of the invention, the treatment task may be layer independent. In some embodiments, w _j → _P P (x=j), whereAnd->There may be a consistent estimate of the true probabilities for all. From this, j +.>Make->Becomes a consistent estimator, and->May also be to +.>The true sample variance of (c) remains consistent.

Process 200 estimates study efficacy based on the distribution of estimation results at the time of the principal analysis of the hypothesis layering (260). In a number of embodiments of the present invention,with the N-infinity,where V is the expected variance of the CMH estimate under some assumption about probability and hierarchical weights. In particular embodiments, it may be assumed that w _j ＝P(X＝x _j ). In several examples, as the sample size of the test increases such that N.fwdarw.infinity, the efficacy of the study will also approach:

the reduction in estimated variance using CMH model and binary result prediction compared to estimated variance without binary result prediction can be expressed as:

in practice, making a priori approximations to equation (2) may require that some variables have expected values that can be estimated from the historical dataset.

In particular embodiments, formula (2) may be represented by R ² Approximation, i.e. between X and Y, between control treatments Y (r _XY ) And the square of the correlation. In some embodiments, since X may be defined as an ordered covariate of the classification and Y may be defined as a binary result of the classification, a Szelman correlation (Spearman correlation) may be used to determine the association between X and Y. In several embodiments, other meaningful measurements, such as kendel rank (Kendall tau) or area under the curve (AUC), may be used to determine the degree of association.

In many embodiments, the CMH examines the variance of the estimated therapeutic effectAs well as a function of the hierarchical level results. When J values and p of all layers _0j When the value is known, E (γ) can be calculated as the expected value. When the value of the design parameter is limited, another may be requiredA priori procedure is used to estimate the likelihood of delamination. In several embodiments, the process requires modeling the parameter J for the sample size N,r _XY . Subjects in the simulation data may be assigned a result (x _i ，y _i ) Wherein p can be taken _0j As the mean. Under certain assumption conditions, the variance reduction can be approximated by:

wherein V (x) _j ) Is a layer x _j Based on the estimated p _0j Is a function of the expected variance of (a). In practice, σ -checking both CMH and unregulated checking is performed using the expected parameter values described above ² Is a formal estimate of (a).

Embodiments of the present invention can reduce the control sample size required for RCT while maintaining desired efficacy and type I error control. Assume thatIs the sample size of the control group under the CMH test, and n ₀ Is the sample size from the unadjusted test control group. In several embodiments, the process reduces the sample size by approximately the previous +.>

Wherein subscript 1 marks the value under the alternative assumption given above.

Although specific procedures for using layering in RCT to estimate the therapeutic effect of a binary outcome are described above, any of a variety of procedures may be utilized to estimate the therapeutic effect of a binary outcome using layering in RCT as appropriate to the needs of a specific application. In particular embodiments, steps may be performed or practiced in any order or sequence that is not limited to the order or sequence shown and described. In various embodiments, some of the above steps may be performed or implemented substantially simultaneously, or in parallel, where appropriate, to reduce latency and processing time. In some embodiments, one or more of the above steps may be omitted.

Estimating TTE treatment effect

TTE endpoint (endpoint) refers to the point in time at which a particular event occurs in a test. The therapeutic effect detected from the TTE endpoint may be used as another indicator of the efficacy of the new treatment. Different test subjects may have different progressions, and differences detected in the TTE of subjects under treatment and control conditions may help researchers to potentially improve drug. Illustrated in fig. 3 is a conceptual illustration of estimating TTE therapeutic effects using pseudo-value regression with covariates obtained from a generative model. In many embodiments, to estimate TTE therapeutic effects, an event of interest is whether the test subject has favorable or unfavorable results with respect to the study, while also considering concurrent events. Process 300 trains a prognostic model using the obtained external data (310). In some embodiments, the external data may be from a control group of clinical trials, a high quality observational study, or any other data source that may approximate a high quality data set. According to several embodiments of the invention, the external data may include subject characteristics of the test subject and its final test results from previous randomized clinical trials. In several embodiments, the prognostic model can be a simple rule-based model. In some embodiments, the prognostic model can be a model-based generative machine learning model.

Process 300 generates a prognostic score for the test subject using the trained prognostic model and subject characteristics of the subject (320). In particular embodiments, the prognostic score can be an expected value of a prognosis model predicted treatment outcome prediction. The prognostic score can be determined byDefinition, wherein X _i Representing the ith potential prognostic baseline signature. In various embodiments, the process may calculate the expected value of the outcome prediction by taking samples from the prognostic model and applying a monte carlo method to the taken samples.

Process 300 estimates the treatment effect of the TTE outcome using the pseudo-value regression model and the prognostic score (330). In certain embodiments, the process performs this estimation after the target trial is completed, where the available TTE data can be easily collected. In many embodiments, the event occurrence time of interest may be a limited mean survival time (RMST). According to the procedure of several embodiments of the present invention, a generalized linear model (GLE) is fitted to TTE data including deleted data. For a certain function f, let θ=e [ f (x)]Wherein θ denotes RMST, and X _i ，...，X _n Representing independent and co-distributed quantities. Let θ _i ＝E[f(X _i )|z _i ]Is given z _i F (X) _i ) Wherein z is _i ，...，z _n Representing independent and co-distributed covariate samples. In various embodiments, an unbiased estimate of θ may be usedTo define the ith pseudo-observation of θ as:

wherein,is based on { X } _j : the knife cut of θ for j+.i } leaves a one-way estimate. In several embodiments, a linear model θ may be used _i ＝β ₀ +β ₁ 1 _T +β ₂ c _i Solving β= (β) from the following estimation equation ₀ ，β ₁ ，β ₂ )：

According to an embodiment of the present invention, the coefficient β may be estimated by calculating a bilateral p-value based on the t-distribution ₂ And evaluates the null hypothesis. Pseudo valueSubstituting the observed data X in the model. This can be a workaround method because it models the deleted data in the same way as the un-deleted data. The prognosis score c in the covariate-adjusted pseudo-value regression provides a more accurate coefficient estimate. In many embodiments, the gain in accuracy may be greater as the correlation between covariates and pseudovalues increases. In some embodiments, increased accuracy may be used to increase efficiency and/or reduce sample size.

In selected embodiments, the process may obtain the maximum gain in variance reduction by fitting a survival model P to provide an estimate of the conditional survival distribution for each test subject i. In several embodiments, the estimation of the conditional survival distribution may be expressed as

In many embodiments, the process may be performed by estimating c _i And (3) withThe correlation between them to reduce the sample size of the test. In various embodiments, the estimation of the test subject's correlation may be based on a test dataset in the external data and an expected therapeutic effect in the target trial, wherein the correlation may be estimated based on a similarity between the external data and the target trial. The estimated correlation can be reduced if the results presented in the target trial are different from the external data. In some embodiments, the estimated correlation may be used for sample size calculation at the design of experiment stage. In many embodiments, the process will maintain type I error and generate an unbiased estimate of the therapeutic effect.

Fig. 4 illustrates an example of a network on which the above-described processes may be implemented in some embodiments of the invention. In many embodiments, the network 400 includes a communication network 460. The communication network 460 may be a network such as the internet that allows devices connected to the network 460 to communicate with other connected devices. In various embodiments, server systems 440 and 470 may be connected to network 460. According to various embodiments of the invention, each of server systems 440 and 470 may be a set of one or more servers communicatively connected to each other via an internal network, which perform a process of providing cloud services to users over network 460. For purposes of this discussion, a cloud service is one or more applications that are executed by one or more server systems to provide data and/or executable applications to devices over a network.

Server systems 440 and 470 are shown with three servers each in the internal network. However, server systems 440 and 470 may include any number of servers, and any additional number of server systems may be connected to network 460 to provide cloud services. In some embodiments, there may be only a single server 410 connected to the network 460 to provide services to users. According to various embodiments of the present invention, a computing system using a system and method for estimating therapeutic effects in a random control trial according to embodiments of the present invention may be provided by processes executing on a single server system and/or a set of server systems communicating over a network 460.

A user may use a personal device 480 connected to the network 460 to perform a process of estimating the effect of a treatment in a random control trial according to various embodiments of the invention. In the illustrated embodiment, personal device 480 is shown as a desktop computer connected to network 460 via a conventional "wired" connection. However, personal device 480 may be a desktop computer, a notebook computer, a smart television, an entertainment game console, or any other device that connects to network 460 via a "wired" connection. The mobile device 420 may connect to the network 460 using a wireless connection. The wireless connection may be a connection to the network 460 using Radio Frequency (RF) signals, infrared signals, or any other form of wireless signaling. In the example of the figure, mobile device 420 is a mobile phone. However, mobile device 420 may be a mobile handset, a Personal Digital Assistant (PDA), a tablet, a smart phone, or any other type of device connected to network 460 via a wireless connection without departing from this invention.

FIG. 5 illustrates a computing system on which the above-described processes may be implemented in some embodiments of the invention. The treatment effect estimation element 500 includes a network interface 530 that may receive external data, and a memory 530 of external data stored under an external data memory 544. Processor 510 may execute treatment effect estimation application 542 to estimate treatment effect in a random control trial in accordance with several embodiments of the invention. Those skilled in the art will recognize that a computing system may exclude certain components and/or include other components omitted for brevity without departing from the invention.

In many embodiments, processor 510 may include a processor, microprocessor, controller, or combination of processors, microprocessors, and/or controllers that execute instructions stored in memory 540 to manipulate test data stored in memory. The processor instructions may configure the processor 510 to perform processes in accordance with certain embodiments of the invention. In various embodiments, processor instructions may be stored on a non-transitory machine-readable medium.

Although a specific example of a treatment effect estimation element 500 is illustrated in this figure, any of a variety of treatment effect estimation elements may be utilized to perform a process for estimating a treatment effect in RCT similar to the process described herein, as appropriate to the requirements of a specific application according to an embodiment of the invention.

An example of an estimation application to execute instructions to estimate the effect of a treatment in a randomized controlled trial according to an embodiment of the invention is illustrated in fig. 6. In several embodiments, the estimation application 600 may include an estimator 602, a layering engine 604, and a pseudo-value regression engine. According to various embodiments of the invention, the estimator 602 may be used to estimate the effect of treatment in a randomized controlled trial. In several embodiments, layering engine 604 may be used to layer test subjects for estimating the therapeutic effect of the binary outcome. In some embodiments, the pseudo-value regression engine 606 may be used to estimate the TTE treatment effect of the test subject.

Although specific examples of treatment effect estimation applications are illustrated in this figure, any of a variety of treatment effect estimation applications may be utilized to perform a process for estimating treatment effect in RCT similar to the process described herein, as appropriate to the requirements of a specific application according to an embodiment of the invention.

Although specific methods of estimating therapeutic effects in RCT are discussed above, many different design methods may be implemented according to many different embodiments of the invention. It is therefore to be understood that the invention may be practiced otherwise than as specifically described without departing from the scope or spirit of the invention. The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive. Thus, the scope of the invention should be determined not by the embodiments illustrated, but by the appended claims and their equivalents.

Claims

1. A method for estimating the effect of a treatment in a randomized controlled trial, the method comprising:

receiving external data from a prior randomized clinical trial;

generating a set of one or more subject characteristics for a plurality of test subjects;

estimating a binary outcome of the test subject using a stratification process; and

the time to event TTE treatment effect of the test subjects was estimated using pseudo-value regression.

2. The method of claim 1, wherein estimating the binary outcome of the test subject using a hierarchical process comprises:

training a prognostic model using the received external data;

generating a result prediction of the test subject using the prognosis model;

defining variables for stratification of the test subject based on the outcome prediction;

layering all test subjects into multiple layers according to the variables; and

the treatment results of the test subjects in all layers were estimated.

3. The method of claim 1, wherein estimating the TTE treatment effect of the test subject using pseudo-value regression comprises:

training a prognostic model using the received external data;

generating a prognostic score for the test subject using the prognostic model and the generated subject characteristics of the test subject; and

the TTE treatment effect of the test subjects was estimated using a pseudo-value regression model and a prognostic score.

4. The method of claim 1, wherein the set of one or more characteristics of the plurality of test subjects comprises a baseline covariate of the test subjects and a treatment session of the test subjects.

5. The method of claim 2, wherein the prognostic model is a generative model.

6. The method of claim 2, wherein the prognostic model is a generalized linear model.

7. A method according to claim 3, wherein the prognostic model is a simple rule-based model.

8. The method of claim 3, wherein the prognostic model is a model-based generative machine learning model.

9. The method of claim 3, wherein estimating the TTE treatment effect comprises estimating a limited mean survival time of the test subject.

10. The method of claim 1, further comprising designing a clinical study based on the estimated therapeutic effect.

11. A non-transitory machine-readable medium comprising processor instructions for estimating a therapeutic effect in a random control trial, wherein execution of the instructions by a processor causes the processor to perform a process comprising:

receiving external data from a prior randomized clinical trial;

estimating a binary treatment outcome for the test subject using the stratification procedure; and

12. The non-transitory machine readable medium of claim 11, wherein estimating the binary outcome of the test subject using the layering process comprises:

training a prognostic model using the received external data;

generating a result prediction of the test subject using the prognosis model;

layering all test subjects into multiple layers according to the variables; and

the treatment results of the test subjects in all layers were estimated.

13. The non-transitory machine readable medium of claim 11, wherein estimating the TTE treatment effect of the subject using pseudo-value regression comprises:

training a prognostic model using the received external data;

14. The non-transitory machine readable medium of claim 11, wherein the set of one or more characteristics of the plurality of test subjects comprises a baseline covariate of the test subjects and a treatment task of the test subjects.

15. The non-transitory machine readable medium of claim 12, wherein the prognostic model is a generative model.

16. The non-transitory machine readable medium of claim 12, wherein the prognostic model is a generalized linear model.

17. The non-transitory machine readable medium of claim 13, wherein the prognosis model is a simple rule-based model.

18. The non-transitory machine readable medium of claim 13, wherein the prognosis model is a model-based generative machine learning model.

19. The non-transitory machine readable medium of claim 13, wherein estimating the TTE treatment effect comprises estimating a limited mean survival time of the test subject.

20. The non-transitory machine-readable medium of claim 11, further comprising designing a clinical study based on the estimated therapeutic effect.