WO2017036482A1

WO2017036482A1 - A cpn-based tool for the stratification of illness severity in patients suspected of sepsis

Info

Publication number: WO2017036482A1
Application number: PCT/DK2016/050288
Authority: WO
Inventors: Logan WARD; Steen Andreassen
Original assignee: Aalborg Universitet
Priority date: 2015-08-28
Filing date: 2016-08-26
Publication date: 2017-03-09

Abstract

There is presented a computer implemented method for calculating a probability of a clinical outcome for a patient, wherein the method comprises providing a maximum of 150 input parameters, such as a maximum of 20 input parameters, as input to a statistical model, and calculating the probability with the statistical model. The method in an embodiment further comprises performing with the statistical model an assessment of a severity of an illness of the patient, where said assessment is based partially or fully on the input parameters, and where said probability is based partially or fully on the severity of the patient's illness.

Description

A CPN-BASED TOOL FOR THE STRATIFICATION OF ILLNESS SEVERITY IN PATIENTS SUSPECTED OF SEPSIS FIELD OF THE INVENTION

The present invention, in some embodiments thereof, relates generally to the identification of patterns and markers associated with certain clinical outcomes in patients suspected of infection, and the methods of using such patterns in triage, diagnosis and prognosis of infection.

BACKGROUND OF THE INVENTION

Patients presenting to the emergency department suspected of infection or other inflammatory conditions are often difficult to accurately classify; this includes determination of illness severity, need for blood cultures, and need for other tests/immediate treatment. Similar problems can be found for hospital inpatients: it is difficult to recognise the onset/progression of an infection.

The ability to predict outcomes including mortality and bacteraemia would give better opportunities for optimal use of hospital resources and timely, accurate treatment.

Currently, decisions are made by following clinically derived rules of best practice. Attempts have been made at creating prognostic and diagnostic models, predicting mortality and positive blood cultures. Most of these models are not in use in clinical practice, and when they are used it is mainly for research purposes.

For blood cultures, some of the techniques used include simple clinical rules/Biomarkers; (1,2), Models; recursive partitioning trees (3), logistic regression (4-9), naive Bayes (10), not-so-naive Bayes (11), Bayesian networks (12). There is an even greater number of models attempting to predict mortality: 4 iterations of the Acute Physiology and Chronic Health Evaluation (APACHE I, II, III, IV) (13-16), three iterations of the Simplified Acute Physiology Score (SAPS I, II, III) (17-19), the Mortality Probability Model (MPM I, 110, 1124) (20,21), the Mortality in the Emergency Department Score (MEDS) (22), the Sequential Organ Failure

Assessment (SOFA, also Sepsis-related Organ Failure Assessment) (23), the Pitt Bacteraemia Score (PBS) (24). These scoring systems are all based on logistic regression models.

The decision support system. Treat, is described in:

• "A Bayesian approach to model-development: design of continuous distributions for infection variables", Logan Morgan Ward et al., 19th World Congress of The International Federation of Automatic Control, IFAC, 24-29 August 2014, Cape Town, South Africa, pages 5653-5658. Treat is a large decision support system which is designed to improve antimicrobial treatment. It has also been used to predict bacteraemia. SUM MARY OF THE INVENTION

The present inventors propose a method for combining a set of measurements and/or observations recorded for a patient suspected of infection, and evaluating the probability of a given clinical outcome. Embodiments of the invention differentiate themselves from the references by a much smaller parameter set (e.g., on the order of lOx smaller) which may allow for easy retuning and recalibration of the underlying model. Additionally, the model is differentiated by its complex statistical model characterised by intermediate "unobservable" variables and tolerance of missing values. The various embodiments of the invention address limitations of current diagnostic and triage practices by suggesting which diagnostic tests are cost-effective, which courses of treatment to take etc. Different aspects of the present invention may each be combined with any of the other aspects. These and other aspects of the invention will be apparent from and elucidated with reference to the embodiments described hereinafter.

According to an embodiment, there is presented a CPN-based tool for the stratification of illness severity in patients suspected of sepsis.

BRIEF DESCRIPTION OF THE DRAWINGS

The method according to the invention will now be described in more detail with regard to the accompanying figures. The figures show one way of implementing the present invention and is not to be construed as being limiting to other possible embodiments falling within the scope of the attached claim set.

FIG 1. Schematic model of the Bayesian Network implemented in an embodiment of the invention.

FIG 2. Possible embodiment of the invention, patient information is fed from hospital information systems (HIS) (1) to the model (3) which then calculates the probability of clinical outcomes and writes this back into the HIS where it is accessible to the treating clinician (2). FIG 3. Potential use of the Bayesian Network implemented in an embodiment of the invention.

FIG 4. ROC curve for the prediction of 30-day mortality according to an embodiment of the invention.

FIG 5. ROC curve for the prediction of positive blood culture according to an embodiment of the invention.

FIG 6. Flow-chart of the model-development process according to an embodiment of the invention FIG 7. Schematic view of an embodiment of the invention identifying where automatic learning is to take place (1, 2) FIG 8. An example of initial specified distributions (A) and learned composite distributions (B) for one of the parameters in an embodiment of the invention.

FIG 9. ROC curves for the prediction of 30-day mortality according to embodiment of the invention, and a published precursor to the invention

FIG 10. Hosmer-Lemeshow calibration curve for the prediction of 30-day mortality according to an embodiment of the invention.

FIG 11. Regression lines for the observed vs. predicted events according to an embodiment of the invention.

FIG 12. An example of an application of an embodiment of the invention.

FIG 13. Flow-chart for the calculation of incremental cost-effectiveness of an application of an embodiment of the invention.

FIG 14. ROC curve for the prediction of positive BC according to an embodiment of the invention. FIG 15. Boxplot for the cost-effectiveness of an application of an embodiment of the invention. FIG 16. Flow-chart of the model-development process according to an embodiment of the invention FIG 17. Graph showing the CPN structure of an embodiment of the invention.

FIG 18. An example of initial specified distributions (A) and learned composite distributions (B) for one of the parameters in an embodiment of the invention.

FIG 19. An example of an embodiment of the invention's ability to distinguish between infectious and non-infectious inflammation.

FIG 20. ROC curves for the prediction of 30-day mortality according to a published precursor to the invention, and two possible embodiments of the invention.

FIG 21. Hosmer-Lemeshow calibration curve for the prediction of 30-day mortality according to an embodiment of the invention.

FIG 22. Regression lines for the observed vs. predicted events according to an embodiment of the invention.

FIG 23 ROC curves for the prediction of 30-day mortality (A) and presence of infection (B) for an embodiment of the invention and the SIRS score.

FIG 24. ROC curves for the prediction of 30-day mortality (A) and presence of infection (B) for an embodiment of the invention and the mREMS score. FIG 25. ROC curves for the prediction of bacteraemia according to an embodiment of the invention, for three patient cohorts.

DETAILED DESCRIPTION OF THE INVENTION

The model Embodiments of the invention use a so-called Bayesian Network, also referred to as a causal probabilistic network (CPN) to model the host response to infection. A CPN comprises a set of parameters or nodes that may be observable or unobservable (latent nodes) and directed links between the nodes which encode the causal stochastic relationships between the nodes.

The CPN used in the invention is described in FIG 1. Sepsis, or another non-infectious inflammatory condition causes changes in a set of known inflammatory mediators, which then cause observable changes in a set of infection variables indicative of the patient's physical condition: Modification of vital signs such as heart rate, respiratory rate, blood pressure etc. and changes in the blood chemistry: lactate, creatinine, C-Reactive Protein (CRP), procalcitonin (PCT), leukocytes, neutrophils etc. In practice, observing these infection variables allows inferences to be made about the cause (sepsis/other) and severity of the patient's immune response. The resulting inference is then used to determine a composite risk profile: "Risks" in FIG 1. The risk profile is used to determine the probability of a given clinical outcome, "Prognosis": this could be, for example, a positive blood culture or PCR test, or mortality.

The infection variables may be quantitative (numerical variables) or qualitative (categorical variables). Quantitative variables may be used in their raw form, or may be transformed and/or combined using mathematical functions, e.g. natural log, ratios of variables etc. Qualitative variables have a fixed number of categories or states.

Tuning the model

The model is tuned through a machine learning process known as expectation maximisation (EM) learning. Other machine learning techniques could also be used, for example: Dirichlet learning.

The model is tuned from >5000 patient cases. Patients were included in the learning dataset if they were suspected of infection.

Potential use

FIG 2. Describes a potential use-case of the invention. Coupling the model to a hospital information system means that probability calculations for clinical outcomes are readily available to clinical staff as the infection variables are entered into the HIS by triage nurses/laboratory staff etc. FIG 3. Illustrates potential use of the Bayesian Network implemented in an embodiment of the invention. The input parameters are entered into the model (1.), which then propagates the evidence throughout the model (2.) allowing the probability of the given outcome to be read off (3.). This describes what happens during step 2. in FIG 2. EXAMPLES

Two examples, Al and A2 (each representative of a paper) describe exemplary embodiments of the invention:

Al. A Bayesian Approach to Model Development: Automatic Learning for Tuning Predictive

Performance.

A2. PCR for sepsis in the Emergency Department: a cost-effective proposition

Example Al details the model development process and the results that can be achieved in using the model as a standalone entity for the prediction of 30-day mortality.

Example A2 includes a cost-benefit analysis for another embodiment of the invention where the model is used to predict the probability of a positive PCR test or positive blood culture. Examples Al and A2 are included below.

An additional example, Example A3 (representative of a paper) also describes an exemplary embodiment of the invention.

A3. Automatic Learning of mortality in a CPN model of the Systemic Inflammatory Response

Syndrome Example A3 builds on the model development process introduced in Example Al, explaining how to incorporate the influence of an external risk factor (in this case, patient age). The results describe the performance of the model with respect to that described in Example Al (FIG 20), and also in comparison to other existing clinical scoring algorithms (FIG. 23, FIG. 24).

Example A3 is included below. FIG 4 and FIG 5 represent ROC curves for the prediction of 30-day mortality and bacteremia for an exemplary embodiment of the invention. The area under the ROC curve represents an assessment of the ability of the model to discriminate between positive and negative cases; between patients alive or dead after 30 days (FIG 4), or patients with or without bacteremia (FIG 5).

Table I presents the area under the ROC curve for FIG 4, along with its standard error and a 95% confidence interval. The asymptotic significance referred to in Table I is whether the area under curve is significantly different from 0.5, which represents no discriminatory power. In this case, p<0.0005, which means that there is a significant difference.

Table I Area under the curve for FIG 4

Area Under the Curve

Test Result Variable(s): LSepsis_site_age

The test result variable(s): LSepsis_site_age has at least one tie between the positive

actual state group and the negative actual state group. Statistics may be biased.

a. Under the nonparametric assumption

b. Null hypothesis: true area = 0.5

Table II presents the area under the ROC curve for FIG 5, along with its standard error and a 95% confidence interval. The asymptotic significance referred to in Table II is whether the area under curve is significantly difference from 0.5, which represents no discriminatory power. In this case, p<0.0005, which means that there is a significant difference.

Table II Area under the curve for FIG 5

Area Under the Curve

Test Result Variable(s): Pbact_SF21MAP

The test result variable(s): Pbact_SF21MAP has at least one tie between the positive

a. Under the nonparametric assumption

b. Null hypothesis: true area = 0.5 EXAMPLE Al: A Bayesian Approach to Model-Development: Automatic Learning for Tuning Predictive Performance

Abstract: The value of manually constructed and tuned Bayesian networks has been demonstrated empirically, however this informal process is limited in terms of what can be reasonably achieved. This paper, such as this Example Al, presents the application of a formal machine learning process, EM learning, to a manually constructed CPN for the assessment of the severity of sepsis. Through learning, the model is tuned to predict 30-day mortality, and displays a significant improvement in discriminatory ability assessed by area under the ROC curve (previous model AUC = 0.647, new model AUC = 0.739, p<0.001).

Key Words: Machine learning; Biological and medical system modelling; System identification and validation

1. INTRODUCTION

Bayesian networks are a set of probabilistic models and can be used to create diagnostic models for diseases (25-28). These models can also provide advice on treatment selection, provided they are accompanied by decision theory and utility functions (29-31).

A Bayesian network can be represented graphically by a set of nodes, linked together by arrows. The nodes themselves represent stochastic variables. The arrows represent causal relationships between the variables, a requirement for the network to provide plausible reasoning (32), and the reason they are also referred to as Causal Probabilistic Networks or CPNs (33). Numerically, a CPN consists of a set of conditional probability tables defining the relationships between a node and its parent(s). The task of constructing a CPN therefore consists of specifying the graphical structure and the set of associated conditional probabilities. Nodes are not limited to representing observable events such as blood pressure or temperature measurements, but can also represent latent concepts such as diagnoses or prognoses which are not observed, but still of interest. Once constructed, the CPN is used to update the probability distributions for the unobserved variables when evidence is inserted into the CPN.

CPNs are ideal models for the fusion of data and knowledge, which may be represented by patient databases and the combination of expert opinion and reports in the scientific literature, respectively. Any or all of these sources of evidence may be used in the construction of a CPN. Throughout the construction process, the conditional probabilities themselves may be considered stochastic variables. The value of the semi-formal approach of using knowledge to assign a priori distributions has been demonstrated empirically through the success of the Treat decision support system (34,35). Treat aids in decision-making regarding diagnosis and optimal treatment of acute infections. The CPN model of Treat is large with close to 6000 nodes. The severity of a patient's illness is assessed by a small section of the model, approximately 40 nodes. FIG 6 presents a framework for the development of this network, referred to as the "Sepsis CPN". The individual phases are described in the literature; the initial specification of the model (FIG 6, phase I) where all observable nodes were discrete stochastic variables (31), known as the Discrete Sepsis CPN (D-Sepsis CPN), and the subsequent development of model with continuous variables (FIG 6, phase II), the Continuous Sepsis CPN (36). Although the conversion to continuous variables was able to solve some of the shortcomings of the discretization in the D-Sepsis model, the model requires tuning. The C-Sepsis CPN has been tuned manually, using a combination of knowledge gleaned from the literature and expert opinion, however this process is limited in terms of what can be reasonably achieved.

FIG. 6: Sepsis CPN development framework. Phase I describes the development of the discrete sepsis CPN (D-Sepsis CPN), phase II the continuous sepsis CPN (C-Sepsis CPN) and phase III the development of the learned sepsis CPN (L-Sepsis CPN) through formal learning methods - the subject of this paper, such as this Example Al. The C-Sepsis CPN can be further improved by supplementing the manual methods used in its development with machine learning from case databases. In this case, we take the sub-network of the C-Sepsis CPN that does not include respiratory parameters. We recognize in the Treat network that oxygen saturation, shortness of breath and respiratory rate are affected differently by lung- and other infections, and that without incorporating any knowledge of the site of infection, it does not make sense to include these parameters. The purpose of this paper, such as this Example Al, is to present a method for tuning the sepsis CPN to predict all-cause 30-day mortality using a database of real patient cases. The new model is internally validated by testing its ability to predict 30-day mortality.

2. METHODS

In this paper, such as this Example Al„ we describe the modification of the C-sepsis CPN into the Learned Sepsis CPN (L-Sepsis CPN) (FIG 6), Phase III of the development framework). This is the final step of our sepsis CPN development framework, with the result being the L-Sepsis CPN, or from the network constructor's perspective: the posterior distributions. For the purpose of this paper, such as this Example Al, the Continuous (C-) Sepsis CPN is regarded as the specification of a prior conditional probability distribution for the observable variables. FIG 7 shows an overview of the L-sepsis CPN. The non-infectious systemic inflammatory response syndrome (NISIRS) and sepsis represent two syndromes, the severity of which we describe using five states; no, mild, moderate, severe and critical. These states can also be thought of as the degree of activation of the immune system. Each of these severities is associated with a mortality rate. The NSI S and sepsis nodes are linked to the infection variables, which we describe with individual parameter distributions, through a set of factor nodes. The specific structure of the sepsis CPN is described in the literature (31,34,36).

FIG. 7: Schematic view of the sepsis CPN identifying where automatic learning is to take place. 1

Learning the weights for the composite parameter distributions. 2 Learning the weights of the NISI S severities across the intermediate factors. * denotes conditional probability tables defined prior to learning based on the C-Sepsis CPN and/or the literature. NSIRS: Non-infectious Systemic Inflammatory Response Syndrome

To prepare the L-Sepsis CPN for learning, structural changes were made. One issue with the C-Sepsis CPN is that some of the literature-derived distributions overlap greatly, which mean that they are difficult to use for classification. Additionally, the individual Gaussian distributions defined for each severity state meant that very large odds ratios were seen for outlying parameter values. Our previous attempts at learning have taught us that it is difficult to learn individual Gaussian distributions for each severity state of sepsis. Instead of doing this, we create a semi-discrete environment where a set of Gaussian curves roughly corresponding to pathophysiological states covers the region of interest for a given variable. Instead of learning the distributions themselves, we learn how each sepsis state spreads itself over the set of defined distributions, creating multi-modal or composite distributions.

Our learning process can be defined as partially supervised. We cannot observe the states of the NISIRS and sepsis nodes as such; however we can observe something to which the severity states of both are linked: 30-day mortality. The explicit definition and unobservable nature of the non-infectious SIRS also creates identifiability issues when it comes to learning. To overcome this issue, we choose to learn in a stepwise fashion, learning first the distributions for patients with infection, and then those without infection. Learning is carried out using the Expectation-Maximisation (EM) method (37).

A 10-fold cross-validation is performed as an internal validation in order to ensure that the learning method is robust. The learned network is assessed for its discriminative ability using the area under the receiver operating characteristic (ROC) curve. The performance of the L-Sepsis CPN is compared to that for Treat with the C-Sepsis CPN. Calibration of the full learned model is assessed using the Hosmer- Lemeshow statistic and calibration curve.

Descriptive statistics and significance testing for the data was carried out using SPSS (Version 22, IBM Corporation). Continuous variables were analysed using one-way ANOVA, and categorical variables with the Pearson Chi-squared statistic. Automatic learning was performed using the EM learning algorithm within Hugin (Version 7.6 (x64), Hugin Expert A/S). ROC-analysis was also undertaken in SPSS. The difference in the area under curve was assessed using the method of Hanley and McNeil (38). Calibration was assessed using the Hosmer-Lemeshow Statistic (39). Additional plotting and calibration analysis was completed in Excel (Version 14.0, Microsoft Corporation). 3. RESULTS

Construction of a CPN requires two main steps: specifying/learning the structure and specifying/learning the conditional probability tables. FIG 7 presents a schematic for the sepsis CPN, removed from the Treat model for learning purposes. The individual parameter distributions are the continuous probability distributions defined for the infection variables in the C-Sepsis CPN, with modifications made where necessary to accommodate the requirements of the chosen learning method. These accommodations include, for example, shifting distributions or introducing additional distributions to better describe the physiological range of a given parameter.

The intermediate factors retain their design from the C-Sepsis CPN, with their severity states defined as the most severe of Sepsis and NISIRS. In the literature, sepsis and non-infectious SIRS are reported to have similar crude mortality rates given comparable severity of illness (40,41). This is implemented in the Severity and 30-day Mortality sections of the model.

3.1 Data

The learning process is divided into two stages. Stage 1 involves learning composite distributions for each of the variables for each sepsis severity state. Stage 2 involves learning the conditional probability tables for the factors linking the severity of illness for patients without infection to the intermediate factors (FIG 7). Table III presents descriptive statistics for the data used in the learning process, using the final diagnosis to split the dataset into patients with infection, and patients without infection. The final diagnosis was taken as that recorded in the patient file on discharge or death. Patient data were collected during trials and/or studies of the Treat system at Beilinson Hospital, Petah Tikva, Israel including 1695 patients from April - November 2004, and 1894 patients from December 2008 - April 2011. Of these 3589 patients, 2855 had a confirmed infectious or non-infectious diagnosis.

Table III Descriptive statistics for the data used in learning steps 1 and 2, groups defined according to final diagnosis

Infected patients (N=2514) Non-infected patients (N=371)

Parameter %recorded Median [Min- %recorded Median [Min- P value

Max]/ Max]/ N [%]

N [%]

Laboratory parameters

Leukocytes

96.5 11.4 [0-110.6] 98.1 10.1 [0.6-116.7] 0.24 [count/mm³]

Creatinine [mg/dl] 96.4 0.9 [0.1-11.3] 97.0 0.9 [0.1-7.7] 0.72 Albumin [g/l] 32.9 35 [1-82] 56.3 35 [4-49] 0.94

Lactate [mmol/l] 13.0 1.9 [0.4-14.9] 20.8 1.9 [0.7-8.8] 0.81

Platelets

96.4 228 [3-1250] 98.7 238 [19-1204] 0.002 [count/mm³]

C P [mg/l] 6.3 102 [1-1230] 13.2 68 [1-285.9] 0.012

Vital parameters

Heart rate [/min] 99.5 90 [30-162] 99.5 90 [40-146] 0.13

Systolic BP [mmHg] 99.8 122 [40-242] 99.5 123 [59-200] 0.83

Temperature [°C] 99.9 38.4 [32-42] 99.7 38 [33-40.6] <0.001

Mental status 80.2 89.2

-confused 215 [10.7%] 36 [10.9%] <0.001

-comatose 38 [1.9%] 1 [0.3%]

Additional

Chills 81.4 460 [22.5%] 79.5 45 [15.3%] 0.004

Chemotherapy 100 166 [6.6%] 100 37 [10%] 0.17

DIC 75.9 8 [0.4%] 76.5 1 [0.4%] 0.751

30-day mortality 99.8 314 [12.5%] 99.7 41 [11.1%] 0.122

3.2 EM Learning

The data used for stage 1 of the learning process are a set of patient cases where the patient had a final diagnosis of infection: we did not limit the data set to patients with sepsis as it is possible to have a patient with a local infection that does not have sepsis. Each patient case included all available measurements/recordings of the individual parameters on hospital admission and whether the patient had recently undergone chemotherapy, in addition to information on the 30-day mortality.

The learning data were split into ten sets of data for cross validation, data for stage 1 and stage 2 (infection and no infection) were assigning to the ten sets separately. The ten learning sets for stage 1 totalled 2514 cases, while those for stage 2 totalled 371 patient cases where the final diagnosis was "no infection". For each cross-validation step, one of the ten sets of patient data was set aside as a validation set while the other nine sets were used for learning. Following cross-validation, the L-Sepsis CPN was learned using the complete data set.

An example of the composite distributions learned in stage 1 is presented in FIG 8. The composite distribution for serum albumin concentration is shown along with the C-sepsis CPN Gaussian distributions from which they were constructed. FIG 8: An example of initial specified distributions (A) and learned composite distributions (B) for one of the parameters in the sepsis CPN: albumin.

Each curve in panel B is constructed from a linear combination of the curves in panel A. Following EM- learning, a similar set of composite distributions could be drawn for each of the continuous variables. 3.3 Predictive Performance

The final step in this phase of the development process (FIG 6) was the internal validation of the L-Sepsis CPN's ability to predict 30-day mortality. For each step of the 10-fold cross-validation predictions can be made for those cases in the validation set. We assess the respective abilities of the CPNs to predict 30- day mortality using the area under the ROC curve. The mean area under curve (AUC) for the validation datasets was 0.720 (range: 0.665-0.777). Learning the L-Sepsis CPN with all available data gave an AUC of 0.739 (95% CI 0.709-0.768) when the validation and training data sets were identical. There is no significant difference between the AUC of the L-Sepsis CPN and that obtained through cross-validation (p=0.16), signifying that there is no significant overfitting occurring.

FIG 9 presents ROC curves for the L-Sepsis and C-Sepsis CPNs using the 2855 patient cased used in the learning step. The reference line represents the line of no discrimination. The AUC for the C-Sepsis CPN was 0.647 (95% CI 0.616-0.678) and the difference between AUC for the L-Sepsis and C-Sepsis CPN was statistically significant (p<0.001).

FIG 9: ROC curves for the prediction of 30-day mortality for the L-Sepsis and C-Sepsis CPNs

We also assessed the calibration of the L-Sepsis CPN using the Hosmer-Lemeshow statistic and calibration curve. Using the ten validation sets and their corresponding models the mean Hosmer-Lemeshow statistic was 7.99 (8 degrees of freedom, p=0.43). The Hosmer-Lemeshow statistic ranged from 5.53-11.24 with all models statistically well calibrated (minimum p=0.19). For the L-Sepsis CPN the Hosmer-Lemeshow statistic was 15.9 (8 degrees of freedom, p=0.04) which suggests that the model is not well calibrated. The calibration curve for the L-Sepsis CPN is presented in FIG 10. For the calibration curve, the predictions are split into deciles based on risk, and the number of observed events are plotted against the predicted number of events for each decile - a perfectly calibrated model will appear as a straight, x=y line. Despite the statistical significance of the Hosmer-Lemeshow statistic, the predictions and observations seem to match well visually.

FIG 10: Hosmer-Lemeshow calibration curve for the prediction of 30-day mortality using the L-Sepsis CPN.

We also conducted a sub-group analysis of model calibration for the two most prevalent infection sites in our dataset; lower respiratory tract (LRT) infections and urinary tract infections (UTI). There were 697 LRT infections with 128 deaths, and 486 UTIs with 41 deaths. FIG 11 presents regression lines drawn relating the predictions and observations for the two sub-groups along with the full dataset.

FIG 11: Regression lines for the observed vs. predicted events for all 2855 patients (solid), 697 patients with a lower respiratory tract infection (dotted) and 486 patients with a urinary tract infection (dashed)

The sub-group curves for LRT- and UTI infections show an interesting trend: for a given severity of the immune response, a lower respiratory tract infection is more likely to lead to death.

4. DISCUSSION

In this paper, such as this Example Al, we have described the process of tuning a manually constructed CPN through machine learning. Manual construction, building the structure and numbers using a mix of expert knowledge and the scientific literature, is effective despite any flaws in its process, as demonstrated by the performance of the C-Sepsis CPN. Many such systems have been constructed in this way, including the complete Treat CPN which has been shown to outperform clinicians (35). One of the explanations suggested for this success is that through the use of all available information, CPNs act as updateable repositories of knowledge, which can be used to perform reasoning while adhering strictly to the axioms of probability theory. However, the degree to which the CPNs can be refined manually is limited in the same way clinicians are: it is very difficult for humans to perform calculations with and recognise patterns within large multidimensional probability matrices. This points towards the necessity of a formalised learning process, in our case the EM-learning algorithm. Seen alongside our previous effort, the manual construction of the C-Sepsis CPN, the learning of the L- Sepsis CPN suffers similar limitations. Data quality and quantity are important factors in the success of machine learning. Several of the parameters used were recorded in fewer than 40% of cases, namely albumin, lactate and CRP. Two of these, albumin and lactate, are reported in the literature as being closely linked to mortality (42,43). However, despite the large number of incomplete cases, the EM algorithm allows us to use all available data.

The L-Sepsis CPN is not well calibrated according to the Hosmer-Lemeshow statistic, while all of the models in the cross-validation were statistically well calibrated. However, the predictions and observations appear to match well visually for the L-Sepsis CPN. The Hosmer-Lemeshow statistic is known to be very sensitive to sample size (44), which is one explanation of why the apparently better (visually) calibrated L-Sepsis CPN gives a significant test result, while the individual folds of the cross-validation give non-significant results with one tenth of the number of cases.

The model discriminates well between cases in terms of 30-day mortality, although we expect further improvements are possible. The sub-group analysis of infection sites points towards the existence of confounding variables not accounted for by our model. This warrants further investigation into possible confounders, possibly including age, site of infection, and presence of other comorbidities, among others. Re-integrating the CPN into Treat presents the opportunity to account for several such factors, which will be required to improve the discriminatory ability of the model to a point where it can be applied in clinical practice.

The goal of constructing the L-Sepsis CPN was to tune its predictive performance while retaining the advantages of the C-Sepsis CPN over the D-Sepsis CPN. The retention of continuous distributions, even though we now use composite curves, has meant that we continue to avoid the "jumps" between diagnoses that we saw with the D-Sepsis CPN. The overwhelmingly strong odds ratios seen occasionally in the C-Sepsis CPN are now also avoided.

Barring the availability of additional, higher quality datasets or datasets with additional parameters we do not believe that further significant improvements can be made to the sepsis CPN in and of itself. However, we do believe that the next steps in improving the performance of Treat will involve extending the learning to include evidence from the wider Treat network, which for example will make it possible to take the dependence of mortality on the site of infection into account.

5. CONCLUSIONS

The internal validation of the L-Sepsis CPN suggests that supplementing a manually constructed CPN with machine learning can improve predictive performance, with the L-Sepsis CPN showing a significant improvement in the discriminatory performance (p<0.001) for 30-day mortality compared with the C- Sepsis CPN.

EXAMPLE A2: Risk-Assessment can improve cost-effectiveness of PCR testing for bacteremia Abstract:

Objectives: To show that risk-based stratification can improve the cost-effectiveness of PCR testing for bacteremia. Perspective: Hospital management

Setting: Primary and tertiary care hospital in Israel

Methods: A cost effectiveness analysis compared two diagnostic strategies for bacteremia: 1) direct multiplex real-time PCR for all patients presenting with suspected sepsis from whom blood cultures are drawn and 2) Stratifying the patients according to the risk of bacteremia by a previously developed computerized decision-support system, performing the PCR only for a high-risk group defined by a threshold.

The strategies were compared by calculating the incremental cost-effectiveness ratio (ICER) over standard care (blood cultures without PCR) in terms of euros (€) per life-year (LY) of the two strategies. Cost-effectiveness was explored for a range of thresholds. The sensitivity to additional parameters involved in the ICER calculation was assessed by Monte-Carlo analysis.

Results: The ICER of PCR when performed for all patients was 16,774€/LY. A threshold of 11.75% defined a low-risk group comprising 63.2% of the patients where the ICER reached the NICE cost- effectiveness threshold of 35,000€/LY. Eliminating PCR for these patients, the ICER for 36.8% of patients in the high-risk group was 8,538€/LY. The ICER could be further reduced to 4281€/LY by choosing a threshold of 25%. This limited testing to a high-risk group comprising 7.8% of all patients.

Conclusions: Risk-based stratification can improve the cost-effectiveness of PCR. Introduction

Rapid molecular diagnostics such as multiplex real time polymerase chain reaction (PCR) provides a rapid alternative to blood culture (BC). PCR can provide a positive test result in 6 hours as opposed to 24-48 hours (45,46). Other new methods may be able to match or exceed the speed of PCR (47). In sepsis, rapid identification of the causative pathogen is vital in ensuring that early, appropriate antimicrobial treatment can be given, saving lives, bed days and additional costs due to unnecessary treatment (48,49).

Sepsis caused by bacterial infections is difficult to diagnose, with criteria based on a set of non-specific inflammatory markers which results in a low pre-test probability for PCR. Although PCR is faster than BC, it is also much more expensive (45). However, risk-based stratification could be applied to select patients for whom PCR testing is cost-effective. The National Institute for Health and Care Excellence (NICE) in the United Kingdom defines a cost-effectiveness threshold of £20,000 to £30,000 (about €35,000) per quality adjusted life year (Q.ALY) gained due to a health care intervention (50).

The idea of risk-based stratification is established in the literature for the elimination of unnecessary BCs (9-12). The same concept can be applied to the ordering of PCR tests: given a suitable predictive model, those with a pre-test probability greater than a set cut-off should receive the test. Leli et al. (51) showed that the same factors can be used to predict both BC results and the results of a commercial direct- from-blood PCR test. Stratification should be integrated into the clinical workflow, providing the predicted probability of positive BC at the time when blood is drawn for BC, allowing a decision to be made on whether a concurrent PCR test is warranted. In this study, we describe the potential of a decision support system to stratify patients with suspected sepsis into low and high-risk groups to direct PCR testing and attempt to improve the cost-effectiveness of such costly tests.

Materials and Methods

Target population, setting and location

The target population consisted of all patients suspected of infection at Beilinson Hospital, a primary and tertiary care hospital in Israel. Both community acquired and health care associated infections were considered.

Comparators and choice of decision-analytical model

Two diagnostic strategies were compared. In the first diagnostic strategy, direct PCR testing of blood is performed for all patients with suspected sepsis from whom blood cultures were drawn. In the second diagnostic strategy, risk-assessment is performed by using an algorithm (52), based on a previously developed computerized decision support system (TREAT) (12,31,34,35), to estimate the probability (p_Bc+) of bacteremia. For high-risk patients, where p_Bc+ is greater than a threshold value T_BC+ a PCR test is performed in addition to the BC. The incremental cost-effectiveness of the risk-assessment strategy, relative to the standard clinical strategy - blood cultures only - was used to compare the two strategies. To provide additional justification for the risk-assessment strategy, we also calculated the cost- effectiveness of PCR in the low-risk group (p_Bc+ < T_BC+)-

FIG 12: The two diagnostic strategies: 1) Direct PCR testing of blood is performed in addition to BC for all patients; 2) Risk-assessment strategy, where PCR is added as an adjunct test for high-risk patients. Study Perspective, time horizon, discount rate and health outcome

In the cost-effectiveness analysis costs were seen from the perspective of the hospital management, addressing total healthcare cost accrued during the infectious episode. Total costs were approximated by the two largest contributors; the cost of diagnostic testing and the cost of hospitalization (bed-days). The number of life-years (LY) gained by introduction of the risk-assessment strategy was used as the measure of effectiveness. Number of life years gained were calculated from a reduction in mortality and an estimated number of life-years gained for survivors. No discount rates were applied, neither for costs nor for life-years.

Estimating resources and costs, currency, price date, and conversion

The PCR test considered was a commercially available multiplex real-time PCR test. The cost of a PCR test includes reagent, equipment, and personnel costs as reported in the literature in Euro (€) in 2010 (53) and 2012 (54). The cost of hospitalization was calculated from Length Of Stay (LOS) multiplied by an average cost per bed-day for Beilinson Hospital. Cost per bed-day was obtained from WHO-CHOICE (55) for 2008, and adjusted for Israeli inflation to give 2015 costs. Costs were obtained in New Israeli Shekel (NIS) and converted to€ using an exchange rate of 4.3€/ IS. Details of the calculation are included in the appendix.

Measurement of cost-effectiveness The incremental cost-effectiveness ratio (ICER) of risk-assessment is calculated as the ratio between the incremental cost (Acost) and the incremental number of life-years saved (ALY):

_ICER = ^ (1)

Since both costs and life-years remain unchanged except for the high-risk group, the increments can be calculated by considering the high-risk group only. The procedure followed to calculate the increments for the high-risk group is illustrated in FIG 13. As can be seen from FIG 13, it is required to estimate the 9 parameters listed below:

1) costbed is the cost per bed-day in€ (Appendix).

2) ALOSa is the difference in the mean length of stay (in days) between patients receiving appropriate empirical treatment and those receiving inappropriate empirical treatment. This was calculated from the patient data.

3) costpc_R is the cost of one PCR test and is set at€300 (53,54).

4) f_Bo is the fraction of patients in the high risk group with positive blood culture. The fraction was calculated from the patient data.

5) fpcR+ is the fraction of patients in the high risk group expected to return a positive PCR test. The patient data included information on BCs only, so we used data from the literature to find the relationship between positivity rates for BCs and PCR (Appendix). 6) fi is the fraction of patients in the high risk group with MDI receiving inappropriate antimicrobial treatment. The fraction was calculated from the patient data.

7) fsus, is the fraction of the patients receiving inappropriate empirical treatment expected to receive appropriate treatment given the pathogen name. f_sus ranges from 0 to 1, where 0 implies that knowing the name of the pathogen does not increase the number of patients receiving appropriate treatment, and f_sus = 1 indicates that all patients who have positive PCR will receive appropriate treatment (Appendix).

8) Amort is the difference in mortality between patients receiving inappropriate empirical treatment (morti) and those receiving appropriate empirical treatment (ΓΠΟΓΪΑ): Amort = morti - mortA (Appendix).

9) LYsurv is the average number of life-years gained for a sepsis survivor. We assume that LY_sun = 5.43 years (53).

These parameters are estimated from the case database whenever the data in the case database allows this. Otherwise the parameters are synthesized from the literature. FIG 13: Calculation of the ICER from the six parameters using a threshold T_Bc+ = 20%. The case database

Patient data were collected during trials and/or studies of the Treat (34,35) system at Beilinson Hospital, in the period from 2004-2011. Patients were included in the studies based on suspicion of infection: those for whom BCs were drawn, those receiving antimicrobials not for prophylaxis, those with SIRS, those with a clinically identified focus of infection (35). In total 3589 patients were included, 1695 of which were collected from April - November 2004 during a cluster-randomised controlled trial (35). Data from 1594 patients were collected during a prospective cohort study from May 2009 to April 2011, using the same inclusion and exclusion criteria as the cluster-randomised controlled trial. The data included information on the empirically prescribed antimicrobial treatment of these patients, BC results and other microbiology (urine, sputum, and other local samples), including results of in vitro antimicrobial susceptibility testing, and 30-day mortality.

Assumptions

We assumed that the rate of positive PCR tests is linked to the rate of positive BCs by a fixed odds ratio for any given subset of patients. The fraction of patients for whom treatments may be influenced are the patients who would return a positive PCR test and were given inappropriate empirical treatment. To determine the fraction of these patients for whom appropriate treatment will be given, we used the decision support system TREAT (35) to determine the effect of knowing the pathogen name on the treatment given. We also assume that the mortality of patients receiving inappropriate and appropriate treatment is linked by a fixed odds ratio. Assessment of inappropriate antimicrobial treatment

Assessment of inappropriate antimicrobial treatment was carried out for all patients that had a microbiologically documented infection (MDI): those patients with at least one clinically significant isolate (blood or other culture). Coagulase negative staphylococci, bacillus sp., corynebacteria sp., bacteroides sp. and anaerobic gram-positive rods were considered non clinically-significant isolates. Inappropriate treatment was defined by in vitro antimicrobial susceptibility testing showing intermediate or full resistance to the empirically given antimicrobial.

Analytical methods

Risk-assessment variables

The risk-assessment algorithm (52) uses any or all of the following variables: temperature, chills, heart rate, mean arterial pressure, mental status, neutrophil fraction, creatinine, C-reactive protein (CRP), lactate, albumin, and platelet count. Although the model is tolerant to missing values, performance declines when many values are missing.

Sensitivity Analysis

The calculation of ICER was performed for different choices of the threshold T_Bc+ for the high-risk group. For the range of thresholds, the uncertainty in the predicted results was characterized through Monte Carlo simulation, by re-iterating the cost-effectiveness calculation while independently varying each input parameter according to their underlying statistical distribution (i.e. fractions of patient groups are binomially distributed, odds ratios follow a log-normal distribution). Each simulation consisted of 10,000 iterations. The results of the simulations are used to generate 95% confidence intervals (CI) for the ICER. Additional analysis of the ICER's sensitivity to PCR cost and the number of life-years gained by a survivor is included in the Appendix. All calculations were done with Matlab version R2015b (The Mathworks, Inc.).

Compliance with reporting guidelines

This economic evaluation followed the Consolidated Health Economic Evaluation Reporting Standards (CHEERS) statement (56).

Results

Stratification of patients

Out of the 3589 patients in the case database, N_Bc+ = 377 had a positive BC, while BC was negative in the remaining N_Bc- = 3212. The receiver operating characteristic (ROC) of the model predicted probability (PBC+) of positive BC is shown in FIG 14. The area under curve (AUC) was 0.75 (95% CI 0.72-0.77), meaning the prediction was fair. High risk patients were those with p_Bc₊ higher than a threshold T_BC+. For TBO = 20% a total of n = 700 patients (19.5% of all patients) fell above this threshold. Out of these 700 patients n_Bc+ = 167 patients had positive BC and n_Bc- = 533 had negative BC (shaded area in FIG 14). From this follows that the sensitivity (the true positive rate) at the threshold was: Sensitivity = n_BC+/N_Bo = 0.44 and that the false positive rate, 1- specificity = n_BC-/N_BC- = 0.17.

FIG 14: ROC for the prediction of positive BC, showing the high-risk group (shaded area) using a threshold T_BC+= 20%.

Cost Effectiveness Calculation

FIG 13 provides a walkthrough of the process of determining the ICER for the high-risk group.

The numbered call-outs refer to the following steps in the calculation:

1. Stratification based on model predicted probability p_Bc+ of positive blood culture

2. Fraction f of patients expected to return a positive PCR test

3. Fraction fi of patients receiving Inappropriate Treatment

4. Fraction f_sus of patients receiving Inappropriate Treatment who will now receive Appropriate Treatment

5. Reduction in mortality expected due to change from inappropriate to appropriate treatment

6. Average number of life-years gained for a sepsis survivor The fraction of positive BCs in the high risk group was f_BC+ = n_BC+/n = 23.9% (95% CI 20.7 - 27.0%). The expected fraction of positive PCR tests was f = 29.4% (Appendix). It is expected that n = n * f 206 patients would return a positive PCR test. The ICER was calculated to be ICER = 6,934€/LY. Table IV presents the parameters used in the model along with their distributions for T_Bc+ = 20%. The 10,000 trial Monte Carlo analysis yielded a 95% CI for the ICER of 3,669 to 12,484€/LY. The ICER for the complementary low-risk group was 24,056 (95% CI 15,936 - 38,959)€/LY.

Table IV Parameters and their distributions used in the Monte Carlo sensitivity analysis. Mean±SD is given for the high-risk group when T_BC+ = 20%.

* Sensitivity to LY costpcR and costbed is presented separately in the Appendix The ICER calculations were repeated for a range of values for the threshold T_Bo, from 0% to 35% in steps of 2.5%. For each threshold, distributions were recalculated for the model parameters based on our own data, while those synthesized from the literature remained the same. Individual distributions for each threshold can be found in the Appendix. Median ICERs for patients above the threshold (high-risk) and below the threshold (low-risk) are presented in Table V, calculated by the Monte Carlo analysis.

Table V: ICERs for complementary low- and high-risk groups for thresholds, T_Bc₊, ranging from 0% to 35% in steps of 2.5%. The NICE threshold was met for the low-risk group for T_BC+ = 11.75%*

FIG 15 plots the ICER distributions resulting from the Monte Carlo analysis in the form of a boxplot, grouped according to the T_Bc₊ threshold. Panel A shows the results for the high-risk group and Panel B shows the results for the corresponding low-risk groups. The ICER for the high-risk group ranged from 1,353€/LY using a cut-off of T_BC+ = 32.5% to 16,774€/LY with no cut-off. The ICER in the low risk group ranged from 16,924€/LY using a cut-off of T_BC+ = 32.5% to 534,355€/LY when the cut-off was T_BC+ = 2.5%. Choosing T_BC+ so that the ICER for the low-risk group was equal to the NICE cost-effectiveness threshold (35,000€/LY) gave an ICER = 8,538€/LY for the remaining high-risk group.

FIG 15: Boxplot for the Monte-Carlo analysis repeated at cut-offs of T_BC+ = 0% to 35% in steps of 2.5%. Panel A shows the results for the high-risk group and Panel B for the low-risk group. Discussion

We believe that this paper, such as this Example A2, represents the first quantitative analysis of the expected cost-effectiveness of direct PCR testing in blood for patients suspected of infection, outside the intensive care unit. If PCR were performed in all patients, PCR testing would have saved life-years at an ICER of 16,774€/LY, which is below the NICE threshold for cost-effectiveness (35000€/LY). The cost- effectiveness can be improved by risk-assessment: In a low risk group containing the 63.2% of the patients with p_BC+ < 11.75%, the ICER was 35,000€/LY, equal to the NICE threshold. If PCR was not performed on these patients then the ICER for the remaining high-risk group was reduced to 8,538€/LY, making PCR a reasonably attractive diagnostic option. Restriction on lab costs or on the capacity to run PCR tests may make it attractive to choose a higher threshold for the high-risk group. For example, if PCR is performed only for 19.5% of all patients using a p_BC+ > 20%, ICER is reduced to 6,934€/LY. For p_BC+ > 25%, PCR testing would be restricted to 7.8% of the patients and have an ICER of 4,281€/LY.

We assumed that the odds ratio of mortality between patients receiving appropriate empirical treatment and those receiving inappropriate empirical treatment was OR_mor_t = 1.60. This is a pooled odds ratio from a meta-analysis of 26 studies, adjusted for background conditions and sepsis severity (48). There was considerable heterogeneity between the studies, with some reporting no effect and others odds ratios greater than 15. A recent multi-centre study of patients with gram-negative bacteraemia also found no effect of appropriate empirical treatment on survival [20]. However, in our own data, the observed OR_m0rt was 2.26 for the high risk group where T_Bc+ = 20%. The decision to use the lower odds ratio OR_mort = 1.60 was a conservative choice, taken to provide a more generalizable result, based on many studies.

We assumed that reduction in mortality due to immediate appropriate antibiotic treatment would be the same as the reduction due to appropriate antibiotic treatment initiated 6 hours later, corresponding to the time required for a PCR test. Some support can be found for this view despite the observation that in septic shock patients in the ICU, each hour delay until the initiation of appropriate antimicrobial treatment was associated with an 8% relative increase in mortality (57). However, for patients who received antibiotics within six hours of ED presentation, a reduction in the time to antibiotics was not found to be associated with an improvement in clinical outcomes (58). Another study also supported the assumption: No association between delays to antibiotics and mortality was found in a cohort of sepsis patients presenting to the ED (59) with the exception that severe sepsis patients showed increased mortality when time to treatment exceeded 6 hours from triage.

The main limitation of our results is that we did not have matched BC-PCR results, requiring us to make an assumption about the expected rate of positive PCR results. We were also limited by the assumption that the antimicrobial treatment given by a clinician following a positive PC test would be the same as that advised by Treat given the same conditions. We expect that the degree to which knowing the name of the pathogen increases the rate of appropriate treatment will vary between geographical regions, corresponding to varying resistance patterns. The costs used for this analysis were those of a commercial multiplex real-time PCR test; however the same considerations can be applied for any diagnostic strategy for the rapid diagnosis of bacteremia to shorten the time to prescription of optimal antibiotic treatment and reduce the usage of unnecessary and superfluous treatments (60). Several diagnostic pathways are currently considered. MALDI-TOF-MS of positive blood cultures can be added to classical methods of bacteremia diagnosis. At the same time point, PCR for specific resistance genes can be applied (61). New systems are in development to automate all these processes (62). Common to most is increase in the laboratory costs, especially when the new test does not replace the routine processes, as with MALDI-TOF and PCR that do not perform well with polymicrobial growth in blood cultures. With improving technology, we expect more genotypic tests to be available with increasing costs. In summary, when examining the patient group as a whole, the use of real-time PCR in an inpatient population is cost-effective according to the NICE threshold for cost-effectiveness. However, passing under this threshold is only one criterion that should be considered. Whether a test can be applied must also consider the availability of resources. Risk-based stratification can be used to improve the cost- effectiveness of PCR by removing patients for whom PCR testing is not cost-effective.

EXAMPLE A3: Automatic Learning of mortality in a CPN model of the Systemic Inflammatory Response Syndrome

Abstract:

The aim of this paper, such as this Example A3, is to apply machine learning as a method to refine a manually constructed CPN for the assessment of the severity of the systemic inflammatory response syndrome (SIRS).The goal of tuning the CPN is to create a scoring system that uses only objective data, compares favourably with other severity-scoring systems and differentiates between sepsis and noninfectious SIRS. The resulting model, the Learned-Age (L_A) -Sepsis CPN has good discriminatory ability for the prediction of 30-day mortality with an area under the ROC curve of 0.79. This result compares well to existing scoring systems. The L_A-Sepsis CPN also has a modest ability to discriminate between sepsis and non-infectious SIRS.

Key Words: SIRS; Sepsis; Bayesian Network; EM learning; Prediction. 1. Introduction

Sepsis is a major healthcare problem with high mortality: rates range from 15-60% and higher in cases of septic shock with multiple organ failure (41,63,64). Sepsis elicits an activation of the patient's immune system, a state referred to as the Systemic Inflammatory Response Syndrome (SIRS). A similar response can also be elicited by processes involving tissue damage such as trauma or surgery. Assessment of the individual patient's severity is an important factor in deciding on both the diagnostic work-up and the course of treatment. TREAT is a decision support system capable of assessing patients suspected of severe infection and recommending the optimal course of treatment (31,34,35,65-67). The infection model behind Treat is a large Bayesian Network, also referred to as a Causal Probabilistic Network (CPN), consisting of approximately 6000 nodes. At its core, the stochastic model describes the interaction of signs, symptoms of infection, bacteria and antibiotics. The part of the model that deals with SIRS response is the "Sepsis CPN". The Sepsis CPN assesses the severity of illness based on the degree of the

inflammatory response indicated by a set of "infection variables".

CPNs can be used to construct diagnostic models for diseases (25-28), and are ideal for this purpose due to their ability to combine knowledge, represented by patient databases, expert opinion and reports in the literature, with strict reasoning adhering to the axioms of probability theory. Construction of a CPN can be manual, or automatic where the latter refers to the use of machine learning techniques. The value of the manual approach has been demonstrated empirically through the success of Treat (34,35), which was constructed in this way. The basic units of a CPN are nodes that represent stochastic variables and arrows which define the causal relationships between the nodes. Mathematically, the arrows represent conditional probability tables. Constructing a CPN therefore consists of specifying a set of stochastic variables and the causal probabilistic relationships between them. Nodes may represent either observable events or concepts, for example, pathophysiological links or diagnoses, which may not be observable but still hold interest. One advantage of CPNs is that following their construction, they can be used to update the probability distributions for such unobserved events based on the observed evidence.

The development of the Sepsis CPN also used a semi-formal manual approach to construction. FIG 16 presents a framework for the development of the Sepsis CPN, each stage of which has been described in the literature. The original model is termed the Discrete Sepsis CPN (D-Sepsis CPN) where all nodes were discrete stochastic variables (31). In the next version of the model, the Continuous Sepsis CPN (C-Sepsis CPN), continuous variables were introduced (36). To overcome some of the shortcomings of the D- and C-Sepsis CPNs, the previously manual construction was supplemented by tuning via machine learning. The result of this was the Learned Sepsis CPN (L-Sepsis CPN) (68). FIG 16 Sepsis CPN development framework. Phase I describes the development of the discrete sepsis CPN (D-Sepsis CPN), phase I I the continuous sepsis CPN (C-Sepsis CPN) and phase I II the development of the learned sepsis CPNs (L-Sepsis CPN and the LA-Sepsis CPN where age is included) through formal learning methods.

Other prognostic models learned from data are described in the literature, notably the Mortality in the Emergency Department Sepsis (M EDS) score (22) and the modified Rapid Emergency Medicine Score (mREMS) (69,70). The SIRS score is used in sepsis diagnosis (71). The variables used by the M EDS score, m REMS, SIRS and the C-Sepsis CPN are shown in Table VI. M EDS and m REMS are based on logistic regression models, and are designed for use outside the intensive care unit, in particular for the emergency department (ED). Logistic regression models have the disadvantage that they cannot handle missing values, unlike CPNs. These models do however represent a basis for comparison in that Treat and the related sepsis CPNs are intended for use in both ED and medical wards. Prognostic scoring systems such as MEDS, m REMS and the sepsis CPN are useful tools for clinicians, aiding in the decision making process when it must be decided whether the patient should receive antibiotics, be admitted to the hospital, or be admitted to the intensive care unit (ICU). Table VI Variables used in the MEDS, mREMS and SIRS scores and the C-Sepsis CPN

MEDS Score Variables mREMS Variables SIRS Variables C-Sepsis CPN Variables

Clinical diagnoses: Objective/measured Objective/measured Objective/measured

Terminal illness (<30 days) variables: variables: variables:

LRTI Temperature Temperature Temperature Septic shock Mean Arterial Pressure Leukocytes Chills

Mental status Heart rate Heart rate Albumin

Objective/measured Respiratory rate Respiratory rate CRP

variables: Sa02 Leukocytes

Bands (immature Age Recent chemotherapy neutrophils) Mental status: Creatinine

Age normal/confused* Mental status

Tachypnea/hypoxia Heart rate

Platelets *The original REMS Systolic Blood Pressure

Nursing home resident instead requires the DIC

Glasgow Coma Score Dyspnea

Respiratory rate

Supplementary Oxygen Sa02

The purpose of this paper, such as this Example A3, is to present an extension of the method for tuning the sepsis CPN to predict all-cause 30-day mortality using a database of real patient cases (68), where patients' age is added to the model to form the L_A-Sepsis CPN. Five-fold cross-validation is used to test the L_A-Sepsis CPN's ability to predict 30-day mortality. The goal of tuning the L_A-Sepsis CPN is to create a scoring system that uses only objective data, compares favourably with other severity-scoring systems and differentiates between sepsis and NSIRS.

2. Materials and Methods

In this paper, such as this Example A3, we describe the modification of the C-sepsis CPN into the age- adjusted Learned Sepsis CPN (L_A-Sepsis CPN) (FIG 16), Phase III of the development framework). This model forms a separate branch to the L-Sepsis CPN which has been described previously (68), although much of the methodology used is the same.

2.1 L_A-Sepsis CPN Model and Adjustments

FIG 17 shows the complete structure of the L_A-sepsis CPN. Two syndromes are represented: the non- infectious systemic inflammatory response syndrome (NSIRS) and sepsis. The Diagnosis node is used to toggle between these syndromes in the learning process (Section 2.3). The severity of each syndrome is described using five states; no, mild, moderate, severe and critical, which can also be thought of as the degree of activation of the immune system. Each severity state is associated with a mortality rate assigned in the node AliveSev. The 30-day mortality is also linked to a patient's age, independent of illness severity (72). In FIG 17 this is implemented through the AgeRisk node. An additional background mortality is also included to account for mortality related to neither NSIRS nor sepsis: BackgroundMort. FIG 17: The L_A-Sepsis CPN structure. Nodes are represented by ovals, causal links by arrows. Causality is expressed through conditional probability tables. The nodes with double rings represent stochastic variables with continuous probability distributions. The remaining nodes have discrete probability distributions. NSIRS is the non-infectious systemic inflammatory response syndrome. Nodes not learned are shown in designated 1.

BackgroundMort is a binary node and adds to the mortality assessment of the AliveSev node as an or function: the patient can die due to the severity of their illness or background causes, or both. AgeRisk also adjusts the assessment of 30-day mortality by allowing for mortality rates to be increased for older patients and decreased for younger patients. The cut-off for the change between increase and decrease is approximately 80 years.

The infection variables are linked to the NSIRS and Sepsis nodes through a set of "factors", labelled as: Fact_leuko_creat, Fact_fever, Fact_alb and Fact_shock. Each factor uses the same five severity states as the NSIRS and Sepsis nodes. These factors were introduced in the original D-Sepsis CPN as a means to account for correlation of the infection variables. The original constructors carried out a factor analysis, leading to this statistical construct where the factors explained 80% of the variation in the data. These factors may also have physical analogues: analysis of a dataset including inflammatory mediators found that interleukin-6 had a loading of 0.91 on Fever_factor and tumor necrosis factor alpha had a loading of 0.50 on Fact_shock (31). Each factor could then be thought of as representing a branch of the immune system. The correlation with biomarkers of activation of the immune system lends credence to the technique used.

The sepsis factors retain their design from the C-Sepsis CPN, with their severity states defined as the most severe of Sepsis and NISIRS. In the literature, sepsis and non-infectious SIRS are reported to have similar crude mortality rates given comparable severity of illness (40,41). This is implemented in the Severity and AliveSev nodes. The severity states of NISIRS and sepsis: no, mild, moderate, severe and critical are assigned mortality rates of 0%, 1%, 8%, 45%, 75% respectively. Our choice of mortality rates for no, mild and moderate is based loosely on defined for the Pneumonia Severity Index (73), with mortality rates for severe and critical adjusted to account for the higher mortality of septic patients: mortality rates from 45-75% have been reported for patients with severe sepsis or septic shock with multiple organ failure (41,63,64). The infection variables were chosen based on their predictive abilities in determining either the severity or the etiology (infectious/non-infectious) of a patient's illness. The ability of each variable in determining the severity was assessed by the area under the ROC curve for the prediction of 30-day mortality. The L_A-Sepsis CPN makes use of a "semi-discrete" learning environment, where individual Gaussian probability distributions are not defined for each severity state of sepsis. Instead, Gaussian curves roughly covering the range of a given variable as seen in the database. Example curves are seen in FIG 18, panel A. The semi-discrete environment is created by the introduction of "mapping" nodes for the continuous variables; the (labelled 2.) leukoMapping, plateletMap, creatMap etc. shown in FIG 17. These mapping nodes define the pathophysiological states for which Gaussian distributions should be defined. By defining Gaussian distributions for the infection variable and learning the "mapping" node, the resulting distribution for the variable given a sepsis severity will be a linear combination of the predefined Gaussian distributions: a composite distribution. 2.2 Patient data

Patient data were collected during trials and/or studies of the Treat system at Beilinson Hospital, Petah Tikva, Israel including 1695 patients from April - November 2004, and 1894 patients from December 2008 - April 2011. Patients were included in the studies based on suspicion of infection: those for whom blood cultures were drawn, those receiving antimicrobials not for prophylaxis, those with SIRS, those with a clinically identified focus of infection (35). Each patient case included all available

measurements/recordings of the individual infection variables on hospital admission and whether the patient had recently undergone chemotherapy, in addition to information on the 30-day mortality. Each case was also annotated with the department's final diagnosis, which could be infectious: pneumonia, meningitis, pyelonephritis, etc. or non-infectious, noted simply as no infection, or unknown which denoted that a final diagnosis was unavailable. The final diagnosis was taken as that recorded in the patient file on discharge or death.

2.3 Learning

Learning is carried out using the Expectation-Maximization (EM) method (37). EM learning is part of the family of maximum likelihood methods and is offered as a tool within the commercial CPN software Hugin. The algorithm aims to maximize the collective likelihood of all cases used in learning by adjusting the conditional probability tables of a specified set of nodes that are to be learned.

Hugin's implementation of EM learning allows "experience tables" to be added to nodes which will then be modified during the learning process. These experience tables allow for prior beliefs to be set on distributions or for constraints to be imposed. One example of such a constraint is applied to the conditional probability table of "Tempmap", which defines the composite temperature distributions corresponding to each severity state in "Fact fever". The column of this table corresponding to Fact_fever = no is set such that Tempmap = normal, that is, P(Tempmap=normal | Fact_fever=no) = 1. "Normal" refers to one of the Gaussian distributions specified for the "fever" node. This "Normal" Gaussian distribution is based on reference values from the literature. In the learning process, we add a degree of supervision by observing the "Diagnosis" node as either noninfectious or infectious, based on the department's final diagnosis. Experience tables were added to the conditional probability tables for the mapping nodes (nodes labelled 2. in FIG 17), the factor mapping nodes (labelled 3. in FIG 17), NSIRS, Sepsis, BackgroundMort, AgeRisk and AliveDay30. These nodes were learned simultaneously. The severity states of the NSIRS and Sepsis nodes are not observed, they are inferred through the observation of 30-day mortality.

Following this learning step, the experience tables are removed. We also wish to learn the a priori probabilities for the Diagnosis node; however we recognize that it is possible for a patient to simultaneously have SIRS of two different causes; NSIRS and Sepsis. This option is not represented in the department's final diagnoses. A "both" state is added to the Diagnosis node along with an experience table. The conditional probability table is then learned for Diagnosis using the same learning data, blinded to the diagnosis.

2.4 Validation

A 5-fold cross-validation is performed as an internal validation in order to guard against overfitting of the model. The complete dataset, called DataAII, for the 2885 included patients was divided into 5 datasets, called Datal through Data5. Data was stratified by whether the patients had infection:

patients with infection and no infection were assigned to the five sets separately to ensure even division of each in each dataset. For each cross-validation step, one of the five sets of patient data was set aside as a validation set while the other four sets were used for learning. Table VII describes the learning and validation data used for each cross-validation model L_A-Sepsisl through L_A-Sepsis5. The ability to predict 30-day mortality was tested by constructing a Receiver Operating Characteristic (ROC) curve where the threshold of the model's predicted 30-day mortality was varied. Performance is assessed from the Area Under the ROC curve (AUROC). The calibration of the model is assessed using the Hosmer-Lemeshow statistic (39). Table VII Cross-validation schema for the 5-fold cross-validation

Learning Data Resulting model Validation Data

[Data2,Data3,Data4,Data5] L_A-Sepsisl Datal

[Datal,Data3,Data4,Data5] LA-Sepsis2 Data2

[Datal,Data2,Data4,Data5] L_A-Sepsis3 Data3

[Datal ,Data2,Data3 ,Data5] L_A-Sepsis4 Data4

[Datal ,Data2,Data3 ,Data4] LA-Sepsis5 Data5

2.5 Assessment of Performance

The node "AliveDay30" contains the model's predicted 30-day mortality. This probability can be read from the node after entering the evidence for each case. The learned network is assessed for its discriminative ability using the AUROC. As a check, the performance of the L_A-Sepsis CPN is compared to the cross-validation models for each fold of the cross-validation data, and also to the mean of these. Calibration of the individual cross-validation models is also assessed.

The performance of the L-Sepsis CPN (68) is compared to that for Treat with the C-Sepsis CPN, as age is not taken into account in either of these models. Subsequently the L-Sepsis CPN and LA-Sepsis CPN are compared.

The performance of the LA-Sepsis CPN is also compared to that of the SI RS and mREMS scores for prediction of both 30-day mortality and the presence of infection (defined by the final diagnosis).The AUROC for the L_A-Sepsis CPN is constructed by reading the model's predicted probability of infection from the "Diagnosis" node after entering the evidence for each case.

Descriptive statistics and significance testing for the data was carried out using SPSS (Version 22, IBM Corporation). Continuous variables were analyzed using the independent samples Mann-Whitney U Test, and categorical variables with the Pearson Chi-squared statistic. EM learning was performed using commercially available software: Hugin (Version 7.6 (x64), Hugin Expert A/S). ROC-analysis was undertaken in SPSS. The difference in the area under curve was assessed using the method of Hanley and McNeil (38). Additional plotting and calibration analysis was completed in Excel (Version 14.0, Microsoft Corporation).

3. Results

3.1 Patient Data

Table VI II presents descriptive statistics for the data used in the learning process, using the final diagnosis to split the dataset into patients with infection, and patients without infection. Of the 3589 patients for whom data were collected, 2885 had a confirmed infectious or non-infectious diagnosis, where 2514 had an infection (infected patients) and 371 did not (non-infected patients). SIRS and m REMS scores were calculated for patients for whom the required data were recorded. No patients had sufficient data to allow the M EDS and the REMS scores to be calculated.

For continuous parameters, distributions are given as median [range] while categorical variables are given as the number and proportion (%). Differences between the distributions for infected and non- infected patients were assessed using the Mann-Whitney U test for continuous variables and the differences between proportions were assessed using the Pearson Chi-squared statistic for categorical variables. The discriminatory ability in predicting 30-day mortality was assessed for each parameter for each patient group using the AUROC.

The distributions for leukocytes, platelets, CRP, temperature and the proportion of patients with chills and recent chemotherapy were different for the infected and non-infected patients. The remaining variables did not show a significant difference between the two groups. The set of infection variables used in the C-, L- and L_A-Sepsis CPNs are chosen for their ability to a) determine the severity of a patient's illness and/or b) differentiate between sepsis and NISIRS. An example of one of the chosen infection variables that is the same for both the infected and non-infected patients in Table VIII is albumin (p=0.96). Albumin is however useful in determining the severity of illness, with a meta-analysis noting significant increases in mortality for each decrease of 10 g/L (42) and this result is confirmed by our data: the AUROC for 30-day mortality prediction was 0.71 for infected patients and 0.72 for non- infected patients. In addition to the variables used in the C-Sepsis CPN, two more were included in both the L- and LA-Sepsis CPNs: platelets and lactate. Although lactate is not a significant predictor of mortality in the univariate analysis of Table VIII, this can be explained by the lack of lactate

measurements for the majority of patients. We include lactate because high lactate is associated with mortality in critical illness (43) and is one of the criteria for severe sepsis (74). Age was also added to the LA-Sepsis CPN.

Table VIII Descriptive statistics for the data used in learning. Groups are defined according to final diagnosis.

Infected patients (N=2514) Non-infected patients (N=371)

Infection variable Recorded Median [Min- AUROC Recorded Median [Min- AUROC P value

% Max] or N [%] mortality % Max] or N [%] mortality

Laboratory variables

Leukocytes [count/mm³] 96.5 11.4 [0-110.6] 0.59* 98.1 10.1 [0.6-116.7] 0.43 0.002†

Creatinine [mg/dl] 96.4 0.9 [0.1-11.3] 0.59* 97.0 0.9 [0.1-7.7] 0.56 0.92

Albumin [g 1] 32.9 35 [1-82] 0.71* 56.3 35 [4-49] 0.72* 0.96

Lactate [mmol/1] 13.0 1.9 [0.4-14.9] 0.54 20.8 1.9 [0.7-8.8] 0.52 0.90

Platelets [count/mm³] 96.4 228 [3-1250] 0.46* 98.7 238 [19-1204] 0.64* 0.03†

CRP [mg/1] 6.3 102 [1-1230] 0.40 13.2 68 [1-285.9] 0.45 0.01†

Vital signs

Heart rate [/min] 99.5 90 [30-162] 0.55* 99.5 90 [40-146] 0.52 0.09

Systolic BP [mniHg] 99.8 122 [40-242] 0.55* 99.5 123 [59-200] 0.64* 0.72

Temperature [°C] 99.9 38.4 [32-42] 0.43 99.7 38 [33-40.6] 0.48 <0.001†

Respiratory rate [/min] 47.3 18 [8-60] 0.62* 45.3 18 [10-60] 0.62 0.09

Mental status 80.2 89.2

-confused 215 [10.7%] - 36 [10.9%] - 0.12

-comatose 38 [1.9%] - 1 [0.3%] -

Additional

Chills 81.4 460 [22.5%] - 79.5 45 [15.3%] - 0.005†

Chemotherapy 100 166 [6.6%] - 100 37 [10%] - 0.02†

DIC 75.9 8 [0.4%] - 76.5 1 [0.4%] - 1.00

Age 99.6 73 [18-108] 0.70* 99.2 72 [18-104] 0.63* 0.21 30-day mortality 99.8 314 [12.5%] - 99.7 41 [11.1%] - 0.45

SIRS 46.4 2 [0-4] 0.576* 44.5 2 [0-4] 0.593 <0.001† mREMS 25.1 6 [0-15] 0.671* 24.5 6 [0-16] 0.660 0.264 * The AUROC is significantly better (p<0.05) than the line of no discrimination (AUROC=0.5).

† There is a significant difference between infected and non-infected patients

3.2 EM Learning

An example of the composite distributions learned is presented in FIG 18. The composite distributions for systolic blood pressure (B) are shown along with the Gaussian distributions (A) from which they were constructed. Each curve in panel B is constructed from a linear combination of the curves in panel A. Following EM-learning, a similar set of composite distributions could be drawn for each of the continuous variables.

FIG 18 An example of initial specified Gaussian distributions (A) and learned composite distributions (B) for one of the variables in the sepsis CPN: systolic blood pressure. The composite distributions in B are linear combinations of the distributions in A. Each composite distribution in B is conditional on a given state of Fact_shock.

If each sepsis factor is considered to represent a branch of the immune system, then we learn whether these branches are activated differently in non-infectious cases than they are in infectious cases. This differential activation is expressed through the conditional probabilities learned in the factor mapping nodes (labelled 3. in FIG 17). FIG 19 shows an example of the difference in activation of the fever factor and the shock factor for severity=moderate. Patients with the infectious etiology (sepsis) present with a higher fever than those without infection (NSIRS) for a given severity of illness (FIG 19, panel A). For variables that do not differentiate between infection and no infection, such as systolic blood pressure, such difference is not seen (FIG 19, panel B). FIG 19 Conditional probability distribution for temperature (panel A) and systolic blood pressure (panel B) for Sepsis and NSIRS patients with severity = moderate

3.3 Validation

The final step in this phase of the development process (FIG 16) was the validation of the L-Sepsis CPN's ability to predict 30-day mortality. Table IX presents the area under curve (AUC) for both the cross validation models, L_A-Sepsisl through L_A-Sepsis5, and the final L_A-Sepsis model with each of the validation datasets. The mean AUC for the validation datasets was 0.78 (range: 0.74-0.81). Learning the L_A-Sepsis CPN with all available data gave an AUC of 0.79 (95% CI 0.77-0.82) when the validation and training data sets were identical. The Hosmer-Lemeshow statistic measures how well the observed mortality rate matches the predicted rate for risk groups separated into deciles, p>0.05 signifies that there are no significant differences, which means the model is well calibrated. Using the five validation sets and their corresponding models the Hosmer-Lemeshow test showed that four of the models were well calibrated (p>0.14) and one was not (p=0.04).

The performance of one of the cross-validation models, LASepsis3 was significantly worse than the LA- Sepsis CPN which signifies that some overfitting may occur. However, the overall performance of the cross-validation models is similar to the L_A-Sepsis CPN which suggests that similar performance could be expected on an independent dataset. Table IX Performance of cross-validation models vs. the L_A-Sepsis CPN for mortality prediction for each cross validation fold.

Test data Mortality Model AUC Model AUC P value

Datal 14.2% LASe sisl 0.74 (0.68-0.79) LA-Sepsis 0.74 (0.68-0.80) 0.40

Data2 12.8% L_ASepsis2 0.79 (0.77-0.87) L_A-Sepsis 0.81 (0.76-0.86) 0.12

Data3 11.8% LASepsis3 0.76 (0.70-0.82) LA-Sepsis 0.81 (0.75-0.86) 0.02

Data4 11.8% L_ASepsis4 0.78 (0.73-0.84) L_A-Sepsis 0.79 (0.74-0.85) 0.26

Data5 10.9% L_ASepsis5 0.81 (0.75-0.87) L_A-Sepsis 0.82 (0.76-0.87) 0.39

Mean 0.78

DataAll 12.3% L_A-Sepsis 0.79 (0.77-0.82)

3.4 Assessment of Performance

FIG. 20 presents ROC curves for the discriminatory ability of the 30-day mortality as assessed by the L- Sepsis, LA-Sepsis and C-Sepsis CPNs for the 2855 patient cases that make up DataAll. The reference line represents the line of no discrimination. The C-Sepsis CPN does not account for age and is therefore compared to the L-Sepsis CPN without age. The AUC for the C-Sepsis CPN was 0.65 (95% CI 0.62-0.68), the AUC for the L-Sepsis CPN was 0.74 (95% CI 0.71-0.77), and the difference between the two curves was statistically significant (p=10⁸). The AUC for the L_A-Sepsis CPN was 0.79 (0.77-0.82). The difference between the L-Sepsis and the LA-Sepsis CPN models was also statistically significant (p=0.0004).

FIG 20 ROC curves for the prediction of 30-day mortality for the C-Sepsis, L-Sepsis and LA-Sepsis CPNs.

For the L_A-Sepsis CPN the Hosmer-Lemeshow statistic was 12.5 (8 degrees of freedom, p=0.13) which suggests that the model is well calibrated. The calibration curve for the L_A-Sepsis CPN is presented in FIG 21. For the calibration curve, the predictions are split into deciles based on risk, and the number of observed events are plotted against the predicted number of events for each decile - a perfectly calibrated model will appear as a straight, x=y line. The predictions and observations can also be seen to match well visually. FIG 21 Hosmer-Lemeshow calibration curve for the prediction of 30-day mortality using the L_A-Sepsis CPN.

We also conducted a sub-group analysis of LA-Sepsis CPN calibration for the two most prevalent infection sites in our dataset; lower respiratory tract (LRT) infections and urinary tract infections (UTI). There were 697 LRT infections with 128 deaths, and 486 UTIs with 41 deaths. FIG 22 presents regression lines drawn relating the predictions and observations for the two sub-groups along with the full dataset.

FIG 22 Regression lines for the observed vs. predicted events for all 2855 patients (solid), 697 patients with a lower respiratory tract infection (dotted) and 486 patients with a urinary tract infection (dashed).

The sub-group curves for LRT- and UTI infections show an interesting trend: for a given severity of the inflammatory response, a lower respiratory tract infection is more likely to lead to death. This matches the trend shown for the L-Sepsis CPN, where the regression gradients were 0.93 and 0.47 for LRT infection and UTI respectively (68). Although the model is well-calibrated overall, it is visually poorly calibrated for the patients with UTI and LRT infections.

FIG 23 ROC curves for the prediction of 30-day mortality (A) and presence of infection (B) for the patients for whom SIRS could be calculated. For the LA-Sepsis CPN, the mortality prediction was read from the "AliveDay30" node and the probability of infection from the "Diagnosis" node.

The LA-Sepsis CPN also compares favourably with the SIRS and mREMS scores. FIG 23 shows ROC curves for the prediction of 30-day mortality (panel A) and the presence of infection (panel B) for the 46% of patients for whom a SIRS score could be calculated. The AUROC for prediction of 30-day mortality was 0.81 for the L_A-Sepsis CPN, significantly better (p<0.0001) than the AUROC of 0.58 for SIRS. For the presence of infection, the AUROC was 0.66 for the L_A-Sepsis CPN, significantly better (p=0.02) than the AUROC of 0.60 for SIRS.

FIG 24 ROC curves for the prediction of 30-day mortality (A) and presence of infection (B) for the patients for whom mREMS could be calculated. For the LA-Sepsis CPN, the mortality prediction was read from the "AliveDay30" node and the probability of infection from the "Diagnosis" node.

FIG 24 shows ROC curves for the prediction of 30-day mortality (panel A) and the presence of infection (panel B) for the 25% of patients for whom a mREMS score could be calculated. The AUROC for the prediction of 30-day mortality was 0.81 for the LA-Sepsis CPN, significantly better (p<0.0001) than the AUROC of 0.67 for mREMS. For the presence of infection, the AUROC was 0.66 for the L_A-Sepsis CPN, significantly better (p=0.001) than the AUROC of 0.54 for mREMS.

4. Discussion

We applied EM-learning to a manually constructed CPN in order to tune its prediction of 30-day mortality: the result is the L_A-Sepsis CPN. The present study presents an extension to our previous work (L-Sepsis CPN) (68) by accounting for patient age as a confounding variable and introducing an explicit representation of background mortality. The L_A-Sepsis CPN uses 13 variables easily obtained at sepsis onset, including vital signs and basic laboratory measurements. Following a similar methodology for tuning the model as the L-Sepsis CPN, the LA-Sepsis CPN exhibited significantly better discrimination as assessed by the AUROC. For the predicted probability of 30-day mortality, the L-Sepsis and L_A-Sepsis CPNs had AUROC of 0.74 and 0.79 respectively.

The strength of CPN models is the ability to mix knowledge and evidence: the models contain knowledge in the form of conditional probability tables, and can be used to perform reasoning following the axioms of probability theory. The inclusion of knowledge gives CPNs the ability to handle missing data. The model uses all available variables and relies on the knowledge contained in the model for the other variables. The effectiveness of manually constructed CPNs has been shown in the literature, for example by the C-Sepsis CPN (36) and through the success of the Treat CPN [11]. This study shows that applying machine learning techniques to parts of a manually constructed CPN may be an effective way to tune them and improve performance. Lack of data and incomplete cases are inherent problems with medical data, and it is therefore important that our modelling and learning methods allow the use of all available data, including those from incomplete cases. The ability to deal with missing data is another advantage of CPNs and the EM learning algorithm. Some of the variables used were recorded in fewer than 40% of cases, namely albumin, lactate and CRP. Two of these, albumin and lactate, are reported in the literature as being closely linked to mortality (42,43). Although not closely related to mortality, CRP has been investigated extensively as a biomarker for sepsis and has some utility in distinguishing between NSIRS and sepsis.

The model discriminates well between cases in terms of 30-day mortality, although as in our previous study, the sub-group analysis of infection sites points towards the existence of confounding variables not accounted for by our model (68). Possible confounders, including site of infection and presence of other comorbidities should be investigated further. Mortality is much higher in patients with an underlying comorbidity, and changes less throughout most of adulthood than in patients without comorbidity (63). For this model, including knowledge of the site of infection would be an artificial construct: at the suggested time of use of the model, a final diagnosis is not usually known. However, reintegrating the CPN into Treat presents the opportunity to account for several such effects: the wider infection model is able to calculate the probability of a given site of infection.

We have chosen not to apply automatic machine learning to refine the structure of the model. The original construction via factor analysis has proved effective, and is in reality its own form of "structure learning". Further refinement may be possible through the comparison of a set of different models, however the number of possible permutations is too many. We are not aware of any methods wherein sufficient limitations can be imposed on the structural learning.

In this study, we performed a 5-fold cross-validation. Testing the L_A-Sepsis CPN on independent datasets would make it possible to verify if the model is robust enough to be used on data collected under different circumstances, such as clinical settings, or demographic differences.

We compared the L_A-Sepsis CPN as a scoring system against mREMS and SIRS. The simplest score, SIRS, requires 4 variables. Only 46% of patients had all of these recorded, mainly due to the respiratory rate being recorded in only 47% of patients. The mREMS score, requiring 7 variables, could only be calculated for 25% of the patients. The L_A-Sepsis CPN outperformed the mREMS and SIRS scores in predicting 30- day mortality and differentiating between infected and non-infected patients. The number of patients for whom the SIRS and mREMS scores could be calculated further highlights the advantage of CPN- based models with their ability to handle missing data.

Due to the lack of data in our database available to make a comparison to MEDS, we can make a comparison to performance reported in the literature. The MEDS score is probably the best known scoring system for patients with suspected sepsis, and has good discriminatory performance in ED patients with suspected sepsis or SIRS (AUC 0.75-0.88) (75) although it performed poorly in a cohort of patients with severe sepsis and septic shock with AUC = 0.61 (76). The L_A Sepsis CPN performs similarly with an AUC of 0.79, and is well calibrated.

Mortality prediction for sepsis patients provides a risk-based stratification of patients which potentially impacts both their treatment and their diagnostic work-up. For example, treatment of a severely septic patient with broad-spectrum antibiotics may be justified, while the use of broad-spectrum antibiotics for a mild sepsis may not be cost-effective when the ecological cost of increasing bacterial resistance due to excessive use of broad spectrum antibiotics is considered (77).

Similarly to that for treatment, the diagnostic work-up can be tailored to the severity of illness.

Expensive, rapid diagnostic tests (for example PCR) can be made cost-effective by risk-based stratification, where the test is not offered to low risk patients where the test is unlikely to give any result (78).

The next challenge is the reintegration of the L_A-Sepsis CPN with the wider Treat network. This allows the potentially confounding effect of diagnoses to be accounted for. We expect that accounting for these potential confounders will result in an improvement to the predictive performance and the calibration of the overall model, when diagnostic subgroups are considered.

Using the same method described in this example (Example A3), a model was tuned to predict bacteremia instead of 30-day mortality. FIG 25 shows ROC curves for the prediction of bacteremia for three patient cohorts, two from Denmark (HvH, SLB) and one from Israel (Beilinson). The stable performance across these three cohorts, one of which (Beilinson) was used to tune the model, suggests that the model may be geographically invariant.

Another possible application of embodiments of the invention is the use of the derived probabilities (including but not limited to the probability of positive blood culture, bacteraemia, PCR or mortality) calculated using an embodiment of the invention (for example one of those described in Example Al, Example A2, and/or Example A3) as an "early-warning" of infection in hospitalised patients. This method could enable the identification of infection before standard clinical methods, such as thereby allowing for earlier treatment. Similarly, the same derived probabilities could be used to provide an assessment of treatment efficacy, such as allowing recommendations of escalation or de-escalation of an antimicrobial treatment regime.

According to alternative embodiments E1-E21 of the invention, there is presented:

El. A computer implemented method for calculating a probability of a clinical outcome for a patient, wherein the method comprises:

- Providing a maximum of 150 input parameters, such as a maximum of 100 input

parameters, such as a maximum of 75 input parameters, such as a maximum of 50 input parameters, such as a maximum of 40 input parameters, such as a maximum of 30 input parameters, such as a maximum of 25 input parameters, such as a maximum of 20 input parameters, as input to a statistical model,

- Calculating the probability with the statistical model.

E2. The method according to embodiment El, wherein the method further comprises:

Performing with the statistical model an assessment of a severity of an illness of the patient, where said assessment is based partially or fully on the input parameters, and where said probability is based partially or fully on the severity of the patient's illness.

E3. The method according to any one of embodiments El and E2 wherein the statistical model is a Bayesian network, such as a not naive Bayesian network.

The method according to any one of embodiments E1-E3 where the clinical outcome is a positive blood culture.

E5. The method according to any one of embodiments E1-E3 where the clinical outcome is a positive PCR test. E6. The method according to any one of embodiments E1-E3 where the clinical outcome is

a. 30-day mortality, and/or

b. In-hospital mortality.

E7. The method according to any one of the preceding embodiments where the method further comprises:

a. recommending treatment options, and/or

b. ruling out treatment options, based on said probability.

E8. The method according to any one of embodiments E1-E6 where the method further comprises:

a. recommending further diagnostic tests, and/or

b. ruling out further diagnostic tests, based on said probability.

The method according to any one of embodiments E1-E6 where the method further comprises:

a. recommending intensive care unit (ICU) admission, and/or b. ruling out ICU admission,

based on said probability.

E10. The method according to any one of the preceding claims, wherein the input parameters

comprise one or more or all of:

a. Laboratory parameters, such as one or more or all of:

i. Leukocytes (such as in units of [count/mm³]), ii. Creatinine (such as in units of [mg/dl]),

iii. Albumin (such as in units of [g/l]),

iv. Lactate (such as in units of [mmol/l]),

v. Platelets (such as in units of [count/mm³]),

vi. Neutrophils (such as in units of [count/mm³]), vii. C P (such as in units of [mg/l]),

viii. procalcitonin (PCT),

b. Vital parameters, such as one or more or all of:

ix. Heart rate (such as in units of [/min]),

x. Systolic blood pressure (BP) (such as in units of [mmHg]), xi. Mean arterial pressure (MAP) (such as in units of

[mmHg]), such as wherein the mean arterial pressure is calculated from the systolic blood pressure and a diastolic blood pressure),

xii. Temperature (such as in units of [°C]),

xiii. Respiratory rate (such as in units of [/min]), xiv. Mental status, being one or both of:

1. Confused,

2. Comatose,

c. Additional parameters, such as one or more or all of:

xv. Chills,

xvi. Chemotherapy,

xvii. Disseminated Intravascular Coagulation (DIC), xviii. Use of supplementary oxygen,

xix. Fraction of inspired oxygen (Fi0₂),

xx. Oxygen saturation (Sa0₂),

xxi. Partial pressure of oxygen (Pa0₂),

xxii. Acute Respiratory Distress Syndrome (ARDS).

Ell. The method according to any one of the preceding claims, wherein the input parameters

comprise, such as consist of:

Neutrophils,

C-Reactive Protein (CRP),

- Lactate,

Temperature,

Albumin,

Creatinine,

Platelets,

- MAP,

Heart Rate.

E12. The method according to any one of the preceding claims, wherein the input parameters

comprise, such as consist of:

- procalcitonin (PCT),

Neutrophils, C-Reactive Protein (CRP),

Lactate,

Temperature,

Albumin,

Creatinine,

Platelets,

MAP,

Heart Rate.

E13. The method according to any one of the preceding claims, wherein the statistical model enables machine learning to be carried out within 1 hour or less, such as within 30 minutes or less, such as within 10 minutes or less.

E14. The method according to any one of embodiments E2-E12 wherein the method comprises first

Performing with the statistical model the assessment of a severity of the patient's illness, where said assessment is based partially or fully on the input parameters, and then subsequently

Calculating the probability with the statistical model, where said probability is based partially or fully on the severity of the patient's illness.

E15. The method according to any one of the preceding claims, wherein the method further comprises a. displaying the probability, and/or

b. transmitting information representative of the probability, such as a digitized value representative of the probability, to an associated unit.

E16. An apparatus comprising a processor and a memory and being configured for carrying out the method according to any one of the preceding claims.

E17. Computer program product having instructions which, when executed cause a computing device or a computing system, such as the apparatus according to embodiment E16, to perform a method according to any one of embodiments E1-E15. E18. A computer readable medium having stored thereon a computer program product according to embodiment E17.

E19. A signal representative of a probability of a clinical outcome for a patient, said signal having been generated according to the method according to anyone of embodiments E1-E15.

E20. A data stream which is representative of a computer program product according to embodiment E17. E21. A computer implemented method for calculating an output value representative of a probability of a clinical outcome for a patient suffering from an illness, wherein the method comprises:

- receiving a limited number of input parameters, such as a maximum of 20 input parameters, such as input parameters indicative of a physiological and/or pathophysiological state of the patient,

- calculating a value representative of an assessed severity of the patient's illness by processing said input parameters according to a first predetermined algorithm, wherein said first algorithm comprises processing steps for processing the input parameters according to a statistical model, such as said value representative of an assessed severity of the patient's illness being determined at least partially on said input parameters, and

- generating said output value representative of the probability of the clinical outcome for the patient by performing a second algorithm, wherein said output value is generated at least partially in response to the value representative of the assessed severity of the patient's illness. For the above embodiments E1-E21, it may be understood that reference to preceding 'embodiments' may refer to preceding embodiments within embodiments E1-E21.

Although the present invention has been described in connection with the specified embodiments, it should not be construed as being in any way limited to the presented examples. The scope of the present invention is set out by the accompanying claim set. In the context of the claims, the terms "comprising" or "comprises" do not exclude other possible elements or steps (similarly these expressions in the application, e.g., the examples do not exclude other possible elements or steps). Also, the mentioning of references such as "a" or "an" etc. should not be construed as excluding a plurality. The use of reference signs in the claims with respect to elements indicated in the figures shall also not be construed as limiting the scope of the invention. Furthermore, individual features mentioned in different claims, may possibly be advantageously combined, and the mentioning of these features in different claims does not exclude that a combination of features is not possible and advantageous.

References (1) Wyllie DH, Bowler IC, Peto TE. Bacteraemia prediction in emergency medical admissions: role of C reactive protein. J Clin Pathol 2005 Apr;58(4):352-356.

(2) Peters RP, van Agtmael MA, Savelkoul PH, Vandenbroucke-Grauls CM, Groeneveld AJ. Clinical and laboratory markers for bedside prediction of bloodstream infection in adult patients: a comprehensive review. Innovations in the Diagnosis of Bloodstream Infection 2007:9.

(3) Tokuda Y, Miyasato H, Stein GH. A simple prediction algorithm for bacteraemia in patients with acute febrile illness. QJM 2005 Nov;98(ll):813-820.

(4) Bates DW, Cook EF, Goldman L, Lee TH. Predicting bacteremia in hospitalized patientsA prospectively validated model. Ann Intern Med 1990;113(7):495-500.

(5) Bates DW, Sands K, Miller E, Lanken PN, Hibberd PL, Graman PS, et al. Predicting bacteremia in patients with sepsis syndrome. Academic Medical Center Consortium Sepsis Project Working Group. J Infect Dis 1997 Dec;176(6):1538-1551.

(6) Jaimes F, Arango C, Ruiz G, Cuervo J, Botero J, Velez G, et al. Predicting bacteremia at the bedside. Clin Infect Dis 2004 Feb l;38(3):357-362.

(7) Leibovici L, Greenshtain S, Cohen O, Mor F, Wysenbeek AJ. Bacteremia in febrile patients: a clinical model for diagnosis. Arch Intern Med 1991;151(9):1801-1806.

(8) Peduzzi P, Shatney C, Sheagren J, Sprung C. Predictors of bacteremia and gram-negative bacteremia in patients with sepsis. Arch Intern Med 1992;152(3):529-535.

(9) Shapiro Nl, Wolfe RE, Wright SB, Moore R, Bates DW. Who needs a blood culture? A prospectively derived and validated prediction rule. J Emerg Med 2008;35(3):255-264.

(10) Jin SJ, Kim M, Yoon JH, Song YG. A new statistical approach to predict bacteremia using electronic medical records. Scand J Infect Dis 2013;45(9):672-680.

(11) Ratzinger F, Dedeyan M, Rammerstorfer M, Perkmann T, Burgmann H, Makristathis A, et al. A Risk Prediction Model for Screening Bacteremic Patients: A Cross Sectional Study. PloS one

2014;9(9):el06765.

(12) Paul M, Andreassen S, Nielsen AD, Tacconelli E, Almanasreh N, Fraser A, et al. Prediction of

Bacteremia Using TREAT, a Computerized Decision-Support System. Clinical Infectious Diseases 2006 May 01;42(9):1274-1282.

(13) Knaus WA, Zimmerman JE, Wagner DP, Draper EA, Lawrence DE. APACHE-acute physiology and chronic health evaluation: a physiologically based classification system. Crit Care Med 1981;9(8):591- 597. (14) Knaus WA, Draper EA, Wagner DP, Zimmerman JE. APACHE II: a severity of disease classification system. Crit Care Med 1985;13(10):818-829.

(15) Knaus WA, Wagner DP, Draper EA, Zimmerman JE, Bergner M, Bastos PG, et al. The APACHE III prognostic system. Risk prediction of hospital mortality for critically ill hospitalized adults. Chest Journal 1991;100(6):1619-1636.

(16) Zimmerman JE, Kramer AA, McNair DS, Malila FM. Acute Physiology and Chronic Health Evaluation (APACHE) IV: hospital mortality assessment for today's critically ill patients. Crit Care Med 2006 May;34(5):1297-1310.

(17) Le Gall J, Loirat P, Alperovitch A, Glaser P, Granthil C, Mathieu D, et al. A simplified acute physiology score for ICU patients. Crit Care Med 1984;12(ll):975-977.

(18) Le Gall J, Lemeshow S, Saulnier F. A new simplified acute physiology score (SAPS II) based on a European/North American multicenter study. JAMA 1993;270(24):2957-2963.

(19) Moreno RP, Metnitz PG, Almeida E, Jordan B, Bauer P, Campos RA, et al. SAPS 3— From evaluation of the patient to evaluation of the intensive care unit. Part 2: Development of a prognostic model for hospital mortality at ICU admission. Intensive Care Med 2005;31(10):1345-1355.

(20) LEMESHOW S, TERES D, PASTIDES H, AVRUNIN JS, STEINGRUB JS. A method for predicting survival and mortality of ICU patients using objectively derived weights. Crit Care Med 1985;13(7):519-525.

(21) Lemeshow S, Teres D, Klar J, Avrunin JS, Gehlbach SH, Rapoport J. Mortality Probability Models (MPM II) based on an international cohort of intensive care unit patients. JAMA 1993;270(20):2478- 2486.

(22) Shapiro Nl, Wolfe RE, Moore RB, Smith E, Burdick E, Bates DW. Mortality in Emergency Department Sepsis (M EDS) score: a prospectively derived and validated clinical prediction rule. Crit Care Med 2003 Mar;31(3):670-675.

(23) Vincent J, Moreno R, Takala J, Willatts S, De Mendonga A, Bruining H, et al. The SOFA (Sepsis-related Organ Failure Assessment) score to describe organ dysfunction/failure. Intensive Care Med

1996;22(7):707-710.

(24) Rhee JY, Kwon KT, Ki HK, Shin SY, Jung DS, Chung DR, et al. Scoring systems for prediction of mortality in patients with intensive care unit-acquired sepsis: a comparison of the Pitt bacteremia score and the Acute Physiology and Chronic Health Evaluation II scoring systems. Shock 2009 Feb;31(2):146- 150.

(25) Andreassen S, Rosenfalck A, Falck B, Olesen KG, Andersen SK. Evaluation of the diagnostic performance of the expert EMG assistant MUNIN. Electroencephalography and Clinical

Neurophysiology/Electromyography and Motor Control 1996 /4;101(2):129-144.

(26) Sadeghi S, Barzi A, Sadeghi N, King B. A Bayesian model for triage decision support. Int J Med Inf 2006 /5;75(5):403-411. (27) Schurink CM, Visscher S, Lucas PF, Leeuwen H, Buskens E, Hoff R, et al. A Bayesian decision-support system for diagnosing ventilator-associated pneumonia. Intensive Care Med 2007 08/01;33(8):1379- 1386.

(28) Kariv G, Shani V, Goldberg E, Leibovici L, Paul M. A model for diagnosis of pulmonary infections in solid-organ transplant recipients. Comput Methods Programs Biomed 2011 11;104(2):135-142.

(29) Hejlesen OK, Andreassen S, Hovorka R, Cavan DA. DIASa€"the diabetes advisory system: an outline of the system and the evaluation results obtained so far. Comput Methods Programs Biomed 1997 /9;54(l):49-58.

(30) Andreassen S, Riekehr C, Kristensen B, Sch0nheyder HC, Leibovici L. Using probabilistic and decision-theoretic methods in treatment and prognosis modeling. Artif Intell Med 1999;15(2):121-134.

(31) Leibovici L, Fishman M, Schonheyder HC, Riekehr C, Kristensen B, Shraga I, et al. A causal probabilistic network for optimal treatment of bacterial infections. Knowledge and Data Engineering, IEEE Transactions on 2000;12(4):517-528.

(32) Pearl J. Probabilistic reasoning in intelligent systems: networks of plausible inference. : Morgan Kaufmann; 1988.

(33) Andreassen S, Jensen FV, Olesen KG. Medical expert systems based on causal probabilistic networks. Int J Biomed Comput 1991 /5; 1991;28(l):l-30.

(34) Andreassen S, Leibovici L, Paul M, Nielsen AD, Zalounina A, Kristensen LE, et al. A probabilistic network for fusion of data and knowledge in clinical microbiology. Probabilistic Modeling in

Bioinformatics and Medical Informatics: Springer; 2005. p. 451-472.

(35) Paul M, Andreassen S, Tacconelli E, Nielsen AD, Almanasreh N, Frank U, et al. Improving empirical antibiotic treatment using TREAT, a computerized decision support system: cluster randomized trial. Journal of Antimicrobial Chemotherapy 2006 December 01;58(6):1238-1245.

(36) A Bayesian approach to model-development: design of continuous distributions for infection variables. World Congress The International Federation of Automatic Control, IFAC; 2014.

(37) Lauritzen SL. The EM algorithm for graphical association models with missing data. Comput Stat Data Anal 1995;19(2):191-201.

(38) Hanley JA, McNeil BJ. A method of comparing the areas under receiver operating characteristic curves derived from the same cases. Radiology 1983 Sep;148(3):839-843.

(39) Hosmer Jr DW, Lemeshow S. Applied logistic regression. : John Wiley & Sons; 2004.

(40) Dulhunty JM, Lipman J, Finfer S, Sepsis Study Investigators for the ANZICS Clinical Trials Group. Does severe non-infectious SIRS differ from severe sepsis? Intensive Care Med 2008;34(9):1654-1661.

(41) Vincent J, Sakr Y, Sprung CL, Ranieri VM, Reinhart K, Gerlach H, et al. Sepsis in European intensive care units: Results of the SOAP study*. Crit Care Med 2006;34(2):344-353.

(42) Vincent J, Dubois M, Navickis RJ, Wilkes M M. Hypoalbuminemia in acute illness: Is there a rationale for intervention?: A meta-analysis of cohort studies and controlled trials. Ann Surg 2003;237(3):319. (43) Howell MD, Donnino M, Clardy P, Talmor D, Shapiro Nl. Occult hypoperfusion and mortality in patients with suspected infection. Intensive Care Med 2007;33(11):1892-1899.

(44) Kramer AA, Zimmerman JE. Assessing the calibration of mortality benchmarks in critical care: The Hosmer-Lemeshow test revisited. Crit Care Med 2007 Sep;35(9):2052-2056.

(45) Liesenfeld O, Lehman L, Hunfeld K, Kost G. Molecular diagnosis of sepsis: new aspects and recent developments. European Journal of Microbiology and Immunology 2014;4(l):l-25.

(46) Mwaigwisya S, Assiri RAM, O'Grady J. Emerging commercial molecular tests for the diagnosis of bloodstream infection. Expert review of molecular diagnostics 2015(0):1-12.

(47) Kang D, Ali MM, Zhang K, Huang SS, Peterson E, Digman MA, et al. Rapid detection of single bacteria in unprocessed blood using Integrated Comprehensive Droplet Digital Detection. Nature

communications 2014;5.

(48) Paul M, Shani V, Muchtar E, Kariv G, Robenshtok E, Leibovici L. Systematic review and meta-analysis of the efficacy of appropriate empiric antibiotic therapy for sepsis. Antimicrob Agents Chemother 2010 Nov;54(ll):4851-4863.

(49) Marquet K, Liesenborgs A, Bergs J, Vleugels A, Claes N. Incidence and outcome of inappropriate in- hospital empiric antibiotics for severe infection: a systematic review and meta-analysis. Critical Care 2015;19(1):63.

(50) Appleby J, Devlin N, Parkin D. NICE'S cost effectiveness threshold. BMJ 2007 Aug 25;335(7616):358- 359.

(51) Leli C, Cardaccia A, D'Alo F, Ferri C, Bistoni F, Mencacci A. A prediction model for real-time PCR results in blood samples from febrile patients with suspected sepsis. J Med Microbiol 2014 May;63(Pt 5):649-658.

(52) Ward L. Gradation of the severity of sepsis - Learning in a causal probabilistic network. 2016.

(53) Lehmann LE, Herpichboehm B, Kost GJ, Kollef MH, Stiiber F. Cost and mortality prediction using polymerase chain reaction pathogen detection in sepsis: evidence from three observational trials. Crit

Care 2010;14(5):R186.

(54) Alvarez J, Mar J, Varela-Ledo E, Garea M, Matinez-Lamas L, Rodriguez J, et al. Cost analysis of realtime polymerase chain reaction microbiological diagnosis in patients with septic shock. Anaesth Intensive Care 2012 Nov;40(6):958-963.

(55) World Health Organization. WHO-CHOICE. 2008; Available at:

http://www.who.int/choice/costs/en/, 2016.

(56) Husereau D, Drummond M, Petrou S, Carswell C, Moher D, Greenberg D, et al. Consolidated Health Economic Evaluation Reporting Standards (CHEERS) statement. BMJ 2013 BMJ Publishing Group Ltd;346.

(57) Kumar A, Roberts D, Wood KE, Light B, Parrillo JE, Sharma S, et al. Duration of hypotension before initiation of effective antimicrobial therapy is the critical determinant of survival in human septic shock.

Crit Care Med 2006 Jun;34(6):1589-1596. (58) de Groot B, Ansems A, Gerling DH, Rijpsma D, van Amstel P, Linzel D, et al. The association between time to antibiotics and relevant clinical outcomes in emergency department patients with various stages of sepsis: a prospective multi-center study. Critical Care 2015;19(1):194.

(59) Wisdom A, Eaton V, Gordon D, Daniel S, Woodman R, Phillips C. I NIT I AT - ED: Impact of timing of INITIation of Antibiotic Therapy on mortality of patients presenting to an Emergency Department with sepsis. Emergency Medicine Australasia 2015;27(3):196-201.

(60) Dauwalder 0, Landrieve L, Laurent F, de Montclos M, Vandenesch F, Una G. Does bacteriology laboratory automation reduce time to results and increase quality management? Clinical Microbiology and Infection 2016;22(3):236-243.

(61) Opota O, Jaton K, Greub G. Microbial diagnosis of bloodstream infection: towards molecular diagnosis directly from blood. Clinical Microbiology and Infection 2015;21(4):323-331.

(62) Una G, Greub G. Automation in bacteriology: a changing way to perform clinical diagnosis in infectious diseases. Clin Microbiol Infect 2016 Mar;22(3):215-216.

(63) Angus DC, Linde-Zwirble WT, Lidicker J, Clermont G, Carcillo J, Pinsky MR. Epidemiology of severe sepsis in the United States: analysis of incidence, outcome, and associated costs of care. Crit Care Med

2001;29(7):1303-1310.

(64) Martin GS, Mannino DM, Eaton S, Moss M. The epidemiology of sepsis in the United States from 1979 through 2000. N Engl J Med 2003;348(16):1546-1554.

(65) Leibovici L, Paul M, Nielsen AD, Tacconelli E, Andreassen S. The TREAT project: decision support and prediction using causal probabilistic networks. Int J Antimicrob Agents 2007;30:93-102.

(66) Kofoed K, Zalounina A, Andersen O, Lisby G, Paul M, Leibovici L, et al. Performance of the TREAT decision support system in an environment with a low prevalence of resistant pathogens. J Antimicrob Chemother 2009 Feb;63(2):400-404.

(67) Leibovici L, Paul M, Andreassen S. Balancing the benefits and costs of antibiotic drugs: the TREAT model. Clinical Microbiology and Infection 2010;16(12):1736-1739.

(68) A Bayesian Approach to Model Development: Automatic Learning for Tuning Predictive

Performance. 9th IFAC Symposium on Biological and Medical Systems; 2015.

(69) Howell MD, Donnino MW, Talmor D, Clardy P, Ngo L, Shapiro Nl. Performance of severity of illness scoring systems in emergency department patients with infection. Acad Emerg Med 2007;14(8):709- 714.

(70) Olsson T, Terent A, Lind L. Rapid Emergency Medicine Score: a new prognostic tool for in - hospital mortality in nonsurgical emergency department patients. J Intern Med 2004;255(5):579-587.

(71) Bone RC, Balk RA, Cerra FB, Dellinger RP, Fein AM, Knaus WA, et al. Definitions for sepsis and organ failure and guidelines for the use of innovative therapies in sepsis. The ACCP/SCCM Consensus Conference Committee. American College of Chest Physicians/Society of Critical Care Medicine. CHEST Journal 1992;101(6):1644-1655. (72) Nguyen HB, Rivers EP, Havstad S, Knoblich B, Ressler JA, Muzzin AM, et al. Critical care in the emergency department a physiologic assessment and outcome evaluation. Acad Emerg Med

2000;7(12):1354-1361.

(73) Fine MJ, Auble TE, Yealy DM, Hanusa BH, Weissfeld LA, Singer DE, et al. A prediction rule to identify low-risk patients with community-acquired pneumonia. N Engl J Med 1997;336(4):243-250.

(74) Dellinger RP, Levy MM, Rhodes A, Annane D, Gerlach H, Opal SM, et al. Surviving sepsis campaign: international guidelines for management of severe sepsis and septic shock, 2012. Intensive Care Med 2013;39(2):165-228.

(75) Carpenter CR, Keim SM, Upadhye S, Nguyen HB, Best Evidence in Emergency Medicine Investigator Group. Risk stratification of the potentially septic patient in the emergency department: the Mortality in the Emergency Department Sepsis (MEDS) score. J Emerg Med 2009;37(3):319-327.

(76) Jones AE, Saak K, Kline JA. Performance of the Mortality in Emergency Department Sepsis score for predicting hospital mortality among patients with severe sepsis and septic shock. Am J Emerg Med 2008;26(6):689-692.

(77) Leibovici L, Shraga I, Andreassen S, Pauker SG. How do you choose antibiotic

treatment?/Commentaries. Br Med J 1999;318(7198):1614.

(78) Ward L, Paul M, Leibovici L, Andreassen S. PCR for Sepsis in the Emergency Department: a Cost- Effective Proposition? In preparation.

Claims

1. A computer implemented method for calculating a probability of a clinical outcome for a patient, wherein the method comprises:

Providing a maximum of 150 input parameters, such as a maximum of 100 input parameters, such as a maximum of 75 input parameters, such as a maximum of 50 input parameters, such as a maximum of 40 input parameters, such as a maximum of 30 input parameters, such as a maximum of 25 input parameters, such as a maximum of 20 input parameters, as input to a statistical model,

Performing with the statistical model an assessment of a severity of an illness of the patient, where said assessment is based partially or fully on the input parameters.

Calculating the probability with the statistical model where said probability is based partially or fully on the severity of the patient's illness.

2. The method according to claim 1 wherein the statistical model is a Bayesian network, such as a not naive Bayesian network.

3. The method according to any one of claims 1-2 where the clinical outcome is bacteraemia.

4. The method according to any one of claims 1-2 where the clinical outcome is a positive blood culture.

5. The method according to any one of claims 1-2 where the clinical outcome is a positive PCR test.

6. The method according to any one of claims 1-2 where the clinical outcome is

a. 30-day mortality, and/or

b. In-hospital mortality.

7. The method according to any one of the preceding claims where the method further comprises:

a. recommending treatment options, and/or

b. ruling out treatment options, based on said probability.

8. The method according to any one of claims 1-6 where the method further comprises:

a. recommending further diagnostic tests, and/or

b. ruling out further diagnostic tests, based on said probability.

9. The method according to any one of claims 1-6 where the method further comprises:

based on said probability.

The method according to any one of the preceding claims, wherein the input parameters comprise one or more or all of:

a. Laboratory parameters, such as one or more or all of:

iii. Albumin (such as in units of [g/l]),

iv. Lactate (such as in units of [mmol/l]),

v. Platelets (such as in units of [count/mm³]),

viii. procalcitonin (PCT),

b. Vital parameters, such as one or more or all of:

ix. Heart rate (such as in units of [/min]),

xii. Temperature (such as in units of [°C]),

1. Confused,

2. Comatose,

c. Additional parameters, such as one or more or all of:

xv. Chills,

xvi. Chemotherapy,

xvii. Disseminated Intravascular Coagulation (DIC), xviii. Use of supplementary oxygen, xix. Fraction of inspired oxygen (Fi0₂),

xx. Oxygen saturation (Sa0₂),

xxi. Partial pressure of oxygen (Pa0₂),

xxii. Acute Respiratory Distress Syndrome (ARDS).

11. The method according to any one of the preceding claims, wherein the input parameters

comprise, such as consist of:

Neutrophils,

- C-Reactive Protein (CRP),

Lactate,

Temperature,

Albumin,

Creatinine,

- Platelets,

- MAP,

Heart Rate.

The method according to any one of the preceding claims, wherein the input parameters comprise, such as consist of:

procalcitonin (PCT),

Neutrophils,

- C-Reactive Protein (CRP),

Lactate,

Temperature,

Albumin,

Creatinine,

Platelets,

- MAP,

Heart Rate.

13. The method according to any one of claims 1-10, wherein the input parameters comprise, such as consist of:

Neutrophil fraction calculated as neutrophils divided by leucocytes,

- C-Reactive Protein (CRP), Platelets.

14. The method according to any one of the preceding claims, wherein the statistical model enables machine learning to be carried out within 1 hour or less, such as within 30 minutes or less, such as within 10 minutes or less.

15. The method according to any one of claims 1-14 wherein the method comprises first

Calculating the probability with the statistical model, where said probability is based partially or fully on the severity of the patient's illness. 16. The method according to any one of the preceding claims, wherein the method further comprises a. displaying the probability, and/or

17. The method according to any one of claims 3-6, wherein the method further comprises:

Determining based on the probability of the clinical outcome a probability of an infection. 18. The method according to any one of claims 3-6, wherein the method further comprises:

Determining based on the probability of the clinical outcome an efficacy of an antimicrobial treatment, such as an antimicrobial treatment of the patient.

19. The method according to claim 18, wherein the method further comprises:

- Based on the efficacy, suggesting a subsequent revision of antimicrobial treatment, such as suggest any one of:

i. Escalation of antimicrobial treatment, or

ii. De-escalation of antimicrobial treatment, or iii. Discontinuation of antimicrobial treatment.

20. An apparatus comprising a processor and a memory and being configured for carrying out the method according to any one of the preceding claims. 21. Computer program product having instructions which, when executed cause a computing device or a computing system, such as the apparatus according to claim 20, to perform a method according to any one of claims 1-19.

22. A computer readable medium having stored thereon a computer program product according to claim 21.

23. A signal representative of a probability of a clinical outcome for a patient, said signal having been generated according to the method according to anyone of claims 1-19. 24. A data stream which is representative of a computer program product according to claim 21.