GB2454799A - Diagnosing infection, SIRS or sepsis using RT-PCR and biomarkers - Google Patents

Diagnosing infection, SIRS or sepsis using RT-PCR and biomarkers Download PDF

Info

Publication number
GB2454799A
GB2454799A GB0820899A GB0820899A GB2454799A GB 2454799 A GB2454799 A GB 2454799A GB 0820899 A GB0820899 A GB 0820899A GB 0820899 A GB0820899 A GB 0820899A GB 2454799 A GB2454799 A GB 2454799A
Authority
GB
United Kingdom
Prior art keywords
sepsis
data
biomarkers
clinical
analysis
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
GB0820899A
Other versions
GB0820899D0 (en
Inventor
Timothy John Gilby Brooks
Matthew Christopher Jackson
Roman Antoni Lukaszewski
Martin Julian Pearce
Carrie Jane Turner
Amanda Marie Yates
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
UK Secretary of State for Defence
Original Assignee
UK Secretary of State for Defence
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by UK Secretary of State for Defence filed Critical UK Secretary of State for Defence
Publication of GB0820899D0 publication Critical patent/GB0820899D0/en
Publication of GB2454799A publication Critical patent/GB2454799A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6809Methods for determination or identification of nucleic acids involving differential detection
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6844Nucleic acid amplification reactions
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/5005Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving human or animal cells
    • G01N33/5091Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving human or animal cells for testing the pathological state of an organism

Abstract

A method for screening a biological sample to detect early stages of infection, SIRS or sepsis comprising the steps of detecting expression of at least one biomarker from a first set and one from a second set of informative biomarkers by RT-PCR; analysing the results of detection; classifying said sample according to the likelihood and/or timing of the development of overt infection. The first set of biomarkers consists of CD40, CD5, CD79A, CRX, CTNND1, CX3CL1, ENTPD2, ENTPD5, EPHA8, GPR44, HMMR, IL8, MAP1A, MAPK7, MEF2D, ODF1, SAA3P, SLC6A9, SPN, TDGF1, TSC22D1 and HDAC5, the second set consists of CD178, MCP-1, TNF-alpha, IL- 1beta, IL-6, IL-10, INF-alpha and INF-gamma. A multilayered perceptron neural network analysis may be used for high accuracy prediction. A diagnostic means may be prepared from the method.

Description

1 2454799 Early detection of sepsis
Background
Despite greatly improved diagnosis, treatment and support, serious infection and sepsis remain significant causes of death and often result in chronic ill-health or disability in those who survive acute episodes. Although sudden, overwhelming infection is comparatively rare amongst otherwise healthy adults, it constitutes an increased risk in immunocompromiseci individuals, seriously ill patients in intensive care, burns patients and young children. In a proportion of cases, an apparently treatable infection leads to the development of sepsis; a dysregulated, inappropriate response to infection characterised by progressive circulatory collapse leading to renal and respiratory failure, abnormalities in coagulation, profound and unresponsive hypotension and, in about 30% of cases death. The incidence of sepsis in the population of North America is about 0.3% of the population annually (about 750,000 cases) with mortality rising to 40% in the elderly and to 50% in cases of the most severe form, septic shock (Angus eta!, 2001, Crit Care Med: 1303-1310).
Following infection with infectious micro-organisms, the body reacts with a classical inflammatory response and activation of, first, the innate, non-specific immune response, followed by a specific, acquired immune response. In the case of bacterial infections, bacteraemia leads to the rapid (within 30-90 minutes) onset of pyrexia and release of inflammatory cytokiries such as interleukin-1 (IL-i) and tumour necrosis factor-a (TNF-a) triggered by the detection of bacterial toxins, long before the development of a specific, antigen-driven immune response.
In Gram-negative bacteraemia due to infections such as typhoid, plague, tularaemia and brucellosis, or peritonitis from Gram-negative gut organisms such as Escherichia coli, Kiebsiella, Proteus or Pseudomonas this is largely a response to lipopolysaccharide (LPS) and other components derived from bacterial cell walls. Circulating LPS and, in particular, its constituent lipid A, provokes a wide range of systemic reactions. It is probably contact with Kupffer cells in the liver that first leads to IL-i release and the onset of pyrexia. Activation of circulating monocytes and macrophages leads to release of cytokines such as IL-6, IL-12, IL- 15, IL-18, TNF-a, macrophage migration inhibitory factor (MIF), and cytokine-like molecules such as high mobility group Bi (HMGB1), which, in turn activate neutrophils, lymphocytes and vascular endothelium, up-regulate cell adhesion molecules, and induce prostaglandins, nitric oxide synthase and acute-phase proteins. Release of platelet activating factor (PAF), prostaglandins, leukotrienes and thromboxane activates vascular endothelium, regulates vascular tone and activates the extrinsic coagulation cascade. Dysregulation of these responses results in the complications of sepsis and septic shock in terms of peripheral vasodilation leading to hypotension, and abnormal clotting and fibrinolysis producing thrombosis and intravascular coagulation (Cohen, 2002, Nature Q* 885-891).
LPS primarily acts on cells by binding to a serum LPS-binding protein (LBP) and CD14 expressed on monocytes and macrophages. On binding a complex of LPS and LBP, CD14 acts with a co-receptor, Toll-like receptor 4 (TLR-4) and a further component, MD-2, to form a signalling complex and initiate activation of macrophages and release of cytokines (Pâlsson-McDermott & O'Neill, 2004, Immunology 113:153-162). The Toll-like receptor family is a group of cell surface receptors involved in a range of bacterial and fungal ligands that act as triggers for innate immune system, including Gram-positive cell wall structures, flagellin, and CpG repeats characteristic of bacterial DNA.
In the case of infection with Gram-positive pathogens, septic shock is associated with the production of exotoxins. For instance, toxic shock syndrome, a particularly acute form of septic shock that often affects otherwise healthy individuals is due to infection with particular strain of Staphylococcus aureus, which produces an exotoxin known as toxic shock syndrome toxin-i (TSST-1). A similar syndrome is caused by invasive infection with certain group A Streptococcus pyogenes strains, and is often associated with streptococcal pyogenic enterotoxin A (SPE-A). Some Gram-positive exotoxins (including TSST-1)are thought to exert their effects predominantly as a result of their superantigen properties.
Superantigens are able to non-specifically stimulate I lymphocytes by cross-linking MHC Class II molecules on antigen presenting cells to certain classes of T cell receptors. Usually, I cell receptor (ICR) -Major Histocompatibility Complex (MHC) interactions are highly specific, with Only T cells carrying TCRs that specifically recognise short antigen-derived peptides presented by the MHC able to bind and be activated, ensuring an antigen-specific T cell response. Superantigens bypass this mechanism resulting in massive and inappropriate activation of I cells. However, SPE-A is not an efficient superantigen and some further mechanism must be implicated.
It should be noted that clinical sepsis may also result from infection with some viruses (for example Venezuelan Equine Encephalitis Virus, VEEV) and fungi, and that other mechanisms are likely to be involved in such cases.
The ability to detect potentially serious infections as early as possible and, especially, to predict the onset of sepsis in susceptible individuals is clearly advantageous. A considerable effort has been expended over many years in attempts to establish clear criteria defining clinical entities such as shock, sepsis, septic shock, toxic shock and systemic inflammatory response syndrome (SIRS). Similarly, many attempts have been made to design robust predictive models based on measuring a range of clinical, chemical, biochemical, immunological and cytometic parameters and a number of scoring systems, of varying prognostic success and sophistication, proposed.
According to the 1991 Consensus Conference of the American College of Chest Physicians (ACCP) and Society of Critical Care Medidne (SCCM) "SIRS' is considered to be present when patients have more than one of the following: a body temperature of greater than 38C or less than 362C, a heart rate of greater than 90/mm, hyperventilation involving a respiratory rate higher than 20/mm or PaCO2 lower than 32mm Hg, a white blood cell count of greater than 12000 cells /gl or less than 4000 cells /tl (Bone eta!, 1992, Crit Care Med 22: 864-874).
"Sepsis" has been defined as SIRS caused by infection. It is accepted that SIRS can occur in the absence of infection in, for example, burns, pancreatitis and other disease states.
"Infection" was defined as a pathological process caused by invasion of a normally sterile tissue, fluid or body cavity by pathogenic or potentially pathogenic micro-organisms.
"Severe sepsis" was defined as sepsis complicated by organ dysfunction, itself defined by Marshall et al (1995, Crit Care Med: 1638-1652) or the Sequential Organ Failure Assessment (SOFA) score (Ferreira et al, 2002, JAMA 286:1754-1758).
"Septic shock" refers (in adults) to sepsis plus a state of acute circulatory failure characterised by a persistent arterial hypotension unexplained by other causes.
In order to evaluate the seriousness of sepsis in intensive care patients and to allow rational treatment planning, a large number of clinical severity models have been developed for sepsis, or adapted from more general models. The first generally accepted system was the Acute Physiology and Chronic Health Evaluation score (APACHE, and its refinements APACHE II and Ill) (Knaus eta!, 1985, Grit Care Medj: 818-829; Knaus etal, 1991, Chest j.QQ: 1619-1636), with the Mortality Prediction Model (MPM) (Lemestiow etal,1993, JAMA Q: 2957-2963) and the Simplified Acute Physiology (SAPS) score (Le Gall et al, 1984, Crit Care Med j2: 975-977) also being widely used general predictive models. For more severe conditions, including sepsis, more specialised models such as the Multiple Organ Dysfunction Score (MODS) (Marshall et a!, 1995, Crit Care Med 23:1638-1652), the Sequential Organ Failure Assessment (SOFA) score (Ferreira at al, 2002, JAMA 286: 1754- 1758) and the Logistical Organ Dysfunction Score (LODS) (Le Gall et al, 1996, JAMA 276: 802-810) were developed. More recently, a specific model, PIRO (Levy et al, 2003, Intensive Care Med 29: 530-538), has been proposed. All of these models use a combination of a wide range of general and specific clinical measures to attempt to derive a useful score reflecting the seriousness of the patient's condition and likely outcome.
In addition to the standard predictive models described above, the correlation of sepsis and a number of specific serum markers has been extensively studied with a view to developing specific diagnostic and prognostic tests, amongst which are the following.
C-reactive protein (CRP) is a liver-derived serum acute phase protein that is well-known as non-specific marker of inflammation. More recently (Toh eta!, 2003, Intensive Care Med Q: 55-61) a calcium dependent complex of CRP and very low density lipoprotein (VLDL), known as Iipoprotein complexed C-reactive protein (LCCRP), has been shown to be involved in affecting the coagulation mechanism during sepsis. In particular, a common test known as the activated partial thromboplastin time develops a particular profile in cases of sepsis, and this has been proposed as the basis for a rapid diagnostic test.
TNF-a and IL-i are archetypal acute inflammatory cytokines long known to be elevated in sepsis (Damas et a!, 1989, Critical Care Med ji: 975-978) and have reported to be useful predictors of organ failure in adult respiratory distress syndrome, a serious complication of sepsis (Meduni eta!, 1995, Chest 107: 1062-1073) Activated complement product C3 (C3a) and lL-6 have been proposed as useful indicators of host response to microbial invasion, and superior to pyrexia and white blood cell counts (Groeneveld eta!, 2001, Clin Diagn Lab Immunol: 1189-1195). Secretary phospholipase A2 was found to be a less reliable marker in the same study.
Procalcitonin is the propeptide precursor of calcitonin, serum concentrations of which are known to rise in response to LPS and correlate with lL-6 and TNF-a levels. Its use as a predictor of sepsis has been evaluated (Al-Nawas eta!, 1996, Eur J Med Res 1: 331-333).
Using a threshold of 0.1 ng /ml, it correctly identified 39% of sepsis patients. However, other reports suggest that it is less reliable than the use of senal CRP measurements (Neely eta!, 2004, J Burn Care Rehab: 76-80), although superior to lL-6 or IL-S (Harbarth eta!, Am J Resp Crit Care Med j: 396-402).
Changes in neutrophil surface expression of leukocyte activation markers (such as CD11b, CD31, CD35, L-selectin, CD1 6) have been used as a marker of SIRS and have been found to correlate with lL-6 and subsequent development of organ failure (Rosenbloom eta!, 1995, JAMA 274: 58-65). Similarly, expression of platelet surface antigens such as CD63, CD62P, CD36 and CD31 have been examined, but no reliable predictive model constructed.
Finally, it has been shown that downregulation of monocyte HLA-DR expression is a predictor of a poor outcome in sepsis and may be an indication of monocyte deactivation, impairing TNF-a production. Treatment with lEN-? has been shown to be beneficial in such cases (Docke eta!, 1997, Nature Med: 678-681).
However, although many of these markers correlate with sepsis and some give an indication of the seriousness of the condition, no single marker or combination markers has yet been shown to be a reliable diagnostic test, much less a predictor of the development of sepsis.
The 2001 International Sepsis Definition Conference concluded that "the use of biomarkers for diagnosing sepsis is premature" (Levy et a!, 2003, Intensive Care Med: 530-538).
Extracting reliable diagnostic patterns and robust prognostic indications from changes over time in complex sets of variables including traditional clinical observations, clinical chemistry, biochemical, immunological and cytometric data requires sophisticated methods of analysis.
The use of expert systems and artificial intelligence, including neural networks, for medical diagnostic applications has been being developed for some time (Place eta!, 1995, Clinical Biochemistry: 373-389; Lisboa, 2002, Neural Networks j: 11-39). Specific systems have been developed in attempts to predict survival of sepsis patients (Flanagan et al, 1996, Clinical Performance & Quality Health Care 4: 96-103) by use of multiple logistic regression and neural network models using APACHE scores and the 1991 ACCP/SCCM SIRS criteria described above (Bone et a!, 1992, Crit Care Med 20: 864-874). Such studies suggest that, although both approaches can give good predictive results, neural network systems are less sensitive to preselected threshold values (results of a number of studies reviewed by Rosenberg, 2002, Curr Opin Crit Care:321 -330). Brause eta! (2004, Journal für Anãsthesie und lntensivbehandlung fl: 40-43) provides an example of a neural network model being used for sepsis prediction. This model (MEDAN) analysed a range of standard clinical measure and compared its results with those obtained by using the APACHE II, SOFA, SAPS II, and MODS models. The study concluded that, of the markers available, the most informative were systolic and diastolic blood pressure, and platelet count.
Neural networks are non-linear functions that are capable of identifying patterns in complex data systems. This is achieved by using a number of mathematical functions that make it possible for the network to identify structure within a noisy data set. This is because data from a system may produce patterns based upon the relationships between the variables within the data. If a neural network sees sufficient examples of such data points during a period known as "training", it is capable of "learning" this structure and then identifying these patterns in future data points or test data. In this way, neural networks are able to predict or classify future examples by modelling the patterns present within the data it has seen. The performance of the network is then assessed by its ability to correctly predict or classify test data, with high accuracy scores, indicating the network has successfully identified true patterns within the data. The parallel processing ability of neural networks is dependent on the architecture of its processing elements, which are arranged to interact according to the model of biological neurones. One or more inputs are regulated by the connection weights to change the stimulation level within the processing element. The output of the processing element is related to its activation level and this output may be non-linear or discontInuous.
Training of a neural network therefore comprises an adjustment of Interconnected weights depending on the transfer function of the elements, the details of the interconnected structure and the rules of learning that the system follows (Place eta!, 1995, Clinical Biochemistry: 373-389). Such systems have been applied to a number of clinical situations, including health outcomes models of trauma patients (Marble & Healy (1999) Art Intell Med j: 299-307).
Warner et a! (1996, Ann Clin Lab Sci: 471-479) describe a multiparametric model for predicting the outcome of sepsis, using measures of septic shock factor' (which appears to be simply whether the patients have signs of septic shock on admission), IL-B, soluble 11-6 receptor (as measured by enzyme-linked immunosorbent assay) and the APACHE II score as components of a four-input algorithm in a multi-layer, feed-forward neural network model.
However, this system is not predictive for individuals who do not yet have clinical signs and, arguably, by the time serum levels of cytokines such as IL-6 are raised, the diagnosis, if not the outcome, is clinically obvious.
Dybowki et a! (1996, Lancet 347:11 46-i 150) use Classification and Regression Trees (CART) to select inputs from 157 possible sepsis prediction criteria and then use a neural network running a genetic algorithm to select the best combination of predictive markers.
These include many routine clinical values and proxy indicators rather than serum or cell surface biomarkers. However, the problem being addressed is the prognosis of patients who already have a clear diagnosis of sepsis and are already critically ill.
A further refinement of the genetic algorithm approach involves the use of Artificial Immune Systems, of which one version is the Artificial Immune Recognition System (AIRS) (Timmis eta!, An overview of Artificial Immune Systems. In: Paton, Bolouri, Holcombe, Parish and Tateson (eds.) "Computation in Cells and Tissues: Perspectives and Tools for Thought", Natural Computation Series, pp5l-86, Springer, 2004; Timmis (L.N. De Castro and J, Timmis. Artifical Immune Systems: A New Computational Intelligence Approach. Springer-Verlag, 2002) .which are adaptive systems inspired by the clonal selection and affinity maturation processes of biological immune systems as applied to artificial intelligence.
Immunologically speaking, AIRS is inspired by the clonal selection theory of the immune system (F. Burnett. The Clona! Selection Theory of Acquired Immunity. Cambridge University Press, 1959). The clonal selection theory attempts to explain that how, through a process of matching, cloning, mutation and selection, anti-bodies are created that are capable of identifying infectious agents. AIRS capitalises on this process, and through a process of matching, cloning and mutation, evolves a set of memory detectors that are capable of being used as classifiers for unseen data items. Unlike other immune inspired approaches, such as negative selection, AIRS is specifically designed for use in classification, more specifically one-shot supervised learning.
US patent application 2002/0052557 describes a method of predicting the onset of a number of catastrophic illnesses based on the variability of the heart-rate of the patient. Again, a neural network is among the possible methods of modelling and analysing the data.
International patent application WO 00/52472 describes a rapid assay method for use in small children based on the serum or neutrophil surface levels of CD1 lb or CD1 1 b complex' (Mac-i, CR3). The method uses only a single marker, and one which is, arguably, a well-known marker of neutrophil activation in response to inflammation.
The alternative approach to analysing such complex data sets where the data are often qualitative and discrete, rather than quantitative and continuous, is to use sophisticated statistical analysis techniques such as logistic regression. Where logistic regression using qualitative binary dependent variables is insufficiently discriminating in terms of selecting significant variables, multivariate techniques may be used. The outputs from both multiple logistic regression models and neural networks are continuously variable quantities but the likelihoods calculated by neural network models usually fall at one extreme or the other, with few values in the middle range. In a clinical situation this is often helpful and can give clearer decisions (Flanagan etal, 1996, Clinical Performance & Quality Health Care 4: 96-103).
The ability to detect the earliest signs of infection and I or sepsis has clear benefits in terms of allowing treatment as soon as possible. Indications of the severity of the condition and likely outcome if untreated inform decisions about treatment options. This is relevant both in vulnerable hospital populations, such as those in intensive care, or who are burned or immunocompromised and in other groups in which there is an increased risk of serious infection and subsequent sepsis. The use or suspected use of biological weapons in both battlefield and civilian settings is an example where a rapid and reliable means of testing for the earliest signs of infection in individuals exposed would be advantageous.
The applicant's earlier International application, WO 2006/061644 (incorporated by reference), discloses a method for detecting early signs of infection based on measurement of expression levels of particular combinations of cytokines and/or cellular activation markers. Expression was measured by either cell surface expression as detected by FACS, or at a transcriptional level by RT-PCR, optionally combined with the use of predictive algorithms.
Despite the greater knowledge of both the molecular basis of, and physiological response to, sepsis a need remains for a method of predicting sepsis as early as possible in the course of an infection, preferably during the therapeutic window of intervention, prior to the onset of clinical symptoms and disease. It is an object of the invention to identify novel markers and combinations of biomarkers, preferably useful for screening by means of micro-array technology. The approach of the prior art described above may be characterised as the selection of genes known to be in some way related to the processes of the inflammatory or immunological response to infection and testing their usefulness in various types of assay.
This is logical but presupposes that the processes involved in the earliest stages of infection are well-characterised and that the earliest genes to be activated are known. It also fails to consider the possibility of informative epiphenomena, that is, genes that are activated incidentally, or as part of a parallel or peripheral response. An alternative approach is to screen a wide range of potentially expressed (and, in some cases, apparently completely unconnected) gene sequences to identify those which, despite this, are nevertheless useful predictors of infection, either alone or in combination.
CD4O is a TNF-receptor superfamily member expressed on I and B lymphocytes, among other cells, and is required for a wide variety of immune and inflammatory responses, in particular B cell immunoglobulin production and isotype switching, and development of memory B cells (Grewel & Flavell, 1998, Annu Rev Immunol 16: 111). Its ligand is another leukocyte cell surface molecule, CD154. Two alternately spliced isoforms are known, the longer isoform (1) being encoded by transcript variant 1 (NCBI accession number NM 001250, SEQ ID NO:1).
CD5 is also a cell surface receptor expressed on T and B lymphocytes where it interacts with its ligand CD72 and has a role in modulating the immune response (Berland & Wortis, 2002, 20: 253). The cDNA sequence encoding human CD5 has the NCBI accession number NM 014207 (SEQ ID NO:2).
CD79A, previously known as MB-i or lg-a, is part of the B cell antigen receptor complex together with another similar molecule, CD79B (B29 or Ig-43), and the surface immunoglobulin chains. CD79A and B are involved in signal transduction and B cell surface immunoglobulin expression Jumaa eta!, 2005, Annu Rev Immunol: 415). There two known transcript variants, and the longer transcript sequence is listed at NCBI accession number NM 001783 (SEQ ID NO:3).
CRX is the gene for cone-rod homeobox, a homeodomain transcription factor that controls differentiation in photoreceptor cells and is required for normal cone and rod cell function.
Mutations in this gene are associated with photoreceptor degeneration (Leber congenital amaurosis type Ill and autosomal dominant cone-rod dystrophy 2, but no immunological functions are known (Chen eta!, 2002, Human Molecular Genetics, 11: 873). The cDNA sequence is available at NM 000554 (SEQ ID NO:4).
CTNND1 is the gene encoding catenin (cadherin-associated protein) delta-i, a member of the arniadiio family of proteins (previously known as p120 cas and p120 catenin). It is one of a number of proteins (others being f3-catenin and plakaglobin) that bind to the cytoplasmic region of cadherins, modulating cell adhesion and linking cadherins to the cytoskeleton (Franze & Ridley, 2004, J Biol Chem: 6588). Such molecules may also have a role in signal transduction through rho family GlPases. The cDNA sequence is available at NM 001331 (SEQ ID NO:5).
CX3CL1 encodes chemokine (C-X3-C motif) ligand 1, an unusual chemokine (previously known as fractalkine) characterised by the unique spacing of the first 2 cysteines in its chemokine cysteine motif and its dual role as a chemoattractant and cell adhesion molecule involved in the inflammatory response. It is expressed as a cell surface molecule but a soluble from is generated by juxtamembrane proteolytic cleavage (Umehara et a!, 2004, Arterioscier Thromb Vasc Biol: 34). The cDNA sequence is available at NM 002996 (SEQ ID NO:6).
ENTPD2 is the gene for ectonucleoside triphosphate diphosphohydrolase 2 (otherwise known as CD39L or NTPDase-2). ENTPD5 is the related ectonucleoside triphosphate diphosphohydrolase 5 (CD39L4 or NTPDa5e-5). These molecules are cell surface ATP-hycirolyzing enzymes responsible for the breakdown of extracellular nucleotides, thus regulating a complex system of cell signalling via large families of purine and pyrimidine receptors. ENTPD2 exists in a number of splice variants, which may have distinct functions (Wang eta!, 2005, Biochem J 385: 729). A long isoform is encoded by the cDNA sequence of NM 203468 (SEQ ID NO:7). NM 001246 encodes a shorter isoform with a truncated C-terminus. The ENTPD5 sequence is available at NM 001249 (SEQ ID NO:8).
EPHA8 is a gene encoding the ephrin A8 receptor, a member of the ephrin receptor subfamily of receptor tyrosine kinases. The ephrin A8 receptor functions as a receptor for ephrin A2, A3 and A5 and is involved in short-range contact-mediated axonal guidance during development of the nervous system (Gu eta!, 2005, Oncogene: 4243). There is a splice variant shortened at the C-terminus (not yet detected at the protein level) but the longer isoform is encoded by the sequence of NM 020526 (SEQ ID NO:9) GPR44 encodes G protein-coupled receptor 44, more widely known as chemoattractant receptor-homoIous molecule expressed on Th2 cells (CRTH2). This the prostaglandin D2 (PGD2) receptor responsible for mediating the inflammatory effect of PGD2 on a variety of leukocytes and other cells (Hata eta!, 2005, J Biol Chem 280: 32442). It is implicated in the skewing of the I cell response to a Th2 pattern during sepsis and low levels of expression of CRTH2 are associated with a poor outcome (Venet eta!, 2004, Clin Immunol ha: 278). The sequence is available at NM 004778 (SEQ ID NO:10).
HDAC5 is histone deacetylase 5, a class II histone deacetylase that represses transcription when tethered to a promoter. Histone acetylation/deacetylation alters chromatin structure and is a major factor controlling gene expression. HDAC5 is thought to interact with MEF2 family proteins and may play a role in myogenesis (Zhang et a!, 2002, Mol Cell Biol: 7302). There are two known isoforms encoded by two splice variants. NM 001015053 relates to the longer transcript (SEQ ID NO:.1 1).
HMMR is the gene for hyaluronan-mediated motility receptor (RHAMM). RHAMM is thought to be involved in invasion and metastasis of tumour cells. Although widely expressed on tumour cells, in normal tissue its expression is limited to testis, placenta and thymus. There is a truncated splice variant lacking an internal segment. NM 012484 represents the longest transcript (SEQ ID NO:12).
lL-8 is very widely known as a member of the CXC family of chemokines and is a prime mediator of the inflammatory response, being a potent chemotactic and angiogenic factor, It has been reported to be a relatively poor predictor of sepsis (Harbarth eta!, Am J Resp Crit Care Med j4: 396). The sequence is available at NM 000584 (SEQ ID NO:13).
MAP1A encodes microtubule-associated protein 1A, a member of a family of microtubule-associated proteins involved in microtubule assembly. MAP1 A is expressed predominantly in the brain. The functional protein comprises light and heavy chains resulting from proteolytic processing of a single propeptide encoded by the sequence of NM 002373 (SEQ ID NO.14).
MAPK7 is the gene encoding mitogen-activated protein kinase 7 (MAP kinase 7 or ERK5).
The MAP kinases occupy a central role in the intracellular signalling cascades from a number of receptor tyrosine kinases and G protein-coupled receptors but MAPK7 differs from the others in that it has not only protein kinase activity but also is also capable of translocating to the nucleus where it appears to be able to phosphorylate and activate transcription factors directly (Buschbeck & Ullrich, 2005, J Biol Chem Q: 2659), Four alternative transcripts encoding two distinct isof arms have been reported. The longest transcript is represented by the sequence of NM 002749 (SEQ ID NO:15).
MEF2D is the gene for MADS box transcription enhancer factor 2, polypeptide D (myocyte enhancer factor 2D). Originally described as a muscle-specific transcription factor, MEF2 is now known to exist as four alternatively spliced isoforms (A-D) that are differentially expressed in a range of tissues (Zhu el al, J Biol Chem, 2005, Q: 28749). MEF2D appears to be involved in leukocyte activation and chromosomal translocations resulting in MEF2D fusion proteins contribute to the development of some acute lymphoblastic leukaemias (Prima eta!, 2005, Leukemiaj: 806). The MEF2D sequence is available as NM 005920 (SEQ ID NO:16).
ODF1 is outer dense fibre of sperm tails 1 and encodes the major protein of the outer dense fibre layer surrounding the axoneme of sperm tails. Defects in the outer dense fibres lead to abnormal sperm morphology and infertility. There is no known connection with genes involved with the inflammatory response. The sequence is available as NM 022410 (SEQ ID NO: 17).
SAA3P denotes the serum amyloid A3 pseudogene. The serum amyloid A (SM) superfamily consists of two acute phase genes, SAA1 and SAA2 and a constitutively expressed gene, SAA4. SAA3P appears to be non-expressed pseudogene. The predicted open reading frame contains an insertion causing a frameshift, which generates a premature stop codon. The resultant hypothetical protein has been expressed. The genomic sequence is available as NG 002634 (SEQ ID NO: 18).
SLC6A9 is solute carrier family 6 (neurotransmitter transporter, glycine) member 6 (GLYT1).
A member of a large superfamily of transporter proteins, SLC6A9 is a sodium:glycine symporter, which may be involved in inhibitory glycinergic neurotransmission. There are a number of splice variants encoding three known isoforms. The longest transcript (giving rise to isoform 2) is available as NM 201649 (SEQ ID NO:19).
SPN is the gene for CD43 (leukosialin, sialophorin). Leukosialin is a major sialoglycoprotein of most leukocytes. It appears to play a part in modulating cell-cell interactions, including T cell activation (Daniels et a!, 2002, Nature Immunol: 903). The cDNA sequence is available at NM 003114 (SEQ ID NO: 20).
TDGF1 is teratocarcinoma-derived growth factor 1 (previously known as Cripto). It is a cell surface, glycosyl phosphatidylinositol (GPI) -anchored molecule, a member of the EGF-CFC family of growth factor-like molecules (Shen, 2003, J Clin Invest jj.: 500). It is over-expressed in a wide range of carcinomas but is not known to have a role in inflammation or the immune response. The cDNA sequence is at NM 003212 (SEQ ID NO: 21).
TSC22D1 is TSC22 domain family member 1. It is the founding member of the 1SC22 family of early response gene transcription factors and is particularly involved in the TGF-3 signalling pathway (and was formerly known as TGF-131-induced transcript 4 -TGFB1I4) (Gupta eta!, 2003, J Biol Chem Z: 7331). The accession number is NM 006022 (SEQ ID NO: 22).
There remains a need for further improvements in the early detection and diagnosis of sepsis including consideration genes not obviously connected with inflammation and immunity.
Statement of Invention
In a first aspect, the invention describes a system and methods of detecting early signs of infection, SIRS or sepsis several days before clinical signs become apparent. It also provides methods capable of predicting the timing of the clinical course of the condition. The system comprises analysing the results of one or more sets of tests based on biological samples, preferably blood samples. Optionally, other routine clinical measurements may be included for analysis.
The method comprises determining the level of expression of biomarkers shown herein to be positively correlated to developing sepsis.
First, there is a diverse group of genes, many of which have no established connection to the development of acute inflammation or the initiation of an immune response, including CRX, CTNND1, CX3CL1, ENTPD2, ENTPD5, EPHA8, GPR44, HDAC5, HMMR, MAP1A, MAPK7, MEF2D, ODF-1 SAA3P, SLC6A9, TDGF1 and TSC22D1. Measuring the expression levels of these genes in combination surprisingly provides early predictive and prognostic information as to the likelihood of sepsis developing in an individual exposed to infective agents.
A further group of biomarkers are chemokines or cytokines expressed in blood leukocytes, or are leukocyte surface receptors with an established role in immune function. This group includes CD178 (FAS-L), MCP-1 (monocyte chemotactic protein-i), TNF-a, lL-1, lL-6, lL-8, IL-b, INF-ci, INF-y, CD5, CD79A. CD178 is encoded and expressed as a type-Il membrane protein, but may be considered as a cytokine since it is cleaved by a metalloprotease to release a soluble homotrimer, soluble FasL or sFasL.
In a preferred embodiment expression of biomarkers is by specific amplification of mRNA by reverse transcription polymerase chain reaction (RT-PCR).
The method involves screening a biological sample to detect early stages of infection, SIRS or sepsis comprising the steps of: a. detecting expression of a first set of informative biomarkers by RT-PCR and detecting expression of a second set of informative biomarkers by RT-PCR; b. analysing the results of detection; and c classifying said sample according to the likelihood and/or timing of the development of overt infection wherein the first set of informative biomarkers the expression of which is detected by means of RT-PCR consists of at least 1 selected from the list consisting of CD4O, CD5, CD79A, CRX, CTNNDi, CX3CL1, ENTPD2, ENTPD5, EPHA8, GPR44, HMMR, IL-8, MAP1A, MAPK7, MEF2D, ODF1, SAA3P, SLC6A9, SPN, TDGF1, TSC22D1 and HDAC5 and wherein the second set of informative biomarkers the expression of which is detected by means of RT-PCR consists of at least 1 selected from the list consisting of CD1 78, MCP-1, TNFa, lL-113, IL-6, IL-b, INF-a, INF-y.
Preferably, the first set of informative biomarkers the expression of which is detected by means of RT-PCR consists of at least 2, more preferably 3,4,5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, or 22 selected from the list. It will be understood that, in general, the greater the number of markers, the greater the accuracy of the prediction. Set against this is the greater complexity and time taken for the analysis.
A preferred selection of markers is SAAP3, MAP1A, EPFIA8, CD4O, IL-8, CRX, SPN, MEF2D, MAPK7, HMMR, ENTPD2 and TSC22D1, Further selections of markers providing high levels of prediction of onset of sepsis are as follows: a. SPN,CD4O, SAAP3, lL-8, MEF2D, EPHA8, MAP1A and CRX.
b. MEF2D, ENTPD2, CD4O, EPHA8, SAA3P, HMMR, 1L8, MAPK7, SPN and TSC22D1 c. SPN, EPHA8, CD4O, CRX, TSC22D1, MAPK7, HMMR, MEF2D and ENTPD2 d. HMMR, SAAP3, ENTPD2, MAP1A, MAPK7, TSC22D1 and IL-8.
e. TDGF1, HDAC5, ENTPD5, ODF1, GPR44, CD5, CD79A, CTNND1, CX3CL1 and SLC6A9 1. IL-8, SAAP3, CD4O, SPN, MEF2D, EPHA8, MAPK7, ENTPD2 and HMMR g. MEF2D, ENTPD2, EPHA8, SAA3P, HMMR, MAPK7, SPN and TSC22D1 h. CD4O, ENTPD2, EPHA8, IL-8, SAAP3, MEF2D and SPN Preferably, at least two biomarkers from the second set of informative biomarkers are detected. Alternatively, 3, 4, 5, 6, 7 or 8 biomarkers from the second set of biomarkers are detected in combination with at least one biomarker from the list consisting of CD4O, CD5, CD79A, CRX, CTNND1, CX3CL1, ENTPD2, ENTPD5, EPHA8, GPR44, HMMR, lL-8, MAP1A, MAPK7, MEF2D, ODF1, SAA3P, SLC6A9, SPN, TDGF1, TSC22D1 and HDAC5.
In a preferred embodiment, analysis of the results yields a prediction of a probability of clinical SIRS or sepsis developing. Alternatively the analysis may be expressed as a binary yes/no prediction of clinical SIRS or sepsis developing.
In one alternative, if the development of clinical SIRS or sepsis is predicted, the results are subjected to a second analysis to determine the likely timing and/or severity of the clinical disease.
The results of one or more sets of tests are analysed, preferably by means of a neural network program enabling a yes mo prediction of the patient from whom the sample was taken developing sepsis to be calculated.
In the case of a positive prediction, preferably a further analysis is then performed allowing an estimate to be made as to the time to onset of overt clinical signs and symptoms.
Alternatively, both analyses may be performed by multivariate logistic regression.
Analysis of the test groups can be performed individually or simultaneously. In one alternative method, further clinical data are entered into the neural net as supplementary data to the PCR data. At the same time flow cytometry data can be processed by the neural network. Only one set of data is required for processing through the neural net although there are advantages in inputting one, two or all three data sets as these additional examples help "train" the neural net and improve confidence in the output from the program.
In a further aspect the neural network is used to process pre-recorded clinical data or a database of such data may be used to train the neural network and improve its predictive power.
Suitable clinical data include at least one, preferably at least three, more preferably at least five selected from the list consisting of temperature, heart rate, total and differential white blood cell count (monocytes, lymphocytes, granulocytes, neutrophils), platelet count, serum creatinine, urea, lactate, base excess, P02 HC03, and C-reactive protein The method may be used as part of routine monitoring for intensive care patients, where regular blood samples are taken for other purposes. Other hospital patients who may be predisposed to infections and/or sepsis may also be monitored. Such predisposing conditions include inherited or acquired immunodeficiencies (including HI V/Al DS) or immunosuppression (such as general surgery patients, transplant recipients or patients receiving steroid treatment), diabetes, lymphoma, Ieukaemia or other malignancy, penetrating or contaminated trauma, burns or peritonitis. In another aspect, the method of the invention may be used to screen individuals during an outbreak of infectious disease or alternatively individuals who have been, or who are suspected of having been, exposed to infectious pathogens, whether accidentally or deliberately as the result of bioterrorism or of use of a biological weapon during an armed conflict.
In one alternative embodiment, this is expressed as a probability. In an alternative embodiment it is expressed as a binary yes/no result. Optionally, where the first analysis suggests that SIRS or sepsis are probable (as defined as exceeding a predetermined arbitrary threshold probability, or a yes' prediction, said results are subjected to a second analysis to determine the likely time to development of overt clinical signs, or to give an indication of probable severity of the clinical disease.
In one highly favoured embodiment, said analysis is by means of a neural network. Most preferably it is a multilayered perceptron neural network Preferably such a neural network is capable of correctly predicting SIRS or sepsis in greater than 70% of cases (determined in trials where such development is not prevented by prophylactic treatment in a control group), more preferably in at least 80% of cases, even more preferably in at least 85% of cases and most preferably in at least 95% of cases. It is preferred that SIRS or sepsis is can be predicted at least one day before the onset of overt clinical signs, more preferably, at least two days, still more preferably at least three days and most preferably more than three days before SIRS or sepsis is diagnosed.
In another favoured embodiment analysis is by means of multivanate statistical analysis, preferably comprising principle component analysis and/or discriminant function analysis. It is more preferred that the multivanate statistical analysis comprises discriminate function analysis.
In a further aspect, the invention provides a system for screening a biological sample to detect early stages of infection, SIRS or sepsis comprising: a means of extracting and purifying RNA from cells obtained from said sample, a thermal cycler or other means to amplify selected RNA sequences by means of reverse transcription polymerase chain reaction (RT-PCR), a means of detecting and quantifying the results of said RT-PCR, a computer-based neural network trained so as to be able to analyse such results and a display means whereby the conclusion of the neural network analysis may be communicated to an operator Note: in this aspect the results of the RT-PCR may be analysed using discriminate function analysis, but the neural network is the preferred embodiment.
In a further aspect the invention provides analysis according to any of embodiment of the method described above for the preparation of a diagnostic means for the diagnosis of SIRS, sepsis or infection, or the use of the system described above for the preparation of a diagnostic means for the diagnosis of infection.
Also provided is a method of early diagnosis of SIRS, infection or sepsis according to the method as described above.
Where standard clinical measurement are analysed these are consist of at least one, preferably at least three, more preferably at least five selected from the list consisting of temperature, heart rate, total and differential white blood cell count (monocytes, lymphocytes, granulocytes, neutrophils), platelet count, serum creatinirie, urea, lactate, base excess, PO2 HC03, and C-reactive protein Throughout the description and claims of this specification, the words "comprise" and "contain" and variations of the words, for example "comprising" and "comprises", means "including but not limited to", and is not intended to (and does not) exclude other moieties, additives, components, integers or steps.
Throughout the description and claims of this specification, the singular encompasses the plural unless the context otherwise requires. In particular, where the indefinite article is used, the specification is to be understood as contemplating plurality as well as singularity, unless the context requires otherwise.
Features, integers, characteristics, compounds, chemical moieties or groups described in conjunction with a particular aspect, embodiment or example of the invention are to be understood to be applicable to any other aspect, embodiment or example described herein unless incompatible therewith.
Detailed description of the invention
The invention will be described in further detail with reference to the following Figures and
Examples
Figure 1: Following infection, cells of the immune system recognise and respond to a pathogen by becoming activated. This results in the production of different messenger proteins (e.g. cytokines and chemokines) and expression of activation markers and adhesion molecules on the cell surface. The production of these facilitates communication between cells and results in a co-ordinated immune response against a particular agent. Since this inflammatory immune response is relatively constant in response to infection, and occurs in the very earliest stages of the disease process, monitoring changes in the expression of such markers can be used to predict the early stages of sepsis development. Ideally this is done during the therapeutic window of intervention, prior to the onset of clinical symptoms and disease.
Figure 2: A plot of the CD31 expression measured on granulocytes by flow cytometry.
From blood samples taken from patients three days before diagnosis of sepsis (n=6), and in ICU patients who did not go on to develop sepsis (n=24). Each symbol represents a measurement from one patient.
Figure 3: Design of neural network analysing clinical data according to Table 4, model 2. WCC= white cell count, CRP= C-reactive protein.
Figure 4: Change in cytokine profile obtained following in vitro blood infection with S. aureus. Data from blood taken from three volunteers as detailed in Example 8.
Figure 5: Results of neural network analysis of S. aueus in vitro sepsis model.
Example 1: Prediction of sepsis by neural network analysis of cytokine expression, cell surface markers and clinical measures.
Study design and Datients The study into the onset of sepsis from the ICU department of Queen Alexandra hospital resulted in a cohort of ninety-one patients (DstLICRO8631). Blood samples were collected daily from these patients throughout their stay in the ICU and in total, twenty-four patients were diagnosed as developing sepsis. Samples taken on the day clinical sepsis was diagnosed (Day 0), back through to six days prior to sepsis diagnosis (Day -6) were analysed by RT-PCR and flow cytometry for the expression of activation markers and cytokine mRNA respectively. In addition, standard hospital data and clinical observations were recorded.
Samples from control patients were also processed in the same manner to provide data for traditional statistical analysis.
RT-PCR was performed according to commonly-used laboratory techniques. Briefly, in the case of a blood sample, whole blood was taken and cells then lysed in the presence of an RNA stabilising reagent. RNA was separated by affinity binding of beads, which were isolated by centrifugation (or magnetically, as appropriate), contaminating DNA removed by DNase digestion and the RNA subjected to RT-PCR.
Fluorescence activated cell sorting (FACS) flow cytometry is very well-known in the art and any standard technique may be used.
Data analysis The complexity of biological systems and intricate relationships between the markers used in this study caused standard linear techniques of data analysis to give inconclusive results.
Consequently it was unclear whether any patterns existed in the data and a more powerful technique, capable of non-linear modelling, was sought to cope with the complexity of the data sets For analysis, data was collated from patients 1 to 4 days prior to the onset of sepsis and compared with an age/sex matched control group consisting of ICU patients who did not develop sepsis. Individual samples provided data measuring up to 56 different parameters and selective combinations of variables were fed into a multi-layered perceptron neural network (Proforma, Hanon Solutions, Glasgow, Scotland).
Each network was trained with a random 70% selection of balanced sepsis and control data using back propagation algorithms and then tested with the remaining 30% of the data Five attempts were made at modelling the data within this network, each model differing in its ability to generalise to the data. The most successful model was the one most capable of correctly classifying previously unseen patients as being from either the sepsis or non-sepsis control group.
Results Table 2 shows an example of a successful model that classified or "scored" 29/35 (or 82.9%) test patients correctly.
Table 2. Classification readout using cytokine mRNA variables (Days 1 to 4) Score % of sample correctly __________________ _______ predicted Total patients 29/35 82.9% Control patients 16/20 80.0 % Sepsis patients 13/15 86.7% To increase confidence in this model, this was carried out five times, each time using a different random selection of data for which to train and test the network. Once completed, the scores for the individual models were averaged to give an overall indication of the networks ability to classify patients into the correct sepsis or non-sepsis control group.
A series of 5 datasets gives a mean accuracy of prediction of approximately 80%, as shown
by Table 3 below
Table 3: neural network predicting sepsis using RT-PCR data only (Classification Performance Analysis of 5 projects) Hits/Occurred % Hits/Predicted % Chance Improvement Ratio % MODEL 1 Condition 23/27 85.2 N/A N/A 50.0 35.2 1.7:1 Sepsis 9/13 69.2 9/9 100.0 48.1 51.9 2.1:1 Control 14/14 100.0 14/18 77.8 51.9 25.9 1.5:1 MODEL 2 Condition 29/35 82.9 N/A N/A 50.0 32.9 1.7:1 Sepsis 13/15 86.7 13/17 76.5 42.9 33.6 1.8:1 Control 16/20 80.0 16/18 88.9 57.1 31.7 1.6.1 MODEL 3 Condition 32/40 80.0 N/A N/A 50.0 30.0 1.6:1 Sepsis 20/22 90.9 20/26 76.9 55,0 21.9 1.4:1 Control 12/18 66.7 12114 85.7 45.0 40.7 1.9:1 MODEL 4 Condition 21/26 80.8 N/A N/A 50.0 30.8 1.6:1 Sepsis 9/12 75.0 9/11 81.8 46.2 35.7 1.8:1 Control 12/14 85.7 12115 80.0 538 26.2 1.5:1 MODEL 5 Condition 23/29 79.3 N/A N/A 50.0 29.3 1.6:1 Control 12/15 80.0 12/15 80.0 51.7 28.3 1.5:1 Sepsis 11/14 78.6 11/14 78.6 48.3 30.3 1.6:1 128/157 = 81.5% 79.3 + 85.2 + 80 + 82.9 + 80.8 = 408.2 / 5 = 81.64% Table 4 lists the averaged prediction accuracy values for a range of networks constructed using differing combinations of variables.
The most successful model was constructed using cytokine mRNA expression combined with CD31 % expression from the flow cytometry data (average 81.0% accuracy, Table 3, model 1) with clinical data also scoring highly (80.4%, Table 3, Model 2).
Table 4. The results from the neural network analysis.
Model Markers Prediction Accuracy (%) 1 FasL, MCP-1, TNF-a, IL-ift lL-6, lL-8, IL-b 61.6 2 Creatinine, Monocytes, CAP, Lymphocytes, Temperature, 80.4 _______ Neutrophils, White Cell Count 3 FasL, MCP-1, IL-8, White cell count, Temperature, Creatinine 79.0 4 FasL, MCP-1, TNF-a, IL-1f3, IL-6, lL-8 & IL-b, %CD31 & 78.7 _______ Creatinine FasL, TNF-cx, lL-1, IL-6, lL-8 & IL-b 78.1 6 FasL, MCP-1, TNF-cx, IL-113, IL-6, lL-8, IL-b, Creatinine, 76.0 Monocytes, CRP, Lymphocytes, Temperature, Neutrophils, White ______ Cell Count 7 MCP-1, TNF-cz, lL-1, IL-6, lL-8, IL-b 76.0 8 Platelets, HCO3, P02, Urea, Creatinine, heart rate 70.7 To further test our predictive model, we trained the network on up to 100% of the cytokine data obtained from 1 to 4 days prior to the onset of clinical symptoms. We then selected test data comprising "Day 0" sepsis patients and those from Day -5 and Day -6. day 0, 5 and 6 and also selected 14 control patients from a separate volunteer study, 7 of which developed symptoms of an Upper Respiratory Tract Infection (URTI) within 9 days of sampling (Dstl/CR08631). The results are shown below in table 5.
Table 5. Performance of cytokine mRNA model (Days -4 to -1) in prediction of other groups Test set Score % of sample correctly __________ ___________ predicted Day 0 8/9 89% sepsis ____________ _______________________ Day-5 7/9 78% sepsis ____________ _______________________ Day -6 5/6 83% sepsis ____________ ______________________ This table shows that our model, built from patterns expressed by sepsis patients up to 4 days before the onset of clinical sepsis, correctly identified, or "scored", 89% of day 0 sepsis patients, 78% of Day -5 sepsis patients and 83% of Day -6 sepsis patients. Overall, analysis using neural networks has led to the creation of a number of predictive models for sepsis.
Models built using only cytokine data have proved consistently capable of successfully distinguishing between individuals who will develop sepsis from those that will not.
Example 2: Lack of false positive results from non-sepsis volunteers using neural network model Table 6 shows the results of testing a group of volunteers by cytokine RT-PCR, none of whom developed signs of SIRS or sepsis.
Table 6
Name *.I-lits/Occurred' % Hits/Predicted % Chance Improvement Ratio Total 13/13 100 0 N/A N/A 50 0% 50 0% 20 1 Control 13/13 100 0 13/13 100 0 100 0% 0 0% 1 0 1 Sepsis 0/0 N/A 0/0 0 0 0 0% 0 0% N/A Example 3: Neural network sepsis prediction of more than 90% accuracy using Clinical Data.
Neural network model tested using clinical data set defined in Table 4 model 2, using the parameters as described in Table 7 below and further illustrated in Figure 3: Table 7: Neural network parameters to analyse clinical data ________________ Weights from input to hidden ____________ _____________ Input Unit 1 Unit 2 Unit 3 Unit 4 Unit 5 1 -3.04389 0.783085 -8.14579 -5.31918 -11.699 2 -2.40492 -3.28341 6.81119 -2.22889 10.3357 3 2 0.0835039 -2.25136 15.2575 -3.38677 11.4842 4 -16.2918 3.45666 -5.86258 8.93773 -1.57967 12.7828 4 5.2618 6.36791 7.45053 17.1156 6 34.3201 7.31471 -18.392 16.806 10.2057 7 -3.31337 -1.41443 -10.2639 -3.68324 1.4483 Weights from bias to hidden 1 -0.407035 -9.38879 I 4.06257 I -12.6877 -1 0.7582 ___________ ____________ Weights from hidden to output _____________ 1 -8.79483 8.82026 ___________ ___________ ____________ 2 8.79794 -8.38064 ___________ ___________ ____________ 3 3.60783 -3.621 77 __________ __________ ___________ 4 3.00073 -3.83778 ___________ ___________ ____________ 4.85609 -4.85559 ___________ ___________ ____________ Weights from bias to output 1 J -2.64015 J 2.65373
Table 8
( Name Hits/Occurred % Hits/Predicted % Chance Improvement Ratio ondition 14/15 93.3 N/A N/AJ 50.0% 43.3°! 1.9:1 ISepsis 8/8 100.0 8/9 88. 9 53.3% 35.6°! 1.7:1 IControl 6/7 85.7 6/6 100.0 46.7% 53.3°! 2.1:1 Example 4: Use of Artificial Immune Recognition System Representation The initial AIRS system (A. Watkins. An Artmcial Immune Recognition System. Mississippi Sate University: MSc Thesis., 2001) employed simple real- value shape space. Recently, other people have extended the representation to Hamming shape space (J. Hanamaker and L. Boggess. The effect of distance metrics on Al AS. In Proc. Of Congress on Evolutionary Computation (CEC). IEEE, 2004) and natural language (D. Goodman, L. Boggess and A. Watkins. "An investigation into the source of power for AIRS, an artificial immune classification system". In Proc. mt Joint Conference on Neural Networks, ppl 678-1683.
IEEE, 2003). AIRS maintains a set of Artificial Recognition Balls (ARBs) that contain a vector of the data being learnt, a stimulation level and a number of resources. During training, the stimulation level is calculated by assessing the affinity of the data vector in the ARB against a training item, the stronger the match, the greater the stimulation. This stimulation level is used to dictate how many clones the ARB will produce, and affects survival of the ARB.
Affinity Measure This is dependent on the representation employed. A number of affinity measures for use in AIRS have been proposed, including Hamming distance, Euclidean distance and so on. In this study, both Euclidean and Hamming distance metrics were used, with Euclidean giving the best results.
Immune Algorithm Essentially, AIRS evolves with two populations, a memory pool and an ARB pool C. It has a separate training and test phase, with the test phase being akin to a k-nearest neighbour classifier. During the training phase, a training data item is presented to M. This set can be seeded randomly, and experimental evidence would suggest that AIRS is insensitive to the initial starting point. The training item is matched against all memory cells in the set M, and a single cell is identified as the higher match MCmatch. This MCmatch is then cloned and mutated. Cloning is performed in proportion to stimulation (the higher the stimulation, the higher the clonal rate), and mutation is inversely proportional (the higher the stimulation, the lower the mutation rate). These clones are inserted into the ARB pool, C. The training item is then presented to the members of the ARB pool, where an iterative procedure is adopted which allows for the cloning and mutation of new candidate memory cells. Through a process of population control, where survival is dictated by the number of resources an ARB can claim, a new candidate memory cell is created. This mechanism is based on the resource allocation algorithm proposed in J. Timmis and M. Neal. A Resource Limited Artificial Immune System. Knowledge Based Systemsm 14(3/4): 121-130, 2001. This new candidate is compared against the MGmatch, with the training item. If the affinity between the candidate cell and M VCmatch is higher, then the memory cell is replaced with the candidate cell.
This process is performed for each training item, whereupon the memory set will contain a number of cells capable of being used for classification. Classification of an unseen data item is performed in a k-nearest neighbour fashion.
Exoerimentpl Setuo An attempt was made to use an experimental procedure that was comparable to the application of neural networks to this data set. For all studies, the marks: asL, MOP-i, TNF-a, IL-f3, IL-6, lL-8 and IL-i 0 were used. However, it was not possible to completely reproduce exactly the data set, due to incomplete information regarding the pre-processing of the data during the neural network study.
Experiment One In the first set of experiments, data collected from patients on days 1 through 4 prior to the onset of sepsis, along with data from a control set of patents as training data were used.
Specifically, the combined data from days 1, 2, 3 and 4 for patients who showed signs of sepsis were used, and a random collection of control patients in order to train AIRS. In total, 59 training data items were used. To test AIRS, a random collection of patients from the control group and combined data from all days (excluding data that had been used in the training process) was used. In total, 34 test data points were used. The settings for AIRS are shown in table 9: Table 9: Parameter Settings for AIRS Parameter Setting Epochs Clonal Rate 10 Mutation Rate 0.8 Initial size of population 5 Affinity Threshold 0.2 Stimulation Threshold 0.8 Number of resources 200 Experiment Two For our second set of experiments, patients were classified who showed signs of sepsis using data for days 0, 5 and 6 and control patients. The AIRS system was trained using the same data as for Experiment One, whilst making use of the same parameters.
Results The results are not directly comparable with the results obtained from the neural network analysis due to the fact thatit was difficult to ascertain from the original report exactly how the data had been first combined over a period of days, and then divided into training and test sets. Therefore, the results obtained should be considered with this in mind.
Experiment One Ten independent runs of the AIRS algorithm were run, then the average and standard deviation calculated. It was found that AIRS was capable of achieving on average 73(2.96)% classification accuracy. This is approximately 10% lower than the neural network analysis, (using the same markers). However, care has to be taken with a direct comparison.
Experiment Two Again, ten independent runs of the AIRS algorithm were undertaken, and the average and standing deviation taken. This time, preceding days (0, 5 and 6) before the onset of sepsis were analysed, and the control group. Again, AIRS was trained on data taken from days 1 through 4 and the control group. These results are presented in Table 10.
Table 10: Prediction in Other Groups (standard deviation in braces) Test Set Accuracy I Day 0 sepsis 83(7.6)% Day 5 sepsis 93(6.1)% Day 6 sepsis 91 (8.2)% Control 70(12.8)% As can be seen from Table 10, AIRS identifies a high percentage of sepsis cases (being able to outperform the neural network on day 5 and day 6, but again with the comparative caveat).
The control group did not fair as well, being a lower than expected result, and significantly lower than the neural network approach. This may be due to the fact that AIRS has biased towards the sepsis patients due to the larger amount of data available for training with those, than for non-sepsis.
Conclusions
AIRS appears capable of identifying potential cases of sepsis in advance, and comparable at a certain level to neural network approaches.
Example 5: Use of CD31 expression to predict sepsis Study design and oatients
See Example 1.
Flow Cytometry Blood was collected into sodium heparin containers (HM&S, Chessington, Surrey) and transported to the laboratory at room temperature. 100 p.1 aliquots of blood were mixed with immunofluorescent stains using the volumes recommended by the manufacturer (Beckman Coulter limited, High Wycombe, Buckinghamshire, and Becton Dickinson UK Limited, Cowley, Oxford). I helper cells were identified by co-staining for CD3 and CD4 and I cytotoxic cells were identified by staining for both CD3 and CD8. These cell populations were stained for HLA-DR, C025, CD54 and CD69. B cells were identified by staining for CD1 9 and were interrogated with CD8O, CD86, CD25 CD54. Natural killer cell were distinguished by staining with CD56 and interrogated with CD1 1 b, CD25, CD54 and CD69.
The monocyte population was selected by staining for CD1 4, these cells were probed with CD1 ib, CD54, CD8O, CD86 and HLA-DR stains. Gating was used in order to identify the granulocyte population, which was stained for CD1 1 b, CD69, CD31, CD54 and CD62L. The stains were incubated at room temperature for 20 minutes. 500 p.1 of Optilyse C (Beckman Coulter limited) was added to each tube and vortex mixed immediately. The samples were incubated at room temperature for 10 minutes to lyse the red blood cells and 500 p.1 of Isoton (Beckman Coulter limited) were then added in order to fix the stains. The tubes were vortex mixed immediately and incubated at room temperature for 10 minutes. The cells were then counted on a Beckman Coulter Epics XL System 2 Flow Cytometer.
StatisticsData was analysed using a Binary Logistic Regression model on the SPSS software package version 11.0. This analysis compared the control group means for immune modulator expression with the means obtained from the sepsis patients at seven time points: 6 days before diagnosis, 5, 4, 3, 2, and 1 day before diagnosis and 0, on the day of diagnosis of sepsis. Where data points were missing, averaged values for the group were substituted in order to maintain acceptable n values. Results from the model were only reported the substituted data points did not involve markers that were highlighted by the model as possible predictors.
Results Analysis of the data found that there was weak evidence of a predictor effect = 0.114.
Decreased expression of CD31 was indicated to be a possible predictor of sepsis three days before diagnosis p=0.037 (n=6). The results obtained for 6 days before diagnosis were inconclusive because of the small sample size for this date (n=4). There were no statistically significant predictors found for 5, 4, 2 or 1 day before diagnosis, or for the day of diagnosis.
Conclusions
The flow cytometry data obtained from patients prior to the development of sepsis, and from patients who did not develop this disease were collated. Groups were constructed using results from patients in the days before diagnosis of sepsis, with a control group consisting of measurements taken from age matched patients who did not develop sepsis. Examination of bar graphs displaying the medians and 90"' and 1 0th percentiles were difficult to interpret because of the spread of the data and hence statistical analysis was performed.
When the raw data for this is plotted (see Figure 2) it could be seen that 4 out of 6 (66.6 %) of the sepsis patients had CD 31 expression that was lower than that of the control group. It can be seen that the control group (non sepsis) data points are distributed between 11.8 % and 100 %, while four of the six data points in the three days before diagnosis measurements were less than 9 %. Therefore it is possible that CD31 may therefore be used to predict the onset of sepsis three days prior to the appearance of clinical signs and symptoms. This suggests that CD31 could be a useful predictive marker, particularly in combination with other informative sepsis biomarkers.
Example 6: Multivarlate statistical analysis to predict sepsis
Introduction
Multivariate data analysis procedures were applied to data collected from patients 1-6 days por to development of symptoms of Sepsis. Measurements included flow cytometry, PCR and classical clinical observations. Principle component analysis (PCA) was applied to the data matrix considering each of the three classes of observations individually and combined as a complete data set. Discriminant Function Analysis (DFA) was used to determine whether groups differ with regard to the mean of a variable, and then to use that variable to predict group membership (e.g., of new cases). This was performed on the results from PCA and on the three classes of observations individually and combined as a complete data set.
Data descriotion, manipulation and multivariate techniques Prior to PCA, the data was summarised by producing probability density functions As normality of distribution is required prior to PCA and DFA, non-normal data were transformed using the Johnson transformation algorithm.
PCA is a dimensionality reducing technique which endeavours to decompose a multivariate data matrix into a few latent variables, composed of linear combinations of the variables, which explain the bulk of the variance of the original matrix. In this way correlations (positive or negative) of parameters within the data set can be established.
Essentially, DFA is similar in approach to Analysis of Variance (ANOVA). The DFA problem can be rephrased as a one-way analysis ANOVA problem. Specifically, one can ask whether two groups are significantly different from each other with respect to the mean of a particular variable. However, it should be clear that, if the means for a variable are significantly different in different groups, then it may be concluded that this variable discriminates between the groups.
In the case of a single variable, the final significance test of whether or not a variable discriminates between groups is the F-test. F is essentially computed as the ratio of the between-groups variance in the data over the pooled (average) within-group variance. If the between- group variance is significantly larger then there must be significant differences between means.
When considering multiple variables, it is possible to establish which of several variables contribute to the discrimination between groups. This results in a matrix of total variances and covariances: and likewise, a matrix of pooled within-group variances and covariances.
These matrices are then compared via multivariate F-tests in order to determine whether or not there are any significant differences (with regard to all variables) between groups. This procedure is identical to multivariate analysis of variance or MANOVA. As in MANOVA, the multivariate test is performed, and, if statistically significant, which of the variables have significantly different means across the groups is examined. Thus, even though the computations with multiple variables are more complex, the principal reasoning still applies, namely, that variables that discriminate between groups are sought, as evident in observed mean differences.
DFA was performed on clinical, flow cytometry and RT-PCR data using the complete data matrix (including substituted mean values) and by exclusion of data points for which one or more parameters contained substituted mean values. Analogous models were developed to allow analysis of PCA scores from models developed in Model 1. The purpose of the latter was to establish if transformed data matrices (PCA) could be used to classify observations.
Model 1. Principle Component Analysis (PCA) of observation data I) PCA model based on dilnical data The number of PCs derived and used in a given model is usually defined as those having an Eigenvalue of >1. 6 PCs meet this criterion for clinical data and explain a total of 74.3% of the variance of the data set Since each of the PCs is orthogonal (uncorrelated) with respect to the other PCs, the association of a clinical parameter with a particular PC defines the PC and illustrates how the parameter influences the variance of the data set.
Table 12 summarises the loadings of each parameter with the six derived PCs from the clinical data. Loading values of >0.5 indicate a strong contribution of a particular parameter to a given PC. The PCs derived from the data set may be interpreted as follows: PCi this is dominated by the strong correlation of WCC, monocytes, neutrophils and platelets. A strong correlation exists between creatinine and lactate. Both these groups have a negative relationship (opposite ends of PCi scale) and are therefore negatively correlated. BXS and HC03 are highly correlated and contribute to PCi and PC2 equally. The latter parameters are contrasted by creatinine and lactate in PCi. These correlations are summarised in Table 2 and Figure 2 PC2 shows a negative correlation between the group composed of WCC, monocytes, neutrophils and platelets and the group composed of BXS and HCO3 PC3 this PC is characterised by the strong relationship between temp, HR and CRP as shown in Figure 3 PC4 although many parameters approach significance for this PC, only CRP is definitively associated with this PC as demonstrated by Figure 4.
PC5 PO2 is contrasted with both urea and MAP in this component.
PC6 this PC exclusively explains the variance introduced into the data set by lymphocytes.
In interpreting the PC loadings, correlated clinical parameters suggest levels of these species/physical parameters will be elevated or decreased in patients belonging characterised as belonging to PCi, 2 etc. This will be performed in the discriminant analysis section.
ii) PCA model based on flow cytometry data The parameters were abbreviated for clarity and abbreviations listed in Table 13. Nine PCs account for 80.5% of the variance of the data set as shown in the eigenvalue matrix in Table 14 and associated loadings are summarised in Table 15. The correlations between the measured parameters are shown in Table 16 with strong correlations within the derived PCs between the following: fc 1 and fc 3 Ic 5 and fc 6 Ic 7 and Ic 8 fc 9 with Ic 10 and Ic 11, fcl2andfcl3 fcll andfcl4 Ic 17 with fc 20 and Ic 23 Ic 21 with fc 17, 20 and 22 Ic 23 with Ic 20 and fc 22 Ic 28 and fc 29 The PC structure may be interpreted as follows: PCi correlates CD3 CD4 C025 in CD3 CD4, CD3 CD4 HLA-DR in CD3 CD4, CD3 CD8 CD25 in CD3 COB, CD19 CD8O in CD19, CD19 CD86 in CD19, CD14 CD8O in CD14, CD14 CD86 in CD14, C019 CD54 in CD19, CD19 CD25 in CD19 and 0D56 0054 in CD56.
PC2 contrasts CD19 CD86 in C019 with 0014 HLA-DR in C014, CD14 HLA-DR CD11B in CD1 4, CD1 4 HLA-DR CD1 1 B CD54 fl COl 4.
PC3 CD3 CD4 0054 in C03 CD4, 003 CD4 CD69 in CD3 004, CD3 CD8 CD54 in CD3 CD8, CD3 CD8 C069 in CD3 CDB PC4 CD56 C069 in CD56, CD69 (%), CD1 1 B CD69 (%) PC5 CD3 CD8 C054 in CD3 CD8, CD62L (%) PC6 CD31 (%) PC7 no significant components with EV>0.5 PC8 0D54 (%) PC9 CD14CD11BinCD14 By considering only one parameter of a pair or group, it would be possible to remove 9 parameters thus increasing the y component of the data matrix. However, it was decided that CD31 (%), CD54 (%), CD62L (%), CD11B (%), CD69 (%) and CD11B CD69 (%) only be subjected to statistical analysis (fc 24-fc 29).
The eigenvalue matrix of the selected flow cytometry variables is shown in Table 17 and loadings of the PCA model constructed summarised in Table 18. The use of 3 PCs explains 76.6% of the variance of the data set. The PC model shows the following: PCi correlates CD69 (%) and CD1 1 B CD69 (%) PC2 correlates CD31 (%), CD62L (%) and CD1 1 B (%) PC3 is composed of the variance associated with CD54 (%) ii,) PCA model based on RT-PCR data Table 19 indicates that 72.9% of the variance of the RT-PCR data is explained by only 3 PCs. The loading for this model are shown in Tablel2O. The correlation of variables with each PC are shown in Figures 23 and 24 and reveal the following: PCi correlates Fas-L, MCP-1, TNF-alpha, IL-6 and 11-8 PC2 correlates IL-i and IL 10 PC3 contrasts IL-i and IL-i 0 The correlation of IL-i and IL-i 0 in PC2 and subsequent contrast of these variables in PC3 is interesting. It appears that in some patients these vairables may be highly correlated or contrasted possibly providing a powerful means of discriminating patients.
iv) PCA model based on combined clinical, flow cytometry and RT-PCR data Table 2i summarises the parameters included in this final model and the associated Eigenvalues for the correlation matrix. Table 21 indicates that 9 PCs have an Eigenvalue greater than 1 which explain 68. 7% of the data variance. The loadings of the model are shown in Table 22. Analysis allows the following interpretation of the PCA model: PCi shows positive correlation between WCC, Neutrophils, Monocytes, APTR, HCO3-, BXS, Platelets, CD69 (%) and Coil B CD69 (%). This PC also contrasts the above with Lactate and Creatinine which are correlated.
PC2 correlates CD69 (%), CD1 1 B CD69 (%), WCC, Neutrophils, Monocytes and INR.
These parameters are contrasted with TNF-alpha.
PC3 strongly correlates the PCR parameters Fas-L, MCP-1, TNF-alpha, lL-6 and 11-8 PC4 contrasts CRP and IL-b.
PC5 correlates the flow cytometry parameters CD31 (%),C054 (%),CD62L (%) and CD69 (%).
PC6 correlates CD62L (%) and HR.
PC7 is associated with Temp.
PC8 is associated with IL-I PC9 is associated with P02 Model 2: Discriminanl Function Analysis (OFA) based on observations and PCA score data The terminology common to all model definition in the DFA models developed is explained below and numerical values shown in Table 23. Model
The object of the analysis is to build a "model' of how to best predict to which group a case belongs. In the following discussion the term in the model" will be used in order to refer to variables that are included in the prediction of group membership, and "not in the model" if they are not included.
* Forward stp�wise analysis In stepwise discriminant function analysis, a model of discrimination is constructed step-by-step. Specifically, at each step all variables are reviewed and evaluated to establish which one will contribute most to the discrimination between groups. That variable will then be included in the model.
* Backward steowise analysis It is possible to step backwards; in that case the programme first includes all variables in the model and then, at each step, eliminates the variable that contributes least to the prediction of group membership. Thus, as the result of a successful discriminant function analysis, one would only keep the "important" variables in the model, that is, those variables that contribute the most to the discrimination between groups.
* F to enter, F to remove The stepwise procedure is "guided" by the respective F to enter and F to remove values. The F value for a variable indicates its statistical significance in the discrimination between groups, that is, it is a measure of the extent to which a variable makes a unique contribution to the prediction of group membership. In general, the programme continues to choose variables to be included in the model, as long as the respective F values for those variables are larger than the user-specified F to enter; and excludes (removes) vanables from the model if their significance is less than the user-specified F to remove.
* Tolerance The tolerance value of a variable is computed as 1 -R2 of the respective variable with all other variables in the model. Thus, the tolerance is a measure of the respective variable's redundancy. For example, a tolerance value of.10 means that the variable is 90% redundant with the other variables in the model.
* WIIks' This parameter gives a measure of the discriminatory power of the model and can assume values in the range of 0 (perfect discrimination) to 1 (no discrimination).
* Partial A. This is the Wilks' A. associated with the unique contribution (measured orthogonally) of the respective variable to the discriminatory power of the model.
As a point of note, a common misinterpretation of the results of stepwise discriminant analysis is to take statistical significance levels at face value. When the programme decides which variable to include or exclude in the next step of the analysis, it actually computes the significance of the contribution of each variable under consideration. Therefore, by nature, the stepwise procedures will capitalize on chance because they "pick and choose" the variables to be included in the model so as to yield maximum discrimination. Thus, when using the stepwise approach awareness that the significance levels do not reflect the true alpha error rate, that is, the probability of erroneously rejecting HO (the null hypothesis that there is no discrimination between groups) must be maintained * Canonical Correlation Analysis (CCA) This is an additional procedure for assessing the relationship between variables. Specifically, this manipulation allows the elucidation of the relationship between two sets of variables.
Parameters which characterise this analysis are detailed below.
* Significance of roots (2 test) The term root is used to describe the individual discriminant functions (DFs). The statistical significance of the derived DFs, is tested by the x2 test of successive DFs. A report of the step-down test of all canonical roots is obtained containing the significance of all DFs followed by the second line which reports the significance of the remaining roots, after removing the first root, and so on. Thus the number of DFs to interpret is obtained.
* Discriminant function coefficients Two outputs are produced, one for the Raw Coefficients and one for the Standardized Coefficients. Raw here means that the coefficients can be used in conjunction with the observed data to compute (raw) discriminant function scores. The standardized coefficients are the ones that are customarily used for interpretation, because they pertain to the standardized variables and therefore refer to comparable scales.
* Eigenvaltjes An Eigenvalue for each OF and the cumulative proportion of explained variance accounted for by each function is obtained. This value is defined in an identical way in PCA and DFA.
The larger the value, the greater the amount of variance explained by that DF.
* Factor structure coefficients These coefficients represent the correlations between the variables and the DFs and are commonly used in order to interpret the "meaning" of discriminant functions. In an analogous way to PCA, the interpretation of factors should be based on the factor structure coefficients.
* Means of canonical variables When knowledge of how the variables participate in the discrimination between different groups is obtained, the next logical step is to determine the nature of the discrimination for each DF. The first step to answer this question is to look at the canonical means. The larger the canonical mean for a given DF and group of observations, the greater the discriminatory power of that DF.
) OFA model based on PCA scores of clinical data a) Containing substituted mean values Table 19 summarises the results of this analysts. The Wilks' value of 0.4 indicates a relatively inefficient classification model. The three derived DFs account for a total of 89.9% of the variance of the data set and the DFs are composed mainly of PCs 1 3 and 4. The factor structure coefficients indicate that: * DF1 is composed of the variance explained by PCi and to a lesser extent with PC4 * DF2 is exclusively composed of the variance explained by PC3 * DF3 is composed of the variance explained by P04 These correlations are confirmed by the standardised coefficients. The means of cannonical variables indicate that: * DF1 negatively correlates days 1, 2 and 3 with days 5 and 6 * DF2 defines control group observations * DF3 defines observations for day 2 A summary of the classification of this model and its discriminative nature in relation to the PCs is shown in Table 25. The classification matrix for the model is shown in Table 26. Table 26 suggests a good classification can be obtained for control and 6 day data with 80 and 83 % respectively of observations being classified correctly. However the overall classification power of the model is poor with only 48 % of all observations being correctly classified.
b) Excluding substituted mean values Table 24 summarises the results of this analysis. The Wilks' value of 0.45 indicates a relatively inefficient classification model. The three derived DFs account for a total of 95 % of the variance of the data set and the DFs are composed mainly of PCs 1, 3 and 5. The factor structure coefficients indicate that: * DF1 is composed of the variance explained by the negative correlation between PCi and PC5 * DF2 is composed of the variance explained by PC3 * DF3 is composed of the variance explained by P03 but to a lesser degree than DF2 These correlations are confirmed by the standardised coefficients. The means of cannonical variables indicate that: * DF1 negatively correlates days 1, 2 and 3 with days 5 and 6 * DF2 negatively correlates days 1 and 6 with the control group * DF3 defines observations for day 6 A summary of the classification of this model and its discriminative nature in relation to the PCs is shown in Table 25. The classification matrix for the model is shown in Table 26. Table 26 suggests a good classification can be obtained for control and 6 day data with 83 and 67 % respectively of observations being classified correctly. However the overall classification power of the model is poor with only 44 % of all observations being correctly classified, less than that using mean substituted variables.
The similar prediction efficiency with and without mean substituted values validated the subsequent approach to perform DFA with the inclusion of these values.
ii) OFA mode! based on transformed values of clinical data In an effort to improve the classification of observations, the transformed variable values from the original data matrix were subjected to DFA. The thesis was that since PCA is a dim ensionality reducing technique, perhaps some data quality is lost and performing DFA on PCA scores leads to a model with less predictive power.
Table 27 summarises the results of this analysis. The Wilks' X value of 0.22 is an improvement on the PCA scores classification models. The five derived DFs account for a total of 99 % of the variance of the data set and the DFs are composed of BXS, CRP, lactate, urea, temperature, creatinine, neutrophils, P02 and HC03 with the other clinical variable having no influence on the classification of observations. The factor structure coefficients indicate that: * DEl classifies the correlation between BXS and HCO3 which are negatively correlated with lactate * DF2 classifies observations showing a high degree of correlation between BXS, CRP and HCO3 * DF3 classifies samples with a negative correlation between temperature and creatinine * DF4 classifies samples with a negative correlation between temperature and P02 * DF5 classifies samples with a negative correlation between urea and neuts A summary of the classification of this model and its discriminative nature in relation to the clinical variables is shown in Table 31. The classification matrix for the model is shown in Table 32. Table 32 suggests a good classification can be obtained for control and 6 day data with 80 and 83 % respectively of observations being classified correctly. Days 1, 2 and 5 are greatly improved compared to the PCA scores models but the overall classification power of the model is poor with 55 % of all observations being correctly classified.
iii) DFA model based on PCA scores of flow cytometry Table 33 summarises the results of this analysis. The Wilks' value of 0.39 indicates a relatively inefficient classification model. The two derived DFs account for a total of 71% of the variance of the data set and the DFs are composed mainly of PCs 1, 5 and 5. The factor structure coefficients indicate that: * DEl is composed of the variance explained by PCi and PC8 * DF2 is exclusively composed of the variance explained by PC5 These correlations are confirmed by the standardised coefficients. The means of cannonical variables indicate that: * DEl negatively correlates day 1 with days 5 and 6 * DF2 defines day 3 observations A summary of the classification of this model and its discriminative nature in relation to the PCs is shown in Table 34. The classification matrix for the model is shown in Table 35. Table suggests a reasonable classification can be obtained for control and 6 day data with 66% of observations being classified correctly in both groups. However the overall classification power of the model is poor with only 44 % of all observations being correctly classified.
iv) DFA model based on flow cytometry data A summary of the classification of this model and its discriminative nature in relation to the variables is shown in Table 36. The Wilks' value of 0.034 indicates an excellent classification model. The three derived DFs account for a total of 74% of the variance of the data set and the DFs are composed mainly of 1c7-8, fcl 1, fcl 6, fc25, fc28, fc29. The factor structure coefficients indicate that: * DEl correlates fc7, 16, 28 and 29 and contrasts these to fc8 and 25 * DF2 correlates fc7, 8, 16 and 25 and contrasts these with fcl 2 * DF3 is correlated with fcl 1 Table 37 summarises this information.
These correlations are confirmed by the standardised coefficients. The means of cannonical variables indicate that: * DF1 contrasts 1 day with days 5 and 6 * DF2 contrasts the control group with days 3, 4 and 5 and correlates the control with day 6 * DF3 contrasts days 2 and 3 with day 4 Table 38 suggests a good classification can be obtained all groups. The overall classification power of the model is impressive with 76.6% of all observations being correctly classified.
The DFA models for RT-PCR were so poor for both PCA scores and transformed data, with The Wilks' . values >0.8, they were discarded and will not be considered further.
v) OFA model based on combined clinical, flow cytometry and RT-PCR data A summary of the classification of this model and its discriminative nature in relation to the variables is shown in Table 39. The Wilks' value of 0.0087 indicates an excellent classification model. The four derived DFs account for a total of 89.2% of the variance of the data set and the OF factor structure indicates that:BXS, fc 25, fc 22, fc 11, Temp, CAP, fc 18, fc 6, IL-6, INR, APTR, fc 16, Urea, Lactate, Fas-L, fc 13, fc 24, fc 1, fc 3, MCP-1, fc 28, 11-10, fc 27, fc 26, Neutrophils, fc 14, WCC, fc 29, Platelets, P02 are included in the model. All other parameters fail to meet the stepwise criteria and hence were eliminated from the model.
The means of canonical variables indicate that: * DF1 contrasts 1 day with days 5 and 6 * DF2 correlates the control group with day 6 and contrasts these with days 3, 4 and 5 * DF3 correlates the control group with day 5 and contrasts these to days 2, 3 and 6 * DF4 contrasts days 4 and 5 Table 40 shows an excellent classification can be obtained for all groups with a minimum correct assignment rate of 76%. The overall classification power of the model is impressive with 86.9% of all observations being correctly classified.
When each DF is applied to the data using this model,the groups of patients can clearly be seen to cluster and are spatially separated from the other groups.
Conclusions
PCA has highlights correlations between measured variables for all classes of patients. Many of the correlations are expected from a molecular biology standpoint. Some of the PCA models greatly reduced the dimensionality of the data set but the resulting scores did not spatially separate the groups of patients.
DFA on scores obtained from PCA showed disappointing results. The discriminatory power of the models ranged from 44 -56 % when PCA scores were used. The low discriminatory power of these models may be a result of the reduction in dimensionality of the data set during PCA with significant detail being lost. Using transformed variables in DFA gave much improved models. The discriminatory power of the clinical and flow cytometry models was 55 and 76 % respectively. When DFA was performed on the complete data set (clinical, flow cytometry and RT-PCR variables) a prediction efficiency of 86.9% was observed. Therefore it is recommended that the variables included in this latter model (Table 36) be measured and used to classify new patients suspected of being susceptible to sepsis.
The most impressive feature of the model is its ability to correctly assign patience correctly 6 days before the onset of symptoms. Therefore key discriminatory variables could be monitored and threshold levels established at which medical treatment must be administered. Using the parameters shown in Table 34 it is possible to acquire data from patients and using transformation algorithms input the data into the OFA model. This is then capable of classifying patients into the appropriate groups with an efficiency of approaching 90%. This could of great value when used in clinical laboratories.
Table 11. Eigenvalues of correlation matrix, and related statistics for clinical observations PC Eigenvalue % Total Cumulative Cumulative variance ____ __________ vanance Eigenvalue % 1 3.73 24.87 3. 73 24.87 2 2.50 16.70 6.23 41.58 3 1.53 10.23 7.77 51.81 4 1.22 8.14 8.99 59.95 1.10 7.39 10.10 67.35 6 _1.04 6.96 11.14 74.31 7 0.87 5.80 12.01 80.11 8 0.77 5.19 12.79 85.31 9 0.66 4.42 13.46 89.73 0.54 3.62 14.00 93.36 1 0.44 2.99 14.45 96.35 12 0.29 1.95 14.74 98.30 13 0.210 1.40 14. 95 99.70 14 0.02 0.16 14.98 99.87 0.01 0.12 15.00 100.00 Table 12. Loadings of clinical measurements for each PC (associations with Eigenvalues >0.5 shown for 95% CL). ________ ________ ________ ________ ________ Clinical PCi Pc 2 PC 3 PC 4 Pc 5 PC 6 measurement Temp 0.14 0.08 0.60 -0.36 0.14 0.40 HR 0.15 -0.06 0.58 -0.48 0.11 -0.37 MAP 0.22 0.21 -0.29 -0.48 -0.54 -0.14 WCC 0.63 -0.74 0.00 0.05 0.02 -0.01 Neuts 0.58 -0.75 0.01 0.17 0.00 0.05 Lymphocytes 0.35 0.00 -0.14 -0.16 -0.19 -0.62 Monocytes 0.62 -0.66 0.08 -0.05 0.01 -0.01 Platelets 0.53 -0.09 -0.37 0. 05 0.22 0.04 CAP -0.15 0.11 0.62 0.50 0.08 -0.36 P02 -0.26 0,07 -0.36 0.12 0.53 -0.39 HCO3-0.76 0.47 0.16 0.23 -0.13 -0.07 BXS 0.78 0,49 0.15 0.20 -0.11 -0.05 Lactate -0.55 -0.46 0.16 -0.12 -0.08 -0.22 Urea -0.30 -0.17 0.09 0.44 -0.58 0.04 Creatinine -0.69 -0.44 0.05 0.02 -0.19 -0.06 Table 13. Abbreviations used in analysis of flow cytometry data Abbreviation flow cytometry parameter Ic 1 CD3 CD4 CD25 in CD3 C04 fc 2 C03 004 HLA-DR in CD3 CD4 fc 3 003 CD8 C025 in CD3 CD8 Ic 4 CD3 C08 HLA-DR in CD3 008 fc 5 CD3 CD4 C054 in CD3 CD4 fc 6 C03 C04 CD69 in CD3 CD4 Ic 7 CD3 CD8 CD54 in CD3 CD8 fc 8 003 CD8 CD69 in CD3 CD8 fc9 CD19 0080 inCDl9 fc 10 CD19 C086 in C019 fcll 0014 C080 in CD14 fc12 CD14CD86inCD14 fc 13 CD19 CD54 in CD19 fc 14 C019 CD25 in CD19 Ic 15 CD56 C054 in 0056 Ic 16 CD56 CD69 in CD56 fc 17 C014 CD54 in CD14 Ic 18 CD14 HLA-DR in CDI4 Ic 19 CD14 CD11B in CD14 fc2O C014 0054 HLA-DR in CD14 fc2l C014 HLA-DA CD11B 1nCD14 Ic 22 0014 HLA-DR CD1 lB CD54 in CD14 fc23 C014 CD11B C054 in CD14 Ic 24 C031 (%) Ic25 CD54(%) fc 26 C062L (%) fc27 CD11B(%) fc28 CD69(%) fc29 CD11BCD69(%) Table 14. Eigenvalues of correlation matrix, and related statistics for flow cytometry data PC Eigenvalue % Total Cumulative Cumulative variance % ____ _________ variance Elgenvalue ____________________ 1 7.16 24.68 7.16 24.68 2 3.81 13.12 10.96 37.80 3 2.63 9.07 13.59 46.88 4 2.43 8.39 16.03 55.27 1.90 6.54 17.93 61.81 6 1. 62 5.59 19.55 67.41 7 1.46 5.05 21.01 72.45 8 1.21 4.19 22.23 76.64 9 1.10 3.78 23.32 80.42 0.87 3.00 24.19 83.42 1 0.85 2.93 25.04 86.36 12 0.71 2.44 25.75 88.80 13 0.63 2.17 26.38 90.97 14 0.53 1.82 26.91 92.78 0.39 1.36 27.30 94.14 16 0.35 1.21 27.65 95.35 17 0.34 1.16 27.99 96.51 18 0. 31 1.07 28.30 97.58 19 0.20 0.68 28.50 98.26 0.12 0.42 28.62 98.68 21 0.10 0.34 28.72 99.02 22 0.08 0.27 28.79 99.29 23 0.07 0.23 28.86 99.52 24 0. 05 0.18 28.91 99.70 0.04 0.13 28.95 99.83 26 0.02 0.08 28.98 99.91 27 0.01 0.05 28.99 99.96 28 0.01 0.03 29.00 99.99 29 0.00 0.01 29.00 100.00 Table 15. Loadings for all flow cytometry data for each PC. (associations with Eigenvalues >0.5 shown for 95% CL) ______ ______ ______ ______ ______ ______ PCi PC2 PC3 PC4 PC5 PC6 PCi PC8 PC9 fcl 0.59 -0.36 0.19 0.06 -0.13 -0.28 0.28 -0.11 -0.33 1c2 0.61 -0.07 -0.20 -0.21 0.11 0.41 0.34 -0.01 -0.17 fc3 0.62 -0.31 -0.07 0.14 0.14 -0.11 0.21 0.05 -0.31 fc4 0.24 -0.03 -0.26 -0.19 0.11 0.44 0.46 -0.24 -0.03 fc5 -0.06 -0.01 0.89 -0.24 -0.10 0.07 0.20 0.07 -0.10 fc6 -0.11 -0.11 0.62 -0.02 -0.45 0.06 0.42 0.15 -0. 07 c7 0.08 0.18 0.64 -0.39 0.50 0.05 -0.25 -0.09 -0.06 c8 0.14 0.29 0.63 -0.41 0.43 0.03 -0.24 -0.12 -0.07 c9 0.70 -0.52 0.17 0.05 -0.15 -0.01 -0.05 -0.03 0.18 c 10 0.58 -0.57 0.05 0.14.0.01 -0.06 -0.25 -0.35 0.11 cli 0.79 -0.35 -0.06 -0.09 0.02 0.30 0.04 0.03 0.14 fcl2 0.53 -0.36 0.06 0.15 -0.22 -0.09 -0.33 -0.44 0.02 Table 15 continued ______ _____ _____ ______ _____ fcl3 0.57 -0.27 0.23 -0.09 0.17 -0.06 0.01 -0.15 0.24 fcl4 0.74 -0.44 -0.01 0.03 0.09 0.13 0.00 0.22 0.16 fcl5 0.54 -0.19 0.11 0.18 0.08 0.13 -0.30 0.46 0.14 fcl6 0.31 -0.09 -0.21 -0.63 -0.07 0.45 -0.07 -0.03 -0.10 c17 0.73 0.26 0.02 0.26 0.03 -0.15 -0.13 0.19 -0.28 c18 0.49 0.64 -0.02 -0.10 -0.39 0.23 -0.15 -0.06 -0.07 c19 0.28 0.28 0.00 -0.27 -0.01 -0.32 0.40 0.01 0.60 c20 0.64 0.64 -0.05 0.09 -0.18 0.07 -0.16 0.01 -0.19 c21 0.52 0.65 0,01 -0.15 -0.38 0.08 -0.02 -0.07 0.16 c22 0.66 0.66 -0.02 0.01 -0.19 -0.07 -0.03 -0.01 0.06 fc23 0.79 0.31 0 05 0.16 0.01 -0.29 0. 02 0.15 0.05 fc24 0.45 0.19 -0.05 -0.21 0.23 -0.53 0.33 -0.02 -0.04 fc25 0. 29 -0.15 -0.12 0.00 0.29 0.03 -0.01 0.54 -0.12 fc26 0.10 0.35 -0.24 -0.03 0.55 0.12 0.02 0.04 0.24 fc27 0.25 0.16 -0.37 -0.17 0.35 -0.29 0.03 -0.29 -0.27 fc28 -0.08 -0.30 -0.25 -0.77 -0.27 -0.26 -0.19 0.16 -0.05 fc29 -0. 03 -0.29 -0.26 -0.77 -0.26 -0.29 -0.17 0.17 -0.04 Table 16. Correlation matrix of PCA model using flow cytometry variables tcl fc2 1c3 fc4 1c5 tc6 fcl fc8 fc9 fclO fcIl 1c12 fcl3 fcl4 fcl5 C 1 1 00 0.34 069 0.09 0 13 0 1 -0. 0.59 047 0.43 0.38 0 46 042 0.19 fc 2 0.34 1 00 043 0.53 -0.06 -0.08 0. 0.33 026 0.57 0.17 022 054 024 c3 0.69 0.43 1.00 0.14 -0.11 -0.1 -0. -0.46 0.41 0.54 029 0.31 0.58 0.36 fc4 0.09 0.53 014 1. -012 -0. -006 -0.10 006 0.32 006 0.1 0.13 0.00 c5 0.1 -0.06 -0.11 -0. 1.00 0. 0.53 0.09 -0.07 -0.05 -004 0.1 -0.04 -0.01 c6 0.1 -0.08 -0.1 -0. 0.80 t -002 -010 -0.07 -0.06 -0.03 00 -0. -0.
7 -002 -0.03 -0. 0.53 -0. 1.00 -004 0.00 -0.03 0.1 0. 01 8 00 0 -0.0 -0. 0.51 -0 092 1 --004 -00 -002 0 -0 0 9 0. 0.4 0.1 0.09 0. -1 0.73 0.76 061 0. 0. 0.46 0. 0.26 0.41 0. -0.07 -0.07 -0 -.7 1 00 0.57 0. 0. 0. 0 11 0. 0. 0.54 0. -0.05 -0.06 -0. 7 057 1.00 0.4 0.51 0 04 12 0. 0.17 0 0.0 -0.04 -0.03 -0. -0. 2 0.82 0.45 1 0 041 0.30 13 0.46 0. 03 0 0.10 0. 03 0. 0. 0 051 0. 1. 0 0.2 14 0. 0. 0 0. -0.04 -0.0 0. -0. -0 0. 041 0. 1.00 058 0. 02 0.36 -0.0 -0.09 0.1 C). 0 0. 0. 0 0.58 1.00 16 0.06 0.47 0. --- 7 0.11 0.1 -0. 1 0. 0.0 0. 0.2 0.07 1c17 04 0.3 0.38 --0 0, 0. 0. 4 0. 0 0. 2 0. 0.38 044 c18 004 0 27 04 --0. 0. 0 -0 02 0 0.0 0.07 0 1 c19 006 0. 0. 04 -0. 0.0 0. -006 0. -0.07 0. 0. -fc2O 0. 0.32 0.1 7 -0 -.1 0. 0. 0 004 0.2 0. 0. 016 c21 0.06 02 0.0 1 -0 0. 02 0. -0.0 02 0 0.10 0.1 fc22 0.1 0.29 0.1 1 --11 0. 0.2 0 004 0.27 0 012 0.20 c23 04 029 0.4 -0 -0 0 0 0.4 02 044 03 033 044 044 24 0.36 0.26 0.30.1 -0.01 -0. 0. 0. 7 0 009 0.17 009 024 018 007 0.07 0.23 0.22 0. -0 -0. 000 00 0.2 012 028 0.04 0.14 037 0. 26 fc2O -0.24 0.15 -005 0 -0. -0. 0. 012 -017 -0.10 0.03 -0.23 0.09 0.01 0. 01 1c27 002 0.20 0.18 0. -0. -0. 0.05 0.03 -0.04 009 0.11 0.11 0.07 0.07 -0.16 fc 28 0.03 000 -007 -0.01 -0 -0.04 -0.0 -0.02 0.07 001 0.01 0.01 -0.02 0.02 -0.07 lc29 006 003 -0.04 -0.01 -0. -0.04 -0.04 -0.03 0.09 0.04 004 003 0.02 004 -0.04
Table 16 continued
-W Ic 19' Ic 20 fc2l E iT -i Id 0.06 0.41 0.04 0.06 0. 006 0.1 0.40 036 0.07 -02 0.02 0 Ic 0.47 0.31 027 0.13 0.32 0.2 02 0.29 0.26 023 0. 020 0 fc 0.12 0.38 0.04 0.04 0. 0.0 01 040 030 0.22 -0. 018 -0. - Ic 027 0. 0 0 0.07 0.1 0.1 0.0 0.12 0.08 0. 010 -0. -fc -0.01 -0. -0. 004 -0. -0 -0.0 -0.01 -0, -0, -025 -0. -0 Ic -0.07 -0. -0. 0.0 -O -0 -0. 1 -008 -008 -0. -0. -0.32 -004 - Ic 011 0. 0 007 003 007 0.06 012 0. 01 005 -00 - fc 8 0.10 0 0. 0.09 0.17 02 j4 017 0. 0 1 003 -002 -fc9 0.15 0. 06 0. 0. 12 11 0 0.4 0.15 0. -0.1 -004 7 Ic 10 0.11 0. *0.06 -0.06 0.04 -0 004 0.28 0.09 0 1 -0.1 009 1 fcll 0. 0. 7 0.2 0. 025 0. 0.2 0.44 017 0. 0. 011 1 0.04 1c12 0.07 0 0.13 -007 0.17 0.1 0. 9 0.31 0.09 0.04 -0. 011 1 003 Ic 13 0. 0.32 0.08 0.19 0 14 0 0 2 0.33 02 0.1 0. 0.07 2 0. 2 Ic 14 0 0.38 007 0.11 0 1 0 0 0.44 0 1 0. 0 1 007 2 004 1 0.0 04 0.1 -0.01 0.2 0. 0. 0.44 0. 0. 0 -016 -7 -0.
1.0 0.06 0.30 0.05 0.1 0. 0 -0. -0 0 0 016 0.41 Ic 0.0 1 00 0.4 0.04 0.77 0.33 0. 0. 0. 0. 0. 0.24 -0 -0.1 Ic 0.3 0.4 1.00 0.16 08 0.90 0.74 0.4 0. -0. 0.0 0.11 -0 -0 Ic 00 0.04 0.1 1.00 0. 04 ö. 04 04 -0.02 0 0.15 008 Ic 0 1 0.77 0.82 0.11 1 00 0.7 0. 0 0.2 0.03 02 0.20 -0 -0.
fc2 0.1 0.33 0.90 04 0.7 1. 0. 0. 02 0.0 006 0.09 -0. -0.0 fc 2 0.0 O 07 0. 0.87 0. 1 0.7 0.38 0.05 0.1 0.18 -0.1 -O Ic 2 -0.01 084 04 0. 4 067 0 07 1 0 048 0.20 0 14 024 -0.1 -0 1c24 -0.0 0.34 0.1 04 028 0.25 0. 0.4 1.00 0. 0.22 0.37 0.08 013 fc25 0.0 0.24 -0.0 -0.0 003 0.01 0.0 0.2 0.1 1.00 0.05 016 0.03 0.03 Ic26 0.11 0.15 0.07 0.12 0.22 0.06 0 0.14 02 005 1.00 0.20 -017 -0.14 fc27 01 0.24 0.11 0.15 0.20 0.09 0 024 037 0.1 0.20 100 0.05 0.09 1c28 0.3 -0.22 -0.10 005 -0.21 -0.06 -0. -0.18 0.08 0.03 -0.17 0.05 1.00 098 fc2Q 0.4 -018 -009 0.08 -0.18 -0.05 -0. -014 0. 13 0.03 -0.14 0.09 098 100 Table 17. Eigenvalues of correlation matrix, and related statistics for selected flow cytometry variables (fc 24-fc 29) PC Eigenvalue % Total Cumulative Cumulative _____________ variance Eigrenvalue variance % 1 2.06 34,26 2.06 34.26 2 1.60 26.64 3.65 60.90 3 094 15.71 4.60 76.60 4 0.76 12.70 5.36 89.31 0.63 1045 5.99 99.76 6 0.01 0.24 6.00 100.00 Table 18. Loadings for selected flow cytometry data for each PC. (associations with Eigenvalues >0.5 shown for 95% CL) _________ PCi PC2 PC3 fc24 -0.23 -0.72 -0.13 fc25 -0.10 -0.40 0.87 fc 26 0.20 -0.60 -039 fc27 -0.19 -0.72 -0.06 fc 28 -0.98 0.14 -0.05 fc 29 -0.98 009 -0.08 Table 19. Eigenvalues of correlation matrix, and related statistics for RT-PCR data PC Eigenvalue % Total variance Cumulative Cumulative _____________ Eigenvalue variance % 1 2.9 41.0 2.9 41.0 2 1.2 17.6 4.1 58.6 3 1.0 14.4 5.1 73.0 4 0.6 8.3 5.7 81.3 0.6 7.9 6.2 89.2 6 0.4 6.2 6.7 95.4 7 0.3 4.6 7.0 100.0 Table 20. Loadings for RT-PCR data for each PC. (associations with Eigenvalues >0.5 shown for 95% CL) _________ PCi PC2 PC3 Fas-L 0.68 -0.01 0.340 MCP-1 0.71 -0.38 -0.27 TNF-alpha 0.77 0.21 -0.25 IL-i 0.37 0.64 0.56 IL-6 0.72 -0.22 -0.20 11-8 0.70 -0.28 0.389 11-10 0.34 0.698.0.48* borderline significance at 95% Cl Table 21. Eigenvalues of correlation matrix, and related statistics for combined clinical, AT-PCR and flow cytometry variables -Eigenvalue % Total Cumulative Cumulative ________ variance Eigenvalue % 1 4.550 15.167 4.550 15.167 2 3.861 12.870 8.411 28.037 3 3.295 10.982 11.706 39.019 4 2.028 6.760 13.734 45.779 1.816 6.052 15.549 51.831 6 1.520 5.065 17.069 56.896 7 1.234 4.113 18.303 61.010 8 1.153 3,844 19.456 64.854 9 1.139 3797 20.595 68.650 0.958 3.192 21.553 71.842 11 0.931 3.104 22.484 74.946 12 0.906 3.021 23.390 77.968 13 0.825 2.748 24.215 80.716 14 0.740 2.466 24.955 83.182 0.685 2.284 25.640 85.466 16 0.611 2.036 26.251 87.502 17 0.559 1.863 26.810 89.365 18 0.542 1.808 27.352 91.173 19 0.480 1.601 27.832 92.774
Table 21 continued
0.441 1.469 28.273 94.244 21 0.371 1.235 28.644 95.479 22 0.333 1.108 28.976 96.588 23 0.299 0.998 29.276 97. 586 24 0.250 0.833 29.526 98.419 0.175 0.585 29.701 99.003 26 0.139 0.462 29.840 99.465 27 0.114 0.381 29.954 99.846 28 0.020 0.068 29.974 99.914 29 0.018 0.059 29.992 99.973 0.008 0.027 30.000 100.000 Table 22. Loadings for combined clinical, RT-PCR and flow cytometry data for each PC.
(associations with Eigenvalues >0.5 shown for 95% CL) )arameter PCi PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9 CD31 (%) -0.02 -0.03 0.42 0.18 0.52 -0.07 -0.40 -0.17 0.03 CD54 (%) -0.16 0.04 0.20 -0.09 0.48 0.48 0.06 0.18 -0.19 CD62L (%) -0.41 -0.09 0.20 0.28 0.16 -0.21 0.01 -0.01 0. 39 CO11B (%) -0.13 -0.16 0.22 0.19 0.54 -0.13 -0.09 -0.19 0.05 CD69 (%) 0.49 -0.55 0.23 -0.27 0.13 -0.16 0.00 0.40 -0.25 CD11BCD69(%) 0.48 -0.53 0.29 -0.27 0.14 -0.16 -0.03 0.40 -0.22 Temp 0.20 0.06 0.30 -0.35 -0.02 a37 0. 47 0.17 0.30 HR 0.15 -0.12 0.13 -0.35 0.20 0.65 -0.03 -0.32 0.00 MAP 0.19 0.19 -0,15 0.10 -0.44 0.31 -030 004 -035 WCC 0.57 -0.64 0.14 0.38 0.01 0.06 0.02 -0.08 0.08 Neuts 0.53 -0.64 0.21 0.38 -0.01 -0.05 0.00 -0.06 0.12 Lymphocytes 031 -010 -032 0.01 -002 0.09 -06 -00 -28 Monocytes 0.56 -0.61 0.04 0.25 -0.04 0.19 0.06 -0.04 0.09 Platelets 0.49 -0.02 -0.31 0. 29 0.01 -0.15 0.26 0.04 -0.01 NR -0.21 -0.71 0.09 0.10 -0.25 0.21 -0.24 -0. 12 -0.04 APTR -050 -049 -0.04 0.07 -0.33 0.11 -0.22 -0.08 -0.08 CRP -0.07 -0.09 0.26 -0.63 0.14 -0.32 -0.23 -0.18 0.02 P02 -023 0.13 -0.10 0.18 0. 45 -0.10 0.14 -0.12 -0.49 HCO3-0.79 0.32 -0.07 -0.25 -0.03 -0.12 -0.21 -0.06 0.01 BXS 0.80 0.33 -0.07 -0.24 -0.04 -0.11 -0.22 -0.08 0.03 Lactate -0.50 -0.41 -0.02 -0.15 0.03 0.06 0.22 0.08 -0.08 Urea -0.25 -0.i5 0.21 -0.12 -0.24 -0.32 0.16 -0.08 -0.19 Creatinine -0.68 -0.45 0.16 -0.06 -0.15 -0. 04 -0.03 0.10 -0.18 Fas-L -0.05 0.12 0.77 -0.08 0.01 0.11 -0.16 -0.05 -0.05 MCP-1 0.12 0.19 0.61 0.07 -0.36 -009 0.20 -0.32 -0.13 TNF-alpha 0.14 0.49 0.57 0.32 0.01 0.24 0.13 0.05 -0.12 IL-i -0.22 0.28 0.20 0.20 -0.12 0.11 -0.45 0.49 0.16 IL-6 0.07 0.19 0.66 0.15 -0.15 -0.20 0.20 -0.05 -0.25 IL- B 0.02 0.16 0.73 -0.05 -0.27 -0.04 -0.03 -0.01 0.16 IL-b 0.14 0.44 0.02 0. 51 0.09 0.03 -0.05 0.21 -0.19 Table 23. Model definition used in all DFA models parameter value
variable introduction forward stepwise
tolerance 0.010 terms included in as derived model ___________ Ftoenter 1.00 F to remove 0.00 number of steps as significant Table 24. Summary of DFA model based on PCA scores of clinical data containing parameter value/term Wilks' . <0.40 variables included in model PCi, P03, PC4 ___________________________________ 3 sig. DEs __________ _____________ cumulative proportion DF1 DF2 DF3 ___________________________________ 56.7% 79.3% 89.9% Eigenvalue ________ 0.60 0.24 0.11 factor structure PCi 0.63 0.02 0.15 PC3 0.09 -0.87 0.39 ________________________ P04 0.44 0.03 -0.51 standardised coefficients PCi 0.78 0.02 0.13 PC3 0.12 -0.89 0.35 _______________________ P04 0.61 0.04 -0.48 means of cannonical control -0.15 0.63 0.04 variables day 1 0.51 -0.39 0.02 day2 0.57 -0.17 -0.64 day 3 0.72 0.02 0.35 day4 0.21 0.02 0.43 day5 -1.80 -0.01 -0.28 ______________________ day6 -1.33 -1.14 0.37 Table 25. summary of variable association in the discriminative DFA model based on PCA scores of clinical data containing substituted mean values DF PC components of each PC 1 WCC, monocytes, neuts and platelets, BXS and HC03-ve corr. with -creatinine and lactate 4 CAP 2 3 HRandCRP 3 4 CAP Table 26. Classification matrix of DFA model based on PCA scores of clinical data group percent control 1 day 2 days 3 days 4 days 5 days 6 days ____ correct p=.28 p=.23 p=.14 p=.08 p=.11 p=.09 p=05 control 80 24 4 0 1 0 0 1 1 day 36 12 9 3 0 0 0 1 2 days 40 5 2 6 0 1 0 _1_ 3days 11 1 5 1 1 1 0 0 4days 8 7 3 0 0 1 1 0 5days 50 2 2 0 0 0 5 1 6days 83 1 0 0 0 0 0 5 Total 48 52 25 10 2 3 6 9 Table 27. Summary of DFA model based on PCA scores of clinical data without substituted mean values parameter value/term Wilks' <0.45 variables included in model PCi, PC3, PC5 ___________________________________ 3 sig. DFs ___________ ___________ cumulative proportion DF1 DF2 DF3 _________________________________ 52% 89% 95% Eigenvalue _________ 0.49 0.35 0.05 factor structure PCi 0.65 0. 09 0.08 PC3 -0.13 0.80 0.45 _______________________ PC5.0.61 -0.32 -0.07 standardised coefficients PCi 0.79 0.10 0.06 PC3 -0.15 0.87 0.38 ______________________ PC5 -0.73 -0.34 -0.06 means of cannonical control -0.18 0.56 -0.10 variables day 1 0.54 -0.59 0.07 day2 0.45 -0.14 0.36 day3 0.81 0.40 -0.19 day 4 0.22 -0.05 -0.27 day5 -1.61 -0.00 0.33 ________________________ day 6 -1.40 -1.90 -0.59 Table 28. summary of variable association in the discriminative DFA model based on PCA scores of clinical data without substituted mean values DF PC components of each PC 1 1 WCC, monocytes, neuts and platelets, BXS and HC03-ve corr. with -creatinine and lactate P02 is contrasted with both urea and MAP 2 3 HRandCRP 3 3 HRandCRP Table 29. Classification matrix of DFA model based on PCA scores of clinical data without group percent control 1 day 2 days 3 days 4 days 5 days 6 days ____ correct p=.34 p=.21 p=.13 p=.06 p=.10 p=.09 p=.03 control 83 25 2 1 1 0 1 0 iday 47 9 9 1 0 0 0 0 2days 0 6 6 0 0 0 0 0 3days 0 3 3 0 0 0 0 0 4days 0 5 3 0 0 0 1 0 5days 25 4 2 0 0 0 2 0 6days 67 0 0 0 0 0 1 2 Total 44 52 25 2 1 0 5 2 Table 30. Summary of OFA model based on transformed clinical data with substituted mean values parameter value/term Wilks' 0.25 variables included in model BXS, CRP, lactate, urea, temperature, _____________________________________ creatinine, neuts, P02, HC03 __________________________________ 5 sig. DEs ______ ______ ______ cumulative proportion DF1 DF2 DF3 DF4 DF5 ________________________________ 46% 74% 85% 94% 99% Eigenvalue _________ 0.78 0.47 0.18 0.15 0.09 factor structure BXS -0.62 0.58 -0.09 0.21 0.22 CRP 0.12 0.63 0.16 -0.03 -0.31 temp 0.17 0.03 -0.56 0.62 0.11 lactate 0. 65 0.03 0.05 0.05 0.01 urea -0.19 -0.18 -0.19 0.17 -0.57 creatinine 0.42 -0.26 0.51 0.20 -0.19 HC03 -0.54 0.67 -0.13 0.19 0.16 P02 0.16 -0.05 -0. 04 -0.67 -0.31 ______________________ neuts -0.14 0.05 -0.24 -0.18 0.70 standardised coefficients BXS -1.45 -0.78 0.84 0.09 1.00 CRP 0.13 0.65 0.07 -0.20 -0.20 temp 0.41 -0.02 -0.64 0.53 0.02 lactate 0.56 0.34 -0.22 -0.01 0.21 urea -0.52 -0.22 -0.51 0.09 -0.60 creatinine 0.13 -0.11 0.93 0.57 0.10 HC03 1.02 1.45 -0.53 0.32 -0.88 P02 0.19 0.07 -0.28 -0.61 -0.21 ______________________ neuts -0.10 0.01 -0.30 -0.32 0.67 means of canonical variables control -0.54 -0.90 -0.10 -0.14 -0.07 day 1 -0.22 0. 53 -0.35 -0.03 0.40 day2 -0.19 0.84 0.20 -0.58 -0.36 day 3 -0.42 0.64 0.41 0.62 -0.28 day4 -0.34 -0.09 0.43 0.61 0.06 day 5 2.05 -0.37 0.61 -0.23 0 27 ________________________ day 6 1.95 i 0.03 -1.05 0.47 -0.54 Table 31. Summary of variable association in the DFA model based on clinical data with DF I clinical variables defining DFs 1 BXS and HC03 which are negatively correlated with lactate correlation between BXS, CRP and HCO3 3 negative correlation between temperature and creatinine 4 negative correlation between temperature and P02 negative correlation between urea and neutrophilss Table 32. Classification matrix of OFA model based on clinical data with substituted mean values group percent control 1 day 2 days 3 days 4 days 5 days 6 days ____ correct p=.28 p=.23 p=.14 p=.08 p=.11 p=.09 p=.05 control 80 24 5 1 0 0 0 0 1 day 56 4 14 5 0 1 0 1 2 days 60 2 2 9 _1 0 0 1 3 days 11 1 4 2 _1 1 0 0 4days 8 7 4 0 0 1 0 0 5days 50 1 2 1 0 0 5 1 6days 83 0 0 0 0 0 1 5 Total 55 39 31 18 2 3 6 8 Table 33. Summary of DFA model based on PCA scores of flow cytometry data containing parameter value/term Wilks' X <0.39 variables included in model pci, PC5, PC8 _____________________________ 2sig.DFs ______________ cumulative proportion DF1 DF2 __________________________ 46% 71% Eigenvalue 0.49 0. 26 factor structure ci 0.30 -0.04 PC5 -0.19--0.59 ______________________ PC8 0.419 -0.14 standardised coefficients PCi 0.40 -0.04 PC5 -0.25 -0.67 _______________________ PC8 0.56 -0.16 means of canonical variables control -0.21 -0.35 day 1 -0.71 -0.12 day 2 -0.24 0.16 day3 0.14 1.44 day4 0.48 0.32 day 5 0.88 -0.28 ________________________ day 6 1.99 -0.43 Table 34. Summary of variable association in the DFA model based on PCA scores of flow cytometry data containing substituted mean values DF PC components of each DF 1 1 fc 1-3, 9-15, 17, 20-23 8 fcl9 2 5 fc7�26 Table 35 Classification matrix of DFA model based on PCA scores of flow cytometry data containing_substituted mean values _______ _______ group percent control 1 day 2 days 3 days 4 days 5 days 6 days ____ correct p.28 p=.23 _p=.14 p=.08 p=.11 p=.09 p=.05 control 66.6 20 7 0 1 0 1 1 iday 36.0 12 9 2 0 0 2 0 2daysl3.3 4 8 2 1 0 0 0 3days44.4 2 2 0 4 0 1 0 4 days 33.3 3 2 1 0 4 1 _1 5days4o.0 4 1 0 0 0 4 _1 6 days 66.6 1 1 0 0 0 0 4 Total 43.9 46 30 5 6 4 9 7 Table 36 Summary of DFA model based on flow cytometry data containing substituted mean values
_________________________________
parameter value/term Wilks' . <0.034 variables included in model fc7-8, fcl 1, fcl 6, fc25, fc28, fc29 __________________________________ 3 sig. DFs _______ ________ cumulative proportion DF1 0F2 DF3 ________________________________ 46% 62% 74% Eigenvalue _________ 2.17 0.94 0.63 factor structure fc7 0.06 -0.16 -0.10 fc8 -0.06 -0.19 -0.07 fcll 0.02 0.13 -0.24 fcl6 0.05 -0.14 0.14 fc25 -0. 34 -0.23 -0.08 1c28 0.04 -0.02 0.02 _______________________ fc29 0.06 -0.03 0.01 standardised coefficients fc7 1.84 0.20 -1.03 fc8 -1.66 -1.00 0.32 fcll 0.87 1.89 -1.16 fcl6 -0.73 -1.33 0.80 fc25 -0.86 -0.29 0.02 fc28 -2.34 0.65 0.29 _______________________ fc29 2.64 0.12 -0.66 means of canonical variables control 0.50 -0.92 -0.40 day 1 1.07 -0.32 _0.12 day2 0.22 _0.07 _1.10 day3 0.20 _1.75 _1.26 day4 -0.39 _1.49 -1.34 day5 -1.05 0.73 -0.55 _____________________ day6 -5.27 -1.06 0.48 Table 37 Summary of variable association in the DFA model based on flow cytometry data DF clinical variables defining DFs 1 correlates fc7, 16, 28 and 29 and contrasts these to fc8 and 25 2 correlates fc7, 8, 16 and 25 and contrasts these with fcl 2 3 correlated with fcl 1 Table 38 Classification matrix of DFA model based on clinical data with substituted mean values _______ group percent control 1 day 2 days 3 days 4 days 5 days 6 days ____ correct p=.28 p=.23 _p=.14 p=.08 p=.11 p=.09 p=.05 control 80,0 24.0 2.0 0.0 1.0 2.0 1.0 0.0 1 day 68.0 6.0 17.0 1.0 0.0 1.0 0.0 0.0 2days 73.3 3. 0 1.0 11.0 0.0 0.0 0.0 0.0 3days 66.7 2.0 0.0 0.0 6.0 1.0 0.0 0.0 4days 91. 7 1.0 0.0 0.0 0.0 11.0 0.0 0.0 5days 80.0 0.0 1.0 1.0 0.0 0.0 8.0 0.0 6days 83.3 0.0 0.0 0.0 0.0 1.0 0.0 5.0 Total 76.6 36.0 21.0 13.0 7.0 16.0 9.0 5.0 Table 39 Summary of DFA model based on combined clinical, RT-PCR and flow cytometry variables parameter value/term variables included in model ic7-8, fcl 1, tcl 6, fc25, fc28, fc29 _________________________________ 3 sig. DEs _______ ________ cumulative proportion DF1 DF2 DF3 DF4 ___________________________________ 44% 63% 79% 89% Eigenvalue ________ 3.79 1.60 1.34 0.86 factor structure BXS -0.30 0.08 -0.26 0.17 fc 25 0.22 -0.23 -0.08 0.13 fc 22 0.22 -0.03 0.08 0.28 fc 11 -0.02 0.12 0.05 0,23 Temp 0.09 -0.11 -0.13 0.20 CAP 0.01 0.16 -0.33 -0.16 fc 18 0.15 0.01 -0. 12 0.06 fc 6 -0.03 0.01 -0.02 0.12 lL-6 -0.11 -0.06 0.11 -0.07 INR 0.00 0.06 0.03 0.12 APTR 0.15 0.12 0.30 -0.03 fc 16 -0.07 -0.09 -0.11 -0.07 Urea -0.02 -0.10 0.01 0.17 Lactate 0.25 0.04 -0.05 -0.22 Fas-L 0.00 0.15 -0.03 0.04 fc 13 -0.04 -0.01 -0.07 0.11 fc 24 0.00 -0.15 0.11 0.15 fcl -0.01 0.12 -0.12 0.22 fc 3 -0.02 0.04 -0.02 0.15 MCP-1 -0.06 0.04 0.02 -0.02 ________________________ fc 28 -0.05 -0.03 -0.08 -0.01 111-10 I 0.01 I -0.10 I 0.09 I -0.07 Table 39 contd. Summary of DFA model based on combined clinical, RT-PCR and flow cytometry variables fc27 0.01 -0.12 0.08 -0.02 fc26 0.04 -0.10 0.19 -0.06 Neuts -0.11 -0.07 0.03 -0.04 fc 14 0.00 0.09 0.08 0.19 WCC -0.10 -0.08 0.03 -0.01 fc 29 -0.07 -0.04 -0.08 0.03 Platelets -0.11 0.04 0.12 -0.03 _______________________ P02 0.05 -0.07 0.03 -0.27 parameter _______ value/term ________ cumulative proportion DF1 DF2 DF3 DF4 _________________________________ 44% 63% 79% 89% Figenvalue ________ 3.79 1.60 1.34 0.86 standardised coefficients BXS -0.80 0.33 -0.21 -0.09 fc 25 0.81 -0.52 -0.29 0.22 fc 22 0.46 -0.58 0.05 0.68 fc 11 0.04 2.00 0.53 0.96 Temp 0.51 -0.41 -0.07 0.23 CRP 0.07 0.14 -0.58 -0.13 fc 18 0.64 0.72 -0.53 -0.11 fc6 0.02 -0.29 0.17 0.08 IL-6 -0.69 -0.36 0.28 -0.12 INR -0.90 0.03 -0.99 0.94 APTR 0.82 0.62 1.28 -0. 44 fc 16 -0.63 -1.23 -0.54 -0.69 Urea -0.13 -0.38 0.06 0.58 Lactate 0.14 0. 12 -0.06 -0.68 Fas-L 0.20 0.92 -0.04 0.05 -fc 13 -0.27 -0.82 -0.33 -0.07 fc24 -0.46 -0.72 0.11 0.15 fc 1 1.04 0.84 0.09 0.12 fc 3 -0.67 -0.46 -0.41 0. 06 MCP-1 0.33 0.01 -0.21 -0.05 fc 28 3.20 0.89 -2.48 -0.21 11-10 022 -0.03 0.26 -0.16 fc27 0.12 0.14 0.23 -0.13 fc26 015 0.26 0.27 -0.01 Neuts 0.83 0.74 0.27 -1.34 fc 14 -0.43 -0.73 0.32 -0.88 WCC -0.70 -1.06 0.05 0.77 fc 29 -2.73 -0.34 2.71 0.56 Platelets -0.47 -0.06 0.33 0.27 ______________________ P02 -0.17 -0.12 0.20 -0.45 means of canonical variables control -0.39 -1.04 1.24 0.23 day 1 -1.80 -0.51 -0.50 -0.19 day2 -0.65 0.70 -1.16 -1.17 day 3 -0.40 1.12 -1.19 0.45 day4 0.38 2.02 0.20 1.77 day 5 3.33 1.36 1.41 -1.45 ________________________ day 6 5.37 -2A4 -2.21 0.76 Table 40. Classification matrix of DFA model based on combined clinical, RT-PCR and flow cytometry variables group percent control 1 day 2 days 3 days 4 days 5 days 6 days ____ correct p.28 p=.23 p=.14 p=.08 p=.ii p=.09 p=.05 control 90.0 27.0 0.0 1.0 1.0 1.0 0.0 0.0 1 day 76.0 3. 0 19.0 3.0 0.0 0.0 0.0 0.0 2days 93.3 0.0 1.0 14.0 0.0 0.0 0.0 0.0 3 days 77.8 1.0 0.0 1.0 7.0 0.0 0.0 0.0 4days 91.7 0.0 0.0 1.0 0.0 11.0 0.0 0.0 days 90.0 0.0 0.0 1.0 0.0 0.0 9.0 0.0 6 days 100.0 0.0 0.0 0.0 0.0 0.0 0. 0 6.0 Total 86.9 31.0 20.0 21.0 8.0 12.0 9.0 6.0 Example 7: Binary logistic regression analysis to predict sepsis A binary logistic regression model was used to analyse the RT-PCR, flow cytometry results and clinical data separately, from the ICU patients who went on to develop sepsis and presented positive microbiology results. This model used results gained from an age matched group of ICU patients who were not diagnosed with sepsis as the control group.
Although the model identified numerous possible predictors some appeared to be of limited use since the values obtained for the pre-symptomatic sepsis patients were within those obtained for the non sepsis patients. The potential prediction markers that did yield some pre-sepsis data points that differed from the non sepsis data are listed in Table 36.
However when combined, these prediction markers could only have identified 8 out of the 24 pre-sepsis patients.
Table 41. Summary of potential prediction markers identified by binary logistic regression analysis. _____________________ ______________ Time Point Overall Predictor effect Predictor Probability score days pre-diagnosis p=O.i0 base excess p=O.047 4 days pre-diagnosis p=O.042 IL- i 0 p=O.058 3 days pre-diagnosis p=O.095 blood bicarbonate p=O.021 p=O.114 CD31 p=O.037 p=O.055 IL-b p=O.052 2 days pre-diagnosis p=O. 10 blood bicarbonate p=O.017 C reactive protein p=0.038 P=0.085 TNF-a p=O.022 The discovery of a combination of markers that could possibly predict sepsis in 8 out of 24 patients who later went on to develop SIRS with confirmed infection dos not constitute a diagnostic test. Although the prediction capability for CD31 on granulocytes appeared promising (66%), this marker was only effective three days before the appearance of clinical symptoms. A test based on CD31 alone may not constitute a diagnostic test since to be effective there would need to be a larger diagnostic window. This could be achieved by the discovery of even more markers. This study may however have found some markers that could form part of a diagnostic test in the future, but caution must be exercised. In the mid 1 980s HLA-DR was believed to be prognostic for the development of infections and sepsis (Spittler, A. & Roth, E. 2003, Intensive Care Med, vol. 29, pp. 121 1-l2l3More recent studies however have shown that post-operative levels of this marker did not predict the onset of SIRS, sepsis or infectious complications (Oczenski, W. et a12003, Intensive Care Med, vol. 29, pp. 1253-1257 and Perry, S. eta! 2003, Intensive Care Med, vol. 29, pp. 1245- 1 252.The conflicting reports in the case of HLA-DR illustrates why caution must be applied to the results of this study. These findings could be due to regional factors such as antibiotic policy, diagnostic criteria, clinical practice, surgical procedures, treatment regimes, environmental factors and the patients predisposing factors (Angus, D. & Wax, R. 2001, Critical Care Medicine, vol. 29, no. 7 (suppl), pp. 109-116). A larger study that involves more patients from several different hospitals, preferably spanning different health authorities, needs to be conducted to further assess the usefulness of the markers identified for the prediction of sepsis.
Example 8: Sepsis as a model for response to biological weapons
Background and method
Since one of the applications for the claimed invention is the early detection of deliberate infection resulting from exposure to biological weapons, the applicability of sepsis as a model for such infection was examined. Presumptive biological weapons pathogens such as Burkholderia pseudomalle, and Francisella tu/arensis are predicted to produce severe sepsis (see Table 42), which is difficult to model for obvious reasons.
However, in vitro infection of whole blood may be used as a model and the activation marker expression and cytokine response measured. To compare this in vitro infection model with the in vivo situation, Staphylococcus aureus infection was selected as a model infectious agent directly comparable with the in vivo hospital-acquired infection data.
Table 42
Infection Forms of "Negative" Outcome Anthrax Sepsis septc shock BruceIIoss Sepsis, septic shock or chronic torm Glanders and Meodosis Sepsis. septic shock or persistent form Plague Sepsis, sept;c shock Tularemia Sepsis. septc shock Epidenc Typhus Sepsis O fever Sepsis
Table 42 continued
Ebola and Marburg hemorrhagic fevers Sepsis. septic (toxico-infectious. hypovolemic) shock Japanese encephalitis Sepsis, septic shock SmaUpox Sepsis, septic shock Yeliow fever Sepsis Blood from 25 healthy volunteers was infected in vitro with Staphylococcus aureus and the following activation markers and cytokine levels measured at 24 and 48 hours post-infection, as previously described.
FAGS
Dendritic cells: CD54, CD97, CCR6, CCR7 NK cells: CD25, CD44, CD62L, CD69, CD97 Monocytes: CD44, CD54, CD62L, CD69, CD97, CD1O7a Neutrophils: CD44, CD62L, CD69, CD1O7a Real time RT PCR IL-i 13, lL-6, lL-8, IL-b, MCP-i, TNF-a, sFasL Each of these sets of input parameters (ie Dendritic cell markers, NK cell markers, monocyte markers, neutrophil markers, AT PCR data at 24h, AT PCR data at 48h) were used to train its own neural network model. Random selections of infected or non-infected blood samples were used for training (70%) or subsequent testing (30%). The testing phase of the neural network analysis gave a predictive accuracy based on the % of times it would correctly predict that the test set of input parameters was from an infected or non-infected sample.
This testing of each set of input parameters was repeated 5 times. Each time the test was conducted a new neural network was constructed using a newly randomised 70% of the infected and non-infected samples. An average predictive accuracy was derived for each set of input parameters by working out the mean from the 5 predictive accuracies calculated from the 5 neural networks constructed on the 5 randomised sets of input data. The methodology was similar to that used in the sepsis patient study.
Results The most consistent results were obtained from the RT PCR results. Figure 4 shows the data obtained from three subjects, which demonstrates the somewhat heterogeneous patterns of change in the profiles. However, when subjected to the neural network analysis described above, the algorithm achieved a good level of identification of infected sample over uninfected controls (Figure 5).
Example 9:
Microarray design and fabrication A custom human immune response array was designed homologous to the DSTL-designed munne immune function array with additional genes that had been identified from the previous sepsis study. A total of 1438 genes were represented by a single 50-mer oligonucleotide designed by MWG Biotech. In addition the array contained 768 oligonucleotides from the MWG Biotech commercially available diverse function' genes to act as an inter-microarray slide control. Printing of the oligonucleotides was performed by MWG according to their array layout plan with the entire set of printed spots (2206) triplicated on each slide.
Blood samples for analysis Blood samples were taken from intensive care unit (ICU) patients and mixed with blood/bone marrow RNA stabilisation reagent (Roche) in a 1:10 ratio as per the manufacturer's instructions. Stabilised samples were shipped to DSTL frozen (-20C) and subsequently stored at -70 C prior to mRNA extraction.
RNA isolation Messenger RNA (mRNA) was isolated from 27.5mls blood lysate (corresponding to 2.5mls of stabilised blood) using the mRNA Isolation Kit for Blood/Bone Marrow (Roche) following the manufacturers guidelines with a few minor changes (volumes for the 55m1 lysate protocol were halved, centrifugation was for 3 minutes, washing of MGP beads was performed using 1 ml MGP washing buffer repeated 3 times and elution was into 20itl of redistilled water). The entire mRNA preparation was treated with RNase free DNase from the DNA-free kit (Ambion Inc.) following the manufacturers guidelines. The final mRNA preparation was quantitated by A. mRNA ampilfication and fluorescent dye labelling All amplification and labelling steps were performed with the Amino Allyl MessageAmp aRNA kit (Ambion Inc.) following the manufacturers instructions. Cy3 and Cy5 post-labelling reactive dyes used in the protocol were obtained from Amersham Bioscience. Amplification of mRNA was performed using 5Ong purified mRNA. A total of 3tg of amplified mRNA was fluorescently labelled for hybridisation, 1.5.tg with Cy3 and 1.5ig with Cy5. Following labelling, the same sample labelled with either Cy3 or Cy5 were mixed together and purified using the MessageAmpiM kit. The volume of eluted sample was reduced to 9d by drying in a vacuum drier. Following this, the size of the labelled amplified mRNA was reduced for hybridisation using the Fragmentation kit (Ambion Inc.) Microarray hybridisation Microarray slides were prepared for hybridisation by attaching a GeneFrame� (MWG) over the oligo printed area according to the manufacturers instructions. Fragmented, labelled mRNA (11 gI) was denatured for 3 minutes at 95CC, snap-cooled on ice for 3 minutes and briefly centrifuged. 240p.l MWG hybridisation solution was added to the sample and mixed before applying to the microarray slide. The slide was covered with a plastic coverslip which attaches to the GeneFrame� and placed within a HC2 hybridisation cassette (CamLab).
500i.d water was added to each well of the cassette to prevent drying. The closed cassette was placed in a 42C hybridisation oven for 16 hours. After hybridisation, slides were removed from the cassettes and the GeneFrame� and coverslip removed. Slides were washed sequentially using three buffers (lx SSC, 0.2% SDS; 0.5xSSC and 0.25xSSC). Each wash was for 5 minutes with agitation. Slides were centrifuged for 5 minutes at 1500 rpm and dried slides stored in the dark until scanning. Slides were scanned using a GenePix 4000B microarray scanner (Axon Instruments Inc.). PMT voltages for 635 and 532nm channels were adjusted to yield a total pixel intensity ratio of approximately 1:1. Images were saved as single image TIFF files.
Microarray gene expression analysis TIFF files from the Axon scanners were loaded into BlueFuse software (BlueGnome Ltd) and processed to Thsed data following the manufacturers instructions. The resultant data files were saved and subsequently analysed in GeneSpring software.
Neural network For analysis, data was collated from patients 1 to 6 days prior to the onset of sepsis and compared with a control group consisting of ICU patients who did not develop sepsis.
Individual samples provided data measuring up to 22 different parameters and selective combinations of variables were fed into a multi-layered perceptron neural network (Proforma, Hanon Solutions, Glasgow, Scotland).
Each network was trained with a random 70% selection of balanced sepsis and control data using back propagation algorithms and then tested with the remaining 30% of the data. This process was then repeated, using a different 70% of random ised data, until a total of 5 repeats had been run. The predictive abilities of these 5 models were then averaged to give an overall predictive capability of the network. The most successful network was the one most capable of correctly classifying previously unseen patients as being from either the sepsis or non-sepsis control group.
Results Table 43 shows various sets of genes selected from the 22 most informative genes based on their individual scores. The sets were assigned in such a way as to attempt establish the relative importance of combinations of genes based on such factors as their individual scores (sets B and G representing the top and bottom ranked genes of the 22), whether or not genes with known immunological or inflammatory functions were included (set E with CD4O and IL-8 excluded, for instance) and the effect of larger or smaller sets.
Table 43
A B C 0 E F C H I J MEF2 MEF2D CD4O IL-8 MEF2D SAAP3 TDGF1 SPN SPN HMMR ENTPD2 ENTPD2 ENTPD2 SAAP3 ENTPD2 MAP1A HDAC5 CD4O EPHA8 SAAP3 CD4O CD4O EPHA8 CD4O EPHA8 EPHA8 ENTPD5 SAAP3 C040 ENTPD2 EPHA8 EPHA8 IL- 8 SPN SAA3P CD4O ODF1 IL-8 CRX MAP1A SAA3P SAA3P SAAP3 MEF2D HMMR L-8 GPR44 MEF2D TSC22D1 MAPK7 I-IMMR HMMR MEF2D EPHA8 MAPK7 CRX CD5 EPHA8 MAPK7 TSC22D1 1L8 1L8 SPN MAPK7 SPN SPN C079A MAP1A HMMR JL-8 MAPK7 MAPK7 ENTPD2 TSC22D1 MEF2D CTNND1 CRX MEF2D SPN SPN HMMR MAPK7 CX3CL1 ENTPD2 TSC22D1 TSC22D1 HMMR SLC6A9 MAP1A ENTPD2 CRX TSC22D1 SLC6A9 TOGF1 HDAC5 CTNND1 ENTPD5 CD79A CD5 GPR4 ODFi CX3CL1 Table 44 shows the ranked scores obtained following the neural network analysis
Table 44
Rank Gene set % correct % correct % correct Overall Sepsis Control 1 F 97.5 100 90 2 H 87.3 83.3 94.7 3 B 86.3 86.5 85.7 4 I 84.4 92.6 72.2 J 87.3 94.2 75 6 G 87.2 95 60 7= D 83.7 92.9 66.7 7= E 83.7 92. 9 66.7 9 C 79.5 82.6 75 A 70 69.6 71.4 Surprisingly, set B, comprising the top ten-scoring genes based on their individual scores did not give the best overall predictive value. Even more surprisingly, the best predictive set, set F, comprised set B together with two genes not known to have any connection with the immune or inflammatory response, CRX and MAP1A. Overall, the values indicate that the inclusion of genes that could not have been predicted to be useful based on their known functions nevertheless resulted in improved predictive scores.
The reader's attention is directed to all papers and documents which are filed concurrently with or previous to this specification in connection with this application and which are open to public inspection with this specification, and the contents of all such papers and documents are incorporated herein by reference.
All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and/or all of the steps of any method or process so disclosed, may be combined in any combination, except combinations where at least some of such features and/or steps are mutually exclusive.
Each feature disclosed in this specification (including any accompanying claims, abstract and drawings), may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise. Thus, unless expressly stated otherwise, each feature disclosed is one example only of a generic series of equivalent or similar features.
The invention is not restricted to the details of any foregoing embodiments. The invention extends to any novel one, or any novel combination, of the features disclosed in this specification (including any accompanying claims, abstract and drawings), or to any novel one, or any novel combination, of the steps of any method or process so disclosed.
SEQUENCE LISTING
<110> Seceretary of State for Defence <120> Sepsis detection microarray <130> P119143W0 <160> 22 <170> Patentln version 3.4 <210> 1 <211> 1177 <212> DNA <213> Homo sapiens <400> 1 gccaaggctg gggcagggga gtcagcagag gcctcgctcg ggcgcccagt ggtcctgccg 60 cctggtctca cctcgctatg gttcgtctgc ctctgcagtg cgtcctctgg ggctgcttgc 120 tgaccgctgt ccatccagaa ccacccactg catgcagaga aaaacagtac ctaataaaca 180 gtcagtgctg ttctttgtgc cagccaggac agaaactggt gagtgactgc acagagttca 240 ctgaaacgga atgccttcct tgcggtgaaa gcgaattcct agacacctgg aacagagaga 300 cacactgcca ccagcacaaa tactgcgacc ccaacctagg gcttcgggtc cagcagaagg 360 gcacctcaga aacagacacc atctgcacct gtgaagaagg ctggcactgt acgagtgagg 420 cctgtgagag ctgtgtcctg caccgctcat gctcgcccgg ctttggggtc aagcagattg 480 ctacaggggt ttctgatacc atctgcgagc cctgcccagt cggcttcttc tccaatgtgt 540 catctgcttt cgaaaaatgt cacccttgga caagctgtga gaccaaagac ctggttgtgc 600 aacaggcagg cacaaacaag actgatgttg tctgtggtcc ccaggatcgg ctgagagccc 660 tggtggtgat ccccatcatc ttcgggatcc tgtttgccat cctcttggtg ctggtcttta 720 tcaaaaaggt ggccaagaag ccaaccaata aggcccccca ccccaagcag gaaccccagg 780 agatcaattt tcccgacgat cttcctggct ccaacactgc tgctccagtg caggagactt 840 tacatggatg ccaaccggtc acccaggagg atggcaaaga gagtcgcatc tcagtgcagg 900 agagacagtg aggctgcacc cacccaggag tgtggccacg tgggcaaaca ggcagttggc 960 cagagagcct ggtgctgctg ctgctgtggc gtgagggtga ggggctggca ctgactgggc 1020 atagctcccc gcttctgcct gcacccctgc agtttgagac aggagacctg gcactggatg 1080 cagaaacagt tcaccttgaa gaacctctca cttcaccctg gagcccatcc agtctcccaa 1140 cttgtattaa agacagaggc agaaaaaaaa aaaaaaa 1177 <210> 2 <211> 3151 <212> DNA <213> Homo sapiens <400> 2 cccggccaga caccctcacc tgcggtgccc agctgcccag gctgaggcaa gagaaggcca 60 gaaaccatgc ccatggggtc tctgcaaccg ctggccacct tgtacctgct ggggatgctg 120 gtcgcttcct gcctcggacg gctcagctgg tatgacccag atttccaggc aaggctcacc 180 cgttccaact cgaagtgcca gggccagctg gaggtctacc tcaaggacgg atggcacatg 240 gtttgcagcc agagctgggg ccggagctcc aagcagtggg aggaccccag tcaagcgtca 300 aaagtctgcc agcggctgaa ctgtggggtg cccttaagcc ttggcccctt ccttgtcacc 360 tacacacctc agagctcaat catctgctac ggacaactgg gctccttctc caactgcagc 420 cacagcagaa atgacatgtg tcactctctg ggcctgacct gcttagaacc ccagaagaca 480 acacctccaa cgacaaggcc cccgcccacc acaactccag agcccacagc tcctcccagg 540 ctgcagctgg tggcacagtc tggcggccag cactgtgccg gcgtggtgga gttctacagc 600 ggcagcctgg ggggtaccat cagctatgag gcccaggaca agacccagga cctggagaac 660 ttcctctgca acaacctcca gtgtggctcc ttcttgaagc atctgccaga gactgaggca 720 ggcagagccc aagacccagg ggagccacgg gaacaccagc ccttgccaat ccaatggaag 780 atccagaact caagctgtac ctccctggag cattgcttca ggaaaatcaa gccccagaaa 840 agtggccgag ttcttgccct cctttgctca ggtttccagc ccaaggtgca gagccgtctg 900 gtggggggca gcagcatctg tgaaggcacc gtggaggtgc gccagggggc tcagtgggca 960 gccctgtgtg acagctcttc agccaggagc tcgctgcggt gggaggaggt gtgccgggag 1020 cagcagtgtg gcagcgtcaa ctcctatcga gtgctggacg ctggtgaccc aacatcccgg 1080 gggctcttct gtccccatca gaagctgtcc cagtgccacg aactttggga gag aaattcc 1140 tactgcaaga aggtgtttgt cacatgccag gatccaaacc ccgcaggcct ggccgcaggc 1200 acggtggcaa gcatcatcct ggccctggtg ctcctggtgg tgctgctggt cgtgtgcggc 1260 ccccttgcct acaagaagct agtgaagaaa ttccgccaga agaagcagcg ccagtggatt 1320 ggcccaacgg gaatgaacca aaacatgtct ttccatcgca accacacggc aaccgtccga 1380 tcccatgctg agaaccccac agcctcccac gtggataacg aatacagcca acctcccagg 1440 aactcccgcc tgtcagctta tccagctctg gaaggggttc tgcatcgctc ctccatgcag 1500 cctgacaact cctccgacag tgactatgat ctgcatgggg ctcagaggct gtaaagaact 1560 gggatccatg agcaaaaagc cgagagccag acctgtttgt cctgagaaaa ctgtccgctc 1620 ttcacttgaa atcatgtccc tatttctacc ccggccagaa catggacaga ggccagaagc 1680 cttccggaca ggcgctgctg ccccgagtgg caggccagct cacactctgc tgcacaacag 1740 ctcggccgcc cctccacttg tggaagctgt ggtgggcaga gccccaaaac aagcagcctt 1800 ccaactagag actcgggggt gtctgaaggg ggcccccttt ccctgcccgc tggggagcgg 1860 cgtctcagtg aaatcggctt tctcctcaga ctctgtccct ggtaaggagt gacaaggaag 1920 ctcacagctg ggcgagtgca ttttgaatag ttttttgtaa gtagtgcttt tcctccttcc 1980 tgacaaatcg agcgctttgg cctcttctgt gcagcatcca cccctgcgga tccctctggg 2040 gaggacagga aggggactcc cggagacctc tgcagccgtg gtggtcagag gctgctcatc 2100 tgagcacaaa gacagctctg cacattcacc gcagctgcca gccaggggtc tgggtgggca 2160 ccaccctgac ccacagcgtc accccactcc ctctgtctta tgactcccct ccccaacccc 2220 ctcatctaaa gacaccttcc tttccactgg ctgtcaagcc cacagggcac cagtgccacc 2280 cagggccctg cacaaagggg cgcctagtaa accttaacca acttggtttt ttgcttcacc 2340 cagcaattaa aagtcccaag ctgaggtagt ttcagtccat cacagttcat cttctaaccc 2400 aagagtcaga gatggggctg gtcatgttcc tttggtttga ataactccct tgacgaaaac 2460 agactcctct agtacttgga gatcttggac gtacacctaa tcccatgggg cctcggcttc 2520 cttaactgca agtgagaaga ggaggtctac ccaggagcct cgggtctgat caagggagag 2580 gccaggcgca gctcactgcg gcctctaaga aggtgaagca acatgggaac acatcctaag 2640 acacatccta agacaggtcc tttctccacg ccatttgatg ctgtatctcc tgggagcaca 2700 ggcatcaatg gtccaagccg cataataagt ctggaagagc aaaagggagt tactaggata 2760 tggggtgggc tgctcccaga atctgctcag ctttctgccc ccaccaacac cctccaacca 2820 ggccttgcct tctgagagcc cccgtggcca agcccaggtc acagatcttc ccccgaccat 2880 gctgggaatc cagaaacagg gaccccattt gtcttcccat atctggtgga ggtgaggggg 2940 ctcctcaaaa gggaactgag aggctgctct tagggagggc aaaggttcgg gggcagccag 3000 tgtctcccat cagtgccttt tttaataaaa gctctttcat ctatagtttg gccaccatac 3060 agtggcctca aagcaaccat ggcctactta aaaaccaaac caaaaataaa gagtttagtt 3120 gaggagaaaa aaaaaaaaaa aaaaaaaaaa a 3151 <210> 3 <211> 1246 <212> DNA <213> Homo sapiens <400> 3 ctgaagcata cccggcaggg gctgtcccca ggcccaacaa gcaaagggcc cagtagcgag 60 ggccactgga gcccatctcc ggggggctgg gcaggaagta gggtggggtt tggggtaggg 120 atctggtacc ctgggactgc tgcaactcaa actaaccaac ccactgggag aagatgcctg 180 ggggtccagg agtcctccaa gctctgcctg ccaccatctt cctcctcttc ctgctgtctg 240 ctgtctacct gggccctggg tgccaggccc tgtggatgca caaggtccca geatcatlga 300 tggtgagcct gggggaagac gcccacttcc aatgcccgca caatagcagc aacaacgcca 360 acgtcacctg gtggcgcgtc ctccatggca actacacgtg gccccctgag ttcttgggcc 420 cgggcgagga ccccaatggt acgctgatca tccagaatgt gaacaagagc catgggggca 480 tatacgtgtg ccgggtccag gagggcaacg agtcatacca gcagtcctgc ggcacctacc 540 tccgcgtgcg ccagccgccc cccaggccct tcctggacat gggggagggc accaagaacc 600 gaatcatcac agccgagggg atcatcctcc tgttctgcgc ggtggtgcct gggacgctgc 660 tgctgttcag gaaacgatgg cagaacgaga agctcgggtt ggatgccggg gatgaatatg 720 aagatgaaaa cctttatgaa ggcctgaacc tggacgactg ctccatgtat gaggacatct 780 cccggggcct ccagggcacc taccaggatg tgggcagcct caacatagga gatgtccagc 840 tggagaagcc gtgacacccc tactcctgcc aggctgcccc cgcctgctgt gcacccagct 900 ccagtgtctc agctcacttc cctgggacat tctcctttca gcccttctgg gggcttcctt 960 agtcatattc ccccagtggg gggtgggagg gtaacctcac tcttctccag gccaggcctc 1020 cttggactcc cctgggggtg tcccactctt cttccctcta aactgcccca cctcctaacc 1080 taatcccccc gccccgctgc ctttcccagg ctcccctcac cccagcgggt aatgagccct 1140 taatcgctgc ctctagggga gctgattgta gcagcctcgt tagtgtcacc ccctcctccc 1200 tgatctgtca gggccactta gtgataataa attcttccca actgca 1246 <210> 4 <211> 2376 <212> DNA <213> Homo sapiens <400> 4 taaaatctcc ccatgtgagg ggatgtgttt ccttcagcct ctgctgtctg gccgctctgt 60 ctaggtcctg ggccacggga gagccccgtc cctcctttct gaaggccccc tgacttgggc 120 ctcagtgtcc ccgaagatca tgatggcgta tatgaacccg gggccccact attctgtcaa 180 cgccttggcc ctaagtggcc ccagtgtgga tctgatgcac caggctgtgc cctacccaag 240 cgcccccagg aagcagcggc gggagcgcac caccttcacc cggagccaac tggaggagct 300 ggaggcactg tttgccaaga cccagtaccc agacgtctat gcccgtgagg aggtggctct 360 gaagatcaat ctgcctgagt ccagggttca ggtttggttc aagaaccgga gggctaaatg 420 caggcagcag cgacagcagc agaaacagca gcagcagccc ccagggggcc aggccaaggc 480 ccggcctgcc aagaggaagg cgggcacgtc cccaagaccc tccacagatg tgtgtccaga 540 ccctctgggc atctcagatt cctacagtcc ccctctgccc ggcccctcag gctccccaac 600 cacggcagtg gccactgtgt ccatctggag cccagcctca gagtcccctt tgcctgaggc 660 gcagcgggct gggctggtgg cctcagggcc gtctctgacc tccgccccct atgccatgac 720 ctacgccccg gcctccgctt tctgctcttc cccctccgcc tatgggtctc cgagctccta 780 tttcagcggc ctagacccct acctttctcc catggtgccc cagctagggg gcccggctct 840 tagccccctc tctggcccct ccgtgggacc ttccctggcc cagtccccca cctccctatc 900 aggccagagc tatggcgcct acagccccgt ggatagcttg gaattcaagg accccacggg 960 cacctggaaa ttcacctaca atcccatgga ccctctggac tacaaggatc agagtgcctg 1020 gaagtttcag atcttgtaga ggacgcagtc tccatctctc tccatcgggc ctcgggaccc 1080 tttctcttct gaatctgctt ccctgcagtt tagatcccgg gatggcattc ctgagaaagc 1140 aacccgaacc agctgtcctt ctgacagctc ggtgttcagc ttacagagac cacccctttc 1200 ctccacaggg agaggctcct ccctctcctg ggacagctca caggtcctag tgattctctc 1260 aaccctaaca ccgtctggca cgattgtgac cgctgaagta caccacgagc tccaggcttc 1320 agaaagtggt gctgagaact tgctccaaga agaagtcaaa ccaaacttgc agttgatttg 1380 gggtcatgtt taggtcagaa tcaccgtgcc cttgaacaag caggtagggg ggcttgataa 1440 cttaactttc cacgtggaca gaattttttt ttttgttttg tttttgtttt gccactgtgc 1500 tctagcctgg gtcacaagag cgaaactccg tctcaataat aacaacaaca aaaataaccc 1560 aggagtggct caagagtcca gtgtgggatg aaaatataaa cagaggaaga caacatatgt 1620 cactggggag gctgggggag ctagacaaaa ttcacacggg aggggcaggg atgagacaat 1680 attgtgccct gggtctgtgc aaacgttggg accagaatcc actataagtt tcattcttcg 1740 gagacacgga agctctgcac tggaggccgg gactcaggcg tggaagagaa tcttctcctt 1800 attcaccggg gaggctgtgt cttgtgcaaa caagtcatag aaacttgatg ggagttgggg 1860 agggactgaa ggatgcatgc aaggttctgg gaaggagtga gaagtagtga aggccaaggg 1920 gcccccatca caggccgatg gggtaagact tcgagagagc ctgatcctgg ggtcttctga 1980 gactccacca ggagccggag agggcaggga gccaaatcca gctaggaggt tacagattgc 2040 ttttcctggg ctgggctcct gagtgttggt tcttccagct gtgcacaggg gatttcaaac 2100 gttctacgtc ctaaagccag gaagagtgac aaggcaggtg gggacagaag gaagaagcca 2160 ggagcctccc tgagaaggtg tcaccttatc tgtcctcctc ttccccacac actggcctct 2220 gggtccccct tctctgtcca cccacagagt gaaaccaatt aaaatgttgg atctcaaaaa 2280 aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 2340 aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaa 2376 <210> 5 <211> 6232 <212> DNA <213> Homo sapiens <400> 5 ctgccagatc agtttgtcac cacccaggct cccttgcctt tggctgggtg caacttccat 60 tttaggtgtt ggatctgagg gggaaaaaaa agagagaggg agagagagag aaagaagagc 120 aggaaagatc ccgaaaggag gaagaggtgg cgaaaaatca actgccctgc tggatttgtc 180 tttctcagca ccttggcgaa gccttgggtt tctttcttaa aggactgatt tttagaactc 240 cacatttgag gtgtgtggct tttgaagaaa atgtatgtac tgacgggaaa aggaagataa 300 gcaagtcgaa tttttgtctt acgctctctc cttcctgctt cctccttgct gtggtggctg 360 ggatgctcct tccatgattt tttgaatcta gactgggctg ttctctgtgt taaaccaatc 420 agttgcgacc ttctcttaac agtgtgaagt gagggggtct ctctccctcc ttctccttcc 480 tctgtgattc accttccttt ttaccctgcc ctgcggcggc tccgcccctt accttcatgg 540 acgactcaga ggtggagtcg accgccagca tcttggcctc tgtgaaggaa caagaggccc 600 agtttgagaa gctgacccgg gcgctggagg aggaacggcg ccacgtctcg gcgcagctgg 660 aacgcgtccg ggtctcacca caagatgcca acccactcat ggccaacggc acactcaccc 720 gccggcatca gaacggccgg tttgtgggcg atgctgacct tgaaagacag aaattttcag 780 atttgaaact caacggaccc caggatcaca gtcaccttct atatagcacc atccccagga 840 tgcaggagcc ggggcagatt gtggagacct acacggagga ggatcctgag ggagccatgt 900 ctgtagtctc tgtggagacc tcagatgatg ggaccactcg gcgcacagag accacggtca 960 agaaagtagt gaagactgtg acaacacgga cagtacagcc agtcgctatg ggaccagacg 1020 ggttgcctgt ggatgcttca tcagtttcta acaactatat ccagactttg ggtcgtgatt 1080 tccgcaagaa tggcaatggg ggacctggtc cctatgtggg gcaagctggc actgctaccc 1140 ttcctaggaa cttccactac cctcctgatg gttatagtcg ccactatgaa gatggttatc 1200 caggtggcag tgataactat ggcagtctgt cccgggtgac ccgcattgag gagcggtata 1260 ggcccagcat ggaaggctac cgggcaccta gtagacagga tgtgtatggg ccccaacccc 1320 aggttcgggt aggtgggagc agcgtggatc tgcatcgctt tcatccagag ccttatgggc 1380 tagaggatga ccagcgtagt atgggctatg atgacctgga ttatggtatg atgtctgatt 1440 atggcactgc ccgtcggact gggacaccct ctgaccctcg tcggcgcctc aggagctatg 1500 aagacatgat tggtgaggag gtgccatcgg atcaatacta ctgggctcct ttggcccagc 1560 atgagcgagg aagtttagca agcttggata gcctgcgcaa aggagggcct ccacctccta 1620 attggagaca gccagagctg ccagaggtga tcgccatgct tggattccgc ttggatgctg 1680 tcaagtccaa tgcagctgca tacctgcaac acttatgcta ccgcaatgac aaggtgaaga 1740 ctgacgtgcg gaagctcaag ggcatcccag tactggtggg attgttagac catcccaaaa 1800 aggaagtgca ccttggagcc tgtggagctc tcaagaatat ctcttttgga cgtgaccagg 1860 ataacaagat tgccataaaa aactgtgatg gtgtgcctgc ccttgtgcga ttgcttcgaa 1920 aggctcgtga tatggacctt actgaagtta ttaccggaac cctgtggaat ctttcatccc 1980 atgactcaat caaaatggag attgtggacc atgcactgca tgccttgaca gatgaagtga 2040 tcattcctca ttctggttgg gagcgggaac ctaatgaaga ctgtaagcca cgccatattg 2100 agtgggaatc ggtgctcacc aacacagctg gctgccttag gaatgtaagc tcagagagga 2160 gtgaagctcg ccggaaactt cgggaatgtg atggtttagt tgatgccctc attttcattg 2220 ttcaggctga gattgggcag aaggattcag acagcaagct tgtagagaac tgtgtttgcc 2280 ttcttcggaa cttatcatat caagttcacc gggagatccc acaggeagag cgttaccaag 2340 aggcagctcc caatgttgcc aacaatactg ggccacatgc tgccagttgc tttggggcca 2400 agaagggcaa agggaaaaaa cctatagagg atccagcaaa cgatacagtg gatttcccta 2460 aaagaacgag tccagctcga ggctatgagc tcttatttca gccagaggtg gttcggatat 2520 acatctcact tcttaaggag agcaagactc ctgccatcct agaagcctca gctggagcta 2580 tccagaactt gtgtgctggg cgctggacgt atggtcgata catccgctct gctctgcgtc 2640 aagagaaggc tctttctgcc atagctgacc tcctgactaa tgaacatgaa cgggtggtga 2700 aagctgcatc tggagcactg agaaacctgg ctgtggatgc tcgcaacaaa gaattaattg 2760 gtaaacatgc tattcctaac ttggtaaaga atctgccagg aggacagcag aactcctctt 2820 ggaatttctc tgaggacact gtcatctcta ttttgaacac tatcaacgag gttatcgctg 2880 agaacttgga ggctgccaaa aagcttcgag agacacaggg tattgagaag ctggtgttga 2940 tcaacaaatc agggaaccgc tcagaaaaag aagttcgagc agcagcactt gtattacaga 3000 caatctgggg atataaggaa ctgcggaagc cactggaaaa agaaggatgg aagaaatcag 3060 actttcaggt gaatctaaac aatgcttccc gaagccagag cagtcattca tatgatgata 3120 gtactctccc tctcattgac cggaaccaaa aatcagataa caactattcc acaccaaatg 3180 agagaggaga ccacaataga acactggatc gatcggggga tctaggcgac atggagccat 3240 tgaagggaac aacacccttg atgcaggacg aggggcagga atctctggag gaagagttgg 3300 atgtgttggt tttggatgat gaggggggcc aagtgtctta cccctccatg cagaagattt 3360 agcaccacta tctccgttcc atctgggctt atatgtactt ttattttttg gtggtgaaat 3420 tgactgatga ttttcctttt tcttcgctgg actattgtgc caactgccag gctgcctcct 3480 gcccttacag ccctaagtgg ctgccttctt tccatcaact cccaacttct tcctgtgaag 3540 tttaattgtc tcaacgcctc cccctccccc attccctcca tttttctccc aagaaacctg 3600 actcaattat ttgcatattt tgagaaactg ctgcagatta gttctttttg ccagttttcc 3660 ctggaactcc tggccttttg tggaggggag ggatggagag aataggaatc ttcactagaa 3720 gccgtgggaa gaattggaag ttacatgctg tatatgcaat gtccagcagt ctgataaact 3780 gacgattctt aatcaagatt tttttcctga tggggaaggg acttttattt tcttttagag 3840 aggggaaagt gtgagctctt cccttattcc taatggctat ttttgaagca aagaaggcca 3900 gcaacattgg cacatgccac ctggcaaagg acccttgagt aagtgaaggt ctcctaaaac 3960 tgggattaag aaaccttgct ctcctcatct ccaaggcagg gaccatcaag aacctacaga 4020 ctccatctct tctgcaagcc tcatgccaac cctgggctat tgctgctgcc ccttaaacac 4080 aggctgtcct taacccacct ctcctgccct gtgatatgtc tgctgagttg gcctggccat 4140 ttccaagagg ctgtagaaag gggagaatgt caaggaagac ttttggtaga gaaggagcag 4200 aaagatgtgt ttttgggaag aagaagacct ctaggaggag ctagtaggaa tgtacatgaa 4260 gcaattagtc tgaaactggc ttccccactc ccccgtttct ccttttccta tccttatagg 4320 cctgtccctt gcctctgccc tggattggtt ggcaaactat aggacttgat gtacataact 4380 cctgtccctt ttcccttaca aggtggggat tgcccctggc tttgcctctt ctttgtgcct 4440 ttggcctggg gtgcatctcc tcccgccctt ccatgtgcct ttctttgcct ctgcagtctc 4500 atttctcata attttgcaaa ttatattttg ttgctttctt acctactatt ggccctaaat 4560 agcagaaaga agagaagtga ccgagagaac ctcagattct tcattgagga ttggtatagc 4620 catgatttca gtcatagcaa gcttttgctc aacagcatat gggtgggatt tggcaaaaat 4680 cctattctga tgaatctcaa agtaaggctg gtaagagaag tgagtggtgt gactcttact 4740 ccttaggtgc ccagaattta ccatcatctc tgaaggagtt acagggaagt ggtctcccca 4800 attctcccct ccctccagta ttgccccctc tcactttagc atatattaat tagcaggttg 4860 ggctagagaa atcagctgct atgcgggttg attattatta ttatttctaa tccttttcct 4920 tatttgcctt ctactcccct taatctaatc taaaagctct gttccatgca actggagttc 4980 cttatccctc tcttcccctt cccttatata ttgaggctat ggggtaggag aaaagtgcac 5040 aacccaccac cccctctact cgtgcattaa aatttcttat ttaccctttt cccccttccc 5100 atttcttccc actttcatct accttttctg gcaaaaagga gccttttgct ctctgtgacc 5160 ctaagagcac actgcacagg gaaaattgcc ccatccagac ctggctccac tcttgatctc 5220 tcttgtcctc nctgctctt ttcctggtgc tcttttttct cggtggggtg tgggtaatag 5280 aacagccgtg ggcttttggg gacctttaac ttttttttct ctcttttgtt tataaaaaac 5340 actaaacatt caattccaga gaacccaaaa tcccaccttc ccaccgaaca ctactaaggg 5400 gcttgtgttc tgctccatac cttttctctt ttctttctgt cttgttaatg cttttaaaaa 5460 caaatgagtt ttttatataa ataaagtttt taaagtgtgt atgtgggggg tctgtgtcat 5520 ttcttcactt caagctgtta tttcttccct gctttgcatc tttgttactt ccttatgtat 5580 cagtgtcctt tccagagcaa ccagaaggag gttataccag gatttatttt gagctcagcc 5640 ccaactcttt atcaagcaac attcttgtta actatatgtg aaacattttt tcttctgaag 5700 attcttaaaa attgaatgtg gctgaagttg aacatgggag cttattgcta atttagagat 5760 aggaaactga agcataaaga attaatgact tactttaatt actggaattc ttctgcaaca 5820 tttgacaaaa ctaaccttga ataaggccca ctgtaatacg tagctctctt aaatataaca 5880 cttaggacta gaagattaga aactaccaat cccaactacg taataggaaa atgtaggatc 5940 aaaaggccca tgtatataag tactgaccac tgggccataa tgttgcttct caggctatat 6000 gcagtccttt agtcagaagt caataggcct atttattaat attttacaga ccatattacc 6060 tggattacca gggactatct ttgctgcaga gatcaagggt taagatctat gggaagatac 6120 ttatttttct gaggtcctta tgtcctgtca tataattaaa gactcaagag aatttatgtg 6180 aaatgctttc tgtatgcccc aatctttaga ttaaaattat atacctgctc ct 6232 <210> 6 <211> 3304 <212> DNA <213> Homo sapiens <400> 6 ctgagctctg ccgcctggct ctagccgcct gcctggcccc cgccgggact cttgcccacc 60 ctcagccatg gctccgatat ctctgtcgtg gCtgctccgc ttggccacCt tctgCcatct 120 gactgtcctg ctggctggac agcaccacgg tgtgacgaaa tgcaacatca cgtgcagcaa 180 gatgacatca aagatacctg tagCtttgct catccaCtat caacagaacc aggcatcatg 240 cggcaaacgc gcaatcatct tggagacgag acagcacagg ctgttctgtg ccgacccgaa 300 ggagcaatgg gtcaaggacg cgatgcagca tctggaccgc caggctgctg ccctaactcg 360 aaatggcggc accttcgaga agcagatcgg cgaggtgaag cccaggacca cccctgccgc 420 cgggggaatg gacgagtctg tggtcctgga gcccgaagcc acaggcgaaa gcagtagcct 480 ggagccgact ccttcttccc aggaagcaca gagggccctg gggacctccc cagagCtgcc 540 gaCgggcgtg aCtggttCct Cagggaccag gctccccccg ac9ccaaagg ctCaggatgg 600 agggcctgtg ggcacggagc ttttccgagt gcctcccgtc tccactgCcg ccacgtggca 660 gagttctgct ccccaccaac ctgggcccag cctctgggct gaggcaaaga cctctgaggc 720 cccgtccacc caggacccct ccacccaggc ctccactgcg tcctccccag ccccagagga 780 gaatgctccg tctgaaggcc agcgtgtgtg gggtcaggga cagagcccca ggccagagaa 840 ctctctggag cgggaggaga tgggtcccgt gccagcgcac acggatgcct tccaggactg 900 ggggcctggc agcatggccc acgtctctgt ggtccctgtc tcctcagaag ggacccccag 960 cagggagcca gtggcttcag gcagctggac ccctaaggct gaggaaccca tccatgccac 1020 catggacccc cagaggctgg gcgtccttat cactcctgtc cctgacgccc aggctgccac 1080 ccggaggcag gcggtggggc tgctggcctt ccttggcctc ctcttctgcc tgggggtggc 1140 catgttcacc taccagagcc tccagggctg ccctcgaaag atggcag gag agatggcgga 1200 gggccttcgc tacatccccc ggagctgtgg tagtaattca tatgtcctgg tgcccgtgtg 1260 aactcctctg gcctgtgtct agttgtttga ttcagacagc tgcctgggat ccctcatcct 1320 catacccacc cccacccaag ggcctggcct gagctgggat gattggaggg gggaggtggg 1360 atcctccagg tgcacaagct ccaagctccc aggcattccc caggaggcca gccttgacca 1440 ttctccacct tccagggaca gagggggtgg cctcccaact caccccagcc ccaaaactct 1500 cctctgctgc tggctggtta gaggttccct ttgacgccat cccagcccca atgaacaatt 1560 atttattaaa tgcccagccc cttctgaccc atgctgccct gtgagtacta cagtcctccc 1620 atctcacaca tgagcatcag gccaggccct ctgcccactc cctgcaacct gattgtgtct 1680 cttggtcctg ctgcagttgc cagtcacccc ggccacctgc ggtgctatct cccccagccc 1740 catcctctgt acagagccca cgcccccact ggtgacatgt cttttcttgc atgaggctag 1800 tgtggtgttt cctggcactg cttccagtga ggctctgccc ttggttaggc attgtgggaa 1860 ggggagataa gggtatctgg tgactttcct ctttggtcta cactgtgctg agtctgaagg 1920 ctgggttctg atcctagttc caccatcaag ccaccaacat actcccatct gtgaaaggaa 1980 agagggaggt aaggaatacc tgtccccctg acaacactca ttgacctgag gcccttctct 2040 ccagcccctg gatgcagcct cacagtcctt accagcagag caccttagac agtccctgcc 2100 aatggactaa cttgtctttg gaccctgagg cccagagggc ctgcaaggga gtgagttgat 2160 agcacagacc ctgccctgtg ggcccccaaa tggaaatggg cagagcagag accatccctg 2220 aaggccccgc ccaggcttag tcactgagac agcccgggct ctgcctccca tcacccgcta 2280 agagggaggg agggctccag acacatgtcc aagaagccca ggaaaggctc caggagcagc 2340 cacattcctg atgcttcttc agagactcct gcaggcagcc aggccacaag acccttgtgg 2400 tcccacccca cacacgccag attctttcct gaggctgggc tcccttccca cctctctcac 2460 tccttgaaaa cactgttctc tgccctccaa gaccttctcc ttcacctttg tccccaccgc 2520 agacaggacc agggatttcc atgatgtttt ccatgagtcc cctgtttgtt tctgaaaggg 2580 acgctacccg ggaagggggc tgggacatgg gaaaggggaa gttgtaggca taaagtcagg 2640 ggttcccttt tttggctgct gaaggctcga gcatgcctgg atggggctgc accggctggc 2700 ctggcccctc agggtccctg gtggcagctc acctctccct tggattgtcc ccgacccttg 2760 ccgtctacct gaggggcctc ttatgggctg ggttctaccc aggtgctagg aacactcctt 2820 cacagatggg tgcttggagg aaggaaaccc agctctggtc catagagagc aagacgctgt 2880 gctgccctgc ccacctggcc tctgcactcc cctgctgggt gtggcgcagc atattcagga 2940 agctcagggc ctggctcagg tggggtcact ctggcagctc agagagggtg ggagtgggtc 3000 caatgcactt tgttctggct cttccaggct gggagagcct ttcaggggtg ggacaccctg 3060 tgatggggcc ctgcctcctt tgtgaggaag ccgctggggc cagttggtcc cccttccatg 3120 gactttgtta gtttctccaa gcaggacatg gacaaggatg atctaggaag actttggaaa 3180 gagtaggaag actttggaaa gacttttcca accctcatca ccaacgtctg tgccattttg 3240 tattttacta ataaaattta aaagtcttgt gaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 3300 aaaa 3304 <210> 7 <211> 2100 <212> DNA <213> Homo sapiens <400> 7 ccggctcccc gcactctccg ggtccacgca tcgtcctccc gcgcgcccgc ccgcccatgg 60 ccgggaaggt gcggtcactg ctgccgccgc tgctgctggc cgccgcgggc ctcgccggcc 120 tcctactgct gtgcgtcccc acccgcgacg tccgggagcc gcccgccctc aagtatggca 180 tcgtcctgga cgctggttct tcacacacgt ccatgtttat ctacaagtgg ccggcagaca 240 aggagaacga cacaggcatt gtgggccagc acagctcctg tgatgttcca ggtgggggca 300 tctccagcta tgcagacaac ccttctgggg ccagccagag tcttgttgga tgcctcgaac 360 aggcgcttca ggatgtgccc aaagagagac acgcgggcac acccctctac ctgggagcca 420 cagcgggtat gcgcctgctc aacctgacca atccagaggc ctcgaccagt gtgctcatgg 480 cagtgactca cacactgacc cagtacccct ttgacttccg gggtgcacgc atcctctcgg 540 gccaggaaga gggggtgttt ggctgggtga ctgccaacta cctgctggag aacttcatca 600 agtacggctg ggtgggccgg tggttccggc cacggaaggg gacactgggg gccatggacc 660 tggggggtgc ctctacccag atcacttttg agacaaccag tccagctgag gacagagcca 720 gcgaggtcca gctgcatctc tacggccagc actaccgagt ctacacccac agcttcctct 780 gctatggccg tgaccaggtc ctccagaggc tgctggccag cgccctccag acccacggct 840 tccacccctg ctggccgagg ggcttttcca cccaagtgct gctcggggat gtgtaccagt 900 caccatgcac catggcccag cggccccaga acttcaacag cagtgccagg gtcagcctgt 960 cagggagcag tgacccccac ctctgccgag atctggtttc tgggctcttc agcttctcct 1020 cctgcccctt ctcccgatgc tctttcaatg gggtcttcca gcccccagtg gctgggaact 1080 ttgtggcctt ctctgccttc ttctacactg tggacttttt gcggacttcg atggggctgc 1140 ccgtggccac cctgcagcag ctggaggcag ccgcagtgaa tgtctgcaac cagacctggg 1200 ctcagctgca agctcgggtg ccagggcaac gggcccgcct ggccgactac tgcgccgggg 1260 ccatgttcgt gcagcagctg ctgagtcgcg gctacggctt cgacgagcgc gccttcggcg 1320 gcgtgatctt ccagaagaag gccgcggaca ctgcagtggg ctgggcgctc ggctacatgc 1380 tgaacctgac caacctgatc cccgccgacc cgccggggct gcgcaagggc acagacttca 1440 gctcctgggt cgtcctcctg ctgctcttcg cctccgcgct cctggctgcg cttgtcctgc 1500 tgctgcgtca ggtgcactcc gccaagctgc caagcaccat ttaggggccg acgggggcag 1560 ctgccccatc cctcccccaa cccctgtatc cccaccccgt actcccaccc ctcccacaac 1620 ccctgtacct cccacccctg tatccccacc cctccaccca cccctctccc aacctctctc 1680 cccgcccctg tatcctgcat tcctccaccc accctctatc ccccaccgct ccaccccacc 1740 actgtcttct ccatccttcc accccaccct cagcgtctct gcccctaagg cagcccagga 1800 aataggaact gagactctgg tacccacagg agcctgggtg ggcaaagagc gctcaatcca 1860 gctccttgaa cccctccagc ccgcttcagc ctgggcatca ctgcaggccc cgtgctcctc 1920 ctcctcctcc tcagggctgg gtctccagag agtggggcct tggtcctgag aatcagccct 1980 tagaggctcc ttctgtgtag tctgggtctg tactggggag ggtcacagcc cacgggctgg 2040 cagccagccc agcacctact tgtaaaaatt ttgtaataaa aagtttttcc tagagacgtg 2100 <210> 8 <211> 1998 <212> DNA <213> Homo sapiens <400> 8 gcgcgcgcgt tttccttgtt cctggtcaac aaagaaatgt ggagtgtctt ggctgaatcc 60 tcatacagac aagatcatta tggtgctgtt aggtaggact tgtatccaga tgtaaggttg 120 aaaaagtgat ataataaagg aaccaaggag aaaattcaga aggaaagaaa aaattgcctc 180 tgcaggtgtg cgagcaggat tgcttctgca acaaaagcct ccacccagcc acatcttggg 240 aaaagaatgg ccacttcttg gggcacagtc tttttcatgc tggtggtatc ctgtgtttgc 300 agcgctgtct cccacaggaa ccagcagact tggtttgagg gtatcttcct gtcttccatg 360 tgccccatca atgtcagcgc cagcaccttg tatggaatta tgtttgatgc agggagcact 420 ggaactcgaa ttcatgttta cacctttgtg cagaaaatgc caggacagct tccaattcta 480 gaaggggaag tttttgattc tgtgaagcca ggactttctg cttttgtaga tcaacctaag 540 cagggtgctg agaccgttca agggctctta gaggtggcca aagactcaat cccccgaagt 600 cactggaaaa agaccccagt ggtcctaaag gcaacagcag gactacgctt actgccagaa 660 cacaaagcca aggctctgct ctttgaggta aaggagatct tcaggaagtc acctttcctg 720 gtaccaaagg gcagtgttag catcatggat ggatccgacg aaggcatatt agcttgggtt 780 actgtgaatt ttctgacagg tcagctgcat ggccacagac aggagactgt ggggaccttg 840 gacctagggg gagcctccac ccaaatcacg ttcctgcccc agtttgagaa aactctggaa 900 caaactccta ggggctacct cacttccttt gagatgttta acagcactta taagctctat 960 acacatagtt acttgggatt tggattgaaa gctgcaagac tagcaaccct gggagccctg 1020 gagacagaag ggactgatgg gcacactttc cggagtgcct gtttaccgag atggttggaa 1080 gcagagtgga tctttggggg tgtgaaatac cagtatggtg gcaaccaaga aggggaggtg 1140 ggctttgagc cctgctatgc cgaagtgctg agggtggtac gaggaaaact tcaccagcca 1200 gaggaggtcc agagaggttc cttctatgct ttctcttact attatgaccg agctgttgac 1260 acagacatga ttgattatga aaaggggggt attttaaaag ttgaagattt tgaaagaaaa 1320 gccagggaag tgtgtgataa cttggaaaac ttcacctcag gcagtccttt cctgtgcatg 1380 gatctcagct acatcacagc cctgttaaag gatggctttg gctttgcaga cagcacagtc 1440 ttacagctca caaagaaagt gaacaacata gagacgggct gggccttggg ggccaccttt 1500 cacctgttgc agtctctggg catctcccat tgaggccacg tacttccttg gagacctgca 1560 tttgccaaca cctttttaag gggaggagag agcacttagt ttctgaacta gtctgggaca 1620 tcctggactt gagcctagag atttaggttt aattaatttt acacatctaa tgtgaactgc 1680 tgcctaacca ctcaagagta cacagctggc accagagcat cacagagagc cctgtgagcc 1740 aaaaagtata gttttggaac ttaaccttgg agtgagagcc cagggacagg tccctggaaa 1800 ccaaagaaaa atcgcatttc aaccctttga gtgcctcatt ccactgaata tttaaatttt 1860 cctcttaaat ggtaaactga cttattgcaa tcccaagacc catcaatatc agtatttttt 1920 tcctccctat acagtgccct gcccaccctt atctgcaccc acctcccctg aaaaagagag 1980 aaaaaaaaaa aaaaaaaa 1998 <210> 9 <211> 4996 <212> DNA <213> Homo sapiens <400> 9 gcgccggggt gtgcgcccgg ccgggtgtgc ggagagcgag ggagcgcgct ccctcccgac 60 gcgcgggccg cagcggccaa gcccgagggt gcgtggcgcc cccgcccgcc cggcccggcc 120 cggccatggc ccccgcccgg ggccgcctgc cccctgcgct ctgggtcgtc acggccgcgg 180 cggc9gcggc cacctgcgtg tccgcggcgc gcggcgaagt gaatttgctg gacacgtcga 240 ccatccac9g ggactggggc tggctcacgt atccggctca tgggtgggac tccatcaacg 300 aggtggacga gtccttccag cccatccaca cgtaccaggt ttgcaacgtc atgagcccca 360 accagaacaa ctggctgcgc acgagctggg tcccccgaga cggcgcccgg cgcgtctatg 420 ctgagatcaa gtttaccctg cgcgactgca acagcatgcc tggtgtgctg ggcacctgca 480 aggagacctt caacctctac tacctggagt cggaccgcga cctgggggcc agcacacaag 540 aaagccagtt cctcaaaatc gacaccattg cggccgacga gagcttcaca ggtgccgacc 600 ttggtgtgcg gcgtctcaag ctcaacacgg aggtgcgcag tgtgggtccc ctcagcaagc 660 gcggcttcta cctggccttc caggacatag gtgcctgcct ggccatcctc tctctccgca 720 tctactataa gaagtgccct gccatggtgc gcaatctggc tgccttctcg gaggcagtga 780 cgggggccga ctcgtcctca ctggtggagg tgaggggcca gtgcgtgcgg cactcagagg 840 agcgggacac acccaagatg tactgcagcg cggagggcga gtggctcgtg cccatcggca 900 aatgcgtgtg cagtgccggc tacgaggagc ggcgggatgc ctgtgtggcc tgtgagctgg 960 gcttctacaa gtcagcccct ggggaccagc tgtgtgcccg ctgccctccc cacagccact 1020 ccgcagctcc agccgcccaa gcctgccact gtgacctcag ctactaccgt gcagccctgg 1080 acccgccgtc ctcagcctgc acccggccac cctcggcacc agtgaacctg atctccagtg 1140 tgaatgggac atcagtgact ctggagtggg cccctcccct ggacccaggt ggccgcagtg 1200 acatcaccta caatgccgtg tgccgccgct gcccctgggc actgagccgc tgcgaggcat 1260 gtgggagcgg cacccgcttt gtgccccagc agacaagcct ggtgcaggcc agcctgctgg 1320 tggccaacct gctggcccac atgaactact ccttctggat cgaggccgtc aatggcgtgt 1380 ccgacctgag ccccgagccc cgccgggccg ctgtggtcaa catcaccacg aaccaggcag 1440 ccccgtccca ggtggtggtg atccgtcaag agcgggcggg gcagaccagc gtctcgctgc 1500 tgtggcagga gcccgagcag ccgaacggca tcatcctgga gtatgagatc aagtactacg 1560 agaaggacaa ggagatgcag agctactcca ccctcaaggc cgtcaccacc agagccaccg 1620 tctccggcct caagccgggc acccgctacg tgttccaggt ccgagcccgc acctcagcag 1680 gctgtggccg cttcagccag gccatggagg tggagaccgg gaaaccccgg ccccgctatg 1740 acaccaggac cattgtctgg atctgcctga cgctcatcac gggcctggtg gtgcttctgc 1800 tcctgctcat ctgcaagaag aggcactgtg gctacagcaa ggccttccag gactcggacg 1860 aggagaagat gcactatcag aatggacagg cacccccacc tgtcttcctg cctctgcatc 1920 accccccggg aaagctccca gagccccagt tctatgcgga accccacacc tacgaggagc 1980 caggccgggc gggccgcagt ttcactcggg agatcgaggc ctctaggatc cacatcgaga 2040 aaatcatcgg ctctggagac tccggggaag tctgctacgg gaggctgcgg gtgccagggc 2100 agcgggatgt gcccgtggcc atcaaggccc tcaaagccgg ctacacggag agacagaggc 2160 gggacttcct gagcgaggcg tccatcatgg ggcaattcga ccatcccaac atcatccgcc 2220 tcgagggtgt cgtcacccgt ggccgcctgg caatgattgt gactgagtac atggagaacg 2280 gctctctgga caccttcctg aggacccacg acgggcagtt caccatcatg cagctggtgg 2340 gcatgctgag aggagtgggt gccggcatgc gctacctctc agacctgggc tatgtccacc 2400 gagacctggc cgcccgcaac gtcctggttg acagcaacct ggtctgcaag gtgtctgact 2460 tcgggctctc acgggtgctg gaggacgacc cggatgctgc ctacaccacc acgggcggga 2520 agatccccat ccgctggacg gccccagagg ccatcgcctt ccgcaccttc tcctcggcca 2580 gcgacgtgtg gagcttcggc gtggtcatgt gggaggtgct ggcctatggg gagcggccct 2640 actggaacat gaccaaccgg gatgtcatca gctctgtgga ggaggggtac cgcctgcccg 2700 cacccatggg ctgcccccac gccctgcacc agctcatgct cgactgttgg cacaaggacc 2760 gggcgcagcg gcctcgcttc tcccagattg tcagtgtcct cgatgcgctc atccgcagcc 2820 ctgagagtct cagggccacc gccacagtca gcaggtgccc accccctgcc ttcgtccgga 2880 gctgctttga cctccgaggg ggcagcggtg gcggtggggg cctcaccgtg ggggactggc 2940 tggactccat ccgcatgggc cggtaccgag accacttcgc tgcgggcgga tactcctctc 3000 tgggcatggt gctacgcatg aacgcccagg acgtgcgcgc cctgggcatc accctcatgg 3060 gccaccagaa gaagatcctg ggcagcattc agaccatgcg ggcccagctg accagcaccc 3120 aggggccccg ccggcacctc tgatgtacag ccagcagggc ccaggcagcc accaagccca 3180 ccccaggtca tgccagcggc agaggacgtg aggggctggc agcaggcagg gcggccccag 3240 gcctctgccc tcctctcagg tgctggagga gctgaaggct tcgccacagg acctggagtt 3300 atcaggggtc aggcgcctgg gaaggggcct ttggtggcca ccctggtgag gacacctgtc 3360 ccccagggca ggcaccttct cttttccaga gcctggggcc tccacgtcac agagtccaac 3420 agggacatca ctcgcctgcc tctgtgtgcg tgcatgtgtg tgtgtggtgg ggggtgttct 3480 cacaaggtca tgggatctca tgtgaacagt gtgtgatcaa gtgtgtccac cccttcgggt 3540 ctcagcatgg acgtgtgcat gttatgagcg tgtgcttatc cgttaaggct ggaggcacat 3600 gtgggtgatg gtggatgatg tgtcatgaat gaggaggtgt gtgagcaggg aactcagtgt 3660 gacaccgcca ggtccagcac ccatgggggc gggggagagg ctcaccccta cgtcccccca 3720 cacctggagg ctggagccag gggccacttc tgaactgcac cagcaccagg cccaccctcg 3780 tctctgcctg ggtgagccca ccccggctgt atctcaggtc tgggtcctcc ctctagccga 3840 ggaggccacc tgcagcctcc acccggctct caccgctgct tcaacaggaa aacagggttc 3900 ccggtcagtc cggctggccg ccttcatgga ggcatcatgg cagagcacat gagatgtcct 3960 cagctgggct tggctgcctg gccagggccg ggggctcagc agcctcctct agcctctgat 4020 gccttccctc cacggcccag gtctcctcac tcaaagtccc ttcgccaacc tttcaatgcc 4080 cagccctgac acctgccctt tgtcccccag gcctaggatc agggaccaga ggatcctatc 4140 ttctcagcac ccagcccacc ccttcctgta gcatatgggg agactaaggc ctggagagag 4200 gggtgatgcc ccgtcccagg ttgcactgca accaagtgtc agagtcgggg ctccggcctc 4260 ctgccaaggc tcttgtcccc atacaccatc ccacaagggc gctgggggtg gaagtgccct 4320 tO ggaagcccct cccctctcac actgacctcc ccccttacgg cccaccaggg tatgtaaata 4380 tctcttttct accatgtcag aatatttttt cctcactcct gacaatgcaa aaatggtctt 4440 caaagcacat aaaaagcacc cagggtgaga aagccccatc ccgggggccg ttggcaggca 4500 gggaagcagg aaccccaccg tgtgccccct gccagcccca gagggagtgg cgagcccagc 4560 tgcccagccc tgccccccct ccccatagcc agcacagcta tcccgcgggg acaccagcac 4620 tgagccccct ctccctcctg caataattcg gggagtctca gccccatcca ggtgccgcgg 4680 ccagctctct acacctctat atattatatt actatatagc cgagctgttc ttccttccta 4740 tggaagtcgg aaacatggtc agaacacgat ctgggggggg gatcctgtct tcctccccac 4800 cccaccccac tcttacccaa tttctgggct ctggatcctc acagtcatgg aggcaccgtg 4860 ggcctggcac ttgcaaaagt gtggcccctc actctagtgt glggtccctc tcagggtcct 4920 ggggatctgc ctctctgtgg tctccatcct gactcttgaa cttacccaca ataagaataa 4980 attctgcctc atcttt 4996 <210> 10 <211> 2901 <212> DNA <213> Homo sapiens <400> 10 cagcctccct ctcccacctc tgtctgcccg ctgcctcttg tctagctgct gtcaggagct 60 gactgcctcc agggctggaa tcctgtgctc cctctgtgcc cagagcccca cgatgtcggc 120 caacgccaca ctgaagccac tctgccccat cctggagcag atgagccgtc tccagagcca 180 cagcaacacc agcatccgct acatcgacca cgcggccgtg ctgctgcacg ggctggcctc 240 gctgctgggc ctggtggaga atggagtcat cctcttcgtg gtgggctgcc gcatgcgcca 300 gaccgtggtc accacctggg tgctgcacct ggcgctgtcc gacctgttgg cctctgcttc 360 cctgcccttc ttcacctact tcttggccgt gggccactcg tgggagctgg gcaccacctt 420 ctgcaaactg cactcctcca tcttctttct caacatgttc gccagcggct tcctgctcag 480 cgccatcagc ctggaccgct gcctgcaggt ggtgcggccg gtgtgggcgc agaaccaccg 540 caccgtggcc gcggcgcaca aagtctgcct ggtgctttgg gcactagcgg tgctcaacac 600 ggtgccctat ttcgtgttcc gggacaccat ctcgcggctg gacgggcgca ttatgtgcta 660 ctacaatgtg ctgctcctga acccggggcc tgaccgcgat gccacgtgca actcgcgcca 720 ggcggccctg gccgtcagca agttcctgct ggccttcctg gtgccgctgg cgatcatcgc 780 ctcgagccac gcggccgtga gcctgcggtt gcagcaccgc ggccgccggc ggccaggccg 840 cttcgtgcgc ctggtggcag ccgtcgtggc cgccttcgcg ctctgctggg ggccctacca 900 cgtgttcagc ctgctggagg cgcgggcgca cgcaaacccg gggctgcggc cgctcgtgtg 960 gcgcgggctg cccttcgtca ccagcctggc cttcttcaac agcgtggcca acccggtgct 1020 ctacgtgctc acctgccccg acatgctgcg caagctgcgg cgctcgctgc gcacggtgct 1080 ggagagcgtg ctggtggacg acagcgagct gggtggcgcg ggaagcagcc gccgccgccg 1140 cacctcctcc accgcccgct cggcctcccc tttagctctc tgcagccgcc cggaggaacc 1200 gcggggcccc gcgcgtctcc tcggctggct gctgggcagc tgcgcagcgt ccccgcagac 1260 gggccccctg aaccgggcgc tgagcagcac ctcgagttag aacccggccc acgtagggcg 1320 gcactcacac gcgaaagtat caccagggtg ccgcggttca attcgatatc cggactcctg 1380 ccgcagtgat caaagtccga ggggcgggac ccaggcacct gcattttaaa gcgccccggg 1440 agactctgaa tctttttcag aaacagtgag ttaaagcagt gcttctcaaa ccttgatgtg 1500 cctgtgaatc acctaggggt cttgttaagt gcagtctgat ccaggaggcc ggggccgggt 1560 actgagagtc tgcacttaac aagctcccag gccgagaagc cagtgcggca ggttcacagg 1620 cgaggcctgg agtaacacaa agtgaaactc gtaatagact tcccactcta gggcagtgga 1680 gtcggaaggg cacacggggt gcgtctcccc ggagttcagt tttaccagat gatgggggag 1740 gggggaagga gttttatgtt aaaccatcca tgtatttttg gagaagagag aggaaaggtt 1800 tgagaagcac tgttccagcc tgccctcttc atttagccaa tgcttactgc gctagacgct 1860 tcatcccaca atcttaaggg gcagcttcta ttagccagtc tttacagctg agcacattct 1920 ggctcaggga ggttaagtga cttgcccagt ttcagggcta acgaccacag ggtctgcact 1980 ctaaccctag gcatcacatg ctcaatgact ctctggtgag cgaggacatt ctctgaccta 2040 ctcgagggac ttaagatgct accttgtgac ccagcactgc ccaaagtgct tccaaggcag 2100 aagcagcagg ggatggcgtg gtcaagcact cgggaaacct ggggctaatc aaatccaatg 2160 ggggaaatga ctaaaagtct tcggtcgtta gaagttgaat gggcacagca actctaagac 2220 tacagcacac gtcatttctt agctaagcgg accagcctcc ctgtcggcct ggtgttctgt 2280 gggatccctc tgggcactgg taatcccaag atctgtgcag ccccgcctcc aggccacatg 2340 gggctgggca gctaccattt cccttttgcg gatgggaggg gtaacttgca cctctgacct 2400 atcacttcca ctgcaccccg tctcattcct ccacctgccg tggacttggg gtcagagact 2460 gctgtgtttg agctctgcag cccagggacc gaaaagttgg tgtcaatgaa ttttgcttgg 2520 tggatgaaat gtcagtggaa gaagcagatg agaaactctt gagatcttgg tcctgtgttt 2580 tttctgccac caaaggccag ggtcactgaa ggcctggccc acagcaggtg ctgagcaaag 2640 ggaacagtga ggtgcccagc tagctgcaga gccaccctgt gttgacacct cgcccctgct 2700 ccctcccatc ccttccccct ttactcatag cacttccccc attggacacg tggtgcattt 2760 tgcttgttta ttatgttttc tctccatcag aatgaaagct cctcgagggc agggactttg 2820 gtctattgtc tgtatttgcc ggtgcctagg attgtgcctg tatgcaacag gcactcaata 2880 aatatttttg ctgtagactg g 2901 <210> 11 <211> 5327 <212> DNA <213> Homo sapiens <400> 11 cgaatgttgt tgttggtggc ggcggcgagc ggagccggag gagccgccgc aaagatggag 60 gagccgtcga ggaggtgctg ccgccgctgc cgccgccgct gctgccgccg ccgcccgcga 120 agccggagct cgagccgcag cggggatgcc gttctgagtg cctgactgcc tcgccccgaa 180 ggatggcctc ggatgggcat tagaggcacg gcggccccgg gctcccgtcc cgtccgtctg 240 tctgttatcg tctgtctctc ttgacatcac cgcagctcca ccccctcccg tcccagcccc 300 caacgccagc ttcctgcagg cccagagccg gcatgaactc tcccaacgag tcggcagatg 360 ggatgtcagg tcgggaacca tccttggaaa tcctgccgcg gacttctctg cacagcatcc 420 ctgtgacagt ggaggtgaag ccggtgctgc caagagccat gcccagttcc atggggggtg 480 ggggtggagg cagccccagc cctgtggagc tacggggggc tctggtgggc tctgtggacc 540 ccacactgcg ggagcagcaa ctgcagcagg agctcctggc gctcaagcag cagcagcagc 600 tgcagaagca gctcctgttc gctgagttcc agaaacagca tgaccacctg acaaggcagc 660 atgaggtcca gctgcagaag cacctcaagc agcagcagga gatgctggca gccaagcagc 720 agcaggagat gctggcagcc aagcggcagc aggagctgga gcagcagcgg cagcgggagc 780 agcagcggca ggaagagctg gagaagcagc ggctggagca gcagctgctc atcctgcgga 840 acaaggagaa gagcaaagag agtgccattg ccagcactga ggtaaagctg aggctccagg 900 aattcctctt gtcgaagtca aaggagccca caccaggcgg cctcaaccat tccctcccac 960 agcaccccaa atgctgggga gcccaccatg cttctttgga ccagagttcc cctccccaga 1020 gcggcccccc tgggacgcct ccctcctaca aactgccttt gcctggg ccc tacgacagtc 1080 gagacgactt ccccctccgc aaaacagcct ctgaacccaa cttgaaagtg cgttcaaggc 1140 taaaacagaa ggtggctgag cggagaagca gtcccctcct gcgtcgcaag gatgggactg 1200 ttattagcac ctttaagaag agagctgttg agatcacagg tgccgggcct ggggcgtcgt 1260 ccgtgtgtaa cagcgcaccc ggctccggcc ccagctctcc caacagctcc
cacagcacca 1320 tcgctgagaa tggctttact ggctcagtcc ccaacatccc cactgagatg ctccctcagc 1380 accgagccct ccctctggac agctccccca accagttcag cctctacacg tctccttctc 1440 tgcccaacat ctccctaggg ctgcaggcca cggtcactgt caccaactca cacctcactg 1500 cctccccgaa gctgtcgaca cagcaggagg ccgagaggca ggccctccag tccctgcggc 1560 agggtggcac gctgaccggc aagttcatga gcacatcctc tattcctggc tgcctgctgg 1620 gcgtggcact ggagggcgac gggagccccc acgggcatgc ctccctgctg cagcatgtgc 1680 tgttgctgga gcaggcccgg cagcagagca ccctcattgc tgtgccactc cacgggcagt 1740 ccccactagt gacgggtgaa cgtgtggcca ccagcatgcg gacggtaggc aagctcccgc 1800 ggcatcggcc cctgagccgc actcagtcct caccgctgcc gcagagtccc caggccctgc 1860 agcagctggt catgcaacaa cagcaccagc agttcctgga gaagcagaag cagcagcagc 1920 tacagctggg caagatcctc accaagacag gggagctgcc caggcagccc accacccacc 1980 ctgaggagac agaggaggag ctgacggagc agcaggaggt cttgctgggg gagggagccc 2040 tgaccatgcc ccgggagggc tccacagaga gtgagagcac acaggaagac ctggaggagg 2100 aggacgagga agacgatggg gaggaggagg aggattgcat ccaggttaag gacgaggagg 2160 gcgagagtgg tgctgaggag gggcccgact tggaggagcc tggtgctgga tacaaaaaac 2220 tgttctcaga tgcccagccg ctgcagcctt tgcaggtgta ccaggcgccc ctcagcctgg 2280 ccactgtgcc ccaccaggcc ctgggccgta cccagtcctc ccctgctgcc cctgggggca 2340 tgaagagccc cccagaccag cccgtcaagc acctcttcac cacaggtgtg gtctacgaca 2400 cgttcatgct aaagcaccag tgcatgtgcg ggaacacaca cgtgcaccct gagcatgctg 2460 gccggatcca gagcatctgg tcccggctgc aggagacagg cctgcttagc aagtgcgagc 2520 ggatccgagg tcgcaaagcc acgctagatg agatccagac agtgcactct gaataccaca 2580 ccctgctcta tgggaccagt cccctcaacc ggcagaagct agacagcaag aagttgctcg 2640 gccccatcag ccagaagatg tatgctgtgc tgccttgtgg gggcatcggg gtggacagtg 2700 acaccgtgtg gaatgagatg cactcctcca gtgctgtgcg catggcagtg ggctgcctgc 2760 tggagctggc cttcaaggtg gctgcaggag agctcaagaa tggatttgcc atcatccggc 2820 ccccaggaca ccacgccgag gaatccacag ccatgggatt ctgcttcttc aactctgtag 2880 ccatcaccgc aaaactccta cagcagaagt tgaacgtggg caaggtcctc atcgtggact 2940 gggacattca ccatggcaat ggcacccagc aggcgttcta caatgacccc tctgtgctct 3000 acatctctct gcatcgctat gacaacggga acttctttcc aggctctggg gctcctgaag 3060 aggttggtgg aggaccaggc gtggggtaca atgtgaacgt ggcatggaca ggaggtgtgg 3120 acccccccat tggagacgtg gagtacctta cagccttcag gacagtggtg atgcccattg 3180 cccacgagtt ctcacctgat gtggtcctag tctccgccgg gtttgatgct gttgaaggac 3240 atctgtctcc tctgggtggc tactctgtca ccgccagatg ttttggccac ttgaccaggc 3300 agctgatgac cctggcaggg ggccgggtgg tgctggccct ggagggaggc catgacttga 3360 ccgccatctg tgatgcctct gaggcttgtg tctcggctct gctcagtgta gagctgcagc 3420 ccttggatga ggcagtcttg cagcaaaagc ccaacatcaa cgcagtggcc acgctagaga 3480 aagtcatcga gatccagagc aaacactgga gctgtgtgca gaagttcgcc gctggtctgg 3540 gccggtccct gcgagaggcc caagcaggtg agaccgagga ggccgagact gtgagcgcca 3600 tggccttgct gtcggtgggg gccgagcagg cccaggctgc ggcagcccgg gaacacagcc 3660 ccaggccggc agaggagccc atggagcagg agcctgccct gtgacgcccc ggcccccatc 3720 cctctgggct tcaccattgt gattttgttt attttttcta ttaaaaacaa aaagtcacac 3780 attcaacaag gtgtgccgtg tgggtctctc agccttgccc ctcctgctcc tctacgctgc 3840 ctcaggcccc cagccctgtg gcttccacct cagctctaga agcctgctcc ctctgcaggg 3900 ggtggtggtg tcttcccagc cctgtcccat gtgtccctcc ccccattttc ctgcattctg 3960 tctgtccttt tcctccttgg agcctgggcc agctcaaggt gggcacgggg gcccagacag 4020 tactctccag ttctggggcc ccccgagtga ggagggaacg ggaagtcggt gccttggttt 4080 cagctgattt ggggggaaat gccttaattt cactctcctc ccttctccag cctcagggga 4140 ggatctggag gatccactac tgtctttaag atgcagagtg gaggggaggt gggcacccac 4200 cctgcgattc tccacccttt ccccttcttt cgtcctcacc atctctgcag acccctctcc 4260 tcctccttcc tcttggtctc agcactgatg ggaggctggt gcccaagctg tggcctgcag 4320 tctgtgagga gggctgtctt gcctcacact cctcacagcc tacttcccct tccccggggc 4380 tgagagggtg aaagtgtgtg gggaaggaga ggactggttt cctgggttct caggggccag 4440 gaggagtaac agaaccaggt ctgctcccca ccttactcgg atggcctccc tgcccctctg 4500 ctggcacagc ctgggcaagg ggagaaggtg gtccctgcag aggggctcca ggctggtgag 4560 agcccccctg ctgtcaggac cagattttcc cagccatcca gcatgctgcg gggagaaggg 4620 gcagaggctc acctccctcc tggggccttt tgttttggat cctggggatg gtgagaatgg 4680 aggttctaga aggggtaagg ccagaaccca gggatccagg agtcggctct cagctggagc 4740 ttccatacct tctgggctcc ctttgctgac caccagecca agggagctaa gaccaggagg 4800 gggctgggcg ctgtcccttc tctttcccag gagccctgcc aggggctgtg ggcctacaag 4860 gcttccaggg gatgccatcc agcctgtagg aaaccaaaga tgggaagtgg ctcctagggg 4920 gctgactctt ccttcctcct cctccccagt accacatata ctttctctcc ttctatctcc 4980 agggccccac caatctgttt acatatttat tatcctatgg gggcctgagc aggattgagg 5040 gagccagggg aggggcagga gtcccagcac catcggttca tagtgtgctt gtgtgtttgt 5100 tttagatcct cctgggggat ggggatgggg ccaggctcag tgtactaggc ctctctgtgc 5160 tgagccccag gctcccggcc ccttacccac tctctccctg tggctggtct ggttctcatg 5220 taaacccact ccttgctttg tctccctgga tatggatttc agttaagtat tttgtaaccc 5280 gttacactgt gtgtccttgt gtaaataaac ttgtttctgg cagtgcc 5327 <210> 12 <211> 3002 <212> DNA <213> Homo sapiens <400> 12 gccagtcacc ttcagtttct ggagctggcc gtcaacatgt cctttcctaa ggcgcccttg 60 aaacgattca atgacccttc tggttgtgca ccatctccag gtgcttatga tgttaaaact 120 ttagaagtat tgaaaggacc agtatccttt cagaaatcac aaagatttaa acaacaaaaa 180 gaatctaaac aaaatcttaa tgttgacaaa gatactacct tgcctgcttc agctagaaaa 240 gttaagtctt cggaatcaaa ggaatctcaa aagaatgata aagatttgaa gatattagag 300 aaagagattc gtgttcttct acaggaacgt ggtgcccagg acagccggat ccaggatctg 360 gaaactgagt tggaaaagat ggaagcaagg ctaaatgctg cactaaggga aaaaacatct 420 ctctctgcaa ataatgctac actggaaaaa caacttattg aattgaccag gactaatgaa 480 ctactaaaat ctaagttttc tgaaaatggt aaccagaaga atttgagaat tctaagcttg 540 gagttgatga aacttagaaa caaaagagaa acaaagatga ggggtatgat ggctaagcaa 600 gaaggcatgg agatgaagct gcaggtcacc caaaggagtc tcgaagagtc tcaagggaaa 660 atagcccaac tggagggaaa acttgtttca atagagaaag aaaagattga tgaaaaatct 720 gaaacagaaa aactcttgga atacatcgaa gaaattagtt gtgcttcaga tcaagtggaa 780 aaatacaagc tagatattgc ccagttagaa gaaaatttga aagagaagaa tgatgaaatt 840 ttaagcctta agcagtctct tgaggagaat attgttatat tatctaaaca agtagaagat 900 ctaaatgtga aatgtcagct gcttgaaaaa gaaaaagaag accatgtcaa caggaataga 960 gaacacaacg aaaatctaaa tgcagagatg caaaacttaa aacagaagtt tattcttgaa 1020 caacaggaac gtgaaaagct tcaacaaaaa gaattacaaa ttgattcact tctgcaacaa 1080 gagaaagaat tatcttcgag tcttcatcag aagctctgtt cttttcaaga ggaaatggtt 1140 aaagagaaga atctgtttga ggaagaatta aagcaaacac tggatgagct tgataaatta 1200 cagcaaaagg aggaacaagc tgaaaggctg gtcaagcaat tggaagagga agcaaaatct 1260 agagctgaag aattaaaact cctagaagaa aagctgaaag ggaaggaggc tgaactggag 1320 aaaagtagtg ctgctcatac ccaggccacc ctgcttttgc aggaaaagta tgacagtatg 1380 gtgcaaagcc ttgaagatgt tactgctcaa tttgaaagct ataaagcgtt aacagccagt 1440 gagatagaag atcttaagct ggagaactca tcattacagg aaaaagcggc caaggctggg 1500 aaaaatgcag aggatgttca gcatcagatt ttggcaactg agagctcaaa tcaagaatat 1560 gtaaggatgc ttctagatct gcagaccaag tcagcactaa aggaaacaga aattaaagaa 1620 atcacagttt cttttcttca aaaaataact gatttgcaga accaactcaa gcaacaggag 1680 gaagacttta gaaaacagct ggaagatgaa gaaggaagaa aagctgaaaa agaaaataca 1740 acagcagaat taactgaaga aattaacaag tggcgtctcc tctatgaaga actatataat 1800 aaaacaaaac cttttcagct acaactagat gcttttgaag tagaaaaaca ggcattgttg 1860 aatgaacatg gtgcagctca ggaacagcta aataaaataa gagattcata tgctaaatta 1920 ttgggtcatc agaatttgaa acaaaaaatc aagcatgttg tgaagttgaa agatgaaaat 1980 agccaactca aatcggaagt atcaaaactc cgctgtcagc ttgctaaaaa aaaacaaagt 2040 gagacaaaac ttcaagagga attgaataaa gttctaggta tcaaacactt tgatccttca 2100 aaggcttttc atcatgaaag taaagaaaat tttgccctga agaccccatt aaaagaaggc 2160 aatacaaact gttaccgagc tcctatggag tgtcaagaat catggaagta aacatctgag 2220 aaacctgttg aagattattt cattcgtctt gttgttattg atgttgctgt tattatattt 2280 gacatgggta ttttataatg ttgtatttaa ttttaactgc caatccttaa atatgtgaaa 2340 ggaacatttt ttaccaaagt gtcttttgac attttatttt ttcttgcaaa tacctcctcc 2400 ctaatgctca cctttatcac ctcattctga accctttcgc tggctttcca gcttagaatg 2460 catctcatca acttaaaagt cagtatcata ttattatcct cctgttctga aaccttagtt 2520 tcaagagtct aaaccccaga ttcttcagct tgatcctgga ggtcttttct agtctgagct 2580 tctttagcta ggctaaaaca ccttggcttg ttattgcctc tactttgatt ctgataatgc 2640 tcacttggtc ctacctatta tccttctact tgtccagttc aaataagaaa taaggacaag 2700 cctaacttca tagaaacctc tctattttta atcagttgtt taataattta caggttctta 2760 ggctccatcc tgtttgtatg aaattataat ctgtggattg gcctttaagc ctgcattctt 2820 aacaaactct tcagttaatt cttagataca ctaaaaatct gagaaactct acatgtaact 2880 atttcttcag agtttgtcat atactgcttg tcatctgcat gtctactcag catttgatta 2940 acatttgtgt aatatgaaat aaaattacac agtaagtcat ttaaccaaaa aaaaaaaaaa 3000 aa 3002 <210> 13 <211> 1666 <212> DNA <213> Homo sapiens <400> 13 ctccataagg cacaaacttt cagagacagc agagcacaca agcttctagg acaagagcca 60 ggaagaaacc accggaagga accatctcac tgtgtgtaaa catgacttcc aagctggccg 120 tggctctctt ggcagccttc ctgatttctg cagctctgtg tgaaggtgca gttttgccaa 180 ggagtgctaa agaacttaga tgtcagtgca taaagacata ctccaaacct ttccacccca 240 aatttatcaa agaactgaga gtgattgaga gtggaccaca ctgcgccaac acagaaatta 300 ttgtaaagct ttctgatgga agagagctct gtctggaccc caaggaaaac tgggtgcaga 360 gggttgtgga gaagtttttg aagagggctg agaattcata aaaaaattca ttctctgtgg 420 tatccaagaa tcagtgaaga tgccagtgaa acttcaagca aatctacttc aacacttcat 480 gtattgtgtg ggtctgttgt agggttgcca gatgcaatac aagattcctg gttaaatttg 540 aatttcagta aacaatgaat agtttttcat tgtaccatga aatatccaga acatacttat 600 atgtaaagta ttatttattt gaatctacaa aaaacaacaa ataattttta aatataagga 660 ttttcctaga tattgcacgg gagaatatac aaatagcaaa attgaggcca agggccaaga 720 gaatatccga actttaattt caggaattga atgggtttgc tagaatgtga tatttgaagc 780 atcacataaa aatgatggga caataaattt tgccataaag tcaaatttag ctggaaatcc 840 tggatttttt tctgttaaat ctggcaaccc tagtctgcta gccaggatcc acaagtcctt 900 gttccactgt gccttggttt ctcctttatt tctaagtgga aaaagtatta gccaccatct 960 tacctcacag tgatgttgtg aggacatgtg gaagcacttt aagttttttc atcataacat 1020 aaattatttt caagtgtaac ttattaacct atttattatt tatgtattta tttaagcatc 1080 aaatatttgt gcaagaattt ggaaaaatag aagatgaatc attgattgaa tagttataaa 1140 gatgttatag taaatttatt ttattttaga tattaaatga tgttttatta gataaatttc 1200 aatcagggtt tttagattaa acaaacaaac aattgggtac ccagttaaat tttcatttca 1260 gataaacaac aaataatttt ttagtataag tacattattg tttatctgaa attttaattg 1320 aactaacaat cctagtttga tactcccagt cttgtcattg ccagctgtgt tggtagtgct 1380 gtgttgaatt acggaataat gagttagaac tattaaaaca gccaaaactc cacagtcaat 1440 attagtaatt tcttgctggt tgaaacttgt ttattatgta caaatagatt cttataatat 1500 tatttaaatg actgcatttt taaatacaag gctttatatt tttaacttta agatgttttt 1560 atgtgctctc caaatttttt ttactgtttc tgattgtatg gaaatataaa agtaaatatg 1620 aaacatttaa aatataattt gttgtcaaag taaaaaaaaa aaaaaa 1666 <210> 14 <211> 10259 <212> DNA <213> Homo sapiens <400> 14 gctgccggct gaggccQgag ctgccgcctc catgagaggc ttcctcctac accccagggc 60 cagaggaccc tttgccacca gagtgagatc ctagagacca tcatcctggt aaatcccagt 120 gcagacagca tcagctctga ggttcatcat cttcttagca gctcatcagc ttataaacta 180 ctaatcttga gtgggcaaag tttagagcct gggggagacc tcatcctaca gagtggcacc 240 tactcatatg aaaactttgc ccaggtcctt cacaaccccg agatttccca attgctcagc 300 aatagagacc ctgggataca ggccttcctt accgtgtcct gcttagggga aggtgattgg 360 agccacctgg gattatccag ttcccaagag accctgcacc tccggctaaa ccctgagccc 420 actctgccca ccatggacgg cgtggctgag ttctccgagt atgtctctga gactgtggac 480 gtgccatccc catttgacct actagagccc cccacctcag ggggcttcct caagctctcc 540 aagccttgtt gctacatctt cccaggtggt cgtggggact ctgccctctt tgctgtcaat 600 ggtttcaaca tcctggtgga tggtggctct gatcgcaagt cctgtttttg gaagctggta 660 cggcacttgg accgcattga ctcggtgcta ctcacacaca ttggggcaga caacctgcca 720 ggcatcaatg gactactgca gcgcaaagtg gcagagctag aggaggagca gtcccagggc 780 tctagcagtt acagcgactg ggtgaagaac cttatctctc ctgagcttgg agttgtcttt 840 ttcaacgtgc ctgagaagct gcggcttcct gatgcctccc ggaaagccaa gcgtagcatt 900 gaggaggcct gcctcactct gcagcactta aaccgcctgg gcatccaggc tgagcctcta 960 tatcgtgtgg tcagcaatac cattgagcca ctgaccctct tccacaaaat gggtgtgggc 1020 cggctggaca tgtatgtcct caaccctgtc aaggacagca aggagatgca gttcctcatg 1080 caaaagtggg caggcaatag taaagccaag acaggcatcg tgctgcccaa tgggaaggag 1140 gctgagatct ccgtgcccta ccttacctct atcactgctc tggtggtctg gctaccagcc 1200 aatcccactg agaagattgt gcgtgtgctt tttccaggaa atgctcccca aaacaagatc 1260 ttggagggcc tagaaaagct tcggcatctg gacttcctgc gttaccctgt ggccacgcag 1320 aaggacctgg cttctggggc tgtgcctacc aacctcaagc ccagcaaaat caaacagcgg 1380 gctgatagca aggagagcct caaagccact accaagacgg ccgtgagcaa gttggccaaa 1440 cgggaggagg tggtagaaga gggagccaag gaggcacgtt cagagctggc caaggagtta 1500 gccaagacag agaagaaggc aaaagagtca tctgagaagc ccccagagaa gcctgccaag 1560 cctgagaggg tgaagacaga gtcaagtgag gcactgaagg cagagaagcg aaagctgatc 1620 aaagacaagg tagggaaaaa gcaccttaaa gaaaagatat caaagctgga agaaaaaaaa 1680 gacaaggaga aaaaagagat caaaaaggag aggaaagagc tcaagaagga tgaaggaagg 1740 aaggaggaga agaaggatgc caagaaggag gagaagagga aagataccaa acctgagctc 1800 aagaagattt ccaagccaga cctaaagccc tttactcctg aggtacgtaa gaccctctat 1860 aaaoccaagg tccctggaag agtcaaaata gacaggagcc gtgctatccg tggggagaag 1920 gagctgtctt ctgagcccca gacaccccca gcccagaagg gaactgtacc actcccaacc 1980 atcagtgggc acagggagct ggtcctatcc tcaccagagg acctcacaca ggactttgag 2040 gagatgaagc gtgaggagag ggctttgctg gctgaacaaa gggacacagg actaggagat 2100 aagccattcc ctctagacac tgcagaggag ggacccccaa gtacagctat ccagggaaca 2160 ccaccctctg ttccagggct gggacaagaa gaacatgtga tgaaggagaa agagcttgtc 2220 ccagaggtcc ctgaggaaca aggcagcaag gacagaggcc tagactctgg ggctgaaaca 2280 gaggaagaga aagatacctg ggaggaaaag aagcagaggg aagcagagag gctcccagac 2340 agaacagaag ccagagagga aagtgaacct gaagtaaagg aggatgtgat agaaaaggct 2400 gagttagaag aaatggagga ggtacaccct tcagatgagg aggaagagga cgcgacaaaa 2460 gctgagggtt tttaccaaaa acatatgcag gaacccttga aggtaactcc aaggagccgg 2520 gaggcttttg ggggtcggga attgggactc cagggcaagg cccctgagaa ggagacctcg 2580 ttattcctaa gcagcctgac cacacctgca ggagccactg agcatgtctc ttacatccag 2640 gatgagacaa tccctggcta ctcagagact gagcagacca tctcagatga ggagatccat 2700 gatgagccgg aggagcgccc agctccaccc agatttcata caagtacata tgacctgccc 2760 gggcctgaag gtgctggccc attcgaagcc agccaacctg ccgatagtgc tgttcctgct 2820 acctctggca aagtctatgg aacgccagag actgaactca cctaccccac taacatagtg 2880 gctgcccctt tggctgaaga ggaacatgtg tcctcggcca cttcaatcac tgagtgtgac 2940 aaactttctt cctttgccac atcagtggct gaggaccaat ctgtggcctc acttacagct 3000 ccccagacag aggagacagg caagagctcc ctgctgcttg acacagtcac aagcatccct 3060 tcctcccgta ctgaagctac gcagggcttg gactatgtgc catcagctgg taccatctca 3120 cccacctcct cactggaaga agacaagggc ttcaaatcac caccctgtga ggacttctct 3180 gtgactgggg agtcagagaa gagaggagag atcataggga aaggcttgtc tggagagaga 3240 gctgtggaag aggaagagga ggagacagca aacgtagaga tgtctgagaa actttgcagt 3300 caatatggaa ctccagtgtt tagtgcccct gggcatgccc tacatccagg agaaccagcc 3360 cttggagaag cagaggagcg gtgccttagc ccagatgaca gcacagtgaa gatggcttct 3420 cctccaccat ctggcccacc cagtgccacc cacacaccct ttcatcagtc cccagtggaa 3480 gaaaagtctg agccccaaga ctttcaggag gcagactcct ggggagacac taagcgcaca 3540 ccaggtgtgg gcaaagaaga tgctgctgag gagacagtca agccagggcc tgaagagggc 3600 acactagaga aggaagagaa agttcctcct cccaggagcc cccaggccca ggaagcacct 3660 gtcaacattg atgaggggct tacaggctgt accattcaac tgttgccagc acaggataaa 3720 gcaatagtct ttgagattat ggaggcagga gagcccacag gcccaattct gggagcagaa 3780 gcccttcccg gaggtttgag gactttaccc caagaacctg gcaaacctca gaaagatgag 3840 gtgctcagat atcctgaccg aagcctctct cctgaagatg cagaatccct ctctgtcctc 3900 agcgtgccct ccccagacac tgccaaccaa gagcctaccc ccaagtctcc ctgtggcctg 3960 acagaacagt acctacacaa agaccgttgg ccagaggtat ctccagaaga cacccagtca 4020 ctttctctgt cagaagagag tcccagcaag gag acctccc tggatgtctc ttctaagcag 4080 ctctctccag aaagccttgg caccctccag tttggggaac taaaccttgg gaaggaagaa 4140 atggggcatc tgatgcaggc cgaggatacc tctcaccaca cagctcccat gtctgttcca 4200 gagccccatg cagccacagc gtcacctccc acagatggga caactcgata ctctgcacag 4260 acagacatca cagatgacag ccttgacagg aagtcacctg ccagctcatt ctctcactct 4320 acaccttcag gaaatgggaa gtacttacct ggggcgatca caagccctga tgaacacatt 4380 ctgacacctg atagctcctt ctccaagagt cctgagtctt tgccaggccc tgccttggag 4440 gacattgcca taaagtggga agataaagtt ccagggttga aagacagaac ctcagaacag 4500 aagaaggaac ctgagccaaa ggatgaagtt ttacagcaga aagacaaaac tctggagcac 4560 aaggaggtgg tagagccgaa ggatacagcc atctatcaga aagatgaggc tctgcatgta 4620 aagaatgagg ctgtgaaaca gcaggataag gctttagaac aaaagggcag agacttagag 4680 caaaaagaca cagccctaga acagaaggac aaggccctgg aaccaaaaga caaagactta 4740 gaagaaaaag acaaggccct ggaacagaag gataagattc cagaagagaa agacaaagcc 4800 ttagaacaaa aggatacagc cctggaacag aaggacaagg ccctggaacc aaaagataaa 4860 gacttggaac aaaaggacag ggtcctagaa cagaaggaga agatcccaga agagaaagac 4920 aaagccttag atcaaaaagt cagaagtgtt gaacataagg ctccggagga cacggtcgct 4980 gaaatgaagg acagagacct agaacagaca gacaaagccc ctgaacagaa acaccaggcc 5040 caggaacaaa aggataaagt ctcagaaaag aaggatcagg ccttagaaca aaaatactgg 5100 gctttgggac agaaggatga agccctggaa caaaacattc aggctctgga agagaaccac 5160 caaactcagg agcaggagag cctagtgcag gaggataaaa ccaggaaacc aaagatgcta 5220 gaggaaaaat ccccagaaaa ggtcaaggcc atggaagaga agttagaagc tcttctggag 5280 aagaccaaag ctctgggcct ggaagagagc ctagtgcagg agggcagggc cagagagcag 5340 gaagaaaagt actggagggg gcaggatgtg gtccaggagt ggcaagaaac atctcctacc 5400 agagaggagc cggctggaga acagaaagag cttgccccgg catgggagga cacatctcct 5460 gagcaggaca ataggtattg gaggggcaga gaggatgtgg ccttggaaca ggacacatac 5520 tggagggagc taagctgtga gcggaaggtc tggttccctc acgagctgga tggccagggg 5580 gcccgcccac actacactga ggaacgggaa agcactttcc tagatgaggg cccagatgat 5640 gagcaagaag tacccctgcg ggaacac9ca acccggagcc cctgggcctc agacttcaag 5700 gatttccagg aatcctcacc acagaagggg ctagaggtgg agcgctggct tgctgaatca 5760 ccagttgggt tgccaccaga ggaagaggac aaactgaccc gctctccctt tgagatcatc 5820 tcccctccag cttccccacc tgagatggtt ggacaaaggg ttccttcagc cccaggacaa 5880 gagagtccta tcccagaccc taagctcatg ccacacatga agaatgaacc cactactccc 5940 tcatggctgg ctgacatccc accctgggtg cccaaggaca gacccctccc ccctgcaccc 6000 ctctccccag ctcctggtcc ccccacacct gccccggaat cccatactcc tgcacccttc 6060 tcttggggca cagccgagta tgacagtgtg gtggctgcag tgcaggaggg ggcagctgag 6120 ttggaaggtg ggccatactc ccccctgggg aaggactacc gcaaggctga aggggaaagg 6180 gaagaagaag gtagggctga ggctcctgac aaaagctcac acagctcaaa ggtaccagag 6240 gccagcaaaa gccatgccac cacggagcct gagcagactg agccggagca gagagagccc 6300 acaccctatc ctgatgagag aagctttcag tatgcagaca tctatgagca gatgatgctt 6360 actgggcttg gccctgcatg ccccactaga gagcctccac ttggagcagc tggggattgg 6420 cccccatgcc tctcaaccaa ggaggcagct gccggccgaa acacatctgc agagaaggag 6480 ctttcatctc ctatctcacc caagagcctc cagtctgaca ctccaacctt cagctatgca 6540 gccctggcag gacccactgt acccccaagg ccagagccag ggccaagtat ggagcccagc 6600 ctcaccccac ctgcagttcc cccccgtgct cctatcctga gcaaaggccc aagcccccct 6660 cttaatggta acatcctgag ctgcagccca gataggaggt ccccatcccc caaggaatca 6720 ggccggagtc actgggatga cagcactagt gactcagaac tggagaaggg ggctcgggaa 6780 cagccagaaa aagaggccca atccccaagt cctcctcacc ccattcctat ggggtccccc 6840 acattatggc cagaaactga ggcacatgtt agccctccct tggactcaca cctggggcct 6900 gcccgaccca gtctggactt ccctgcttca gcctttggct tctcctcatt gcagccagct 6960 cccccacagc tgccctctcc agctgaaccc cgctcggcac cctgtggctc ccttgccttc 7020 tctggggatc gagctctggc tctggctcca ggacccccca ccagaacccg gcatgatgaa 7080 tacctggaag tgaccaaggc ccccagcctg gattcctcac tgccccagct cccatcaccc 7140 agttctcctg gggcccctct cctctccaat ctgccacgac ctgcctcacc agccctgtct 7200 gagggctcct cctctgaggc taccacgcct gtgatttcaa gtgtggcgga gcgcttctct 7260 ccaagccttg aggctgcaga acaggagtct ggagagctgg acccaggaat ggaaccagct 7320 gcccacagcc tctgggacct cactcctctg agcccagcac ccccagcttc actggacttg 7380 gccctagctc cagctccaag cctgcctgga gacatgggtg atggcatcct gccgtgccac 7440 ctggagtgct cagaggcagc cacggagaag ccaagcccct tccaggttcc ctctgaggat 7500 tgtgcagcca atggcccaac tgaaaccagc cctaaccccc caggccctgc cccagccaag 7560 gctgaaaatg aagaggctgc ggcttgccct gcctgggaac gtggggcctg gcctgaagga 7620 gctgagagga gctcccQgcc tgacacattg ctctcccctg agcagccagt gtgtcctgca 7680 gggggctccg ggggcccacc cagcagtgcc tctcctgagg tc9aagctgg gccccaggga 7740 tgtgccactg agcctcggcc ccatcgtggg gagctctccc catccttcct gaacccacct 7800 ctgcccccat ccatagatga tagggacctc tcaactgagg aagttcggct agtaggaaga 7860 ggggggcggc gccgggtagg ggggccaggg accactgggg gcccatgccc tgtgactgat 7920 gagacacccc ctacatcagc cagtgactca ggctcctcac agtcagattc tgatgtcccg 7980 ccagaaactg aggagtgtcc gtccatcaca gctgaggcag ccctcgactc agatgaagat 8040 ggagacttcc tacctgtgga caaagctggg ggtgtcagtg gtactcacca ccccaggcct 8100 ggccatgacc cacctcctct cccacagcca gacccccgcc catcccctcc ccgccctgat 8160 gtgtgcatgg ctgaccccga ggggctcagc tcagagtctg ggagagtaga gaggctacgg 8220 gagaaggaaa aggttcaggg gcgagtaggg cgcagggccc caggcaaggc caagccagcg 8280 tcccctgcac ggcgtctgga tcttcgggga aaacgctcac ccacccctgg taaagggcct 8340 gcagatcgag catcccgggc cccacctcga ccacgcagca ccacaagcca ggtcacccca 8400 gcagaggaaa aggatggaca cagccccatg tccaaaggcc tagtcaatgg actcaaggca 8460 ggaccatggc cttgagttcc aagggcagct ctggtgcccc tgtatatgtg gatctcgcct 8520 acatcccgaa tcattgcagt ggcaagactg ctgaccttga cttcttccgt cgagtgcgtg 8580 catcctacta tgtggtcagt gggaatgacc ctgccaatgg cgagccaagc cgggctgtgc 8640 tggatgccct gctggagggc aaggcccagt ggggggagaa tcttcaggtg agtgactctg 8700 atccctactc atgacacgga ggtgactcgt gagtggtacc aacaaactca tgagcagcag 8760 caacaactga atgtcctggt cctggctagc agcagcaccg tggtgatgca ggatgagtcc 8820 ttccctgcct gcaagattga gttctgaaag agccgccctc ccttccccaa ggatccactc 8880 ccccagctcc tttagagaat ggctactgct gagtcctttg gggttgaggg agatgggagc 8940 tagggggagg ggagggagat gtcttgttgt ggggacttgg gctgggctaa atgggagggg 9000 ttgtccctcc ccatcatcca ttcctgtgag gtgtctcaaa ccaaagttaa cagggagagg 9060 atgggggagg ggacaaatta gaataggata gcatctgatg cctgagaacc ctctcctagc 9120 actgtcaaat gctggtattg aatggggact gaggatgggt ctcagagagc aacctcctcc 9180 ctcgtagagg gagattatat ccccaactcc agggacctct ttatctcaat ctatttattt 9240 ggcatcctgg gaaggatttc caatagtaat ttatgtgacc tggggcagga taccgtcagt 9300 gaggtgccca gagctgcacc ctttcctcca tttcccatcc cccatctcct caaccaccag 9360 ggtctgagtt ctagcagggt cctgggggta tcccactgct atactgttct actgcttccc 9420 tcagtatctg aatgtctcaa tttaaaactt gaagctcttt agaccaatag actggtgaga 9480 ggagaaagga gcttatcccc cagaccctgc tttataccat tcacatccca gggctgtgtc 9540 cagacagcac aaaacggcaa ggagagccca agccccaatg ccagaattct tccaaactcc 9600 ctgactcttt gaagttttta ctcaccccat ttcaattatc ctgatccctt ctcatcccct 9660 gcttggcttc tctgcatgtg gtcatctgct gtggcttggt gtttaatggg ttaaaaataa 9720 gccactgcct gacatcccaa catttgacac cccagcaatg tgtgactccc ccaacattcc 9780 actatgccat cctgcagctg aaatgggaac actggctgcc tctccaaacc cgctcttgga 9840 cagaggatct gggaggtgga agccaggcca gaggacttgg ggaaaatgag atggaggaag 9900 gaaaaaggga gaagctgagc cacagcttaa ctcctacaga gtgaaatgaa aacgggctga 9960 aaataccacc ccaggagagg acctcgcccc aagcaagcca gtgagcagcc ctgccagact 10020 actgccagac tgagaaaccc agaagctggt agtcatgtgg gcttgccttc tctgccaaac 10080 gactgggaaa ccaaaatgag cccaccttgt gttcttccta gctccaccct ccccgtgctg 10140 ctgtgttctg ctcctcccca cgcttccctg ctatagttcc cagctgctgt aacggagcca 10200 cctccaactc taacaataaa ccaagttcat tgcagatagt gtaaaaaaaa aaaaaaaaa 10259 <210> 15 <211> 2918 <212> DNA <213> Homo sapiens <400> 15 ggacagggca gctcaagacg ctgaggtggt ggctgcggcc tttgaacaag taagtgagcc 60 accctcggag acccccgcgc tggggacggg aggccggcga gcctcgggac ctctgaaagc 120 cttgaggagg cgcgg9gaca ccatggccga gcctctgaag gaggaagacg gcgaggacgg 180 ctctgcggag ccccccgggc ccgtgaaggc cgaacccgcc cacaccgctg cctctgtagc 240 ggccaagaac ctggccctgc ttaaagcccg ctccttcgat gtgacctttg acgtgggcga 300 cgagtacgag atcatcgaga ccataggcaa cggggcctat ggagtggtgt cctccgcccg 360 ccgccgcctc accggccagc aggtggccat caagaagatc cctaatgctt tcgatgtggt 420 gaccaatgcc aagcggaccc tcagggagct gaagatcctc aagcacttta aacacgacaa 480 catcatcgcc atcaaggaca tcctgaggcc caccgtgccc tatggcgaat tcaaatctgt 540 ctacgtggtc ctggacctga tggaaagcga cctgcaccag atcatccact cctcacagcc 600 cctcacactg gaacacgtgc gctacttcct gtaccaactg ctgcggggcc tgaagtacat 660 gcactcggct caggtcatcc accgtgacct gaagccctcc aacctattgg tgaatgagaa 720 ctgtgagctc aagattggtg actttggtat ggctcgtggc ctgtgcacct cgcccgctga 780 acatcagtac ttcatgactg agtatgtggc cacgcgctgg taccgtgcgc ccgagctcat 840 gctctctttg catgagtata cacaggctat tgacctctgg tctgtgggct gcatctttgg 900 tgagatgctg gcccggcgcc agctcttccc aggcaaaaac tatgtacacc agctacagct 960 catcatgatg gtgctgggta ccccatcacc agccgtgatt caggctgtgg gggctgagag 1020 ggtgcgggcc tatatccaga gcttgccacc acgccagcct gtgccctggg agacagtgta 1080 cccaggtgcc gaccgccagg ccctatcact gctgggtcgc atgctgcgtt ttgagcccag 1140 cgctcgcatc tcagcagctg ctgcccttcg ccaccctttc ctggccaagt accatgatcc 1200 tgatgatgag cctgactgtg ccccgccctt tgactttgcc tttgaccgcg aagccctcac 1260 tcgggagcgc attaaggagg ccattgtggc tgaaattgag gacttccatg caaggcgtga 1320 gggcatccgc caacagatcc gcttccagcc ttctctacag cctgtggcta gtgagcctgg 1380 ctgtccagat gttgaaatgc ccagtccctg ggctcccagt ggggactgtg ccatggagtc 1440 tccaccacca gccccgccac catgccccgg ccctgcacct gacaccattg atctgaccct 1500 gcagccacct ccaccagtca gtgagcctgc cccaccaaag aaagatggtg ccatctcaga 1560 caatactaag gctgccctta aagctgccct gctcaagtct ttgaggagcc ggctcagaga 1620 tggccccagc gcacccctgg aggctcctga gcctcggaag ccggtgacag cccaggagcg 1660 ccagcgggag cgggaggaga agcggcggag gcggcaagaa cgagccaagg agcgggagaa 1740 acggcggcag gagcgggagc gaaaggaacg gggggctggg gcctctgggg gcccctccac 1800 tgaccccttg gctggactag tgctcagtga caatgacaga agcctgttgg aacgctggac 1860 tcgaatggcc cggcccgcag ccccagccct cacctctgtg ccggcccctg ccccagcgcc 1920 aacgccaacc ccaaccccag tccaacctac cagtcctcct cctggccctg tagcccagcc 1980 cactggcccg caaccacaat ctgcgggctc tacctctggc cctgtacccc agcctgcctg 2040 cccaccccct ggccctgcac cccaccccac tggccctcct gggcccatcc ctgtccccgc 2100 gccaccccag attgccacct ccaccagcct cctggctgcc cagtcacttg tgccaccccc 2160 tgggctgcct ggctccagca ccccaggagt tttgccttac ttcccacctg gcctgccgcc 2220 cccagacgcc gggggagccc ctcagtcttc catgtcagag tcacctgatg tcaaccttgt 2280 gacccagcag ctatctaagt cacaggtgga ggaccccctg ccccctgtgt tctcaggcac 2340 accaaagggc agtggggctg gctacggtgt tggctttgac ctggaggaat tcttaaacca 2400 gtctttcgac atgggcgtgg ctgatgggcc acaggatggc caggcagatt cagcctctct 2460 ctcagcctcc ctgcttgctg actggctcga aggccatggc atgaaccctg ccgatattga 2520 gtccctgcag cgtgagatcc agatggactc cccaatgctg ctggctgacc tgcctgacct 2580 ccaggacccc tgaggccccc agcctgtgcc ttgctgccac agtagaccta gttccaggat 2640 ccatgggagc attctcaaag gctttagccc tggacccagc aggtgaggct cggcttggat 2700 tattctgcag gttcatctca gacccacctt tcagccttaa gcagccacct gagccaccac 2760 cgagccatgg caggatcggg agaccccaac tccccctgaa caatcctttt cagtattata 2820 tttttattat tattatgtta ttattacact gtctttttgc catcaaaatg aggcctgtga 2880 aatacaaggt tcccttctgc aaaaaaaaaa aaaaaaaa 2918 <210> 16 <211> 5888 <212> DNA <213> Homo sapiens <400> 16 atatcaacaa cagccgaggc ggctcaggcg ctcggccccg gttccccgct tgcctgccgc 60 ccgcctgctg gcccccgcgc ccacgacggg ggcccaggcc tcacggcgcc gcccagggcc 120 cgcgcggacg ccggcctcat ttattattct ccccgcccgg agctgcggct tcccggtgtt 180 gaagatcccc cggaccaggg gcgagggcta cccgctcttt gccgtgacaa caccgttccc 240 ccagccgggc tggaggctgt gcagaaggta tcctgcagac catgaactga gcactgttcc 300 cagaccgttc atgagcacag tgtaaggtgt gccgagaccc accacccagc gagcccctcc 360 cctccgtagc actgaggacc cccggagaag atggggagga aaaagattca gatccagcga 420 atcaccgacg agcggaaccg acaggtgact ttcaccaagc ggaagtttgg cctgatgaag 480 aaggcgtatg agctgagcgt gctatgtgac tgcgagatcg cactcatcat cttcaaccac 540 tccaacaagc tgttccagta cgccagcacc gacatggaca aggtgctgct caagtacacg 600 gagtacaatg agccacacga gagccgcacc aacgccgaca tcatcgagac cctgaggaag 660 aagggcttca atggctgcga cagccccgag cccgacgggg aggactcgct ggaacagagc 720 cccctgctgg aggacaagta ccgacgcgcc agcgaggagc tcgacgggct cttccggcgc 780 tatgggtcaa ctgtcccggc ccccaacttt gccatgcctg tcacggtgcc cgtgtccaat 840 cagagctcac tgcagttcag caatcccagc ggctccctgg tcaccccttc cctggtgaca 900 tcatccctca cggacccgcg gctcctgtcc ccccagcagc cagcactaca gaggaacagt 960 gtgtctcctg gcctgcccca gcggccagct agtgcggggg ccatgctggg gggtgacctg 1020 aacagtgcta acggagcctg ccccagccct gttgggaatg gctacgtcag tgctcgggct 1080 tcccctggcc tcctccctgt ggccaatggc aacagcctaa acaaggtcat ccctgccaag 1140 tctccgcccc cacctaccca cagcacccag cttggagccc ccagccgcaa gcccgacctg 1200 cgagtcatca cttcccaggc aggaaagggg ttaatgcatc acttgactga ggaccattta 1260 gatctgaaca atgcccagcg ccttggggtc tcccagtcta ctcattcgct caccacccca 1320 gtggtttctg tggcaacgcc gagtttactc agccagggcc tccccttctc ttccatgccc 1380 actgcctaca acacagatta ccagttgacc agtgcagagc tctcctcctt accagccttt 1440 agttcacctg gggggctgtc gctaggcaat gtcactgcct ggcaacagcc acagcagccc 1500 cagcagccgc agcagccaca gcctccacag cagcagccac cgcagccaca gcagccacag 1560 ccacagcagc ctcagcagcc gcaacagcca cctcagcaac agtcccacct ggtccctgta 1620 tctctcagca acctcatccc gggcagcccc ctgccccacg tgggtgctgc cctcacagtc 1680 accacccacc cccacatcag catcaagtca gaaccggtgt ccccaagccg tgagcgcagc 1740 cctgcgcctc cccctccagc tgtgttccca gctgcccgcc ctgagcctgg cgatggtctc 1800 agcagcccag ccgggggatc ctatgagacg ggagaccggg atgacggacg gggggacttc 1860 gggcccacac tgggcctgct gcgcccagcc ccagagcctg aggctgaggg ctcagctgtg 1920 aagaggatgc ggcttgatac ctggacatta aagtgacgat tcccactccc ctcctctcag 1980 cctccctgat gaagagttga caatctcacc gcccgccctt ccctgccccg ggctcctccc 2040 gctcgacccc cacttccttt cttgtgcttc gtgtcctgtt gacggttaca tttgtgtata 2100 attattatat tattattatt attattatat tttttttaat ttggattctc gctttggaga 2160 gggggatgct ctcatcccct ctttctgtac cccccaccat tttcactggc tggggggctc 2220 tctttttcgc gggaaggggg gacactttgc acgttgtaca catatgctgc aggaaggggg 2280 tggggggccc aataaggcct ttgggaaagg acaggtgccg agccctgcat gtggagccct 2340 cccaccccac ccccagatag agggaaataa ccaaaaaact accaaacaac agaaacccac 2400 actctagact gaaaccccaa agtgggcttg atgggtgggt ttgtgtttca aggggaaagt 2460 gaggcagagg ttctgaaaag ggtctctgtt tttgtgttca tgtagccata ggcacatgga 2520 gcagaatact taagcctggc ccccaaatgc ccctgcacac acacgtgcca cacctgcgct 2580 gattcttgtg tgtgctgcac ccccaaggtg tgtgggtgct ggctgagctt tgggccggga 2640 aggcagcctg ggaatctgag gctggagaca ggggtttgag gtgggggcct ctctggaagc 2700 acatttggag ggaaagacaa gagagccatg aggagagggc tgaggagggc agaagggcta 2760 ggcagggggc aaattgagcc cctcccttcc ccagtttttc tctaagatat acagtgcaat 2820 agctccccac ccctcagttg acgccagccc tgtaaagctg gccacagtgt gcagggagaa 2880 tggggagagg gtcttcagtg aggtggctgg ggcgagagtc ggcctggact tccctggggt 2940 gctccaggcc agagctcttt cattggggcg agtgtggtga ggggacgtcc ttggtcttgc 3000 acgcacacta cctgggggag tcaacactgg gatggtctgt ggggtgggag ggcctacgga 3060 tgggtccgta gaggtcccac ctccctcatt cctccttggc ccctctccct agcttctcct 3120 gttagctcct tctgctcctg accccacctc cttgctcttg gcgcccctat tgtctctggc 3180 tacctccttg tcccaccacc tccaggctgc atcccacctt ccctcttggc tactgtaatt 3240 gtaaatagcg acctttggaa aacgttagcg gtgtaacagt ccaggaaact gttttttttt 3300 gttgttgttg tattgatatg aaatgagatt ctatttttgt caaagtatat tgtaataata 3360 atgactcaaa cggcccgtac tgtacagacg agattcttct gctgttgttc ttgctcccct 3420 cccctcctct gagtccgccc ctccctgctg cctcctcagt ggggcagtgg gcaaggggcc 3480 caggggcagc cgaagcacgg ggtcctgaga cctcaggcag gattggagat caaaccagag 3540 ggggcaggcc cccagcctgc tctctaggat caccccccgc cctaaggggc ctggcctggg 3600 gtgacgtggc caggcagact gtctgcccca ctccttcaca caagcccagc tcctctgccc 3660 aaggggtgcg gcgccccctt ggggtttcct cccagttgga gagtagagtt aagacaaggc 3720 ccagttttgt gttagtcgac cgtctttgcc cacctctatg acccagcctc ttgcagtatt 3780 cccatacttg atgcagggaa ggaaccagaa gcagaggggc ctctacgcag gtacacacgt 3840 gtacctgagt gtgttcatga gggcatctgg tgtttatgtg tctgagtgta gctttgtatt 3900 tatgtgtgtg tgtgtgtgta tgtctgattg cacgggtgta cttttgtatt tatgtgtgtg 3960 tgtggttgca cgggtgtgcc tctgtgtgtc tctgaccctg gctgggtgtg tgtgcaaatc 4020 tgtgtgactg gagctctagg ggcatctctg tgtctgagtg tgcctggtgt gtgtttacaa 4080 agggagagtt ggctgctcca gctccacagc cctgggaccc caactcctgt cttccctgct 4140 cctttccctg tgttcaccct cagctctgac acattgaact gcagttgggg ggattggcag 4200 ttagccctct gtgcttctcc ctgcagccct acctctgcca aggtctctcc ctccagggac 4260 ctctgcttcc acccacatat gtccacttag tcacccacac ttgacacagt tcctggagta 4320 ccctcttccc ccaaccccag acctgctttc agagcaaaac tcaagtccct cttcctccgt 4380 gaagcttctc cctcagctga gcagtgatca cttactcact cttaacccca atccgctgac 4440 tgggtgggga cagcacgtcc agccttccca cctctcctgc aggcttctag acggagtttc 4500 aaaaactgat gagcctcgat ccagggcttg aaagaagcca gggtgtaatc ttgttcatgc 4560 atgcgtcccc agagcctcgc ccagtgcctg gcacatagta ggcactcaat aaatgctgaa 4620 tgggtgaata gttgaatgat aggtgctcaa taaatgaatg aatggccttc ccttctcagg 4680 ctattcccaa cattagtctg cccacctttc taggctgggc ttggccacca ttaaacacgg 4740 ggtgggggtg agggcccctg caattcacgg tgcaatattc accagttttg ccctctgcct 4800 cataaaggca aacctggctt ttgattacca tgtgtggatg tttcagtgtc ctttcttctc 4860 tgtccctggg gatggggtgg tctgtgaata tgtgacattt ctgcagttca gtatccgaag 4920 gtttctcttg ggggtagggg ctcctgggcg gccagatgaa tgggtccctg ggaacccaga 4980 cctcagatga ggacttaatg tcttcttcct ctcaagccaa attcgcctcc acccactccc 5040 tctgaagaac tgggcatttg ccaaagtaac cactggagtc atctaatggc cctccccctc 5100 cccaggtttc ccacagcttt cagggacagt gggcaagagg acaccccccc ccaccacctc 5160 agtggaacac accattctcc ccccctcaac agcacactca gtgcagcaag actgacccct 5220 gaccccctcc cagccctccc taccttggac aggaaggaag taatgcacct tctcttgctg 5280 attatttatt tgtttggaga gacagaaatg taaaagtgta tctagaaata tctatatctc 5340 tatatatttt taactgactc tttggaatcc cctggggtgg ggtgaggggt aagtttaggc 5400 tttcgcggag gggaggagac atggagcctg ggaactcctt gttctcccct ctgctgcctc 5460 tccccacccc ttaaagcagt tggtagaagg aatggtattt gtatgggggg agggaggctg 5520 gaatggagaa tctggattct ctcctcttcc ccattctcca gagggaggga ggtggtgagg 5580 aagaggaagg gaggggcagg atgggccatg gaggtgcccc acccccacac ctgacaatca 5640 cccacactcc tggggctctt cctgggtcct ggggcagggc gagtccaagt gtgaggctgt 5700 tgatttgttt tcaatatttc ttttcgtgct gtatggtgat gctttcttag tattctacac 5760 aataagaaaa gacaaagtcc tcgagattct tatgagtttt gtttgaaaac tctttcacta 5820 tatttgttgt aaagaggttt actattaaaa gaaaaagaat acacgtttct gataaaaaaa 5880 aaaaaaaa 5888 <210> 17 <211> 983 <212> DNA <213> Homo sapiens <400> 17 atttttaaaa gcaacttctg agaagggctt agaacaaatt ttttcccgga gtgccatttc 60 ccaaaggtac tcacagaaca atcaggtgtg accataatgg ctgcactgag ttgtctcttg 120 gacagtgtca gaagggacat aaagaaggtg gacagagaac taaggcaact gagatgcatc 180 gacgaattta gcacacggtg cctgtgcgac ttgtatatgc acccctattg ctgctgtgac 240 ttgcacccat atccgtactg cttgtgctat tccaagcgat cacgctcttg cggcctgtgt 300 gatctctacc catgttgcct gtgtgattat aagctttact gtctgcgacc atctctcaga 360 agtttggaga ggaaagccat cagagccata gaagatgaga agcgagagct tgccaaactg 420 agaagaacaa caaatagaat tctggcttcc tcctgctgta gcagtaacat tttaggatcg 480 gtgaatgtat gcggttttga acccgatcaa gtcaaagttc gagtgaagga tggaaaggta 540 tgtgtgtcgg ctgagcggga gaacaggtac gactgccttg gatcgaaaaa gtacagctac 600 atgaacatct gcaaagagtt cagcttgccg ccctgtgtgg atgagaagga tgtaacatac 660 tcctatgggc tcggcagctg tgtcaagatc gagtctcctt gctacccttg cacttctcct 720 tgcagcccgt gcagcccgtg caacccctgc aacccctgca gcccctgcaa cccgtgcagc 780 ccatatgatc cttgcaaccc gtgttatccc tgtggaagcc gattttcctg taggaagatg 840 attttgtaaa gtgcgcatag gaacccatta cttaatagaa gtcagttact ccagccaggc 900 agctctccca atgtttctcc tctccttccc atggcccctg ttgttgaagt acgtaggaaa 960 ctgaatacat aactgcaatc tgc 983 <210> 18 <211> 4286 <212> DNA <213> Homo sapiens <400> 18 gatggttgac aactcccctc ctcttccccc tcttctactg tctactcctg ggaccaagtg 60 agccacgcca gctcagatac tacactgacc acagggaatc ccaccttttc caaggaatgg 120 aagttgtgta gggaatattc aaatgttgct tagcattgcc ttagataaga accaaaggga 180 cagggaaatc ctctgacagc tatctgcctt ataactttca ttttactgtg cctaaaatat 240 gctcagaacc cagaaagagg cataattcct aattttggca ggctctaatc taaaataatg 300 attctcaaac atggtgtgac ttttgtctat ttgctttatc ctgggtcact gctcctcttc 360 tgtcagatac tgggattcca atgagacaaa tggaaatgga gacgtagacc ctctgacctt 420 ctatctttta tctatacaca tacacctgtg tgtgtgtgtg tgtgtgtgtg tgtgcgtgtg 480 taaaaccgag tgggtttttt tcttggaatg aaagaatgga ctaacattac aaaaaataaa 540 aacttgaaac agaatgtgta ttatccttgg ttgtgtttcc ttggccctgc agcaggatga 600 agctctccac tggcatcatt ttctgctccc tggtcctggg tgtcagcagc caaggatggt 660 taacattcct caaggcagct ggccaaggtg aggtccacag gatagggggc aggaggctgc 720 ttctggctgc ccccaggatg cagctgagca gaggccacat ccccactggg caaaggtgct 780 agtgatgcca cagatggata gagaaggggc atggtttttc ataagcgtgg ttcctcatgc 840 ttttctggac agctttgaca ctcttctatg aggatcctcc agccgaggtc gcataaggtg 900 tgagctgcct cttttcagca ggaccatgag agagatgtgg agttgagggg tgcatgttcc 960 cataataccQ gtggggctct actgccccct agtgggaaat ctgggacagt tcatgtctat 1020 gtctcctggg aagccaggaa gcaggtggat caaaagtgtg aggcgagtcc atggggaagc 1080 tgaacggagc caaccgtccc cataaaaaca accaagctta gctgagattt taatacgtac 1140 taggcactgt ttaaatgtac taatgaattg gtttccatca tttagtccta tgatgcaagc 1200 agcattatcc cttaacagag aagctaacac acacacacac acacacacac taacacacac 1260 acacacacac acacacacac aaaccccaag atacgtaaag aagttccaaa gcagagcagg 1320 attaacccag gcagtcttgc tctgcagaac ttgctcttaa tcaaggtact ctgctgcttt 1380 caaaacaaga gtttcggatt tgtgaacaca tagctcatcc tttatctaag aaatggcaaa 1440 taggatgtgg tgcctttgga aggtaagtct agctccactt atcccagtaa aacctacagt 1500 gaattacctt gatggtggtt ctactggggc ttatatatgg ccaggaaact gctagcaaga 1560 gaaatatacc ccgagggctg ggcacagtgg ctcacacctg taatcccagc actttgggag 1620 gctgaggtgg gcagatcacc tgaggtcaag agttcgagac cagcctggcc aacatggcga 1680 aatcctgtct ctactaaaaa tacagaaatt agccgggtgt ggtggcatgc gcctataatc 1740 ccagcctctc gggaggctga gggagaagaa ttgcttgaac tcaggaggca gaggttgcag 1800 tgagctgtga tcacaccact gcactccagc ctaggagaca gagcaagact ccatctagag 1860 agacagagag agagagagag ggagaaatat accccactag ccataataaa gtggcaaaat 1920 tttgttttca gaatgcagta ttttaaattt caggtattat tatttttctg agtctctgaa 1980 aaatggtttt aaggatttgc ttttaatcct atttacatgt tcacacactc aactacaaat 2040 atctttcatt ccttaggtta atatttttca aagggttgtt ctgggaccac ttgcgtgaga 2100 atcacctgga ttctgggatg ctttgtgaaa tgaaatgaag attcccgggt ccatacccta 2160 ccccctgccc ccaacagcca cagtctcttg ggacagagcc tagaaatctt gcctttgcta 2220 agcacctcgg tagattttta tgcacagcaa aggttgagaa ccactacctc ttgttttgct 2280 gctgaaagtg ataaaatgtg ccaggaattt tggaagtact tattaagcca atctgaacat 2340 caaggagcca tttaagtcag taactcagag gaataagtag agtaaaaatg tcataaactc 2400 tcaataaaag
caatcaattt aacaccagga gtaataaatg cataaaatga agatgagtta 2460 tctaatagag aaattatata aaccatgatt ataactctat atttgagttc ccccttttcc 2520 gtaatcagtt aattttctaa aaaatcttcg tcacttaatt ctagcttgat cagatccctt 2580 cagtccgtaa ctccctgctc ctcatcttag tttagccctt cttttttctt atgccacctt 2640 tcctaaggac cagagaagtg aaatgataat atattggcca cctacaatgt tctagacatc 2700 atacatgtat tttctctgct cttctgcata atcactgtga ggcaggcaat actcctccat 2760 ttcattgggg aggacattga ggttctgaac tagtgggtca gttgtccttt ttctgaattt 2820 gattacccag tagtataaag ctttcttagg taactcacct ttatcacttg ctgactgaat 2880 tctgacagat gtcagtttct aattatagcc tggacattca gatgtattca ggaccaagtt 2940 gtcctcactc tacctacagg catgaatttc tctcattgac taggttagga gcgccatatg 3000 tctgcagcct ccctcagaat cccctgtgtt ctcacaccag ggaactgagg gttccctggg 3060 tccttccagg tagaagttca ttgtacaatg aaacatccct taaggaccat ttcatctctt 3120 ctttaggtgc atcacacatg gttaaaacaa agtaataaca gaacttagaa tggaatcaaa 3180 cagaatgaaa cttacaccaa gtacaattct cattacatta acccagagaa gtgaaaagta 3240 gaagaatatt tatttcaagc caatataatt tccaagggct ttgttgaagg ctgaaatctt 3300 cgggaggaaa gtagtgagaa gaaaactgtt cattcctcta ttttcccagt atataattgt 3360 tttgatcatt ttctttcctt tccagggact aaagacatgt ggaaagccta ctctgacatg 3420 aaagaagcca attacaaaaa attcagacaa atacttccat gcttggggga actatgatgc 3480 tgtacaaagg gggcttgggg ctgtctgggc tacagaagtg atcaggtaat gcacattcct 3540 gatgttgcca ggaatgagtg agcagagctt gactgccttg gacagtcagg agagaggtaa 3600 gctccttgca gagaagttag aggctgcagc ccctcctcct cttgccctct ctctgcctgt 3660 gtgcttagtg cgagggtctg agtggatggt agaagtgagt gattcctcac cctccctctc 3720 tgggtgctgt tcatccagcc taggggtgcc cagcctggct gagtggggca gtgcccaggc 3780 agggtcattg ttttcacccc tccttccttg gccttcctgg gcttctccca gagtcctccc 3840 ttggaaagca gagaatggga aggtgggctg ttgctcactg gcctggtgat taatctcctt 3900 gcttgcctgg actacagcga tgccagagag aacgtccaga gactcacagg agaccatgca 3960 gaggattcgc tggctggcca ggctaccaac aaatggggcc agagtggcaa agaccccaat 4020 cacttccgac ctgctggcct gccagagaaa tactgagctt ccttttcaat ctgctctcag 4080 gag acctggc tgtgagcccc tgagggcagg gacatttgtt gacctacagt tactgaattc 4140 tatatcccta gtacttgata tagaacacat aaaaatgctt aataaatgct tgtgaaatcc 4200 agtttgttat tggaatctgg aagcagaata tgacagtctt cctgggatca tgggcctgtt 4260 tagtaccata gggatgacca ataaac 4286 <210> 19 <211> 3206 <212> DNA <213> Homo sapiens <400> 19 acaacaggct gtggaagagc tgcgagcggc gcggcggggc gggcgaactc ggcgcggcac 60 agagcctcgg gaggctgatg caactttccc tttaagaaag ccacctgggc gcaccgcggt 120 gcggacccag cacgcctggg ccgggggctg cagcatgctc ttgagatctg tggcctgaaa 180 ggcgctggaa gcagagcctg tgagtgtggt ccccgtcacc agagccccaa cccaccgccg 240 ccatggtagg aaaaggtgcc aaagggatgc tgaatggtgc tgtgcccagc gaggccacca 300 agagggacca gaacctcaaa cggggcaact ggggcaacca gatcgagttt gtactgacga 360 gcgtgggcta tgccgtgggc ctgggcaatg tctggcgctt cccatacctc tgctatcgca 420 acgggggagg cgccttcatg ttcccctact tcatcatgct catcttctgc gggatccccc 480 tcttcttcat ggagctctcc ttcggccagt ttgcaagcca ggggtgcctg ggggtctgga 540 ggatcagccc catgttcaaa ggagtgggct atggtatgat ggtggtgtcc acctacatcg 600 gcatctacta caatgtggtc atctgcatcg ccttctacta cttcttctcg tccatgacgc 660 acgtgctgcc ctgggcctac tgcaataacc cctggaacac gcatgactgc gccggtgtac 720 tggacgcctc caacctcacc aatggctctc ggccagccgc cttgcccagc aacctctccc 780 acctgctcaa ccacagcctc cagaggacca gccccagcga ggagtactgg aggctgtacg 840 tgctgaagct gtcagatgac attgggaact ttggggaggt gcggctgccc ctccttggct 900 gcctcggtgt ctcctggttg gtcgtcttcc tctgcctcat ccgaggggtc aagtcttcag 960 ggaaagtggt gtacttcacg gccacgttcc cctacgtggt gctgaccatt ctgtttgtcc 1020 gcggagtgac cctggaggga gcctttgacg gcatcatgta ctacctaacc ccgcagtggg 1080 acaagatcct ggaggccaag gtgtggggtg atgctgcctc ccagatcttc tactcactgg 1140 gctgcgcgtg gggaggcctc atcaccatgg cttcctacaa caagttccac aataactgtt 1200 accgggacag tgtcatcatc agcatcacca actgtgccac cagcgtctat gctggcttcg 1260 tcatcttctc catcctcggc ttcatggcca atcacctggg cgtggatgtg tcccgtgtgg 1320 cagaccacgg ccctggcctg gccttcgtgg cttaccccga ggccctcaca ctacttccca 1380 tctccccgct gtggtctctg ctcttcttct tcatgcttat cctgctgggg ctgggcactc 1440 agttctgcct cctggagacg ctggtcacag ccattgtgga tgaggtgggg aatgagtgga 1500 tcctgcagaa aaagacctat gtgaccttgg gcgtggctgt ggctggcttc ctgctgggca 1560 tccccctcac cagccaggca ggcatctatt ggctgctgct gatggacaac tatgcggcca 1620 gcttctcctt ggtggtcatc tcctgcatca tgtgtgtggc catcatgtac atctacgggc 1680 accggaacta cttccaggac atccagatga tgctgggatt cccaccaccc ctcttctttc 1740 agatctgctg gcgcttcgtc tctcccgcca tcatcttctt tattctagtt ttcactgtga 1800 tccagtacca gccgatcacc tacaaccact accagtaccc aggctgggcc gtggccattg 1860 gcttcctcat ggctctgtcc tccgtcctct gcatccccct ctacgccatg ttccggctct 1920 gccgcacaga cggggacacc ctcctccagc gtttgaaaaa tgccacaaag ccaagcagag 1980 actggggccc tgccctcctg gagcaccgga cagggcgcta cgcccccacc atagccccct 2040 ctcctgagga cggcttcgag gtccagccac tgcacccgga caaggcgcag atccccattg 2100 tgggcagtaa tggctccagc cgcctccagg actcccggat atgagcacag ctgccagggg 2160 agtggcccca cccccacccc gtgctcccac cgcagagact ggtgaggcag aggccaggtg 2220 tctctgcctg ccccctgcca cgccctggcc aggacggctg ctgtcacctt ggtcaccact 2280 gctagtgcag tcattcatgc tcatgtcccc agtgtttaca tgtcctttgg atgccaagat 2340 agcagctggg gggaggggtg ggagtggagg gttgctggga ggtccaaagc actttggagg 2400 ggtctcgggc caggtcccca gcagcctgga tggctttacg tggcctccga tacccttata 2460 ccctgccctg agctgaggtt ctgggtgggc ctccagcccc atgactagtg ctcctgccct 2520 cagagccgac cccagcctct gccaggcaca tttggctatt cctccctagg ggcaggatga 2580 agggctgggg gactgcccag tgttacttgt caggctgtgc tgcccagccc tgtttatctg 2640 tgtaattatt tttgtaaaca ttgtattctc tgtggtcgcc acctcctcgc ccccagcctc 2700 gggttcagtc tgtcttccag gcctgcttgc acctcactgg gctgctccgg ggtctctgcc 2760 cctcattcca ggcctggctg tcaggcccag ccagcagggg cccgtgaccc agcagcctgc 2820 ccaaagcatt tgtttctggg ggatggggtg ggggctgctc cacaggaggt ttgagcccag 2880 caccctgggg aaggggaccc tgcacgacac ccccttgccc tccctccatg aggctaaagg 2940 cccagtcttc ccaaatgtgc tgccctcgtt catgtgccaa atggccccag cccacatgcc 3000 cctcccctct ctggagtggg agccccttct gaagtgtctg aatccctgaa gtgttcattt 3060 gtccgtcctc tgtgcagtga cagccccggc caagccacct ctaatcctct gtagcaataa 3120 cggtgcgccg cccatccctg cccattgtgc accactagga ttttaaagtc catagatttt 3180 aatgaaattt ctattcctgt ctctga 3206 <210> 20 <211> 6871 <212> DNA <213> Homo sapiens <400> 20 gcctcgggag gtggtggagt gacctggccc cagtgctgcg tccttatcag ccgagccggt 60 cccagctctt gctcctgcct gtttgcctgg aaatggccac gcttctcctt ctccttgggg 120 tgctggtggt aagcccagac gctctgggga gcacaacagc agtgcagaca cccacctccg 180 gagagccttt ggtctctact agcgagcccc tgagctcaaa gatgtacacc acttcaataa 240 caagtgaccc taaggccgac agcactgggg accagacctc agccctacct ccctcaactt 300 ccatcaatga gggatcccct ctttggactt ccattggtgc cagcactggt tcccctttac 360 ctgagccaac aacctaccag gaagtttcca tcaagatgtc atcagtgccc caggaaaccc 420 ctcatgcaac cagtcatcct gctgttccca taacagcaaa ctctctagga tcccacaccg 480 tgacaggtgg aaccataaca acgaactctc cagaaacctc cagtaggacc agtggagccc 540 ctgttaccac ggcagctagc tctctggaga cctccagagg cacctctgga ccccctctta 600 ccatggcaac tgtctctctg gagacttcca aaggcacctc tggaccccct gttaccatgg 660 caactgactc tctggagacc tccactggga ccactggacc ccctgttacc atgacaactg 720 gctctctgga gccctccagc ggggccagtg gaccccaggt ctctagcgta aaactatcta 780 caatgatgtc tccaacgacc tccaccaacg caagcactgt gcccttccgg aacccagatg 840 agaactcacg aggcatgctg ccagtggctg tgcttgtggc cctgctggcg gtcatagtcc 900 tcgtggctct gctcctgctg tggcgccggc ggcagaagcg gcggactggg gccctcgtgc 960 tgagcagagg cggcaagcgt aacggggtgg tggacgcctg ggctgggcca gcccaggtcc 1020 ctgaggaggg ggccgtgaca gtgaccgtgg gagggtccgg gggcgacaag ggctctgggt 1080 tccccgatgg ggaggggtct agccgtcggc ccacgctcac cactttcttt ggcagacgga 1140 agtctcgcca gggctccctg gcgatggagg agctgaagtc tgggtcaggc cccagcctca 1200 aaggggagga ggagccactg gtggccagtg aggatggggc tgtggacgcc ccagctcctg 1260 atgagcccga agggggagac ggggctgccc cttaagtgtc ggtgaatagt gaggctggag 1320 gccggaatct cagccagcct ccagcacctt ccctctcacc atcccactgc cccctcgctc 1380 ccatgtttcc acccggcacc ctgatcctca cccgaatctc cttttttttt ttcttttgag 1440 acagagtttc gctttgtcgc ccaggctgga gtgcaatgca cgatctcagt tcactgcaac 1500 ctctgcctcc taagttcagg cgattctcct gcctcagctt cccgagtaac tgagattaca 1560 ggcacccacc accatgccca gctgcttttt tgtatttttg gtagagatgg ggtttcacca 1620 tgttggctag gctggtctca aactcctgac ctcaggtgat ctacctgcct cagcctccca 1680 aagtgctgag attacagaca tgagcctccg cgccttgcct cctcacccac ctcttcactc 1740 tgaatcctca tgaggcttct cagccctgga tttcctgctg ccatcctcac ccagcaccca 1800 caactagcgc ctgggcaggg cagggctggc acctctcaac gtctgtggac tgaatgaata 1860 aaccctcctc atccacccct atttatctcc atcaccattt ccccctcttt cttgttcctg 1920 gaaacggctg ctgagtctcc atcggccaaa cttatctgcc ctgtgatttc tttgacaatt 1980 ctccttttcc cccagaaccc accctgggtt gaccagagtc tgggaagaag gacaagagaa 2040 cccggcaaac tccctcctag gattaacttt gtaaagcacc cttgccctgt agctgcaagg 2100 gctgtggaac ctgggcagcc cgcaaccacc tttagctctg ggccccccag gccagcctgg 2160 agcatggctg ggtggggcca ccagcccatg ctctcaggcg ggcctgtgat ctttcccagg 2220 gcacatggac tgtaggctgg ccctggccca caccaccaca ctctccccag ccatggacag 2280 aggcagccag aggcctcacg gtttctcctc cgagtttctg gctgggtgta gttctcagaa 2340 accccagtgc ctgcgtgtgt ccactcgtgg gtgtggtttg tgtgcaagag ctgaggattt 2400 ggcgatgctt gggaggggta gttgtgggta cagacggtgt gggggtggga agtggtgcag 2460 agactgaaga gggtcaacct gggcatgggg gacacaggga ctgctgagaa cgtgcgtgtc 2520 atctttgctc tgatggggtg gacatagcag aaaatctaac tctgtctgta gccccataca 2580 gaatgccagg gtgagcacag tggctggtgc ctttaatccc agcactttgg aaagttgagg 2640 caggaggatc gcttgagccc aggagttcga gtctgaagtg agctgtgatt gcaccactgc 2700 acttcagcct gggcaacaga gtgagcccct gtctcaaaaa agaaaagaaa aagaaagcca 2760 ggcttcatgg aaagatcgta tgtgtgaccc aaatatgagt tcttcagctc agccatggta 2820 atcccttcct tgaagtctcc atttctgcag tacacatgca tgtgcgctct ctctctctct 2880 ctctctctca cacacacaca cacacacaca cacacacgcg cgcgcgcgcg cgctctcctg 2940 cgaacagagg cagggggaga ggggtttgcc ctggtctcgg ggactggtct ggctggcgct 3000 tccccactgc acgtttccag gtttagtttg tctgtgtctc ctcttccatc ccaggggctg 3060 agccccttcc atcctccaag aggaaccagt gagagtgagt gaaggagggg cctggagcca 3120 gggacttccc ctgtggggcc tgggtggaga ggggagaact caatggtgct gcctttgaga 3180 ccagcccagg ctacagccca ggagcacaca tgggccaggg cagttggtat ttcccgagga 3240 caaagaggaa attttcaaag aggaagttgt tgagttagag cttgcggtgg ctgagagcag 3300 acaggttgac ctgcaaaaaa agacagggga ggcatgtgag tgtgacagcc ctgctctgtg 3360 gcctgggcag gagatggggg aaagggtcag gtgggggatg ggctcgtgca gtgggagagg 3420 agacggaggg agggagcggg aaggggcttg cttagtgggt gggaagagct gagctcggat 3480 ggaaccagct tctaccagcc aggctgggca cccactgggc tgcatctggt ggccttttct 3540 gattgctatt tggactcact gcagctgcag aatgacagag gccatgtcca aaatccctta 3600 gagacactgt tgtcttagag ttgttaaaat aagagccccc atatcaggtt tagaaaatac 3660 tgtcaccgaa cgaacgtcgc tgtcctcagc tccacctccc tltcctttga cagatatggt 3720 tgttttctaa gccaggactg gttttagtca ggtcctgggc gaatcctgaa aaaaagaggt 3780 agtacgggta aggaaggcac ccaacagggc tttcacaatc cagaaaatat caaaatataa 3840 gtgttaaaag agaggcacag gccgggtgcg gtggctcacg cctgtaatct cagcactttg 3900 ggaggccaag gtgggcagat catgaggtca ggagtttgag accagcctgg ccaatatgat 3960 gaaaccccgt ttctactaaa aatacaaaag ttagccaggc atggtggtgt gctcctgtaa 4020 tcccagctac ttaggaggct gaggccagag aattgcttga accctggagt cagaggttgc 4080 agtgagccgg gatcatgcca ctgtactcca ggctgggtga caaagtgaga ctgtctcaaa 4140 aaataaaaat aaataaaata aataaaagag aggcacaaac agtgttatga atgcaccaag 4200 gaaaatggtg cattcataac tctcaggtga agcctaccaa gccatgcgtg tgtgcacata 4260 tgtgtgtacg tgtgcatgtg cgtgcgtgca tgtgcgtgcg tgcatgtgcc tgtgtgtgta 4320 tgtgtgcaca tgtgtgtgcg catgtgtgtg tgtgcgcgca tgtgtgtgtg catgcatgtt 4380 ctcccatgca tgtgtactgt ggcaagggag actttgagga agagattcca gtggctgagc 4440 agaagggctc gcattgccct ggcgaaaggt tggaaggctt cacctgagag tgtgtcgtgg 4500 cctttgtcat atccactgct tgattccttt ctttaaaaat tatttttatt gttttctaca 4560 tatgagaacc accacacctg gctaattttt gtattttttg tagagatggg gtttcaccat 4620 gttgtcccgg ctggtctcaa actcccgggc acaagagatc cacctgcctc agcctcccaa 4680 aatgctggga ctataggcat gagccactgc acccagccac tgcttcattc ctggtggctg 4740 ctgtgcctgg catgttgcag atcctccatg aatatgcatt tgaatgaatg aatgaatgaa 4800 tgaatgaatg aatggagatg acgcctcaga gattctttct tttgagatga ggtctcattc 4860 tgtcacccag actagagggc agtggtgcaa tcacagctca ccacagcctc aacctcctgg 4920 gcctcccaag tagctgcgat cacaggtgtg caccaacatg cccagctaat tttttttttt 4980 aatttttaat ttgtacagac agggtcttgc tgtgttgccc aggctggtct caaactcctg 5040 ggctcaagtg gtcctcccac ctaagcttcc ccaaatactg ggattatagg tgtgagccac 5100 tgtgcccagg cttgcctcag atatttgaag gctgggaagg attttgcaaa gctgggaaaa 5160 ggaaaaggca ttcccagcag aggggatagc aggtggaaat acataattaa aaaaaaaaaa 5220 cgtggagcag atccagcgca gtggctcatg cctgtaatcc cagcactttg ggaggcagag 5280 cagggaggat tgcttgagtc taggagttca agaccagcct gggtaacata gaaagaccct 5340 gtctctacaa aaacacaaaa aattagccag gcgtggtggt gcatgcctgt agtaccagct 5400 acttgagaag ctgaggcagg aggactgctt gaagccagga gtttgagacc agcctgggca 5460 acatagtgag accccgtgtc tacaaaaagt aaacatttat atatatattt tttaaagtgg 5520 agcagttcaa tatagagtct tttttgaaca aacgtgaaat agatgtcttt tttttttttt 5580 tgagatggag ttttcactct tgttacccag gctggagtgc aatggcgtga tcttggctca 5640 ccagaacctc cgcctcctgg gttcaaacaa ttctcctgcc tcagcctccc aggtagctgg 5700 gattacaggc atgcaccacc aaacccggat aattttgtat ttttagtaga gatggggttt 5760 caccatgttg gtcaagctgg tcttgaactc ccgacctctg ctgatccgta tgcctcggcc 5820 tcccaaagtg ctgggattac atgcgtgagc caccgtgccc gacaatagat gtcttttaat 5880 tttctggagg aaaaagcaaa gcaaaagaag cagtggatat tttaagacta aaaaggaaaa 5940 caaaaaaagg agatagagca ggccagacgt ggtggctcaa cgtctgtaat cccagcactt 6000 tgggaggccg aggcaggtgg atcacctgag gtcaggagtt caagaccagc ctgaccaaca 6060 tggtgaaacc ctgtttcaaa atacaaaaaa ttagctgggc gtggtggcgg gcacctgtaa 6120 tcccagctac ttgggaggct gaggcaggag aatcccttga acccaggagg tggaggttgc 6180 agtgagccga gatcacgcca ttgcactcca gcctgggcga caagtaaaaa actccatctc 6240 aaaaaaaaaa aaggagatag agcaaggaac agtaagaaaa tagttgggtg cagtggctat 6300 gcggtggcac tataggaggc tgaggcgggc agatcacctg aggtcaggag ttggagacca 6360 gcctgggcaa catagacccc tatctctaca aaaaatttga aatatgaaaa attagccagg 6420 tgtagtggtg tgcgcctgtg gtaccagcta ctcaagaggc gaaggcaggg aagattgctt 6480 gagcccagga gtttgaggct atagtgagct gtgatcatgc cactgcactc cagcctgggc 6540 aatagtgtga gactctgtct caaaagaaag aacatggcca ggcgtggtgg ctcacacctg 6600 taatcccagc actttgggag gccgaggcag tcagatcatg aggccagttt gaaaccagcc 6660 tggccaacat ggtgaaaccc tgtctctact aaaaatacaa aaattagcca ggcgtggtgg 6720 catgcccccg taatcccagc tacttggggg tctgaggcag aagaattgct tgaaaccggg 6780 aggcagaggt tgcagtgagc cgagatcgtg tcattgcact ctagtctggg cgacagagca 6840 agactccgtc ttggaaaaaa atttaaaaaa a 6871 <210> 21 <211> 2033 <212> DNA <213> Homo sapiens <400> 21 ggagaatccc cggaaaggct gagtctccag ctcaaggtca aaacgtccaa ggccgaaagc 60 cctccagttt cccctggacg ccttgctcct gcttctgcta cgaccttctg gggaaaacga 120 atttctcatt ttcttcttaa attgccatn tcgctttagg agatgaatgt tttcctttgg 180 ctgttttggc aatgactctg aattaaagcg atgctaacgc ctcttttccc cctaattgtt 240 aaaagctatg gactgcagga agatggcccg cttctcttac agtgtgattt ggatcatggc 300 catttctaaa gtctttgaac tgggattagt tgccgggctg ggccatcagg aatttgctcg 360 tccatctcgg ggatacctgg ccttcagaga tgacagcatt tggccccagg aggagcctgc 420 aattcggcct cggtcttccc agcgtgtgcc gcccatgggg atacagcaca gtaaggagct 480 aaacagaacc tgctgcctga atgggggaac ctgcatgctg gggtcctttt gtgcctgccc 540 tccctccttc tacggacgga actgtgagca cgatgtgcgc aaagagaact gtgggtctgt 600 gccccatgac acctggctgc ccaagaagtg ttccctgtgt aaatgctggc acggtcagct 660 ccgctgcttt cctcaggcat ttctacccgg ctgtgatggc cttgtgatgg atgagcacct 720 cgtggcttcc aggactccag aactaccacc gtctgcacgt actaccactt ttatgctagt 780 tggcatctgc ctttctatac aaagctacta ttaatcgaca ttgacctatt tccagaaata 840 caattttaga tatcatgcaa atttcatgac cagtaaaggc tgctgctaca atgtcctaac 900 tgaaagatga tcatttgtag ttgccttaaa ataatgaata caatttccaa aatggtctct 960 aacatttcct tacagaacta cttcttactt ctttgccctg ccctctccca aaaaactact 1020 tcttttttca aaagaaagtc agccatatct ccattgtgcc taagtccagt gtttcttttt 1080 tttttttttt ttgagacgga gtctcactct gtcacccagg ctggactgca atgacgcgat 1140 cttggttcac tgcaacctcc gcatccgggg ttcaagccat tctcctgcct aagcctccca 1200 agtaactggg attacaggca tgtgtcacca tgcccagcta atttttttgt attttagtag 1260 agatgggggt ttcaccatat tggccagtct ggtctcgaac tctgaccttg tgatccatcg 1320 atcagcctct cgagtgctga gattacacac gtgagcaact gtgcaaggcc tggtgtttct 1380 tgatacatgt aattctacca aggtcttctt aatatgttct tttaaatgat tgaattatat 1440 gttcagatta ttggagacta attctaatgt ggaccttaga atacagtttt gagtagagtt 1500 gatcaaaatc aattaaaata gtctctttaa aaggaaagaa aacatcttta aggggaggaa 1560 ccagagtgct gaaggaatgg aagtccatct gcgtgtgtgc agggagactg ggtaggaaag 1620 aggaagcaaa tag aagagag aggttgaaaa acaaaatggg ttacttgatt ggtgattagg 1680 tggtggtaga gaagcaagta aaaaggctaa atggaagggc aagtttccat catctataga 1740 aagctatata agacaagaac tccccttttt ttcccaaagg cattataaaa agaatgaagc 1800 ctccttagaa aaaaaattat acctcaatgt ccccaacaag attgcttaat aaattgtgtt 1860 tcctccaagc tattcaattc ttttaactgt tgtagaagac aaaatgttca caatatattt 1920 agttgtaaac caagtgatca aactacatat tgtaaagccc atttttaaaa tacattgtat 1980 atatgtgtat gcacagtaaa aatggaaact atattgacct aaaaaaaaaa aaa 2033 <210> 22 <211> 1791 <212> DNA <213> Homo sapiens <400> 22 gagcccggct gcggatctgg gaagcgcctc ttcacggcac tgggatccgc atctgcctgg 60 gatcatcaag ccctagaagc tgggtttctt taaattaggg ctgccgtttt ctgtttctcc 120 ctgggctgcg gaaagccaga agattttatc tagcttatac aaggctgctg gtgttccctc 180 tttttttcca cgagggtgtt tttggctgca attgcatgaa atcccaatgg tgtagaccag 240 tggcgatgga tctaggagtt taccaactga gacatttttc aatttctttc ttgtcatcct 300 tgctggggac tgaaaacgct tctgtgagac ttgataatag ctcctctggt gcaagtgtgg 360 tagctattga caacaaaatc gagcaagcta tggatctagt gaaaagccat ttgatgtatg 420 cggtcagaga agaagtggag gtcctcaaag agcaaatcaa agaactaata gagaaaaatt 480 cccagctgga gcaggagaac aatctgctga agacactggc cagtcctgag cagcttgccc 540 agtttcaggc ccagctgcag actggctccc cccctgccac cacccagcca cagggcacca 600 cacagccccc cgcccagcca gcatcgcagg gctcaggacc aaccgcatag ctgcctatgc 660 ccccgcagaa ctggctgctg cgtgtgaact gaacagacgg agaagatgtg ctagggagaa 720 tctgcctcca cagtcaccca tttcattgct cgctgcgaaa gagacgtgag actgacatat 780 gccattatct cttttccagt attaaacact catatgctta tggcttggag aaatttctta 840 gttgggtgaa ttaaaggtta atccgagaat tagcatggat ataccgggac ctcatgcagc 900 ttggcagata tctgagaaat ggtttaattc atgctcagga gctgtgtgcc tttccatccc 960 ttccggctcc ctacccctca cttccaaggg ttctctctcc tgcttgcgct tagtgtccta 1020 catggggttg tgaagcgatg gagctcctca ctggactcgc ctctctcctc tcctcccccc 1080 aggaggaact tgaaaggagg gtaaaaagac taaaatgagg gggaacagag ttcactgtac 1140 aaatttgaca actgtcacca aaattcataa aaaacaatag tactgtgcct ctttcttctc 1200 aaacaatgga tgacacaaaa ctatgagagt gacaaaatgg tgacaggtag ctgggaccta 1260 ggctatctta ccatgaaggt tgttttgctt attgtatatt tgtgtatgta gtgtaactat 1320 tttgtacaat agaggactgt aactactatt taggttgtac agattgaaat ttagttgttt 1380 cattggctgt ctgaggaggt gtggactttt atatatagat ctacataaaa actgctacat 1440 gacaaaaacc acacctaaag aaattttaag aatttggcac agttactcac tttgtgtaat 1500 ctgaaatcta gctgctgaat acgctgaagt aaatccttgt tcactgaagt ctttcaattg 1560 agctggttga atactttgaa aaatgctcag ttctaactaa tgaaatggat ttcccagtag 1620 gggtttctgc atatcacctg tatagtagtt atatgcatat gtttctgtgc atgttctcta 1680 cacaattgta aggtgtcact gtatttaact gttgcacttg tcaactttca ataaagcata 1740 taaatgttga taaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa a 1791

Claims (19)

1. Method for screening a biological sample to detect early stages of infection, SIRS or sepsis comprising the steps of: a) detecting expression of a first set of informative biomarkers by RT-PCR and detecting expression of a second set of informative bio markers by RT-PCR; b) analysing the results of detection; C) classifying said sample according to the likelihood and/or timing of the development of overt infection wherein the first set of informative biomarkers the expression of which is detected by means of RT-PCR consists of at least 1 selected from the list consisting of CD4O, CD5, CD79A, CRX, CTNND1, CX3CL1, ENTPD2, ENTPD5, EPHA8, GPR44, HMMR, lL-8, MAP1A, MAPK7, MEF2D, ODF1, SAA3P, SLC6A9, SPN, TDGF1, TSC22D1 and HDAC5 and wherein the second set of informative biomarkers the expression of which is detected by means of RT-PCR consists of at least 1 selected from the list consisting of CD178, MCP-1, TNFa, IL-1j3, IL-6, IL-b, INF-a, INF-y.
2. Method according to claim 1, wherein the first set of informative biomarkers the expression of which is detected by means of RT-PCR consists of at least 2 selected from the list.
3. Method according to either claim 1 or claim 2, wherein the first set of informative biomarkers the expression of which is detected by means of RT-PCR consists of at least 3 selected from the list.
4. Method according to any preceding claim, wherein the first set of informative biomarkers the expression of which is detected by means of RT-PCR consists of at least 4 selected from the list.
5. Method according to any preceding claim, wherein the first set of informative biomarkers the expression of which is detected by means of RT-PCR consists of at least 6 selected from the list.
6. Method according to any preceding claim, wherein the first set of informative biomarkers the expression of which is detected by means of RT-PCR consists of at least 12 selected from the list.
7. Method according to claim 6, wherein the set of informative biomarkers consists of C040, CRX, ENTPD2, EPHA8, IL-8, HMMR, MAP1A, MAPK7, MEF2D, SAA3P, SPN, TSC22D1.
8. Method according to claim 1, wherein at least 2 biomarkers selected from the second list are detected.
9. Method according to claim 8, wherein at least 3 biomarkers selected from the second list are detected.
10. Method according to claim 9, wherein at least 4 biomarkers selected from the second list are detected.
11. Method according to claim 10, wherein at least 6 biomarkers selected from the second list are detected.
12. Method according to claim 11, wherein at least 8 biomarkers selected from the second list are detected.
13. Method according to any preceding claim, wherein analysis of the results yields a prediction of a probability of clinical SIRS or sepsis developing.
14. Method according to any of claims 1 to 12, wherein analysis of the results yields a binary yes/no prediction of clinical SIRS or sepsis developing.
15. Method according to any preceding claim, wherein if the development of clinical SIRS or sepsis is predicted, the results are subjected to a second analysis to determine the likely timing and/or severity of the clinical disease.
16. Method of any preceding claim wherein analysis is by means of a neural network.
17. Method of claim 16 wherein the neural network is a multilayered perceptron neural network.
18. Method of any of preceding claim, wherein the analysis is capable of correctly predicting clinical SIRS or sepsis in greater than 80% of cases.
19. Analysis according to the method of any preceding claim for the preparation of a diagnostic means for the diagnosis of infection, SIRS or sepsis.
GB0820899A 2007-11-16 2008-11-17 Diagnosing infection, SIRS or sepsis using RT-PCR and biomarkers Withdrawn GB2454799A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
GB0722582A GB0722582D0 (en) 2007-11-16 2007-11-16 Early detection of sepsis

Publications (2)

Publication Number Publication Date
GB0820899D0 GB0820899D0 (en) 2008-12-24
GB2454799A true GB2454799A (en) 2009-05-20

Family

ID=38896479

Family Applications (2)

Application Number Title Priority Date Filing Date
GB0722582A Ceased GB0722582D0 (en) 2007-11-16 2007-11-16 Early detection of sepsis
GB0820899A Withdrawn GB2454799A (en) 2007-11-16 2008-11-17 Diagnosing infection, SIRS or sepsis using RT-PCR and biomarkers

Family Applications Before (1)

Application Number Title Priority Date Filing Date
GB0722582A Ceased GB0722582D0 (en) 2007-11-16 2007-11-16 Early detection of sepsis

Country Status (2)

Country Link
GB (2) GB0722582D0 (en)
WO (1) WO2009063249A2 (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013107826A2 (en) * 2012-01-17 2013-07-25 Institut Pasteur Use of cellular biomarkers expression to diagnose sepsis among intensive care patients
CN110129425A (en) * 2013-06-28 2019-08-16 睿智研究实验室私人有限公司 Pyemia biomarker and its application
CN103882039B (en) * 2013-09-23 2016-04-20 中国农业科学院上海兽医研究所 Detect the fluorescent quantitative RT-PCR method of duck MAPK1 gene
CN107840876A (en) * 2016-09-18 2018-03-27 北京奥维亚生物技术有限公司 A kind of mouse Odf1 polypeptides and its preparation method for antibody

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040096917A1 (en) * 2002-11-12 2004-05-20 Becton, Dickinson And Company Diagnosis of sepsis or SIRS using biomarker profiles
WO2004043236A2 (en) * 2002-11-12 2004-05-27 Becton, Dickinson And Company Diagnosis of sepsis or sirs using biomarker profiles
WO2006061644A1 (en) * 2004-12-09 2006-06-15 Secretary Of State For Defence Early detection of sepsis
US20060246495A1 (en) * 2005-04-15 2006-11-02 Garrett James A Diagnosis of sepsis

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1611255A2 (en) * 2003-04-02 2006-01-04 SIRS-Lab GmbH Method for recognising acute generalised inflammatory conditions (sirs), sepsis, sepsis-like conditions and systemic infections
DE102004015605B4 (en) * 2004-03-30 2012-04-26 Sirs-Lab Gmbh Method for predicting the individual disease course in sepsis
DE102004016437A1 (en) * 2004-04-04 2005-10-20 Oligene Gmbh Method for detecting signatures in complex gene expression profiles

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040096917A1 (en) * 2002-11-12 2004-05-20 Becton, Dickinson And Company Diagnosis of sepsis or SIRS using biomarker profiles
WO2004043236A2 (en) * 2002-11-12 2004-05-27 Becton, Dickinson And Company Diagnosis of sepsis or sirs using biomarker profiles
WO2004044554A2 (en) * 2002-11-12 2004-05-27 Becton, Dickinson And Company Diagnosis of sepsis or sirs using biomarker profiles
WO2006061644A1 (en) * 2004-12-09 2006-06-15 Secretary Of State For Defence Early detection of sepsis
US20060246495A1 (en) * 2005-04-15 2006-11-02 Garrett James A Diagnosis of sepsis

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Lancet, Vol. 346, 1995, Dybowski et al, 'Artificial neural networks in pathology and medical laboratories', pp. 1203-1207 *

Also Published As

Publication number Publication date
GB0820899D0 (en) 2008-12-24
WO2009063249A2 (en) 2009-05-22
GB0722582D0 (en) 2007-12-27
WO2009063249A3 (en) 2009-10-15

Similar Documents

Publication Publication Date Title
CN109790583B (en) Methods for typing lung adenocarcinoma subtypes
AU2019201577B2 (en) Cancer diagnostics using biomarkers
DK2681333T3 (en) EVALUATION OF RESPONSE TO GASTROENTEROPANCREATIC NEUROENDOCRINE NEOPLASIS (GEP-NENE) THERAPY
CN101370946B (en) Method and apparatus for correlating levels of biomarker products with disease
AU2013277971B2 (en) Molecular malignancy in melanocytic lesions
RU2721916C2 (en) Methods for prostate cancer prediction
US20120225793A1 (en) Methods for identifying, diagnosing, and predicting survival of lymphomas
US20030228617A1 (en) Method for predicting autoimmune diseases
KR100964193B1 (en) Markers for liver cancer prognosis
CN111183233A (en) Assessment of Notch cell signaling pathway activity using mathematical modeling of target gene expression
KR20140006898A (en) Colon cancer gene expression signatures and methods of use
CN111479933A (en) Assessment of JAK-STAT1/2 cell signaling pathway activity using mathematical modeling of target gene expression
KR20110015409A (en) Gene expression markers for inflammatory bowel disease
US10900086B1 (en) Compositions and methods for diagnosing prostate cancer using a gene expression signature
KR20160117606A (en) Molecular diagnostic test for predicting response to anti-angiogenic drugs and prognosis of cancer
KR20140140069A (en) Compositions and methods for diagnosis and treatment of pervasive developmental disorder
CN101111768A (en) Lung cancer prognostics
KR20190026769A (en) Compositions and methods for diagnosing lung cancer using gene expression profiles
AU2017334293A1 (en) Assay for distinguishing between sepsis and systemic inflammatory response syndrome
AU2008203226A1 (en) Colorectal cancer prognostics
CA2666057C (en) Genetic variations associated with tumors
GB2454799A (en) Diagnosing infection, SIRS or sepsis using RT-PCR and biomarkers
AU2016377391A1 (en) Triage biomarkers and uses therefor
KR101128112B1 (en) Colorectal cancer prognostics
US20020137077A1 (en) Genes regulated in activated T cells

Legal Events

Date Code Title Description
WAP Application withdrawn, taken to be withdrawn or refused ** after publication under section 16(1)