WO2017208001A1 - Biomarkers for platelet disorders - Google Patents

Biomarkers for platelet disorders Download PDF

Info

Publication number
WO2017208001A1
WO2017208001A1 PCT/GB2017/051572 GB2017051572W WO2017208001A1 WO 2017208001 A1 WO2017208001 A1 WO 2017208001A1 GB 2017051572 W GB2017051572 W GB 2017051572W WO 2017208001 A1 WO2017208001 A1 WO 2017208001A1
Authority
WO
WIPO (PCT)
Prior art keywords
biomarkers
disorder
modulator
disease
ensg000001
Prior art date
Application number
PCT/GB2017/051572
Other languages
French (fr)
Inventor
Peter Fraser
Stefan SCHOENFELDER
Mikhail SPIVAKOV
Original Assignee
Babraham Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Babraham Institute filed Critical Babraham Institute
Priority to EP17731938.1A priority Critical patent/EP3465219A1/en
Publication of WO2017208001A1 publication Critical patent/WO2017208001A1/en

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/68Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving proteins, peptides or amino acids
    • G01N33/6893Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving proteins, peptides or amino acids related to diseases not provided for elsewhere

Definitions

  • the invention relates to novel targets and biomarkers for the diagnosis and treatment of selected diseases or disorders, in particular blood disorders (such as platelet disorders and red blood cell disorders), autoimmune disorders (such as ulcerative colitis, multiple sclerosis, rheumatoid arthritis, celiac disease and Crohn's disease), diabetes (such as Type 1 diabetes and Type 2 diabetes), height/growth disorders, disorders related to lipid metabolism, disorders related to glucose metabolism, disorders related to insulin metabolism, disorders related to bone mineral density, disorders related to blood pressure and disorders related to body mass index.
  • the invention also relates to methods of diagnosing said diseases or disorders and methods of screening of modulators said biomarkers.
  • Genomic regulatory elements such as enhancers determine spatiotempora! patterns of gene expression. It has been estimated that there are up to 1 million enhancer elements with gene regulatory potential in mammalian genomes (The ENCODE Project Consortium, 2012), based primarily on detection of specific combinations of chromatin features, and to a lesser extent, on their ability to activate transgenic reporters in vivo (Akhtar et al., 2013; Arnold et al., 2013).
  • Hi-C has begun to shed light on global genome organization, revealing higher-order structures such as contact domains and topologically associated domains (TADs) (Dixon et al., 2012; Nora et al., 2012; Sexton et al., 2012), and enabling the structural modelling of global chromosomal architecture in multiple species (Ay et al., 2014; Duan et al., 2010; Lieberman-Aiden et al., 2009; Nagano et al., 2013).
  • TADs topologically associated domains
  • sequence capture to enrich for Hi-C interactions that involve specific regions of interest is a versatile approach to overcome the limitations imposed by complexity (Dryden et al., 2014; Jager ef al., 2015; Mifsud et al., 2015; Sahlen et al., 2015; Schoenfelder et al., 2015a)), enabling robust and sensitive interaction calling based on statistical significance (Cairns et al., 2016).
  • Capture Hi-C avoids the challenges associated with alternative methods of target enrichment, such as selective PGR amplification as in 5C (Sanyal et al., 2012), or immunoprecipitation as in ChlA ⁇ PET (Fullwood et al., 2009), and allows interactome profiling of large numbers of baited regions in a single experiment, regardless of their activity state or factor binding.
  • PCHi-C promoter Capture Hi-C
  • the inventors have recently developed promoter Capture Hi-C (PCHi-C) which targets all annotated promoters, and used it to study the genomic regulatory architecture in mouse and human ceil lines (Mifsud et al., 2015; Schoenfelder et al., 2015a, 2015b).
  • the studies described herein use PCHi-C in primary cells to generate a comprehensive catalogue of the interactomes of 31 ,253 annotated promoters in 17 human primary blood ceil types. Devising a statistical methodology to link GWAS SNPs to their putative target genes based on PCHi-C interaction data, hundreds of new candidate genes and pathways associated with common diseases have been identified.
  • a modulator of one or more of the biomarkers of Tables 3 to 7 for use in the treatment of an autoimmune disorder is provided.
  • a modulator of one or more of the biomarkers of Tables 8 or 9 for use in the treatment of diabetes is provided.
  • a modulator of one or more of the biomarkers of Table 10 for use in the treatment of height/growth disorders.
  • a modulator of one or more of the biomarkers of Table 11 for use in the treatment of disorders related to lipid metabolism.
  • a modulator of one or more of the biomarkers of Table 12 for use in the treatment of disorders related to glucose metabolism.
  • a modulator of one or more of the biomarkers of Table 13 for use in the treatment of disorders related to insulin metabolism.
  • a modulator of one or more of the biomarkers of Table 14 for use in the treatment of disorders related to bone mineral density.
  • a modulator of one or more of the biomarkers of Table 15 for use in the treatment of disorders related to blood pressure.
  • a modulator of one or more of the biomarkers of Table 16 for use in the treatment of disorders related to body mass index.
  • biomarkers as defined herein for the diagnosis or prognosis of a disease or disorder as defined herein.
  • a method of diagnosing a disease or disorder as described herein comprising:
  • a method of prognosing the development of a disease or disorder as described herein in an individual comprising:
  • a method of monitoring efficacy of a therapy in a subject having, suspected of having, or of being predisposed to a disease or disorder as described herein comprising detecting and/or quantifying, in a sample from said subject, the biomarkers as defined herein.
  • a further aspect of the invention provides ligands, such as naturally occurring or chemically synthesised compounds, capable of specific binding to the biomarker.
  • a ligand according to the invention may comprise a peptide, an antibody or a fragment thereof, or an aptamer or oligonucleotide, capable of specific binding to the biomarker.
  • the antibody can be a monoclonal antibody or a fragment thereof capable of specific binding to the biomarker.
  • a ligand according to the invention may be labelled with a detectable marker, such as a luminescent, fluorescent or radioactive marker; alternatively or additionally a ligand according to the invention may be labelled with an affinity tag, e.g. a biotin, avidin, streptavidin or His (e.g. hexa-His) tag.
  • a biosensor according to the invention may comprise the biomarker or a structural/shape mimic thereof capable of specific binding to an antibody against the biomarker. Also provided is an array comprising a ligand or mimic as described herein.
  • ligands as described herein, which may be naturally occurring or chemically synthesised, and is suitably a peptide, antibody or fragment thereof, aptamer or oligonucleotide, or the use of a biosensor of the invention, or an array of the invention, or a kit of the invention to detect and/or quantify the biomarker.
  • the detection and/or quantification can be performed on a biological sample such as from the group consisting of whole blood, blood serum, plasma, CSF, urine, saliva, or other bodily fluid, breath, e.g. as condensed breath, or an extract or purification therefrom, or dilution thereof.
  • kits for performing methods of the invention.
  • kits will suitably comprise a ligand according to the invention, for detection and/or quantification of the biomarker, and/or a biosensor, and/or an array as described herein, optionally together with instructions for use of the kit.
  • kits comprising a biosensor capable of detecting and/or quantifying the biomarkers as defined herein for monitoring, prognosing or diagnosing a disease or disorder as described herein or a predisposition thereto.
  • Biomarkers for the diseases or disorders as described herein are essential targets for discovery of novel targets and drug molecules that retard or halt progression of the disease or disorder. As the level of the biomarker is indicative of disorder and of drug response, the biomarker is useful for identification of novel therapeutic compounds in in vitro and/or in vivo assays. Biomarkers of the invention can be employed in methods for screening for compounds that modulate the activity of the biomarker.
  • a ligand as described, which can be a peptide, antibody or fragment thereof or aptamer or oligonucleotide according to the invention; or the use of a biosensor according to the invention, or an array according to the invention; or a kit according to the invention, to identify a substance capable of promoting and/or of suppressing the generation of the biomarker.
  • Also there is provided a method of identifying a substance capable of promoting or suppressing the generation of the biomarker in a subject comprising administering a test substance to a subject animal and detecting and/or quantifying the level of the biomarker present in a test sample from the subject.
  • a method for treating a disease or disorder as described herein may comprise treating a patient with suitable medicament and/or non- drug therapies. Treatment may be based upon a diagnosis or suspicion of a disease or disorder as described herein derived from the methods, biomarkers and specific panels of biomarkers as described herein.
  • results of any analyses according to the invention will often be communicated to physicians and/or patients (or other interested parties such as researchers) in a transmittable form that can be communicated or transmitted to any of the above parties. Therefore, according to a further aspect of the invention, there is provided systems for diagnosing and treating a disease or disorder as described herein. These systems may comprise sample analyzers, computers and software as described herein.
  • FIG. 1 Promoter Capture Hi-C (PCHi-C) across 17 human primary blood cell types.
  • C Interaction landscape of the INPP4B, RHAG, ZEB2-AS and ALAD promoters in naive CD4+ cells (nCD4), erythroblasts (Ery) and monocytes (Mon).
  • nCD4 naive CD4+ cells
  • Ery erythroblasts
  • monocytes monocytes
  • FIG. 1 Promoter interactions reflect the lineage relationships of the haemopoietic tree.
  • PCA Principal Component Analysis
  • (B) Top (dendrogram): hierarchical clustering of the cell types according to their interaction profiles, with lymphoid-lineage cells on the left (nB: naive B cells, tB:total B cells, FetT: fetal thymus, aCD4: activated CD4, naCD4: non-activated CD4, tCD4: total CD4, nCD8: naive CD8, nCD4: naive CD4, tCD8: total CD8), and myeloid cells on the right (Mon: monocytes, Neu: neutrophils, ⁇ -2: Macrophages M0, M1 , M2, EndP: endothelial precursors, MK: megakaryocytes, Ery: Erythroblasts).
  • Bottom (heatmap): Autoclass Bayesian clustering of interactions according to their cell-type specificity. Cluster IDs are shown on the right.
  • (C) Cell-type specificity of interaction clusters.
  • Cell types and clusters are arranged as in (B).
  • the specificity score is a lineage-tree weighted deviation from the mean interaction score per cluster (see Materials and Methods for details).
  • Interactions in clusters 1 - 15 are generally lymphoid-specific, in clusters 28 - 34 generally myeloid-specific, while the remaining clusters show broad specificity across lineages.
  • FIG. 3 Promoters preferentially connect to active enhancers.
  • FIG. 5 Promoter-interacting regions are enriched for eQTLs.
  • A, B Proportion of significant associations of gene expression with SNPs within PIRs in monocytes (A) and B cells (B). Values are compared to randomised PIRs at binned distances of SNPs from the transcription start site. Asterisks represent the significance of enrichment at observed versus randomised PIRs (permutation test *p ⁇ 0.05; **p ⁇ 0.01 ; ***p ⁇ 0.001).
  • A Enrichment of GWAS summary statistics at PIRs by tissue type. Axes reflect BLOCKSHIFTER z-scores for two different tissue group comparisons, first lymphoid vs myeloid, then additionally within the myeloid lineage (Mon, monocytes; ⁇ , macrophages; Neu, neutrophils; MK, megakaryoctes; Ery, erythroblasts).
  • BMI Body Mass Index
  • BP_D Diastolic blood pressure
  • BP_S Systolic blood pressure
  • CD Crohn's disease
  • CEL Celiac disease
  • FNBMD Femoral neck bone mineral density
  • GLC Glucose sensitivity
  • GLC_B Glucose sensitivity BMI-adjusted
  • HB Haemoglobin
  • HDL High density lipoprotein
  • HEIGHT Height
  • INS Insulin sensitivity
  • INS_B Insulin sensitivity BMI-adjusted
  • LDL Low density lipoprotein
  • LSBMD Lumbar spine bone mineral density
  • MCH Mean corpuscular haemoglobin
  • MCHC Mean corpuscular haemoglobin concentration
  • MCV Mean corpuscular volume
  • MS Multiple sclerosis
  • PBC Primary Billiary Cirrhosis
  • PCV Packed cell volume
  • PLT Platelet count
  • PV Platelet volume
  • RA Mean corpuscular volume
  • MS Multiple s
  • bubble size indicates the ratio of test genes to those in the pathway, and blue to red corresponds to decreasing adjusted p value for enrichment.
  • Figure 7 Higher-order topological properties of eight blood cell types.
  • Top panel Distributions of the frequencies of promoter interactions (per bait) that cross the cognate TAD boundaries in three representative cell types. Black bars show the observed frequencies, and grey bars show expected frequencies computed by permuting TAD boundaries 1000 times (see Materials and Methods for details). The error bars show the most extreme observations in the 1000 permutations. On the X axis, 1 corresponds to a scenario whereby all interactions of a given bait localise within the same TAD as the bait, and 0 corresponds to a scenario whereby all interactions of a given bait are cross TAD boundaries.
  • Bottom panel examples of baits with PIRs mapping fully within (left) or fully outside (right) the baits' TADs.
  • (B) Coverage-and-distance corrected Hi-C matrices of chromosome 1 show the log2- enrichment of interactions between chromatin segments binned at 1 Mb resolution.
  • the eight analysed cell types (MK, megakaryocytes; Ery, erythroblasts; Neu, neutrophils; Mon, monocytes; ⁇ , macrophages M0; nCD4, naive CD4 + T cells; nCD8, naive CD8 + T cells) are shown in columns, and the respective biological replicates are in rows.
  • Figure 8 Validation of promoter interactions using reciprocal capture Hi-C.
  • FIG. 1 Venn diagram showing the numbers of promoters baits with interactions mapping to the "myeloid”, “lymphoid” and “invariant” sets of clusters. See Figure 2B-C and the main text for details.
  • Figure 10 Additional evidence of the link between promoter interactions and gene expression.
  • Figure 11 Further details on the enrichment of eQTLs at promoter-interacting regions.
  • A, B Proportion of genes with at least one eQTL at PIRs in monocytes (Mon, panel A) and total B cells (tB, panel B), evaluated at binned distances of SNPs from the transcription start site. Results obtained with randomised PIRs are shown as controls. Asterisks represent the significance of enrichment at observed versus randomised PIRs (permutation test *p ⁇ 0.05; **p ⁇ 0.01 ; ***p ⁇ 0.001).
  • Figure 12 Colocalization of GWAS and eQTL signals at prioritised candidate genes.
  • Promoter capture Hi-C is used herein to identify interacting regions of 31 ,253 promoters in 17 human primary haematopoietic cell types. It is shown that long-range promoter interactions are highly cell-type specific, preferentially linking active promoters and enhancers. Patterns of promoter interactions reflect cell lineage relationships of the hematopoietic tree, consistent with dynamic remodeling of nuclear architecture during differentiation. Interacting regions are enriched for expression quantitative trait loci with expression effects on their interacting target genes.
  • a modulator of one or more of the biomarkers of Tables 1 or 2 for use in the treatment of a blood disorder is selected from a platelet disorder and a red blood cell disorder.
  • the blood disorder is a platelet disorder and the one or more biomarkers are selected from Table 1.
  • the blood disorder is a platelet disorder and the one or more biomarkers are selected from one or more, or all, of: CD22, BAZ2A, JAK2, CHRNE, PTGES3, CYP27B1 , OPRD1 , CA14, CD274, ABCC4, FFAR2, PRMT1 , PLD2, SLC39A5, MINK1 , SIRT3, RNPEPL1 , PSMB6, SLC2A12, TAOK1 , NUAK2, GPR182, BRD3, JMJD1 C, NLRP6, TBK1 and NRBP2.
  • the one or more biomarkers are selected from one or more, or all, of: CD22, BAZ2A, JAK2, CHRNE, PTGES3, CYP27B1 , OPRD1 , CA14, CD274, ABCC4, FFAR2, PRMT1 , PLD2, SLC39A5, MINK1 , SIRT3, RNPEPL1 , PSMB6, SLC2A12,
  • the blood disorder is a red blood cell disorder and the one or more biomarkers are selected from Table 2.
  • the blood disorder is a red blood cell disorder and the one or more biomarkers are selected from one or more, or all, of: CASP10, SLC25A39, CLK1 , ATP2B4, CASP8, PTGS2, STRADB, AURKA, JAK2, IL2RB, KAT8, KCNN4, DOT1 L, SLC12A7, PLCG1 , LPIN3, AMHR2, GABBR2, ADAM 10, IFNAR1 , SLC6A3, PADI3, SLC25A37, UGCG, KCNMA1 , MAPK13, ITGAD, IFNAR2, PADI4, NEK8, NUAK2, FABP1 , SLC51A, ABCA1 , BRD7, SMPD1 , ILK, CDK12, VKORC1 , BRD3, GPR152, RPS6KB2, TMPRSS6, TOP1
  • a modulator of one or more of the biomarkers of Tables 3 to 7 for use in the treatment of an autoimmune disorder is provided.
  • the autoimmune disorder is selected from ulcerative colitis, multiple sclerosis, rheumatoid arthritis, celiac disease and Crohn's disease.
  • the autoimmune disorder is ulcerative colitis and the one or more biomarkers are selected from Table 3.
  • the autoimmune disorder is multiple sclerosis and the one or more biomarkers are selected from Table 4.
  • the autoimmune disorder is multiple sclerosis and the one or more biomarkers are selected from one or more, or all, of: CYP24A1 , IFNGR1 , PDE4A, MAPK1 , CSF2RB, PTK6, SLC17A7, PDE4C, SCNN1A, LTBR, USP5, NEK9, SRMS, SLC34A1 , IL2RA, SLC26A10, CD27, NCOA2, PTPRK, CXCR5, ATP1A1 , SLC9B1 , SLC9B2, IL22RA2, NSD1 , PIP4K2C, RGS14, ADRA1 B, PTGER4, GPR160, S1 PR5, LPAR5, SEMA4D, P2RY11 and GPR162.
  • the autoimmune disorder is rheumatoid arthritis and the one or more biomarkers are selected from Table 5.
  • the autoimmune disorder is rheumatoid arthritis and the one or more biomarkers are selected from one or more, or all, of: IFNGR1 , PDE4A, IDI1 , FDFT1 , KEAP1 , PREP, MAP3K1 , CD40, CDK6, SLC26A8, CCR6, PCCB, CYP20A1 , NEK9, ACAT2, SLC44A2, IL6ST, IL2RA, SLC26A10, BLK, SLC35B2, TNFRSF14, IFNAR2, CXCR5, IL6R, NEK10, CTSB, PIP4K2C, CXCR6 and S1 PR5.
  • the autoimmune disorder is celiac disease and the one or more biomarkers are selected from Table 6.
  • the autoimmune disorder is celiac disease and the one or more biomarkers are selected from one or more, or all, of: IL20RA, IFNGR1 , DAPK2, ZMYND8, ITGA4, IL1 R2, IL1 RL2, IL1 RL1 , IL18R1 , SLC9A2, PTPRK, IL22RA2 and GPR160.
  • the autoimmune disorder is Crohn's disease and the one or more biomarkers are selected from Table 7.
  • the autoimmune disorder is Crohn's disease and the one or more biomarkers are selected from one or more, or all, of: ATP6V0A1 , ITGA8, MAP3K1 , JAK2, ATP5D, TYK2, CCR6, LNPEP, KCNJ13, IL1 R2, IL1 RL2, IL1 RL1 , IL18R1 , SLC9A2, STK11 , CD274, PIK3CA, SLC44A2, IL2RA, GPR65, FURIN, HCN3, JAK1 , USP1 , SLC9B1 , ERAP1 , ERAP2, BRD7, INPP5D, ADRA1 B, PTGER4, TRIB1 , CLK2, S1 PR5, FES and P2RY11.
  • a modulator of one or more of the biomarkers of Tables 8 or 9 for use in the treatment of diabetes is provided.
  • the diabetes is selected from Type 1 diabetes and Type 2 diabetes.
  • the diabetes is Type 1 diabetes and the one or more biomarkers are selected from Table 8.
  • the diabetes is Type 1 diabetes and the one or more biomarkers are selected from one or more, or all, of: FYN, HDAC9, ERBB3, PRKCQ, TRPM5, TRIB2, PTPRC, CTSH, CDK6, SLC22A18, RGS2, GPR18, CCR7, RARA, IL2RA, AMHR2, SLC25A47, CAMK2D, NEK7, ATP6V1G3, GPR183, PTEN, SSTR2, GPR19, TMPRSS6, SLC25A29 and CD3E.
  • the diabetes is Type 2 diabetes and the one or more biomarkers are selected from Table 9.
  • the diabetes is Type 2 diabetes and the one or more biomarkers are selected from one or more, or all, of: RORA, MAP3K1 , PLCG1 , RIOK1 and TOP1.
  • the one or more height/growth disorders biomarkers are selected from one or more, or all, of: MARK4, MMP25, PRKCH, MAP2K3, MAPK9, LPAR2, MAP2K4, PDE4A, ATP2B1 , TRIB2, EED, SPTLC1 , CMA1 , SLC7A8, SIRT1 , JAK2, PCSK5, CTSG, GZMB, CTSZ, SLC04A1 , NTSR1 , LPIN2, KCNN4, DOT1 L, VRK3, TYK2, GSK3A, CDK6, EZH2, SLC16A6, CPZ, SLC1 1A2, ITK, ACVR2B, ODC1 , ECE1 , NEK2, CA14, SLC16A7, NT5C3A, TRIM24,
  • the one or more lipid metabolism biomarkers are selected from one or more, or all, of: MARK4, BAZ1 B, NPC1 L1 , PRSS8, SLC12A3, EIF2AK1 , MAP3K1 , CYP26A1 , KAT8, AEBP1 , PCCB, PLCG1 , SLC25A35, RIPK3, ADCY4, RAF1 , PPARG, ABCB10, BLK, ADAM 10, TSSK4, NLRC5, PNMT, CELSR2, SCN3A, GPR61 , CPA2, SLC45A3, SIK3, PCSK7, GPR146, ABCA1 , SLC35G2, PCSK9, FPR2, FPR1 , JMJD1C, SLC16A11 , SLC16A13, IL20RB, F
  • Suitable disorders related to lipid metabolism include but are not restricted to stroke, heart disease and vascular disease.
  • a modulator of one or more of the biomarkers of Table 12 for use in the treatment of disorders related to glucose metabolism is provided.
  • the one or more glucose metabolism biomarkers are selected from one or more, or all, of: NPC1 L1 , NR1 H3, AEBP1 , PLCG1 , STK33, LPIN3, MTNR1 B, SLC9B1 , SLC9B2, SLC30A8, SLC39A13, STK39 and TOP1.
  • suitable disorders related to glucose metabolism include but are not restricted to Type II diabetes and obesity.
  • a modulator of one or more of the biomarkers of Table 13 for use in the treatment of disorders related to insulin metabolism is provided.
  • the one or more insulin metabolism biomarkers are selected from either or both of: PPARG and TBCK.
  • suitable disorders related to insulin metabolism include but are not restricted to Type II diabetes and obesity.
  • a modulator of one or more of the biomarkers of Table 14 for use in the treatment of disorders related to bone mineral density.
  • the one or more bone mineral density biomarkers are selected from one or more, or all, of: SLC25A39, CLCN7, AMHR2, MAP3K12, ITGB7, TNFRSF1 1A, SLC26A1 , RARG and GAK.
  • suitable disorders related to bone mineral density include but are not restricted to osteoporosis and fracture risk.
  • a modulator of one or more of the biomarkers of Table 15 for use in the treatment of disorders related to blood pressure.
  • the one or more blood pressure biomarkers are selected from one or more, or all, of: CLCN6, CSK, PREPL, FURIN, PLCD3, NEKI O and FES.
  • suitable disorders related to blood pressure include but are not restricted to heart attack, stroke, kidney disease and vascular dementia.
  • a modulator of one or more of the biomarkers of Table 16 for use in the treatment of disorders related to body mass index is provided.
  • the one or more body mass index biomarkers are selected from one or more, or all, of: GIPR, PRSS8, AQP6, BCKDK, KAT8, MTCH2, ASIC1 , PPARG, CSNK1G2, MAP2K5, ITGAX, SIRT3, CPO, GPR61 , ADCY9, USP1 , F2RL1 , KCNG3, KCNK3, CD19, SLC25A22, GPBAR1 and ATP2A1.
  • suitable disorders related to body mass index include but are not restricted to obesity.
  • biomarker means a distinctive biological or biologically derived indicator of a process, event, or condition. Biomarkers can be used in methods of diagnosis, e.g. clinical screening, and prognosis assessment and in monitoring the results of therapy, identifying patients most likely to respond to a particular therapeutic treatment, drug screening and development. Biomarkers and uses thereof are valuable for identification of new drug treatments and for discovery of new targets for drug treatment.
  • References herein to the term “modulator” refer to any agent, such as an inhibitor (i.e. competitive, non-competitive or un-competitive inhibitor) or antagonist (i.e. competitive, noncompetitive or un-competitive antagonist), activator or agonist (i.e. full inverse agonist, partial inverse agonist, silent antagonist, partial agonist, full agonist or super agonist) capable of modulating the signaling effected by the biomarker.
  • biomarkers as defined herein for the diagnosis or prognosis of a disease or disorder as defined herein.
  • the blood disorder is selected from a platelet disorder and a red blood cell disorder.
  • the blood disorder is a platelet disorder and the one or more biomarkers are selected from Table 1.
  • the blood disorder is a platelet disorder and the one or more biomarkers are selected from one or more, or all, of: CD22, BAZ2A, JAK2, CHRNE, PTGES3, CYP27B1 , OPRD1 , CA14, CD274, ABCC4, FFAR2, PRMT1 , PLD2, SLC39A5, MINK1 , SIRT3, RNPEPL1 , PSMB6, SLC2A12, TAOK1 , NUAK2, GPR182, BRD3, JMJD1 C, NLRP6, TBK1 and NRBP2.
  • the blood disorder is a red blood cell disorder and the one or more biomarkers are selected from Table 2.
  • the blood disorder is a red blood cell disorder and the one or more biomarkers are selected from one or more, or all, of: CASP10, SLC25A39, CLK1 , ATP2B4, CASP8, PTGS2, STRADB, AURKA, JAK2, IL2RB, KAT8, KCNN4, DOT1 L, SLC12A7, PLCG1 , LPIN3, AMHR2, GABBR2, ADAM 10, IFNAR1 , SLC6A3, PADI3, SLC25A37, UGCG, KCNMA1 , MAPK13, ITGAD, IFNAR2, PADI4, NEK8, NUAK2, FABP1 , SLC51A, ABCA1 , BRD7, SMPD1 , ILK, CDK12, VKORC1 , BRD3, GPR152, RPS6KB2, TMPRSS6, TOP1
  • biomarkers from Tables 1 and 2 may be used to differential diagnose a first blood disorder from a second blood disorder.
  • the autoimmune disorder is selected from ulcerative colitis, multiple sclerosis, rheumatoid arthritis, celiac disease and Crohn's disease.
  • the autoimmune disorder is ulcerative colitis and the one or more biomarkers are selected from Table 3.
  • the autoimmune disorder is ulcerative colitis and the one or more biomarkers are selected from one or more, or all, of: CDK11A, SLC11A1 , IFNGR1 , ATP6V0A1 , ENTPD2, EIF2AK1 , SLC26A3, JAK2, ABCA2, CHRNE, IL1 RL2, CD274, SLC7A10, CDK4, SLC26A10, ADAM 10, MINK1 , PSMB6, RORC, ADAMTS16, INPP5E, PLCH2, STK32B, TNFRSF14, STK36, BRD7, PIP4K2C, ADAM9, ADRA1 B, PTGER4, BCL2, DPP7, SLC9A4, CXCR2, EHMT1 , PRKAR1 B, TUBB4B, MMP23B, PIM3, SLC34A3, SGMS1 , SLC35E2 and CDK1 1 B.
  • biomarkers are selected from one or more, or all
  • the autoimmune disorder is multiple sclerosis and the one or more biomarkers are selected from Table 4.
  • the autoimmune disorder is multiple sclerosis and the one or more biomarkers are selected from one or more, or all, of: CYP24A1 , IFNGR1 , PDE4A, MAPK1 , CSF2RB, PTK6, SLC17A7, PDE4C, SCNN1A, LTBR, USP5, NEK9, SRMS, SLC34A1 , IL2RA, SLC26A10, CD27, NCOA2, PTPRK, CXCR5, ATP1A1 , SLC9B1 , SLC9B2, IL22RA2, NSD1 , PIP4K2C, RGS14, ADRA1 B, PTGER4, GPR160, S1 PR5, LPAR5, SEMA4D, P2RY11 and GPR162.
  • the autoimmune disorder is rheumatoid arthritis and the one or more biomarkers are selected from Table 5.
  • the autoimmune disorder is rheumatoid arthritis and the one or more biomarkers are selected from one or more, or all, of: IFNGR1 , PDE4A, IDI1 , FDFT1 , KEAP1 , PREP, MAP3K1 , CD40, CDK6, SLC26A8, CCR6, PCCB, CYP20A1 , NEK9, ACAT2, SLC44A2, IL6ST, IL2RA, SLC26A10, BLK, SLC35B2, TNFRSF14, IFNAR2, CXCR5, IL6R, NEK10, CTSB, PIP4K2C, CXCR6 and S1 PR5.
  • the autoimmune disorder is celiac disease and the one or more biomarkers are selected from Table 6.
  • the autoimmune disorder is celiac disease and the one or more biomarkers are selected from one or more, or all, of: IL20RA, IFNGR1 , DAPK2, ZMYND8, ITGA4, IL1 R2, IL1 RL2, IL1 RL1 , IL18R1 , SLC9A2, PTPRK, IL22RA2 and GPR160.
  • the autoimmune disorder is Crohn's disease and the one or more biomarkers are selected from Table 7.
  • the autoimmune disorder is Crohn's disease and the one or more biomarkers are selected from one or more, or all, of: ATP6V0A1 , ITGA8, MAP3K1 , JAK2, ATP5D, TYK2, CCR6, LNPEP, KCNJ13, IL1 R2, IL1 RL2, IL1 RL1 , IL18R1 , SLC9A2, STK11 , CD274, PIK3CA, SLC44A2, IL2RA, GPR65, FURIN, HCN3, JAK1 , USP1 , SLC9B1 , ERAP1 , ERAP2, BRD7, INPP5D, ADRA1 B, PTGER4, TRIB1 , CLK2, S1 PR5, FES and P2RY11.
  • biomarkers from Tables 3 to 7 may be used to differential diagnose a first autoimmune disorder from one or more further autoimmune disorders.
  • the diabetes is selected from Type 1 diabetes and Type 2 diabetes.
  • the diabetes is Type 1 diabetes and the one or more biomarkers are selected from Table 8.
  • the diabetes is Type 1 diabetes and the one or more biomarkers are selected from one or more, or all, of: FYN, HDAC9, ERBB3, PRKCQ, TRPM5, TRIB2, PTPRC, CTSH, CDK6, SLC22A18, RGS2, GPR18, CCR7, RARA, IL2RA, AMHR2, SLC25A47, CAMK2D, NEK7, ATP6V1G3, GPR183, PTEN, SSTR2, GPR19, TMPRSS6, SLC25A29 and CD3E.
  • the diabetes is Type 2 diabetes and the one or more biomarkers are selected from Table 9.
  • the diabetes is Type 2 diabetes and the one or more biomarkers are selected from one or more, or all, of: RORA, MAP3K1 , PLCG1 , RIOK1 and TOP1.
  • biomarkers from Tables 8 and 9 may be used to differential diagnose Type 1 diabetes from Type 2 diabetes.
  • biomarkers of Table 10 there is provided the use of one or more of the biomarkers of Table 10 for the diagnosis or prognosis of height/growth disorders.
  • the one or more height/growth disorders biomarkers are selected from one or more, or all, of: MARK4, MMP25, PRKCH, MAP2K3, MAPK9, LPAR2, MAP2K4, PDE4A, ATP2B1 , TRIB2, EED, SPTLC1 , CMA1 , SLC7A8, SIRT1 , JAK2, PCSK5, CTSG, GZMB, CTSZ, SLC04A1 , NTSR1 , LPIN2, KCNN4, DOT1 L, VRK3, TYK2, GSK3A, CDK6, EZH2, SLC16A6, CPZ, SLC1 1A2, ITK, ACVR2B, ODC1 , ECE1 , NEK2, CA14, SLC16A7, NT5C3A, TRIM24, CCR7, MAP2K2, PPAT, SLC44A2, RIPK3, ADCY4, STK33, DNMT1 , AN01
  • the one or more lipid metabolism biomarkers are selected from one or more, or all, of: MARK4, BAZ1 B, NPC1 L1 , PRSS8, SLC12A3, EIF2AK1 , MAP3K1 , CYP26A1 , KAT8, AEBP1 , PCCB, PLCG1 , SLC25A35, RIPK3, ADCY4, RAF1 , PPARG, ABCB10, BLK, ADAM 10, TSSK4, NLRC5, PNMT, CELSR2, SCN3A, GPR61 , CPA2, SLC45A3, SIK3, PCSK7, GPR146, ABCA1 , SLC35G2, PCSK9, FPR2, FPR1 , JMJD1C, SLC16A11 , SLC16A13, IL20RB,
  • the one or more glucose metabolism biomarkers are selected from one or more, or all, of: NPC1 L1 , NR1 H3, AEBP1 , PLCG1 , STK33, LPIN3, MTNR1 B, SLC9B1 , SLC9B2, SLC30A8, SLC39A13, STK39 and TOP1.
  • the one or more insulin metabolism biomarkers are selected from either or both of: PPARG and TBCK.
  • the one or more bone mineral density biomarkers are selected from one or more, or all, of: SLC25A39, CLCN7, AMHR2, MAP3K12, ITGB7, TNFRSF1 1A, SLC26A1 , RARG and GAK.
  • the use of one or more of the biomarkers of Table 15 for the diagnosis or prognosis of disorders related to blood pressure.
  • the one or more blood pressure biomarkers are selected from one or more, or all, of: CLCN6, CSK, PREPL, FURIN, PLCD3, NEK10 and FES.
  • the one or more body mass index biomarkers are selected from one or more, or all, of: GIPR, PRSS8, AQP6, BCKDK, KAT8, MTCH2, ASIC1 , PPARG, CSNK1G2, MAP2K5, ITGAX, SIRT3, CPO, GPR61 , ADCY9, USP1 , F2RL1 , KCNG3, KCNK3, CD19, SLC25A22, GPBAR1 and ATP2A1.
  • GLTPD2 ENSG00000182327 VM01 ENSG00000182853
  • Table 2 Blood Disorders (Red Blood Cell Disorders)
  • CEBPD ENSG00000221869 CEBPA ENSG00000245848
  • DNAJC5 ENSG00000101 152 SMC4 ENSG000001 13810
  • GIGYF2 ENSG00000204120 MUSTN1 ENSG00000248592

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Chemical & Material Sciences (AREA)
  • Biomedical Technology (AREA)
  • Urology & Nephrology (AREA)
  • Hematology (AREA)
  • Immunology (AREA)
  • Biotechnology (AREA)
  • Analytical Chemistry (AREA)
  • Cell Biology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Food Science & Technology (AREA)
  • Medicinal Chemistry (AREA)
  • Physics & Mathematics (AREA)
  • Microbiology (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Pathology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

The invention relates to novel targets and biomarkers for the diagnosis and treatment of selected diseases or disorders, in particular blood disorders (such as platelet disorders and red blood cell disorders), autoimmune disorders (such as ulcerative colitis, multiple sclerosis, rheumatoid arthritis, celiac disease and Crohn's disease), diabetes (such as Type 1 diabetes and Type 2 diabetes), height/growth disorders, disorders related to lipid metabolism, disorders related to glucose metabolism, disorders related to insulin metabolism, disorders related to bone mineral density, disorders related to blood pressure and disorders related to body mass index. The invention also relates to methods of diagnosing said diseases or disorders and methods of screening of modulators said biomarkers.

Description

BIOMARKERS FOR PLATELET DISORDERS
FIELD OF THE INVENTION
The invention relates to novel targets and biomarkers for the diagnosis and treatment of selected diseases or disorders, in particular blood disorders (such as platelet disorders and red blood cell disorders), autoimmune disorders (such as ulcerative colitis, multiple sclerosis, rheumatoid arthritis, celiac disease and Crohn's disease), diabetes (such as Type 1 diabetes and Type 2 diabetes), height/growth disorders, disorders related to lipid metabolism, disorders related to glucose metabolism, disorders related to insulin metabolism, disorders related to bone mineral density, disorders related to blood pressure and disorders related to body mass index. The invention also relates to methods of diagnosing said diseases or disorders and methods of screening of modulators said biomarkers.
BACKGROUND OF THE INVENTION
Genomic regulatory elements such as enhancers determine spatiotempora! patterns of gene expression. It has been estimated that there are up to 1 million enhancer elements with gene regulatory potential in mammalian genomes (The ENCODE Project Consortium, 2012), based primarily on detection of specific combinations of chromatin features, and to a lesser extent, on their ability to activate transgenic reporters in vivo (Akhtar et al., 2013; Arnold et al., 2013). While a number of well-characterised enhancers map dose to their target genes, assignment based on linear proximity is error-prone, and many likely map large distances away from their targets (Krivega et al., 2012; Mifsud et al., 2015; Sanyal et al., 2012; Schoenfelder et al., 2015a). Long-range gene regulation by enhancers in vivo involves close spatial proximity between distal enhancers and their target gene promoters in the three dimensional nuclear space (Carter et al., 2002), most likely involving a direct interaction (Bartman et al., 2016; Deng et al., 2014), while the intervening sequences are looped out. Thus a comprehensive catalogue of promoter-interacting regions (PiRs) is a requisite to fully understand genome transcriptional control. Thousands of disease and trait-associated genetic variants have been identified by genome- wide association studies (GWAS). The vast majority are single nucleotide polymorphisms (SNP) located in non-coding regions of the genome, often at considerable genomic distances from any known gene (Manolio, 2010; Welter et al., 2014), making assessment of their potential function in disease aetiology problematic. Recent evidence indicates that GWAS SNPs often map in dose proximity to DNAse I hypersensitive sites, potentially disrupting transcription factor binding sites, suggesting that they may contribute to disease by altering the function of distal regu!atory elements in gene control (Maurano et al., 2012). Therefore, promoter interactions may link disease-associated variants to their putative target genes (Mifsud et al., 2015). Recent advances in chromosome conformation capture technologies such as Hi-C and ChlA- PET have raised the potential to understand long-range gene control (Dekker et al., 2013). In particular, Hi-C (Lieberman-Aiden et al., 2009) has begun to shed light on global genome organization, revealing higher-order structures such as contact domains and topologically associated domains (TADs) (Dixon et al., 2012; Nora et al., 2012; Sexton et al., 2012), and enabling the structural modelling of global chromosomal architecture in multiple species (Ay et al., 2014; Duan et al., 2010; Lieberman-Aiden et al., 2009; Nagano et al., 2013). However, the enormous combinatorial complexity of DNA fragment pairs in Hi-C libraries impedes high- resolution detection of specific regulatory interactions between individual genetic elements in a robust fashion. Using sequence capture to enrich for Hi-C interactions that involve specific regions of interest is a versatile approach to overcome the limitations imposed by complexity (Dryden et al., 2014; Jager ef al., 2015; Mifsud et al., 2015; Sahlen et al., 2015; Schoenfelder et al., 2015a)), enabling robust and sensitive interaction calling based on statistical significance (Cairns et al., 2016). Capture Hi-C avoids the challenges associated with alternative methods of target enrichment, such as selective PGR amplification as in 5C (Sanyal et al., 2012), or immunoprecipitation as in ChlA~PET (Fullwood et al., 2009), and allows interactome profiling of large numbers of baited regions in a single experiment, regardless of their activity state or factor binding.
The inventors have recently developed promoter Capture Hi-C (PCHi-C) which targets all annotated promoters, and used it to study the genomic regulatory architecture in mouse and human ceil lines (Mifsud et al., 2015; Schoenfelder et al., 2015a, 2015b). The studies described herein use PCHi-C in primary cells to generate a comprehensive catalogue of the interactomes of 31 ,253 annotated promoters in 17 human primary blood ceil types. Devising a statistical methodology to link GWAS SNPs to their putative target genes based on PCHi-C interaction data, hundreds of new candidate genes and pathways associated with common diseases have been identified.
SUMMARY OF THE INVENTION
According to a first aspect of the invention there is provided a modulator of one or more of the biomarkers of Tables 1 or 2 for use in the treatment of a blood disorder. According to a further aspect of the invention there is provided a modulator of one or more of the biomarkers of Tables 3 to 7 for use in the treatment of an autoimmune disorder.
According to a further aspect of the invention there is provided a modulator of one or more of the biomarkers of Tables 8 or 9 for use in the treatment of diabetes.
According to a further aspect of the invention there is provided a modulator of one or more of the biomarkers of Table 10 for use in the treatment of height/growth disorders. According to a further aspect of the invention there is provided a modulator of one or more of the biomarkers of Table 11 for use in the treatment of disorders related to lipid metabolism.
According to a further aspect of the invention there is provided a modulator of one or more of the biomarkers of Table 12 for use in the treatment of disorders related to glucose metabolism.
According to a further aspect of the invention there is provided a modulator of one or more of the biomarkers of Table 13 for use in the treatment of disorders related to insulin metabolism.
According to a further aspect of the invention there is provided a modulator of one or more of the biomarkers of Table 14 for use in the treatment of disorders related to bone mineral density.
According to a further aspect of the invention there is provided a modulator of one or more of the biomarkers of Table 15 for use in the treatment of disorders related to blood pressure.
According to a further aspect of the invention there is provided a modulator of one or more of the biomarkers of Table 16 for use in the treatment of disorders related to body mass index.
According to a further aspect of the invention there is provided the use of one or more of the biomarkers as defined herein for the diagnosis or prognosis of a disease or disorder as defined herein.
According to a further aspect of the invention, there is provided a method of diagnosing a disease or disorder as described herein, comprising:
(a) quantifying the amounts of the biomarkers as defined herein in a biological sample obtained from an individual; (b) comparing the amounts of the biomarkers in the biological sample with the amounts present in a normal control biological sample from a normal subject, such that a difference in the level of the biomarkers in the biological sample is indicative of said disease or disorder, or predisposition thereto.
According to a further aspect of the invention, there is provided a method of prognosing the development of a disease or disorder as described herein in an individual, comprising:
(a) quantifying the amounts of the biomarkers as defined herein in a biological sample obtained from an individual;
(b) comparing the amounts of the biomarkers in the biological sample with the amounts present in a normal control biological sample from a normal subject, such that a difference in the level of the biomarkers in the biological sample is indicative that the individual will develop said disease or disorder. According to a further aspect of the invention, there is provided a method of monitoring efficacy of a therapy in a subject having, suspected of having, or of being predisposed to a disease or disorder as described herein, comprising detecting and/or quantifying, in a sample from said subject, the biomarkers as defined herein. A further aspect of the invention provides ligands, such as naturally occurring or chemically synthesised compounds, capable of specific binding to the biomarker. A ligand according to the invention may comprise a peptide, an antibody or a fragment thereof, or an aptamer or oligonucleotide, capable of specific binding to the biomarker. The antibody can be a monoclonal antibody or a fragment thereof capable of specific binding to the biomarker. A ligand according to the invention may be labelled with a detectable marker, such as a luminescent, fluorescent or radioactive marker; alternatively or additionally a ligand according to the invention may be labelled with an affinity tag, e.g. a biotin, avidin, streptavidin or His (e.g. hexa-His) tag. A biosensor according to the invention may comprise the biomarker or a structural/shape mimic thereof capable of specific binding to an antibody against the biomarker. Also provided is an array comprising a ligand or mimic as described herein.
Also provided by the invention is the use of one or more ligands as described herein, which may be naturally occurring or chemically synthesised, and is suitably a peptide, antibody or fragment thereof, aptamer or oligonucleotide, or the use of a biosensor of the invention, or an array of the invention, or a kit of the invention to detect and/or quantify the biomarker. In these uses, the detection and/or quantification can be performed on a biological sample such as from the group consisting of whole blood, blood serum, plasma, CSF, urine, saliva, or other bodily fluid, breath, e.g. as condensed breath, or an extract or purification therefrom, or dilution thereof.
Diagnostic, prognostic or monitoring kits are provided for performing methods of the invention. Such kits will suitably comprise a ligand according to the invention, for detection and/or quantification of the biomarker, and/or a biosensor, and/or an array as described herein, optionally together with instructions for use of the kit.
According to a further aspect of the invention, there is provided the use of a kit comprising a biosensor capable of detecting and/or quantifying the biomarkers as defined herein for monitoring, prognosing or diagnosing a disease or disorder as described herein or a predisposition thereto.
Biomarkers for the diseases or disorders as described herein are essential targets for discovery of novel targets and drug molecules that retard or halt progression of the disease or disorder. As the level of the biomarker is indicative of disorder and of drug response, the biomarker is useful for identification of novel therapeutic compounds in in vitro and/or in vivo assays. Biomarkers of the invention can be employed in methods for screening for compounds that modulate the activity of the biomarker.
Thus, in a further aspect of the invention, there is provided the use of a ligand, as described, which can be a peptide, antibody or fragment thereof or aptamer or oligonucleotide according to the invention; or the use of a biosensor according to the invention, or an array according to the invention; or a kit according to the invention, to identify a substance capable of promoting and/or of suppressing the generation of the biomarker.
Also there is provided a method of identifying a substance capable of promoting or suppressing the generation of the biomarker in a subject, comprising administering a test substance to a subject animal and detecting and/or quantifying the level of the biomarker present in a test sample from the subject.
In general, when a doctor or other medical practitioner is apprised that a patient is suffering from a disease or disorder as described herein, the practitioner will treat the individual to alleviate the causes or symptoms of the disorder. Thus, according to a further aspect of the invention, there is provided a method for treating a disease or disorder as described herein. Methods of treatment may comprise treating a patient with suitable medicament and/or non- drug therapies. Treatment may be based upon a diagnosis or suspicion of a disease or disorder as described herein derived from the methods, biomarkers and specific panels of biomarkers as described herein.
The results of any analyses according to the invention will often be communicated to physicians and/or patients (or other interested parties such as researchers) in a transmittable form that can be communicated or transmitted to any of the above parties. Therefore, according to a further aspect of the invention, there is provided systems for diagnosing and treating a disease or disorder as described herein. These systems may comprise sample analyzers, computers and software as described herein.
BRIEF DESCRIPTION OF THE FIGURES
Figure 1 : Promoter Capture Hi-C (PCHi-C) across 17 human primary blood cell types.
(A) Schematic representation of the project, using 31 ,253 annotated promoters to identify 708,007 unique interactions.
(B) Interaction landscape of INPP4B gene promoter (dotted black vertical line) along a 5Mb region in naive CD4+ (nCD4) cells (PCHi-C, top panel). Each dot represents a sequenced di- tag mapping, on one end, to the baited Hindlll fragment overlapping the INPP4B gene promoter, with the X axis coordinates corresponding to the location of the Hindlll fragment mapping to the other end; the Y axis shows read counts per di-tag. Red dots represent significant PIRs (CHiCAGO score>5). Grey lines denote expected red counts per di-tag according to CHiCAGO background model, and dashed lines show the upper bound of the 95% confidence interval. Red arcs below the dot plot denote significant INPP4B-P\R interactions. Interactions only involve few-selected DNAse hypersensitivity sites (DHSs, middle panel) defined in the same cell type (ENCODE CD4+_Naive_Wb1 1970640 DNase- seq from University of Washington). Some of these interactions occur within the same topological^ associated domain (TADs, black line, as defined according to the standardised directionality index score, sDI), while others span TAD boundaries. A conventional Hi-C profile for the same locus in nCD4 cells is shown in the bottom panel.
(C) Interaction landscape of the INPP4B, RHAG, ZEB2-AS and ALAD promoters in naive CD4+ cells (nCD4), erythroblasts (Ery) and monocytes (Mon). The dot plots as in (B), with promoter locations shown with vertical lines and the respective ORFs depicted in red below. Significant PIRs are shown in red (CHiCAGO score >= 5) and sub-threshold PIRs (3 < CHiCAGO score < 5) are shown in blue.
(D) A cumulative plot summarizing the number of interactions (blue) and PIRs (red) detected per cell type.
(E) Frequencies of interactions crossing TAD boundaries per cell type. Black bars show the observed frequencies of interactions that cross TAD boundaries. White bars show expected frequencies of TAD-crossing interactions according to a randomization test (see Materials and Methods for more details). Error bars show the most extreme observations in the 1000 permutations.
Figure 2: Promoter interactions reflect the lineage relationships of the haemopoietic tree.
(A) Principal Component Analysis (PCA) of the CHiCAGO scores for each individual biological replicate. Shown are the first two principal components (PC1 , PC2). Replicates cluster closely together, and it can be seen that PC1 separates the myeloid (PC1 >0) and lymphoid lineages (PC1 <0), with Neu forming an outgroup.
(B) Top (dendrogram): hierarchical clustering of the cell types according to their interaction profiles, with lymphoid-lineage cells on the left (nB: naive B cells, tB:total B cells, FetT: fetal thymus, aCD4: activated CD4, naCD4: non-activated CD4, tCD4: total CD4, nCD8: naive CD8, nCD4: naive CD4, tCD8: total CD8), and myeloid cells on the right (Mon: monocytes, Neu: neutrophils, ΜφΟ-2: Macrophages M0, M1 , M2, EndP: endothelial precursors, MK: megakaryocytes, Ery: Erythroblasts). Bottom (heatmap): Autoclass Bayesian clustering of interactions according to their cell-type specificity. Cluster IDs are shown on the right.
(C) Cell-type specificity of interaction clusters. Cell types and clusters are arranged as in (B). The specificity score is a lineage-tree weighted deviation from the mean interaction score per cluster (see Materials and Methods for details). Interactions in clusters 1 - 15 are generally lymphoid-specific, in clusters 28 - 34 generally myeloid-specific, while the remaining clusters show broad specificity across lineages.
Figure 3: Promoters preferentially connect to active enhancers.
(A) PIR enrichment for active histone marks. Overlaps of observed PIRs and promoter distance-matched random regions with histone marks in naive CD8 (nCD8) and Macrophage M1 (Μφ1) cells. Error bars show 95% confidence intervals across 100 draws of random regions.
(B) Heatmap showing the enrichment/depletion of histone modifications at PIRs versus promoter distance-matched random regions in the nine analysed cell types for which BLUEPRINT histone modification data is available. Enrichment is expressed in terms of z- scores (Xobs- exp) oexp. Note a particularly high PIR enrichment for H3K4me1 that is associated with active enhancers.
(C) Promoter interactions and chromatin features in the β-globin locus. PCHi-C data from 3 cell types (Ery, Mon, nCD8), with annotation from the Ensembl Regulatory Build based on the BLUEPRINT data, coloured by function. The image is based on a screenshot produced with Ensembl v83 using GRCh37 assembly and GENCODE v19 gene annotations. The β-globin Locus Control Region (LCR) is highlighted in a blue box.
(D) Heatmap showing the enrichment of PIRs for active distal enhancers in 9 PCHi-C cell types with available BLUEPRINT annotations. The regulatory annotation is derived from a matched set of 20 BLUEPRINT biological replicates for 9 cell types (shown per biological replicate).
(E) Enrichment of promoter-enhancer interactions (observed/expected) for connections between active promoters and active enhancers. The p-value is for the overdispersion- adjusted x2-test of independence of promoter and enhancer states at either ends of interactions.
(F) Interactions that connect an active promoter with an active enhancer in at least one cell type tend to be absent in cell types, in which the respective enhancer is inactive. Shown are odds ratios for the "synchronous" (enhancer state and interaction are "in sync") and "asynchronous" scenarios conditioned on the promoter remaining active. The p-value is for the overdispersion-adjusted x2-test of independence of the enhancer state and the presence of interaction. Figure 4: Active enhancers at PIRs associate with lineage-specific gene expression.
(A) Partial residual plot of log2-gene expression as a function of the number of active enhancers interacting with the respective baited region, in the cell types, where the promoter is active in all analysed cell types. The trendline is from a linear regression using iterated reweighted least squares (see Materials and Methods for details).
(B) Heatmap of "gene specificity scores" for 7,004 protein-coding genes uniquely mapping to a Hindlll baited fragment (rows), capturing specificity of a gene's enhancer-promoter interactions to each of eight cell types (columns). See Materials and Methods for details. Genes are partitioned into 12 clusters (cluster ID) resulting from k-means clustering of their specificity scores and cell types are ordered by hierarchical clustering (dendrogram).
(C) Mean gene specificity score (based on PCHi-C) for each of the clusters in (B) is plotted against analogous mean gene specificity scores based on expression data for nCD4, MK, Ery and Neu cells (see Materials and Methods for details). Error bars indicate ±s.d. Plots for Mon and Μφ1-3 are shown in Figure 10B.
(D) A subset of the heatmap in (B), showing interaction-based gene specificity scores for the top 100 nCD4-specifically expressed genes (obtained by ranking genes according to their nCD4 gene specificity scores based on expression data), together with cluster IDs. (E) For each cell type, enrichment scores are shown quantifying the enrichment of each of the 12 clusters in (B) for the 100 genes expressed with highest specificity for that cell type.
Figure 5: Promoter-interacting regions are enriched for eQTLs.
(A, B) Proportion of significant associations of gene expression with SNPs within PIRs in monocytes (A) and B cells (B). Values are compared to randomised PIRs at binned distances of SNPs from the transcription start site. Asterisks represent the significance of enrichment at observed versus randomised PIRs (permutation test *p<0.05; **p<0.01 ; ***p<0.001).
(C, D) A single common lead eQTL SNP was identified for two genes (ARID1A and
ZDHHC18 - panel C; NDUFAF4 and ZBTB2 - panel D) with the opposite directionality of the effect. SNPs have been tested within PIRs plus an additional 500 bp window. The
Manhattan plot depicts the similarity of the measured signal for both genes. The grey dashed line represents the significance threshold. Figure 6: Promoter interactions link GWAS SNPs with putative target genes.
(A) Enrichment of GWAS summary statistics at PIRs by tissue type. Axes reflect BLOCKSHIFTER z-scores for two different tissue group comparisons, first lymphoid vs myeloid, then additionally within the myeloid lineage (Mon, monocytes; Μφ, macrophages; Neu, neutrophils; MK, megakaryoctes; Ery, erythroblasts). Traits are labelled and coloured by category (BMI, Body Mass Index; BP_D, Diastolic blood pressure; BP_S, Systolic blood pressure; CD, Crohn's disease; CEL, Celiac disease; FNBMD, Femoral neck bone mineral density; GLC, Glucose sensitivity; GLC_B, Glucose sensitivity BMI-adjusted; HB, Haemoglobin; HDL, High density lipoprotein; HEIGHT, Height; INS, Insulin sensitivity; INS_B, Insulin sensitivity BMI-adjusted; LDL, Low density lipoprotein; LSBMD, Lumbar spine bone mineral density; MCH, Mean corpuscular haemoglobin; MCHC, Mean corpuscular haemoglobin concentration; MCV, Mean corpuscular volume; MS, Multiple sclerosis; PBC, Primary Billiary Cirrhosis; PCV, Packed cell volume; PLT, Platelet count; PV, Platelet volume; RA, Rheumatoid arthritis; RBC, Red blood cell count; SLE, Systemic Lupus Erythrematosis; T1 D, Type 1 diabetes; T2D=Type 2 diabetes; TC, Total Cholesterol; TG, Tryglycerides; UC, Ulcerative Colitis).
(B) Heatmap of BLOCKSHIFTER enrichment Z-scores of GWAS summary statistics in PIRs by individual tissue type using endothelial cells as a control. Red indicates enrichment in the labelled tissue (rows MK, megakaryocytes; Ery, erythroblasts; ΜφΟ, macrophages M0; Μφ1 , macrophages M1 ; Μφ2, macrophages M2; tB, total B cells; nB, naive B cells, FetT, fetal thymus; Neu, neutrophils; Mon, monocytes; naCD4, non-activated CD4+ T cells; tCD4, total CD4+ T cells; aCD4, activated CD4+ T cells; tCD8, total CD8+ T cells; nCD8, naive CD8+ T cells; nCD4, naive CD4+ T cells); green indicates enrichment in the endothelial cell control. (C) Example of gene prioritisation method using PCHi-C in 1 p13.1 rheumatoid arthritis susceptibility region. Top stanza shows GWAS summary p-values from (Okada et al., 2012). Middle stanza shows these transformed into posterior probabilities for variant being causal. Bottom stanza indicates how these are integrated with the PCHi-C to compute gene scores. The histogram widths are proportional to Hindlll fragment width and heights are proportional to the sum of posterior probabilities taking into account LD (grey/black: non-promoter Hindlll fragment, coloured: captured promoter fragment with green: RP4-753F5. 1 , yellow: CD101, red: TTF2, and blue TRIM45); arcs represent physical interactions.
(D) Bubble plot of traits with significant enrichment (p.adj < 0.05) in one or more reactome pathways. Top numbers indicate the total number of genes analysed for each trait (gene score
> 0.5), bubble size indicates the ratio of test genes to those in the pathway, and blue to red corresponds to decreasing adjusted p value for enrichment.
Figure 7: Higher-order topological properties of eight blood cell types.
(A) Top panel: Distributions of the frequencies of promoter interactions (per bait) that cross the cognate TAD boundaries in three representative cell types. Black bars show the observed frequencies, and grey bars show expected frequencies computed by permuting TAD boundaries 1000 times (see Materials and Methods for details). The error bars show the most extreme observations in the 1000 permutations. On the X axis, 1 corresponds to a scenario whereby all interactions of a given bait localise within the same TAD as the bait, and 0 corresponds to a scenario whereby all interactions of a given bait are cross TAD boundaries. Bottom panel: examples of baits with PIRs mapping fully within (left) or fully outside (right) the baits' TADs. Blue mark and arrow show the baited region, purple arcs show significant interactions called by CHiCAGO, red marks show TAD boundaries. Plots above show the directionality index (Dl) profiles in the displayed regions, with TAD boundaries defined on the basis of a switch from a negative to a positive Dl.
(B) Coverage-and-distance corrected Hi-C matrices of chromosome 1 show the log2- enrichment of interactions between chromatin segments binned at 1 Mb resolution. The eight analysed cell types (MK, megakaryocytes; Ery, erythroblasts; Neu, neutrophils; Mon, monocytes; ΜφΟ, macrophages M0; nCD4, naive CD4+ T cells; nCD8, naive CD8+ T cells) are shown in columns, and the respective biological replicates are in rows.
(C) The first principal component of the 100kb-binned interaction correlation matrix for chromosome 1 shows compartmentalisation (positive values are associated with A and negative values with B compartment). Each biological replicate of the eight analysed cell types are shown.
(D) Correlation matrices of the genome-wide concatenated first principal components with dendrograms from hierarchical clustering show the grouping of cell types according to the compartment signal.
Figure 8: Validation of promoter interactions using reciprocal capture Hi-C.
(A) Cumulative density plots showing the distributions of asinh-transformed CHiCAGO interaction scores for promoter-containing reciprocal capture Hi-C fragment pairs that are detected as significant interactions in the PCHi-C analyses in the respective cell types (blue line - SI; CHiCAGO score >= 5) versus those that are not detected as SI in PCHi-C (grey line). Vertical lines show the SI CHiCAGO score cutoff of 5 on the asinh-transformed scale (-2.31) for the reciprocal capture Hi-C samples and the q∑ cutoffs minimising the total misclassification error across the PCHi-C and reciprocal capture Hi-C samples for each cell type (Blangiardo and Richardson, 2007). See Materials and Methods for details.
(B, C) Comparison of interactions detected with PCHi-C (top) and reciprocal capture (bottom two panels) for two example regions in erythroblasts (Ery, panel B) and non-activated CD4 cells (naCD4, panel C). The PCHi-C baits capture the TRPC3 and TES promoters, respectively, while reciprocal capture baits were designed to capture their respective PIRs. Interactions are plotted in the same way as in Figure 1C.
Figure 9: Additional properties of promoter interactions.
(A) Venn diagram showing the numbers of promoters baits with interactions mapping to the "myeloid", "lymphoid" and "invariant" sets of clusters. See Figure 2B-C and the main text for details.
(B) Evidence that promoters preferentially have interactions with a similar tissue specificity. A histogram of the observed variance of the cluster specificity scores across interactions of the same bait (blue) versus the same obtained by permuting cluster labels (expected, grey). Cluster specificity score was computed as the maximum of such scores across all tissues.
(C) A zoomed-out view of promoter interactions and chromatin features in and around the β- globin locus. PCHi-C data from 3 cell types (Ery, erythroblasts; Mon, monocytes; nCD8, naive CD8+ T cells), with annotation from the EnsembI Regulatory Build based on the BLUEPRINT data, coloured by function. The image is based on a screenshot produced with EnsembI v83 using GRCh37 assembly and GENCODE v19 gene annotations. The blue square denotes the approximate location of the β-globin Locus Control Region (LCR).
Figure 10: Additional evidence of the link between promoter interactions and gene expression.
(A) Partial residual plot of log2-gene expression as a function of the number of PIRs interacting with the respective baited region in the cell types, where the promoter is active in all analysed cell types. The trendline is from a linear regression using iterated reweighted least squares (see Materials and Methods for details).
(B) Mean gene specificity score (based on PCHi-C) for each of the clusters in (B) is plotted against analogous mean gene specificity scores based on expression data for monocytes (Mon) and macrophages MO, M 1 , M2 (ΜφΟ-2). Error bars indicate ±s.d. Plots for nCD4, MK, Ery and Neu are shown in Figure 4C. See Materials and Methods for details.
Figure 11 : Further details on the enrichment of eQTLs at promoter-interacting regions.
(A, B) Proportion of genes with at least one eQTL at PIRs in monocytes (Mon, panel A) and total B cells (tB, panel B), evaluated at binned distances of SNPs from the transcription start site. Results obtained with randomised PIRs are shown as controls. Asterisks represent the significance of enrichment at observed versus randomised PIRs (permutation test *p<0.05; **p<0.01 ; ***p<0.001).
(C) An example of an extremely long range association between rs3817995 and AURKA expression in total B cells (tB), with the SNP located >30 Mb away from AURKA TSS. The grey dashed line represents the significance threshold.
(D) An example of two independent eQTL signals detected for NCOA4 in monocytes (Mon), with the primary eQTL SNP (rs4948673) located > 5 Mb away from the TSS. The second, independent eQTL SNP (rs10821610) is located close (<20kb) to the NCOA4 TSS. The grey dashed line represents the significance threshold.
Figure 12: Colocalization of GWAS and eQTL signals at prioritised candidate genes.
2 Mb windows around the genes prioritised by the GWAS/PCHi-C based algorithm in RA and SLE were overlapped with eQTLs for the same genes in B cells and monocytes. In five cases high LD (r2>0.8) was detected between the GWAS lead SNP and the eQTL lead SNP in the 2Mb regions. Manhattan plots for two SLE-prioritized genes (SLC15A4, panel A; BLK, panel B) and two RA-prioritized genes (GIN1, panel C; RASGRP1, panel D), for which high LD (r2>0.8) was detected between the GWAS lead SNP and the eQTL lead SNP, providing further evidence for colocalization of the GWAS and eQTL signals in these regions.
DETAILED DESCRIPTION OF THE INVENTION
Long-range interactions between DNA regulatory elements and their target genes play major roles in gene regulation. The vast majority of interactions are uncharted, constituting a major missing link in understanding genome control. Promoter capture Hi-C is used herein to identify interacting regions of 31 ,253 promoters in 17 human primary haematopoietic cell types. It is shown that long-range promoter interactions are highly cell-type specific, preferentially linking active promoters and enhancers. Patterns of promoter interactions reflect cell lineage relationships of the hematopoietic tree, consistent with dynamic remodeling of nuclear architecture during differentiation. Interacting regions are enriched for expression quantitative trait loci with expression effects on their interacting target genes. This rich resource of interactome maps is exploited to connect non-coding disease variants to their target promoters, identifying hundreds of new disease-candidate genes. The results herein demonstrate the power of promoter interactomes from primary cells to reveal insights into global genomic regulatory mechanisms and gene pathways underlying disease pathologies.
Drug Targets
According to a further aspect of the invention there is provided a modulator of one or more of the biomarkers of Tables 1 or 2 for use in the treatment of a blood disorder. In one embodiment, the blood disorder is selected from a platelet disorder and a red blood cell disorder. In a further embodiment, the blood disorder is a platelet disorder and the one or more biomarkers are selected from Table 1. In a yet further embodiment, the blood disorder is a platelet disorder and the one or more biomarkers are selected from one or more, or all, of: CD22, BAZ2A, JAK2, CHRNE, PTGES3, CYP27B1 , OPRD1 , CA14, CD274, ABCC4, FFAR2, PRMT1 , PLD2, SLC39A5, MINK1 , SIRT3, RNPEPL1 , PSMB6, SLC2A12, TAOK1 , NUAK2, GPR182, BRD3, JMJD1 C, NLRP6, TBK1 and NRBP2.
In an alternative embodiment, the blood disorder is a red blood cell disorder and the one or more biomarkers are selected from Table 2. In a yet further embodiment, the blood disorder is a red blood cell disorder and the one or more biomarkers are selected from one or more, or all, of: CASP10, SLC25A39, CLK1 , ATP2B4, CASP8, PTGS2, STRADB, AURKA, JAK2, IL2RB, KAT8, KCNN4, DOT1 L, SLC12A7, PLCG1 , LPIN3, AMHR2, GABBR2, ADAM 10, IFNAR1 , SLC6A3, PADI3, SLC25A37, UGCG, KCNMA1 , MAPK13, ITGAD, IFNAR2, PADI4, NEK8, NUAK2, FABP1 , SLC51A, ABCA1 , BRD7, SMPD1 , ILK, CDK12, VKORC1 , BRD3, GPR152, RPS6KB2, TMPRSS6, TOP1 , S1 PR3 and FNTB.
According to a further aspect of the invention there is provided a modulator of one or more of the biomarkers of Tables 3 to 7 for use in the treatment of an autoimmune disorder. In one embodiment, the autoimmune disorder is selected from ulcerative colitis, multiple sclerosis, rheumatoid arthritis, celiac disease and Crohn's disease. In a further embodiment, the autoimmune disorder is ulcerative colitis and the one or more biomarkers are selected from Table 3. In a yet further embodiment, the autoimmune disorder is ulcerative colitis and the one or more biomarkers are selected from one or more, or all, of: CDK11A, SLC11A1 , IFNGR1 , ATP6V0A1 , ENTPD2, EIF2AK1 , SLC26A3, JAK2, ABCA2, CHRNE, IL1 RL2, CD274, SLC7A10, CDK4, SLC26A10, ADAM 10, MINK1 , PSMB6, RORC, ADAMTS16, INPP5E, PLCH2, STK32B, TNFRSF14, STK36, BRD7, PIP4K2C, ADAM9, ADRA1 B, PTGER4, BCL2, DPP7, SLC9A4, CXCR2, EHMT1 , PRKAR1 B, TUBB4B, MMP23B, PIM3, SLC34A3, SGMS1 , SLC35E2 and CDK1 1 B. In an alternative embodiment, the autoimmune disorder is multiple sclerosis and the one or more biomarkers are selected from Table 4. In a yet further embodiment, the autoimmune disorder is multiple sclerosis and the one or more biomarkers are selected from one or more, or all, of: CYP24A1 , IFNGR1 , PDE4A, MAPK1 , CSF2RB, PTK6, SLC17A7, PDE4C, SCNN1A, LTBR, USP5, NEK9, SRMS, SLC34A1 , IL2RA, SLC26A10, CD27, NCOA2, PTPRK, CXCR5, ATP1A1 , SLC9B1 , SLC9B2, IL22RA2, NSD1 , PIP4K2C, RGS14, ADRA1 B, PTGER4, GPR160, S1 PR5, LPAR5, SEMA4D, P2RY11 and GPR162.
In an alternative embodiment, the autoimmune disorder is rheumatoid arthritis and the one or more biomarkers are selected from Table 5. In a yet further embodiment, the autoimmune disorder is rheumatoid arthritis and the one or more biomarkers are selected from one or more, or all, of: IFNGR1 , PDE4A, IDI1 , FDFT1 , KEAP1 , PREP, MAP3K1 , CD40, CDK6, SLC26A8, CCR6, PCCB, CYP20A1 , NEK9, ACAT2, SLC44A2, IL6ST, IL2RA, SLC26A10, BLK, SLC35B2, TNFRSF14, IFNAR2, CXCR5, IL6R, NEK10, CTSB, PIP4K2C, CXCR6 and S1 PR5.
In an alternative embodiment, the autoimmune disorder is celiac disease and the one or more biomarkers are selected from Table 6. In a yet further embodiment, the autoimmune disorder is celiac disease and the one or more biomarkers are selected from one or more, or all, of: IL20RA, IFNGR1 , DAPK2, ZMYND8, ITGA4, IL1 R2, IL1 RL2, IL1 RL1 , IL18R1 , SLC9A2, PTPRK, IL22RA2 and GPR160.
In an alternative embodiment, the autoimmune disorder is Crohn's disease and the one or more biomarkers are selected from Table 7. In a yet further embodiment, the autoimmune disorder is Crohn's disease and the one or more biomarkers are selected from one or more, or all, of: ATP6V0A1 , ITGA8, MAP3K1 , JAK2, ATP5D, TYK2, CCR6, LNPEP, KCNJ13, IL1 R2, IL1 RL2, IL1 RL1 , IL18R1 , SLC9A2, STK11 , CD274, PIK3CA, SLC44A2, IL2RA, GPR65, FURIN, HCN3, JAK1 , USP1 , SLC9B1 , ERAP1 , ERAP2, BRD7, INPP5D, ADRA1 B, PTGER4, TRIB1 , CLK2, S1 PR5, FES and P2RY11.
According to a further aspect of the invention there is provided a modulator of one or more of the biomarkers of Tables 8 or 9 for use in the treatment of diabetes.
In one embodiment, the diabetes is selected from Type 1 diabetes and Type 2 diabetes.
In a further embodiment, the diabetes is Type 1 diabetes and the one or more biomarkers are selected from Table 8. In a yet further embodiment, the diabetes is Type 1 diabetes and the one or more biomarkers are selected from one or more, or all, of: FYN, HDAC9, ERBB3, PRKCQ, TRPM5, TRIB2, PTPRC, CTSH, CDK6, SLC22A18, RGS2, GPR18, CCR7, RARA, IL2RA, AMHR2, SLC25A47, CAMK2D, NEK7, ATP6V1G3, GPR183, PTEN, SSTR2, GPR19, TMPRSS6, SLC25A29 and CD3E.
In an alternative embodiment, the diabetes is Type 2 diabetes and the one or more biomarkers are selected from Table 9. In a yet further embodiment, the diabetes is Type 2 diabetes and the one or more biomarkers are selected from one or more, or all, of: RORA, MAP3K1 , PLCG1 , RIOK1 and TOP1.
According to a further aspect of the invention there is provided a modulator of one or more of the biomarkers of Table 10 for use in the treatment of height/growth disorders. In one embodiment, the one or more height/growth disorders biomarkers are selected from one or more, or all, of: MARK4, MMP25, PRKCH, MAP2K3, MAPK9, LPAR2, MAP2K4, PDE4A, ATP2B1 , TRIB2, EED, SPTLC1 , CMA1 , SLC7A8, SIRT1 , JAK2, PCSK5, CTSG, GZMB, CTSZ, SLC04A1 , NTSR1 , LPIN2, KCNN4, DOT1 L, VRK3, TYK2, GSK3A, CDK6, EZH2, SLC16A6, CPZ, SLC1 1A2, ITK, ACVR2B, ODC1 , ECE1 , NEK2, CA14, SLC16A7, NT5C3A, TRIM24, CCR7, MAP2K2, PPAT, SLC44A2, RIPK3, ADCY4, STK33, DNMT1 , AN01 , RARA, PRKAB2, PRKAA1 , ADAM 19, AMHR2, PREPL, MAP3K12, ITGB7, TSSK4, SLC16A3, PIP4K2B, SIRT3, PADI3, CELSR2, SLC18B1 , HTR7, ADAM 12, RGS10, PTPRJ, PIP4K2A, NR3C2, KCNJ16, PRKCA, PDE6D, NPR2, SIK3, CXCR5, CHRNB2, FGFR4, PLCD3, ADCY9, NEK10, RYK, KDM 1 B, BRD7, PIP4K2C, FASN, CDC42BPG, PTGER4, SLC25A33, EIF2AK3, OXSR1 , CHRNA9, HTR3C, S1 PR5, SCN5A, KCNJ12, NT5C1 B, PLCD1 , LPAR1 , SLC5A3, GRK5, L3MBTL3, LTB4R, LTB4R2, DHFR, P2RY11 , FNTB, S1 PR2, LTB4R2 and SLC5A3. Examples of suitable height/growth disorders include but are not restricted to dwarfism and gigantism.
According to a further aspect of the invention there is provided a modulator of one or more of the biomarkers of Table 11 for use in the treatment of disorders related to lipid metabolism. In one embodiment, the one or more lipid metabolism biomarkers are selected from one or more, or all, of: MARK4, BAZ1 B, NPC1 L1 , PRSS8, SLC12A3, EIF2AK1 , MAP3K1 , CYP26A1 , KAT8, AEBP1 , PCCB, PLCG1 , SLC25A35, RIPK3, ADCY4, RAF1 , PPARG, ABCB10, BLK, ADAM 10, TSSK4, NLRC5, PNMT, CELSR2, SCN3A, GPR61 , CPA2, SLC45A3, SIK3, PCSK7, GPR146, ABCA1 , SLC35G2, PCSK9, FPR2, FPR1 , JMJD1C, SLC16A11 , SLC16A13, IL20RB, F2, SLC25A42, NRBP2, BACE1 and TOP1.
Examples of suitable disorders related to lipid metabolism include but are not restricted to stroke, heart disease and vascular disease.
According to a further aspect of the invention there is provided a modulator of one or more of the biomarkers of Table 12 for use in the treatment of disorders related to glucose metabolism. In one embodiment, the one or more glucose metabolism biomarkers are selected from one or more, or all, of: NPC1 L1 , NR1 H3, AEBP1 , PLCG1 , STK33, LPIN3, MTNR1 B, SLC9B1 , SLC9B2, SLC30A8, SLC39A13, STK39 and TOP1.
Examples of suitable disorders related to glucose metabolism include but are not restricted to Type II diabetes and obesity. According to a further aspect of the invention there is provided a modulator of one or more of the biomarkers of Table 13 for use in the treatment of disorders related to insulin metabolism. In one embodiment, the one or more insulin metabolism biomarkers are selected from either or both of: PPARG and TBCK. Examples of suitable disorders related to insulin metabolism include but are not restricted to Type II diabetes and obesity.
According to a further aspect of the invention there is provided a modulator of one or more of the biomarkers of Table 14 for use in the treatment of disorders related to bone mineral density. In one embodiment, the one or more bone mineral density biomarkers are selected from one or more, or all, of: SLC25A39, CLCN7, AMHR2, MAP3K12, ITGB7, TNFRSF1 1A, SLC26A1 , RARG and GAK. Examples of suitable disorders related to bone mineral density include but are not restricted to osteoporosis and fracture risk. According to a further aspect of the invention there is provided a modulator of one or more of the biomarkers of Table 15 for use in the treatment of disorders related to blood pressure. In one embodiment, the one or more blood pressure biomarkers are selected from one or more, or all, of: CLCN6, CSK, PREPL, FURIN, PLCD3, NEKI O and FES. Examples of suitable disorders related to blood pressure include but are not restricted to heart attack, stroke, kidney disease and vascular dementia.
According to a further aspect of the invention there is provided a modulator of one or more of the biomarkers of Table 16 for use in the treatment of disorders related to body mass index. In one embodiment, the one or more body mass index biomarkers are selected from one or more, or all, of: GIPR, PRSS8, AQP6, BCKDK, KAT8, MTCH2, ASIC1 , PPARG, CSNK1G2, MAP2K5, ITGAX, SIRT3, CPO, GPR61 , ADCY9, USP1 , F2RL1 , KCNG3, KCNK3, CD19, SLC25A22, GPBAR1 and ATP2A1. Examples of suitable disorders related to body mass index include but are not restricted to obesity.
References herein to the term "one or more" include references to any number between one and the maximum number of biomarkers within each table. In particular, one or more includes references to 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, 20 or more biomarkers.
The term "biomarker" means a distinctive biological or biologically derived indicator of a process, event, or condition. Biomarkers can be used in methods of diagnosis, e.g. clinical screening, and prognosis assessment and in monitoring the results of therapy, identifying patients most likely to respond to a particular therapeutic treatment, drug screening and development. Biomarkers and uses thereof are valuable for identification of new drug treatments and for discovery of new targets for drug treatment. References herein to the term "modulator" refer to any agent, such as an inhibitor (i.e. competitive, non-competitive or un-competitive inhibitor) or antagonist (i.e. competitive, noncompetitive or un-competitive antagonist), activator or agonist (i.e. full inverse agonist, partial inverse agonist, silent antagonist, partial agonist, full agonist or super agonist) capable of modulating the signaling effected by the biomarker.
Biomarkers
According to a further aspect of the invention there is provided the use of one or more of the biomarkers as defined herein for the diagnosis or prognosis of a disease or disorder as defined herein.
According to a further aspect of the invention there is provided the use of one or more of the biomarkers of Tables 1 or 2 for the diagnosis or prognosis of a blood disorder.
In one embodiment, the blood disorder is selected from a platelet disorder and a red blood cell disorder. In a further embodiment, the blood disorder is a platelet disorder and the one or more biomarkers are selected from Table 1. In a yet further embodiment, the blood disorder is a platelet disorder and the one or more biomarkers are selected from one or more, or all, of: CD22, BAZ2A, JAK2, CHRNE, PTGES3, CYP27B1 , OPRD1 , CA14, CD274, ABCC4, FFAR2, PRMT1 , PLD2, SLC39A5, MINK1 , SIRT3, RNPEPL1 , PSMB6, SLC2A12, TAOK1 , NUAK2, GPR182, BRD3, JMJD1 C, NLRP6, TBK1 and NRBP2. In an alternative embodiment, the blood disorder is a red blood cell disorder and the one or more biomarkers are selected from Table 2. In a yet further embodiment, the blood disorder is a red blood cell disorder and the one or more biomarkers are selected from one or more, or all, of: CASP10, SLC25A39, CLK1 , ATP2B4, CASP8, PTGS2, STRADB, AURKA, JAK2, IL2RB, KAT8, KCNN4, DOT1 L, SLC12A7, PLCG1 , LPIN3, AMHR2, GABBR2, ADAM 10, IFNAR1 , SLC6A3, PADI3, SLC25A37, UGCG, KCNMA1 , MAPK13, ITGAD, IFNAR2, PADI4, NEK8, NUAK2, FABP1 , SLC51A, ABCA1 , BRD7, SMPD1 , ILK, CDK12, VKORC1 , BRD3, GPR152, RPS6KB2, TMPRSS6, TOP1 , S1 PR3 and FNTB.
It will be appreciated that any one or more of the biomarkers from Tables 1 and 2 (or any preferred subsets of biomarkers listed in the above mentioned embodiments) may be used to differential diagnose a first blood disorder from a second blood disorder.
According to a further aspect of the invention there is provided the use of one or more of the biomarkers of Tables 3 to 7 for the diagnosis or prognosis of an autoimmune disorder.
In one embodiment, the autoimmune disorder is selected from ulcerative colitis, multiple sclerosis, rheumatoid arthritis, celiac disease and Crohn's disease. In a further embodiment, the autoimmune disorder is ulcerative colitis and the one or more biomarkers are selected from Table 3. In a yet further embodiment, the autoimmune disorder is ulcerative colitis and the one or more biomarkers are selected from one or more, or all, of: CDK11A, SLC11A1 , IFNGR1 , ATP6V0A1 , ENTPD2, EIF2AK1 , SLC26A3, JAK2, ABCA2, CHRNE, IL1 RL2, CD274, SLC7A10, CDK4, SLC26A10, ADAM 10, MINK1 , PSMB6, RORC, ADAMTS16, INPP5E, PLCH2, STK32B, TNFRSF14, STK36, BRD7, PIP4K2C, ADAM9, ADRA1 B, PTGER4, BCL2, DPP7, SLC9A4, CXCR2, EHMT1 , PRKAR1 B, TUBB4B, MMP23B, PIM3, SLC34A3, SGMS1 , SLC35E2 and CDK1 1 B.
In an alternative embodiment, the autoimmune disorder is multiple sclerosis and the one or more biomarkers are selected from Table 4. In a yet further embodiment, the autoimmune disorder is multiple sclerosis and the one or more biomarkers are selected from one or more, or all, of: CYP24A1 , IFNGR1 , PDE4A, MAPK1 , CSF2RB, PTK6, SLC17A7, PDE4C, SCNN1A, LTBR, USP5, NEK9, SRMS, SLC34A1 , IL2RA, SLC26A10, CD27, NCOA2, PTPRK, CXCR5, ATP1A1 , SLC9B1 , SLC9B2, IL22RA2, NSD1 , PIP4K2C, RGS14, ADRA1 B, PTGER4, GPR160, S1 PR5, LPAR5, SEMA4D, P2RY11 and GPR162.
In an alternative embodiment, the autoimmune disorder is rheumatoid arthritis and the one or more biomarkers are selected from Table 5. In a yet further embodiment, the autoimmune disorder is rheumatoid arthritis and the one or more biomarkers are selected from one or more, or all, of: IFNGR1 , PDE4A, IDI1 , FDFT1 , KEAP1 , PREP, MAP3K1 , CD40, CDK6, SLC26A8, CCR6, PCCB, CYP20A1 , NEK9, ACAT2, SLC44A2, IL6ST, IL2RA, SLC26A10, BLK, SLC35B2, TNFRSF14, IFNAR2, CXCR5, IL6R, NEK10, CTSB, PIP4K2C, CXCR6 and S1 PR5.
In an alternative embodiment, the autoimmune disorder is celiac disease and the one or more biomarkers are selected from Table 6. In a yet further embodiment, the autoimmune disorder is celiac disease and the one or more biomarkers are selected from one or more, or all, of: IL20RA, IFNGR1 , DAPK2, ZMYND8, ITGA4, IL1 R2, IL1 RL2, IL1 RL1 , IL18R1 , SLC9A2, PTPRK, IL22RA2 and GPR160.
In an alternative embodiment, the autoimmune disorder is Crohn's disease and the one or more biomarkers are selected from Table 7. In a yet further embodiment, the autoimmune disorder is Crohn's disease and the one or more biomarkers are selected from one or more, or all, of: ATP6V0A1 , ITGA8, MAP3K1 , JAK2, ATP5D, TYK2, CCR6, LNPEP, KCNJ13, IL1 R2, IL1 RL2, IL1 RL1 , IL18R1 , SLC9A2, STK11 , CD274, PIK3CA, SLC44A2, IL2RA, GPR65, FURIN, HCN3, JAK1 , USP1 , SLC9B1 , ERAP1 , ERAP2, BRD7, INPP5D, ADRA1 B, PTGER4, TRIB1 , CLK2, S1 PR5, FES and P2RY11.
It will be appreciated that any one or more of the biomarkers from Tables 3 to 7 (or any preferred subsets of biomarkers listed in the above mentioned embodiments) may be used to differential diagnose a first autoimmune disorder from one or more further autoimmune disorders.
According to a further aspect of the invention there is provided the use of one or more of the biomarkers of Tables 8 or 9 for the diagnosis or prognosis of diabetes.
In one embodiment, the diabetes is selected from Type 1 diabetes and Type 2 diabetes.
In a further embodiment, the diabetes is Type 1 diabetes and the one or more biomarkers are selected from Table 8. In a yet further embodiment, the diabetes is Type 1 diabetes and the one or more biomarkers are selected from one or more, or all, of: FYN, HDAC9, ERBB3, PRKCQ, TRPM5, TRIB2, PTPRC, CTSH, CDK6, SLC22A18, RGS2, GPR18, CCR7, RARA, IL2RA, AMHR2, SLC25A47, CAMK2D, NEK7, ATP6V1G3, GPR183, PTEN, SSTR2, GPR19, TMPRSS6, SLC25A29 and CD3E.
In an alternative embodiment, the diabetes is Type 2 diabetes and the one or more biomarkers are selected from Table 9. In a yet further embodiment, the diabetes is Type 2 diabetes and the one or more biomarkers are selected from one or more, or all, of: RORA, MAP3K1 , PLCG1 , RIOK1 and TOP1.
It will be appreciated that any one or more of the biomarkers from Tables 8 and 9 (or any preferred subsets of biomarkers listed in the above mentioned embodiments) may be used to differential diagnose Type 1 diabetes from Type 2 diabetes. According to a further aspect of the invention there is provided the use of one or more of the biomarkers of Table 10 for the diagnosis or prognosis of height/growth disorders. In one embodiment, the one or more height/growth disorders biomarkers are selected from one or more, or all, of: MARK4, MMP25, PRKCH, MAP2K3, MAPK9, LPAR2, MAP2K4, PDE4A, ATP2B1 , TRIB2, EED, SPTLC1 , CMA1 , SLC7A8, SIRT1 , JAK2, PCSK5, CTSG, GZMB, CTSZ, SLC04A1 , NTSR1 , LPIN2, KCNN4, DOT1 L, VRK3, TYK2, GSK3A, CDK6, EZH2, SLC16A6, CPZ, SLC1 1A2, ITK, ACVR2B, ODC1 , ECE1 , NEK2, CA14, SLC16A7, NT5C3A, TRIM24, CCR7, MAP2K2, PPAT, SLC44A2, RIPK3, ADCY4, STK33, DNMT1 , AN01 , RARA, PRKAB2, PRKAA1 , ADAM 19, AMHR2, PREPL, MAP3K12, ITGB7, TSSK4, SLC16A3, PIP4K2B, SIRT3, PADI3, CELSR2, SLC18B1 , HTR7, ADAM 12, RGS10, PTPRJ, PIP4K2A, NR3C2, KCNJ16, PRKCA, PDE6D, NPR2, SIK3, CXCR5, CHRNB2, FGFR4, PLCD3, ADCY9, NEK10, RYK, KDM 1 B, BRD7, PIP4K2C, FASN, CDC42BPG, PTGER4, SLC25A33, EIF2AK3, 0XSR1 , CHRNA9, HTR3C, S1 PR5, SCN5A, KCNJ12, NT5C1 B, PLCD1 , LPAR1 , SLC5A3, GRK5, L3MBTL3, LTB4R, LTB4R2, DHFR, P2RY11 , FNTB, S1 PR2, LTB4R2 and SLC5A3.
According to a further aspect of the invention there is provided the use of one or more of the biomarkers of Table 11 for the diagnosis or prognosis of disorders related to lipid metabolism. In one embodiment, the one or more lipid metabolism biomarkers are selected from one or more, or all, of: MARK4, BAZ1 B, NPC1 L1 , PRSS8, SLC12A3, EIF2AK1 , MAP3K1 , CYP26A1 , KAT8, AEBP1 , PCCB, PLCG1 , SLC25A35, RIPK3, ADCY4, RAF1 , PPARG, ABCB10, BLK, ADAM 10, TSSK4, NLRC5, PNMT, CELSR2, SCN3A, GPR61 , CPA2, SLC45A3, SIK3, PCSK7, GPR146, ABCA1 , SLC35G2, PCSK9, FPR2, FPR1 , JMJD1C, SLC16A11 , SLC16A13, IL20RB, F2, SLC25A42, NRBP2, BACE1 and TOP1.
According to a further aspect of the invention there is provided the use of one or more of the biomarkers of Table 12 for the diagnosis or prognosis of disorders related to glucose metabolism. In one embodiment, the one or more glucose metabolism biomarkers are selected from one or more, or all, of: NPC1 L1 , NR1 H3, AEBP1 , PLCG1 , STK33, LPIN3, MTNR1 B, SLC9B1 , SLC9B2, SLC30A8, SLC39A13, STK39 and TOP1.
According to a further aspect of the invention there is provided the use of one or more of the biomarkers of Table 13 for the diagnosis or prognosis of disorders related to insulin metabolism. In one embodiment, the one or more insulin metabolism biomarkers are selected from either or both of: PPARG and TBCK.
According to a further aspect of the invention there is provided the use of one or more of the biomarkers of Table 14 for the diagnosis or prognosis of disorders related to bone mineral density. In one embodiment, the one or more bone mineral density biomarkers are selected from one or more, or all, of: SLC25A39, CLCN7, AMHR2, MAP3K12, ITGB7, TNFRSF1 1A, SLC26A1 , RARG and GAK. According to a further aspect of the invention there is provided the use of one or more of the biomarkers of Table 15 for the diagnosis or prognosis of disorders related to blood pressure. In one embodiment, the one or more blood pressure biomarkers are selected from one or more, or all, of: CLCN6, CSK, PREPL, FURIN, PLCD3, NEK10 and FES.
According to a further aspect of the invention there is provided the use of one or more of the biomarkers of Table 16 for the diagnosis or prognosis of disorders related to body mass index. In one embodiment, the one or more body mass index biomarkers are selected from one or more, or all, of: GIPR, PRSS8, AQP6, BCKDK, KAT8, MTCH2, ASIC1 , PPARG, CSNK1G2, MAP2K5, ITGAX, SIRT3, CPO, GPR61 , ADCY9, USP1 , F2RL1 , KCNG3, KCNK3, CD19, SLC25A22, GPBAR1 and ATP2A1.
Table 1 : Blood Disorders (Platelet disorders)
Figure imgf000025_0001
Gene Name Ensembl ID Gene Name Ensembl ID
ABCC4 ENSG00000125257 RPL13A ENSG00000142541
IRF1 ENSG00000125347 ANKMY1 ENSG00000144504
FFAR2 ENSG00000126262 RP1 1-977G19.10 ENSG00000144785
BCL2L12 ENSG00000126453 IQGAP2 ENSG00000145703
IRF3 ENSG00000126456 LIX1 ENSG00000145721
PRMT1 ENSG00000126457 SLC2A12 ENSG0000014641 1
EMG1 ENSG00000126749 ASB15 ENSG00000146809
HELB ENSG0000012731 1 MFHAS1 ENSG00000147324
MKLN1 ENSG00000128585 AK3 ENSG00000147853
PLD2 ENSG00000129219 STOM ENSG00000148175
APOE ENSG00000130203 GSN ENSG00000148180
KIF1A ENSG00000130294 SURF4 ENSG00000148248
ZFC3H1 ENSG00000133858 INCENP ENSG00000149503
USP30 ENSG00000135093 DST ENSG00000151914
AGAP2 ENSG00000135439 BEND6 ENSG00000151917
TSPAN31 ENSG00000135452 ZFP36L2 ENSG00000152518
ZC3H10 ENSG00000135482 PLOD2 ENSG00000152952
HNRNPA1 ENSG00000135486 RASSF3 ENSG00000153179
OS9 ENSG00000135506 AHCTF1 ENSG00000153207
GNS ENSG00000135677 CMIP ENSG00000153815
PIGC ENSG00000135845 CHD1 ENSG00000153922
DOCK10 ENSG00000135905 TAOK1 ENSG00000160551
ISCU ENSG00000136003 ALDH16A1 ENSG00000161618
LRCH1 ENSG00000136141 ASB16 ENSG00000161664
MARCH9 ENSG00000139266 FAM171A2 ENSG00000161682
FOXN4 ENSG00000139445 MED11 ENSG00000161920
SLC39A5 ENSG00000139540 CXCL16 ENSG00000161921
NABP2 ENSG00000139579 SEPN1 ENSG00000162430
SMARCC2 ENSG00000139613 MATN1 ENSG00000162510
ESYT1 ENSG00000139641 NUAK2 ENSG00000163545
MORN3 ENSG00000139714 UBLCP1 ENSG00000164332
TPM1 ENSG00000140416 IQUB ENSG00000164675
UNC45B ENSG00000141 161 REEP3 ENSG00000165476
PELP1 ENSG00000141456 SLFN5 ENSG00000166750
ARRB2 ENSG00000141480 GPR182 ENSG00000166856
ZMYND15 ENSG00000141497 ZBTB39 ENSG00000166860
MINK1 ENSG00000141503 TAC3 ENSG00000166863
SIRT3 ENSG00000142082 MY01A ENSG00000166866
ATHL1 ENSG00000142102 TM EM 194 A ENSG00000166881
RNPEPL1 ENSG00000142327 NAB2 ENSG00000166886
PSMB6 ENSG00000142507 STAT6 ENSG00000166888 Gene Name Ensembl ID Gene Name Ensembl ID
XRCC6BP1 ENSG00000166896 TANG02 ENSG00000183597
MARS ENSG00000166986 TBK1 ENSG00000183735
GATAD2A ENSG00000167491 C9orf66 ENSG00000183784
TP53I13 ENSG00000167543 BRF1 ENSG00000185024
GGT6 ENSG00000167741 AN09 ENSG00000185101
RAB3IL1 ENSG00000167994 NRBP2 ENSG00000185189
BEST1 ENSG00000167995 TNFAIP2 ENSG00000185215
FTH1 ENSG00000167996 GP1 BA ENSG00000185245
TMUB2 ENSG00000168591 SOCS1 ENSG00000185338
ABHD15 ENSG00000168792 PSMD13 ENSG00000185627
BRD3 ENSG00000169925 NDUFA4L2 ENSG00000185633
WIBG ENSG00000170473 BLOC1S4 ENSG00000186222
PA2G4 ENSG00000170515 ZNF669 ENSG00000188295
KIAA0232 ENSG00000170871 DUSP28 ENSG00000188542
RPS7 ENSG00000171863 TMEM120B ENSG00000188735
JMJD1C ENSG00000171988 OR13G1 ENSG00000197437
LCLAT1 ENSG00000172954 C5orf56 ENSG00000197536
THAP2 ENSG00000173451 PDCD1 LG2 ENSG00000197646
MY01 H ENSG00000174527 YTHDF2 ENSG00000198492
NLRP6 ENSG00000174885 FICD ENSG00000198855
PTDSS2 ENSG00000174915 C9orf96 ENSG00000198870
SH3BP5L ENSG00000175137 PHACTR4 ENSG00000204138
INHBC ENSG00000175189 C17orf107 ENSG00000205710
RMI2 ENSG00000175643 LBH ENSG00000213626
HIC1 ENSG00000177374 CAPN14 ENSG0000021471 1
FAM187B ENSG00000177558 PHB2 ENSG00000215021
PLEC ENSG00000178209 EPPK1 ENSG00000227184
PARP10 ENSG00000178685 RPL41 ENSG00000229117
GRINA ENSG00000178719 OR2W3 ENSG00000238243
FAM101A ENSG00000178882 OR14A2 ENSG00000241 128
MRFAP1 ENSG00000179010 CCDC71 L ENSG00000253276
PACS2 ENSG00000179364 CLDN23 ENSG00000253958
R3HDM2 ENSG00000179912 CNPY2 ENSG00000257727
PUF60 ENSG00000179950 RP1 1-571 M6.15 ENSG00000257921
TRNAU1AP ENSG00000180098 RP4-576H24.4 ENSG00000260861
C1 orf105 ENSG00000180999 OVCA2 ENSG00000262664
RNF41 ENSG00000181852 RP1 1-145E5.5 ENSG00000264545
DEXI ENSG00000182108 HBCBP ENSG00000268162
B4GALNT4 ENSG00000182272 AP003733.1 ENSG00000269089
GLTPD2 ENSG00000182327 VM01 ENSG00000182853 Table 2: Blood Disorders (Red Blood Cell Disorders)
Figure imgf000028_0001
Gene Name Ensembl ID Gene Name Ensembl ID
AKAP10 ENSG00000108599 UBTF ENSG00000108312
MYL2 ENSG000001 11245 RAB22A ENSG00000124209
CUX2 ENSG000001 11249 PPP4R1 L ENSG00000124224
SH2B3 ENSG000001 11252 ZNF576 ENSG00000124444
HDDC2 ENSG000001 11906 LYPD3 ENSG00000124466
HINT3 ENSG000001 1191 1 PRICKLE4 ENSG00000124593
C6orf62 ENSG000001 12308 MED20 ENSG00000124641
GMNN ENSG000001 12312 NRSN2 ENSG00000125841
HBS1 L ENSG000001 12339 MAX ENSG00000125952
QKI ENSG000001 12531 CEP250 ENSG00000126001
CCND3 ENSG000001 12576 PLEKHG3 ENSG00000126822
BYSL ENSG000001 12578 ADM2 ENSG00000128165
SLC12A7 ENSG000001 13504 SPECC1 ENSG00000128487
SSR3 ENSG000001 14850 DAD1 ENSG00000129562
CHMP3 ENSG000001 15561 CDH15 ENSG00000129910
ORC2 ENSG000001 15942 KLF16 ENSG0000012991 1
TRAK2 ENSG000001 15993 XP07 ENSG00000130227
CEP104 ENSG000001 16198 LSM7 ENSG00000130332
WRAP73 ENSG000001 16213 KLHDC7B ENSG00000130487
RPL22 ENSG000001 16251 SC02 ENSG00000130489
PHF13 ENSG000001 16273 GATA5 ENSG00000130700
PDC ENSG000001 16703 PPIL4 ENSG00000131013
C1 orf109 ENSG000001 16922 RBM39 ENSG00000131051
PADI2 ENSG000001 17115 ZNF428 ENSG00000131 116
UAP1 ENSG000001 17143 NPHP4 ENSG00000131697
ARTN ENSG000001 17407 C19orf40 ENSG00000131944
CREB1 ENSG000001 18260 FIGNL1 ENSG00000132436
MYB ENSG000001 18513 FLOT2 ENSG00000132589
CCND2 ENSG000001 18971 ERAL1 ENSG00000132591
NDUFB3 ENSG000001 19013 LPIN3 ENSG00000132793
HOXB8 ENSG00000120068 CHIT1 ENSG00000133063
HOXB5 ENSG00000120075 CCNA1 ENSG00000133101
HOXB3 ENSG00000120093 SPG20 ENSG00000133104
SPG20OS ENSG00000120664 RFXAP ENSG0000013311 1
UFM1 ENSG00000120686 BTBD2 ENSG00000133243
CEP89 ENSG00000121289 C1 QTNF6 ENSG00000133466
FMOD ENSG00000122176 CDCA8 ENSG00000134690
RBM 19 ENSG00000122965 TBX3 ENSG0000013511 1
ACADS ENSG00000122971 AMHR2 ENSG00000135409
PLCG1 ENSG00000124181 HEY2 ENSG00000135547
CCDC102A ENSG00000135736 IFNAR2 ENSG00000159110 Gene Name Ensembl ID Gene Name Ensembl ID
HUS1 ENSG00000136273 PADI4 ENSG00000159339
IREB2 ENSG00000136381 BTG2 ENSG00000159388
GABBR2 ENSG00000136928 GPR114 ENSG00000159618
C9orf156 ENSG00000136932 TEPP ENSG00000159648
FRS3 ENSG00000137218 NEK8 ENSG00000160602
HIST1 H2AB ENSG00000137259 TLCD1 ENSG00000160606
TAF8 ENSG00000137413 NACC1 ENSG00000160877
ADAM 10 ENSG00000137845 IER2 ENSG00000160888
MSMB ENSG00000138294 YDJC ENSG00000161 179
PPCDC ENSG00000138621 CCDC116 ENSG00000161 180
MAN2C1 ENSG00000140400 FAM171A2 ENSG00000161682
ARMC5 ENSG00000140691 SYCE2 ENSG00000161860
TMC6 ENSG00000141524 TAL1 ENSG00000162367
IFNAR1 ENSG00000142166 KLHL21 ENSG00000162413
SLC6A3 ENSG00000142319 MEGF6 ENSG00000162591
PADI3 ENSG00000142619 SGOL2 ENSG00000163535
THNSL2 ENSG00000144115 NUAK2 ENSG00000163545
METTL21A ENSG00000144401 FABP1 ENSG00000163586
PHLDB2 ENSG00000144824 C4orf36 ENSG00000163633
MTFR2 ENSG00000146410 TIPARP ENSG00000163659
GIGYF1 ENSG00000146830 CCNL1 ENSG00000163660
SLC25A37 ENSG00000147454 DNASE1 L3 ENSG00000163687
UGCG ENSG00000148154 ZDHHC19 ENSG00000163958
PRSS53 ENSG00000151006 SLC51A ENSG00000163959
ZFP36L2 ENSG00000152518 MAD2L1 ENSG00000164109
SAMSN1 ENSG00000155307 C7orf72 ENSG00000164500
LARP1 ENSG00000155506 SAP30L ENSG00000164576
CNOT8 ENSG00000155508 USP49 ENSG00000164663
KCTD18 ENSG00000155729 SPIDR ENSG00000164808
FAM126B ENSG00000155744 TP53INP1 ENSG00000164938
TMEM237 ENSG00000155755 INTS8 ENSG00000164941
RASA2 ENSG00000155903 ABCA1 ENSG00000165029
KCNMA1 ENSG00000156113 MARCH8 ENSG00000165406
HKDC1 ENSG00000156510 C10orf25 ENSG0000016551 1
MAPK13 ENSG0000015671 1 ZNF22 ENSG00000165512
COX6A2 ENSG00000156885 BRD7 ENSG00000166164
ITGAD ENSG00000156886 SMPD1 ENSG0000016631 1
C1 orf27 ENSG00000157181 APBB1 ENSG00000166313
RNF207 ENSG00000158286 ILK ENSG00000166333
NPM2 ENSG00000158806 TAF10 ENSG00000166337
DCHS1 ENSG00000166341 UNC1 19B ENSG00000175970 Gene Name Ensembl ID Gene Name Ensembl ID
RPL27A ENSG00000166441 ZNF575 ENSG00000176472
NYAP1 ENSG00000166924 ZBTB38 ENSG0000017731 1
TSC22D4 ENSG00000166925 GRB2 ENSG00000177885
C15orf39 ENSG00000167173 ODF3B ENSG00000177989
TBC1 D2B ENSG00000167202 PRSS36 ENSG00000178226
CDK12 ENSG00000167258 DDC8 ENSG00000178404
POP5 ENSG00000167272 FARSA ENSG00000179115
IRGQ ENSG00000167378 CALR ENSG00000179218
ZNF668 ENSG00000167394 RAD23A ENSG00000179262
ZNF646 ENSG00000167395 GADD45GIP1 ENSG00000179271
VKORC1 ENSG00000167397 DAND5 ENSG00000179284
DHRS13 ENSG00000167536 BBS12 ENSG00000181004
TMC8 ENSG00000167895 CLDN7 ENSG00000181885
ITFG3 ENSG00000167930 PAPPA ENSG00000182752
ACOX2 ENSG00000168306 SH2D7 ENSG00000183476
IRS1 ENSG00000169047 ZFP36L1 ENSG00000185650
XP06 ENSG00000169180 IKZF1 ENSG0000018581 1
DFFB ENSG00000169598 C7orf61 ENSG00000185955
C15orf40 ENSG00000169609 C9orf47 ENSG00000186354
ZNF32 ENSG00000169740 FSD2 ENSG00000186628
BRD3 ENSG00000169925 PRR5 ENSG00000186654
ZNF778 ENSG00000170100 TMPRSS6 ENSG00000187045
FAXDC2 ENSG00000170271 HMX3 ENSG00000188620
ELP5 ENSG00000170291 OPTC ENSG00000188770
HOXB9 ENSG00000170689 PRELP ENSG00000188783
KRCC1 ENSG00000172086 NIF3L1 ENSG00000196290
CEBPB ENSG00000172216 PLXNB2 ENSG00000196576
AFF1 ENSG00000172493 HIST1 H2AD ENSG00000196866
TMEM134 ENSG00000172663 HIST1 H3D ENSG00000197409
ZFAND4 ENSG00000172671 LEKR1 ENSG00000197980
COR01 B ENSG00000172725 RPL23A ENSG00000198242
ANKRD13D ENSG00000172932 TOP1 ENSG00000198900
ADRBK1 ENSG00000173020 C1 orf174 ENSG00000198912
FAM222B ENSG00000173065 KIAA1279 ENSG00000198954
ZHX3 ENSG00000174306 SAP25 ENSG00000205307
C20orf166 ENSG00000174407 PTPRCAP ENSG00000213402
GPR152 ENSG00000175514 S1 PR3 ENSG00000213694
CABP4 ENSG00000175544 KCTD1 1 ENSG00000213859
RPS6KB2 ENSG00000175634 NUDT19 ENSG00000213965
PLEKHF2 ENSG00000175895 SMC01 ENSG00000214097
RP1-139D8.6 ENSG00000214732 TOMM6 ENSG00000214736 Gene Name Ensembl ID Gene Name Ensembl ID
AC003102.1 ENSG00000214921 PPIL3 ENSG00000240344
SYCE3 ENSG00000217442 PLCXD2 ENSG00000240891
CEBPD ENSG00000221869 CEBPA ENSG00000245848
TNFSF12-
SRRM5 ENSG00000226763 TNFSF13 ENSG00000248871
C21 orf54 ENSG00000229086 RNF103-CHMP3 ENSG00000249884
ZNF668 ENSG00000232748 RP1 1-196G1 1.1 ENSG00000255439
PINLYP ENSG00000234465 FNTB ENSG00000257365
MCI DAS ENSG00000234602 AE000662.92 ENSG00000259003
SMIM1 ENSG00000235169 HOXB7 ENSG00000260027
RNF103 ENSG00000239305 RP1-4G17.5 ENSG00000262302
TNFSF12 ENSG00000239697 L34079.2 ENSG00000268361
Table 3: Autoimmune Disorders (Ulcerative Colitis)
Figure imgf000032_0001
Gene Name Ensembl ID Gene Name Ensembl ID
TWISTNB ENSG00000105849 MTRR ENSG00000124275
HBP1 ENSG00000105856 FASTKD3 ENSG00000124279
CBLL1 ENSG00000105879 HIF3A ENSG00000124440
USP42 ENSG00000106346 PCNXL4 ENSG00000126773
CDC37L1 ENSG00000106993 VIL1 ENSG00000127831
PLGRKT ENSG00000107020 AAMP ENSG00000127837
KIAA1432 ENSG00000107036 PNKD ENSG00000127838
NPDC1 ENSG00000107281 DLL4 ENSG00000128917
ABCA2 ENSG00000107331 ILF3 ENSG00000129351
GATA3 ENSG00000107485 CDKN2D ENSG00000129355
CUL2 ENSG00000108094 INS-IGF2 ENSG00000129965
CHRNE ENSG00000108556 UBAC1 ENSG00000130560
GLI 1 ENSG000001 11087 LSP1 ENSG00000130592
LRP3 ENSG00000130881 TNNI2 ENSG00000130598
COX4I 1 ENSG00000131 143 SLC7A10 ENSG00000130876
EMC8 ENSG00000131 148 MED12L ENSG00000144893
FCRLA ENSG00000132185 LYAR ENSG00000145220
TMEM128 ENSG00000132406 R0PN 1 L ENSG00000145491
PDZD2 ENSG00000133401 MARCH6 ENSG00000145495
ETS1 ENSG00000134954 ADAMTS16 ENSG00000145536
AGAP2 ENSG00000135439 LIX1 ENSG00000145721
CDK4 ENSG00000135446 SDK1 ENSG00000146555
TSPAN31 ENSG00000135452 MCPH1 ENSG00000147316
B4GALNT1 ENSG00000135454 AK3 ENSG00000147853
SLC26A10 ENSG00000135502 AUH ENSG00000148090
OS9 ENSG00000135506 INPP5E ENSG00000148384
USP37 ENSG00000135913 SEC16A ENSG00000148396
IL10 ENSG00000136634 DPH7 ENSG00000148399
ADAM 10 ENSG00000137845 SYT8 ENSG00000149043
MARCH9 ENSG00000139266 PLCH2 ENSG00000149527
INHBE ENSG00000139269 CCT5 ENSG00000150753
NKD1 ENSG00000140807 FAM173B ENSG00000150756
IRF8 ENSG00000140968 F0X01 ENSG00000150907
CIRH1A ENSG00000141076 IPMK ENSG00000151 151
PELP1 ENSG00000141456 FLU ENSG00000151702
ARRB2 ENSG00000141480 HHEX ENSG00000152804
MINK1 ENSG00000141503 STK32B ENSG00000152953
FBX015 ENSG00000141665 CEBPG ENSG00000153879
NFIC ENSG00000141905 CHD1 ENSG00000153922
IL19 ENSG00000142224 BUB3 ENSG00000154473
PSMB6 ENSG00000142507 ACSS1 ENSG00000154930 Gene Name Ensembl ID Gene Name Ensembl ID
MMEL1 ENSG00000142606 AGPAT5 ENSG00000155189
DPT ENSG00000143196 HEATR3 ENSG00000155393
FCGR2A ENSG00000143226 KIF5A ENSG00000155980
RORC ENSG00000143365 SST ENSG00000157005
TUFT1 ENSG00000143367 ETS2 ENSG00000157557
CGN ENSG00000143375 FAM213B ENSG00000157870
SNX27 ENSG00000143376 TNFRSF14 ENSG00000157873
SELENBP1 ENSG00000143416 PANK4 ENSG00000157881
POGZ ENSG00000143442 PEX10 ENSG0000015791 1
SPAG16 ENSG00000144451 ANKRD61 ENSG00000157999
CTDSP1 ENSG00000144579 RNF207 ENSG00000158286
RQCD1 ENSG00000144580 FBXW5 ENSG00000159069
LRIG1 ENSG00000144749 PSMB4 ENSG00000159377
GPSM1 ENSG00000160360 CELF3 ENSG00000159409
IKZF3 ENSG00000161405 TM2D2 ENSG00000169490
TMC04 ENSG00000162542 HTRA4 ENSG00000169495
FCRLB ENSG00000162746 PLEKHA2 ENSG00000169499
IL20 ENSG00000162891 HINT1 ENSG00000169567
IL24 ENSG00000162892 AGPAT2 ENSG00000169692
FAIM3 ENSG00000162894 ADRA1 B ENSG00000170214
PIGR ENSG00000162896 PTGER4 ENSG00000171522
FCAMR ENSG00000162897 BCL2 ENSG00000171791
REL ENSG00000162924 C1 1 orf40 ENSG00000171987
C1 orf106 ENSG00000163362 TMEM196 ENSG00000173452
ARPC2 ENSG00000163466 CHD2 ENSG00000173575
RNF25 ENSG00000163481 TRAPPC3L ENSG00000173626
STK36 ENSG00000163482 AGFG1 ENSG00000173744
BSN ENSG00000164061 SH3PXD2B ENSG00000174705
KIAA0947 ENSG00000164151 INHBC ENSG00000175189
ANKRD33B ENSG00000164236 DCTN2 ENSG00000175203
FAM26D ENSG00000164451 RMI2 ENSG00000175643
SP8 ENSG00000164651 TPRN ENSG00000176058
HEATR2 ENSG00000164818 SSNA1 ENSG00000176101
UNCX ENSG00000164853 ANAPC2 ENSG00000176248
PMPCA ENSG00000165688 C8G ENSG00000176919
SDCCAG3 ENSG00000165689 DPP7 ENSG00000176978
FAM69B ENSG00000165716 MAN1 B1 ENSG00000177239
ZMYND19 ENSG00000165724 MAMDC4 ENSG00000177943
SPINT1 ENSG00000166145 FAM26E ENSG00000178033
BRD7 ENSG00000166164 DTX3 ENSG00000178498
TAC3 ENSG00000166863 R3HDM2 ENSG00000179912 Gene Name Ensembl ID Gene Name Ensembl ID
MY01A ENSG00000166866 SLC9A4 ENSG00000180251
STAT6 ENSG00000166888 NRIP1 ENSG00000180530
PIP4K2C ENSG00000166908 C9orf139 ENSG00000180539
MARS ENSG00000166986 FUT7 ENSG00000180549
SNX20 ENSG00000167208 CXCR2 ENSG00000180871
IGF2 ENSG00000167244 EHMT1 ENSG00000181090
GGT6 ENSG00000167741 TNFSF15 ENSG00000181634
ADAM9 ENSG00000168615 S100A7A ENSG00000184330
DNAJC21 ENSG00000168724 LRRC26 ENSG00000184709
CHTF8 ENSG00000168802 ZFP90 ENSG00000184939
ZBTB49 ENSG00000168826 MROH2A ENSG00000185038
TM4SF20 ENSG00000168955 STAC3 ENSG00000185482
MFF ENSG00000168958 NDUFA4L2 ENSG00000185633
ZPBP2 ENSG00000186075 TMEM210 ENSG00000185863
SAPCD2 ENSG00000186193 RNF208 ENSG00000212864
LYRM7 ENSG00000186687 CCDC183 ENSG00000213213
TSPYL4 ENSG00000187189 DNLZ ENSG00000213221
EXD3 ENSG00000187609 AL031590.1 ENSG00000214854
TMEM203 ENSG00000187713 SLC35E2 ENSG00000215790
CARD9 ENSG00000187796 RP1 1-211 G3.3 ENSG00000228804
LCN10 ENSG00000187922 C9orf172 ENSG00000232434
WNT7B ENSG00000188064 RNF224 ENSG00000233198
FAM166A ENSG00000188163 ARHGEF25 ENSG00000240771
PRKAR1 B ENSG00000188191 C7orf73 ENSG00000243317
TUBB4B ENSG00000188229 TMEM141 ENSG00000244187
NDOR1 ENSG00000188566 CEBPA ENSG00000245848
HMX3 ENSG00000188620 CDK11 B ENSG00000248333
NOXA1 ENSG00000188747 INS ENSG00000254647
HMX2 ENSG00000188816 DKFZP43401614 ENSG00000258729
FAM26F ENSG00000188820 AL807752.1 ENSG00000268996
NELFB ENSG00000188986 RP1 1-216L13.17 ENSG00000272896
MMP23B ENSG00000189409 HES5 ENSG00000197921
CD55 ENSG00000196352 TOR4A ENSG00000198113
C9orf163 ENSG00000196366 PIM3 ENSG00000198355
RABL6 ENSG00000196642 WWP2 ENSG00000198373
DTHD1 ENSG00000197057 RTP2 ENSG00000198471
ARRDC1 ENSG00000197070 SLC34A3 ENSG00000198569
C9orf169 ENSG00000197191 SELT ENSG00000198843
UAP1 L1 ENSG00000197355 SGMS1 ENSG00000198964
MIB2 ENSG00000197530 C9orf37 ENSG00000203993
C9orf173 ENSG00000197768 IGFL4 ENSG00000204869 Gene Name Ensembl ID Gene Name Ensembl ID
CNEP1 R1 ENSG00000205423 KRTAP5-6 ENSG00000205864
C17orf107 ENSG00000205710
Table 4: Autoimmune Disorders (Multiple Sclerosis)
Gene Name Ensembl ID Gene Name Ensembl ID
SEC62 ENSG00000008952 CD5 ENSG000001 10448
NCAPD2 ENSG00000010292 VWF ENSG000001 10799
IFF01 ENSG00000010295 SCNN1A ENSG000001 11319
CD4 ENSG00000010610 LTBR ENSG000001 11321
LRRC23 ENSG00000010626 MRPL51 ENSG000001 11639
CD6 ENSG00000013725 GAPDH ENSG000001 11640
CYP24A1 ENSG00000019186 NOP2 ENSG000001 11641
GLRX2 ENSG00000023572 CHD4 ENSG000001 11642
IFNGR1 ENSG00000027697 ACRBP ENSG000001 11644
SPATA7 ENSG00000042317 COPS7A ENSG000001 11652
GALC ENSG00000054983 ING4 ENSG000001 11653
PDE4A ENSG00000065989 USP5 ENSG000001 11667
ZFYVE26 ENSG00000072121 TPI 1 ENSG000001 11669
PVR ENSG00000073008 SPSB2 ENSG000001 11671
TEAD2 ENSG00000074219 ATN1 ENSG000001 11676
TCF7 ENSG00000081059 C12orf57 ENSG000001 11678
LAG 3 ENSG00000089692 SOD2 ENSG000001 12096
MLF2 ENSG00000089693 BACH 2 ENSG000001 12182
ICAM1 ENSG00000090339 IL12B ENSG000001 13302
MAPK1 ENSG00000100030 TTC1 ENSG000001 13312
NCF4 ENSG00000100365 GNPDA1 ENSG000001 13552
CSF2RB ENSG00000100368 FGF1 ENSG000001 13578
DNAJC5 ENSG00000101 152 SMC4 ENSG000001 13810
PRPF6 ENSG00000101 161 PDCD10 ENSG000001 14209
PTK6 ENSG00000101213 PN01 ENSG000001 15946
RGCC ENSG00000102760 PLEK ENSG000001 15956
PIH1 D1 ENSG00000104872 TROVE2 ENSG000001 16747
SLC17A7 ENSG00000104888 UCHL5 ENSG000001 16750
DKKL1 ENSG00000104901 NEK9 ENSG000001 19638
OLFM2 ENSG00000105088 ARAP3 ENSG00000120318
MRPL4 ENSG00000105364 POPDC2 ENSG00000121577
ICAM4 ENSG00000105371 PRM2 ENSG00000122304
CDC37 ENSG00000105401 ANXA1 1 ENSG00000122359
PDE4C ENSG00000105650 CKS2 ENSG00000123975
ISYNA1 ENSG00000105655 SRMS ENSG00000125508
FAM208B ENSG00000108021 C20orf195 ENSG00000125531 Gene Name Ensembl ID Gene Name Ensembl ID
NFKB1 ENSG00000109320 ZNF384 ENSG00000126746
UBE2D3 ENSG00000109332 EMG1 ENSG00000126749
CRTAM ENSG00000109943 KRI 1 ENSG00000129347
ILF3 ENSG00000129351 KIF5A ENSG00000155980
CDKN2D ENSG00000129355 BATF ENSG00000156127
APOE ENSG00000130203 PTMS ENSG00000159335
GADD45G ENSG00000130222 CXCR5 ENSG00000160683
SSBP4 ENSG0000013051 1 ATP1A1 ENSG00000163399
KIAA1683 ENSG00000130518 EOMES ENSG00000163508
LSM4 ENSG00000130520 AZI2 ENSG00000163512
ZBTB46 ENSG00000130584 SERPINI 1 ENSG00000163536
SAMD10 ENSG00000130590 SLC9B1 ENSG00000164037
PPAN ENSG00000130810 SLC9B2 ENSG00000164038
EIF3G ENSG0000013081 1 BDH2 ENSG00000164039
COX4I 1 ENSG00000131 143 IL22RA2 ENSG00000164485
EMC8 ENSG00000131 148 SYTL3 ENSG00000164674
SLC34A1 ENSG00000131 183 TAGAP ENSG00000164691
PRR7 ENSG00000131 188 GTF2A1 ENSG00000165417
SEC61 G ENSG00000132432 NSD1 ENSG00000165671
RBM 17 ENSG00000134453 TMEM52B ENSG00000165685
IL2RA ENSG00000134460 PIP4K2C ENSG00000166908
SLC26A10 ENSG00000135502 DNAJC21 ENSG00000168724
CAB39 ENSG00000135932 RGS14 ENSG00000169220
HUS1 ENSG00000136273 LMAN2 ENSG00000169223
SKIL ENSG00000136603 ADRA1 B ENSG00000170214
RPS24 ENSG00000138326 FOS ENSG00000170345
GABARAPL1 ENSG00000139112 TMED10 ENSG00000170348
VAMP1 ENSG00000139190 PTGER4 ENSG00000171522
TAPBPL ENSG00000139192 MALT1 ENSG00000172175
CD27 ENSG00000139193 CLEC7A ENSG00000172243
NCOA2 ENSG00000140396 MAB21 L3 ENSG00000173212
IRF8 ENSG00000140968 GPR160 ENSG00000173890
CISD2 ENSG00000145354 DCTN2 ENSG00000175203
DOK3 ENSG00000146094 CTDSP2 ENSG00000175215
DYNLT1 ENSG00000146425 RMI2 ENSG00000175643
WTAP ENSG00000146457 OLIG3 ENSG00000177468
MCPH1 ENSG00000147316 PTRF ENSG00000177469
POLR3A ENSG00000148606 PRM3 ENSG00000178257
SESN3 ENSG00000149212 TNP2 ENSG00000178279
ZFP36L2 ENSG00000152518 DTX3 ENSG00000178498
SPEF2 ENSG00000152582 S1 PR5 ENSG00000180739 Gene Name Ensembl ID Gene Name Ensembl ID
PTPRK ENSG00000152894 RAD51 B ENSG00000182185
AGPAT5 ENSG00000155189 DDX41 ENSG00000183258
ELM01 ENSG00000155849 GKN2 ENSG00000183607
LPAR5 ENSG00000184574 ZCWPW2 ENSG00000206559
TMED9 ENSG00000184840 TRIM59 ENSG00000213186
SOCS1 ENSG00000185338 MXD3 ENSG00000213347
CEACAM19 ENSG00000186567 PHB2 ENSG00000215021
CMC1 ENSG00000187118 PPP3R1 ENSG00000221823
SECISBP2 ENSG00000187742 ARHGEF25 ENSG00000240771
SEMA4D ENSG00000187764 PPAN-P2RY11 ENSG00000243207
ZNF512B ENSG00000196700 WDR92 ENSG00000243667
IGF2R ENSG00000197081 P2RY1 1 ENSG00000244165
CLEC9A ENSG00000197992 GPR162 ENSG00000250510
C6orf99 ENSG0000020371 1 MPV17L2 ENSG00000254858
ZNF783 ENSG00000204946 CLEC12B ENSG00000256660
RP1 1-293M10.1 ENSG00000258740 RP1 1-474G23.1 ENSG00000273398
AL158091.1 ENSG00000269223
Table 5: Autoimmune Disorders (Rheumatoid Arthritis)
Gene Name Ensembl ID Gene Name Ensembl ID
HS3ST1 ENSG00000002587 CNTRL ENSG000001 19397
RNASET2 ENSG00000026297 FBXW2 ENSG000001 19402
IFNGR1 ENSG00000027697 PHF19 ENSG000001 19403
CLEC16A ENSG00000038532 NEK9 ENSG000001 19638
WDR37 ENSG00000047056 ACAT2 ENSG00000120437
TRAF1 ENSG00000056558 PRM2 ENSG00000122304
GDI2 ENSG00000057608 AARS2 ENSG00000124608
PRDM1 ENSG00000057657 ARHGAP22 ENSG00000128805
PDE4A ENSG00000065989 KRI 1 ENSG00000129347
IDI 1 ENSG00000067064 ILF3 ENSG00000129351
KCNAB2 ENSG00000069424 SLC44A2 ENSG00000129353
I CAM 3 ENSG00000076662 CDKN2D ENSG00000129355
FDFT1 ENSG00000079459 RSPH3 ENSG00000130363
KEAP1 ENSG00000079999 ATG4D ENSG00000130734
DUSP12 ENSG00000081721 COX4I 1 ENSG00000131 143
PREP ENSG00000085377 EMC8 ENSG00000131 148
PGS1 ENSG00000087157 NPHP4 ENSG00000131697
MAP3K1 ENSG00000095015 FCRLA ENSG00000132185
PSMD5 ENSG00000095261 PTPN22 ENSG00000134242
SYNGR1 ENSG00000100321 PTGFRN ENSG00000134247
CD40 ENSG00000101017 TRIM45 ENSG00000134253 Gene Name Ensembl ID Gene Name Ensembl ID
CDC37 ENSG00000105401 CD101 ENSG00000134256
CDK6 ENSG00000105810 IL6ST ENSG00000134352
AHR ENSG00000106546 FBX018 ENSG00000134452
GATA3 ENSG00000107485 RBM 17 ENSG00000134453
C1 QBP ENSG00000108561 IL2RA ENSG00000134460
UPK2 ENSG000001 10375 FADS2 ENSG00000134824
COMMD9 ENSG000001 10442 ETS1 ENSG00000134954
SLC26A8 ENSG000001 12053 PRR5L ENSG00000135362
CCR6 ENSG000001 12486 AGAP2 ENSG00000135439
QKI ENSG000001 12531 TSPAN31 ENSG00000135452
PCCB ENSG000001 14054 SLC26A10 ENSG00000135502
SSR3 ENSG000001 14850 OS9 ENSG00000135506
CD58 ENSG000001 16815 FLNB ENSG00000136068
TTF2 ENSG000001 16830 BLK ENSG00000136573
PADI2 ENSG000001 17115 GATA4 ENSG00000136574
ATF6 ENSG000001 18217 MYC ENSG00000136997
TNFAIP3 ENSG000001 18503 STAT4 ENSG00000138378
CYP20A1 ENSG000001 19004 SUOX ENSG00000139531
TLE3 ENSG00000140332 FCRLB ENSG00000162746
IRF8 ENSG00000140968 REL ENSG00000162924
MMEL1 ENSG00000142606 NEK10 ENSG00000163491
GPA33 ENSG00000143167 AZI2 ENSG00000163512
POU2F1 ENSG00000143190 TIPARP ENSG00000163659
FCGR2A ENSG00000143226 DNASE1 L3 ENSG00000163687
ROPN 1 L ENSG00000145491 C3orf67 ENSG00000163689
MARCH6 ENSG00000145495 ANKRD33B ENSG00000164236
FAM105A ENSG00000145569 SYTL3 ENSG00000164674
GIN1 ENSG00000145723 TAGAP ENSG00000164691
PPIP5K2 ENSG00000145725 CTSB ENSG00000164733
TNIP1 ENSG00000145901 TAF3 ENSG00000165632
TCTE1 ENSG00000146221 PIP4K2C ENSG00000166908
NFKBIE ENSG00000146232 MBD6 ENSG00000166987
DYNLT1 ENSG00000146425 FAM107A ENSG00000168309
ZEB1 ENSG00000148516 FOS ENSG00000170345
ARID5B ENSG00000150347 TMED10 ENSG00000170348
CCT5 ENSG00000150753 PFKFB3 ENSG00000170525
FAM173B ENSG00000150756 RSL1 D1 ENSG00000171490
FOX01 ENSG00000150907 CXCR6 ENSG00000172215
MFSD6 ENSG00000151690 RASGRP1 ENSG00000172575
PRDM8 ENSG00000152784 NABP1 ENSG00000173559
CHD1 ENSG00000153922 MSL2 ENSG00000174579 Gene Name Ensembl ID Gene Name Ensembl ID
ANKH ENSG00000154122 INHBC ENSG00000175189
FAM167A ENSG00000154319 DDIT3 ENSG00000175197
NEIL2 ENSG00000154328 DCTN2 ENSG00000175203
ELM01 ENSG00000155849 PTPN2 ENSG00000175354
KIF5A ENSG00000155980 RMI2 ENSG00000175643
SLC35B2 ENSG00000157593 C15orf53 ENSG00000175779
FAM213B ENSG00000157870 FOXR1 ENSG00000176302
TNFRSF14 ENSG00000157873 DDX10 ENSG00000178105
PEX10 ENSG0000015791 1 KDELC2 ENSG00000178202
SKI ENSG00000157933 TMEM151 B ENSG00000178233
IFNAR2 ENSG00000159110 DTX3 ENSG00000178498
CXCR5 ENSG00000160683 R3HDM2 ENSG00000179912
IL6R ENSG00000160712 S1 PR5 ENSG00000180739
UBE2Q1 ENSG00000160714 DEXI ENSG00000182108
YDJC ENSG00000161 179 SOCS3 ENSG00000184557
CCDC116 ENSG00000161 180 ZFP36L1 ENSG00000185650
IKZF3 ENSG00000161405 ZPBP2 ENSG00000186075
RAVER1 ENSG00000161847 BCL9L ENSG00000186174
CMC1 ENSG00000187118 SPATS1 ENSG00000249481
MTF1 ENSG00000188786 RP1 1-297N6.4 ENSG00000255046
LITAF ENSG00000189067 C8orf49 ENSG00000255394
TMEM194B ENSG00000189362 AGAP2-AS1 ENSG00000255737
ANXA6 ENSG00000197043 CTD-2369P2.12 ENSG00000267303
IGF2R ENSG00000197081 AL590822.2 ENSG00000269554
HES5 ENSG00000197921 RP1 1-444E17.6 ENSG00000272442
LEKR1 ENSG00000197980 ZCWPW2 ENSG00000206559
SPRED2 ENSG00000198369 DKFZP667F071 1 ENSG00000212743
CD247 ENSG00000198821 FGFR10P ENSG00000213066
DUSP27 ENSG00000198842 C19orf38 ENSG00000214212
C1 orf174 ENSG00000198912 TTC34 ENSG00000215912
DEFB134 ENSG00000205882 SMIM1 ENSG00000235169
DEFB135 ENSG00000205883 ARHGEF25 ENSG00000240771
DEFB136 ENSG00000205884 RP1 1-514012.4 ENSG00000249141
Table 6: Autoimmune Disorders (Celiac Disease)
Figure imgf000040_0001
Gene Name Ensembl ID Gene Name Ensembl ID
CLEC16A ENSG00000038532 WTAP ENSG00000146457
TAB2 ENSG00000055208 POLR3A ENSG00000148606
IFT80 ENSG00000068885 FLU ENSG00000151702
MYNN ENSG00000085274 PTPRK ENSG00000152894
RBM22 ENSG00000086589 CD1 C ENSG00000158481
TBC1 D2 ENSG00000095383 REL ENSG00000162924
HNRNPH3 ENSG00000096746 SERPINI 1 ENSG00000163536
ZMYND8 ENSG00000101040 IL22RA2 ENSG00000164485
PBLD ENSG00000108187 SYTL3 ENSG00000164674
SOD2 ENSG000001 12096 TAGAP ENSG00000164691
BACH 2 ENSG000001 12182 C15orf39 ENSG00000167173
PEX7 ENSG000001 12357 TTC21A ENSG00000168026
SMC4 ENSG000001 13810 XIRP1 ENSG00000168334
PDCD10 ENSG000001 14209 IL12A ENSG0000016881 1
GORASP1 ENSG000001 14745 GPR160 ENSG00000173890
ITGA4 ENSG000001 15232 PTPN2 ENSG00000175354
PAPOLG ENSG000001 15421 RMI2 ENSG00000175643
IL1 R2 ENSG000001 15590 ATOX1 ENSG00000177556
IL1 RL2 ENSG000001 15598 SUM04 ENSG00000177688
IL1 RL1 ENSG000001 15602 DEXI ENSG00000182108
IL18R1 ENSG000001 15604 GKN2 ENSG00000183607
SLC9A2 ENSG000001 15616 ACTRT3 ENSG00000184378
PLEK ENSG000001 15956 IGF2R ENSG00000197081
TROVE2 ENSG000001 16747 PDCD1 LG2 ENSG00000197646
UCHL5 ENSG000001 16750 RCSD1 ENSG00000198771
ANXA1 1 ENSG00000122359 CD247 ENSG00000198821
PPIL4 ENSG00000131013 DUSP27 ENSG00000198842
CDC73 ENSG00000134371 C6orf99 ENSG0000020371 1
HOOK1 ENSG00000134709 FLJ20373 ENSG00000233404
ETS1 ENSG00000134954 MYC ENSG00000136997
SKIL ENSG00000136603 RPS24 ENSG00000138326
C9orf156 ENSG00000136932
Table 7: Autoimmune Disorders (Crohn's Disease)
Figure imgf000041_0001
Gene Name Ensembl ID Gene Name Ensembl ID
GALC ENSG00000054983 CCR6 ENSG000001 12486
PRDM1 ENSG00000057657 TTC1 ENSG000001 13312
MPC1 ENSG00000060762 LNPEP ENSG000001 13441
SBN02 ENSG00000064932 GNPDA1 ENSG000001 13552
NGEF ENSG00000066248 ACTR8 ENSG000001 13812
PTPN21 ENSG00000070778 BCL6 ENSG000001 13916
ZFYVE26 ENSG00000072121 PAPOLG ENSG000001 15421
DGKD ENSG00000077044 KCNJ13 ENSG000001 15474
ITGA8 ENSG00000077943 IL1 R2 ENSG000001 15590
FYB ENSG00000082074 IL1 RL2 ENSG000001 15598
ICAM1 ENSG00000090339 IL1 RL1 ENSG000001 15602
ZNF268 ENSG00000090612 IL18R1 ENSG000001 15604
MAP3K1 ENSG00000095015 SLC9A2 ENSG000001 15616
CREM ENSG00000095794 PLCL1 ENSG000001 15896
JAK2 ENSG00000096968 SCAMP3 ENSG000001 16521
CIRBP ENSG00000099622 STK1 1 ENSG000001 18046
ATP5D ENSG00000099624 GOT1 ENSG00000120053
C19orf26 ENSG00000099625 RCL1 ENSG00000120158
POLR2E ENSG00000099817 CD274 ENSG00000120217
MLC1 ENSG00000100427 PAPD5 ENSG00000121274
TSC22D1 ENSG00000102804 PIK3CA ENSG00000121879
OLFM2 ENSG00000105088 CD244 ENSG00000122223
MRPL4 ENSG00000105364 ANXA1 1 ENSG00000122359
ICAM4 ENSG00000105371 FAM213A ENSG00000122378
I CAM 5 ENSG00000105376 CISD1 ENSG00000122873
TYK2 ENSG00000105397 IRF1 ENSG00000125347
CDC37 ENSG00000105401 SLC2A4RG ENSG00000125520
HOXA1 ENSG00000105991 CPSF3L ENSG00000127054
HOXA3 ENSG00000105997 LIF ENSG00000128342
CDC37L1 ENSG00000106993 KRI 1 ENSG00000129347
PLGRKT ENSG00000107020 ILF3 ENSG00000129351
KIAA1432 ENSG00000107036 SLC44A2 ENSG00000129353
TFAM ENSG00000108064 CDKN2D ENSG00000129355
ATG4D ENSG00000130734 JAK1 ENSG00000162434
PPAN ENSG00000130810 USP1 ENSG00000162607
EIF3G ENSG0000013081 1 VANGL2 ENSG00000162738
TBC1 D5 ENSG00000131374 REL ENSG00000162924
DYDC2 ENSG00000133665 SLC9B1 ENSG00000164037
PTPN22 ENSG00000134242 BSN ENSG00000164061
IL2RA ENSG00000134460 ERAP1 ENSG00000164307
FADS2 ENSG00000134824 ERAP2 ENSG00000164308 Gene Name Ensembl ID Gene Name Ensembl ID
CAB39 ENSG00000135932 SYTL3 ENSG00000164674
HUS1 ENSG00000136273 TAGAP ENSG00000164691
ACTL6A ENSG00000136518 KIAA0196 ENSG00000164961
NDUFB5 ENSG00000136521 GTF2A1 ENSG00000165417
MRPL47 ENSG00000136522 BRD7 ENSG00000166164
MYC ENSG00000136997 SMAD3 ENSG00000166949
CH25H ENSG00000138135 TCP10L2 ENSG00000166984
RPS24 ENSG00000138326 SNX20 ENSG00000167208
RBM26 ENSG00000139746 GPX4 ENSG00000167468
GPR65 ENSG00000140030 MIDN ENSG00000167470
FURIN ENSG00000140564 GHDC ENSG00000167925
IQGAP1 ENSG00000140575 BEST1 ENSG00000167995
CRTC3 ENSG00000140577 INPP5D ENSG00000168918
NKD1 ENSG00000140807 ADRA1 B ENSG00000170214
HCN3 ENSG00000143630 DYDC1 ENSG00000170788
LIX1 ENSG00000145721 PTGER4 ENSG00000171522
BOD1 ENSG00000145919 RASGRP1 ENSG00000172575
DYNLT1 ENSG00000146425 TRIB1 ENSG00000173334
AK3 ENSG00000147853 PTPN2 ENSG00000175354
POLR3A ENSG00000148606 C15orf53 ENSG00000175779
FADS1 ENSG00000149485 CLK2 ENSG00000176444
INCENP ENSG00000149503 PTRF ENSG00000177469
IPMK ENSG00000151 151 HMHA1 ENSG00000180448
SERP2 ENSG00000151778 NRIP1 ENSG00000180530
CEBPG ENSG00000153879 S1 PR5 ENSG00000180739
CHD1 ENSG00000153922 TNFSF15 ENSG00000181634
HEATR3 ENSG00000155393 ADO ENSG00000181915
NSMCE2 ENSG00000156831 DEXI ENSG00000182108
ETS2 ENSG00000157557 RAD51 B ENSG00000182185
ZNF233 ENSG00000159915 FES ENSG0000018251 1
GPSM1 ENSG00000160360 SATB1 ENSG00000182568
FAM189B ENSG00000160767 C2orf82 ENSG00000182600
IKZF3 ENSG00000161405 NGRN ENSG00000182768
GDPGP1 ENSG00000183208 ZNF234 ENSG00000263002
HDDC3 ENSG00000184508 AC005549.3 ENSG00000268024
IKZF1 ENSG0000018581 1 AC004076.9 ENSG00000268163
ZPBP2 ENSG00000186075 RP4-583P15.14 ENSG00000273047
PLEKHN1 ENSG00000187583 ZNF774 ENSG00000196391
C1 orf170 ENSG00000187642 MAN2A2 ENSG00000196547
CARD9 ENSG00000187796 ZNF470 ENSG00000197016
HMX3 ENSG00000188620 IGF2R ENSG00000197081 Gene Name Ensembl ID Gene Name Ensembl ID
ZGPAT ENSG00000197114 AC01 1475.1 ENSG00000226104
BLM ENSG00000197299 FLJ20373 ENSG00000233404
C5orf56 ENSG00000197536 GABARAPL3 ENSG00000238244
ZNF790 ENSG00000197863 PPAN-P2RY11 ENSG00000243207
ZNF583 ENSG00000198440 P2RY1 1 ENSG00000244165
SFT2D1 ENSG00000198818 CEBPA ENSG00000245848
LIME1 ENSG00000203896 CTD-2260A17.2 ENSG00000247121
TMEM1 10-
GIGYF2 ENSG00000204120 MUSTN1 ENSG00000248592
ZNF155 ENSG00000204920 RP1 1-514012.4 ENSG00000249141
CNEP1 R1 ENSG00000205423 ZNF345 ENSG00000251247
DNLZ ENSG00000213221 ZNF10 ENSG00000256223
TMEM1 10 ENSG00000213533 ZNF225 ENSG00000256294
ZNF891 ENSG00000214029 ZNF350 ENSG00000256683
AL031590.1 ENSG00000214854 CTD-2140B24.4 ENSG00000256825
ZGLP1 ENSG00000220201
Table 8: Type 1 Diabetes
Figure imgf000044_0001
Gene Name Ensembl ID Gene Name Ensembl ID
STIM2 ENSG00000109689 SSTR2 ENSG00000180616
UBE4A ENSG000001 10344 MFSD5 ENSG00000182544
SLC22A18 ENSG000001 10628 GPR19 ENSG00000183150
CD69 ENSG000001 10848 CHST6 ENSG00000183196
MYL2 ENSG000001 11245 ASCL2 ENSG00000183734
CUX2 ENSG000001 11249 TSSC4 ENSG00000184281
CDKN1 B ENSG000001 11276 SOCS1 ENSG00000185338
MDN1 ENSG000001 12159 ZFP36L1 ENSG00000185650
BACH 2 ENSG000001 12182 ZNRF1 ENSG00000186187
RGS2 ENSG000001 16741 TMPRSS6 ENSG00000187045
TROVE2 ENSG000001 16747 RSBN1 L ENSG00000187257
UCHL5 ENSG000001 16750 SLC25A29 ENSG00000197119
EPC1 ENSG00000120616 HOXA4 ENSG00000197576
PRM2 ENSG00000122304 CLEC9A ENSG00000197992
PFDN5 ENSG00000123349 CD3E ENSG00000198851
GPR18 ENSG00000125245 KLLN ENSG00000227268
PLSCR5 ENSG00000231213 CLEC12B ENSG00000256660
SLC22A18AS ENSG00000254827 RP1 1-603J24.9 ENSG0000025741 1
Table 9: Type 2 Diabetes
Figure imgf000045_0001
Gene Name Ensembl ID Gene Name Ensembl ID
SNRNP48 ENSG00000168566 PLEKHF2 ENSG00000175895
CTRB1 ENSG00000168925 CHST6 ENSG00000183196
CTRB2 ENSG00000168928 SLIT3 ENSG00000184347
ZHX3 ENSG00000174306
Table 10: Height/Growth Disorders
Gene Name Ensembl ID Gene Name Ensembl ID
AOC1 ENSG00000002726 ITIH4 ENSG00000055955
WNT16 ENSG00000002745 mm ENSG00000055957
HOXA11 ENSG00000005073 ZFR ENSG00000056097
DLX6 ENSG00000006377 TRAF1 ENSG00000056558
PHTF2 ENSG00000006576 MXD1 ENSG00000059728
PAF1 ENSG00000006712 OGFR ENSG00000060491
MARK4 ENSG00000007047 MON2 ENSG00000061987
CEACAM21 ENSG00000007129 CS ENSG00000062485
RPUSD1 ENSG00000007376 MED29 ENSG00000063322
MMP25 ENSG00000008516 RNF4 ENSG00000063978
IL32 ENSG00000008517 LPAR2 ENSG00000064547
MED24 ENSG00000008838 SNX24 ENSG00000064652
VTA1 ENSG00000009844 FAR2 ENSG00000064763
PLAUR ENSG00000011422 AP3D1 ENSG00000065000
MVP ENSG00000013364 WDR3 ENSG00000065183
POLR3B ENSG00000013503 MAP2K4 ENSG00000065559
IGF1 ENSG00000017427 TMEM206 ENSG00000065600
SNAI2 ENSG00000019549 PDE4A ENSG00000065989
SAMD4A ENSG00000020577 DHX29 ENSG00000067248
RNH1 ENSG00000023191 METTL22 ENSG00000067365
SNAPC1 ENSG00000023608 PITX1 ENSG00000069011
PLEKH01 ENSG00000023902 WIPI1 ENSG00000070540
PRKCH ENSG00000027075 ATP2B1 ENSG00000070961
POU2F2 ENSG00000028277 NCK2 ENSG00000071051
CENPQ ENSG00000031691 TRIB2 ENSG00000071575
MAP2K3 ENSG00000034152 PTPN18 ENSG00000072135
SKIV2L2 ENSG00000039123 ACAP1 ENSG00000072818
ZPBP ENSG00000042813 XRCC1 ENSG00000073050
POLR2B ENSG00000047315 FRY ENSG00000073910
TSPAN17 ENSG00000048140 EED ENSG00000074266
COL9A2 ENSG00000049089 FNDC3B ENSG00000075420
H6PD ENSG00000049239 I CAM 3 ENSG00000076662
LTBP1 ENSG00000049323 ACTL6B ENSG00000077080
LETMD1 ENSG00000050426 CAPZB ENSG00000077549 Gene Name Ensembl ID Gene Name Ensembl ID
MAPK9 ENSG00000050748 HOXA9 ENSG00000078399
BCAR1 ENSG00000050820 EDN1 ENSG00000078401
CUL1 ENSG00000055130 CIC ENSG00000079432
USP36 ENSG00000055483 PAFAH1 B3 ENSG00000079462
PUM2 ENSG00000055917 KIF22 ENSG00000079616
NDC80 ENSG00000080986 TNRC6B ENSG00000100354
CYLD ENSG00000083799 KIAA0930 ENSG00000100364
NOA1 ENSG00000084092 UPK3A ENSG00000100373
REST ENSG00000084093 FAM1 18A ENSG00000100376
GSTP1 ENSG00000084207 KHNYN ENSG00000100441
PGS1 ENSG00000087157 SDR39U1 ENSG00000100445
GNAS ENSG00000087460 CTSG ENSG00000100448
ERGIC2 ENSG00000087502 GZMB ENSG00000100453
SMOX ENSG00000088826 CGRRF1 ENSG00000100532
TMEM230 ENSG00000089063 HIF1A ENSG00000100644
PXN ENSG00000089159 CPNE6 ENSG00000100884
GMIP ENSG00000089639 CHD8 ENSG00000100888
ZBTB25 ENSG00000089775 PCK2 ENSG00000100889
SPTLC1 ENSG00000090054 DCAF1 1 ENSG00000100897
ICAM1 ENSG00000090339 EMC9 ENSG00000100908
ZNF268 ENSG00000090612 PSME2 ENSG0000010091 1
EXOC1 ENSG00000090989 REC8 ENSG00000100918
APOH ENSG00000091583 TM9SF1 ENSG00000100926
CMA1 ENSG00000092009 GMPR2 ENSG00000100938
PSME1 ENSG00000092010 RABGGTA ENSG00000100949
JPH4 ENSG00000092051 NFATC4 ENSG00000100968
MYH7 ENSG00000092054 NELFCD ENSG00000101 158
SLC7A8 ENSG00000092068 CTSZ ENSG00000101 160
RNF31 ENSG00000092098 TUBB1 ENSG00000101 162
HNRNPC ENSG00000092199 SLM02 ENSG00000101 166
SUPT16H ENSG00000092201 SLC04A1 ENSG00000101 187
TGM1 ENSG00000092295 NTSR1 ENSG00000101 188
TINF2 ENSG00000092330 MRGBP ENSG00000101 189
COL9A3 ENSG00000092758 RASSF2 ENSG00000101265
TGFB2 ENSG00000092969 CDS2 ENSG00000101290
NUP50 ENSG00000093000 TTI1 ENSG00000101407
VNN3 ENSG00000093134 RPRD1 B ENSG00000101413
CDC6 ENSG00000094804 METTL4 ENSG00000101574
IKZF5 ENSG00000095574 LPIN2 ENSG00000101577
HIVEP1 ENSG00000095951 SMCHD1 ENSG00000101596
SIRT1 ENSG00000096717 SMAD7 ENSG00000101665 Gene Name Ensembl ID Gene Name Ensembl ID
JAK2 ENSG00000096968 COTL1 ENSG00000103187
PCSK5 ENSG00000099139 CRISPLD2 ENSG00000103196
MY09B ENSG00000099331 ZNF500 ENSG00000103199
TRIOBP ENSG00000100106 STUB1 ENSG00000103266
TIMP3 ENSG00000100234 RHBDL1 ENSG00000103269
PIEZ01 ENSG00000103335 RPA3 ENSG00000106399
COR07-PAM16 ENSG00000103426 GLCCI 1 ENSG00000106415
RBL2 ENSG00000103479 EZH2 ENSG00000106462
QPRT ENSG00000103485 RARRES2 ENSG00000106538
MAZ ENSG00000103495 MEGF9 ENSG00000106780
CDIPT ENSG00000103502 ASPN ENSG00000106819
INTS10 ENSG00000104613 AM BP ENSG00000106927
ERI 1 ENSG00000104626 PLGRKT ENSG00000107020
KCTD9 ENSG00000104756 KIAA1432 ENSG00000107036
BNIP3L ENSG00000104765 DNMBP ENSG00000107554
KCNN4 ENSG00000104783 PLEKHA1 ENSG00000107679
CLPTM1 ENSG00000104853 DNAJC12 ENSG00000108176
DOT1 L ENSG00000104885 NUFIP2 ENSG00000108256
PLEKHJ1 ENSG00000104886 GIT1 ENSG00000108262
SF3A2 ENSG00000104897 CYTH1 ENSG00000108669
RSPH6A ENSG00000104941 SLC16A6 ENSG00000108932
VRK3 ENSG00000105053 NDUFC1 ENSG00000109390
OLFM2 ENSG00000105088 GAB1 ENSG00000109458
DENND3 ENSG00000105339 CPZ ENSG00000109625
MRPL4 ENSG00000105364 ELP4 ENSG0000010991 1
CD 79 A ENSG00000105369 ZNF259 ENSG00000109917
ICAM4 ENSG00000105371 APOA5 ENSG000001 10243
I CAM 5 ENSG00000105376 UPK2 ENSG000001 10375
TYK2 ENSG00000105397 SOX6 ENSG000001 10693
CDC37 ENSG00000105401 C1 1 orf58 ENSG000001 10696
NAPA ENSG00000105402 PITPNM1 ENSG000001 10697
CNFN ENSG00000105427 AIP ENSG000001 1071 1
LENG1 ENSG00000105617 CAPRIN2 ENSG000001 10888
ERF ENSG00000105722 SLC1 1A2 ENSG000001 1091 1
GSK3A ENSG00000105723 PARP1 1 ENSG000001 11224
ATP13A1 ENSG00000105726 MANSC1 ENSG000001 11261
ETHE1 ENSG00000105755 DUSP16 ENSG000001 11266
SMG9 ENSG00000105771 TIMELESS ENSG000001 11602
CDK6 ENSG00000105810 BAG2 ENSG000001 12208
DLX5 ENSG00000105880 VNN1 ENSG000001 12299
HOXA1 ENSG00000105991 VNN2 ENSG000001 12303 Gene Name Ensembl ID Gene Name Ensembl ID
HOXA2 ENSG00000105996 RPS12 ENSG000001 12306
HOXA3 ENSG00000105997 C7 ENSG000001 12936
HOXA13 ENSG00000106031 FAF2 ENSG000001 13194
CPED1 ENSG00000106034 HAND1 ENSG000001 13196
EVX1 ENSG00000106038 ITK ENSG000001 13263
MSH3 ENSG000001 13318 UBN1 ENSG000001 18900
ARRDC3 ENSG000001 13369 SATB2 ENSG000001 19042
GOLPH3 ENSG000001 13384 CNTRL ENSG000001 19397
NUP155 ENSG000001 13569 TCTN3 ENSG000001 19977
C9 ENSG000001 13600 PR0SER1 ENSG00000120685
SELK ENSG000001 1381 1 TBX2 ENSG00000121068
ACTR8 ENSG000001 13812 PAPD5 ENSG00000121274
ACVR2B ENSG000001 14739 BAI2 ENSG00000121753
EIF1 B ENSG000001 14784 ANXA1 1 ENSG00000122359
NCL ENSG000001 15053 BBS9 ENSG00000122507
ACTR3 ENSG000001 15091 NT5C3A ENSG00000122643
FN1 ENSG000001 15414 TRIM24 ENSG00000122779
SF3B1 ENSG000001 15524 AKR1 D1 ENSG00000122787
ODC1 ENSG000001 15758 MED13L ENSG00000123066
SDC1 ENSG000001 15884 CDKN2C ENSG00000123080
PN01 ENSG000001 15946 ACSL3 ENSG00000123983
MORN1 ENSG000001 16151 VAPB ENSG00000124164
TCEANC2 ENSG000001 16205 SNRNP27 ENSG00000124380
TMEM59 ENSG000001 16209 USP22 ENSG00000124422
MRPL37 ENSG000001 16221 ZNF576 ENSG00000124444
FBX02 ENSG000001 16661 LYPD3 ENSG00000124466
SRSF1 1 ENSG000001 16754 S0X4 ENSG00000124766
C1 orf109 ENSG000001 16922 RREB1 ENSG00000124782
RLF ENSG000001 17000 NUP153 ENSG00000124789
SLAMF1 ENSG000001 17090 EEF1 E1 ENSG00000124802
CD48 ENSG000001 17091 IRF1 ENSG00000125347
MFAP2 ENSG000001 17122 BHLHE23 ENSG00000125533
ECE1 ENSG000001 17298 SYMPK ENSG00000125755
APH1A ENSG000001 17362 RRBP1 ENSG00000125844
AKR1A1 ENSG000001 17448 MAX ENSG00000125952
DIEXF ENSG000001 17597 CHURC1-FNTB ENSG00000125954
NEK2 ENSG000001 17650 GDF5 ENSG00000125965
NENF ENSG000001 17691 CCR7 ENSG00000126353
CENPF ENSG000001 17724 NSRP1 ENSG00000126653
STAG1 ENSG000001 18007 ZBTB1 ENSG00000126804
AP0A1 ENSG000001 18137 MAP2K2 ENSG00000126934 Gene Name Ensembl ID Gene Name Ensembl ID
C1 orf54 ENSG000001 18292 WDR24 ENSG00000127580
CA14 ENSG000001 18298 FBXL16 ENSG00000127585
SLC16A7 ENSG000001 18596 CHTF18 ENSG00000127586
MYL12B ENSG000001 18680 AAMP ENSG00000127837
PPL ENSG000001 18898 PNKD ENSG00000127838
SHFM1 ENSG00000127922 PRKAB2 ENSG00000131791
PTPN12 ENSG00000127947 VIMP ENSG00000131871
PAICS ENSG00000128050 CHSY1 ENSG00000131873
PPAT ENSG00000128059 EMILIN2 ENSG00000132205
ANAPC13 ENSG00000129055 PRKAA1 ENSG00000132356
DCTD ENSG00000129187 CARD6 ENSG00000132357
KRI 1 ENSG00000129347 TBC1 D14 ENSG00000132405
ILF3 ENSG00000129351 PCNA ENSG00000132646
SLC44A2 ENSG00000129353 TOE1 ENSG00000132773
CDKN2D ENSG00000129355 MUTYH ENSG00000132781
RIPK3 ENSG00000129465 FBX044 ENSG00000132879
ADCY4 ENSG00000129467 PEMT ENSG00000133027
SNX6 ENSG00000129515 BTBD2 ENSG00000133243
NRL ENSG00000129535 TMEM254 ENSG00000133678
NEDD8 ENSG00000129559 IP08 ENSG00000133704
DAD1 ENSG00000129562 DPF2 ENSG00000133884
TTI2 ENSG00000129696 NUMB ENSG00000133961
CDH15 ENSG00000129910 ADAMDEC1 ENSG00000134028
KLF16 ENSG0000012991 1 BHLHE40 ENSG00000134107
LBP ENSG00000129988 EDEM1 ENSG00000134109
ERMARD ENSG00000130023 CDCA8 ENSG00000134690
APOE ENSG00000130203 FADS2 ENSG00000134824
LSM7 ENSG00000130332 ETS1 ENSG00000134954
STK33 ENSG00000130413 RFK ENSG00000135002
TAF4 ENSG00000130699 ADAM 19 ENSG00000135074
LAMA5 ENSG00000130702 TMEM60 ENSG0000013521 1
OSBPL2 ENSG00000130703 AVIL ENSG00000135407
ADRM1 ENSG00000130706 AMHR2 ENSG00000135409
ATG4D ENSG00000130734 ZC3H10 ENSG00000135482
PPAN ENSG00000130810 KLHL36 ENSG00000135686
EIF3G ENSG0000013081 1 KIAA0513 ENSG00000135709
DNMT1 ENSG00000130816 PIGC ENSG00000135845
ZNF428 ENSG00000131 116 CKAP4 ENSG00000136026
CHMP1A ENSG00000131 165 KIAA1033 ENSG00000136051
NFATC1 ENSG00000131 196 VILL ENSG00000136059
PPT1 ENSG00000131238 RNASEH2B ENSG00000136104 Gene Name Ensembl ID Gene Name Ensembl ID
HAUS8 ENSG00000131351 SPRY2 ENSG00000136158
PSMC3IP ENSG00000131470 IGF2BP3 ENSG00000136231
AN01 ENSG00000131620 HUS1 ENSG00000136273
TRAF7 ENSG00000131653 CIDEB ENSG00000136305
RARA ENSG00000131759 ZFHX2 ENSG00000136367
HLX ENSG00000136630 CBX8 ENSG00000141570
VPS45 ENSG00000136631 SECTM1 ENSG00000141574
DNAJC1 ENSG00000136770 PIP4K2B ENSG00000141720
KIF12 ENSG00000136883 ARL5C ENSG00000141748
MYC ENSG00000136997 NFIC ENSG00000141905
TPMT ENSG00000137364 SIRT3 ENSG00000142082
MGARP ENSG00000137463 APP ENSG00000142192
SYTL2 ENSG00000137501 GEMIN7 ENSG00000142252
TGS1 ENSG00000137574 ZNF473 ENSG00000142528
BUD13 ENSG00000137656 CTU1 ENSG00000142544
RSL24D1 ENSG00000137876 PRDM 16 ENSG0000014261 1
PPM1 B ENSG00000138032 PADI3 ENSG00000142619
PREPL ENSG00000138078 PIGK ENSG00000142892
RPS24 ENSG00000138326 CTTNBP2NL ENSG00000143079
APH1 B ENSG00000138613 CELSR2 ENSG00000143126
LEF1 ENSG00000138795 RGL1 ENSG00000143344
NABP2 ENSG00000139579 LYPLAL1 ENSG00000143353
SMARCC2 ENSG00000139613 ANP32E ENSG00000143401
MAP3K12 ENSG00000139625 TP53BP2 ENSG00000143514
ITGB7 ENSG00000139626 ETNK2 ENSG00000143845
CSAD ENSG00000139631 0SR1 ENSG00000143867
ESYT1 ENSG00000139641 GDF7 ENSG00000143869
ZNF740 ENSG00000139651 CAM KMT ENSG00000143919
CBLN3 ENSG00000139899 EML4 ENSG00000143924
TSSK4 ENSG00000139908 ZNF514 ENSG00000144026
TLE3 ENSG00000140332 C0PS7B ENSG00000144524
PML ENSG00000140464 CTDSPL ENSG00000144677
CELF6 ENSG00000140488 LPP ENSG00000145012
FANCI ENSG00000140525 EIF2B5 ENSG00000145191
ZNF710 ENSG00000140548 RPL37 ENSG00000145592
GLYR1 ENSG00000140632 RASA1 ENSG00000145715
PARN ENSG00000140694 MUT ENSG00000146085
TLDC1 ENSG00000140950 RASGEF1 C ENSG00000146090
RH0T2 ENSG00000140983 PRIM2 ENSG00000146143
RPS2 ENSG00000140988 ARHGAP18 ENSG00000146376
SSH2 ENSG00000141298 SLC18B1 ENSG00000146409 Gene Name Ensembl ID Gene Name Ensembl ID
ARSG ENSG00000141337 FERD3L ENSG00000146618
SS18 ENSG00000141380 CSGALNACT1 ENSG00000147408
TMC6 ENSG00000141524 DOCK5 ENSG00000147459
SLC16A3 ENSG00000141526 SURF4 ENSG00000148248
FOXK2 ENSG00000141568 POLR3A ENSG00000148606
HERC4 ENSG00000148634 CMIP ENSG00000153815
ANKRD1 ENSG00000148677 KCNJ16 ENSG00000153822
HTR7 ENSG00000148680 HS2ST1 ENSG00000153936
RPP30 ENSG00000148688 MSI2 ENSG00000153944
ADAM 12 ENSG00000148848 PRKCA ENSG00000154229
RGS10 ENSG00000148908 LONRF1 ENSG00000154359
LIN7C ENSG00000148943 FA M 69 A ENSG0000015451 1
IMMP1 L ENSG00000148950 WNT7A ENSG00000154764
PTPRJ ENSG00000149177 CCDC174 ENSG00000154781
C1 1 orf73 ENSG00000149196 MOV10 ENSG00000155363
SERPINH1 ENSG00000149257 MED7 ENSG00000155868
RPS3 ENSG00000149273 C9orf41 ENSG00000156017
FADS1 ENSG00000149485 ANKRD9 ENSG00000156381
CDC42EP2 ENSG00000149798 FGF18 ENSG00000156427
FAM57B ENSG00000149926 PDE6D ENSG00000156973
DOC2A ENSG00000149927 EXOG ENSG00000157036
SAP18 ENSG00000150459 RBPMS ENSG00000157110
PIP4K2A ENSG00000150867 CDCP2 ENSG0000015721 1
FOX01 ENSG00000150907 SSBP3 ENSG00000157216
CRIM1 ENSG00000150938 SUSD3 ENSG00000157303
CCRN4L ENSG00000151014 HYDIN ENSG00000157423
ENKUR ENSG00000151023 AASDH ENSG00000157426
NGLY1 ENSG00000151092 RAB28 ENSG00000157869
OXSM ENSG00000151093 RER1 ENSG00000157916
SRFBP1 ENSG00000151304 BRE ENSG00000158019
NR3C2 ENSG00000151623 FANCC ENSG00000158169
FLU ENSG00000151702 RNF166 ENSG00000158717
DST ENSG00000151914 NBL1 ENSG00000158747
BEND6 ENSG00000151917 PRAC1 ENSG00000159182
TIAL1 ENSG00000151923 HOXB13 ENSG00000159184
FAM168B ENSG00000152102 IGF2BP1 ENSG00000159217
PTPN14 ENSG00000152104 GIP ENSG00000159224
ARL14EP ENSG00000152219 ARHGAP27 ENSG00000159314
CCDC50 ENSG00000152492 ATP13A2 ENSG00000159363
CARHSP1 ENSG00000153048 ZFYVE28 ENSG00000159733
DAB2 ENSG00000153071 LYPD5 ENSG00000159871 Gene Name Ensembl ID Gene Name Ensembl ID
BCL2L1 1 ENSG00000153094 NPR2 ENSG00000159899
SMARCA5 ENSG00000153147 ZNF233 ENSG00000159915
BMP6 ENSG00000153162 DEDD2 ENSG00000160570
RPIA ENSG00000153574 SIK3 ENSG00000160584
ZDHHC7 ENSG00000153786 CXCR5 ENSG00000160683
UBE2Q1 ENSG00000160714 C9orf89 ENSG00000165233
CHRNB2 ENSG00000160716 LOH 12CR1 ENSG00000165714
FGFR4 ENSG00000160867 METTL3 ENSG00000165819
PLCD3 ENSG00000161714 TCP1 1 L2 ENSG00000166046
RAVER1 ENSG00000161847 IL25 ENSG00000166090
FGF1 1 ENSG00000161958 CMTM5 ENSG00000166091
WDR90 ENSG00000161996 BRD7 ENSG00000166164
JMJD8 ENSG00000161999 ARIH1 ENSG00000166233
ADCY9 ENSG00000162104 AP1 G1 ENSG00000166747
TAL1 ENSG00000162367 MESP1 ENSG00000166823
TMC04 ENSG00000162542 PIP4K2C ENSG00000166908
ARPC5 ENSG00000162704 SMAD3 ENSG00000166949
SLAMF6 ENSG00000162739 AKTIP ENSG00000166971
IL20 ENSG00000162891 RP1 1-93B14.6 ENSG00000167046
IL24 ENSG00000162892 C16orf92 ENSG00000167194
FAIM3 ENSG00000162894 SNX20 ENSG00000167208
PIGR ENSG00000162896 PRRT2 ENSG00000167371
FCAMR ENSG00000162897 IRGQ ENSG00000167378
REL ENSG00000162924 SPATA33 ENSG00000167523
PUS10 ENSG00000162927 TP53I13 ENSG00000167543
PEX13 ENSG00000162928 COR06 ENSG00000167549
SMC6 ENSG00000163029 TMC4 ENSG00000167608
ZNF2 ENSG00000163067 LENG8 ENSG00000167615
OTUD7B ENSG00000163113 ZNF526 ENSG00000167625
CDC42EP3 ENSG00000163171 ZNF283 ENSG00000167637
PYG02 ENSG00000163348 YIF1 B ENSG00000167645
NEK10 ENSG00000163491 NXN ENSG00000167693
AZI2 ENSG00000163512 AC018755.1 ENSG00000167765
FBLN2 ENSG00000163520 CDK2AP2 ENSG00000167797
RYBP ENSG00000163602 CTD-2369P2.10 ENSG00000167807
RYK ENSG00000163785 TMC8 ENSG00000167895
S100P ENSG00000163993 TMEM68 ENSG00000167904
PGRMC2 ENSG00000164040 MLST8 ENSG00000167965
NAA15 ENSG00000164134 E4F1 ENSG00000167967
HHIP ENSG00000164161 DNASE1 L2 ENSG00000167968
ANAPC10 ENSG00000164162 CASKIN1 ENSG00000167971 Gene Name Ensembl ID Gene Name Ensembl ID
ABCE1 ENSG00000164163 SCARA3 ENSG00000168077
OTUD4 ENSG00000164164 ANKS3 ENSG00000168096
UBLCP1 ENSG00000164332 NUDT16L1 ENSG00000168101
C7orf72 ENSG00000164500 HEXIM2 ENSG00000168517
KDM 1 B ENSG00000165097 TET2 ENSG00000168769
CTRB2 ENSG00000168928 ADRBK1 ENSG00000173020
HEXDC ENSG00000169660 RHOD ENSG00000173156
LRRC45 ENSG00000169683 INSM1 ENSG00000173404
STRA13 ENSG00000169689 MINOS1 ENSG00000173436
ASPSCR1 ENSG00000169696 TMEM196 ENSG00000173452
FASN ENSG00000169710 APOBEC4 ENSG00000173627
ACTRT2 ENSG00000169717 PSMD1 ENSG00000173692
DCXR ENSG00000169738 WFIKKN2 ENSG00000173714
RAC3 ENSG00000169750 CD7 ENSG00000173762
NSG2 ENSG00000170091 TIGD3 ENSG00000173825
ZNF778 ENSG00000170100 CTU2 ENSG00000174177
USP38 ENSG00000170185 ZBTB4 ENSG00000174282
C7orf33 ENSG00000170279 CHRNA9 ENSG00000174343
ELP5 ENSG00000170291 MSL2 ENSG00000174579
FAM71 B ENSG00000170613 SRP72 ENSG00000174780
ZNF296 ENSG00000170684 PTDSS2 ENSG00000174915
TTLL6 ENSG00000170703 SEZ6L2 ENSG00000174938
CHCHD7 ENSG00000170791 ASPHD1 ENSG00000174939
KBTBD2 ENSG00000170852 PARL ENSG00000175193
KIAA0232 ENSG00000170871 CTDSP2 ENSG00000175215
SOCS5 ENSG00000171 150 CLCF1 ENSG00000175505
RBKS ENSG00000171 174 ERCC4 ENSG00000175595
CDC42BPG ENSG00000171219 CALCB ENSG00000175868
RP1 1-1055B8.7 ENSG00000171282 ZNF683 ENSG00000176083
MAP1 LC3B2 ENSG00000171471 DLEU1 ENSG00000176124
WIPF2 ENSG00000171475 GPX2 ENSG00000176153
PTGER4 ENSG00000171522 CCDC57 ENSG00000176155
OTP ENSG00000171540 FOXR1 ENSG00000176302
CXXC5 ENSG00000171604 ZNF575 ENSG00000176472
SLC25A33 ENSG00000171612 FOXL1 ENSG00000176678
ATF7IP ENSG00000171681 FOXC2 ENSG00000176692
RPS21 ENSG00000171858 EFCAB5 ENSG00000176927
RPS7 ENSG00000171863 ZBTB38 ENSG0000017731 1
PRND ENSG00000171864 AGTRAP ENSG00000177674
EIF2AK3 ENSG00000172071 HTR3C ENSG00000178084
KRCC1 ENSG00000172086 VN1 R1 ENSG00000178201 Gene Name Ensembl ID Gene Name Ensembl ID
ISG20 ENSG00000172183 PLEC ENSG00000178209
ID4 ENSG00000172201 GEN1 ENSG00000178295
EFCAB12 ENSG00000172771 DDC8 ENSG00000178404
ANKRD13D ENSG00000172932 PARP10 ENSG00000178685
OXSR1 ENSG00000172939 GRINA ENSG00000178719
RNF186 ENSG00000178828 SOCS3 ENSG00000184557
APOLD1 ENSG00000178878 CDCA2 ENSG00000184661
FAM101A ENSG00000178882 TCTE3 ENSG00000184786
ZBTB7A ENSG00000178951 H1 FX ENSG00000184897
MRFAP1 L1 ENSG00000178988 FMNL1 ENSG00000184922
MRFAP1 ENSG00000179010 NT5C1 B ENSG00000185013
AP001885.1 ENSG00000179038 C5orf47 ENSG00000185056
PER1 ENSG00000179094 AN09 ENSG00000185101
PRNT ENSG00000180259 FAF1 ENSG00000185104
CCDC66 ENSG00000180376 C6orf120 ENSG00000185127
YOD1 ENSG00000180667 PSMD13 ENSG00000185627
S1 PR5 ENSG00000180739 SMIM23 ENSG00000185662
C1 orf105 ENSG00000180999 THNSL1 ENSG00000185875
AEN ENSG00000181026 C16orf54 ENSG00000185905
TMEM102 ENSG00000181284 PAGR1 ENSG00000185928
OGFOD3 ENSG00000181396 KRTAP5-5 ENSG00000185940
ACBD4 ENSG00000181513 GSAP ENSG00000186088
PLAG1 ENSG00000181690 BCL9L ENSG00000186174
RNF41 ENSG00000181852 BLOC1S4 ENSG00000186222
CLDN7 ENSG00000181885 FOXD2 ENSG00000186564
MRPS11 ENSG00000181991 HEXIM1 ENSG00000186834
ERCC6L2 ENSG00000182150 C17orf82 ENSG00000187013
CREB3L2 ENSG00000182158 TEAD1 ENSG00000187079
B4GALNT4 ENSG00000182272 PLCD1 ENSG00000187091
SGK223 ENSG00000182319 RSBN1 L ENSG00000187257
TEX19 ENSG00000182459 USP7 ENSG00000187555
MFSD5 ENSG00000182544 NHLRC1 ENSG00000187566
BRICD5 ENSG00000182685 C17orf97 ENSG00000187624
C16orf72 ENSG00000182831 ZFP69B ENSG00000187801
C21 orf90 ENSG00000182912 TPRG1 ENSG00000188001
CEP63 ENSG00000182923 CENPP ENSG00000188312
SEP15 ENSG00000183291 PRR19 ENSG00000188368
FAM101 B ENSG00000183688 BLOC1S5 ENSG00000188428
NOG ENSG00000183691 TMEM201 ENSG00000188807
ZNF703 ENSG00000183779 NHLRC3 ENSG0000018881 1
SCN5A ENSG00000183873 RPL14 ENSG00000188846 Gene Name Ensembl ID Gene Name Ensembl ID
TTC32 ENSG00000183891 UTS2B ENSG00000188958
KCNJ12 ENSG00000184185 FAM53B ENSG00000189319
PGP ENSG00000184207 ACADSB ENSG00000196177
TM2D3 ENSG00000184277 IARS ENSG00000196305
WDR27 ENSG00000184465 ZBTB44 ENSG00000196323
IP04 ENSG00000196497 LTB4R2 ENSG00000213906
GDAP2 ENSG00000196505 MDP1 ENSG00000213920
COL27A1 ENSG00000196739 IRF9 ENSG00000213928
FAM3C ENSG00000196937 AP1 G2 ENSG00000213983
NOP9 ENSG00000196943 ZNF891 ENSG00000214029
ZNF585A ENSG00000196967 NEURL4 ENSG00000215041
ZNF470 ENSG00000197016 CYB5RL ENSG00000215883
GTF2E2 ENSG00000197265 AC007401.2 ENSG00000217075
SPN ENSG00000197471 PAM16 ENSG00000217930
ZNF628 ENSG00000197483 RPA3-AS1 ENSG00000219545
C5orf56 ENSG00000197536 ZGLP1 ENSG00000220201
MYH6 ENSG00000197616 PPP3R1 ENSG00000221823
ZNF790 ENSG00000197863 PPP2R2A ENSG00000221914
NOL8 ENSG00000198000 C12orf61 ENSG00000221949
CD2AP ENSG00000198087 AC112721.1 ENSG00000222022
LPAR1 ENSG00000198121 C5orf66 ENSG00000224186
ZNF583 ENSG00000198440 APOC4-APOC2 ENSG00000224916
CALM1 ENSG00000198668 SRRM5 ENSG00000226763
SSBP3-AS1 ENSG0000019871 1 C1 orf143 ENSG00000228208
ANKRD13B ENSG00000198720 DHFR ENSG00000228716
SLC5A3 ENSG00000198743 TMEM1 14 ENSG00000232258
COLGALT2 ENSG00000198756 RP1 1-47122.3 ENSG00000232774
C9orf96 ENSG00000198870 AC008394.1 ENSG00000233828
GRK5 ENSG00000198873 PINLYP ENSG00000234465
L3MBTL3 ENSG00000198945 AP0C2 ENSG00000234906
MAFB ENSG00000204103 AC018816.3 ENSG00000235978
ZNF155 ENSG00000204920 TXNDC5 ENSG00000239264
AC006486.1 ENSG00000204957 TNFSF12 ENSG00000239697
SNX2 ENSG00000205302 RDH14 ENSG00000240857
AC074091.13 ENSG00000205334 ATP50 ENSG00000241837
CNEP1 R1 ENSG00000205423 MRPL33 ENSG00000243147
LOH 12CR2 ENSG00000205791 PPAN-P2RY11 ENSG00000243207
CYS1 ENSG00000205795 WDR92 ENSG00000243667
KRTAP5-6 ENSG00000205864 RP5-966M1.6 ENSG00000243696
NYNRIN ENSG00000205978 MRPS6 ENSG00000243927
ZCWPW2 ENSG00000206559 P2RY1 1 ENSG00000244165 Gene Name Ensembl ID Gene Name Ensembl ID
SCAF8 ENSG00000213079 ASPRV1 ENSG00000244617
TMEM1 10-
TMEM1 10 ENSG00000213533 MUSTN1 ENSG00000248592
ZNF134 ENSG00000213762 AL357673.1 ENSG00000248835
TNFSF12-
KCTD1 1 ENSG00000213859 TNFSF13 ENSG00000248871
LTB4R ENSG00000213903 AC006486.9 ENSG00000268643
AP000304.12 ENSG00000249209 AC093323.1 ENSG00000268791
RP1 1-503N18.3 ENSG00000249428 MINOS1-NBL1 ENSG00000270136
ZNF345 ENSG00000251247 MTRNR2L2 ENSG00000271043
HOXA10 ENSG00000253293 MUSTN1 ENSG00000272573
NACA2 ENSG00000253506 DOC2B ENSG00000272636
CLDN23 ENSG00000253958 LTB4R2 ENSG00000272658
CHMP4A ENSG00000254505 SLC5A3 ENSG00000272962
TM9SF1 ENSG00000254692 CELF6 ENSG00000273025
NEDD8-MDP1 ENSG00000255526 RP1 1-474G23.1 ENSG00000273398
ZNF10 ENSG00000256223 RP1 1-468E2.4 ENSG00000259529
ZNF225 ENSG00000256294 COR07 ENSG00000262246
RP1 1-446E24.4 ENSG00000256407 RP1-4G17.5 ENSG00000262302
CTD-2140B24.4 ENSG00000256825 ZNF234 ENSG00000263002
RP1-170O19.20 ENSG00000257184 RP1 1-1055B8.6 ENSG00000263053
CTD-2006C1.10 ENSG00000257355 PAGR1 ENSG00000263136
FNTB ENSG00000257365 EEF1 E1-BLOC1S5 ENSG00000265818
ZNF878 ENSG00000257446 CTD-2369P2.12 ENSG00000267303
CHURC1 ENSG00000258289 APOC4 ENSG00000267467
RP1 1-80A15.1 ENSG00000258744 S1 PR2 ENSG00000267534
RP1 1-944C7.1 ENSG00000258792 FDX1 L ENSG00000267673
RP1 1-934B9.3 ENSG00000258973 AC004076.9 ENSG00000268163
RP1 1-47122.4 ENSG00000258989 L34079.2 ENSG00000268361
BLOC1S5-TXNDC5 ENSG00000259040 AC026202.1 ENSG00000268509
THTPA ENSG00000259431 RP1 1-468E2.2 ENSG00000259522
MRPL46 ENSG00000259494
Table 11 : Lipid Metabolism
Gene Name Ensembl ID Gene Name Ensembl ID
MARK4 ENSG00000007047 BCAT1 ENSG00000060982
TRAPPC6A ENSG00000007255 MEF2BNB-MEF2B ENSG00000064489
MYLIP ENSG00000007944 RFXANK ENSG00000064490
BAZ1 B ENSG00000009954 SUGP2 ENSG00000064607
NPC1 L1 ENSG00000015520 TRMT11 ENSG00000066651
GAB2 ENSG00000033327 SLC12A3 ENSG00000070915
HERPUD1 ENSG00000051 108 WBSCR22 ENSG00000071462 Gene Name Ensembl ID Gene Name Ensembl ID
PRSS8 ENSG00000052344 CPSF1 ENSG00000071894
TSG101 ENSG00000074319 FSD1 L ENSG00000106701
MKRN2 ENSG00000075975 ZNF259 ENSG00000109917
UBA5 ENSG00000081307 CRTAM ENSG00000109943
APOB ENSG00000084674 DCPS ENSG000001 10063
GCKR ENSG00000084734 FOXRED1 ENSG000001 10074
EIF2AK1 ENSG00000086232 APOA5 ENSG000001 10243
PGS1 ENSG00000087157 CEP164 ENSG000001 10274
CETP ENSG00000087237 HPS5 ENSG000001 10756
FUS ENSG00000089280 GTF2H1 ENSG000001 10768
TGM1 ENSG00000092295 MLEC ENSG000001 10917
TINF2 ENSG00000092330 C2CD5 ENSG000001 11731
MAP3K1 ENSG00000095015 MCM3 ENSG000001 12118
CYP26A1 ENSG00000095596 PCCB ENSG000001 14054
EFHC1 ENSG00000096093 GTF3C2 ENSG000001 15207
IFT74 ENSG00000096872 EIF2B4 ENSG000001 1521 1
STX1 B ENSG00000099365 SNX17 ENSG000001 15234
HSD3B7 ENSG00000099377 SDC1 ENSG000001 15884
HNRNPM ENSG00000099783 DHCR24 ENSG000001 16133
MARCH2 ENSG00000099785 RALGPS2 ENSG000001 16191
MTAP ENSG00000099810 ANGPTL1 ENSG000001 16194
KHNYN ENSG00000100441 STAG1 ENSG000001 18007
TM9SF1 ENSG00000100926 APOA1 ENSG000001 18137
GMPR2 ENSG00000100938 RPS25 ENSG000001 18181
PLTP ENSG00000100979 RAB3GAP2 ENSG000001 18873
PCIF1 ENSG00000100982 SUPT7L ENSG000001 19760
ACD ENSG00000102977 TMEM214 ENSG000001 19777
PARD6A ENSG00000102981 PLS1 ENSG00000120756
PSMD7 ENSG00000103035 CBX3 ENSG00000122565
KAT8 ENSG00000103510 HNRNPA2B1 ENSG00000122566
CLPTM1 ENSG00000104853 POLM ENSG00000122678
PPP1 R37 ENSG00000104866 ACADS ENSG00000122971
CKM ENSG00000104879 ENKD1 ENSG00000124074
SIGLEC8 ENSG00000105366 PLCG1 ENSG00000124181
NAPA ENSG00000105402 ATXN1 ENSG00000124788
SIGLEC5 ENSG00000105501 SLC25A35 ENSG00000125434
HAS1 ENSG00000105509 VASP ENSG00000125753
LENG1 ENSG00000105617 BPIFB1 ENSG00000125999
ARMC6 ENSG00000105676 KLHDC10 ENSG00000128607
SUGP1 ENSG00000105705 RIPK3 ENSG00000129465
STX1A ENSG00000106089 ADCY4 ENSG00000129467 Gene Name Ensembl ID Gene Name Ensembl ID
AEBP1 ENSG00000106624 NEDD8 ENSG00000129559
TBL2 ENSG00000106638 MAU2 ENSG00000129933
DOCK6 ENSG00000130158 FADS1 ENSG00000149485
APOE ENSG00000130203 INCENP ENSG00000149503
APOC1 ENSG00000130208 SIDT2 ENSG00000149577
BPIFA3 ENSG00000131059 TAGLN ENSG00000149591
STARD3 ENSG00000131748 TIRAP ENSG00000150455
USP29 ENSG00000131864 TGOLN2 ENSG00000152291
RAF1 ENSG00000132155 SCN3A ENSG00000153253
PPARG ENSG00000132170 CMIP ENSG00000153815
FCRLA ENSG00000132185 UBASH3B ENSG00000154127
VAV3 ENSG00000134215 FAM167A ENSG00000154319
SORT1 ENSG00000134243 NEIL2 ENSG00000154328
LDHA ENSG00000134333 CCDC174 ENSG00000154781
FST ENSG00000134363 TTC39B ENSG00000155158
FADS2 ENSG00000134824 PSD3 ENSG0000015601 1
HNF1A ENSG00000135100 GPR61 ENSG00000156097
URB2 ENSG00000135763 NSMCE2 ENSG00000156831
ABCB10 ENSG00000135776 COX6A2 ENSG00000156885
TAF5L ENSG00000135801 TIMP4 ENSG00000157150
GLUL ENSG00000135821 CABP1 ENSG00000157782
BLK ENSG00000136573 BRE ENSG00000158019
STAM ENSG00000136738 NCK1 ENSG00000158092
PLAA ENSG00000137055 CPA2 ENSG00000158516
ADAM 10 ENSG00000137845 SLC45A3 ENSG00000158715
KHK ENSG00000138030 RSPRY1 ENSG00000159579
EMILIN1 ENSG00000138080 RLTPR ENSG00000159753
CEP55 ENSG00000138180 C16orf86 ENSG00000159761
MSMB ENSG00000138294 SIK3 ENSG00000160584
INHBE ENSG00000139269 PCSK7 ENSG00000160613
CBLN3 ENSG00000139899 YDJC ENSG00000161 179
TSSK4 ENSG00000139908 BCL6B ENSG00000161940
ARMC5 ENSG00000140691 SEPN1 ENSG00000162430
NLRC5 ENSG00000140853 ATXN7L2 ENSG00000162650
TMC6 ENSG00000141524 FCRLB ENSG00000162746
PNMT ENSG00000141744 C4orf36 ENSG00000163633
GEMIN7 ENSG00000142252 TOPBP1 ENSG00000163781
TMEM61 ENSG00000143001 ZNF513 ENSG00000163795
CELSR2 ENSG00000143126 SLC4A1AP ENSG00000163798
FCGR2A ENSG00000143226 PLB1 ENSG00000163803
ANXA9 ENSG00000143412 GPR146 ENSG00000164849 Gene Name Ensembl ID Gene Name Ensembl ID
C7orf50 ENSG00000146540 KIAA0196 ENSG00000164961
HAUS6 ENSG00000147874 SNAPC3 ENSG00000164975
ARFGAP2 ENSG00000149182 PSIP1 ENSG00000164985
CCDC171 ENSG00000164989 CCDC121 ENSG00000176714
ABCA1 ENSG00000165029 PLEC ENSG00000178209
TMEM246 ENSG00000165152 PRSS36 ENSG00000178226
MARCH8 ENSG00000165406 DDC8 ENSG00000178404
REEP3 ENSG00000165476 PARP10 ENSG00000178685
PACSIN3 ENSG00000165912 GRINA ENSG00000178719
ARL5B ENSG00000165997 PFAS ENSG00000178921
WDR88 ENSG00000166359 HES7 ENSG0000017911 1
SUN5 ENSG00000167098 PUF60 ENSG00000179950
RNF214 ENSG00000167257 F2 ENSG00000180210
POP5 ENSG00000167272 SLC25A42 ENSG00000181035
ZNF668 ENSG00000167394 SRPR ENSG00000182934
TMC4 ENSG00000167608 RP1 L1 ENSG00000183638
AC018755.1 ENSG00000167765 C8orf12 ENSG00000184608
TMC8 ENSG00000167895 NRBP2 ENSG00000185189
BEST1 ENSG00000167995 RAB11 B ENSG00000185236
PAFAH1 B2 ENSG00000168092 BPIFB4 ENSG00000186191
IRF2BP2 ENSG00000168264 BACE1 ENSG00000186318
SLC35G2 ENSG00000168917 CEACAM19 ENSG00000186567
PCSK9 ENSG00000169174 BCAM ENSG00000187244
ZNF296 ENSG00000170684 HAPLN4 ENSG00000187664
XKR6 ENSG00000171044 PPP3R2 ENSG00000188386
FPR2 ENSG00000171049 BL0C1S3 ENSG00000189114
FPR1 ENSG00000171051 NC0R2 ENSG00000196498
RBKS ENSG00000171 174 TRAPPC4 ENSG00000196655
RPS7 ENSG00000171863 CD2AP ENSG00000198087
JMJD1C ENSG00000171988 BPIFA1 ENSG00000198183
MTBP ENSG00000172167 PEG3 ENSG00000198300
MRPL13 ENSG00000172172 MYL4 ENSG00000198336
AFF1 ENSG00000172493 GPN1 ENSG00000198522
ZFAND4 ENSG00000172671 T0P1 ENSG00000198900
FAM192A ENSG00000172775 TEDDM1 ENSG00000203730
TCAP ENSG00000173991 MAFB ENSG00000204103
CYB561 D1 ENSG00000174151 DPRX ENSG00000204595
ZHX3 ENSG00000174306 AC074091.13 ENSG00000205334
SLC16A1 1 ENSG00000174326 DEFB136 ENSG00000205884
SLC16A13 ENSG00000174327 CEACAM18 ENSG00000213822
IL20RB ENSG00000174564 MDP1 ENSG00000213920 Gene Name Ensembl ID Gene Name Ensembl ID
UBE2C ENSG00000175063 TM6SF2 ENSG00000213996
UNC1 19B ENSG00000175970 MEF2B ENSG00000213999
DNAJC30 ENSG00000176410 FADS3 ENSG00000221968
VPS37D ENSG00000176428 APOC4-APOC2 ENSG00000224916
OST4 ENSG00000228474 C8orf49 ENSG00000255394
APOC2 ENSG00000234906 RP1 1-934B9.3 ENSG00000258973
NSUN6 ENSG00000241058 RP1 1-145E5.5 ENSG00000264545
MRPL33 ENSG00000243147 CTB-129P6.1 1 ENSG00000267114
CTC-236F12.4 ENSG00000248727 APOC4 ENSG00000267467
SIGLEC14 ENSG00000254415 AC005779.2 ENSG00000267545
CHMP4A ENSG00000254505 SIGLEC5 ENSG00000268500
TM9SF1 ENSG00000254692 AC135048.1 ENSG00000268863
MEF2BNB ENSG00000254901 AL121901.1 ENSG00000269117
RP1 1-712L6.5 ENSG00000255062 ZIM2 ENSG00000269699
Table 12: Glucose Metabolism
Figure imgf000061_0001
Gene Name Ensembl ID Gene Name Ensembl ID
OR2T4 ENSG00000196944 FRRS1 L ENSG00000260230
STK39 ENSG00000198648 RP1 1-145E5.5 ENSG00000264545
TOP1 ENSG00000198900 AP003733.1 ENSG00000269089
AC022498.1 ENSG00000213132
Table 13: Insulin Metabolism
Gene Name Ensembl ID Gene Name Ensembl ID
COBLL1 ENSG00000082438 TBCK ENSG00000145348
TGFB2 ENSG00000092969 FAM167A ENSG00000154319
KLHDC10 ENSG00000128607 DNAJB14 ENSG00000164031
PPARG ENSG00000132170 H2AFZ ENSG00000164032
PPA2 ENSG00000138777 WDR88 ENSG00000166359
GSTCD ENSG00000138780 TET2 ENSG00000168769
INTS12 ENSG00000138785 CEBPA ENSG00000245848
Table 14: Bone Mineral Density
Gene Name Ensembl ID Gene Name Ensembl ID
WNT16 ENSG00000002745 CCDC170 ENSG00000120262
SLC25A39 ENSG00000013306 HOXC13 ENSG00000123364
POLR3B ENSG00000013503 NFE2 ENSG00000123405
ZPBP ENSG00000042813 HOXC12 ENSG00000123407
TRAF1 ENSG00000056558 SMUG1 ENSG00000123415
CCAR1 ENSG00000060339 C17orf53 ENSG00000125319
ING3 ENSG00000071243 MKKS ENSG00000125863
MAEA ENSG00000090316 IDUA ENSG00000127415
AAAS ENSG00000094914 FGFRL1 ENSG00000127418
PSMD5 ENSG00000095261 TMEM175 ENSG00000127419
RTDR1 ENSG00000100218 NFATC1 ENSG00000131 196
CLCN7 ENSG00000103249 AMHR2 ENSG00000135409
ANKRD27 ENSG00000105186 CKAP4 ENSG00000136026
BRAT1 ENSG00000106009 TTYH3 ENSG00000136295
CPED1 ENSG00000106034 TARBP2 ENSG00000139546
RUNDC3A ENSG00000108309 DHH ENSG00000139549
DPH1 ENSG00000108963 NPFF ENSG00000139574
SOX6 ENSG000001 10693 MAP3K12 ENSG00000139625
C1 1 orf58 ENSG000001 10696 ITGB7 ENSG00000139626
WNT5B ENSG000001 11 186 CSAD ENSG00000139631
COPZ1 ENSG000001 11481 ZNF740 ENSG00000139651
WLS ENSG000001 16729 TNFRSF1 1A ENSG00000141655
SPP1 ENSG000001 18785 RERE ENSG00000142599
CNTRL ENSG000001 19397 DGKQ ENSG00000145214 Gene Name Ensembl ID Gene Name Ensembl ID
SLC26A1 ENSG00000145217 BANF1 ENSG00000175334
C6orf21 1 ENSG00000146476 EIF1AD ENSG00000175376
SLX4IP ENSG00000149346 SART1 ENSG00000175467
MKX ENSG00000150051 HIC1 ENSG00000177374
RMND1 ENSG00000155906 RNF212 ENSG00000178222
ASB16 ENSG00000161664 GAK ENSG00000178950
FAM171A2 ENSG00000161682 HOXC9 ENSG00000180806
EN1 ENSG00000163064 COLEC10 ENSG00000184374
TCP1 1 L2 ENSG00000166046 PCGF3 ENSG00000185619
SOST ENSG00000167941 RGS9BP ENSG00000186326
LDLRAD4 ENSG00000168675 FAM3C ENSG00000196937
SP7 ENSG00000170374 HOXC6 ENSG00000197757
HOXC5 ENSG00000172789 HOXC4 ENSG00000198353
RARG ENSG00000172819 HN1 L ENSG00000206053
ZNF408 ENSG00000175213 OVCA2 ENSG00000262664
ARHGAP1 ENSG00000175220 RP1 1-793H13.10 ENSG00000267281
CST6 ENSG00000175315
Table 15: Blood Pressure
Gene Name Ensembl ID Gene Name Ensembl ID
CLCN6 ENSG00000011021 FURIN ENSG00000140564
HERPUD1 ENSG00000051 108 IQGAP1 ENSG00000140575
SLM02 ENSG00000101 166 CRTC3 ENSG00000140577
HM13 ENSG00000101294 CTTNBP2NL ENSG00000143079
CSK ENSG00000103653 CAM KMT ENSG00000143919
BRAT1 ENSG00000106009 CNNM2 ENSG00000148842
MED13 ENSG00000108510 NGLY1 ENSG00000151092
KLF3 ENSG00000109787 OXSM ENSG00000151093
MYL2 ENSG000001 11245 ZFP36L2 ENSG00000152518
CUX2 ENSG000001 11249 ARHGAP27 ENSG00000159314
MFN2 ENSG000001 16688 PLCD3 ENSG00000161714
MNP ENSG000001 16691 NEK10 ENSG00000163491
TBX2 ENSG00000121068 AZI2 ENSG00000163512
ID1 ENSG00000125968 TP53INP1 ENSG00000164938
GFAP ENSG00000131095 CACNB2 ENSG00000165995
RBM38 ENSG00000132819 HEXIM2 ENSG00000168517
SBF2 ENSG00000133812 AMZ1 ENSG00000174945
TTYH3 ENSG00000136295 MTHFR ENSG00000177000
BRIP1 ENSG00000136492 RPP25 ENSG00000178718
PREPL ENSG00000138078 FAM219B ENSG00000178761
SEMA7A ENSG00000138623 MPI ENSG00000178802 ACBD4 ENSG00000181513 AC021860.1 ENSG00000196355
FES ENSG0000018251 1 ZNF774 ENSG00000196391
NGRN ENSG00000182768 MAN2A2 ENSG00000196547
GDPGP1 ENSG00000183208 BLM ENSG00000197299
HDDC3 ENSG00000184508 SCAMP5 ENSG00000198794
FMNL1 ENSG00000184922 ZCWPW2 ENSG00000206559
HEXIM1 ENSG00000186834 GABARAPL3 ENSG00000238244
Table 16: Body Mass Index
Figure imgf000064_0001
Gene Name Ensembl ID Gene Name Ensembl ID
CPO ENSG00000144410 PIDD ENSG00000177595
RQCD1 ENSG00000144580 BET1 L ENSG00000177951
ANKRD31 ENSG00000145700 RIC8A ENSG00000177963
IQGAP2 ENSG00000145703 SH2B1 ENSG00000178188
BTF3 ENSG00000145741 PRSS36 ENSG00000178226
HSD17B12 ENSG00000149084 TUFM ENSG00000178952
TMEM18 ENSG00000151353 GPBAR1 ENSG00000179921
POC5 ENSG00000152359 FA M 73 A ENSG00000180488
PAN3 ENSG00000152520 FIGN ENSG00000182263
CEBPG ENSG00000153879 ADI 1 ENSG00000182551
CHD1 ENSG00000153922 ST6GALNAC3 ENSG00000184005
RASA2 ENSG00000155903 PCDH9 ENSG00000184226
GPR61 ENSG00000156097 CEND1 ENSG00000184524
COX6A2 ENSG00000156885 PSMD13 ENSG00000185627
ETS2 ENSG00000157557 ZNF267 ENSG00000185947
TNNI1 ENSG00000159173 POLR1 D ENSG00000186184
CSRP1 ENSG00000159176 POFUT2 ENSG00000186866
ADCY9 ENSG00000162104 C15orf61 ENSG00000189227
TAL1 ENSG00000162367 FAM150B ENSG00000189292
USP1 ENSG00000162607 ATP2A1 ENSG00000196296
TMEM161 B ENSG00000164180 SPTAN1 ENSG00000197694
F2RL1 ENSG00000164251 IP09 ENSG00000198700
AGGF1 ENSG00000164252 PPM1 N ENSG00000213889
ANKRA2 ENSG00000164331 LSM 14A ENSG00000257103
UTP15 ENSG00000164338 AC135048.1 ENSG00000268863
NSA2 ENSG00000164346
Methods of diagnosis, prognosis or monitoring
According to a further aspect of the invention, there is provided a method of diagnosing a disease or disorder as described herein or predisposition in an individual thereto, comprising:
(a) quantifying the amounts of the biomarkers as defined herein in a biological sample obtained from an individual;
(b) comparing the amounts of the biomarkers in the biological sample with the amounts present in a normal control biological sample from a normal subject, such that a difference in the level of the biomarkers in the biological sample is indicative of said disease or disorder, or predisposition thereto.
According to a further aspect of the invention, there is provided a method of prognosing the development of a disease or disorder as described herein in an individual, comprising:
(a) quantifying the amounts of the biomarkers as defined herein in a biological sample obtained from an individual;
(b) comparing the amounts of the biomarkers in the biological sample with the amounts present in a normal control biological sample from a normal subject, such that a difference in the level of the biomarkers in the biological sample is indicative that the individual will develop said disease or disorder. It should be noted that references to biomarker amounts or levels also include
references to a biomarker range.
It will be appreciated that references herein to "difference in the level" refer to either a higher or lower level of the biomarker(s) in the test biological sample compared with the reference sample(s).
In one embodiment, the higher or lower level is a < 1 fold difference relative to the reference sample, such as a fold difference of 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1 , 0.05, 0.01 or any ranges therebetween. In one embodiment, the lower level is between a 0.1 and 0.9 fold difference, such as between a 0.2 and 0.5 fold difference, relative to the reference sample.
In one embodiment, the higher or lower level is a > 1 fold difference relative to the reference sample, such as a fold difference of 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, 4.5, 5.0, 5.5, 6.0, 6.5, 7.0, 7.5, 8.0, 8.5, 9.0, 9.5, 10, 10.5, 11 , 1 1.5, 12, 12.5, 15 or 20 or any ranges therebetween. In one embodiment, the higher level is between a 1 and 15 fold difference, such as between a 2 and 10 fold difference, relative to the reference sample. According to a further aspect of the invention, there is provided a method of monitoring efficacy of a therapy in a subject having, suspected of having, or of being predisposed to a disease or disorder as described herein, comprising detecting and/or quantifying, in a sample from said subject, the biomarkers as defined herein.
Monitoring methods of the invention can be used to monitor onset, progression, stabilisation, amelioration and/or remission.
In methods of diagnosing, prognosing or monitoring according to the invention, detecting and/or quantifying the biomarker in a biological sample from a test subject may be performed on two or more occasions. Comparisons may be made between the level of biomarker in samples taken on two or more occasions. Assessment of any change in the level of the biomarker in samples taken on two or more occasions may be performed. Modulation of the biomarker level is useful as an indicator of the state of a disease or disorder as described herein or predisposition thereto. An increase in the level of the biomarker, over time is indicative of onset or progression, i.e. worsening of this disorder, whereas a decrease in the level of the biomarker indicates amelioration or remission of the disorder, or vice versa.
A method of diagnosis or prognosis of or monitoring according to the invention may comprise quantifying the biomarker in a test biological sample from a test subject and comparing the level of the biomarker present in said test sample with one or more controls.
The control used in a method of the invention can be one or more control(s) selected from the group consisting of: the level of biomarker found in a normal control sample from a normal subject, a normal biomarker level; a normal biomarker range, the level in a sample from a subject with a disease or disorder as described herein, or a diagnosed predisposition thereto; biomarker level of a disease or disorder as described herein, or biomarker range of a disease or disorder as described herein. Also provided is a method of monitoring efficacy of a therapy for a disease or disorder as described herein in a subject having such a disorder, suspected of having such a disorder, or of being predisposed thereto, comprising detecting and/or quantifying the biomarker present in a biological sample from said subject. In monitoring methods, test samples may be taken on two or more occasions. The method may further comprise comparing the level of the biomarker present in the test sample with one or more reference(s) and/or with one or more previous test sample(s) taken earlier from the same test subject, e.g. prior to commencement of therapy, and/or from the same test subject at an earlier stage of therapy. The method may comprise detecting a change in the level of the biomarker in test samples taken on different occasions.
In one embodiment, the method comprises comparing the amount of biomarker(s) in said test biological sample with the amount present in one or more samples taken from said individual prior to commencement of treatment, and/or one or more samples taken from said individual during treatment.
For biomarkers which are increased in individuals with a disease or disorder as described herein, a higher level of the biomarker in the test sample relative to the level in the normal control is indicative of the presence of a disease or disorder as described herein, or predisposition thereto; an equivalent or lower level of the biomarker in the test sample relative to the normal control is indicative of absence of a disease or disorder as described herein and/or absence of a predisposition thereto.
For biomarkers which are decreased in individuals with a disease or disorder as described herein, a lower level of the biomarker in the test sample relative to the level in the normal control is indicative of the presence of said disease or disorder, or predisposition thereto; an equivalent or lower level of the biomarker in the test sample relative to the normal control is indicative of absence of said disease or disorder and/or absence of a predisposition thereto.
The term "diagnosis" as used herein encompasses identification, confirmation, and/or characterisation of a disease or disorder as described herein, or predisposition thereto. The term "prognosis" as used herein encompasses the prediction of whether a patient it likely to develop a disease or disorder as described herein. By "predisposition" it is meant that a subject does not currently present with the disorder, but is liable to be affected by the disorder in time.
Methods of monitoring and of diagnosis or prognosis according to the invention are useful to confirm the existence of a disorder, or predisposition thereto; to monitor development of the disorder by assessing onset and progression, or to assess amelioration or regression of the disorder. Methods of monitoring and of diagnosis or prognosis are also useful in methods for assessment of clinical screening, choice of therapy, evaluation of therapeutic benefit, i.e. for drug screening and drug development. Efficient diagnosis, prognosis and monitoring methods provide very powerful "patient solutions" with the potential for improved prognosis, by establishing the correct diagnosis, allowing rapid identification of the most appropriate treatment (thus lessening unnecessary exposure to harmful drug side effects), reducing "down-time" and relapse rates.
Methods for monitoring efficacy of a therapy can be used to monitor the therapeutic effectiveness of existing therapies and new therapies in human subjects and in non-human animals (e.g. in animal models). These monitoring methods can be incorporated into screens for new drug substances and combinations of substances.
Suitably, the time elapsed between taking samples from a subject undergoing diagnosis or monitoring will be 3 days, 5 days, a week, two weeks, a month, 2 months, 3 months, 6 or 12 months. Samples may be taken prior to and/or during and/or following therapy for a disease or disorder described herein. Samples can be taken at intervals over the remaining life, or a part thereof, of a subject. The term "detecting" as used herein means confirming the presence of the biomarker present in the sample. Quantifying the amount of the biomarker present in a sample may include determining the concentration of the biomarker present in the sample. Detecting and/or quantifying may be performed directly on the sample, or indirectly on an extract therefrom, or on a dilution thereof.
In alternative aspects of the invention, the presence of the biomarker is assessed by detecting and/or quantifying antibody or fragments thereof capable of specific binding to the biomarker that are generated by the subject's body in response to the biomarker and thus are present in a biological sample from a subject having a disease or disorder as described herein or a predisposition thereto.
Detecting and/or quantifying can be performed by any method suitable to identify the presence and/or amount of a specific biomarker in a biological sample from a patient or a purification or extract of a biological sample or a dilution thereof. In methods of the invention, quantifying may be performed by measuring the concentration of the biomarker in the sample or samples. Biological samples that may be tested in a method of the invention include whole blood, blood serum, plasma, cerebrospinal fluid (CSF), urine, saliva, or other bodily fluid (stool, tear fluid, synovial fluid, sputum), breath, e.g. as condensed breath, or an extract or purification therefrom, or dilution thereof. Biological samples also include tissue homogenates, tissue sections and biopsy specimens from a live subject, or taken post-mortem. The samples can be prepared, for example where appropriate diluted or concentrated, and stored in the usual manner. It will be understood that methods of the invention may be performed in vitro. In one embodiment, the biological sample is whole blood, blood serum or plasma, such as blood serum. Detection and/or quantification of biomarkers may be performed by detection of the biomarker or of a fragment thereof, e.g. a fragment with C-terminal truncation, or with N-terminal truncation. Fragments are suitably greater than 4 amino acids in length, for example 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids in length. In one embodiment, the biomarker defined herein may be replaced by a molecule, or a measurable fragment of the molecule, found upstream or downstream of the biomarker in a biological pathway.
Methods of detection
As used herein, the term "biosensor" means anything capable of detecting the presence of the biomarker. Examples of biosensors are described herein.
Biosensors according to the invention may comprise a ligand or ligands, as described herein, capable of specific binding to the biomarker. Such biosensors are useful in detecting and/or quantifying a biomarker of the invention.
The biomarker may be directly detected, e.g. by SELDI or MALDI-TOF. Alternatively, the biomarker may be detected directly or indirectly via interaction with a ligand or ligands such as an antibody or a biomarker-binding fragment thereof, or other peptide, or ligand, e.g. aptamer, or oligonucleotide, capable of specifically binding the biomarker. The ligand may possess a detectable label, such as a luminescent, fluorescent or radioactive label, and/or an affinity tag.
For example, detecting and/or quantifying can be performed by one or more method(s) selected from the group consisting of: SELDI (-TOF), MALDI (-TOF), a 1-D gel-based analysis, a 2-D gel-based analysis, mass spectroscopy (MS) such as selected reaction monitoring (SRM), reverse phase (RP) LC, size permeation (gel filtration), ion exchange, affinity, HPLC, UPLC and other LC or LC MS-based techniques. Appropriate LC MS techniques include ICAT® (Applied Biosystems, CA, USA), or iTRAQ® (Applied Biosystems, CA, USA). Liquid chromatography (e.g. high pressure liquid chromatography (HPLC) or low pressure liquid chromatography (LPLC)), thin-layer chromatography, NMR (nuclear magnetic resonance) spectroscopy could also be used. In one embodiment, the detecting and/or quantifying is performed using mass spectroscopy (MS). In a further embodiment, the detecting and/or quantifying is performed using selected reaction monitoring (SRM). SRM is a method used in tandem mass spectrometry in which an ion of a particular mass is selected in the first stage of a tandem mass spectrometer and an ion product of a fragmentation reaction of the precursor ion is selected in the second mass spectrometer stage for detection. Specific analyte panels can be developed for SRM matching the analytes on the biomarker panel. The analyte panels can quantitatively measure the protein analytes with high precision. This methodology has the advantage of allowing raw blood to be used instead of blood serum which minimizes the number intermediate processing steps.
Methods according to the invention may comprise analysing a sample of blood serum by SELDI-TOF or MALDI-TOF to detect the presence or level of the biomarker. These methods are also suitable for clinical screening, prognosis, monitoring the results of therapy, identifying patients most likely to respond to a particular therapeutic treatment, for drug screening and development, and identification of new targets for drug treatment.
Detecting and/or quantifying the biomarkers may be performed using an immunological method, involving an antibody, or a fragment thereof capable of specific binding to the biomarker. Suitable immunological methods include sandwich immunoassays, such as sandwich ELISA, in which the detection of the biomarkers is performed using two antibodies which recognize different epitopes on a biomarker; radioimmunoassays (RIA), direct, indirect or competitive enzyme linked immunosorbent assays (ELISA), enzyme immunoassays (EIA), Fluorescence immunoassays (FIA), western blotting, immunoprecipitation and any particle- based immunoassay (e.g. using gold, silver, or latex particles, magnetic particles, or Q-dots). Immunological methods may be performed, for example, in microtitre plate or strip format.
Immunological methods in accordance with the invention may be based, for example, on any of the following methods.
Immunoprecipitation is the simplest immunoassay method; this measures the quantity of precipitate, which forms after the reagent antibody has incubated with the sample and reacted with the target antigen present therein to form an insoluble aggregate. Immunoprecipitation reactions may be qualitative or quantitative. In particle immunoassays, several antibodies are linked to the particle, and the particle is able to bind many antigen molecules simultaneously. This greatly accelerates the speed of the visible reaction. This allows rapid and sensitive detection of the biomarker. In immunonephelometry, the interaction of an antibody and target antigen on the biomarker results in the formation of immune complexes that are too small to precipitate. However, these complexes will scatter incident light and this can be measured using a nephelometer. The antigen, i.e. biomarker, concentration can be determined within minutes of the reaction. Radioimmunoassay (RIA) methods employ radioactive isotopes such as I125 to label either the antigen or antibody. The isotope used emits gamma rays, which are usually measured following removal of unbound (free) radiolabel. The major advantages of RIA, compared with other immunoassays, are higher sensitivity, easy signal detection, and well-established, rapid assays. The major disadvantages are the health and safety risks posed by the use of radiation and the time and expense associated with maintaining a licensed radiation safety and disposal program. For this reason, RIA has been largely replaced in routine clinical laboratory practice by enzyme immunoassays.
Enzyme (EIA) immunoassays were developed as an alternative to radioimmunoassays (RIA). These methods use an enzyme to label either the antibody or target antigen. The sensitivity of EIA approaches that of RIA, without the danger posed by radioactive isotopes. One of the most widely used EIA methods for detection is the enzyme-linked immunosorbent assay (ELISA). ELISA methods may use two antibodies one of which is specific for the target antigen and the other of which is coupled to an enzyme, addition of the substrate for the enzyme results in production of a chemiluminescent or fluorescent signal.
Fluorescent immunoassay (FIA) refers to immunoassays which utilize a fluorescent label or an enzyme label which acts on the substrate to form a fluorescent product. Fluorescent measurements are inherently more sensitive than colorimetric (spectrophotometric) measurements. Therefore, FIA methods have greater analytical sensitivity than EIA methods, which employ absorbance (optical density) measurement.
Chemiluminescent immunoassays utilize a chemiluminescent label, which produces light when excited by chemical energy; the emissions are measured using a light detector. Immunological methods according to the invention can thus be performed using well-known methods. Any direct (e.g., using a sensor chip) or indirect procedure may be used in the detection of the biomarker of the invention. The Biotin-Avidin or Biotin-Streptavidin systems are generic labelling systems that can be adapted for use in immunological methods of the invention. One binding partner (hapten, antigen, ligand, aptamer, antibody, enzyme etc) is labelled with biotin and the other partner (surface, e.g. well, bead, sensor etc) is labelled with avidin or streptavidin. This is conventional technology for immunoassays, gene probe assays and (bio)sensors, but is an indirect immobilisation route rather than a direct one. For example a biotinylated ligand (e.g. antibody or aptamer) specific for a biomarker of the invention may be immobilised on an avidin or streptavidin surface, the immobilised ligand may then be exposed to a sample containing or suspected of containing the biomarker in order to detect and/or quantify a biomarker of the invention. Detection and/or quantification of the immobilised antigen may then be performed by an immunological method as described herein.
The term "antibody" as used herein includes, but is not limited to: polyclonal, monoclonal, bispecific, humanised or chimeric antibodies, single chain antibodies, Fab fragments and F(ab')2 fragments, fragments produced by a Fab expression library, anti-idiotypic (anti-Id) antibodies and epitope-binding fragments of any of the above. The term "antibody" as used herein also refers to immunoglobulin molecules and immunologically-active portions of immunoglobulin molecules, i.e., molecules that contain an antigen binding site that specifically binds an antigen. The immunoglobulin molecules of the invention can be of any class (e.g., IgG, IgE, IgM, IgD and IgA) or subclass of immunoglobulin molecule.
The identification of key biomarkers specific to a disease is central to integration of diagnostic procedures and therapeutic regimes. Using predictive biomarkers, appropriate diagnostic tools such as biosensors can be developed, accordingly, in methods and uses of the invention, detecting and quantifying can be performed using a biosensor, microanalytical system, microengineered system, microseparation system, immunochromatography system or other suitable analytical devices. The biosensor may incorporate an immunological method for detection of the biomarker, electrical, thermal, magnetic, optical (e.g. hologram) or acoustic technologies. Using such biosensors, it is possible to detect the target biomarker at the anticipated concentrations found in biological samples.
Thus, according to a further aspect of the invention there is provided an apparatus for monitoring a disease or disorder as described herein, which comprises a biosensor, microanalytical, microengineered, microseparation and/or immunochromatography system configured to detect and/or quantify the biomarker defined herein.
The biomarker of the invention can be detected using a biosensor incorporating technologies based on "smart" holograms, or high frequency acoustic systems, such systems are particularly amenable to "bar code" or array configurations.
In smart hologram sensors (Smart Holograms Ltd, Cambridge, UK), a holographic image is stored in a thin polymer film that is sensitised to react specifically with the biomarker. On exposure, the biomarker reacts with the polymer leading to an alteration in the image displayed by the hologram. The test result read-out can be a change in the optical brightness, image, colour and/or position of the image. For qualitative and semi-quantitative applications, a sensor hologram can be read by eye, thus removing the need for detection equipment. A simple colour sensor can be used to read the signal when quantitative measurements are required. Opacity or colour of the sample does not interfere with operation of the sensor. The format of the sensor allows multiplexing for simultaneous detection of several substances. Reversible and irreversible sensors can be designed to meet different requirements, and continuous monitoring of a particular biomarker of interest is feasible. Suitably, biosensors for detection of the biomarker of the invention combine biomolecular recognition with appropriate means to convert detection of the presence, or quantitation, of the biomarker in the sample into a signal. Biosensors can be adapted for "alternate site" diagnostic testing, e.g. in the ward, outpatients' department, surgery, home, field and workplace.
Biosensors to detect the biomarker of the invention include acoustic, plasmon resonance, holographic and microengineered sensors. Imprinted recognition elements, thin film transistor technology, magnetic acoustic resonator devices and other novel acousto-electrical systems may be employed in biosensors for detection of the biomarker of the invention.
Methods involving detection and/or quantification of the biomarker of the invention can be performed on bench-top instruments, or can be incorporated onto disposable, diagnostic or monitoring platforms that can be used in a non-laboratory environment, e.g. in the physician's office or at the patient's bedside. Suitable biosensors for performing methods of the invention include "credit" cards with optical or acoustic readers. Biosensors can be configured to allow the data collected to be electronically transmitted to the physician for interpretation and thus can form the basis for e-neuromedicine. Any suitable animal may be used as a subject non-human animal, for example a non-human primate, horse, cow, pig, goat, sheep, dog, cat, fish, rodent, e.g. guinea pig, rat or mouse; insect (e.g. Drosophila), amphibian (e.g. Xenopus) or C. elegans.
There is provided a method of identifying a substance capable of promoting or suppressing the generation of the biomarker in a subject, comprising exposing a test cell to a test substance and monitoring the level of the biomarker within said test cell, or secreted by said test cell. The test cell could be prokaryotic, however a eukaryotic cell will suitably be employed in cell- based testing methods. Suitably, the eukaryotic cell is a yeast cell, insect cell, Drosophila cell, amphibian cell (e.g. from Xenopus), C. elegans cell or is a cell of human, non-human primate, equine, bovine, porcine, caprine, ovine, canine, feline, piscine, rodent or murine origin. The test substance can be a known chemical or pharmaceutical substance, such as, but not limited to, a medicament for the disease or disorder as described herein; or the test substance can be novel synthetic or natural chemical entity, or a combination of two or more of the aforesaid substances. In methods for identifying substances of potential therapeutic use, non-human animals or cells can be used that are capable of expressing the biomarker.
Screening methods also encompass a method of identifying a ligand capable of binding to the biomarker according to the invention, comprising incubating a test substance in the presence of the biomarker in conditions appropriate for binding, and detecting and/or quantifying binding of the biomarker to said test substance.
Thus, according to a further aspect of the invention there is provided a method of screening for a modulator of one or more biomarkers as defined herein, which comprises the steps of:
(a) incubating a test compound in the presence of said one or more biomarkers; and
(b) detecting and/or quantifying binding of the biomarker to said test substance.
High-throughput screening technologies based on the biomarker, uses and methods of the invention, e.g. configured in an array format, are suitable to monitor biomarker signatures for the identification of potentially useful therapeutic compounds, e.g. ligands such as natural compounds, synthetic chemical compounds (e.g. from combinatorial libraries), peptides, monoclonal or polyclonal antibodies or fragments thereof, which may be capable of binding the biomarker.
Methods of the invention can be performed in array format, e.g. on a chip, or as a multiwell array. Methods can be adapted into platforms for single tests, or multiple identical or multiple non-identical tests, and can be performed in high throughput format. Methods of the invention may comprise performing one or more additional, different tests to confirm or exclude diagnosis, and/or to further characterise a condition. The invention further provides a substance, e.g. a ligand, identified or identifiable by an identification or screening method or use of the invention. Such substances may be capable of inhibiting, directly or indirectly, the activity of the biomarker, or of suppressing generation of the biomarker. The term "substances" includes substances that do not directly bind the biomarker and directly modulate a function, but instead indirectly modulate a function of the biomarker. Ligands are also included in the term substances; ligands of the invention (e.g. a natural or synthetic chemical compound, peptide, aptamer, oligonucleotide, antibody or antibody fragment) are capable of binding, suitably specific binding, to the biomarker.
The invention further provides a substance according to the invention for use in the treatment of a disease or disorder as described herein, or predisposition thereto.
In one embodiment, the method additionally comprises administering a medicament for a disease or disorder as described herein to a patient who is diagnosed with or predicted to have said disease or disorder.
Thus, according to a further aspect of the invention there is provided a method of treating a patient suffering from a disease or disorder as described herein, which comprises the step of administering a medicament for said disease or disorder to a patient identified as having differing levels of the biomarkers as defined herein when compared to the levels of said biomarkers from a normal subject.
According to a further aspect of the invention there is provided a method of treating a patient suffering from a disease or disorder as described herein, which comprises the following steps:
(a) quantifying the amounts of the biomarkers as defined herein in a biological sample obtained from an individual;
(b) comparing the amounts of the biomarkers in the biological sample with the amounts present in a normal control biological sample from a normal subject, such that a difference in the level of the biomarkers in the biological sample is indicative of said disease or disorder, or predisposition thereto; and
(c) administering a medicament for said disease or disorder to a patient diagnosed in step (b) as a patient with said disease or disorder.
Also provided is the use of a substance according to the invention in the treatment of a disease or disorder as described herein, or predisposition thereto.
Also provided is the use of a substance according to the invention as a medicament.
Diagnostic kits
A further aspect of the invention provides a kit for diagnosing and/or monitoring a disease or disorder as described herein comprising reagents and/or a biosensor capable of detecting and/or quantifying the biomarkers described herein. Suitably a kit according to the invention may contain one or more components selected from the group: a ligand specific for the biomarker or a structural/shape mimic of the biomarker, one or more controls, one or more reagents and one or more consumables; optionally together with instructions for use of the kit in accordance with any of the methods defined herein. In one embodiment, the kit additionally comprises a questionnaire for use in diagnosing a patient with a disease or disorder as described herein.
Diagnostic kits for the diagnosis and monitoring a disease or disorder as described herein are described herein. In one embodiment, the kits additionally contain a biosensor capable of detecting and/or quantifying a biomarker.
Biomarker-based tests provide a first line assessment of 'new' patients, and provide objective measures for accurate and rapid diagnosis, in a time frame and with precision, not achievable using the current subjective measures.
Furthermore, diagnostic biomarker tests are useful to identify family members or patients at high risk of developing a disease or disorder as described herein. This permits initiation of appropriate therapy, or preventive measures, e.g. managing risk factors. These approaches are recognised to improve outcome and may prevent overt onset of the disease or disorder.
Biomarker monitoring methods, biosensors and kits are also vital as patient monitoring tools, to enable the physician to determine whether relapse is due to worsening of the disorder, poor patient compliance or substance abuse. If pharmacological treatment is assessed to be inadequate, then therapy can be reinstated or increased; a change in therapy can be given if appropriate. As the biomarker is sensitive to the state of the disorder, it provides an indication of the impact of drug therapy or of substance abuse.
The following studies illustrate the invention. MATERIALS AND METHODS Cell isolation and purity test
Cells were isolated from venous or cord blood and in vitro cultured and differentiated in some cases following BLUEPRINT standard protocols. In all cases purity or morphology were tested using samples with purity over 97%. Libraries were obtained from either a single healthy donor (Mon, Neu, ΜφΟ (2/3 reps), Μφ1 , Μφ2 (1/3 reps), Ery, EndP, nCD4 (1/4 reps), tCD4, tCD8 (2/3 reps), tB) or pooled from multiple healthy donors (MK, ΜφΟ (1/3 reps), Μφ2 (2/3 reps), nCD4 (3/4 reps), naCD4, aCD4, nCD8, tCD8 (1/3 reps), nB). A single fetal thymus per library was used. Monocytes were isolated from venous blood after CD16+ depletion and CD14+ selection of peripheral blood mononuclear cells (PBMCs) by Miltenyi Biotec kits, as described in detail at
http://vww.blueprint-epiqenome.eu/UserFiies/fiie/Protocols/ CAM BiuePrint onocvte.pdf.
Neutrophils were isolated from venous blood after erythrocyte lysis and CD16+ selection by Miltenyi Biotec kits. Macrophages were in vitro differentiated from monocytes isolated from venous blood. Briefly, MO resting macrophages were obtained after stimulation with 50ng/ml M-CSF for 8 days of monocytes. M1 inflammatory macrophages were obtained after stimulation of monocytes with 50ng/ml M-CSF for 7 days followed by LPS alone at 100ng/ml for the last 18 hours. M2 anti-inflammatory macrophages were obtained after stimulation of monocytes with of 15ng/ml IL-13 AND 0.1 μΜ Rosiglitazone. See http://vww.bluepri.nt- epiQenoroe.eu/UserFiies/fjle/Protocols/UCA BiuePrint Macrophage.pdf for full details.
Erythroblasts and Megakaryocytes were cultured from CD34+ cells isolated from cord blood mononuclear cells obtained with the human CD34 isolation kit (Miltenyi Biotec) as described in (Chen et al., 2014). Erythroblasts were generated with erythropoietin, SCF and IL3 for 14 days, while megakaryocytes were obtained with thrombopoietin and I L1 β in 10 days.
Endothelial precursors (blood outgrowth endothelial cells (BOECs)) were generated from circulating endothelial progenitors in adult peripheral blood after long-term culturing of PBMCs with endothelial cell growth medium and colony isolation (Ormiston et ai, 2015).
Naive CD4+ lymphocytes were obtained from PBMCs from venous blood by using custom kit (Catalog#19309) from STEMCELL Technologies. Total CD4+ lymphocytes were obtained from PBMCs from venous blood by negative selection using EasySep Human CD4+ T Cell Enrichment kit (Catalog#19052) from STEMCELL Technologies.
Activated and non-activated total CD4+ T cells were enriched from whole blood using RosetteSep human CD4+ T cell enrichment cocktail according to the manufacturer's protocol (STEMCELL Technologies, Vancouver, Canada). The enriched CD4+ T cell interface was washed twice in X-VIVO-15 media (Lonza, Basel, Switzerland) supplemented with 1 % human AB serum (Lonza) and penicillin/streptomycin (Gibco, ThermoFisher). 250,000 CD4+ T cells (93 - 99% pure) were stimulated with anti-CD3/CD28 T cell activator beads (Dynal, ThermoFisher). Beads were added at a ratio of 0.3 beads / 1 CD4+ T cell (75,000 beads / well) and the cells +/- beads were cultured for 4 hours at 37°C + 5% CO2.
Naive CD8+ lymphocytes were obtained from PBMCs from venous blood by negative selection using EasySep Human Naive CD8+ T Cell Enrichment kit (Catalog#19158) from STEMCELL Technologies. Total CD8+ lymphocytes were obtained from PBMCs from venous blood by negative selection using EasySep Human CD8+ T-cell Enrichment kit, (Catalog#19053) from STEMCELL Technologies.
Naive B lymphocytes were obtained from PBMCs from venous blood by negative selection using EasySep Naive B Cell Enrichment kit (Catalog#19254) from STEMCELL Technologies. Total B lymphocytes were obtained from PBMCs from venous blood by negative selection using EasySep Human B cell Enrichment kit, (Catalog#19054) from STEMCELL Technologies. The blood samples were consented under Study Title: A Blueprint of Blood Cells; REC reference 12/EE/0040; NRES Committee East of England-Hertfordshire. Fetal thymus cells were obtained after cell disaggregation from fetal thymus tissue from Advanced Bioscience Resources (Alameda, CA, USA) sourced, processed and banked in accordance with UK's Human Tissue Act 2004. Ficoll isolation was used to select healthy cells. Cell fixation
-8x107 cells per library were resuspended in 30.825 ml of DMEM supplemented with 10% FBS, and 4.375 ml of formaldehyde was added (16% stock solution; 2% final concentration). The fixation reaction continued for 10 min at room temperature with mixing and was then quenched by the addition of 5 mi of 1 glycine (125 m final concentration). Cells were incubated at room temperature for 5 min and were then on ice for 15 min. Cells were pelleted by centrifugation at 400g for 10 min at 4°C, and the supernatant was discarded. The pellet was washed briefly in cold PBS, and samples were centrifuged again to pellet the cells. The supernatant was removed, and the cells were flash frozen in liquid nitrogen and stored at -80 °C.
Hi-C library preparation
Hi-C library generation was carried with in-nucleus ligation as described previously (Nagano et al., 2015). Chromatin was then de-crosslinked and purified by phenol-chloroform extraction. DNA concentration was measured using Quant-iT PicoGreen (Life Technologies), and 40 μ of DNA was sheared to an average size of 400 bp, using the manufacturer's instructions (Covaris). The sheared DNA was end-repaired, adenine-tailed and double size-selected using AMPure XP beads to isolate DNA ranging from 250 to 550 bp. Ligation fragments marked by biotin were immobilized using MyOne Streptavidin C1 DynaBeads (Invitrogen) and ligated to paired-end adaptors (!!Sumina). The immobilized Hi-C libraries were amplified using PE PCR 1.0 and PE PGR 2.0 primers (Hlumina) with 7-8 PCR amplification cycles. Biotinylated RNA bait library design
Biotinylated 120-mer RNA baits were designed to the ends of Hindiii restriction fragments that overlap Ensembl-annotated promoters of protein-coding, noncoding, antisense, snRNA, miRNA and snoRNA transcripts (Mifsud et al., 2015). A target sequence was accepted if its GC content ranged between 25% and 85%, the sequence contained no more than two consecutive Ns and was within 330 bp of the Hindiii restriction fragment terminus, A total of 22,076 Hindiii fragments were captured, containing a total of 31 ,253 annotated promoters for 18,202 protein-coding and 10,929 non-protein genes according to Ensembl v75 (http://grch37.ensembl.org). PCHi-C
Capture Hi-C of promoters was carried out with SureSeiect target enrichment, using the custom-designed biotinylated RNA bait library and custom paired-end blockers according to the manufacturer's instructions (Agilent Technologies). After library enrichment, a post- capture PCR amplification step was carried out using PE PCR 10 and PE PCR 2.0 primers with 4 PCR amplification cycles.
Sequencing Hi-C and PCHi-C libraries were sequenced on the !!!umina HiSeq 2500 platform. 3 sequencing lanes per PCHi-C library and 2 sequencing lanes per Hi-C library were used.
Hi-C and PCHi-C sequence alignment
Raw sequencing reads were processed using the HiCUP pipeline (Wingett et al., 2015), which maps the positions of di-tags against the human genome (hg19), filters out experimental artefacts, such as circularized reads and re-ligations, and removes all duplicate reads.
Hi-C data processing and the definition of TAD boundaries
Aligned Hi-C data was analysed using HOMER (Heinz et al., 2010). Using binned Hi-C data, we computed the coverage- and distance-related background in the Hi-C data at 25kb, 100kb and 1 Mb resolutions, based on an iterative correction algorithm (Imakaev et al., 2012). General genome organisation in the eight selected cell types was compared by plotting the distance-and-coverage corrected Hi-C matrices at 1 Mb resolution, and by computing the compartment signal related (1 st or 2nd) principle component of the distance-and-coverage corrected interaction profile correlation matrix (Lieberman-Aiden et al., 2009) at 100kb resolution, with positive values aligned with H3K4me3 CHIP-seq on human CD20+ (https://www.encodeproject.org/experiments/ENCSR000DQR/ ENCFF001WXC). The compartment signal for the selected cell types in each replicate was plotted for comparison, and the genome-wide concatenated CHIP-seq aligned principal components were clustered using hierarchical clustering (using 1 - Pearson correlation as the distance metric). Directionality indices (Dixon et al., 2012) were calculated from the number of interactions 1 Mb upstream and downstream using a 25kb sliding window every 5kb steps, and were smoothed using a +/-25kb window. Topological domain boundaries (TAD) were called between consecutive negative and positive local extrema of the smoothed directionality indices with a standard score above 0.5. For each analysed cell type, TADs called on individual biological replicates were merged by taking the mean of the TAD boundary genome locations; TADs showing an intersection of less than 75% between biological replicates were removed from the analysis.
PCHi-C interaction calling
Significantly interacting regions were called using the CHICAGO pipeline (Cairns et al., 2016). Briefly, CHICAGO calls interactions based on a convolution background model reflecting both the 'Brownian' (real, but expected interactions) and 'technical' (assay and sequencing artefacts) components. The resulting p-values are adjusted using a weighted false discovery control procedure that specifically accommodates the fact that increasingly larger numbers of tests are performed at regions where progressively smaller numbers of interactions are expected. The weights were learned based on the decrease of the reproducibility of interaction calls in individual replicates with distance. Interaction scores are then computed for each fragment pair as -logio-transformed, soft-thresholded, weighted p-values. Interactions with a CHiCAGO score≥5 in at least one cell type were used for further analysis.
Reciprocal Capture CHi-C
A capture system containing 949 PIRs identified in the PCHi-C experiments in at least one of the following cell types: activated, non-activated CD4+ T cells, erythroblasts, and monocytes was used to probe the Hi-C material in these cell types. Data processing and interaction calling were performed in the same way as for PCHi-C.
Comparing PCHi-C and Reciprocal Capture Hi-C
Determining consistent signals between genomics datasets is a non-trivial problem that requires leveraging both false-positive and false-negative rates (Blangiardo and Richardson, 2007; Jeffries et al., 2009), particularly in undersampled datasets such as PCHi-C (Cairns et al., 2016). Here we took advantage of a published method sdef (Blangiardo et al., 2010) to determine the so-called q∑ thresholds on CHiCAGO interaction scores that minimize the global misclassification error by balancing sensitivity and specificity. The q∑ thresholds (Ery: 0.27; MK: 0.14; nCD4: 1.23; aCD4: 1.20) were below 5 in all cases, indicating that the consistency range between PCHi-C and reciprocal capture Hi-C datasets extends considerably below the stringent threshold used for interaction detection (as also evident from Figure 8A). The proportion of significant interactions called in PCHi-C (score >= 5) that fell within consistency range in the reciprocal capture (score >= q∑ in both experiments) were, respectively 96.3% (Ery), 98.7% (MK), 92.9% (nCD4), and 91.6% (aCD4).
Promoter interaction localisation with respect to TADs
Significant PCHi-C interactions were classified as either "within-TAD" or "TAD boundary- crossing" (only interactions with baits located with TAD boundaries were considered in the analysis). Localisation expected at random was estimated by randomly reshuffling the distances between baits and the TAD boundaries on both their flanks across baits, thus preserving the overall structure of promoter interactions and bait positioning within TADs.
Clustering and Principal Component Analysis
Interactions with a CHiCAGO score≥5 in at least one cell type were clustered by the Bayesian algorithm "autoclass" (Cheeseman et al., 1988) based on the full range of asinh-transformed CHiCAGO scores in each cell type. The algorithm was trained on a sample of 30,000 interactions, and then used in the "predict" mode to classify the complete dataset. The relative error parameter was set to 0.1. This resulted in 34 clusters, with cluster sizes ranging from 108,066 interactions to 12 interactions and a mean cluster size of 21 ,436 interactions. Clustering of the cell types based on their interaction profiles was performed using a hierarchical algorithm with average linkage, based on Euclidian distances. Principal component analysis with performed using the prcomp function in R.
Histone modification ChIP and the definition of chromatin states
All histone modification ChlP-seq datasets were taken from the BLUEPRINT project (in the January 2015 GRCb37-based release) in the processed form, Histone modification enrichment at PIRs was computed using the peakEnrichment4Features function in the CHiCAGO package with respect to randomised PIRs generated so as to preserve the distribution of PIR distances to promoters. To form genome segmentations, ChromHM (Ernst and Kellis, 2012) and all BLUEPRINT samples with full reference epigenome histone modification alignment files were used, using default settings and defining 25 epigenetic states. This data set was used as the basis for the Ensembl Regulatory Build (Zerbino et al., 2015) method, defining regulatory features based on the histone profiles (transcription start site, proximal enhancer, distal enhancer, poised, repressed), and also assigning activity statuses based on sample specific experiments (active, inactive, poised, repressed) (Zerbino et al., 2016). Baits and PIRs were then overlapped with Ensembl Regulatory Build regulatory features.
Hierarchical clustering was conducted on the presence or absence of significant interactions and active distal enhancers using the binary distance and complete linkage. Enrichment was calculated as observed over expected, where observed is the number of active distal enhancers overlapping PIRs, and expected is the expected number under the null model of no association between enhancer activity and PIR presence.
For the analyses in Figures 3E and 3F, one representative BLUEPRI T sample was selected for each cell type to avoid double counting interactions. A bait fragment was labeled as active if it overlapped at least one active promoter regulatory element, and a PIR was labeled as active if it overlapped at least one active distal enhancer. Sets were formed of overlapping promoter features and baits, and overlapping distal enhancers and PIRs. 2x2 contingency tables were generated by summarizing these sets: either the full set (Figure 3E) or the subset where at least one cell type has a significant interaction between an active promoter and an active distal enhancer (Figure 3F). The p- values for the null hypothesis of independence between interaction state and regulatory state were calculated by the χ2 test. Overdispersion was expected in the underlying null distribution due to correiated observations arising from the shared baits of multiple interactions. Bootstrapping was therefore performed to estimate overdispersion, and the x2-distribution was scaled by a factor of sqrt(2) divided by the square root of the variance of the 1000 bootstrap-resamp!ed x2-statistics.
Relationship between active enhancers and gene expression
Gene expression quantifications for the GENCODE v15 (Harrow et al., 2012) genes were downloaded from BLUEPRINT site for all BLUEPRINT segmentation samples (ftp://ftp.ebi, ac.uk/'pub/databases/blueprint/releases/20150128/homo__sapiens/20150128.dat a. index). DEseq2 (Love et al., 2014) was used to calculate normalized gene counts across individuals and to convert to log2 scale. The data was filtered so that the promoter regulatory feature was within 500 bp upstream and 50 bp downstream of an annotated transcription start site for the gene. Only genes with active promoters in all Blueprint samples were used in this analysis, to remove the large effect of promoter status on gene expression. A linear model was fitted by robust regression using iterated reweighted least squares, where the gene expression was modeled by the number of interacting active enhancers (Figure 4A) or any interacting PIRs (Figure 10A).
Interaction-based gene specificity scores and gene-centric clustering
Data pre-processing. Calculation of interaction-based gene specificity scores was restricted to the eight cell types for which BLUEPRINT expression and histone modification data were available. The original set of significant interactions was filtered to (i) only contain baits that mapped to a unique protein-coding gene promoter and (ii) only contain interactions for which at least one of the eight cell types has both a CHiCAGO score≥5 and an active enhancer (according to the histone modification data). This resulted in a set of 139,835 interactions and 7,004 unique baits. To focus the analysis on tissue-specificity of interactions with active enhancers, for each interaction CHiCAGO scores were set to zero for cell types where the enhancer had an inactive status. Finally, to avoid large CHiCAGO scores dominating the specificity analysis, scores were asinh-transformed and values larger than a threshold of 4.3 (equivalent to a score =~ 36.8) were set to 4.3. Below, these scores are referred to as "processed CHiCAGO scores".
Calculation of interaction-based gene specificity scores. First, calculation of specificity scores for a single enhancer-promoter interaction was considered. Let x, denote the processed CHiCAGO score for cell type /'. Then, the specificity score sc for a given cell type c is a weighted mean of the differences xc - x, for /'≠ c,
Figure imgf000085_0001
where the weights dc are distances between cell type c and cell types /', calculated using CHiCAGO scores for the full set of interactions (CHiCAGO scores asinh-transformed with upper threshold of 4.3; distances calculated using Euclidean distance metric). These distance weights focus the calculation of sc on cell types that are not close to cell type c in the interaction-based lineage tree (Figure 2B). For example, among the eight cell types are three types of macrophages that have very similar interaction profiles and so are very close in the lineage tree. The distance weights result in the calculation of sc for each type of macrophage placing relatively little weight on the other types of macrophages. Without this weighting, specificity scores for macrophages would be smaller on average simply because macrophages are over-represented among the eight cell types.
A single gene (protein-coding gene promoter) g is now considered. Let ng denote the number of enhancer interactions this gene has among the set of 139,835 interactions. The gene then has ng specificity scores sc for cell type c, one for each interaction. These ng scores are averaged to obtain the interaction-based gene specificity score for cell type c, sc s . The heat map in Figure 4B shows these scores for eight cell types and 7,004 genes.
Clustering of interaction-based gene specificity scores. The 7,004 genes were clustered based on their interaction-based gene specificity scores across the eight cell types. Clustering was performed in R using /(-means with Euclidean distance metric and 10,000 random starts each with a maximum of 10,000 iterations. The analysis was repeated for number of clusters varying between 2 and 30. 12 clusters (shown in Figure 4B) were selected by inspecting the scree plot of within cluster sum of squares versus number of clusters.
The cell types were also clustered according to their interaction-based gene specificity scores across genes. Hierarchical clustering was applied with Euclidean distance and complete linkage (see dendrogram in Figure 4B). Calculation of expression-based specificity scores. For each of the 7,004 genes, expression- based specificity scores were calculated for each cell type based on BLUEPRINT expression data processed as previously described (Chen et ai, 2014). The scores were calculated in a similar way as the interaction-based scores above, replacing processed CHiCAGO scores with asinh-transformed expression data (distance weights were also based on expression data). eQTL analysis
Preprocessed, publicly available eQTL datasets (Fairfax et al., 2012) and preprocessed PCHi- C datasets were merged for matching cell types. eQTL analysis was performed using PCHi- C data to select SNPs for testing. Associations between expression data and pre-selected SNPs were performed using LIMIX (Lippert et al., 2014). p-values were corrected using Benjamini-Hochberg correction over all tested SNPs of all genes and FDR threshold was set to 10% for all analyses. To create randomised interactions maintaining the distribution of PIRs, real interaction profiles were randomly associated to baits in 1000 iterations. Strands of genes were taken into consideration. Interchromosomal interactions and interactions reaching further than 1 Mb were discarded. As distance to the TSS is known to have a strong effect on association strength of SNPs all enrichment analyses were performed in a binned manner, partitioning the selection of tested SNPs by their distance to the TSS. Multiple testing correction was performed for each bin individually.
GWAS Summary statistics
Blood trait summary data (Gieger et al., 2011 ; van der Harst et al., 2012) were kindly provided by N. Soranzo, autoimmune disease summary data were retrieved from ImmunoBase (http://wwwjmrounobase.org) (Anderson et al., 201 1 ; Barrett et al., 2009; Bentham et al., 2015; Cordell et al., 2015; Dubois et al., 2010; Franke et al., 2010; International Multiple Sclerosis Genetics Consortium et al., 201 1 ; Stahl et al., 2010) the remaining GWAS summary data were retrieved from various internet resources (Estrada et al., 2012; International Consortium for Blood Pressure Genome-Wide Association Studies et al., 2011 ; Locke et al., 2015; Manning et al., 2012; Morris et al., 2012; Teslovich et al., 2010; Wood et al., 2014) . Where necessary liftOver or in house scripts were used to convert to GRCh37 coordinates. In order to remove SNPs with spuriously strong association statistics, SNPs were removed with P < 5 x 10"8 for which there were no SNPs in LD (r2>0.6 using 1000 genomes EUR cohort as a reference genotype panel (1000 Genomes Project Consortium et al., 2015)) or within 50Kb with P<10"5.
Poor man's imputation (PMI)
A pipeline was developed that approximates the p-value for missing SNP summary statistics for a given study using a suitable reference genotype set. Firstly the genome was split into regions based on a recombination frequency of 0.1 cM using HapMap recombination rate data (International HapMap Consortium et al., 2007). For each region all SNPs were retrieved from the reference genotype set (1000 genomes EUR cohort (1000 Genomes Project Consortium et ai, 2015)) that have MAF > 1 % and use these to compute pairwise LD. Each SNP from the summary statistics set was paired, where p-values are present with SNPs from the reference set where p-values are unavailable using maximum pairwise A2 (/^ ax), if 2/wax>0.6, and then impute the missing p-value as that at the paired SNP. SNPs with missing data without a pair above this i^Max threshold are discarded as are SNPs that are included in the study but don't map to the reference genotype set. Wakefield's synthesis (Wakefield, 2009) was used to compute approximate Bayes factors and thus posterior probabilities for each SNP within a region being causal assuming a single causal variant (Wellcome Trust Case Control Consortium et a/., 2012). The MHC region (GRCh37:chr6:25-35Mb) was masked from all downstream analysis due to its extended LD and known strong and complex association with autoimmune diseases. An implementation of this pipeline and further documentation is available in the accompanying code repository.
GWAS tissue set enrichment analysis of PCHi-C
A method was developed, BLOCKSHIFTER based on ideas implemented in GOSHIFTER (Trynka et al., 2015) to examine the enrichment of GWAS signals at PIRs in order to overcome linkage disequilibrium (LD) and interaction fragment correlation. BLOCKSHIFTER implements a competitive test of enrichment between a test set of PIRs compared to a control set. Firstly the coordinates of the PIR in the union of test and control sets are retrieved, and PIRs with no GWAS signal overlap, or that are found in both test or control set are discarded. For the remaining PIRs we store the number and sum of overlapping GWAS posterior probabilities and these are used to compute δ, the difference in the means between the test and control sets. Due to spatial correlation between GWAS signals and between PIRs the variance of δ is inflated, it is therefore computed empirically using permutation. Runs of one or more PIRs (separated by at most one Hindlll fragment) are combined into 'blocks', that are labeled unmixed (either test or control PIRs) or mixed (block contains both test and control PIRs). Unmixed blocks are permuted in a standard fashion by reassigning either test or control labels randomly, taking into account the number of blocks in the observed sets. Mixed blocks are permuted by conceptually circularising each block and rotating the labels. Each of these precomputed block permutations is randomly sampled n times so that the proportion of underlying PIR labels is the same as the observed set and use this to compute the set of 6nuii. We use <5nuii to compute an empirical Z-score:
2 _ δ— 5nuii
JVar(Snull)
An implementation of BLOCKSHIFTER and further documentation is available from https://github.com/ollyburren/CHIGP. Integration of GWAS summary statistics with tissue specific PCHi-C and functional information In order to prioritise genes, traits and tissues for further study an algorithm was developed to compute tissue specific gene scores for each GWAS trait, taking into account linkage disequilibrium, interactions and functional SNP annotation. For each gene annotation, which has at least one significant interaction and recombination block (used by PMI (see above) to compute trait posterior probabilities) a block gene score is computed that is composed of the contributions of three components: (1) coding SNPs in the annotated gene as computed by VEP (McLaren et al., 2010), (2) promoter SNPs, which are defined as SNPs that overlap a region encompassing the bait and flanking Hindlll fragments and not any coding SNPs, (3) SNPs that overlap PIRs for a tissue or set of tissues that do not overlap coding SNPs. Thus for a given target gene and recombination block and trait a block "genescore" can be derived that is the sum of the posterior probabilities (as computed by PMI) of SNPs overlapping each component. If independence is assumed, blocks can be combined to get an overall "genescore" such that:
genescore = 1-Π(1 -genescore. block).
An implementation of the gene prioritization algorithm and further documentation is available from https://github.com/ollyburren/CHIGP.
Core autoimmune network
For each of the eight analysed autoimmune traits (CD, CEL, RA, UC, PBC, SLE, MS, T1 D) top-scoring genes were selected based on the following criteria: genescore > 0.5, no more than top 75 genes per condition. The resulting 421 genes were combined into a single list, and disease associations were assigned to each gene based on the respective genescore > 0.5. This gene list was used as input to the GeneMania 3.4.0 plugin (Montojo et al., 2010) for Cytoscape 3.3.0 (Cline et al., 2007) to construct a network the based on prior knowledge about these 421 genes (shown in Figure 6E). The following information was used for linking gene pairs: physical interaction (all sources in the plugin), co-localization (the "Satoh-Yamamoto- 2013" dataset only), predicted interaction (l2D-based datasets only), shared pathway annotation. Only the 421 network genes were plotted ("find 0 related genes") and query-gene- based weights were used.
Reactome Pathway analysis
For each trait all protein coding genes were selected having an overall gene score above 0.5. We converted Ensembl gene identifiers to Entrez identifiers using bioMaRt and used ReactomePA to compute enrichment of genes in Reactome pathways using an FDR cutoff of 0.05. We plotted a bubble plot of significant results for each trait using R package ClusterProfiler (Yu et al., 2012). All R code to conduct this analysis is available in the accompanying code repository.
RESULTS
Example 1 : Promoter Capture Hi-C
A minimum of three biological replicate PCHi-C experiments were performed, using in-nucleus ligation (Nagano et al., 2015) and capturing 22,076 fragments containing 31 ,253 annotated promoters in 17 human primary blood cell types. This produced over 1 1 billion unique, valid read-pairs involving promoters (Table 17), which is analogous in promoter interaction detection power to over 165 billion conventional Hi-C read-pairs (Schoenfelder et al., 2015a).
Table 17: Summary of PCHi-C datasets generated in this study.
Figure imgf000089_0001
Naive CD8+ T cells nCD8 3 747,834,572 187,399
Total CD8+ T cells tCD8 3 628,771 ,947 183,964
Total 1 1 ,299,489,740 708.0072
1Total numbers of valid read pairs across all biological replicates are listed. See Table S1 for replicate-level statistics. 2Unique interactions detected in at least one cell type.
The CHiCAGO pipeline (Cairns et al., 2016) was used to call statistically significant interactions with the captured promoters (Figures 1A-C), detected approximately 165,000 interactions per cell type (Figure 1 D; Tables 1 and S1), with a median of 4 interactions per promoter per cell type. Abundant examples of tissue-specific interactions were found, such as those for INPP4B, RHAG and ZEB2AS in naive CD4 T cells (nCD4), erythroid progenitors (Ery) and monocytes (Mon), respectively (Figure 1C). Numerous tissue invariant interactions were also found, such as in the case of the ubiquitously-expressed ALAD gene, which encodes a heme biosynthetic enzyme. Interestingly, ALAD is specifically upregulated in erythroid progenitors, where additional erythroid-specific interactions are evident (Figure 1C). In total, 708,007 unique, significant interactions were detected across all cell types, of which 9.6% were promoter-to-promoter interactions and 90.4% promoter-to-PIR, with a median linear distance between promoters and their interacting regions of 328Kb. Approximately 20% of interactions were between fragments greater than 1 Mb apart and 5,023 mapped across chromosomes ('trans-interactions'). In total, 253, 148 unique interacting fragments were detected, including 21 , 102 captured fragments and 232,046 PIRs (Figure 1 D). 16 of the Hi-C libraries from 8 cell types were also sequenced prior to capture and identified topological^ associated domains (TADs) using the directionality index score (Dixon et al., 2012) (Figure 1 B). It was found that the incidence of PCHi-C-identified long-range interactions crossing TAD boundaries is significantly below that expected at random in all 8 cell types (Figures 1 E and 7A), consistent with previous results (Schoenfelder et al., 2015a a). However, abundant examples of strong interactions crossing one or multiple boundaries were evident (Figures 1 B and 7A), including cases where all interactions of a given promoter were located outside its respective TAD (Figure 7A).
Approximately 1000 long-range interactions were chosen for validation using a reciprocal capture system, using the PIRs of these interactions as capture baits in 8 Hi-C libraries from 4 cell types (Figure 8A). The signals in PCHi-C and reciprocal capture Hi-C were found to be highly consistent (Figure 8 and Materials and Methods), validating the present approach. Example 2: Promoter interactomes are lineage- and cell-type specific
Principal component analysis (PCA) of all PCHi-C biological replicates for the 17 cell types (at least three replicates per cell type, Table 17) revealed close clustering of the replicates and separation of the individual cell types (Figure 2A).
This demonstrates signal reproducibility across replicates and suggests strong cell-type specificity of the interactomes. It was noted that neutrophils showed a distinct PCA profile, potentially reflecting their unusual segmented nuclear morphology. Hierarchical clustering of the 17 cell types based on their patterns of promoter interactions confirmed cell-type specificity and demonstrated consistency with the haematopoietic lineage tree (Figure 2B, top). Distinct lymphoid and myeloid branches were evident, as well as close clustering among the subgroups of cell types with common progenitors. For example, CD4+ and CD8+ T cell interactomes cluster closely, reflecting their close lineage relationship, as do the M0, M1 and M2 macrophage interactomes (Figures 2A and 2B). It was further confirmed the cell-type specificity and lineage relationships of the interactomes globally using conventional Hi-C at the level of large-scale A/B nuclear compartments (Figures 7B-7D).
Autoclass Bayesian clustering (Cheeseman et al., 1988) was then used to partition interactions based on their tissue specificity. Autoclass jointly optimises the number of clusters and their membership, which resulted in a total of 34 interaction clusters (Figure 2B, heatmap). Cluster "specificity scores" were devised to quantify the distribution of interaction signals in each cluster across samples, accounting for the lineage tree (Figure 2C, see Materials and Methods for details), and used it to define sets of tissue-specific clusters. Just under half (47.4%) of interactions mapped to predominantly lymphoid-specific clusters (1-15, 25, 26). Examples of genes whose promoter interactions predominantly map to this set of clusters include T-cell receptor components (CD247, CD3D and CD3G), as well as IKZF3 coding for the AIOLOS protein that has a key role in lymphoid development (Thompson et al., 2007). 38.9% of the interactions mapped to mainly myleloid-specific clusters (27-34). Promoters with predominant interactions in this set of clusters include, for example, DIP2C (Disco-Interacting Protein 2 Homolog C) that shows high expression in acute myeloid leukemia (Cancer Genome Atlas Research Network et al., 2013). Of the interactions mapping to the myeloid branch, clusters 16-18 contained predominantly monocyte- and neutrophil-specific interactions (9.4% total), while in the lymphoid branch interactions in clusters 25 and 26 were restricted to B cells (3.1 %). Finally, clusters 19-24, containing 13.6% of interactions, showed strong signals in both lineages.
It was found that 60.4% of captured fragments (12,579) had at least one interaction in both myeloid and lymphoid lineages. However, approximately 99% (12,505) of them had additional lineage- or cell-type-specific interactions (Figure 8A). On the whole, interactions sharing the same promoter tended to have more similar tissue specificities than expected at random (Figure 8B). This suggests a complex and potentially cooperative effect of tissue-specific and invariant interactions in setting up genome organisation and expression.
Collectively, the tissue-specificity and lineage-relatedness of promoter interactomes suggests that higher-order genome structure undergoes widespread and coordinated remodeling during lineage specification, dynamically reshaping transcriptional decisions.
Example 3: Promoters preferentially connect to active enhancers
To investigate the regulatory potential of the interactions, the chromatin properties of PIRs using data from the BLUEPRINT project (Adams et al., 2012) were assessed from 9 matching blood cell types, in which sufficient information was available. PIRs were found to be significantly enriched for the histone marks associated with active enhancers, such as H3K27ac and particularly H3K4me1 , in comparison with distance-matched random controls (Figures 3A and 3B). Enrichment for H3K4me3 and H3K36me3 was also found at PIRs, which are marks associated with promoters and transcribed regions, respectively, consistent with non-coding transcription of regulatory regions (Natoli and Andrau, 2012). Conversely, PIRs were selectively depleted of the H3K27me3 and H3K9me3 marks associated, respectively, with Polycomb repression and constitutive heterochromatin (Figures 3A and 3B). Notably, this lack of H3K27me3 enrichment at PIRs in human blood cells is in contrast to that observed in human and mouse ES cells (Schoenfelder et al., 2015a a, 2015b b)(Freire-Pritchett et al., in preparation).
Regions annotated as promoters and enhancers in the Ensembl Regulatory Build were then focussed upon, defining their activity on the basis of ChromHMM (Ernst and Kellis, 2012) segmentations of the BLUEPRINT histone ChIP data (see Materials and Methods for details). It was asked whether the tissue-specific activity state of enhancers correlated with their connectivity to promoters, or alternatively, enhancer-promoter interactions tended to be primed irrespective of activity (Ghavi-Helm et al., 2014). Consistent with previous findings in the β-globin locus (Tolhuis et al., 2002), Figures 3C and 9C show that interactions between the Locus Control Region (LCR) enhancers and the HBB and HBG genes occur in erythroblasts, in which they are active (Forrester et al., 1990), but not in monocytes or CD4+ T cells. Tthis activity-state dependent connectivity of enhancers with promoters was observed globally (Figure 3D), and formally confirmed it using overdispersion-adjusted statistical tests (Figures 3E-F, see Materials and Methods for details). These results demonstrate that the dynamic nature of enhancer-promoter interactions is largely coupled with the tissue-specific activity of the regulatory elements they connect.
Example 4: Enhancer activity associates with lineage-specific gene expression
To gain insight into the role of promoter contacts in regulating lineage-specific gene expression, information on chromatin states at promoters and enhancers was integrated with global transcriptional profiles in the same cells available from the BLUEPRINT consortium. Comparing gene expression across tissues, it was observed that promoter interactions with active enhancers generally had an additive effect on tissue-specific expression levels (β=0.1 1 mean-centred log2-expression units, p<2x10"16; Figure 4A). Notably, a weaker but also significant additive effect was observed when all PIRs, irrespectively of their annotation, were considered for the analysis (β=0.003, p<2x10"16; Figure 10A), with the fraction of active enhancers among them providing an independent predictor (β=0.004, p<2x10"16, data not shown). These results confirm that active enhancers, and potentially other elements devoid of canonical enhancer features, quantitatively contribute to gene expression.
It was then sought to partition genes based on the tissue-specificity of their interactions with active enhancers. For each gene, CHiCAGO interaction scores and enhancer activity states were used to calculate a "gene specificity score" for each cell type (see Materials and Methods for further details). A large gene specificity score for a given cell type indicates that the gene's promoter-enhancer interactions are predominantly specific to that cell type, while a large negative score indicates that interactions are not present in the given cell type, but are present in many others. Small specificity values across all cell types indicate no predominant specificity. Applying k-means clustering to the resulting gene specificity scores, the 12 clusters shown in Figure 4B were obtained. This revealed clusters of genes with predominant activity in one cell type (e.g., Ery, cluster 4; MK, cluster 5; Neu, cluster 6; nCD4, cluster 8) or multiple related cell types (such as different types of Μφ, cluster 2).
The gene specificity scores were compared based on interactions with active enhancers with analogous scores that capture tissue-specificity of the respective genes' expression (see Materials and Methods). As can be seen in Figure 4C and 10B, genes mapping to a tissue- specific cluster based on their interactions with active enhancers were preferentially expressed in the same tissue. The link between tissue-specificity of enhancer interactions and gene expression was the most apparent when focusing on the 100 most tissue-specifically expressed genes in each cell type. For example, 46% of the top 100 nCD4-specifically expressed genes mapped to the nCD4-specific cluster 8 based on their enhancer activity, while 37% others showed enhancer activities in both nCD4 cells and other tissues (Figure 4D). Overall, clusters characterised by predominant enhancer activity in a given tissue were the most enriched for the 100 genes expressed with highest specificity for that tissue (Figure 4E). Taken together, these results support a direct functional role of enhancer-promoter interactions in transcriptional control.
Example 5: PIRs are enriched for expression quantitative trait loci
To gain insight into the functional role of promoter interactions in the regulation of gene expression, evidence was used from population genetics. Expression quantitative trait loci (eQTLs) are genetic variants such as single nucleotide polymorphisms (SNPs) that affect the expression level of specific target genes. While many eQTL SNPs map in proximity to the promoters they influence, others map considerable distances (up to megabases) away from them (Albert and Kruglyak, 2015), either affecting gene expression indirectly ('trans-eQTLs') or potentially localising to the target genes' long-range regulatory regions. Publicly available eQTL data from monocytes and B cells (Fairfax et al., 2012) was used to assess the enrichment of eQTLs at PIRs connected to the eQTL-associated genes. Specifically, it was asked what fraction of SNPs mapping to PIRs are eQTLs (association q-value<0.1) for the PCHi-C-identified target gene. As controls, SNPs mapping to "randomised PIRs" generated by jointly permuting the gene labels of all PIRs mapping to the same gene were considered, thus accounting for interdependencies in PIR locations (see Materials and Methods for details). As can be seen in Fig. 5A-B, selective enrichment of eQTLs at PIRs versus random regions was observed across a broad range of distances in both analysed cell types. This result was also confirmed at the gene level, considering only a single lead eQTL per gene (Fig. 11A-B).
Notable examples of eQTLs in PIRs included those with regulatory effects on more than a single gene. For instance, eQTL SNP rs71636780 localises to a PIR of two genes, ARID1A and ZDHHC18 in monocytes (located within 50kb and 100kb from them, respectively), with its variants showing an opposite effect on these genes' expression (Fig 5C). A similar effect is observed in B cells for eQTL SNP rs1 17561058 within a PIR of NDUFAF4 and ZBTB2 that, strikingly, is located ~10Mb and ~60Mb from these genes, respectively (Fig 5D). Further examples of long-range PIRs harbouring eQTLs are shown in Fig 11 C-D. Taken together, these results support a functional role of promoter interactions spanning a broad range of distances in gene regulation. Example 6: Promoter interactions reveal putative targets of disease-associated SNPs
The majority of common single-nucleotide polymorphisms associated with human phenotypes fall outside of protein coding and promoter regions, but are enriched at tissue specific regulatory elements (Farh et al., 2014; Maurano et al., 2012). The phenotypic effects of non- coding polymorphisms may therefore be mediated by DNA looping interactions between the regulatory elements harbouring the polymorphism and specific gene promoters in relevant tissues (Claussnitzer et al., 2015; Davison et al., 201 1 ; Smemo ef al., 2014).
To examine whether there was evidence for tissue-specific enrichment of single nucleotide polymorphisms (SNPs) associated with complex traits within PIRs, summary statistics were assembled from 31 genome-wide association studies (GWAS), including 8 autoimmune diseases, 8 blood cell traits, 9 metabolic and 6 other traits. BLOCKSHIFTER was devised, a method that compares enrichment between sets of tissues taking into account correlation structure in both GWAS and PIR datasets (see Materials and Methods for details). Using this method, it was found that variants associated with autoimmune disease are enriched at PIRs in lymphoid as compared to myeloid cells (Figure 6A). This enrichment is strongest in activated CD4+ T cells as compared to endothelial cells (Figure 6B), which is further explored in a companion study focusing on CD4+ T cell activation (Burren et al., submitted). In contrast, SNPs associated with platelet- and red blood cell-specific traits were predominantly enriched at PIRs in myeloid lineages (Figure 6A). In particular, SNPs associated with erythroid traits, including mean haemoglobin concentration (MCH), mean corpuscular volume (MCV) and red blood cell count (RBC) showed a selective enrichment at PIRs in erythroblasts and megakaryocytes compared to PIRs identified in monocytes, macrophages and neutrophils (Figure 6A). Finally, SNPs associated with traits generally unrelated to hematopoietic cells, such as blood pressure (systolic, BP S and diastolic, BP D) and bone mineral density (in femoral neck, FNMD and lumbar spine, LSMD) were not selectively enriched at PIRs in any analysed cell types (Figure 6B).
A Bayesian prioritisation strategy was next developed to use PIRs to rank putative disease- associated genes and tissues across the 31 GWAS traits. This algorithm integrates statistical fine mapping of GWAS signals across SNPs mapping to gene coding regions, promoters and PIRs to provide a single measure of support for each gene (see Materials and Methods for details). An example of the prioritisation algorithm at work in the 1 p13.1 rheumatoid arthritis (RA) susceptibility region is shown in Figure 6C. In a given linkage disequilibrium (LD) block, the GWAS summary p-values at each detected or imputed SNP (Figure 6C, top panel (Okada et al., 2012)) are transformed into approximate Bayes factors (Wakefield, 2009) and then posterior probabilities for the variant being causal, assuming a single causal variant in the region (Figure 6C, middle panel). For each gene, the posterior probabilities are then summed over SNPs overlapping the PIRs, promoter and coding region to compute gene-level scores. In the 1 p13.1 susceptibility region, this strategy prioritises RP4-753F5. 1, CD101, TTF2 and TRIM45 as RA candidate genes (Figure 6C, bottom panel), consistent with CD101's previously reported possible role in RA (Jovanovic et al., 201 1).
Using this algorithm genome-wide for 31 diseases and blood cell traits, a total of 2,604 candidate genes (gene-level score > 0.5) were prioritised. The prioritised genes exhibited both expected and unexpected enrichments for specific pathways in the Reactome Pathway Database (Fabregat et al., 2016). In particular, genes prioritised for autoimmune diseases were enriched in inflammation and immune response-related pathways, such as interleukin and T cell receptor signaling, whereas genes prioritised for platelet traits were preferentially associated with platelet production and hemostasis (Figure 6D). A number of these trait associations are investigated and validated in detail in companion studies focusing on T cells and MK cells (Burren et al., submitted; Frontini et al, in preparation). Unexpected pathway associations were also identified, such as, but not limited to, free oxygen species metabolism in celiac disease, and post-translational and epigenetic modifications of proteins and nucleic acids in the red blood cell traits, inviting further in-depth validation by specialist communities (Figure 6D).
A subset of top 421 genes prioritised for at least one autoimmune disease (see Materials and Methods for details) were further focused upon. Taking into account known protein-protein interactions and pathway co-localisation of their products, a consolidated "autoimmune disease network" (Figure 6E and Methods) was constructed. The highly-connected core of this network (Figure 6E, inset) includes cytokine genes such as IL19 and IL24, signalling and transcription factors controlling proliferation, inflammation and lineage identity (such as MYC, JAK, ETS, CDKN1B, NFKB, FOX01 and IKZF). According to ImmunoBase (http://www.immunobase.org), the majority (76%) of the genes in the core autoimmune disease network were not previously implicated as causal candidates for autoimmune diseases, and 65% fall outside of known disease susceptibility regions (DSLs). The number of newly identified candidates increased to 81 % (with 67% falling outside known DSLs), when considering all genes prioritised for autoimmune diseases (gene-level score>0.5).
The RA and systemic lupus erythematosus (SLE) GWAS datasets (Bentham et al., 2015; Okada et al., 2012), for which imputed results are publicly available, were used to ask if the GWAS signals that drove candidate gene prioritisation are supported by eQTLs in the respective LD blocks. Genome-wide, this analysis revealed that out of 456 genes prioritized for these two diseases, 136 had eQTLs, of which three genes (RASGRP1, SUOX, and GIN1) showed evidence for possible co-localization in RA and two genes (BLK and SLC15A4) in SLE (see Figure 12 for examples). In addition, the genes prioritised for RA included 5/9 candidates (C80rf13, BLK, TRAF1, FADS2 and SYNGR1) that were identified in a recent study (Zhu et al., 2016) combining whole-blood eQTL with RA GWAS data by Mendelian randomisation. The relatively large number of prioritized genes without eQTL support is in agreement with previous reports of limited overlap of disease variants with eQTLs (Guo et al., 2015; Huang et al., 2015). This demonstrates complementary benefits of eQTL-based and physical interaction-based prioritization approaches for identifying candidate target genes of non-coding disease variants.
Taken together, the results herein reveal large numbers of newly identified potential disease genes and pathways, and demonstrate the power of high-resolution 3D promoter interactomes for large-scale interpretation of GWAS data.
DISCUSSION
A comprehensive analysis of promoter genome architecture in primary cells by using Promoter-Capture Hi-C is presented herein to identify distal sequences interacting with 31 ,253 known promoters in 17 primary human haematopoietic cell types. It is shown that promoter interactomes are highly cell-type specific, preferentially connect active promoters with active enhancers, and reflect the lineage relationships between blood cell types of the haematopoietic tree. Collectively, these results suggests that three-dimensional genome architecture undergoes stepwise remodeling during lineage specification. This process likely complements the well-characterised developmental chromatin remodelling at the level of individual gene regulatory elements (Calo and Wysocka, 2013).
Theoretically, enhancer-promoter contacts can be either "instructive" (triggering transcriptional activation) or "permissive" (poised for activation). The mechanistically verified model of instructive interactions are loops in the β-globin locus (Bartman et al., 2016; Deng et ai, 2012, 2014). The enclosed observations in blood cells provide additional evidence for the "instructive" model. However, it is likely that both mechanisms are operational, particularly in early development. For example, permissive interactions were previously detected for early mesodermal enhancers in Drosophila (Ghavi-Helm et al., 2014) and in mouse and human embryonic stem cells (Schoenfelder et al., 2015a)(Freire-Pritchett et al., in preparation), as well as for TNF-a response genes in fibroblasts (Jin et ai, 2013). High-resolution interaction information makes it possible to connect genes to their active enhancers. Using this approach, it was observed that enhancers show generally additive effects on the expression of their target genes. This suggests a largely independent enhancer action (Spivakov, 2014) and is consistent with functional evidence from mouse (Bender et al., 2001) and fly models (Arnold et al., 2013). However, deviations from this simple additive model have also been described. For example, very strong enhancers in Drosophila were found to show subadditive behaviour (Bothma et al., 2015). Interestingly, it was also observed additive effects, albeit weaker, for PIRs that were not annotated as enhancers. This provides additional support to recent findings that regions without "classic" gene-regulatory signatures may also be involved in activation of gene expression (Rajagopal et al., 2016). The generally additive enhancer action may explain why genes are often able to buffer the effects of deleterious mutations at individual enhancers (Frankel et al., 2010; Kasowski et al., 2013; Waszak et al., 2015). This buffering, in turn, may underlie the fact that many non-coding GWAS SNPs, while enriched at regulatory regions, are not detectable as eQTLs, particularly under normal conditions (Guo ef al., 2015; Huang et al., 2015).
Recent studies by the inventors and others have made a strong case for using 3D genome information to interpret non-coding variation (Claussnitzer et al., 2015; Davison et al., 2011 ; Dryden et al., 2014; Grubert et al., 2015; Jager et al., 2015; Martin et al., 2015; Mifsud et al., 2015; Smemo et al., 2014; Stadhouders et al., 2014). Herein, this has been exploited globally by applying an improved PCHi-C protocol (Nagano et al., 2015) to a broad array of human primary cells at high coverage, coupled with robust interaction signal detection and a formal approach to leverage the full distributions of GWAS summary statistics. This has allowed the inventors to link thousands of GWAS SNPs to their putative target genes, and identify more than 2,500 potential disease-associated genes, of which only about a quarter were previously known. The enclosed work establishes a systematic approach to tackle the problem of interpreting non-coding genetic variation, and creates an unprecedented opportunity to unlock the seemingly intractable promise created by current and future GWAS, to identify disease candidate genes and pathways.
REFERENCES
1000 Genomes Project Consortium, Auton, A., Brooks, L.D., Durbin, R.M., Garrison, E.P., Kang, H.M., Korbel, J.O., Marchini, J.L, McCarthy, S., McVean, G.A., et al. (2015). A global reference for human genetic variation. Nature 526, 68-74.
Adams, D., Altucci, L, Antonarakis, S.E., Ballesteros, J., Beck, S., Bird, A., Bock, C,
Boehm, B., Campo, E., Caricasole, A., et al. (2012). BLUEPRINT to decode the epigenetic signature written in blood. Nat. Biotechnol. 30, 224-226.
Akhtar, W., Waseem, A., de Jong, J., Pindyurin, A.V., Ludo, P., Wouter, M., de Ridder, J., Anton, B., Wessels, L.F.A., van Lohuizen, M., et al. (2013). Chromatin Position Effects Assayed by Thousands of Reporters Integrated in Parallel. Cell 154, 914-927.
Albert, F.W., and Kruglyak, L. (2015). The role of regulatory variation in complex traits and disease. Nat. Rev. Genet. 16, 197-212.
Anderson, C.A., Boucher, G., Lees, C.W., Franke, A., D'Amato, M., Taylor, K.D., Lee, J.C., Goyette, P., Imielinski, M., Latiano, A., et al. (201 1). Meta-analysis identifies 29 additional ulcerative colitis risk loci, increasing the number of confirmed associations to 47. Nat. Genet. 43, 246-252.
Arnold, CD., Gerlach, D., Stelzer, C, Boryh, L.M., Rath, M., and Stark, A. (2013). Genome- wide quantitative enhancer activity maps identified by STARR-seq. Science 339, 1074- 1077.
Ay, F., Bunnik, E.M., Varoquaux, N., Bol, S.M., Prudhomme, J., Vert, J. -P., Noble, W.S., and Le Roch, K.G. (2014). Three-dimensional modeling of the P. falciparum genome during the erythrocytic cycle reveals a strong connection between genome architecture and gene expression. Genome Res. 24, 974-988.
Barrett, J.C., Clayton, D.G., Concannon, P., Akolkar, B., Cooper, J.D., Erlich, H.A., Julier, C, Morahan, G., Nerup, J., Nierras, C, et al. (2009). Genome-wide association study and metaanalysis find that over 40 loci affect risk of type 1 diabetes. Nat. Genet. 41, 703-707.
Bartman, C.R., Hsu, S.C., Hsiung, C.C.-S., Raj, A., and Blobel, G.A. (2016). Enhancer Regulation of Transcriptional Bursting Parameters Revealed by Forced Chromatin Looping. Mol. Cell 62, 237-247.
Bender, M.A., Roach, J.N., Halow, J., Close, J., Alami, R., Bouhassira, E.E., Groudine, M., and Fiering, S.N. (2001). Targeted deletion of 5'HS1 and 5'HS4 of the beta-globin locus control region reveals additive activity of the DNasel hypersensitive sites. Blood 98, 2022- 2027.
Bentham, J., Morris, D.L., Cunninghame Graham, D.S., Pinder, C.L., Tombleson, P., Behrens, T.W., Martin, J., Fairfax, B.P., Knight, J.C., Chen, L, et al. (2015). Genetic association analyses implicate aberrant regulation of innate and adaptive immunity genes in the pathogenesis of systemic lupus erythematosus. Nat. Genet.
Blangiardo, M., and Richardson, S. (2007). Statistical tools for synthesizing lists of differentially expressed features in related experiments. Genome Biol. 8, R54.
Blangiardo, M., Cassese, A., and Richardson, S. (2010). sdef: an R package to synthesize lists of significant features in related experiments. BMC Bioinformatics 11, 270.
Bothma, J. P., Garcia, H.G., Ng, S., Perry, M.W., Gregor, T., and Levine, M. (2015).
Enhancer additivity and non-additivity are determined by enhancer strength in the Drosophila embryo. Elife 4.
Cairns, J., Freire-Pritchett, P., Wingett, S.W., Varnai, C, Dimond, A., Plagnol, V., Zerbino, D., Schoenfelder, S., Javierre, B.-M., Osborne, C, et al. (2016). CHiCAGO: Robust
Detection of DNA Looping Interactions in Capture Hi-C data. Genome Biol, in press.
Calo, E., and Wysocka, J. (2013). Modification of enhancer chromatin: what, how, and why? Mol. Cell 49, 825-837.
Cancer Genome Atlas Research Network, Weinstein, J.N., Collisson, E.A., Mills, G.B., Shaw, K.R.M., Ozenberger, B.A., Ellrott, K., Shmulevich, I., Sander, C, and Stuart, J.M. (2013). The Cancer Genome Atlas Pan-Cancer analysis project. Nat. Genet. 45, 1 1 13-1120.
Carter, D., Chakalova, L, Osborne, C.S., Dai, Y.-F., and Fraser, P. (2002). Long-range chromatin regulatory interactions in vivo. Nat. Genet. 32, 623-626.
Cheeseman, P., Peter, C, James, K., Matthew, S., John, S., Will, T., and Don, F. (1988). AutoClass: A Bayesian Classification System. In Machine Learning Proceedings 1988, pp. 54-64.
Chen, L., Kostadima, M., Martens, J.H.A., Canu, G., Garcia, S.P., Turro, E., Downes, K., Macaulay, I.C., Bielczyk-Maczynska, E., Coe, S., et al. (2014). Transcriptional diversity during lineage commitment of human blood progenitors. Science 345, 1251033.
Claussnitzer, M., Dankel, S.N., Kim, K.-H., Quon, G., Meuleman, W., Haugen, C, Glunk, V., Sousa, I.S., Beaudry, J.L., Puviindran, V., et al. (2015). FTO Obesity Variant Circuitry and Adipocyte Browning in Humans. N. Engl. J. Med. 373, 895-907.
Cline, M.S., Smoot, M., Cerami, E., Kuchinsky, A., Landys, N., Workman, C, Christmas, R., Avila-Campilo, I., Creech, M., Gross, B., et al. (2007). Integration of biological networks and gene expression data using Cytoscape. Nat. Protoc. 2, 2366-2382.
Cordell, H.J., Han, Y., Mells, G.F., Li, Y., Hirschfield, G.M., Greene, C.S., Xie, G., Juran, B.D., Zhu, D., Qian, D.C., et al. (2015). International genome-wide meta-analysis identifies new primary biliary cirrhosis risk loci and targetable pathogenic pathways. Nat. Commun. 6, 8019.
Davison, L.J., Wallace, C, Cooper, J.D., Cope, N.F., Wilson, N.K., Smyth, D.J., Howson, J.M.M., Saleh, N., Al-Jeffery, A., Angus, K.L., et al. (2011). Long-range DNA looping and gene expression analyses identify DEXI as an autoimmune disease candidate gene. Hum. Mol. Genet. 21, 322-333.
Dekker, J., Job, D., Marti-Renom, M.A., and Mirny, LA. (2013). Exploring the three- dimensional organization of genomes: interpreting chromatin interaction data. Nat. Rev. Genet. 14, 390^103.
Deng, W., Lee, J., Wang, H., Miller, J., Reik, A., Gregory, P.D., Dean, A., and Blobel, G.A. (2012). Controlling long-range genomic interactions at a native locus by targeted tethering of a looping factor. Cell 149, 1233-1244.
Deng, W., Rupon, J.W., Krivega, I., Breda, L, Motta, I., Jahn, K.S., Reik, A., Gregory, P.D., Rivella, S., Dean, A., et al. (2014). Reactivation of developmental^ silenced globin genes by forced chromatin looping. Cell 158, 849-860.
Dixon, J.R., Siddarth, S., Feng, Y., Audrey, K., Yan, L, Yin, S., Ming, H., Liu, J.S., and Bing, R. (2012). Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature 485, 376-380.
Dryden, N.H., Broome, L.R., Dudbridge, F., Johnson, N., Orr, N., Schoenfelder, S., Nagano, T., Andrews, S., Wngett, S., Kozarewa, I., et al. (2014). Unbiased analysis of potential targets of breast cancer susceptibility loci by Capture Hi-C. Genome Res. 24, 1854-1868. Duan, Z., Andronescu, M., Schutz, K., Mcllwain, S., Kim, Y.J., Lee, C, Shendure, J., Fields, S., Blau, C.A., and Noble, W.S. (2010). A three-dimensional model of the yeast genome. Nature 465, 363-367. Dubois, P.C.A., Trynka, G., Franke, L., Hunt, K.A., Romanos, J., Curtotti, A., Zhernakova, A., Heap, G.A.R., Adany, R., Aromaa, A., et al. (2010). Multiple common variants for celiac disease influencing immune gene expression. Nat. Genet. 42, 295-302.
Ernst, J., and Kellis, M. (2012). ChromHMM: automating chromatin-state discovery and characterization. Nat. Methods 9, 215-216.
Estrada, K., Styrkarsdottir, U., Evangelou, E., Hsu, Y.-H., Duncan, E.L., Ntzani, E.E., Oei, L, Albagha, O.M.E., Amin, N., Kemp, J. P., et al. (2012). Genome-wide meta-analysis identifies 56 bone mineral density loci and reveals 14 loci associated with risk of fracture. Nat. Genet. 44, 491-501.
Fabregat, A., Sidiropoulos, K., Garapati, P., Gillespie, M., Hausmann, K., Haw, R., Jassal, B., Jupe, S., Korninger, F., McKay, S., et al. (2016). The Reactome pathway
Knowledgebase. Nucleic Acids Res. 44, D481-D487.
Fairfax, B.P., Makino, S., Radhakrishnan, J., Plant, K., Leslie, S., Dilthey, A., Ellis, P., Langford, C, Vannberg, F.O., and Knight, J.C. (2012). Genetics of gene expression in primary immune cells identifies cell type-specific master regulators and roles of HLA alleles. Nat. Genet. 44, 502-510.
Farh, K.K.-H., Alexander, M., Jiang, Z., Markus, K., Housley, W.J., Samantha, B., Noam, S., Holly, W., Ryan, R.J.H., Shishkin, A. A., et al. (2014). Genetic and epigenetic fine mapping of causal autoimmune disease variants. Nature 518, 337-343.
Forrester, W.C., Epner, E., Driscoll, M.C., Enver, T., Brice, M., Papayannopoulou, T., and Groudine, M. (1990). A deletion of the human beta-globin locus activation region causes a major alteration in chromatin structure and replication across the entire beta-globin locus. Genes Dev. 4, 1637-1649.
Franke, A., McGovern, D.P.B., Barrett, J.C, Wang, K., Radford-Smith, G.L., Ahmad, T., Lees, C.W., Balschun, T., Lee, J., Roberts, R., et al. (2010). Genome-wide meta-analysis increases to 71 the number of confirmed Crohn's disease susceptibility loci. Nat. Genet. 42, 1 118-1125.
Frankel, N., Davis, G.K., Vargas, D., Wang, S., Payre, F., and Stern, D.L. (2010). Phenotypic robustness conferred by apparently redundant transcriptional enhancers. Nature 466, 490- 493.
Fullwood, M.J., Liu, M.H., Pan, Y.F., Liu, J., Xu, H., Mohamed, Y.B., Orlov, Y.L., Velkov, S., Ho, A., Mei, P.H., et al. (2009). An oestrogen-receptor-alpha-bound human chromatin interactome. Nature 462, 58-64.
Ghavi-Helm, Y., Yad, G.-H., Klein, F.A., Tibor, P., Lucia, C, Daan, N., Wolfgang, H., and Furlong, E.E.M. (2014). Enhancer loops appear stable during development and are associated with paused polymerase. Nature.
Gieger, C, Radhakrishnan, A., Cvejic, A., Tang, W., Porcu, E., Pistis, G., Serbanovic-Canic, J., Elling, U., Goodall, A.H., Labrune, Y., et al. (2011). New gene functions in
megakaryopoiesis and platelet formation. Nature 480, 201-208.
Grubert, F., Zaugg, J.B., Kasowski, M., Ursu, O., Spacek, D.V., Martin, A.R., Greenside, P., Srivas, R., Phanstiel, D.H., Pekowska, A., et al. (2015). Genetic Control of Chromatin States in Humans Involves Local and Distal Chromosomal Interactions. Cell 162, 1051-1065.
Guo, H., Fortune, M.D., Burren, O.S., Schofield, E., Todd, J.A., and Wallace, C. (2015). Integration of disease association and eQTL data using a Bayesian colocalisation approach highlights six candidate causal genes in immune-mediated diseases. Hum. Mol. Genet. 24, 3305-3313.
Harrow, J., Frankish, A., Gonzalez, J.M., Tapanari, E., Diekhans, M., Kokocinski, F., Aken, B.L., Barrell, D., Zadissa, A., Searle, S., et al. (2012). GENCODE: the reference human genome annotation for The ENCODE Project. Genome Res. 22, 1760-1774.
van der Harst, P., Zhang, W., Mateo Leach, I., Rendon, A., Verweij, N., Sehmi, J., Paul, D.S., Elling, U., Allayee, H., Li, X., et al. (2012). Seventy-five genetic loci influencing the human red blood cell. Nature 492, 369-375.
Heinz, S., Benner, C, Spann, N., Bertolino, E., Lin, Y.C., Laslo, P., Cheng, J.X., Murre, C, Singh, H., and Glass, C.K. (2010). Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol. Cell 38, 576-589.
Huang, H., Hailiang, H., Ming, F., Luke, J., Mirkov, M.U., Gabrielle, B., Anderson, C.A., Vibeke, A., Isabelle, C, Adrian, C, et al. (2015). Association mapping of inflammatory bowel disease loci to single variant resolution.
Imakaev, M., Fudenberg, G., McCord, R.P., Naumova, N., Goloborodko, A., Lajoie, B.R., Dekker, J., and Mirny, L.A. (2012). Iterative correction of Hi-C data reveals hallmarks of chromosome organization. Nat. Methods 9, 999-1003.
International Consortium for Blood Pressure Genome-Wide Association Studies, Ehret, G.B., Munroe, P.B., Rice, K.M., Bochud, M., Johnson, A.D., Chasman, D.I., Smith, A.V., Tobin, M.D., Verwoert, G.C., et al. (2011). Genetic variants in novel pathways influence blood pressure and cardiovascular disease risk. Nature 478, 103-109.
International HapMap Consortium, Frazer, K.A., Ballinger, D.G., Cox, D.R., Hinds, D.A., Stuve, L.L., Gibbs, R.A., Belmont, J.W., Boudreau, A., Hardenbol, P., et al. (2007). A second generation human haplotype map of over 3.1 million SNPs. Nature 449, 851-861.
International Multiple Sclerosis Genetics Consortium, Wellcome Trust Case Control
Consortium 2, Sawcer, S., Hellenthal, G., Pirinen, M., Spencer, C.C.A., Patsopoulos, N.A., Moutsianas, L., Dilthey, A., Su, Z., et al. (2011). Genetic risk and a primary role for cell- mediated immune mechanisms in multiple sclerosis. Nature 476, 214-219.
Jager, R., Migliorini, G., Henrion, M., Kandaswamy, R., Speedy, H.E., Heindl, A., Whiffin, N., Carnicer, M.J., Broome, L., Dryden, N., et al. (2015). Capture Hi-C identifies the chromatin interactome of colorectal cancer risk loci. Nat. Commun. 6, 6178.
Jeffries, CD., Ward, W.O., Perkins, D.O., and Wright, F.A. (2009). Discovering collectively informative descriptors from high-throughput experiments. BMC Bioinformatics 10, 431.
Jin, F., Fulai, J., Yan, L, Dixon, J.R., Siddarth, S., Zhen, Y., Lee, A.Y., Chia-An, Y., Schmitt, A.D., Espinoza, C.A., et al. (2013). A high-resolution map of the three-dimensional chromatin interactome in human cells. Nature.
Jovanovic, D.V., Boumsell, L., Bensussan, A., Chevalier, X., Mancini, A., and Di Battista, J. A. (2011). CD101 expression and function in normal and rheumatoid arthritis-affected human T cells and monocytes/macrophages. J. Rheumatol. 38, 419-428.
Kasowski, M., Kyriazopoulou-Panagiotopoulou, S., Grubert, F., Zaugg, J.B., Kundaje, A., Liu, Y., Boyle, A.P., Zhang, Q.C., Zakharia, F., Spacek, D.V., et al. (2013). Extensive variation in chromatin states across humans. Science 342, 750-752.
Krivega, I., Ivan, K., and Ann, D. (2012). Enhancer and promoter interactions— long distance calls. Curr. Opin. Genet. Dev. 22, 79-85.
Lieberman-Aiden, E., van Berkum, N.L., Williams, L, Imakaev, M., Ragoczy, T., Telling, A., Amit, I., Lajoie, B.R., Sabo, P.J., Dorschner, M.O., et al. (2009). Comprehensive Mapping of Long-Range Interactions Reveals Folding Principles of the Human Genome. Science 326, 289-293.
Lippert, C, Casale, F.P., Rakitsch, B., and Stegle, O. (2014). LIMIX: genetic analysis of multiple traits. BioRxiv.
Locke, A.E., Kahali, B., Berndt, S.I., Justice, A.E., Pers, T.H., Day, F.R., Powell, C,
Vedantam, S., Buchkovich, M.L., Yang, J., et al. (2015). Genetic studies of body mass index yield new insights for obesity biology. Nature 518, 197-206.
Love, M.I., Huber, W., and Anders, S. (2014). Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550.
Manning, A.K., Hivert, M.-F., Scott, R.A., Grimsby, J.L., Bouatia-Naji, N., Chen, H., Rybin, D., Liu, C.-T., Bielak, L.F., Prokopenko, I., et al. (2012). A genome-wide approach accounting for body mass index identifies genetic variants influencing fasting glycemic traits and insulin resistance. Nat. Genet. 44, 659-669.
Manolio, T.A. (2010). Genomewide association studies and assessment of the risk of disease. N. Engl. J. Med. 363, 166-176.
Martin, P., McGovern, A., Orozco, G., Duffus, K., Yarwood, A., Schoenfelder, S., Cooper, N.J., Barton, A., Wallace, C, Fraser, P., et al. (2015). Capture Hi-C reveals novel candidate genes and complex long-range interactions with related autoimmune risk loci. Nat. Commun. 6, 10069.
Maurano, M.T., Humbert, R., Rynes, E., Thurman, R.E., Haugen, E., Wang, H., Reynolds, A. P., Sandstrom, R., Qu, H., Brody, J., et al. (2012). Systematic localization of common disease-associated variation in regulatory DNA. Science 337, 1 190-1195.
McLaren, W., Pritchard, B., Rios, D., Chen, Y., Flicek, P., and Cunningham, F. (2010).
Deriving the consequences of genomic variants with the Ensembl API and SNP Effect Predictor. Bioinformatics 26, 2069-2070.
Mifsud, B., Borbala, M., Filipe, T.-C, Young, A.N., Robert, S., Stefan, S., Lauren, F., Wingett, S.W., Simon, A., Wlliam, G., et al. (2015). Mapping long-range promoter contacts in human cells with high-resolution capture Hi-C. Nat. Genet. 47, 598-606.
Montojo, J., Zuberi, K., Rodriguez, H., Kazi, F., Wright, G., Donaldson, S.L., Morris, Q., and Bader, G.D. (2010). GeneMANIA Cytoscape plugin: fast gene function predictions on the desktop. Bioinformatics 26, 2927-2928.
Morris, A.P., Voight, B.F., Teslovich, T.M., Ferreira, T., Segre, A.V., Steinthorsdottir, V., Strawbridge, R.J., Khan, H., Grallert, H., Mahajan, A., et al. (2012). Large-scale association analysis provides insights into the genetic architecture and pathophysiology of type 2 diabetes. Nat. Genet. 44, 981-990.
Nagano, T., Lubling, Y., Stevens, T.J., Schoenfelder, S., Yaffe, E., Dean, W., Laue, E.D., Tanay, A., and Fraser, P. (2013). Single-cell Hi-C reveals cell-to-cell variability in chromosome structure. Nature 502, 59-64.
Nagano, T., Takashi, N., Csilla, V., Stefan, S., Biola-Maria, J., Wingett, S.W., and Peter, F. (2015). Comparison of Hi-C results using in-solution versus in-nucleus ligation. Genome Biol. 16.
Natoli, G., and Andrau, J.-C. (2012). Noncoding transcription at enhancers: general principles and functional models. Annu. Rev. Genet. 46, 1-19.
Nora, E.P., Lajoie, B.R., Schulz, E.G., Giorgetti, L, Okamoto, I., Servant, N., Piolot, T., van Berkum, N.L., Meisig, J., Sedat, J., et al. (2012). Spatial partitioning of the regulatory landscape of the X-inactivation centre. Nature 485, 381-385.
Okada, Y., Terao, C, Ikari, K., Kochi, Y., Ohmura, K., Suzuki, A., Kawaguchi, T., Stahl, E.A., Kurreeman, F.A.S., Nishida, N., et al. (2012). Meta-analysis identifies nine new loci associated with rheumatoid arthritis in the Japanese population. Nat. Genet. 44, 511-516.
Ormiston, M.L., Toshner, M.R., Kiskin, F.N., Huang, C.J.Z., Groves, E., Morrell, N.W., and Rana, A.A. (2015). Generation and Culture of Blood Outgrowth Endothelial Cells from Human Peripheral Blood. J. Vis. Exp. e53384.
Rajagopal, N., Srinivasan, S., Kooshesh, K., Guo, Y., Edwards, M.D., Banerjee, B., Syed, T., Emons, B.J.M., Gifford, D.K., and Sherwood, R.I. (2016). High-throughput mapping of regulatory DNA. Nat. Biotechnol. 34, 167-174.
Sahlen, P., Abdullayev, I., Ramskold, D., Matskova, L, Rilakovic, N., Lotstedt, B., Albert, T.J., Lundeberg, J., and Sandberg, R. (2015). Genome-wide mapping of promoter-anchored interactions with close to single-enhancer resolution. Genome Biol. 16, 156.
Sanyal, A., Lajoie, B.R., Jain, G., and Dekker, J. (2012). The long-range interaction landscape of gene promoters. Nature 489, 109-1 13.
Schoenfelder, S., Stefan, S., Mayra, F.-M., Borbala, M., Filipe, T.-C, Robert, S., Biola-Maria, J., Takashi, N., Yulia, K., Moorthy, S., et al. (2015a). The pluripotent regulatory circuitry connecting promoters to their long-range interacting elements. Genome Res. 25, 582-597.
Schoenfelder, S., Sugar, R., Dimond, A., Javierre, B.-M., Armstrong, H., Mifsud, B.,
Dimitrova, E., Matheson, L, Tavares-Cadete, F., Furlan-Magaril, M., et al. (2015b).
Polycomb repressive complex PRC1 spatially constrains the mouse embryonic stem cell genome. Nat. Genet. 47, 1 179-1186.
Schofield, E.C., Carver, T., Achuthan, P., Freire-Pritchett, P., Spivakov, M., Todd, J.A., and Burren, O.S. (2016). CHiCP: a web-based tool for the integrative and interactive visualization of promoter capture Hi-C datasets. Bioinformatics.
Sexton, T., Tom, S., Eitan, Y., Ephraim, K., Frederic, B., Benjamin, L, Michael, H., Hugues, P., Amos, T., and Giacomo, C. (2012). Three-Dimensional Folding and Functional
Organization Principles of the Drosophila Genome. Cell 148, 458-472.
Smemo, S., Tena, J.J., Kim, K.-H., Gamazon, E.R., Sakabe, N.J., Gomez-Marin, C, Aneas, I., Credidio, F.L., Sobreira, D.R., Wasserman, N.F., et al. (2014). Obesity-associated variants within FTO form long-range functional connections with IRX3. Nature 507, 371-375. Spivakov, M. (2014). Spurious transcription factor binding: non-functional or genetically redundant? Bioessays 36, 798-806.
Stadhouders, R., Aktuna, S., Thongjuea, S., Aghajanirefah, A., Pourfarzad, F., van Ijcken, W., Lenhard, B., Rooks, H., Best, S., Menzel, S., et al. (2014). HBS1 L-MYB intergenic variants modulate fetal hemoglobin via long-range MYB enhancers. J. Clin. Invest. 124, 1699-1710.
Stahl, E.A., Raychaudhuri, S., Remmers, E.F., Xie, G., Eyre, S., Thomson, B.P., Li, Y., Kurreeman, F.A.S., Zhernakova, A., Hinks, A., et al. (2010). Genome-wide association study meta-analysis identifies seven new rheumatoid arthritis risk loci. Nat. Genet. 42, 508-514.
Teslovich, T.M., Musunuru, K., Smith, A.V., Edmondson, A.C., Stylianou, I.M., Koseki, M., Pirruccello, J. P., Ripatti, S., Chasman, D.I., Wilier, C.J., et al. (2010). Biological, clinical and population relevance of 95 loci for blood lipids. Nature 466, 707-713.
The ENCODE Project Consortium (2012). An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57-74.
Thompson, E.C., Cobb, B.S., Sabbattini, P., Meixlsperger, S., Parelho, V., Liberg, D., Taylor, B., Dillon, N., Georgopoulos, K., Jumaa, H., et al. (2007). Ikaros DNA-binding proteins as integral components of B cell developmental-stage-specific regulatory circuits. Immunity 26, 335-344.
Tolhuis, B., Palstra, R.J., Splinter, E., Grosveld, F., and de Laat, W. (2002). Looping and interaction between hypersensitive sites in the active beta-globin locus. Mol. Cell 10, 1453- 1465.
Trynka, G., Westra, H.-J., Slowikowski, K., Hu, X., Xu, H., Stranger, B.E., Klein, R.J., Han, B., and Raychaudhuri, S. (2015). Disentangling the Effects of Colocalizing Genomic
Annotations to Functionally Prioritize Non-coding Variants within Complex-Trait Loci. Am. J. Hum. Genet. 97, 139-152.
Wakefield, J. (2009). Bayes factors for genome-wide association studies: comparison with P- values. Genet. Epidemiol. 33, 79-86.
Waszak, S.M., Delaneau, O., Gschwind, A.R., Kilpinen, H., Raghav, S.K., Witwicki, R.M., Orioli, A., Wiederkehr, M., Panousis, N.I., Yurovsky, A., et al. (2015). Population Variation and Genetic Control of Modular Chromatin Architecture in Humans. Cell 162, 1039-1050.
Wellcome Trust Case Control Consortium, Mailer, J.B., McVean, G., Byrnes, J., Vukcevic, D., Palin, K., Su, Z., Howson, J.M.M., Auton, A., Myers, S., et al. (2012). Bayesian refinement of association signals for 14 loci in 3 common diseases. Nat. Genet. 44, 1294- 1301.
Welter, D., MacArthur, J., Morales, J., Burdett, T., Hall, P., Junkins, H., Klemm, A., Flicek, P., Manolio, T., Hindorff, L, et al. (2014). The NHGRI GWAS Catalog, a curated resource of SNP-trait associations. Nucleic Acids Res. 42, D1001-D1006.
Wingett, S., Ewels, P., Furlan-Magaril, M., Nagano, T., Schoenfelder, S., Fraser, P., and Andrews, S. (2015). HiCUP: pipeline for mapping and processing Hi-C data. F1000Res. 4, 1310.
Wood, A.R., Esko, T., Yang, J., Vedantam, S., Pers, T.H., Gustafsson, S., Chu, A.Y., Estrada, K., Luan, J. 'an, Kutalik, Z., et al. (2014). Defining the role of common variation in the genomic and biological architecture of adult human height. Nat. Genet. 46, 1 173-1186.
Yu, G., Wang, L.-G., Han, Y., and He, Q.-Y. (2012). clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS 16, 284-287. Zerbino, D.R., Wilder, S.P., Johnson, N., Juettemann, T., and Flicek, P.R. (2015). The ensembl regulatory build. Genome Biol. 16, 56.
Zerbino, D.R., Johnson, N., Juetteman, T., Sheppard, D., Wider, S.P., Lavidas, I., Nuhn, M., Perry, E., Raffaillac-Desfosses, Q., Sobral, D., et al. (2016). Ensembl regulation resources. Database 2016.
Zhu, Z., Zhang, F., Hu, H., Bakshi, A., Robinson, M.R., Powell, J.E., Montgomery, G.W., Goddard, M.E., Wray, N.R., Visscher, P.M., et al. (2016). Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets. Nat. Genet. 48, 481-487.

Claims

1. A modulator of one or more of the biomarkers of Tables 1 or 2 for use in the treatment of a blood disorder.
2. The modulator for use as defined in claim 1 , wherein the blood disorder is a platelet disorder and the one or more biomarkers are selected from Table 1.
3. The modulator for use as defined in claim 2, wherein the blood disorder is a platelet disorder and the one or more biomarkers are selected from one or more, or all, of: CD22,
BAZ2A, JAK2, CHRNE, PTGES3, CYP27B1 , OPRD1 , CA14, CD274, ABCC4, FFAR2, PRMT1 , PLD2, SLC39A5, MINK1 , SIRT3, RNPEPL1 , PSMB6, SLC2A12, TAOK1 , NUAK2, GPR182, BRD3, JMJD1 C, NLRP6, TBK1 and NRBP2.
4. The modulator for use as defined in claim 1 , wherein the blood disorder is a red blood cell disorder and the one or more biomarkers are selected from Table 2.
5. The modulator for use as defined in claim 4, wherein the blood disorder is a red blood cell disorder and the one or more biomarkers are selected from one or more, or all, of: CASP10, SLC25A39, CLK1 , ATP2B4, CASP8, PTGS2, STRADB, AURKA, JAK2, IL2RB, KAT8, KCNN4, DOT1 L, SLC12A7, PLCG1 , LPIN3, AMHR2, GABBR2, ADAM 10, IFNAR1 , SLC6A3, PADI3, SLC25A37, UGCG, KCNMA1 , MAPK13, ITGAD, IFNAR2, PADI4, NEK8, NUAK2, FABP1 , SLC51A, ABCA1 , BRD7, SMPD1 , ILK, CDK12, VKORC1 , BRD3, GPR152, RPS6KB2, TMPRSS6, TOP1 , S1 PR3 and FNTB.
6. A modulator of one or more of the biomarkers of Tables 3 to 7 for use in the treatment of an autoimmune disorder.
7. The modulator for use as defined in claim 6, wherein the autoimmune disorder is ulcerative colitis and the one or more biomarkers are selected from Table 3.
8. The modulator for use as defined in claim 7, wherein the autoimmune disorder is ulcerative colitis and the one or more biomarkers are selected from one or more, or all, of: CDK11A, SLC11A1 , IFNGR1 , ATP6V0A1 , ENTPD2, EIF2AK1 , SLC26A3, JAK2, ABCA2, CHRNE, IL1 RL2, CD274, SLC7A10, CDK4, SLC26A10, ADAM 10, MINK1 , PSMB6, RORC, ADAMTS16, INPP5E, PLCH2, STK32B, TNFRSF14, STK36, BRD7, PIP4K2C, ADAM9, ADRA1 B, PTGER4, BCL2, DPP7, SLC9A4, CXCR2, EHMT1 , PRKAR1 B, TUBB4B, MMP23B, PIM3, SLC34A3, SGMS1 , SLC35E2 and CDK1 1 B.
9. The modulator for use as defined in claim 6, wherein the autoimmune disorder is multiple sclerosis and the one or more biomarkers are selected from Table 4.
10. The modulator for use as defined in claim 9, wherein the autoimmune disorder is multiple sclerosis and the one or more biomarkers are selected from one or more, or all, of: CYP24A1 , IFNGR1 , PDE4A, MAPK1 , CSF2RB, PTK6, SLC17A7, PDE4C, SCNN1A, LTBR, USP5, NEK9, SRMS, SLC34A1 , IL2RA, SLC26A10, CD27, NCOA2, PTPRK, CXCR5, ATP1A1 , SLC9B1 , SLC9B2, IL22RA2, NSD1 , PIP4K2C, RGS14, ADRA1 B, PTGER4, GPR160, S1 PR5, LPAR5, SEMA4D, P2RY11 and GPR162.
1 1. The modulator for use as defined in claim 6, wherein the autoimmune disorder is rheumatoid arthritis and the one or more biomarkers are selected from Table 5.
12. The modulator for use as defined in claim 11 , wherein the autoimmune disorder is rheumatoid arthritis and the one or more biomarkers are selected from one or more, or all, of: IFNGR1 , PDE4A, IDI1 , FDFT1 , KEAP1 , PREP, MAP3K1 , CD40, CDK6, SLC26A8, CCR6, PCCB, CYP20A1 , NEK9, ACAT2, SLC44A2, IL6ST, IL2RA, SLC26A10, BLK, SLC35B2, TNFRSF14, IFNAR2, CXCR5, IL6R, NEK10, CTSB, PIP4K2C, CXCR6 and S1 PR5.
13. The modulator for use as defined in claim 6, wherein the autoimmune disorder is celiac disease and the one or more biomarkers are selected from Table 6.
14. The modulator for use as defined in claim 13, wherein the autoimmune disorder is celiac disease and the one or more biomarkers are selected from one or more, or all, of: IL20RA, IFNGR1 , DAPK2, ZMYND8, ITGA4, IL1 R2, IL1 RL2, IL1 RL1 , IL18R1 , SLC9A2, PTPRK, IL22RA2 and GPR160.
15. The modulator for use as defined in claim 6, wherein the autoimmune disorder is Crohn's disease and the one or more biomarkers are selected from Table 7.
16. The modulator for use as defined in claim 15, wherein the autoimmune disorder is Crohn's disease and the one or more biomarkers are selected from one or more, or all, of:
ATP6V0A1 , ITGA8, MAP3K1 , JAK2, ATP5D, TYK2, CCR6, LNPEP, KCNJ13, IL1 R2, IL1 RL2, IL1 RL1 , IL18R1 , SLC9A2, STK11 , CD274, PIK3CA, SLC44A2, IL2RA, GPR65, FURIN, HCN3, JAK1 , USP1 , SLC9B1 , ERAP1 , ERAP2, BRD7, INPP5D, ADRA1 B, PTGER4, TRIB1 , CLK2, S1 PR5, FES and P2RY11.
17. A modulator of one or more of the biomarkers of Tables 8 or 9 for use in the treatment of diabetes.
18. The modulator for use as defined in claim 17, wherein the diabetes is Type 1 diabetes and the one or more biomarkers are selected from Table 8.
19. The modulator for use as defined in claim 18, wherein the diabetes is Type 1 diabetes and the one or more biomarkers are selected from one or more, or all, of: FYN, HDAC9, ERBB3, PRKCQ, TRPM5, TRIB2, PTPRC, CTSH, CDK6, SLC22A18, RGS2, GPR18, CCR7, RARA, IL2RA, AMHR2, SLC25A47, CAMK2D, NEK7, ATP6V1G3, GPR183, PTEN, SSTR2, GPR19, TMPRSS6, SLC25A29 and CD3E.
20. The modulator for use as defined in claim 17, wherein the diabetes is Type 2 diabetes and the one or more biomarkers are selected from Table 9.
21. The modulator for use as defined in claim 20, wherein the diabetes is Type 2 diabetes and the one or more biomarkers are selected from one or more, or all, of: RORA, MAP3K1 ,
PLCG1 , RIOK1 and TOP1.
22. A modulator of one or more of the biomarkers of Table 10 for use in the treatment of height/growth disorders.
23. The modulator for use as defined in claim 22, wherein the one or more height/growth disorders biomarkers are selected from one or more, or all, of: MARK4, MMP25, PRKCH, MAP2K3, MAPK9, LPAR2, MAP2K4, PDE4A, ATP2B1 , TRIB2, EED, SPTLC1 , CMA1 , SLC7A8, SIRT1 , JAK2, PCSK5, CTSG, GZMB, CTSZ, SLC04A1 , NTSR1 , LPIN2, KCNN4, DOT1 L, VRK3, TYK2, GSK3A, CDK6, EZH2, SLC16A6, CPZ, SLC11A2, ITK, ACVR2B, ODC1 , ECE1 , NEK2, CA14, SLC16A7, NT5C3A, TRIM24, CCR7, MAP2K2, PPAT, SLC44A2, RIPK3, ADCY4, STK33, DNMT1 , AN01 , RARA, PRKAB2, PRKAA1 , ADAM 19, AMHR2, PREPL, MAP3K12, ITGB7, TSSK4, SLC16A3, PIP4K2B, SIRT3, PADI3, CELSR2, SLC18B1 , HTR7, ADAM 12, RGS10, PTPRJ, PIP4K2A, NR3C2, KCNJ16, PRKCA, PDE6D, NPR2, SIK3, CXCR5, CHRNB2, FGFR4, PLCD3, ADCY9, NEK10, RYK, KDM 1 B, BRD7, PIP4K2C, FASN, CDC42BPG, PTGER4, SLC25A33, EIF2AK3, OXSR1 , CHRNA9, HTR3C, S1 PR5, SCN5A, KCNJ12, NT5C1 B, PLCD1 , LPAR1 , SLC5A3, GRK5, L3MBTL3, LTB4R, LTB4R2, DHFR, P2RY1 1 , FNTB, S1 PR2, LTB4R2 and SLC5A3.
24. A modulator of one or more of the biomarkers of Table 1 1 for use in the treatment of disorders related to lipid metabolism.
25. The modulator for use as defined in claim 24, wherein the one or more lipid metabolism biomarkers are selected from one or more, or all, of: MARK4, BAZ1 B, NPC1 L1 , PRSS8, SLC12A3, EIF2AK1 , MAP3K1 , CYP26A1 , KAT8, AEBP1 , PCCB, PLCG1 , SLC25A35, RIPK3, ADCY4, RAF1 , PPARG, ABCB10, BLK, ADAM 10, TSSK4, NLRC5, PNMT, CELSR2, SCN3A, GPR61 , CPA2, SLC45A3, SIK3, PCSK7, GPR146, ABCA1 , SLC35G2, PCSK9, FPR2, FPR1 , JMJD1C, SLC16A1 1 , SLC16A13, IL20RB, F2, SLC25A42, NRBP2, BACE1 and TOP1.
26. A modulator of one or more of the biomarkers of Table 12 for use in the treatment of disorders related to glucose metabolism.
27. The modulator for use as defined in claim 26, wherein the one or more glucose metabolism biomarkers are selected from one or more, or all, of: NPC1 L1 , NR1 H3, AEBP1 , PLCG1 , STK33, LPIN3, MTNR1 B, SLC9B1 , SLC9B2, SLC30A8, SLC39A13, STK39 and TOP1.
28. A modulator of one or more of the biomarkers of Table 13 for use in the treatment of disorders related to insulin metabolism.
29. The modulator for use as defined in claim 28, wherein the one or more insulin metabolism biomarkers are selected from either or both of: PPARG and TBCK.
30. A modulator of one or more of the biomarkers of Table 14 for use in the treatment of disorders related to bone mineral density.
31. The modulator for use as defined in claim 30, wherein the one or more bone mineral density biomarkers are selected from one or more, or all, of: SLC25A39, CLCN7, AMHR2, MAP3K12, ITGB7, TNFRSF1 1A, SLC26A1 , RARG and GAK.
32. A modulator of one or more of the biomarkers of Table 15 for use in the treatment of disorders related to blood pressure.
33. The modulator for use as defined in claim 32, wherein the one or more blood pressure biomarkers are selected from one or more, or all, of: CLCN6, CSK, PREPL, FURIN, PLCD3, NEKI O and FES.
34. A modulator of one or more of the biomarkers of Table 16 for use in the treatment of disorders related to body mass index.
35. The modulator for use as defined in claim 34, wherein the one or more body mass index biomarkers are selected from one or more, or all, of: GIPR, PRSS8, AQP6, BCKDK, KAT8, MTCH2, ASIC1 , PPARG, CSNK1G2, MAP2K5, ITGAX, SIRT3, CPO, GPR61 , ADCY9, USP1 , F2RL1 , KCNG3, KCNK3, CD19, SLC25A22, GPBAR1 and ATP2A1.
36. Use of one or more of the biomarkers as defined in any of claims 1 to 35 for the diagnosis or prognosis of a disease or disorder as defined in any of claims 1 to 35.
37. A method of screening for a modulator of one or more biomarkers as defined in any of claims 1 to 35, which comprises the steps of:
(a) incubating a test compound in the presence of said one or more biomarkers; and
(b) detecting and/or quantifying binding of the biomarker to said test substance.
38. A method of diagnosing a disease or disorder as defined in any of claims 1 to 35 or predisposition in an individual thereto, comprising:
(a) quantifying the amounts of the biomarkers as defined in any of claims 1 to 35 in a biological sample obtained from an individual;
(b) comparing the amounts of the biomarkers in the biological sample with the amounts present in a normal control biological sample from a normal subject, such that a difference in the level of the biomarkers in the biological sample is indicative of said disease or disorder, or predisposition thereto.
39. A method of prognosing the development of a disease or disorder as defined in any of claims 1 to 35 in an individual, comprising:
(a) quantifying the amounts of the biomarkers as defined in any one of claims 1 to 35 in a biological sample obtained from an individual;
(b) comparing the amounts of the biomarkers in the biological sample with the amounts present in a normal control biological sample from a normal subject, such that a difference in the level of the biomarkers in the biological sample is indicative that the individual will develop said disease or disorder.
40. A method of monitoring efficacy of a therapy in a subject having, suspected of having, or of being predisposed to a disease or disorder as defined in any of claims 1 to 35, comprising detecting and/or quantifying, in a sample from said subject, the biomarkers as defined in any of claims 1 to 35.
41. A method as defined in any one of claims 38 to 40, which is conducted on samples taken on two or more occasions from a test subject.
42. A method as defined in any of claims 38 to 41 , further comprising comparing the level of the biomarker present in samples taken on two or more occasions.
43. A method as defined in any of claims 38 to 42, comprising comparing the amount of the biomarker in said test sample with the amount present in one or more samples taken from said subject prior to commencement of therapy, and/or one or more samples taken from said subject at an earlier stage of therapy.
44. A method as defined in any of claims 38 to 43, further comprising detecting a change in the amount of the biomarker in samples taken on two or more occasions.
45. A method as defined in any of claims 38 to 44, comprising comparing the amount of the biomarker present in said test sample with one or more controls.
46. A method as defined in claim 45, comprising comparing the amount of the biomarker in a test sample with the amount of the biomarker present in a sample from a normal subject.
47. A method as defined in any of claims 38 to 46, wherein samples are taken prior to and/or during and/or following therapy for said disease or disorder.
48. A method as defined in any of claims 38 to 47, wherein samples are taken at intervals over the remaining life, or a part thereof, of a subject.
49. A method as defined in any of claims 38 to 48, wherein quantifying is performed by measuring the concentration of the biomarker in the or each sample.
50. A method as defined in any of claims 38 to 49, wherein detecting and/or quantifying is performed by one or more methods selected from SELDI (-TOF), MALDI (-TOF), a 1-D gel- based analysis, a 2-D gel-based analysis, mass spectroscopy (MS) such as selected reaction monitoring (SRM), reverse phase (RP) LC, size permeation (gel filtration), ion exchange, affinity, HPLC, UPLC or other LC or LC-MS-based technique.
51. A method as defined in any of claims 38 to 50, wherein detecting and/or quantifying is performed using an immunological method.
52. A method as defined in any of claims 38 to 51 , wherein the detecting and/or quantifying is performed using a biosensor or a microanalytical, microengineered, microseparation or immunochromatography system.
53. A method as defined in any of claims 38 to 52, wherein the biological sample is whole blood, blood serum, plasma, cerebrospinal fluid, urine, saliva, or other bodily fluid, or breath, condensed breath, or an extract or purification therefrom, or dilution thereof, such as whole blood, blood serum or plasma, in particular blood serum.
54. Use of a kit comprising a biosensor capable of detecting and/or quantifying the biomarkers as defined in any of claims 1 to 35 for monitoring, prognosing or diagnosing a disease or disorder as defined in any of claims 1 to 35.
55. A method of treating a patient suffering from a disease or disorder as defined in any of claims 1 to 35, which comprises the step of administering a medicament for said disease or disorder to a patient identified as having differing levels of the biomarkers as defined in any of claims 1 to 35 when compared to the levels of said biomarkers from a normal subject.
56. A method of treating a patient suffering from a disease or disorder as defined in any of claims 1 to 35, which comprises the following steps:
(a) quantifying the amounts of the biomarkers as defined in any of claims 1 to 35 in a biological sample obtained from an individual;
(b) comparing the amounts of the biomarkers in the biological sample with the amounts present in a normal control biological sample from a normal subject, such that a difference in the level of the biomarkers in the biological sample is indicative of said disease or disorder, or predisposition thereto; and
(c) administering a medicament for said disease or disorder to a patient diagnosed in step (b) as a patient with said disease or disorder.
PCT/GB2017/051572 2016-06-03 2017-06-01 Biomarkers for platelet disorders WO2017208001A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP17731938.1A EP3465219A1 (en) 2016-06-03 2017-06-01 Biomarkers for platelet disorders

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
GBGB1609712.3A GB201609712D0 (en) 2016-06-03 2016-06-03 Disease targets and biomarkers
GB1609712.3 2016-06-03

Publications (1)

Publication Number Publication Date
WO2017208001A1 true WO2017208001A1 (en) 2017-12-07

Family

ID=56508012

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/GB2017/051572 WO2017208001A1 (en) 2016-06-03 2017-06-01 Biomarkers for platelet disorders

Country Status (3)

Country Link
EP (1) EP3465219A1 (en)
GB (1) GB201609712D0 (en)
WO (1) WO2017208001A1 (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108254558A (en) * 2018-02-11 2018-07-06 山东省千佛山医院 Applications of the PADI3 in colon cancer is diagnosed and/or treated
KR20180133180A (en) * 2017-06-05 2018-12-13 한국과학기술원 Biomarker For Mature Pancreatic Beta Cell And Methods Of Using The Same
CN109504778A (en) * 2019-01-11 2019-03-22 复旦大学附属中山医院 It is a kind of that model is early diagnosed based on the 5hmC polymolecular marker apparently modified and colorectal cancer
CN109628588A (en) * 2019-02-27 2019-04-16 河北医科大学第二医院 Osteoarthritis disorders screening gene PRXL2A and ACTR8 and application thereof
US20200408779A1 (en) * 2018-02-09 2020-12-31 City Of Hope Doc2b as a biomarker for type 1 diabetes
US20220091136A1 (en) * 2020-09-16 2022-03-24 Ajou University Industry-Academic Cooperation Foundation Early detection marker for degenerative osteoarthritis with trim24-rip3 axis
WO2023043257A1 (en) * 2021-09-16 2023-03-23 경북대학교 산학협력단 Pharmaceutical composition for preventing or treating osteoarthritis, containing smpd1 regulator as active ingredient

Non-Patent Citations (103)

* Cited by examiner, † Cited by third party
Title
"An integrated encyclopedia of DNA elements in the human genome", NATURE, vol. 489, 2012, pages 57 - 74
"HBS1 L-MYB intergenic variants modulate fetal hemoglobin via long-range MYB enhancers", J. CLIN. INVEST., vol. 124, 2014, pages 1699 - 1710
"Integration of disease association and eQTL data using a Bayesian colocalisation approach highlights six candidate causal genes in immune-mediated diseases", HUM. MOL. GENET., vol. 24, pages 3305 - 3313
"The Cancer Genome Atlas Pan-Cancer analysis project", NAT. GENET., vol. 45, 2013, pages 1113 - 1120
ADAMS, D.; ALTUCCI, L.; ANTONARAKIS, S.E.; BALLESTEROS, J.; BECK, S.; BIRD, A.; BOCK, C.; BOEHM, B.; CAMPO, E.; CARICASOLE, A. ET: "BLUEPRINT to decode the epigenetic signature written in blood", NAT. BIOTECHNOL., vol. 30, 2012, pages 224 - 226
AKHTAR, W.; WASEEM, A.; DE JONG, J.; PINDYURIN, A.V.; LUDO, P.; WOUTER, M.; DE RIDDER, J.; ANTON, B.; WESSELS, L.F.A.; VAN LOHUIZE: "Chromatin Position Effects Assayed by Thousands of Reporters Integrated in Parallel", CELL, vol. 154, 2013, pages 914 - 927
ALBERT, F.W.; KRUGLYAK, L.: "The role of regulatory variation in complex traits and disease", NAT. REV. GENET., vol. 16, 2015, pages 197 - 212
ANDERSON, C.A.; BOUCHER, G.; LEES, C.W.; FRANKE, A.; D'AMATO, M.; TAYLOR, K.D.; LEE, J.C.; GOYETTE, P.; IMIELINSKI, M.; LATIANO, A: "Meta-analysis identifies 29 additional ulcerative colitis risk loci, increasing the number of confirmed associations to 47", NAT. GENET., vol. 43, 2011, pages 246 - 252
ARNOLD, C.D.; GERLACH, D.; STELZER, C.; BORYRI, L.M.; RATH, M.; STARK, A.: "Genome-wide quantitative enhancer activity maps identified by STARR-seq", SCIENCE, vol. 339, 2013, pages 1074 - 1077
ATSUSHI TAKAHASHI ET AL: "A novel potent tumour promoter aberrantly overexpressed in most human cancers", SCIENTIFIC REPORTS, vol. 1, no. 15, 14 June 2011 (2011-06-14), pages 1 - 12, XP055076925, DOI: 10.1038/srep00015 *
AUTON, A.; BROOKS, L.D.; DURBIN, R.M.; GARRISON, E.P.; KANG, H.M.; KORBEL, J.O.; MARCHINI, J.L.; MCCARTHY, S.; MCVEAN, G.A. ET AL.: "A global reference for human genetic variation", NATURE, vol. 526, 2015, pages 68 - 74
AY, F.; BUNNIK, E.M.; VAROQUAUX, N.; BOL, S.M.; PRUDHOMME, J.; VERT, J.-P.; NOBLE, W.S.; LE ROCH, K.G: "Three-dimensional modeling of the P. falciparum genome during the erythrocytic cycle reveals a strong connection between genome architecture and gene expression", GENOME RES., vol. 24, 2014, pages 974 - 988
BARRETT, J.C.; CLAYTON, D.G.; CONCANNON, P.; AKOLKAR, B.; COOPER, J.D.; ERLICH, H.A.; JULIER, C.; MORAHAN, G.; NERUP, J.; NIERRAS,: "Genome-wide association study and meta-analysis find that over 40 loci affect risk of type 1 diabetes", NAT. GENET., vol. 41, 2009, pages 703 - 707
BARTMAN, C.R.; HSU, S.C.; HSIUNG, C.C.-S.; RAJ, A.; BLOBEL, G.A.: "Enhancer Regulation of Transcriptional Bursting Parameters Revealed by Forced Chromatin Looping.", MOL. CELL, vol. 62, 2016, pages 237 - 247
BENDER, M.A.; ROACH, J.N.; HALOW, J.; CLOSE, J.; ALAMI, R.; BOUHASSIRA, E.E.; GROUDINE, M.; FIERING, S.N.: "Targeted deletion of 5'HS1 and 5'HS4 of the beta-globin locus control region reveals additive activity of the DNasel hypersensitive sites", BLOOD, vol. 98, 2001, pages 2022 - 2027
BENTHAM, J.; MORRIS, D.L.; CUNNINGHAME GRAHAM, D.S.; PINDER, C.L.; TOMBLESON, P.; BEHRENS, T.W.; MARTIN, J.; FAIRFAX, B.P.; KNIGHT: "Genetic association analyses implicate aberrant regulation of innate and adaptive immunity genes in the pathogenesis of systemic lupus erythematosus", NAT. GENET., 2015
BLANGIARDO, M.; CASSESE, A.; RICHARDSON, S.: "sdef: an R package to synthesize lists of significant features in related experiments", BMC BIOINFORMATICS, vol. 11, 2010, pages 270
BLANGIARDO, M.; RICHARDSON, S.: "Statistical tools for synthesizing lists of differentially expressed features in related experiments", GENOME BIOL., vol. 8, 2007, pages R54
BOTHMA, J.P.; GARCIA, H.G.; NG, S.; PERRY, M.W.; GREGOR, T.; LEVINE, M: "Enhancer additivity and non-additivity are determined by enhancer strength in the Drosophila embryo", ELIFE, vol. 4, 2015
CAIRNS, J.; FREIRE-PRITCHETT, P.; WINGETT, S.W.; VARNAI, C.; DIMOND, A.; PLAGNOL, V.; ZERBINO, D.; SCHOENFELDER, S.; JAVIERRE, B.-: "CHiCAGO: Robust Detection of DNA Looping Interactions in Capture Hi-C data", GENOME BIOL, 2016
CALO, E.; WYSOCKA, J.: "Modification of enhancer chromatin: what, how, and why?", MOL. CELL, vol. 49, 2013, pages 825 - 837
CARTER, D.; CHAKALOVA, L.; OSBORNE, C.S.; DAI, Y.-F.; FRASER, P.: "Long-range chromatin regulatory interactions in vivo", NAT. GENET., vol. 32, 2002, pages 623 - 626
CHEESEMAN, P.; PETER, C.; JAMES, K.; MATTHEW, S.; JOHN, S.; WILL, T.; DON, F: "AutoClass: A Bayesian Classification System", MACHINE LEARNING PROCEEDINGS, vol. 1988, 1988, pages 54 - 64
CHEN, L.; KOSTADIMA, M.; MARTENS, J.H.A.; CANU, G.; GARCIA, S.P.; TURRO, E.; DOWNES, K.; MACAULAY, I.C.; BIELCZYK-MACZYNSKA, E.; C: "Transcriptional diversity during lineage commitment of human blood progenitors", SCIENCE, vol. 345, 2014, pages 1251033
CLAUSSNITZER, M.; DANKEL, S.N.; KIM, K.-H.; QUON, G.; MEULEMAN, W.; HAUGEN, C.; GLUNK, V.; SOUSA, I.S.; BEAUDRY, J.L.; PUVIINDRAN,: "FTO Obesity Variant Circuitry and Adipocyte Browning in Humans", N. ENGL. J. MED., vol. 373, 2015, pages 895 - 907
CLINE, M.S.; SMOOT, M.; CERAMI, E.; KUCHINSKY, A.; LANDYS, N.; WORKMAN, C.; CHRISTMAS, R.; AVILA-CAMPILO, I.; CREECH, M.; GROSS, B: "Integration of biological networks and gene expression data using Cytoscape", NAT. PROTOC., vol. 2, 2007, pages 2366 - 2382
CORDELL, H.J.; HAN, Y.; MELLS, G.F.; LI, Y.; HIRSCHFIELD, G.M.; GREENE, C.S.; XIE, G.; JURAN, B.D.; ZHU, D.; QIAN, D.C. ET AL.: "International genome-wide meta-analysis identifies new primary biliary cirrhosis risk loci and targetable pathogenic pathways", NAT. COMMUN, vol. 6, 2015, pages 8019
DAVISON, L.J.; WALLACE, C.; COOPER, J.D.; COPE, N.F.; WILSON, N.K.; SMYTH, D.J.; HOWSON, J.M.M.; SALEH, N.; AL-JEFFERY, A.; ANGUS,: "Long-range DNA looping and gene expression analyses identify DEXI as an autoimmune disease candidate gene", HUM. MOL. GENET., vol. 21, 2011, pages 322 - 333
DEKKER, J.; JOB, D.; MARTI-RENOM, M.A.; MIRNY, L.A.: "Exploring the three-dimensional organization of genomes: interpreting chromatin interaction data", NAT. REV. GENET., vol. 14, 2013, pages 390 - 403
DENG, W.; LEE, J.; WANG, H.; MILLER, J.; REIK, A.; GREGORY, P.D.; DEAN, A.; BLOBEL, G.A: "Controlling long-range genomic interactions at a native locus by targeted tethering of a looping factor", CELL, vol. 149, 2012, pages 1233 - 1244
DENG, W.; RUPON, J.W.; KRIVEGA, I.; BREDA, L.; MOTTA, I.; JAHN, K.S.; REIK, A.; GREGORY, P.D.; RIVELLA, S.; DEAN, A. ET AL.: "Reactivation of developmentally silenced globin genes by forced chromatin looping", CELL, vol. 158, 2014, pages 849 - 860
DIXON, J.R.; SIDDARTH, S.; FENG, Y.; AUDREY, K.; YAN, L.; YIN, S.; MING, H.; LIU, J.S.; BING, R.: "Topological domains in mammalian genomes identified by analysis of chromatin interactions", NATURE, vol. 485, 2012, pages 376 - 380
DRYDEN, N.H.; BROOME, L.R.; DUDBRIDGE, F.; JOHNSON, N.; ORR, N.; SCHOENFELDER, S.; NAGANO, T.; ANDREWS, S.; WINGETT, S.; KOZAREWA,: "Unbiased analysis of potential targets of breast cancer susceptibility loci by Capture Hi-C", GENOME RES., vol. 24, 2014, pages 1854 - 1868
DUAN, Z.; ANDRONESCU, M.; SCHUTZ, K.; MCLLWAIN, S.; KIM, Y.J.; LEE, C.; SHENDURE, J.; FIELDS, S.; BLAU, C.A.; NOBLE, W.S.: "A three-dimensional model of the yeast genome", NATURE, vol. 465, 2010, pages 363 - 367
DUBOIS, P.C.A.; TRYNKA, G.; FRANKE, L.; HUNT, K.A.; ROMANOS, J.; CURTOTTI, A.; ZHERNAKOVA, A.; HEAP, G.A.R.; ADANY, R.; AROMAA, A.: "Multiple common variants for celiac disease influencing immune gene expression", NAT. GENET, vol. 42, 2010, pages 295 - 302
EHRET, G.B.; MUNROE, P.B.; RICE, K.M.; BOCHUD, M.; JOHNSON, A.D.; CHASMAN, D.I.; SMITH, A.V.; TOBIN, M.D.; VERWOERT, G.C. ET AL.: "Genetic variants in novel pathways influence blood pressure and cardiovascular disease risk", NATURE, vol. 478, 2011, pages 103 - 109
ERNST, J.; KELLIS, M.: "ChromHMM: automating chromatin-state discovery and characterization", NAT. METHODS, vol. 9, 2012, pages 215 - 216
ESTRADA, K.; STYRKARSDOTTIR, U.; EVANGELOU, E.; HSU, Y.-H.; DUNCAN, E.L.; NTZANI, E.E.; OEI, L.; ALBAGHA, O.M.E.; AMIN, N.; KEMP,: "Genome-wide meta-analysis identifies 56 bone mineral density loci and reveals 14 loci associated with risk of fracture", NAT. GENET., vol. 44, 2012, pages 491 - 501
FABREGAT, A.; SIDIROPOULOS, K.; GARAPATI, P.; GILLESPIE, M.; HAUSMANN, K.; HAW, R.; JASSAL, B.; JUPE, S.; KORNINGER, F.; MCKAY, S.: "The Reactome pathway Knowledgebase", NUCLEIC ACIDS RES., vol. 44, 2016, pages D481 - D487
FAIRFAX, B.P.; MAKINO, S.; RADHAKRISHNAN, J.; PLANT, K.; LESLIE, S.; DILTHEY, A.; ELLIS, P.; LANGFORD, C.; VANNBERG, F.O.; KNIGHT,: "Genetics of gene expression in primary immune cells identifies cell type-specific master regulators and roles of HLA alleles", NAT. GENET., vol. 44, 2012, pages 502 - 510
FARH, K.K.-H.; ALEXANDER, M.; JIANG, Z.; MARKUS, K.; HOUSLEY, W.J.; SAMANTHA, B.; NOAM, S.; HOLLY, W.; RYAN, R.J.H.; SHISHKIN, A.A: "Genetic and epigenetic fine mapping of causal autoimmune disease variants", NATURE, vol. 518, 2014, pages 337 - 343
FORRESTER, W.C.; EPNER, E.; DRISCOLL, M.C.; ENVER, T.; BRICE, M.; PAPAYANNOPOULOU, T.; GROUDINE, M.: "A deletion of the human beta-globin locus activation region causes a major alteration in chromatin structure and replication across the entire beta-globin locus", GENES DEV., vol. 4, 1990, pages 1637 - 1649
FRANKE, A.; MCGOVERN, D.P.B.; BARRETT, J.C.; WANG, K.; RADFORD-SMITH, G.L.; AHMAD, T.; LEES, C.W.; BALSCHUN, T.; LEE, J.; ROBERTS,: "Genome-wide meta-analysis increases to 71 the number of confirmed Crohn's disease susceptibility loci", NAT. GENET., vol. 42, 2010, pages 1118 - 1125
FRANKEL, N.; DAVIS, G.K.; VARGAS, D.; WANG, S.; PAYRE, F.; STERN, D.L.: "Phenotypic robustness conferred by apparently redundant transcriptional enhancers", NATURE, vol. 466, 2010, pages 490 - 493
FRAZER, K.A.; BALLINGER, D.G.; COX, D.R.; HINDS, D.A.; STUVE, L.L.; GIBBS, R.A.; BELMONT, J.W.; BOUDREAU, A.; HARDENBOL, P. ET AL.: "A second generation human haplotype map of over 3.1 million SNPs", NATURE, vol. 449, 2007, pages 851 - 861
FULLWOOD, M.J.; LIU, M.H.; PAN, Y.F.; LIU, J.; XU, H.; MOHAMED, Y.B.; ORLOV, Y.L.; VELKOV, S.; HO, A.; MEI, P.H. ET AL.: "An oestrogen-receptor-alpha-bound human chromatin interactome", NATURE, vol. 462, 2009, pages 58 - 64
GHAVI-HELM, Y.; YAD, G.-H.; KLEIN, F.A.; TIBOR, P.; LUCIA, C.; DAAN, N.; WOLFGANG, H.; FURLONG, E.E.M: "Enhancer loops appear stable during development and are associated with paused polymerase", NATURE, 2014
GIEGER, C.; RADHAKRISHNAN, A.; CVEJIC, A.; TANG, W.; PORCU, E.; PISTIS, G.; SERBANOVIC-CANIC, J.; ELLING, U.; GOODALL, A.H.; LABRU: "New gene functions in megakaryopoiesis and platelet formation", NATURE, vol. 480, 2011, pages 201 - 208
GRUBERT, F.; ZAUGG, J.B.; KASOWSKI, M.; URSU, O.; SPACEK, D.V.; MARTIN, A.R.; GREENSIDE, P.; SRIVAS, R.; PHANSTIEL, D.H.; PEKOWSKA: "Genetic Control of Chromatin States in Humans Involves Local and Distal Chromosomal Interactions", CELL, vol. 162, 2015, pages 1051 - 1065
HARROW, J.; FRANKISH, A.; GONZALEZ, J.M.; TAPANARI, E.; DIEKHANS, M.; KOKOCINSKI, F.; AKEN, B.L.; BARRELL, D.; ZADISSA, A.; SEARLE: "GENCODE: the reference human genome annotation for The ENCODE Project", GENOME RES., vol. 22, 2012, pages 1760 - 1774
HEINZ, S.; BENNER, C.; SPANN, N.; BERTOLINO, E.; LIN, Y.C.; LASLO, P.; CHENG, J.X.; MURRE, C.; SINGH, H.; GLASS, C.K.: "Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities", MOL. CELL, vol. 38, 2010, pages 576 - 589
HUANG, H.; HAILIANG, H.; MING, F.; LUKE, J.; MIRKOV, M.U.; GABRIELLE, B.; ANDERSON, C.A.; VIBEKE, A.; ISABELLE, C.; ADRIAN, C. ET, ASSOCIATION MAPPING OF INFLAMMATORY BOWEL DISEASE LOCI TO SINGLE VARIANT RESOLUTION, 2015
IMAKAEV, M.; FUDENBERG, G.; MCCORD, R.P.; NAUMOVA, N.; GOLOBORODKO, A.; LAJOIE, B.R.; DEKKER, J.; MIRNY, L.A.: "Iterative correction of Hi-C data reveals hallmarks of chromosome organization", NAT. METHODS, vol. 9, 2012, pages 999 - 1003
JAGER, R.; MIGLIORINI, G.; HENRION, M.; KANDASWAMY, R.; SPEEDY, H.E.; HEINDL, A.; WHIFFIN, N.; CARNICER, M.J.; BROOME, L.; DRYDEN,: "Capture Hi-C identifies the chromatin interactome of colorectal cancer risk loci", NAT. COMMUN., vol. 6, 2015, pages 6178
JEFFRIES, C.D.; WARD, W.O.; PERKINS, D.O.; WRIGHT, F.A.: "Discovering collectively informative descriptors from high-throughput experiments", BMC BIOINFORMATICS, vol. 10, 2009, pages 431
JIN, F.; FULAI, J.; YAN, L.; DIXON, J.R.; SIDDARTH, S.; ZHEN, Y.; LEE, A.Y.; CHIA-AN, Y.; SCHMITT, A.D.; ESPINOZA, C.A. ET AL.: "A high-resolution map of the three-dimensional chromatin interactome in human cells", NATURE, 2013
JOVANOVIC, D.V.; BOUMSELL, L.; BENSUSSAN, A.; CHEVALIER, X.; MANCINI, A.; DI BATTISTA, J.A.: "CD101 expression and function in normal and rheumatoid arthritis-affected human T cells and monocytes/macrophages", J. RHEUMATOL., vol. 38, 2011, pages 419 - 428
KASOWSKI, M.; KYRIAZOPOULOU-PANAGIOTOPOULOU, S.; GRUBERT, F.; ZAUGG, J.B.; KUNDAJE, A.; LIU, Y.; BOYLE, A.P.; ZHANG, Q.C.; ZAKHARI: "Extensive variation in chromatin states across humans", SCIENCE, vol. 342, 2013, pages 750 - 752
KRIVEGA, I.; IVAN, K.; ANN, D.: "Enhancer and promoter interactions-long distance calls", CURR. OPIN. GENET. DEV., vol. 22, 2012, pages 79 - 85
LIEBERMAN-AIDEN, E.; VAN BERKUM, N.L.; WILLIAMS, L.; IMAKAEV, M.; RAGOCZY, T.; TELLING, A.; AMIT, I.; LAJOIE, B.R.; SABO, P.J.; DO: "Comprehensive Mapping of Long-Range Interactions Reveals Folding Principles of the Human Genome", SCIENCE, vol. 326, 2009, pages 289 - 293
LIPPERT, C.; CASALE, F.P.; RAKITSCH, B.; STEGLE, O.: "LIMIX: genetic analysis of multiple traits", BIORXIV, 2014
LOCKE, A.E.; KAHALI, B.; BERNDT, S.I.; JUSTICE, A.E.; PERS, T.H.; DAY, F.R.; POWELL, C.; VEDANTAM, S.; BUCHKOVICH, M.L.; YANG, J.: "Genetic studies of body mass index yield new insights for obesity biology", NATURE, vol. 518, 2015, pages 197 - 206
LOVE, M.I.; HUBER, W.; ANDERS, S.: "Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2", GENOME BIOL., vol. 15, 2014, pages 550
MALLER, J.B.; MCVEAN, G.; BYRNES, J.; VUKCEVIC, D.; PALIN, K.; SU, Z.; HOWSON, J.M.M.; AUTON, A.; MYERS, S. ET AL.: "Bayesian refinement of association signals for 14 loci in 3 common diseases", NAT. GENET., vol. 44, 2012, pages 1294 - 1301
MANNING, A.K.; HIVERT, M.-F.; SCOTT, R.A.; GRIMSBY, J.L.; BOUATIA-NAJI, N.; CHEN, H.; RYBIN, D.; LIU, C.-T.; BIELAK, L.F.; PROKOPE: "A genome-wide approach accounting for body mass index identifies genetic variants influencing fasting glycemic traits and insulin resistance", NAT. GENET., vol. 44, 2012, pages 659 - 669
MANOLIO, T.A.: "Genomewide association studies and assessment of the risk of disease", N. ENGL. J. MED., vol. 363, 2010, pages 166 - 176
MARTIN, P., MCGOVERN, A.; OROZCO, G.; DUFFUS, K.; YARWOOD, A.; SCHOENFELDER, S.; COOPER, N.J.; BARTON, A.; WALLACE, C.; FRASER, P.: "Capture Hi-C reveals novel candidate genes and complex long-range interactions with related autoimmune risk loci", NAT. COMMUN., vol. 6, 2015, pages 10069
MAURANO, M.T.; HUMBERT, R.; RYNES, E.; THURMAN, R.E.; HAUGEN, E.; WANG, H.; REYNOLDS, A.P.; SANDSTROM, R.; QU, H.; BRODY, J. ET AL: "Systematic localization of common disease-associated variation in regulatory DNA", SCIENCE, vol. 337, 2012, pages 1190 - 1195
MCLAREN, W.; PRITCHARD, B.; RIOS, D.; CHEN, Y.; FLICEK, P.; CUNNINGHAM, F.: "Deriving the consequences of genomic variants with the Ensembl API and SNP Effect Predictor", BIOINFORMATICS, vol. 26, 2010, pages 2069 - 2070
MIFSUD, B.; BORBALA, M.; FILIPE, T.-C.; YOUNG, A.N.; ROBERT, S.; STEFAN, S.; LAUREN, F.; WINGETT, S.W.; SIMON, A.; WILLIAM, G. ET: "Mapping long-range promoter contacts in human cells with high-resolution capture Hi-C", NAT. GENET., vol. 47, 2015, pages 598 - 606
MONTOJO, J.; ZUBERI, K.; RODRIGUEZ, H.; KAZI, F.; WRIGHT, G.; DONALDSON, S.L.; MORRIS, Q.; BADER, G.D.: "GeneMANIA Cytoscape plugin: fast gene function predictions on the desktop", BIOINFORMATICS, vol. 26, 2010, pages 2927 - 2928
MORRIS, A.P.; VOIGHT, B.F.; TESLOVICH, T.M.; FERREIRA, T.; SEGRE, A.V.; STEINTHORSDOTTIR, V.; STRAWBRIDGE, R.J.; KHAN, H.; GRALLER: "Large-scale association analysis provides insights into the genetic architecture and pathophysiology of type 2 diabetes", NAT. GENET., vol. 44, 2012, pages 981 - 990
NAGANO, T.; LUBLING, Y.; STEVENS, T.J.; SCHOENFELDER, S.; YAFFE, E.; DEAN, W.; LAUE, E.D.; TANAY, A.; FRASER, P.: "Single-cell Hi-C reveals cell-to-cell variability in chromosome structure", NATURE, vol. 502, 2013, pages 59 - 64
NAGANO, T.; TAKASHI, N.; CSILLA, V.; STEFAN, S.; BIOLA-MARIA, J.; WINGETT, S.W.; PETER, F.: "Comparison of Hi-C results using in-solution versus in-nucleus ligation", GENOME BIOL., 2015, pages 16
NATOLI, G.; ANDRAU, J.-C.: "Noncoding transcription at enhancers: general principles and functional models", ANNU. REV. GENET., vol. 46, 2012, pages 1 - 19
NORA, E.P.; LAJOIE, B.R.; SCHULZ, E.G.; GIORGETTI, L.; OKAMOTO, I.; SERVANT, N.; PIOLOT, T.; VAN BERKUM, N.L.; MEISIG, J.; SEDAT,: "Spatial partitioning of the regulatory landscape of the X-inactivation centre", NATURE, vol. 485, 2012, pages 381 - 385
OKADA, Y.; TERAO, C.; IKARI, K.; KOCHI, Y.; OHMURA, K.; SUZUKI, A.; KAWAGUCHI, T.; STAHL, E.A.; KURREEMAN, F.A.S.; NISHIDA, N. ET: "Meta-analysis identifies nine new loci associated with rheumatoid arthritis in the Japanese population", NAT. GENET., vol. 44, 2012, pages 511 - 516
ORMISTON, M.L.; TOSHNER, M.R.; KISKIN, F.N.; HUANG, C.J.Z.; GROVES, E.; MORRELL, N.W.; RANA, A.A.: "Generation and Culture of Blood Outgrowth Endothelial Cells from Human Peripheral Blood", J. VIS. EXP., 2015, pages E53384
RAJAGOPAL, N.; SRINIVASAN, S.; KOOSHESH, K.; GUO, Y.; EDWARDS, M.D.; BANERJEE, B.; SYED, T.; EMONS, B.J.M.; GIFFORD, D.K.; SHERWOO: "High-throughput mapping of regulatory DNA", NAT. BIOTECHNOL., vol. 34, 2016, pages 167 - 174
SAHLEN, P.; ABDULLAYEV, I.; RAMSKOLD, D.; MATSKOVA, L.; RILAKOVIC, N.; LOTSTEDT, B.; ALBERT, T.J.; LUNDEBERG, J.; SANDBERG, R: "Genome-wide mapping of promoter-anchored interactions with close to single-enhancer resolution", GENOME BIOL., vol. 16, 2015, pages 156
SANYAL, A.; LAJOIE, B.R.; JAIN, G.; DEKKER, J.: "The long-range interaction landscape of gene promoters", NATURE, vol. 489, 2012, pages 109 - 113
SAWCER, S.; HELLENTHAL, G.; PIRINEN, M.; SPENCER, C.C.A.; PATSOPOULOS, N.A.; MOUTSIANAS, L.; DILTHEY, A.; SU, Z. ET AL.: "Genetic risk and a primary role for cell-mediated immune mechanisms in multiple sclerosis", NATURE, vol. 476, 2011, pages 214 - 219
SCHOENFELDER, S.; STEFAN, S.; MAYRA, F.-M.; BORBALA, M.; FILIPE, T.-C.; ROBERT, S.; BIOLA-MARIA, J.; TAKASHI, N.; YULIA, K.; MOORT: "The pluripotent regulatory circuitry connecting promoters to their long-range interacting elements", GENOME RES., vol. 25, 2015, pages 582 - 597
SCHOENFELDER, S.; SUGAR, R.; DIMOND, A.; JAVIERRE, B.-M.; ARMSTRONG, H.; MIFSUD, B.; DIMITROVA, E.; MATHESON, L.; TAVARES-CADETE,: "Polycomb repressive complex PRC1 spatially constrains the mouse embryonic stem cell genome", NAT. GENET., vol. 47, 2015, pages 1179 - 1186
SCHOFIELD, E.C.; CARVER, T.; ACHUTHAN, P.; FREIRE-PRITCHETT, P.; SPIVAKOV, M.; TODD, J.A.; BURREN, O.S.: "CHiCP: a web-based tool for the integrative and interactive visualization of promoter capture Hi-C datasets", BIOINFORMATICS, 2016
SEXTON, T.; TOM, S.; EITAN, Y.; EPHRAIM, K.; FREDERIC, B.; BENJAMIN, L.; MICHAEL, H.; HUGUES, P.; AMOS, T.; GIACOMO, C.: "Three-Dimensional Folding and Functional Organization Principles of the Drosophila Genome", CELL, vol. 148, 2012, pages 458 - 472
SMEMO, S.; TENA, J.J.; KIM, K.-H.; GAMAZON, E.R.; SAKABE, N.J.; GOMEZ-MARIN, C.; ANEAS, I.; CREDIDIO, F.L.; SOBREIRA, D.R.; WASSER: "Obesity-associated variants within FTO form long-range functional connections with IRX3", NATURE, vol. 507, 2014, pages 371 - 375
SPIVAKOV, M: "Spurious transcription factor binding: non-functional or genetically redundant?", BIOESSAYS, vol. 36, 2014, pages 798 - 806
STAHL, E.A.; RAYCHAUDHURI, S.; REMMERS, E.F.; XIE, G.; EYRE, S.; THOMSON, B.P.; LI, Y.; KURREEMAN, F.A.S.; ZHERNAKOVA, A.; HINKS,: "Genome-wide association study meta-analysis identifies seven new rheumatoid arthritis risk loci", NAT. GENET., vol. 42, 2010, pages 508 - 514
TESLOVICH, T.M.; MUSUNURU, K.; SMITH, A.V.; EDMONDSON, A.C.; STYLIANOU, I.M.; KOSEKI, M.; PIRRUCCELLO, J.P.; RIPATTI, S.; CHASMAN,: "Biological, clinical and population relevance of 95 loci for blood lipids", NATURE, vol. 466, 2010, pages 707 - 713
THOMPSON, E.C.; COBB, B.S.; SABBATTINI, P.; MEIXLSPERGER, S.; PARELHO, V.; LIBERG, D.; TAYLOR, B.; DILLON, N.; GEORGOPOULOS, K.; J: "Ikaros DNA-binding proteins as integral components of B cell developmental-stage-specific regulatory circuits", IMMUNITY, vol. 26, 2007, pages 335 - 344
TOLHUIS, B.; PALSTRA, R.J.; SPLINTER, E.; GROSVELD, F.; DE LAAT, W.: "Looping and interaction between hypersensitive sites in the active beta-globin locus", MOL. CELL, vol. 10, 2002, pages 1453 - 1465
TRYNKA, G.; WESTRA, H.-J.; SLOWIKOWSKI, K.; HU, X.; XU, H.; STRANGER, B.E.; KLEIN, R.J.; HAN, B.; RAYCHAUDHURI, S.: "Disentangling the Effects of Colocalizing Genomic Annotations to Functionally Prioritize Non-coding Variants within Complex-Trait Loci", AM. J. HUM. GENET., vol. 97, 2015, pages 139 - 152
VAN DER HARST, P.; ZHANG, W.; MATEO LEACH, I.; RENDON, A.; VERWEIJ, N.; SEHMI, J.; PAUL, D.S.; ELLING, U.; ALLAYEE, H.; LI, X. ET: "Seventy-five genetic loci influencing the human red blood cell", NATURE, vol. 492, 2012, pages 369 - 375
WAKEFIELD, J.: "Bayes factors for genome-wide association studies: comparison with P-values", GENET. EPIDEMIOL., vol. 33, 2009, pages 79 - 86
WASZAK, S.M.; DELANEAU, O.; GSCHWIND, A.R.; KILPINEN, H.; RAGHAV, S.K.; WITWICKI, R.M.; ORIOLI, A.; WIEDERKEHR, M.; PANOUSIS, N.I.: "Population Variation and Genetic Control of Modular Chromatin Architecture in Humans", CELL, vol. 162, 2015, pages 1039 - 1050
WELTER, D.; MACARTHUR, J.; MORALES, J.; BURDETT, T.; HALL, P.; JUNKINS, H.; KLEMM, A.; FLICEK, P.; MANOLIO, T.; HINDORFF, L. ET AL: "The NHGRI GWAS Catalog, a curated resource of SNP-trait associations", NUCLEIC ACIDS RES., vol. 42, 2014, pages D1001 - D1006
WINGETT, S.; EWELS, P.; FURLAN-MAGARIL, M.; NAGANO, T.; SCHOENFELDER, S.; FRASER, P.; ANDREWS, S.: "HiCUP: pipeline for mapping and processing Hi-C data", F1000RES., vol. 4, 2015, pages 1310
WOOD, A.R.; ESKO, T.; YANG, J.; VEDANTAM, S.; PERS, T.H.; GUSTAFSSON, S.; CHU, A.Y.; ESTRADA, K.; LUAN, J. 'AN; KUTALIK, Z. ET AL.: "Defining the role of common variation in the genomic and biological architecture of adult human height", NAT. GENET., vol. 46, 2014, pages 1173 - 1186
YU, G.; WANG, L.-G.; HAN, Y.; HE, Q.-Y.: "clusterProfiler: an R package for comparing biological themes among gene clusters", OMICS, vol. 16, 2012, pages 284 - 287
ZERBINO, D.R.; JOHNSON, N.; JUETTEMAN, T.; SHEPPARD, D.; WILDER, S.P.; LAVIDAS, I.; NUHN, M.; PERRY, E.; RAFFAILLAC-DESFOSSES, Q.;: "Ensembl regulation resources", DATABASE, 2016
ZERBINO, D.R.; WILDER, S.P.; JOHNSON, N.; JUETTEMANN, T.; FLICEK, P.R.: "The ensembl regulatory build", GENOME BIOL., vol. 16, 2015, pages 56
ZHU, Z.; ZHANG, F.; HU, H.; BAKSHI, A.; ROBINSON, M.R.; POWELL, J.E.; MONTGOMERY, G.W.; GODDARD, M.E.; WRAY, N.R.; VISSCHER, P.M.: "Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets", NAT. GENET., vol. 48, 2016, pages 481 - 487

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20180133180A (en) * 2017-06-05 2018-12-13 한국과학기술원 Biomarker For Mature Pancreatic Beta Cell And Methods Of Using The Same
KR102129624B1 (en) 2017-06-05 2020-07-02 한국과학기술원 Biomarker For Mature Pancreatic Beta Cell And Methods Of Using The Same
US20200408779A1 (en) * 2018-02-09 2020-12-31 City Of Hope Doc2b as a biomarker for type 1 diabetes
CN108254558A (en) * 2018-02-11 2018-07-06 山东省千佛山医院 Applications of the PADI3 in colon cancer is diagnosed and/or treated
CN108254558B (en) * 2018-02-11 2021-02-09 山东省千佛山医院 Use of PADI3 in diagnosis and/or treatment of colon cancer
CN109504778A (en) * 2019-01-11 2019-03-22 复旦大学附属中山医院 It is a kind of that model is early diagnosed based on the 5hmC polymolecular marker apparently modified and colorectal cancer
CN109504778B (en) * 2019-01-11 2021-11-09 复旦大学附属中山医院 5hmC multi-molecular marker based on apparent modification and colorectal cancer early diagnosis model
CN109628588A (en) * 2019-02-27 2019-04-16 河北医科大学第二医院 Osteoarthritis disorders screening gene PRXL2A and ACTR8 and application thereof
CN109628588B (en) * 2019-02-27 2019-09-17 河北医科大学第二医院 Osteoarthritis disorders screening gene PRXL2A and ACTR8 and application thereof
US20220091136A1 (en) * 2020-09-16 2022-03-24 Ajou University Industry-Academic Cooperation Foundation Early detection marker for degenerative osteoarthritis with trim24-rip3 axis
US12117455B2 (en) * 2020-09-16 2024-10-15 Ajou University Industry-Academic Cooperation Foundation Early detection marker for degenerative osteoarthritis with TRIM24-RIP3 axis
WO2023043257A1 (en) * 2021-09-16 2023-03-23 경북대학교 산학협력단 Pharmaceutical composition for preventing or treating osteoarthritis, containing smpd1 regulator as active ingredient

Also Published As

Publication number Publication date
GB201609712D0 (en) 2016-07-20
EP3465219A1 (en) 2019-04-10

Similar Documents

Publication Publication Date Title
US20220325348A1 (en) Biomarker signature method, and apparatus and kits therefor
EP3465219A1 (en) Biomarkers for platelet disorders
US20240102095A1 (en) Methods for profiling and quantitating cell-free rna
US10002230B2 (en) Screening, diagnosis and prognosis of autism and other developmental disorders
Shah et al. A recurrent germline PAX5 mutation confers susceptibility to pre-B cell acute lymphoblastic leukemia
US7611839B2 (en) Methods for diagnosing RCC and other solid tumors
US20160041153A1 (en) Biomarker compositions and markers
US20230178245A1 (en) Immunotherapy Response Signature
AU2012294458A1 (en) Biomarker compositions and methods
IL301304A (en) Metastasis predictor
AU2018335382B2 (en) Novel cell line and uses thereof
US20240167097A1 (en) Cellular response assays for lung cancer
Yaung et al. Artificial intelligence and high-dimensional technologies in the theragnosis of systemic lupus erythematosus
US20220290243A1 (en) Identification of patients that will respond to chemotherapy
CA2949959A1 (en) Gene expression profiles associated with sub-clinical kidney transplant rejection
Borràs et al. The use of transcriptomics in clinical applications
US20240115699A1 (en) Use of cancer cell expression of cadherin 12 and cadherin 18 to treat muscle invasive and metastatic bladder cancers
US20240150453A1 (en) Methods of predicting response to anti-tnf blockade in inflammatory bowel disease
Chyra Genomické studie u monoklonálních gamapatií

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17731938

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2017731938

Country of ref document: EP

Effective date: 20190103