NZ791679A - Methods for the detection of genomic copy changes in dna samples - Google Patents
Methods for the detection of genomic copy changes in dna samplesInfo
- Publication number
- NZ791679A NZ791679A NZ791679A NZ79167917A NZ791679A NZ 791679 A NZ791679 A NZ 791679A NZ 791679 A NZ791679 A NZ 791679A NZ 79167917 A NZ79167917 A NZ 79167917A NZ 791679 A NZ791679 A NZ 791679A
- Authority
- NZ
- New Zealand
- Prior art keywords
- dna
- sample
- region
- nucleotides
- adaptor
- Prior art date
Links
- 229920003013 deoxyribonucleic acid Polymers 0.000 title claims abstract description 432
- 238000001514 detection method Methods 0.000 title abstract description 21
- 239000000523 sample Substances 0.000 claims abstract description 431
- 210000004369 Blood Anatomy 0.000 claims abstract description 16
- 239000008280 blood Substances 0.000 claims abstract description 16
- 239000012472 biological sample Substances 0.000 claims abstract description 13
- 239000002773 nucleotide Substances 0.000 claims description 213
- 125000003729 nucleotide group Chemical group 0.000 claims description 213
- 229920001850 Nucleic acid sequence Polymers 0.000 claims description 71
- 230000003321 amplification Effects 0.000 claims description 71
- 238000003199 nucleic acid amplification method Methods 0.000 claims description 70
- 229920002083 cellular DNA Polymers 0.000 claims description 43
- 229920000023 polynucleotide Polymers 0.000 claims description 36
- 239000002157 polynucleotide Substances 0.000 claims description 36
- 210000001519 tissues Anatomy 0.000 claims description 35
- 206010028980 Neoplasm Diseases 0.000 claims description 26
- 238000001574 biopsy Methods 0.000 claims description 15
- 239000012530 fluid Substances 0.000 claims description 13
- 210000002381 Plasma Anatomy 0.000 claims description 9
- 210000004381 Amniotic Fluid Anatomy 0.000 claims description 3
- 210000003296 Saliva Anatomy 0.000 claims description 3
- 210000000582 Semen Anatomy 0.000 claims description 3
- 210000002966 Serum Anatomy 0.000 claims description 3
- 210000004243 Sweat Anatomy 0.000 claims description 3
- 210000002700 Urine Anatomy 0.000 claims description 3
- 230000002490 cerebral Effects 0.000 claims description 3
- 230000001926 lymphatic Effects 0.000 claims description 2
- 230000002068 genetic Effects 0.000 abstract description 123
- 239000000203 mixture Substances 0.000 abstract description 53
- 230000001413 cellular Effects 0.000 abstract description 12
- 230000000869 mutational Effects 0.000 abstract description 7
- 238000004458 analytical method Methods 0.000 description 90
- -1 ABLl Proteins 0.000 description 41
- 210000004027 cells Anatomy 0.000 description 40
- 238000004166 bioassay Methods 0.000 description 39
- 201000011510 cancer Diseases 0.000 description 38
- 230000004536 DNA copy number loss Effects 0.000 description 29
- 229920000272 Oligonucleotide Polymers 0.000 description 27
- 230000002759 chromosomal Effects 0.000 description 27
- 238000000034 method Methods 0.000 description 24
- 241000700605 Viruses Species 0.000 description 23
- 230000035772 mutation Effects 0.000 description 21
- 201000010099 disease Diseases 0.000 description 20
- 230000003902 lesions Effects 0.000 description 19
- 150000007523 nucleic acids Chemical group 0.000 description 18
- 102000015098 Tumor Suppressor Protein p53 Human genes 0.000 description 16
- 108010078814 Tumor Suppressor Protein p53 Proteins 0.000 description 16
- 210000000349 Chromosomes Anatomy 0.000 description 15
- 241000282414 Homo sapiens Species 0.000 description 15
- 108020004707 nucleic acids Proteins 0.000 description 15
- 238000006243 chemical reaction Methods 0.000 description 14
- 150000002500 ions Chemical class 0.000 description 14
- 102000004190 Enzymes Human genes 0.000 description 12
- 108090000790 Enzymes Proteins 0.000 description 12
- 238000006062 fragmentation reaction Methods 0.000 description 10
- 201000002406 genetic disease Diseases 0.000 description 10
- 230000035945 sensitivity Effects 0.000 description 10
- 230000002255 enzymatic Effects 0.000 description 9
- 230000004927 fusion Effects 0.000 description 9
- 238000005259 measurement Methods 0.000 description 9
- 238000003753 real-time PCR Methods 0.000 description 9
- 150000001768 cations Chemical group 0.000 description 8
- 239000003814 drug Substances 0.000 description 8
- 230000001605 fetal Effects 0.000 description 8
- 239000000126 substance Substances 0.000 description 8
- 108060006202 ATM Proteins 0.000 description 7
- 102100000648 ATM Human genes 0.000 description 7
- 108060009497 WRNexo Proteins 0.000 description 7
- 101710030587 ligN Proteins 0.000 description 7
- 101700077585 ligd Proteins 0.000 description 7
- 238000010606 normalization Methods 0.000 description 7
- 210000000056 organs Anatomy 0.000 description 7
- 230000001717 pathogenic Effects 0.000 description 7
- 244000052769 pathogens Species 0.000 description 7
- 238000000746 purification Methods 0.000 description 7
- 241000894007 species Species 0.000 description 7
- 102000008422 EC 2.7.1.78 Human genes 0.000 description 6
- 108010021757 EC 2.7.1.78 Proteins 0.000 description 6
- 206010068052 Mosaicism Diseases 0.000 description 6
- RWQNBRDOKXIBIV-UHFFFAOYSA-N Thymine Chemical compound CC1=CNC(=O)NC1=O RWQNBRDOKXIBIV-UHFFFAOYSA-N 0.000 description 6
- 230000000875 corresponding Effects 0.000 description 6
- 229940079593 drugs Drugs 0.000 description 6
- 238000003780 insertion Methods 0.000 description 6
- 229920000160 (ribonucleotides)n+m Polymers 0.000 description 5
- 238000001712 DNA sequencing Methods 0.000 description 5
- 102000012338 Poly(ADP-ribose) Polymerases Human genes 0.000 description 5
- 108010061844 Poly(ADP-ribose) Polymerases Proteins 0.000 description 5
- 206010060862 Prostate cancer Diseases 0.000 description 5
- 238000003776 cleavage reaction Methods 0.000 description 5
- 230000000694 effects Effects 0.000 description 5
- 230000002401 inhibitory effect Effects 0.000 description 5
- 230000000670 limiting Effects 0.000 description 5
- 238000010008 shearing Methods 0.000 description 5
- 238000000527 sonication Methods 0.000 description 5
- 238000004450 types of analysis Methods 0.000 description 5
- 108010000750 BRCA2 Protein Proteins 0.000 description 4
- 102000002280 BRCA2 Protein Human genes 0.000 description 4
- 229920001405 Coding region Polymers 0.000 description 4
- 229920002676 Complementary DNA Polymers 0.000 description 4
- 230000003350 DNA copy number gain Effects 0.000 description 4
- 101700011961 DPOM Proteins 0.000 description 4
- 101710029649 MDV043 Proteins 0.000 description 4
- 101700080605 NUC1 Proteins 0.000 description 4
- 101700061424 POLB Proteins 0.000 description 4
- 101700054624 RF1 Proteins 0.000 description 4
- 229940035295 Ting Drugs 0.000 description 4
- 239000002253 acid Substances 0.000 description 4
- 230000027455 binding Effects 0.000 description 4
- 238000007622 bioinformatic analysis Methods 0.000 description 4
- 239000002299 complementary DNA Substances 0.000 description 4
- 238000010276 construction Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 238000009396 hybridization Methods 0.000 description 4
- 239000003112 inhibitor Substances 0.000 description 4
- 230000002934 lysing Effects 0.000 description 4
- 244000005700 microbiome Species 0.000 description 4
- 238000010369 molecular cloning Methods 0.000 description 4
- 101700006494 nucA Proteins 0.000 description 4
- 150000003839 salts Chemical class 0.000 description 4
- 239000011780 sodium chloride Substances 0.000 description 4
- WYWHKKSPHMUBEB-UHFFFAOYSA-N tioguanine Chemical compound N1C(N)=NC(=S)C2=C1N=CN2 WYWHKKSPHMUBEB-UHFFFAOYSA-N 0.000 description 4
- 102100011141 ALK Human genes 0.000 description 3
- 101710033641 ALK Proteins 0.000 description 3
- 229960000643 Adenine Drugs 0.000 description 3
- GFFGJBXGBJISGV-UHFFFAOYSA-N Adenine Natural products NC1=NC=NC2=C1N=CN2 GFFGJBXGBJISGV-UHFFFAOYSA-N 0.000 description 3
- 206010006187 Breast cancer Diseases 0.000 description 3
- 102100017371 CYP2C19 Human genes 0.000 description 3
- 101710007600 CYP2C19 Proteins 0.000 description 3
- 102100005543 CYP2D6 Human genes 0.000 description 3
- 101710007490 CYP2D6 Proteins 0.000 description 3
- 102100004057 CYP3A4 Human genes 0.000 description 3
- 101710007540 CYP3A4 Proteins 0.000 description 3
- 102100004059 CYP3A5 Human genes 0.000 description 3
- 101710007537 CYP3A5 Proteins 0.000 description 3
- 101710028159 DNTT Proteins 0.000 description 3
- 102100005049 DPYD Human genes 0.000 description 3
- 229920002024 GDNA Polymers 0.000 description 3
- 102000007648 Glutathione S-Transferase pi Human genes 0.000 description 3
- 108010007355 Glutathione S-Transferase pi Proteins 0.000 description 3
- 102100006988 KCNH2 Human genes 0.000 description 3
- 101700085508 KCNH2 Proteins 0.000 description 3
- FBOZXECLQNJBKD-ZDUSSCGKSA-N L-methotrexate Chemical compound C=1N=C2N=C(N)N=C(N)C2=NC=1CN(C)C1=CC=C(C(=O)N[C@@H](CCC(O)=O)C(O)=O)C=C1 FBOZXECLQNJBKD-ZDUSSCGKSA-N 0.000 description 3
- 102100002074 MTHFR Human genes 0.000 description 3
- 241001465754 Metazoa Species 0.000 description 3
- 206010035226 Plasma cell myeloma Diseases 0.000 description 3
- 229940113082 Thymine Drugs 0.000 description 3
- 238000007792 addition Methods 0.000 description 3
- OIRDTQYFTABQOQ-KQYNXXCUSA-N adenosine Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@@H]1O[C@H](CO)[C@@H](O)[C@H]1O OIRDTQYFTABQOQ-KQYNXXCUSA-N 0.000 description 3
- 231100001075 aneuploidy Toxicity 0.000 description 3
- 230000001093 anti-cancer Effects 0.000 description 3
- 230000003466 anti-cipated Effects 0.000 description 3
- 239000002246 antineoplastic agent Substances 0.000 description 3
- 238000003766 bioinformatics method Methods 0.000 description 3
- 239000000090 biomarker Substances 0.000 description 3
- 201000009030 carcinoma Diseases 0.000 description 3
- 230000001364 causal effect Effects 0.000 description 3
- 230000002153 concerted Effects 0.000 description 3
- 230000002596 correlated Effects 0.000 description 3
- 230000009089 cytolysis Effects 0.000 description 3
- 230000003247 decreasing Effects 0.000 description 3
- 238000003745 diagnosis Methods 0.000 description 3
- 230000029087 digestion Effects 0.000 description 3
- LFQSCWFLJHTTHZ-UHFFFAOYSA-N ethanol Chemical compound CCO LFQSCWFLJHTTHZ-UHFFFAOYSA-N 0.000 description 3
- OVBPIULPVIDEAO-LBPRGKRZSA-N folic acid Chemical class C=1N=C2NC(N)=NC(=O)C2=NC=1CNC1=CC=C(C(=O)N[C@@H](CCC(O)=O)C(O)=O)C=C1 OVBPIULPVIDEAO-LBPRGKRZSA-N 0.000 description 3
- 239000000463 material Substances 0.000 description 3
- 229960000485 methotrexate Drugs 0.000 description 3
- 238000007481 next generation sequencing Methods 0.000 description 3
- 239000011886 peripheral blood Substances 0.000 description 3
- 229920001184 polypeptide Polymers 0.000 description 3
- 235000018102 proteins Nutrition 0.000 description 3
- 102000004169 proteins and genes Human genes 0.000 description 3
- 108090000623 proteins and genes Proteins 0.000 description 3
- 102000005962 receptors Human genes 0.000 description 3
- 108020003175 receptors Proteins 0.000 description 3
- 230000022983 regulation of cell cycle Effects 0.000 description 3
- 108091007521 restriction endonucleases Proteins 0.000 description 3
- 201000010874 syndrome Diseases 0.000 description 3
- 238000002560 therapeutic procedure Methods 0.000 description 3
- PCHKPVIQAHNQLW-CQSZACIVSA-N 2-[4-[(3S)-piperidin-3-yl]phenyl]indazole-7-carboxamide Chemical compound N1=C2C(C(=O)N)=CC=CC2=CN1C(C=C1)=CC=C1[C@@H]1CCCNC1 PCHKPVIQAHNQLW-CQSZACIVSA-N 0.000 description 2
- HMABYWSNWIZPAG-UHFFFAOYSA-N 283173-50-2 Chemical compound C1=CC(CNC)=CC=C1C(N1)=C2CCNC(=O)C3=C2C1=CC(F)=C3 HMABYWSNWIZPAG-UHFFFAOYSA-N 0.000 description 2
- GHASVSINZRGABV-UHFFFAOYSA-N 5-flurouricil Chemical compound FC1=CNC(=O)NC1=O GHASVSINZRGABV-UHFFFAOYSA-N 0.000 description 2
- 102100014183 ADRB2 Human genes 0.000 description 2
- 108060003354 ADRB2 Proteins 0.000 description 2
- 101700006583 AKT2 Proteins 0.000 description 2
- 102100009823 ALOX5 Human genes 0.000 description 2
- 208000009956 Adenocarcinoma Diseases 0.000 description 2
- 229920002425 Alu element Polymers 0.000 description 2
- 102000003984 Aryl hydrocarbon receptors Human genes 0.000 description 2
- 108090000448 Aryl hydrocarbon receptors Proteins 0.000 description 2
- IUEWAGVJRJORLA-HZPDHXFCSA-N BMN-673 Chemical compound CN1N=CN=C1[C@H]1C(NNC(=O)C2=CC(F)=C3)=C2C3=N[C@@H]1C1=CC=C(F)C=C1 IUEWAGVJRJORLA-HZPDHXFCSA-N 0.000 description 2
- 206010005003 Bladder cancer Diseases 0.000 description 2
- 102100014737 CYP2B6 Human genes 0.000 description 2
- 101710007656 CYP2B6 Proteins 0.000 description 2
- 102100018445 CYP2J2 Human genes 0.000 description 2
- 101710007736 CYP2J2 Proteins 0.000 description 2
- 229960004117 Capecitabine Drugs 0.000 description 2
- GAGWJHPBXLXJQN-UORFTKCHSA-N Capecitabine Chemical compound C1=C(F)C(NC(=O)OCCCCC)=NC(=O)N1[C@H]1[C@H](O)[C@H](O)[C@@H](C)O1 GAGWJHPBXLXJQN-UORFTKCHSA-N 0.000 description 2
- 241000713756 Caprine arthritis encephalitis virus Species 0.000 description 2
- 229960004630 Chlorambucil Drugs 0.000 description 2
- JCKYGMPEJWAADB-UHFFFAOYSA-N Chlorambucil Chemical compound OC(=O)CCCC1=CC=C(N(CCCl)CCCl)C=C1 JCKYGMPEJWAADB-UHFFFAOYSA-N 0.000 description 2
- 206010008958 Chronic lymphocytic leukaemia Diseases 0.000 description 2
- 229920000453 Consensus sequence Polymers 0.000 description 2
- 229960004397 Cyclophosphamide Drugs 0.000 description 2
- CMSMOCZEIVJLDB-UHFFFAOYSA-N Cyclophosphamide Chemical compound ClCCN(CCCl)P1(=O)NCCCO1 CMSMOCZEIVJLDB-UHFFFAOYSA-N 0.000 description 2
- 102000002237 Cytochrome P-450 CYP2A6 Human genes 0.000 description 2
- 108010000080 Cytochrome P-450 CYP2A6 Proteins 0.000 description 2
- 102100014129 DMD Human genes 0.000 description 2
- 230000004544 DNA amplification Effects 0.000 description 2
- 239000003298 DNA probe Substances 0.000 description 2
- 101700031267 DPYD Proteins 0.000 description 2
- 101700040453 DRD2 Proteins 0.000 description 2
- 102100014971 DRD2 Human genes 0.000 description 2
- 201000010374 Down syndrome Diseases 0.000 description 2
- 239000004243 E-number Substances 0.000 description 2
- 235000019227 E-number Nutrition 0.000 description 2
- 102000006378 EC 2.1.1.6 Human genes 0.000 description 2
- 108020002739 EC 2.1.1.6 Proteins 0.000 description 2
- 241000713730 Equine infectious anemia virus Species 0.000 description 2
- 102000013165 Exonucleases Human genes 0.000 description 2
- 108060002716 Exonucleases Proteins 0.000 description 2
- 102100020191 FGFR3 Human genes 0.000 description 2
- 108010087740 Fanconi Anemia Complementation Group A Protein Proteins 0.000 description 2
- 102000009095 Fanconi Anemia Complementation Group A Protein Human genes 0.000 description 2
- 201000000106 Fanconi anemia complementation group A Diseases 0.000 description 2
- 241000713800 Feline immunodeficiency virus Species 0.000 description 2
- 241000714165 Feline leukemia virus Species 0.000 description 2
- 241000711950 Filoviridae Species 0.000 description 2
- 229960002949 Fluorouracil Drugs 0.000 description 2
- 108091006011 G proteins Proteins 0.000 description 2
- 102000030007 GTP-Binding Proteins Human genes 0.000 description 2
- 108091000058 GTP-Binding Proteins Proteins 0.000 description 2
- 206010017758 Gastric cancer Diseases 0.000 description 2
- 206010051066 Gastrointestinal stromal tumour Diseases 0.000 description 2
- UYTPUPDQBNUYGX-UHFFFAOYSA-N Guanine Chemical compound O=C1NC(N)=NC2=C1N=CN2 UYTPUPDQBNUYGX-UHFFFAOYSA-N 0.000 description 2
- 102100016432 HDAC2 Human genes 0.000 description 2
- 101700061787 HDAC2 Proteins 0.000 description 2
- 206010073071 Hepatocellular carcinoma Diseases 0.000 description 2
- FDGQSTZJBFJUBT-UHFFFAOYSA-N Hypoxanthine Chemical compound O=C1NC=NC2=C1NC=N2 FDGQSTZJBFJUBT-UHFFFAOYSA-N 0.000 description 2
- 229960001101 Ifosfamide Drugs 0.000 description 2
- HOMGKSMUEGBAAB-UHFFFAOYSA-N Ifosfamide Chemical compound ClCCNP1(=O)OCCCN1CCCl HOMGKSMUEGBAAB-UHFFFAOYSA-N 0.000 description 2
- 102100012475 LDLR Human genes 0.000 description 2
- 208000000429 Leukemia, Lymphocytic, Chronic, B-Cell Diseases 0.000 description 2
- 208000008456 Leukemia, Myelogenous, Chronic, BCR-ABL Positive Diseases 0.000 description 2
- 241000712899 Lymphocytic choriomeningitis mammarenavirus Species 0.000 description 2
- 101710033932 MTHFR Proteins 0.000 description 2
- 229920002393 Microsatellite Polymers 0.000 description 2
- 229960001156 Mitoxantrone Drugs 0.000 description 2
- KKZJGLLVHKMTCM-UHFFFAOYSA-N Mitoxantrone Chemical compound O=C1C2=C(O)C=CC(O)=C2C(=O)C2=C1C(NCCNCCO)=CC=C2NCCNCCO KKZJGLLVHKMTCM-UHFFFAOYSA-N 0.000 description 2
- 241000713862 Moloney murine sarcoma virus Species 0.000 description 2
- 241000699666 Mus <mouse, genus> Species 0.000 description 2
- NWIBSHFKIJFRCO-WUDYKRTCSA-N Mytomycin Chemical class C1N2C(C(C(C)=C(N)C3=O)=O)=C3[C@@H](COC(N)=O)[C@@]2(OC)[C@@H]2[C@H]1N2 NWIBSHFKIJFRCO-WUDYKRTCSA-N 0.000 description 2
- 229950011068 Niraparib Drugs 0.000 description 2
- 206010029592 Non-Hodgkin's lymphomas Diseases 0.000 description 2
- 102100019727 P2RY1 Human genes 0.000 description 2
- 102100017318 PTGIS Human genes 0.000 description 2
- 101700002755 PTGIS Proteins 0.000 description 2
- 201000009928 Patau syndrome Diseases 0.000 description 2
- 208000006664 Precursor Cell Lymphoblastic Leukemia-Lymphoma Diseases 0.000 description 2
- 210000002307 Prostate Anatomy 0.000 description 2
- 101710043943 RALGDS Proteins 0.000 description 2
- 108020004511 Recombinant DNA Proteins 0.000 description 2
- 241000714474 Rous sarcoma virus Species 0.000 description 2
- 102100017449 SCN5A Human genes 0.000 description 2
- 101700070146 SCN5A Proteins 0.000 description 2
- 108091006693 SLC19A1 Proteins 0.000 description 2
- 102100010056 SLC19A1 Human genes 0.000 description 2
- 241000713311 Simian immunodeficiency virus Species 0.000 description 2
- 241000580858 Simian-Human immunodeficiency virus Species 0.000 description 2
- 102100009449 TPMT Human genes 0.000 description 2
- 108050009735 TPMT Proteins 0.000 description 2
- 102100011050 TYMS Human genes 0.000 description 2
- 101710013732 TYMS Proteins 0.000 description 2
- FOCVUCIESVLUNU-UHFFFAOYSA-N ThioTEPA Chemical compound C1CN1P(N1CC1)(=S)N1CC1 FOCVUCIESVLUNU-UHFFFAOYSA-N 0.000 description 2
- 229960005454 Thioguanine Drugs 0.000 description 2
- 229960001196 Thiotepa Drugs 0.000 description 2
- XFCLJVABOIYOMF-QPLCGJKRSA-N Toremifene Chemical compound C1=CC(OCCN(C)C)=CC=C1C(\C=1C=CC=CC=1)=C(\CCCl)C1=CC=CC=C1 XFCLJVABOIYOMF-QPLCGJKRSA-N 0.000 description 2
- 102000008579 Transposases Human genes 0.000 description 2
- 108010020764 Transposases Proteins 0.000 description 2
- 206010044686 Trisomy 13 Diseases 0.000 description 2
- 108010081267 Type 3 Fibroblast Growth Factor Receptor Proteins 0.000 description 2
- 102100007198 UBE3A Human genes 0.000 description 2
- 101700027248 UBE3A Proteins 0.000 description 2
- 241000711975 Vesicular stomatitis virus Species 0.000 description 2
- 230000004308 accommodation Effects 0.000 description 2
- 150000007513 acids Chemical class 0.000 description 2
- RJURFGZVJUQBHK-IIXSONLDSA-N actinomycin D Chemical compound C[C@H]1OC(=O)[C@H](C(C)C)N(C)C(=O)CN(C)C(=O)[C@@H]2CCCN2C(=O)[C@@H](C(C)C)NC(=O)[C@H]1NC(=O)C1=C(N)C(=O)C(C)=C2OC(C(C)=CC=C3C(=O)N[C@@H]4C(=O)N[C@@H](C(N5CCC[C@H]5C(=O)N(C)CC(=O)N(C)[C@@H](C(C)C)C(=O)O[C@@H]4C)=O)C(C)C)=C3N=C21 RJURFGZVJUQBHK-IIXSONLDSA-N 0.000 description 2
- 201000005510 acute lymphocytic leukemia Diseases 0.000 description 2
- 239000011543 agarose gel Substances 0.000 description 2
- 239000012491 analyte Substances 0.000 description 2
- 238000000137 annealing Methods 0.000 description 2
- 239000002738 chelating agent Substances 0.000 description 2
- 201000006934 chronic myeloid leukemia Diseases 0.000 description 2
- 238000007374 clinical diagnostic method Methods 0.000 description 2
- 238000010367 cloning Methods 0.000 description 2
- 230000000295 complement Effects 0.000 description 2
- 238000010192 crystallographic characterization Methods 0.000 description 2
- 201000011243 gastrointestinal stromal tumor Diseases 0.000 description 2
- 230000014509 gene expression Effects 0.000 description 2
- 239000002955 immunomodulating agent Substances 0.000 description 2
- 230000002584 immunomodulator Effects 0.000 description 2
- 229940121354 immunomodulators Drugs 0.000 description 2
- 238000002955 isolation Methods 0.000 description 2
- 239000007788 liquid Substances 0.000 description 2
- 201000005244 lung non-small cell carcinoma Diseases 0.000 description 2
- 108091022076 maltose binding proteins Proteins 0.000 description 2
- GLVAUDGFNGKCSF-UHFFFAOYSA-N mercaptopurine Chemical compound S=C1NC=NC2=C1NC=N2 GLVAUDGFNGKCSF-UHFFFAOYSA-N 0.000 description 2
- 229960001428 mercaptopurine Drugs 0.000 description 2
- 229910052751 metal Inorganic materials 0.000 description 2
- 239000002184 metal Substances 0.000 description 2
- 201000009251 multiple myeloma Diseases 0.000 description 2
- 201000003793 myelodysplastic syndrome Diseases 0.000 description 2
- IJGRMHOSHXDMSA-UHFFFAOYSA-N nitrogen Chemical compound N#N IJGRMHOSHXDMSA-UHFFFAOYSA-N 0.000 description 2
- FDLYAMZZIXQODN-UHFFFAOYSA-N olaparib Chemical compound FC1=CC=C(CC=2C3=CC=CC=C3C(=O)NN=2)C=C1C(=O)N(CC1)CCN1C(=O)C1CC1 FDLYAMZZIXQODN-UHFFFAOYSA-N 0.000 description 2
- 201000008968 osteosarcoma Diseases 0.000 description 2
- 238000004393 prognosis Methods 0.000 description 2
- 239000002096 quantum dot Substances 0.000 description 2
- 229950004707 rucaparib Drugs 0.000 description 2
- 201000011549 stomach cancer Diseases 0.000 description 2
- 201000002510 thyroid cancer Diseases 0.000 description 2
- 229960005026 toremifene Drugs 0.000 description 2
- 241001529453 unidentified herpesvirus Species 0.000 description 2
- 229960002066 vinorelbine Drugs 0.000 description 2
- GMRQFYUYWCNGIN-NKMMMXOESA-N (1R,3S,5Z)-5-{2-[(1R,3aS,4E,7aR)-1-[(2R)-6-hydroxy-6-methylheptan-2-yl]-7a-methyl-octahydro-1H-inden-4-ylidene]ethylidene}-4-methylidenecyclohexane-1,3-diol Chemical compound C1(/[C@@H]2CC[C@@H]([C@]2(CCC1)C)[C@@H](CCCC(C)(C)O)C)=C\C=C1\C[C@@H](O)C[C@H](O)C1=C GMRQFYUYWCNGIN-NKMMMXOESA-N 0.000 description 1
- SHGAZHPCJJPHSC-ZVCIMWCZSA-N (2E,4E,6Z,8E)-3,7-dimethyl-9-(2,6,6-trimethylcyclohex-1-en-1-yl)nona-2,4,6,8-tetraenoic acid Chemical compound OC(=O)/C=C(\C)/C=C/C=C(/C)\C=C\C1=C(C)CCCC1(C)C SHGAZHPCJJPHSC-ZVCIMWCZSA-N 0.000 description 1
- WDQLRUYAYXDIFW-RWKIJVEZSA-N (2R,3R,4S,5R,6R)-4-[(2S,3R,4S,5R,6R)-3,5-dihydroxy-4-[(2R,3R,4S,5S,6R)-3,4,5-trihydroxy-6-(hydroxymethyl)oxan-2-yl]oxy-6-[[(2R,3R,4S,5S,6R)-3,4,5-trihydroxy-6-(hydroxymethyl)oxan-2-yl]oxymethyl]oxan-2-yl]oxy-6-(hydroxymethyl)oxane-2,3,5-triol Chemical compound O[C@@H]1[C@@H](CO)O[C@@H](O)[C@H](O)[C@H]1O[C@H]1[C@H](O)[C@@H](O[C@H]2[C@@H]([C@@H](O)[C@H](O)[C@@H](CO)O2)O)[C@H](O)[C@@H](CO[C@H]2[C@@H]([C@@H](O)[C@H](O)[C@@H](CO)O2)O)O1 WDQLRUYAYXDIFW-RWKIJVEZSA-N 0.000 description 1
- VFKZTMPDYBFSTM-GUCUJZIJSA-N (2R,3S,4R,5S)-1,6-dibromohexane-2,3,4,5-tetrol Chemical compound BrC[C@H](O)[C@@H](O)[C@@H](O)[C@H](O)CBr VFKZTMPDYBFSTM-GUCUJZIJSA-N 0.000 description 1
- FLWWDYNPWOSLEO-HQVZTVAUSA-N (2S)-2-[[4-[1-(2-amino-4-oxo-1H-pteridin-6-yl)ethyl-methylamino]benzoyl]amino]pentanedioic acid Chemical compound C=1N=C2NC(N)=NC(=O)C2=NC=1C(C)N(C)C1=CC=C(C(=O)N[C@@H](CCC(O)=O)C(O)=O)C=C1 FLWWDYNPWOSLEO-HQVZTVAUSA-N 0.000 description 1
- PMATZTZNYRCHOR-CGLBZJNRSA-N (3S,6S,9S,12R,15S,18S,21S,24S,30S,33S)-30-ethyl-33-[(E,1R,2R)-1-hydroxy-2-methylhex-4-enyl]-1,4,7,10,12,15,19,25,28-nonamethyl-6,9,18,24-tetrakis(2-methylpropyl)-3,21-di(propan-2-yl)-1,4,7,10,13,16,19,22,25,28,31-undecazacyclotritriacontane-2,5,8,11,14,17 Chemical compound CC[C@@H]1NC(=O)[C@H]([C@H](O)[C@H](C)C\C=C\C)N(C)C(=O)[C@H](C(C)C)N(C)C(=O)[C@H](CC(C)C)N(C)C(=O)[C@H](CC(C)C)N(C)C(=O)[C@@H](C)NC(=O)[C@H](C)NC(=O)[C@H](CC(C)C)N(C)C(=O)[C@H](C(C)C)NC(=O)[C@H](CC(C)C)N(C)C(=O)CN(C)C1=O PMATZTZNYRCHOR-CGLBZJNRSA-N 0.000 description 1
- DLWOTOMWYCRPLK-UVTDQMKNSA-N (4Z)-5-amino-6-(7-amino-6-methoxy-5,8-dioxoquinolin-2-yl)-4-(4,5-dimethoxy-6-oxocyclohexa-2,4-dien-1-ylidene)-3-methyl-1H-pyridine-2-carboxylic acid Chemical compound C1=CC(OC)=C(OC)C(=O)\C1=C\1C(N)=C(C=2N=C3C(=O)C(N)=C(OC)C(=O)C3=CC=2)NC(C(O)=O)=C/1C DLWOTOMWYCRPLK-UVTDQMKNSA-N 0.000 description 1
- XRBSKUSTLXISAB-XVVDYKMHSA-N (5R,6R,7R,8R)-8-hydroxy-7-(hydroxymethyl)-5-(3,4,5-trimethoxyphenyl)-5,6,7,8-tetrahydrobenzo[f][1,3]benzodioxole-6-carboxylic acid Chemical compound COC1=C(OC)C(OC)=CC([C@@H]2C3=CC=4OCOC=4C=C3[C@H](O)[C@@H](CO)[C@@H]2C(O)=O)=C1 XRBSKUSTLXISAB-XVVDYKMHSA-N 0.000 description 1
- NRUKOCRGYNPUPR-QBPJDGROSA-N (5S,5aR,8aR,9R)-5-[[(2R,4aR,6R,7R,8R,8aS)-7,8-dihydroxy-2-thiophen-2-yl-4,4a,6,7,8,8a-hexahydropyrano[3,2-d][1,3]dioxin-6-yl]oxy]-9-(4-hydroxy-3,5-dimethoxyphenyl)-5a,6,8a,9-tetrahydro-5H-[2]benzofuro[6,5-f][1,3]benzodioxol-8-one Chemical compound COC1=C(O)C(OC)=CC([C@@H]2C3=CC=4OCOC=4C=C3[C@@H](O[C@H]3[C@@H]([C@@H](O)[C@@H]4O[C@@H](OC[C@H]4O3)C=3SC=CC=3)O)[C@@H]3[C@@H]2C(OC3)=O)=C1 NRUKOCRGYNPUPR-QBPJDGROSA-N 0.000 description 1
- JXVAMODRWBNUSF-KZQKBALLSA-N (7S,9R,10R)-7-[(2R,4S,5S,6S)-5-[[(2S,4aS,5aS,7S,9S,9aR,10aR)-2,9-dimethyl-3-oxo-4,4a,5a,6,7,9,9a,10a-octahydrodipyrano[4,2-a:4',3'-e][1,4]dioxin-7-yl]oxy]-4-(dimethylamino)-6-methyloxan-2-yl]oxy-10-[(2S,4S,5S,6S)-4-(dimethylamino)-5-hydroxy-6-methyloxan-2 Chemical compound O([C@@H]1C2=C(O)C=3C(=O)C4=CC=CC(O)=C4C(=O)C=3C(O)=C2[C@@H](O[C@@H]2O[C@@H](C)[C@@H](O[C@@H]3O[C@@H](C)[C@H]4O[C@@H]5O[C@@H](C)C(=O)C[C@@H]5O[C@H]4C3)[C@H](C2)N(C)C)C[C@]1(O)CC)[C@H]1C[C@H](N(C)C)[C@H](O)[C@H](C)O1 JXVAMODRWBNUSF-KZQKBALLSA-N 0.000 description 1
- KMSKQZKKOZQFFG-YXRRJAAWSA-N (7S,9S)-7-[(2R,4S,5S,6S)-4-amino-6-methyl-5-[(2R)-oxan-2-yl]oxyoxan-2-yl]oxy-6,9,11-trihydroxy-9-(2-hydroxyacetyl)-4-methoxy-8,10-dihydro-7H-tetracene-5,12-dione Chemical compound O([C@H]1[C@@H](N)C[C@@H](O[C@H]1C)O[C@H]1C[C@@](O)(CC=2C(O)=C3C(=O)C=4C=CC=C(C=4C(=O)C3=C(O)C=21)OC)C(=O)CO)[C@@H]1CCCCO1 KMSKQZKKOZQFFG-YXRRJAAWSA-N 0.000 description 1
- AESVUZLWRXEGEX-GJPCMZTKSA-N (7S,9S)-7-[(2S,4R,5R,6R)-4-amino-5-hydroxy-6-methyloxan-2-yl]oxy-6,9,11-trihydroxy-9-(2-hydroxyacetyl)-4-methoxy-8,10-dihydro-7H-tetracene-5,12-dione;iron(3+) Chemical compound [Fe+3].O([C@H]1C[C@@](O)(CC=2C(O)=C3C(=O)C=4C=CC=C(C=4C(=O)C3=C(O)C=21)OC)C(=O)CO)[C@@H]1C[C@@H](N)[C@@H](O)[C@@H](C)O1 AESVUZLWRXEGEX-GJPCMZTKSA-N 0.000 description 1
- IEXUMDBQLIVNHZ-YOUGDJEHSA-N (8S,11R,13R,14S,17S)-11-[4-(dimethylamino)phenyl]-17-hydroxy-17-(3-hydroxypropyl)-13-methyl-1,2,6,7,8,11,12,14,15,16-decahydrocyclopenta[a]phenanthren-3-one Chemical compound C1=CC(N(C)C)=CC=C1[C@@H]1C2=C3CCC(=O)C=C3CC[C@H]2[C@H](CC[C@]2(O)CCCO)[C@@]2(C)C1 IEXUMDBQLIVNHZ-YOUGDJEHSA-N 0.000 description 1
- GDHFOVCRYCPOTK-QBFSEMIESA-N (Z)-2-cyano-3-cyclopropyl-3-hydroxy-N-[3-methyl-4-(trifluoromethyl)phenyl]prop-2-enamide Chemical compound C1=C(C(F)(F)F)C(C)=CC(NC(=O)C(\C#N)=C(/O)C2CC2)=C1 GDHFOVCRYCPOTK-QBFSEMIESA-N 0.000 description 1
- 108091010603 1 family Proteins 0.000 description 1
- BTOTXLJHDSNXMW-POYBYMJQSA-N 1-[(2R,5S)-5-(hydroxymethyl)oxolan-2-yl]pyrimidine-2,4-dione Chemical compound O1[C@H](CO)CC[C@@H]1N1C(=O)NC(=O)C=C1 BTOTXLJHDSNXMW-POYBYMJQSA-N 0.000 description 1
- MHKBMNACOMRIAW-UHFFFAOYSA-N 2,3-dinitrophenol Chemical compound OC1=CC=CC([N+]([O-])=O)=C1[N+]([O-])=O MHKBMNACOMRIAW-UHFFFAOYSA-N 0.000 description 1
- BOMZMNZEXMAQQW-UHFFFAOYSA-N 2,5,11-trimethyl-6H-pyrido[4,3-b]carbazol-2-ium-9-ol;acetate Chemical compound CC([O-])=O.C[N+]1=CC=C2C(C)=C(NC=3C4=CC(O)=CC=3)C4=C(C)C2=C1 BOMZMNZEXMAQQW-UHFFFAOYSA-N 0.000 description 1
- FOYWNSCCNCUEPU-UHFFFAOYSA-N 2-[[2-[bis(2-hydroxyethyl)amino]-4-piperidin-1-ylpyrimido[5,4-d]pyrimidin-6-yl]-(2-hydroxyethyl)amino]ethanol Chemical compound C12=NC(N(CCO)CCO)=NC=C2N=C(N(CCO)CCO)N=C1N1CCCCC1 FOYWNSCCNCUEPU-UHFFFAOYSA-N 0.000 description 1
- QCXJFISCRQIYID-IAEPZHFASA-N 2-amino-1-N-[(3S,6S,7R,10S,16S)-3-[(2S)-butan-2-yl]-7,11,14-trimethyl-2,5,9,12,15-pentaoxo-10-propan-2-yl-8-oxa-1,4,11,14-tetrazabicyclo[14.3.0]nonadecan-6-yl]-4,6-dimethyl-3-oxo-9-N-[(3S,6S,7R,10S,16S)-7,11,14-trimethyl-2,5,9,12,15-pentaoxo-3,10-di(propa Chemical compound C[C@H]1OC(=O)[C@H](C(C)C)N(C)C(=O)CN(C)C(=O)[C@@H]2CCCN2C(=O)[C@H](C(C)C)NC(=O)[C@H]1NC(=O)C1=C(N=C2C(C(=O)N[C@@H]3C(=O)N[C@H](C(N4CCC[C@H]4C(=O)N(C)CC(=O)N(C)[C@@H](C(C)C)C(=O)O[C@@H]3C)=O)[C@@H](C)CC)=C(N)C(=O)C(C)=C2O2)C2=C(C)C=C1 QCXJFISCRQIYID-IAEPZHFASA-N 0.000 description 1
- VNBAOSVONFJBKP-UHFFFAOYSA-N 2-chloro-N,N-bis(2-chloroethyl)propan-1-amine;hydrochloride Chemical compound Cl.CC(Cl)CN(CCCl)CCCl VNBAOSVONFJBKP-UHFFFAOYSA-N 0.000 description 1
- SYNHCENRCUAUNM-UHFFFAOYSA-N 2-chloro-N-(2-chloroethyl)-N-methylethanamine oxide;hydron;chloride Chemical compound Cl.ClCC[N+]([O-])(C)CCCl SYNHCENRCUAUNM-UHFFFAOYSA-N 0.000 description 1
- DBIGHPPNXATHOF-UHFFFAOYSA-N 3-(3-methylsulfonyloxypropylamino)propyl methanesulfonate Chemical compound CS(=O)(=O)OCCCNCCCOS(C)(=O)=O DBIGHPPNXATHOF-UHFFFAOYSA-N 0.000 description 1
- PWMYMKOUNYTVQN-UHFFFAOYSA-N 3-(8,8-diethyl-2-aza-8-germaspiro[4.5]decan-2-yl)-N,N-dimethylpropan-1-amine Chemical compound C1C[Ge](CC)(CC)CCC11CN(CCCN(C)C)CC1 PWMYMKOUNYTVQN-UHFFFAOYSA-N 0.000 description 1
- GSCPDZHWVNUUFI-UHFFFAOYSA-N 3-Aminobenzamide Chemical compound NC(=O)C1=CC=CC(N)=C1 GSCPDZHWVNUUFI-UHFFFAOYSA-N 0.000 description 1
- BUJCVBRLTBAYCW-UHFFFAOYSA-N 3-hydroxy-1-(4-hydroxy-3-methoxyphenyl)-2-(2-methoxyphenoxy)propan-1-one Chemical compound COC1=CC=CC=C1OC(CO)C(=O)C1=CC=C(O)C(OC)=C1 BUJCVBRLTBAYCW-UHFFFAOYSA-N 0.000 description 1
- YFTWHEBLORWGNI-UHFFFAOYSA-N 6-(3-methyl-5-nitroimidazol-4-yl)sulfanyl-7H-purin-2-amine Chemical compound CN1C=NC([N+]([O-])=O)=C1SC1=NC(N)=NC2=C1NC=N2 YFTWHEBLORWGNI-UHFFFAOYSA-N 0.000 description 1
- WYXSYVWAUAUWLD-SHUUEZRQSA-N 6-azauridine Chemical compound O[C@@H]1[C@H](O)[C@@H](CO)O[C@H]1N1C(=O)NC(=O)C=N1 WYXSYVWAUAUWLD-SHUUEZRQSA-N 0.000 description 1
- 102100014001 ABCB1 Human genes 0.000 description 1
- 101700002015 ABCB1 Proteins 0.000 description 1
- 102100017861 ABCC2 Human genes 0.000 description 1
- 102100006348 ABCC4 Human genes 0.000 description 1
- 101710024119 ABCC4 Proteins 0.000 description 1
- 102100004948 ABCD1 Human genes 0.000 description 1
- 102100002706 ABCG2 Human genes 0.000 description 1
- 102100019003 ABL2 Human genes 0.000 description 1
- 101700048259 ABL2 Proteins 0.000 description 1
- 102100010371 ADGRA2 Human genes 0.000 description 1
- 101710005080 ADGRA2 Proteins 0.000 description 1
- 101700007462 ADH1A Proteins 0.000 description 1
- 102100000356 ADH1A Human genes 0.000 description 1
- 101710003370 ADH1C Proteins 0.000 description 1
- 102100010849 ADH1C Human genes 0.000 description 1
- AOJJSUZBOXZQNB-TZSSRYMLSA-N ADRIAMYCIN Chemical compound O([C@H]1C[C@@](O)(CC=2C(O)=C3C(=O)C=4C=CC=C(C=4C(=O)C3=C(O)C=21)OC)C(=O)CO)[C@H]1C[C@H](N)[C@H](O)[C@H](C)O1 AOJJSUZBOXZQNB-TZSSRYMLSA-N 0.000 description 1
- 102100001248 AKT1 Human genes 0.000 description 1
- 101700006234 AKT1 Proteins 0.000 description 1
- 102100001250 AKT2 Human genes 0.000 description 1
- 102100006725 AKT3 Human genes 0.000 description 1
- 101700004058 AKT3 Proteins 0.000 description 1
- 102100006705 ALDH4A1 Human genes 0.000 description 1
- 101710037561 ALDH4A1 Proteins 0.000 description 1
- 229940100198 ALKYLATING AGENTS Drugs 0.000 description 1
- 101710036546 ALOX5 Proteins 0.000 description 1
- 229940030486 ANDROGENS Drugs 0.000 description 1
- 229940030495 ANTIANDROGEN SEX HORMONES AND MODULATORS OF THE GENITAL SYSTEM Drugs 0.000 description 1
- 229940100197 ANTIMETABOLITES Drugs 0.000 description 1
- 102100007788 APC Human genes 0.000 description 1
- 101700010938 APC Proteins 0.000 description 1
- 102100011069 ARAF Human genes 0.000 description 1
- 101700086422 ARAF Proteins 0.000 description 1
- 102100010553 AURKB Human genes 0.000 description 1
- 101700037792 AURKB Proteins 0.000 description 1
- ZOZKYEHVNDEUCO-XUTVFYLZSA-N Aceglatone Chemical compound O1C(=O)[C@H](OC(C)=O)[C@@H]2OC(=O)[C@@H](OC(=O)C)[C@@H]21 ZOZKYEHVNDEUCO-XUTVFYLZSA-N 0.000 description 1
- 229950002684 Aceglatone Drugs 0.000 description 1
- 208000008919 Achondroplasia Diseases 0.000 description 1
- 208000004064 Acoustic Neuroma Diseases 0.000 description 1
- 206010069754 Acquired gene mutation Diseases 0.000 description 1
- TXUZVZSFRXZGTL-QPLCGJKRSA-N Afimoxifene Chemical compound C=1C=CC=CC=1C(/CC)=C(C=1C=CC(OCCN(C)C)=CC=1)/C1=CC=C(O)C=C1 TXUZVZSFRXZGTL-QPLCGJKRSA-N 0.000 description 1
- 102000019599 Aldehyde Dehydrogenase 1 Human genes 0.000 description 1
- 108010044647 Aldehyde Dehydrogenase 1 Proteins 0.000 description 1
- QMGUSPDJTPDFSF-UHFFFAOYSA-N Aldophosphamide Chemical compound ClCCN(CCCl)P(=O)(N)OCCC=O QMGUSPDJTPDFSF-UHFFFAOYSA-N 0.000 description 1
- 229940045714 Alkyl sulfonate alkylating agents Drugs 0.000 description 1
- 241000710929 Alphavirus Species 0.000 description 1
- UUVWYPNAQBNQJQ-UHFFFAOYSA-N Altretamine Chemical compound CN(C)C1=NC(N(C)C)=NC(N(C)C)=N1 UUVWYPNAQBNQJQ-UHFFFAOYSA-N 0.000 description 1
- 206010001897 Alzheimer's disease Diseases 0.000 description 1
- 102000007325 Amelogenin Human genes 0.000 description 1
- 108010007570 Amelogenin Proteins 0.000 description 1
- 206010002383 Angina pectoris Diseases 0.000 description 1
- ORWYRWWVDCYOMK-HBZPZAIKSA-N Angiotensin I Chemical compound C([C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC=1NC=NC=1)C(=O)N1[C@@H](CCC1)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CC=1NC=NC=1)C(=O)N[C@@H](CC(C)C)C(O)=O)NC(=O)[C@@H](NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@@H](N)CC(O)=O)C(C)C)C1=CC=C(O)C=C1 ORWYRWWVDCYOMK-HBZPZAIKSA-N 0.000 description 1
- 102400000344 Angiotensin-1 Human genes 0.000 description 1
- 101800000734 Angiotensin-1 Proteins 0.000 description 1
- 229940046836 Anti-estrogens Drugs 0.000 description 1
- 229940064005 Antibiotic throat preparations Drugs 0.000 description 1
- 229940083879 Antibiotics FOR TREATMENT OF HEMORRHOIDS AND ANAL FISSURES FOR TOPICAL USE Drugs 0.000 description 1
- 229940042052 Antibiotics for systemic use Drugs 0.000 description 1
- 229940042786 Antitubercular Antibiotics Drugs 0.000 description 1
- 206010073360 Appendix cancer Diseases 0.000 description 1
- 108010093579 Arachidonate 5-Lipoxygenase Proteins 0.000 description 1
- 241000712891 Arenavirus Species 0.000 description 1
- 201000009695 Argentine hemorrhagic fever Diseases 0.000 description 1
- 108010078554 Aromatase Proteins 0.000 description 1
- 206010003210 Arteriosclerosis Diseases 0.000 description 1
- 241001480043 Arthrodermataceae Species 0.000 description 1
- 206010003571 Astrocytoma Diseases 0.000 description 1
- 102000004000 Aurora Kinase A Human genes 0.000 description 1
- 108090000461 Aurora Kinase A Proteins 0.000 description 1
- 208000010061 Autosomal Dominant Polycystic Kidney Diseases 0.000 description 1
- 206010064097 Avian influenza Diseases 0.000 description 1
- 229960002756 Azacitidine Drugs 0.000 description 1
- 102100013894 BCL2 Human genes 0.000 description 1
- 108060000885 BCL2 Proteins 0.000 description 1
- 102100018921 BCL2A1 Human genes 0.000 description 1
- 101710002713 BCL2A1 Proteins 0.000 description 1
- 102100015652 BCL2L2 Human genes 0.000 description 1
- 101710032376 BCL2L2-PABPN1 Proteins 0.000 description 1
- 102100011377 BCL6 Human genes 0.000 description 1
- 101700024247 BCL6 Proteins 0.000 description 1
- 101700004551 BRAF Proteins 0.000 description 1
- 102100004328 BRAF Human genes 0.000 description 1
- 102100007281 BRCA1 Human genes 0.000 description 1
- 101700076604 BRCA1 Proteins 0.000 description 1
- 241000894006 Bacteria Species 0.000 description 1
- 206010004146 Basal cell carcinoma Diseases 0.000 description 1
- GQYIWUVLTXOXAJ-UHFFFAOYSA-N Belustine Chemical compound ClCCN(N=O)C(=O)NC1CCCCC1 GQYIWUVLTXOXAJ-UHFFFAOYSA-N 0.000 description 1
- VGGGPCQERPFHOB-MCIONIFRSA-N Bestatin Chemical compound CC(C)C[C@H](C(O)=O)NC(=O)[C@@H](O)[C@H](N)CC1=CC=CC=C1 VGGGPCQERPFHOB-MCIONIFRSA-N 0.000 description 1
- 206010004593 Bile duct cancer Diseases 0.000 description 1
- 241000335423 Blastomyces Species 0.000 description 1
- OYVAGSVQBOHSSS-WXFSZRTFSA-O Bleomycin Chemical class N([C@H](C(=O)N[C@H](C)[C@@H](O)[C@H](C)C(=O)N[C@@H]([C@H](O)C)C(=O)NCCC=1SC=C(N=1)C=1SC=C(N=1)C(=O)NCCC[S+](C)C)[C@@H](OC1C(C(O)C(O)C(CO)O1)OC1C(C(OC(N)=O)C(O)C(CO)O1)O)C=1NC=NC=1)C(=O)C1=NC([C@H](CC(N)=O)NC[C@H](N)C(N)=O)=NC(N)=C1C OYVAGSVQBOHSSS-WXFSZRTFSA-O 0.000 description 1
- 108010006654 Bleomycin Proteins 0.000 description 1
- 201000009694 Bolivian hemorrhagic fever Diseases 0.000 description 1
- 210000001185 Bone Marrow Anatomy 0.000 description 1
- 241000283690 Bos taurus Species 0.000 description 1
- 210000004556 Brain Anatomy 0.000 description 1
- 210000000133 Brain Stem Anatomy 0.000 description 1
- 208000003362 Bronchogenic Carcinoma Diseases 0.000 description 1
- 210000001217 Buttocks Anatomy 0.000 description 1
- 229960005084 CALCITRIOL Drugs 0.000 description 1
- XREUEWVEMYWFFA-CSKJXFQVSA-N CARUBICIN Chemical compound C1[C@H](N)[C@H](O)[C@H](C)O[C@H]1O[C@@H]1C2=C(O)C(C(=O)C3=C(O)C=CC=C3C3=O)=C3C(O)=C2C[C@@](O)(C(C)=O)C1 XREUEWVEMYWFFA-CSKJXFQVSA-N 0.000 description 1
- 229950001725 CARUBICIN Drugs 0.000 description 1
- 102100019530 CCND2 Human genes 0.000 description 1
- 101700059002 CCND2 Proteins 0.000 description 1
- 102100016486 CCND3 Human genes 0.000 description 1
- 101700079292 CCND3 Proteins 0.000 description 1
- 101710005912 CDH2 Proteins 0.000 description 1
- 102100014481 CDH2 Human genes 0.000 description 1
- 101710032834 CDH20 Proteins 0.000 description 1
- 101710035596 CDH5 Proteins 0.000 description 1
- 102100005687 CDH5 Human genes 0.000 description 1
- 101700008359 CDK4 Proteins 0.000 description 1
- 102100019398 CDK4 Human genes 0.000 description 1
- 102100006130 CDK6 Human genes 0.000 description 1
- 102100003970 CDK8 Human genes 0.000 description 1
- 102100019348 CEBPA Human genes 0.000 description 1
- 101700058775 CEBPA Proteins 0.000 description 1
- 102100019698 CHEK2 Human genes 0.000 description 1
- 108060006647 CHEK2 Proteins 0.000 description 1
- FDKXTQMXEQVLRF-ZHACJKMWSA-N CN(C)\N=N\c1[nH]cnc1C(N)=O Chemical compound CN(C)\N=N\c1[nH]cnc1C(N)=O FDKXTQMXEQVLRF-ZHACJKMWSA-N 0.000 description 1
- 102100011432 CRKL Human genes 0.000 description 1
- 101700072217 CRKL Proteins 0.000 description 1
- 102100005573 CYP19A1 Human genes 0.000 description 1
- 102100017368 CYP2C8 Human genes 0.000 description 1
- 102100017367 CYP2C9 Human genes 0.000 description 1
- 229950009908 Cactinomycin Drugs 0.000 description 1
- HXCHCVDVKSCDHU-LULTVBGHSA-N Calicheamicin Chemical compound C1[C@H](OC)[C@@H](NCC)CO[C@H]1O[C@H]1[C@H](O[C@@H]2C\3=C(NC(=O)OC)C(=O)C[C@](C/3=C/CSSSC)(O)C#C\C=C/C#C2)O[C@H](C)[C@@H](NO[C@@H]2O[C@H](C)[C@@H](SC(=O)C=3C(=C(OC)C(O[C@H]4[C@@H]([C@H](OC)[C@@H](O)[C@H](C)O4)O)=C(I)C=3C)OC)[C@@H](O)C2)[C@@H]1O HXCHCVDVKSCDHU-LULTVBGHSA-N 0.000 description 1
- 241000589876 Campylobacter Species 0.000 description 1
- 241000222120 Candida <Saccharomycetales> Species 0.000 description 1
- 229960004562 Carboplatin Drugs 0.000 description 1
- OLESAACUTLOWQZ-UHFFFAOYSA-L Carboplatin Chemical compound O=C1O[Pt]([N]([H])([H])[H])([N]([H])([H])[H])OC(=O)C11CCC1 OLESAACUTLOWQZ-UHFFFAOYSA-L 0.000 description 1
- SHHKQEUPHAENFK-UHFFFAOYSA-N Carboquone Chemical compound O=C1C(C)=C(N2CC2)C(=O)C(C(COC(N)=O)OC)=C1N1CC1 SHHKQEUPHAENFK-UHFFFAOYSA-N 0.000 description 1
- 208000002458 Carcinoid Tumor Diseases 0.000 description 1
- 241000700199 Cavia porcellus Species 0.000 description 1
- 210000003169 Central Nervous System Anatomy 0.000 description 1
- 206010008342 Cervix carcinoma Diseases 0.000 description 1
- 201000011470 Charcot-Marie-Tooth disease Diseases 0.000 description 1
- JWBOIMRXGHLCPP-UHFFFAOYSA-N Chloditan Chemical compound C=1C=CC=C(Cl)C=1C(C(Cl)Cl)C1=CC=C(Cl)C=C1 JWBOIMRXGHLCPP-UHFFFAOYSA-N 0.000 description 1
- 102000011045 Chloride Channels Human genes 0.000 description 1
- 108010062745 Chloride Channels Proteins 0.000 description 1
- HAWPXGHAZFHHAD-UHFFFAOYSA-N Chlormethine Chemical compound ClCCN(C)CCCl HAWPXGHAZFHHAD-UHFFFAOYSA-N 0.000 description 1
- MKQWTWSXVILIKJ-LXGUWJNJSA-N Chlorozotocin Chemical compound OC[C@@H](O)[C@@H](O)[C@H](O)[C@H](C=O)NC(=O)N(N=O)CCCl MKQWTWSXVILIKJ-LXGUWJNJSA-N 0.000 description 1
- 206010008723 Chondrodystrophy Diseases 0.000 description 1
- 208000005243 Chondrosarcoma Diseases 0.000 description 1
- 208000006332 Choriocarcinoma Diseases 0.000 description 1
- 210000003483 Chromatin Anatomy 0.000 description 1
- 108010077544 Chromatin Proteins 0.000 description 1
- 210000003917 Chromosomes, Human Anatomy 0.000 description 1
- 241000222290 Cladosporium Species 0.000 description 1
- 241000193403 Clostridium Species 0.000 description 1
- 210000001072 Colon Anatomy 0.000 description 1
- 206010009944 Colon cancer Diseases 0.000 description 1
- 241000711573 Coronaviridae Species 0.000 description 1
- 241000186216 Corynebacterium Species 0.000 description 1
- 206010070976 Craniocerebral injury Diseases 0.000 description 1
- 208000009798 Craniopharyngioma Diseases 0.000 description 1
- 241001337994 Cryptococcus <scale insect> Species 0.000 description 1
- 108010025468 Cyclin-Dependent Kinase 6 Proteins 0.000 description 1
- 108010025415 Cyclin-Dependent Kinase 8 Proteins 0.000 description 1
- 229940119017 Cyclosporine Drugs 0.000 description 1
- 108010036949 Cyclosporine Proteins 0.000 description 1
- 229960000684 Cytarabine Drugs 0.000 description 1
- 108010000561 Cytochrome P-450 CYP2C8 Proteins 0.000 description 1
- 108010000543 Cytochrome P-450 CYP2C9 Proteins 0.000 description 1
- 102000003849 Cytochrome P450 Human genes 0.000 description 1
- 108050008488 Cytochrome P450 Proteins 0.000 description 1
- UHDGCWIWMRVCDJ-CCXZUQQUSA-N Cytosar Chemical compound O=C1N=C(N)C=CN1[C@H]1[C@@H](O)[C@H](O)[C@@H](CO)O1 UHDGCWIWMRVCDJ-CCXZUQQUSA-N 0.000 description 1
- 229940104302 Cytosine Drugs 0.000 description 1
- OPTASPLRGRRNAP-UHFFFAOYSA-N Cytosine Chemical compound NC=1C=CNC(=O)N=1 OPTASPLRGRRNAP-UHFFFAOYSA-N 0.000 description 1
- STQGQHZAVUOBTE-VGBVRHCVSA-N DAUNOMYCIN Chemical compound O([C@H]1C[C@@](O)(CC=2C(O)=C3C(=O)C=4C=CC=C(C=4C(=O)C3=C(O)C=21)OC)C(C)=O)[C@H]1C[C@H](N)[C@H](O)[C@H](C)O1 STQGQHZAVUOBTE-VGBVRHCVSA-N 0.000 description 1
- 102000012410 DNA Ligases Human genes 0.000 description 1
- 108010061982 DNA Ligases Proteins 0.000 description 1
- 108010014303 DNA-Directed DNA Polymerase Proteins 0.000 description 1
- 102000016928 DNA-Directed DNA Polymerase Human genes 0.000 description 1
- 102100006402 DNMT3A Human genes 0.000 description 1
- 101710038368 DNMT3A Proteins 0.000 description 1
- 102100002445 DNTT Human genes 0.000 description 1
- 229960000640 Dactinomycin Drugs 0.000 description 1
- 108010092160 Dactinomycin Proteins 0.000 description 1
- 229960000975 Daunorubicin Drugs 0.000 description 1
- NNJPGOLRFBJNIW-HNNXBMFYSA-N Demecolcine Chemical compound C1=C(OC)C(=O)C=C2[C@@H](NC)CCC3=CC(OC)=C(OC)C(OC)=C3C2=C1 NNJPGOLRFBJNIW-HNNXBMFYSA-N 0.000 description 1
- 208000001490 Dengue Diseases 0.000 description 1
- 206010012310 Dengue fever Diseases 0.000 description 1
- 102000007260 Deoxyribonuclease I Human genes 0.000 description 1
- 108010008532 Deoxyribonuclease I Proteins 0.000 description 1
- BMKDZUISNHGIBY-ZETCQYMHSA-N Dexrazoxane Chemical compound C([C@H](C)N1CC(=O)NC(=O)C1)N1CC(=O)NC(=O)C1 BMKDZUISNHGIBY-ZETCQYMHSA-N 0.000 description 1
- 208000000398 DiGeorge Syndrome Diseases 0.000 description 1
- 206010012818 Diffuse large B-cell lymphoma Diseases 0.000 description 1
- 108010066455 Dihydrouracil Dehydrogenase (NADP) Proteins 0.000 description 1
- ZDZOTLJHXYCWBA-VCVYQWHSSA-N Docetaxel Chemical compound O([C@H]1[C@H]2[C@@](C([C@H](O)C3=C(C)[C@@H](OC(=O)[C@H](O)[C@@H](NC(=O)OC(C)(C)C)C=4C=CC=CC=4)C[C@]1(O)C3(C)C)=O)(C)[C@@H](O)C[C@H]1OC[C@]12OC(=O)C)C(=O)C1=CC=CC=C1 ZDZOTLJHXYCWBA-VCVYQWHSSA-N 0.000 description 1
- 229960004679 Doxorubicin Drugs 0.000 description 1
- NOTIQUSPUUHHEH-UXOVVSIBSA-N Drostanolone propionate Chemical compound C([C@@H]1CC2)C(=O)[C@H](C)C[C@]1(C)[C@@H]1[C@@H]2[C@@H]2CC[C@H](OC(=O)CC)[C@@]2(C)CC1 NOTIQUSPUUHHEH-UXOVVSIBSA-N 0.000 description 1
- 210000001198 Duodenum Anatomy 0.000 description 1
- 108010069091 Dystrophin Proteins 0.000 description 1
- 102000007698 EC 1.1.1.1 Human genes 0.000 description 1
- 108010021809 EC 1.1.1.1 Proteins 0.000 description 1
- 229940022766 EGTA Drugs 0.000 description 1
- 102100002671 EML4 Human genes 0.000 description 1
- 101700040935 EML4 Proteins 0.000 description 1
- 102100001810 EPHA3 Human genes 0.000 description 1
- 101700049294 EPHA3 Proteins 0.000 description 1
- 102100001727 EPHA6 Human genes 0.000 description 1
- 101700082598 EPHA6 Proteins 0.000 description 1
- 102100001728 EPHA7 Human genes 0.000 description 1
- 101700020058 EPHA7 Proteins 0.000 description 1
- 102100009831 EPHB4 Human genes 0.000 description 1
- 102100009565 EPHB6 Human genes 0.000 description 1
- 101700059782 EPHB6 Proteins 0.000 description 1
- 229960001904 EPIRUBICIN Drugs 0.000 description 1
- AOJJSUZBOXZQNB-VTZDEGQISA-N EPIRUBICIN Chemical compound O([C@H]1C[C@@](O)(CC=2C(O)=C3C(=O)C=4C=CC=C(C=4C(=O)C3=C(O)C=21)OC)C(=O)CO)[C@H]1C[C@H](N)[C@@H](O)[C@H](C)O1 AOJJSUZBOXZQNB-VTZDEGQISA-N 0.000 description 1
- 101700025368 ERBB2 Proteins 0.000 description 1
- 102100016662 ERBB2 Human genes 0.000 description 1
- 101700041204 ERBB3 Proteins 0.000 description 1
- 102000027776 ERBB3 Human genes 0.000 description 1
- 101700023619 ERBB4 Proteins 0.000 description 1
- 102100009851 ERBB4 Human genes 0.000 description 1
- 102100010876 ERCC2 Human genes 0.000 description 1
- 101700055371 ERG Proteins 0.000 description 1
- 102000033147 ERVK-25 Human genes 0.000 description 1
- ITSGNOIFAJAQHJ-BMFNZSJVSA-N ESORUBICIN Chemical compound O([C@H]1C[C@@](O)(CC=2C(O)=C3C(=O)C=4C=CC=C(C=4C(=O)C3=C(O)C=21)OC)C(=O)CO)[C@H]1C[C@H](N)C[C@H](C)O1 ITSGNOIFAJAQHJ-BMFNZSJVSA-N 0.000 description 1
- 102100016237 ESR2 Human genes 0.000 description 1
- 102100002090 ETV1 Human genes 0.000 description 1
- 101700001792 ETV1 Proteins 0.000 description 1
- 102100002087 ETV4 Human genes 0.000 description 1
- 101700082619 ETV4 Proteins 0.000 description 1
- 102100002089 ETV6 Human genes 0.000 description 1
- 101700053672 ETV6 Proteins 0.000 description 1
- 102100016041 EZH2 Human genes 0.000 description 1
- 101700041849 EZH2 Proteins 0.000 description 1
- 241000710945 Eastern equine encephalitis virus Species 0.000 description 1
- 206010014071 Ebola disease Diseases 0.000 description 1
- 201000011001 Ebola hemorrhagic fever Diseases 0.000 description 1
- 201000006360 Edwards syndrome Diseases 0.000 description 1
- 229950000549 Elliptinium acetate Drugs 0.000 description 1
- 206010014599 Encephalitis Diseases 0.000 description 1
- SAMRUMKYXPVKPA-VFKOLLTISA-N Enocitabine Chemical compound O=C1N=C(NC(=O)CCCCCCCCCCCCCCCCCCCCC)C=CN1[C@H]1[C@@H](O)[C@H](O)[C@@H](CO)O1 SAMRUMKYXPVKPA-VFKOLLTISA-N 0.000 description 1
- 229950011487 Enocitabine Drugs 0.000 description 1
- 206010014950 Eosinophilia Diseases 0.000 description 1
- 206010014958 Eosinophilic leukaemia Diseases 0.000 description 1
- 206010014967 Ependymoma Diseases 0.000 description 1
- 108010055323 EphB4 Receptor Proteins 0.000 description 1
- 241001455610 Ephemerovirus Species 0.000 description 1
- 229950002973 Epitiostanol Drugs 0.000 description 1
- 241000283086 Equidae Species 0.000 description 1
- 241000588722 Escherichia Species 0.000 description 1
- 210000003238 Esophagus Anatomy 0.000 description 1
- 229950002017 Esorubicin Drugs 0.000 description 1
- 208000002047 Essential Thrombocythemia Diseases 0.000 description 1
- 229960001842 Estramustine Drugs 0.000 description 1
- FRPJXPJMRWBBIH-RBRWEJTLSA-N Estramustine Chemical compound ClCCN(CCCl)C(=O)OC1=CC=C2[C@H]3CC[C@](C)([C@H](CC4)O)[C@@H]4[C@@H]3CCC2=C1 FRPJXPJMRWBBIH-RBRWEJTLSA-N 0.000 description 1
- 229960005237 Etoglucid Drugs 0.000 description 1
- UMILHIMHKXVDGH-UHFFFAOYSA-N Etoglucid Chemical compound C1OC1COCCOCCOCCOCC1CO1 UMILHIMHKXVDGH-UHFFFAOYSA-N 0.000 description 1
- 229960005420 Etoposide Drugs 0.000 description 1
- VJJPUSNTGOMMGY-MRVIYFEKSA-N Etoposide Chemical compound COC1=C(O)C(OC)=CC([C@@H]2C3=CC=4OCOC=4C=C3[C@@H](O[C@H]3[C@@H]([C@@H](O)[C@@H]4O[C@H](C)OC[C@H]4O3)O)[C@@H]3[C@@H]2C(OC3)=O)=C1 VJJPUSNTGOMMGY-MRVIYFEKSA-N 0.000 description 1
- 241000579695 European bat 1 lyssavirus Species 0.000 description 1
- 108010062201 F-Box-WD Repeat-Containing Protein 7 Proteins 0.000 description 1
- 102100020361 F5 Human genes 0.000 description 1
- 102100020077 FBXW7 Human genes 0.000 description 1
- 102000027757 FGF receptors Human genes 0.000 description 1
- 108091008101 FGF receptors Proteins 0.000 description 1
- 102100018000 FGFR2 Human genes 0.000 description 1
- 102100020189 FGFR4 Human genes 0.000 description 1
- 101700075612 FGFR4 Proteins 0.000 description 1
- 102100004573 FLT3 Human genes 0.000 description 1
- 101710009074 FLT3 Proteins 0.000 description 1
- 102100013182 FLT4 Human genes 0.000 description 1
- 102100014881 FOXP4 Human genes 0.000 description 1
- 101700084508 FOXP4 Proteins 0.000 description 1
- 108010014172 Factor V Proteins 0.000 description 1
- 229960000301 Factor VIII Drugs 0.000 description 1
- 108010054218 Factor VIII Proteins 0.000 description 1
- 102000001690 Factor VIII Human genes 0.000 description 1
- 108010067741 Fanconi Anemia Complementation Group N Protein Proteins 0.000 description 1
- 229940043168 Fareston Drugs 0.000 description 1
- 241000282326 Felis catus Species 0.000 description 1
- 102000004150 Flap endonucleases Human genes 0.000 description 1
- 108090000652 Flap endonucleases Proteins 0.000 description 1
- 241000710781 Flaviviridae Species 0.000 description 1
- ODKNJVUHOIMIIZ-RRKCRQDMSA-N Floxuridine Chemical compound C1[C@H](O)[C@@H](CO)O[C@H]1N1C(=O)NC(=O)C(F)=C1 ODKNJVUHOIMIIZ-RRKCRQDMSA-N 0.000 description 1
- 229960000961 Floxuridine Drugs 0.000 description 1
- 229960000304 Folic Acid Drugs 0.000 description 1
- 208000001914 Fragile X Syndrome Diseases 0.000 description 1
- 108009000484 Fragile X Syndrome Proteins 0.000 description 1
- 241000714188 Friend murine leukemia virus Species 0.000 description 1
- 241000233866 Fungi Species 0.000 description 1
- 241000223218 Fusarium Species 0.000 description 1
- 102100014305 GNAQ Human genes 0.000 description 1
- 101700035643 GNAQ Proteins 0.000 description 1
- 102000033185 GNAS Human genes 0.000 description 1
- 101700086896 GNAS Proteins 0.000 description 1
- 102100010488 GUCY1A2 Human genes 0.000 description 1
- 101710036434 GUCY1A2 Proteins 0.000 description 1
- GYHNNYVSQQEPJS-UHFFFAOYSA-N Gallium Chemical compound [Ga] GYHNNYVSQQEPJS-UHFFFAOYSA-N 0.000 description 1
- 101710041546 Galphas Proteins 0.000 description 1
- 206010018048 Gaucher's disease Diseases 0.000 description 1
- SDUQYLNIPVEERB-QPPQHZFASA-N Gemcitabine Chemical compound O=C1N=C(N)C=CN1[C@H]1C(F)(F)[C@H](O)[C@@H](CO)O1 SDUQYLNIPVEERB-QPPQHZFASA-N 0.000 description 1
- 206010018338 Glioma Diseases 0.000 description 1
- 102000004547 Glucosylceramidase Human genes 0.000 description 1
- 108010017544 Glucosylceramidase Proteins 0.000 description 1
- BLCLNMBMMGCOAS-URPVMXJPSA-N Goserelin Chemical compound C([C@@H](C(=O)N[C@H](COC(C)(C)C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N1[C@@H](CCC1)C(=O)NNC(N)=O)NC(=O)[C@H](CO)NC(=O)[C@H](CC=1C2=CC=CC=C2NC=1)NC(=O)[C@H](CC=1NC=NC=1)NC(=O)[C@H]1NC(=O)CC1)C1=CC=C(O)C=C1 BLCLNMBMMGCOAS-URPVMXJPSA-N 0.000 description 1
- 229960002913 Goserelin Drugs 0.000 description 1
- 108010069236 Goserelin Proteins 0.000 description 1
- 229940093922 Gynecological Antibiotics Drugs 0.000 description 1
- 102100020458 HLA-A Human genes 0.000 description 1
- 108010075704 HLA-A Antigens Proteins 0.000 description 1
- 102100020459 HLA-B Human genes 0.000 description 1
- 108010058607 HLA-B Antigens Proteins 0.000 description 1
- 102100020457 HLA-C Human genes 0.000 description 1
- 108010052199 HLA-C Antigens Proteins 0.000 description 1
- 108010062347 HLA-DQ Antigens Proteins 0.000 description 1
- 102000006354 HLA-DR Antigens Human genes 0.000 description 1
- 108010058597 HLA-DR Antigens Proteins 0.000 description 1
- 101710027869 HOXA3 Proteins 0.000 description 1
- 102100014797 HOXA3 Human genes 0.000 description 1
- 102100016790 HPRT1 Human genes 0.000 description 1
- 101710011940 HPRT1 Proteins 0.000 description 1
- 102100009283 HRAS Human genes 0.000 description 1
- 101710033925 HRAS Proteins 0.000 description 1
- 206010061192 Haemorrhagic fever Diseases 0.000 description 1
- 230000036499 Half live Effects 0.000 description 1
- 208000001258 Hemangiosarcoma Diseases 0.000 description 1
- 241000893570 Hendra henipavirus Species 0.000 description 1
- 241000711549 Hepacivirus C Species 0.000 description 1
- 241000700721 Hepatitis B virus Species 0.000 description 1
- 241000724675 Hepatitis E virus Species 0.000 description 1
- 241000724709 Hepatitis delta virus Species 0.000 description 1
- 241000709721 Hepatovirus A Species 0.000 description 1
- 208000009889 Herpes Simplex Diseases 0.000 description 1
- 208000007514 Herpes Zoster Diseases 0.000 description 1
- 102000016871 Hexosaminidase A Human genes 0.000 description 1
- 108010053317 Hexosaminidase A Proteins 0.000 description 1
- 241000228402 Histoplasma Species 0.000 description 1
- 101710043757 Hln-2 Proteins 0.000 description 1
- 206010020243 Hodgkin's disease Diseases 0.000 description 1
- 201000006743 Hodgkin's lymphoma Diseases 0.000 description 1
- 229940088597 Hormone Drugs 0.000 description 1
- 241001502974 Human gammaherpesvirus 8 Species 0.000 description 1
- 241000701027 Human herpesvirus 6 Species 0.000 description 1
- 241000725303 Human immunodeficiency virus Species 0.000 description 1
- 241000713340 Human immunodeficiency virus 2 Species 0.000 description 1
- 235000008694 Humulus lupulus Nutrition 0.000 description 1
- 240000006600 Humulus lupulus Species 0.000 description 1
- 241000282620 Hylobates sp. Species 0.000 description 1
- 206010020772 Hypertension Diseases 0.000 description 1
- 101700033123 IAAT Proteins 0.000 description 1
- 101700030371 IDH2 Proteins 0.000 description 1
- 102100002772 IDH2 Human genes 0.000 description 1
- 102100013307 IGF2R Human genes 0.000 description 1
- 101710032496 IGF2R Proteins 0.000 description 1
- 102100008723 IKBKE Human genes 0.000 description 1
- 101710002884 IKBKE Proteins 0.000 description 1
- 102100008238 INHBA Human genes 0.000 description 1
- 102100002913 ITPA Human genes 0.000 description 1
- 101700030116 ITPA Proteins 0.000 description 1
- 229940015872 Ibandronate Drugs 0.000 description 1
- 229960000908 Idarubicin Drugs 0.000 description 1
- XDXDZDZNSLXDNA-TZNDIEGXSA-N Idarubicin hydrochloride Chemical compound C1[C@H](N)[C@H](O)[C@H](C)O[C@H]1O[C@@H]1C2=C(O)C(C(=O)C3=CC=CC=C3C3=O)=C3C(O)=C2C[C@@](O)(C(C)=O)C1 XDXDZDZNSLXDNA-TZNDIEGXSA-N 0.000 description 1
- 210000003405 Ileum Anatomy 0.000 description 1
- DOUYETYNHWVLEO-UHFFFAOYSA-N Imiquimod Chemical compound C1=CC=CC2=C3N(CC(C)C)C=NC3=C(N)N=C21 DOUYETYNHWVLEO-UHFFFAOYSA-N 0.000 description 1
- 241000713297 Influenza C virus Species 0.000 description 1
- RCINICONZNJXQF-MZXODVADSA-N Intaxel Chemical compound O([C@@H]1[C@@]2(C[C@@H](C(C)=C(C2(C)C)[C@H](C([C@]2(C)[C@@H](O)C[C@H]3OC[C@]3([C@H]21)OC(C)=O)=O)OC(=O)C)OC(=O)[C@H](O)[C@@H](NC(=O)C=1C=CC=CC=1)C=1C=CC=CC=1)O)C(=O)C1=CC=CC=C1 RCINICONZNJXQF-MZXODVADSA-N 0.000 description 1
- 241000229754 Iva xanthiifolia Species 0.000 description 1
- 102100019516 JAK2 Human genes 0.000 description 1
- 101700016050 JAK2 Proteins 0.000 description 1
- 102100019518 JAK3 Human genes 0.000 description 1
- 101700007593 JAK3 Proteins 0.000 description 1
- 101700011826 KCNH6 Proteins 0.000 description 1
- 102100002644 KCNJ11 Human genes 0.000 description 1
- 108060004074 KCNJ11 Proteins 0.000 description 1
- 210000003734 Kidney Anatomy 0.000 description 1
- QNAYBMKLOCPYGJ-REOHCLBHSA-N L-alanine Chemical compound C[C@H](N)C(O)=O QNAYBMKLOCPYGJ-REOHCLBHSA-N 0.000 description 1
- 108010001831 LDL receptors Proteins 0.000 description 1
- 108060004326 LDLR Proteins 0.000 description 1
- 102100012759 LRP2 Human genes 0.000 description 1
- 229950009700 Laflunimus Drugs 0.000 description 1
- 241001520693 Lagos bat lyssavirus Species 0.000 description 1
- 206010023927 Lassa fever Diseases 0.000 description 1
- 206010024190 Leiomyosarcomas Diseases 0.000 description 1
- 229920001491 Lentinan Polymers 0.000 description 1
- 208000009625 Lesch-Nyhan Syndrome Diseases 0.000 description 1
- 229940008250 Leuprolide Drugs 0.000 description 1
- 108010000817 Leuprolide Proteins 0.000 description 1
- 229960004338 Leuprorelin Drugs 0.000 description 1
- 210000004185 Liver Anatomy 0.000 description 1
- WDRYRZXSPDWGEB-UHFFFAOYSA-N Lonidamine Chemical compound C12=CC=CC=C2C(C(=O)O)=NN1CC1=CC=C(Cl)C=C1Cl WDRYRZXSPDWGEB-UHFFFAOYSA-N 0.000 description 1
- 210000004072 Lung Anatomy 0.000 description 1
- 206010058467 Lung neoplasm malignant Diseases 0.000 description 1
- 210000001165 Lymph Nodes Anatomy 0.000 description 1
- 206010025224 Lymphangiosarcomas Diseases 0.000 description 1
- 241000711828 Lyssavirus Species 0.000 description 1
- 108010068342 MAP Kinase Kinase 1 Proteins 0.000 description 1
- 108010068353 MAP Kinase Kinase 2 Proteins 0.000 description 1
- 102100006473 MAP2K1 Human genes 0.000 description 1
- 102100015877 MAP2K2 Human genes 0.000 description 1
- 101700074567 MCF2L Proteins 0.000 description 1
- 102100019155 MDM2 Human genes 0.000 description 1
- 101700032565 MDM2 Proteins 0.000 description 1
- 102000017274 MDM4 Human genes 0.000 description 1
- 108050005300 MDM4 Proteins 0.000 description 1
- 101710026102 MIC-ACT-2 Proteins 0.000 description 1
- 229950010718 MOPIDAMOL Drugs 0.000 description 1
- 102100013820 MSH2 Human genes 0.000 description 1
- 101700083509 MSH2 Proteins 0.000 description 1
- 229910015837 MSH2 Inorganic materials 0.000 description 1
- 102100002001 MSH6 Human genes 0.000 description 1
- 101700030163 MSH6 Proteins 0.000 description 1
- 102100004834 MT-ND4 Human genes 0.000 description 1
- 101710028315 MT-ND4 Proteins 0.000 description 1
- 102100013322 MTOR Human genes 0.000 description 1
- 101700036611 MTOR Proteins 0.000 description 1
- 102100003827 MUTYH Human genes 0.000 description 1
- 101700053678 MUTYH Proteins 0.000 description 1
- 102100015262 MYC Human genes 0.000 description 1
- 101700075357 MYC Proteins 0.000 description 1
- 102100018882 MYCN Human genes 0.000 description 1
- 241000712898 Machupo mammarenavirus Species 0.000 description 1
- 241000124008 Mammalia Species 0.000 description 1
- MQXVYODZCMMZEM-ZYUZMQFOSA-N Mannomustine Chemical compound ClCCNC[C@@H](O)[C@@H](O)[C@H](O)[C@H](O)CNCCCl MQXVYODZCMMZEM-ZYUZMQFOSA-N 0.000 description 1
- 206010026798 Mantle cell lymphomas Diseases 0.000 description 1
- 208000000932 Marburg Virus Disease Diseases 0.000 description 1
- 201000011013 Marburg hemorrhagic fever Diseases 0.000 description 1
- 229960004961 Mechlorethamine Drugs 0.000 description 1
- 208000000172 Medulloblastoma Diseases 0.000 description 1
- SGDBTWWWUNNDEQ-LBPRGKRZSA-N Melphalan Chemical compound OC(=O)[C@@H](N)CC1=CC=C(N(CCCl)CCCl)C=C1 SGDBTWWWUNNDEQ-LBPRGKRZSA-N 0.000 description 1
- 108010049137 Member 1 Subfamily D ATP Binding Cassette Transporter Proteins 0.000 description 1
- 108010090306 Member 2 Subfamily G ATP Binding Cassette Transporter Proteins 0.000 description 1
- 206010027191 Meningioma Diseases 0.000 description 1
- 206010027406 Mesothelioma Diseases 0.000 description 1
- 206010054949 Metaplasia Diseases 0.000 description 1
- 108010030837 Methylenetetrahydrofolate Reductase (NADPH2) Proteins 0.000 description 1
- 102000013760 Microphthalmia-Associated Transcription Factor Human genes 0.000 description 1
- 108010050345 Microphthalmia-Associated Transcription Factor Proteins 0.000 description 1
- 241001480037 Microsporum Species 0.000 description 1
- VFKZTMPDYBFSTM-KVTDHHQDSA-N Mitobronitol Chemical compound BrC[C@@H](O)[C@@H](O)[C@H](O)[C@H](O)CBr VFKZTMPDYBFSTM-KVTDHHQDSA-N 0.000 description 1
- 229960003539 Mitoguazone Drugs 0.000 description 1
- 229960004857 Mitomycin Drugs 0.000 description 1
- 229960000350 Mitotane Drugs 0.000 description 1
- 241000725171 Mokola lyssavirus Species 0.000 description 1
- 241000713869 Moloney murine leukemia virus Species 0.000 description 1
- 208000005871 Monkeypox Diseases 0.000 description 1
- 241000711513 Mononegavirales Species 0.000 description 1
- 210000000214 Mouth Anatomy 0.000 description 1
- 210000003205 Muscles Anatomy 0.000 description 1
- 208000003627 Muscular Dystrophy Diseases 0.000 description 1
- 241000186359 Mycobacterium Species 0.000 description 1
- 229960000951 Mycophenolic Acid Drugs 0.000 description 1
- 208000010125 Myocardial Infarction Diseases 0.000 description 1
- 206010068871 Myotonic dystrophy Diseases 0.000 description 1
- 208000001611 Myxosarcoma Diseases 0.000 description 1
- YMVWGSQGCWCDGW-UHFFFAOYSA-N N',N'-dimethyl-N-(1-nitroacridin-9-yl)propane-1,3-diamine Chemical compound C1=CC([N+]([O-])=O)=C2C(NCCCN(C)C)=C(C=CC=C3)C3=NC2=C1 YMVWGSQGCWCDGW-UHFFFAOYSA-N 0.000 description 1
- 108010010748 N-Myc Proto-Oncogene Protein Proteins 0.000 description 1
- NJSMWLQOCQIOPE-OCHFTUDZSA-N N-[(E)-[10-[(E)-(4,5-dihydro-1H-imidazol-2-ylhydrazinylidene)methyl]anthracen-9-yl]methylideneamino]-4,5-dihydro-1H-imidazol-2-amine Chemical compound N1CCN=C1N\N=C\C(C1=CC=CC=C11)=C(C=CC=C2)C2=C1\C=N\NC1=NCCN1 NJSMWLQOCQIOPE-OCHFTUDZSA-N 0.000 description 1
- 102100010499 NF2 Human genes 0.000 description 1
- 101700071070 NF2 Proteins 0.000 description 1
- 102100018287 NKX2-1 Human genes 0.000 description 1
- 101710012901 NKX2-1 Proteins 0.000 description 1
- 101700067251 NQO1 Proteins 0.000 description 1
- 102100002509 NQO1 Human genes 0.000 description 1
- 229940086322 Navelbine Drugs 0.000 description 1
- QZGIWPZCWHMVQL-UIYAJPBUSA-N Neocarzinostatin Chemical compound O1[C@H](C)[C@H](O)[C@H](O)[C@@H](NC)[C@H]1O[C@@H]1C/2=C/C#C[C@H]3O[C@@]3([C@@H]3OC(=O)OC3)C#CC\2=C[C@H]1OC(=O)C1=C(O)C=CC2=C(C)C=C(OC)C=C12 QZGIWPZCWHMVQL-UIYAJPBUSA-N 0.000 description 1
- 206010029260 Neuroblastoma Diseases 0.000 description 1
- 102000007530 Neurofibromin 1 Human genes 0.000 description 1
- 108010085793 Neurofibromin 1 Proteins 0.000 description 1
- XWXYUMMDTVBTOU-UHFFFAOYSA-N Nilutamide Chemical compound O=C1C(C)(C)NC(=O)N1C1=CC=C([N+]([O-])=O)C(C(F)(F)F)=C1 XWXYUMMDTVBTOU-UHFFFAOYSA-N 0.000 description 1
- 241000526636 Nipah henipavirus Species 0.000 description 1
- DLGOEMSEDOSKAD-UHFFFAOYSA-N Nitrumon Chemical compound ClCCNC(=O)N(N=O)CCCl DLGOEMSEDOSKAD-UHFFFAOYSA-N 0.000 description 1
- 208000002154 Non-Small-Cell Lung Carcinoma Diseases 0.000 description 1
- 241000714209 Norwalk virus Species 0.000 description 1
- 241000725177 Omsk hemorrhagic fever virus Species 0.000 description 1
- 229950011093 Onapristone Drugs 0.000 description 1
- 208000008760 Optic Nerve Disease Diseases 0.000 description 1
- 206010061323 Optic neuropathy Diseases 0.000 description 1
- 241000150452 Orthohantavirus Species 0.000 description 1
- 206010033128 Ovarian cancer Diseases 0.000 description 1
- 241000283898 Ovis Species 0.000 description 1
- 102000004316 Oxidoreductases Human genes 0.000 description 1
- 108090000854 Oxidoreductases Proteins 0.000 description 1
- 101700062635 P2RY1 Proteins 0.000 description 1
- 102100004041 PALB2 Human genes 0.000 description 1
- 102100019725 PCDHGB4 Human genes 0.000 description 1
- 101710037580 PCDHGB4 Proteins 0.000 description 1
- 102100004940 PDGFRA Human genes 0.000 description 1
- 101710018349 PDGFRA Proteins 0.000 description 1
- 102100004939 PDGFRB Human genes 0.000 description 1
- 229950003180 PEPLOMYCIN Drugs 0.000 description 1
- 102100019471 PIK3CA Human genes 0.000 description 1
- 101710027440 PIK3CA Proteins 0.000 description 1
- 102100014818 PIK3R1 Human genes 0.000 description 1
- 101710039899 PIK3R1 Proteins 0.000 description 1
- 101710039569 POLM Proteins 0.000 description 1
- 102100003225 PRKDC Human genes 0.000 description 1
- 101700061400 PRKDC Proteins 0.000 description 1
- 108010011536 PTEN Phosphohydrolase Proteins 0.000 description 1
- 102000014160 PTEN Phosphohydrolase Human genes 0.000 description 1
- 102100017818 PTPN11 Human genes 0.000 description 1
- 101710018405 PTPN11 Proteins 0.000 description 1
- 102100005502 PTPRD Human genes 0.000 description 1
- 101700079089 PTPRD Proteins 0.000 description 1
- RXWNCPJZOCPEPQ-NVWDDTSBSA-N PUROMYCIN Chemical compound C1=CC(OC)=CC=C1C[C@H](N)C(=O)N[C@H]1[C@@H](O)[C@H](N2C3=NC=NC(=C3N=C2)N(C)C)O[C@@H]1CO RXWNCPJZOCPEPQ-NVWDDTSBSA-N 0.000 description 1
- 229950010131 PUROMYCIN Drugs 0.000 description 1
- 229960001592 Paclitaxel Drugs 0.000 description 1
- 210000000496 Pancreas Anatomy 0.000 description 1
- 208000004019 Papillary Adenocarcinoma Diseases 0.000 description 1
- 241001631646 Papillomaviridae Species 0.000 description 1
- 241000711504 Paramyxoviridae Species 0.000 description 1
- 229960002340 Pentostatin Drugs 0.000 description 1
- FPVKHBSQESCIEP-JQCXWYLXSA-N Pentostatin Chemical compound C1[C@H](O)[C@@H](CO)O[C@H]1N1C(N=CNC[C@H]2O)=C2N=C1 FPVKHBSQESCIEP-JQCXWYLXSA-N 0.000 description 1
- 108010057150 Peplomycin Proteins 0.000 description 1
- 108091005771 Peptidases Proteins 0.000 description 1
- 210000001428 Peripheral Nervous System Anatomy 0.000 description 1
- 206010034699 Peroneal muscular atrophy Diseases 0.000 description 1
- 210000003800 Pharynx Anatomy 0.000 description 1
- 208000007641 Pinealoma Diseases 0.000 description 1
- NJBFOOCLYDNZJN-UHFFFAOYSA-N Pipobroman Chemical compound BrCCC(=O)N1CCN(C(=O)CCBr)CC1 NJBFOOCLYDNZJN-UHFFFAOYSA-N 0.000 description 1
- 206010035148 Plague Diseases 0.000 description 1
- 108010051742 Platelet-Derived Growth Factor beta Receptor Proteins 0.000 description 1
- 241000233870 Pneumocystis Species 0.000 description 1
- 208000008696 Polycythemia Vera Diseases 0.000 description 1
- 201000010769 Prader-Willi syndrome Diseases 0.000 description 1
- HFVNWDWLWUCIHC-GUPDPFMOSA-N Prednimustine Chemical compound O=C([C@@]1(O)CC[C@H]2[C@H]3[C@@H]([C@]4(C=CC(=O)C=C4CC3)C)[C@@H](O)C[C@@]21C)COC(=O)CCCC1=CC=C(N(CCCl)CCCl)C=C1 HFVNWDWLWUCIHC-GUPDPFMOSA-N 0.000 description 1
- CPTBDICYNRMXFX-UHFFFAOYSA-N Procarbazine Chemical compound CNNCC1=CC=C(C(=O)NC(C)C)C=C1 CPTBDICYNRMXFX-UHFFFAOYSA-N 0.000 description 1
- 239000004365 Protease Substances 0.000 description 1
- 241000125945 Protoparvovirus Species 0.000 description 1
- 241000589516 Pseudomonas Species 0.000 description 1
- 108010080192 Purinergic Receptors Proteins 0.000 description 1
- 206010037660 Pyrexia Diseases 0.000 description 1
- 101700078798 RARA Proteins 0.000 description 1
- 102100006051 RET Human genes 0.000 description 1
- 101700001630 RET Proteins 0.000 description 1
- 102000020497 RNA-Binding Proteins Human genes 0.000 description 1
- 108091022184 RNA-Binding Proteins Proteins 0.000 description 1
- 229920001186 RNA-Seq Polymers 0.000 description 1
- 102100012618 RPTOR Human genes 0.000 description 1
- 241000711798 Rabies lyssavirus Species 0.000 description 1
- GZUITABIAKMVPG-UHFFFAOYSA-N Raloxifene Chemical compound C1=CC(O)=CC=C1C1=C(C(=O)C=2C=CC(OCCN3CCCCC3)=CC=2)C2=CC=C(O)C=C2S1 GZUITABIAKMVPG-UHFFFAOYSA-N 0.000 description 1
- 229960004622 Raloxifene Drugs 0.000 description 1
- AHHFEZNOXOZZQA-ZEBDFXRSSA-N Ranimustine Chemical compound CO[C@H]1O[C@H](CNC(=O)N(CCCl)N=O)[C@@H](O)[C@H](O)[C@H]1O AHHFEZNOXOZZQA-ZEBDFXRSSA-N 0.000 description 1
- 102000012007 Rapamycin-Insensitive Companion of mTOR Protein Human genes 0.000 description 1
- 108010061204 Rapamycin-Insensitive Companion of mTOR Protein Proteins 0.000 description 1
- 241000700159 Rattus Species 0.000 description 1
- 210000000664 Rectum Anatomy 0.000 description 1
- 108010029031 Regulatory-Associated Protein of mTOR Proteins 0.000 description 1
- 208000006265 Renal Cell Carcinoma Diseases 0.000 description 1
- 229920000970 Repeated sequence (DNA) Polymers 0.000 description 1
- 102000006382 Ribonucleases Human genes 0.000 description 1
- 108010083644 Ribonucleases Proteins 0.000 description 1
- 241000713124 Rift Valley fever virus Species 0.000 description 1
- 229950004892 Rodorubicin Drugs 0.000 description 1
- 241000702670 Rotavirus Species 0.000 description 1
- 241000315672 SARS coronavirus Species 0.000 description 1
- 229950001403 SIZOFIRAN Drugs 0.000 description 1
- 108091006187 SLC-Transporter Proteins 0.000 description 1
- 102000037151 SLC-Transporter Human genes 0.000 description 1
- 102100017015 SLC22A2 Human genes 0.000 description 1
- 108091006650 SLC22A2 Proteins 0.000 description 1
- 102100007110 SLCO1B1 Human genes 0.000 description 1
- 101710042004 SLCO1B1 Proteins 0.000 description 1
- 102100017669 SMAD2 Human genes 0.000 description 1
- 101700012842 SMAD2 Proteins 0.000 description 1
- 102100017668 SMAD3 Human genes 0.000 description 1
- 101700079713 SMAD3 Proteins 0.000 description 1
- 102100017680 SMAD4 Human genes 0.000 description 1
- 101700062085 SMAD4 Proteins 0.000 description 1
- 102100019447 SMARCA4 Human genes 0.000 description 1
- 101710025703 SMARCA4 Proteins 0.000 description 1
- 102100004239 SOD2 Human genes 0.000 description 1
- 101700006931 SOX2 Proteins 0.000 description 1
- 102100018829 SOX2 Human genes 0.000 description 1
- 102100019647 SULT1A1 Human genes 0.000 description 1
- 101710017681 SULT1A1 Proteins 0.000 description 1
- 206010061934 Salivary gland cancer Diseases 0.000 description 1
- 241000607142 Salmonella Species 0.000 description 1
- 206010039447 Salmonellosis Diseases 0.000 description 1
- 206010039491 Sarcoma Diseases 0.000 description 1
- 210000001732 Sebaceous Glands Anatomy 0.000 description 1
- 241000607768 Shigella Species 0.000 description 1
- 206010040550 Shigella infection Diseases 0.000 description 1
- 235000015076 Shorea robusta Nutrition 0.000 description 1
- 208000007056 Sickle Cell Anemia Diseases 0.000 description 1
- 241000700584 Simplexvirus Species 0.000 description 1
- QFJCIRLUMZQUOT-HPLJOQBZSA-N Sirolimus Chemical compound C1C[C@@H](O)[C@H](OC)C[C@@H]1C[C@@H](C)[C@H]1OC(=O)[C@@H]2CCCCN2C(=O)C(=O)[C@](O)(O2)[C@H](C)CC[C@H]2C[C@H](OC)/C(C)=C/C=C/C=C/[C@@H](C)C[C@@H](C)C(=O)[C@H](OC)[C@H](O)/C(C)=C/[C@@H](C)C(=O)C1 QFJCIRLUMZQUOT-HPLJOQBZSA-N 0.000 description 1
- 241000212342 Sium Species 0.000 description 1
- 229920000519 Sizofiran Polymers 0.000 description 1
- 210000003491 Skin Anatomy 0.000 description 1
- 206010054184 Small intestine carcinoma Diseases 0.000 description 1
- 208000001203 Smallpox Diseases 0.000 description 1
- 229950006315 Spirogermanium Drugs 0.000 description 1
- 241001149962 Sporothrix Species 0.000 description 1
- 241000713675 Spumavirus Species 0.000 description 1
- 206010041823 Squamous cell carcinoma Diseases 0.000 description 1
- 241000191940 Staphylococcus Species 0.000 description 1
- 210000002784 Stomach Anatomy 0.000 description 1
- 241000194017 Streptococcus Species 0.000 description 1
- 229960001052 Streptozocin Drugs 0.000 description 1
- ZSJLQEPLLKMAKR-GKHCUFPYSA-N Streptozotocin Chemical compound O=NN(C)C(=O)N[C@H]1[C@@H](O)O[C@H](CO)[C@@H](O)[C@@H]1O ZSJLQEPLLKMAKR-GKHCUFPYSA-N 0.000 description 1
- 241000282887 Suidae Species 0.000 description 1
- 241000282890 Sus Species 0.000 description 1
- 206010042863 Synovial sarcoma Diseases 0.000 description 1
- RCINICONZNJXQF-XAZOAEDWSA-N TAXOL® Chemical compound O([C@@H]1[C@@]2(CC(C(C)=C(C2(C)C)[C@H](C([C@]2(C)[C@@H](O)C[C@H]3OC[C@]3(C21)OC(C)=O)=O)OC(=O)C)OC(=O)[C@H](O)[C@@H](NC(=O)C=1C=CC=CC=1)C=1C=CC=CC=1)O)C(=O)C1=CC=CC=C1 RCINICONZNJXQF-XAZOAEDWSA-N 0.000 description 1
- 102100017503 TBX22 Human genes 0.000 description 1
- 101700077983 TBX22 Proteins 0.000 description 1
- 102100008790 TNFRSF14 Human genes 0.000 description 1
- 101710038601 TNFRSF14 Proteins 0.000 description 1
- 229950000212 TRIOXIFENE Drugs 0.000 description 1
- 102100008210 TSC2 Human genes 0.000 description 1
- 101700083014 TSC2 Proteins 0.000 description 1
- 229960001967 Tacrolimus Drugs 0.000 description 1
- QJJXYPPXXYFBGM-LFZNUXCKSA-N Tacrolimus Chemical compound C1C[C@@H](O)[C@H](OC)C[C@@H]1\C=C(/C)[C@@H]1[C@H](C)[C@@H](O)CC(=O)[C@H](CC=C)/C=C(C)/C[C@H](C)C[C@H](OC)[C@H]([C@H](C[C@H]2C)OC)O[C@@]2(O)C(=O)C(=O)N2CCCC[C@H]2C(=O)O1 QJJXYPPXXYFBGM-LFZNUXCKSA-N 0.000 description 1
- 229950004550 Talazoparib Drugs 0.000 description 1
- 229960001603 Tamoxifen Drugs 0.000 description 1
- NKANXQFJJICGDU-QPLCGJKRSA-N Tamoxifen Chemical compound C=1C=CC=CC=1C(/CC)=C(C=1C=CC(OCCN(C)C)=CC=1)/C1=CC=CC=C1 NKANXQFJJICGDU-QPLCGJKRSA-N 0.000 description 1
- 108010006785 Taq Polymerase Proteins 0.000 description 1
- NAVMQTYZDKMPEU-UHFFFAOYSA-N Targretin Chemical compound CC1=CC(C(CCC2(C)C)(C)C)=C2C=C1C(=C)C1=CC=C(C(O)=O)C=C1 NAVMQTYZDKMPEU-UHFFFAOYSA-N 0.000 description 1
- 229960001278 Teniposide Drugs 0.000 description 1
- 206010057644 Testis cancer Diseases 0.000 description 1
- 229960005353 Testolactone Drugs 0.000 description 1
- 102000004377 Thiopurine S-methyltransferases Human genes 0.000 description 1
- 108090000958 Thiopurine S-methyltransferases Proteins 0.000 description 1
- 102000005497 Thymidylate Synthase Human genes 0.000 description 1
- 108010067449 Thymidylate Synthase Proteins 0.000 description 1
- 229950011457 Tiamiprine Drugs 0.000 description 1
- 241000130764 Tinea Species 0.000 description 1
- 208000002474 Tinea Diseases 0.000 description 1
- 229940024982 Topical Antifungal Antibiotics Drugs 0.000 description 1
- 206010057589 Total hypoxanthine-guanine phosphoribosyl transferase deficiency Diseases 0.000 description 1
- 208000005765 Traumatic Brain Injury Diseases 0.000 description 1
- 241000589886 Treponema Species 0.000 description 1
- 229950007229 Tresperimus Drugs 0.000 description 1
- 229950001353 Tretamine Drugs 0.000 description 1
- 229960001727 Tretinoin Drugs 0.000 description 1
- SHGAZHPCJJPHSC-NWVFGJFESA-N Tretinoin Chemical compound OC(=O)/C=C(\C)/C=C/C=C(C)C=CC1=C(C)CCCC1(C)C SHGAZHPCJJPHSC-NWVFGJFESA-N 0.000 description 1
- 229960004560 Triaziquone Drugs 0.000 description 1
- PXSOHRWMIRDKMP-UHFFFAOYSA-N Triaziquone Chemical compound O=C1C(N2CC2)=C(N2CC2)C(=O)C=C1N1CC1 PXSOHRWMIRDKMP-UHFFFAOYSA-N 0.000 description 1
- 241000223238 Trichophyton Species 0.000 description 1
- NOYPYLRCIDNJJB-UHFFFAOYSA-N Trimetrexate Chemical compound COC1=C(OC)C(OC)=CC(NCC=2C(=C3C(N)=NC(N)=NC3=CC=2)C)=C1 NOYPYLRCIDNJJB-UHFFFAOYSA-N 0.000 description 1
- 229960001099 Trimetrexate Drugs 0.000 description 1
- 208000000625 Triple X syndrome Diseases 0.000 description 1
- 208000006284 Trisomy 13 Syndrome Diseases 0.000 description 1
- 206010053884 Trisomy 18 Diseases 0.000 description 1
- 206010044688 Trisomy 21 Diseases 0.000 description 1
- 206010053871 Trisomy 8 Diseases 0.000 description 1
- UMKFEPPTGMDVMI-UHFFFAOYSA-N Trofosfamide Chemical compound ClCCN(CCCl)P1(=O)OCCCN1CCCl UMKFEPPTGMDVMI-UHFFFAOYSA-N 0.000 description 1
- HDZZVAMISRMYHH-LITAXDCLSA-N Tubercidin Chemical compound C1=CC=2C(N)=NC=NC=2N1[C@@H]1O[C@@H](CO)[C@H](O)[C@H]1O HDZZVAMISRMYHH-LITAXDCLSA-N 0.000 description 1
- 102000007581 Tuberous Sclerosis Complex 2 Protein Human genes 0.000 description 1
- 108010007089 Tuberous Sclerosis Complex 2 Protein Proteins 0.000 description 1
- 206010045181 Turner's syndrome Diseases 0.000 description 1
- 108010081268 Type 2 Fibroblast Growth Factor Receptor Proteins 0.000 description 1
- NMUSYJAQQFHJEW-KVTDHHQDSA-N U-18,496 Chemical compound O=C1N=C(N)N=CN1[C@H]1[C@H](O)[C@H](O)[C@@H](CO)O1 NMUSYJAQQFHJEW-KVTDHHQDSA-N 0.000 description 1
- 229950009811 UBENIMEX Drugs 0.000 description 1
- 101700073311 USP9X Proteins 0.000 description 1
- 102100009329 USP9X Human genes 0.000 description 1
- IDPUKCWIGUEADI-UHFFFAOYSA-N Uramustine Chemical compound ClCCN(CCCl)C1=CNC(=O)NC1=O IDPUKCWIGUEADI-UHFFFAOYSA-N 0.000 description 1
- XCCTYIAWTASOJW-XVFCMESISA-N Uridine-5'-Diphosphate Chemical compound O[C@@H]1[C@H](O)[C@@H](COP(O)(=O)OP(O)(O)=O)O[C@H]1N1C(=O)NC(=O)C=C1 XCCTYIAWTASOJW-XVFCMESISA-N 0.000 description 1
- 241000700647 Variola virus Species 0.000 description 1
- 108010053100 Vascular Endothelial Growth Factor Receptor-3 Proteins 0.000 description 1
- 230000037115 Vdr Effects 0.000 description 1
- 241000710959 Venezuelan equine encephalitis virus Species 0.000 description 1
- 201000009693 Venezuelan hemorrhagic fever Diseases 0.000 description 1
- 241000711970 Vesiculovirus Species 0.000 description 1
- 241000607265 Vibrio vulnificus Species 0.000 description 1
- 229960004528 Vincristine Drugs 0.000 description 1
- 229960004355 Vindesine Drugs 0.000 description 1
- UGGWPQSBPIFKDZ-KOTLKJBCSA-N Vindesine Chemical compound C([C@@H](C[C@]1(C(=O)OC)C=2C(=CC3=C([C@]45[C@H]([C@@]([C@H](O)[C@]6(CC)C=CCN([C@H]56)CC4)(O)C(N)=O)N3C)C=2)OC)C[C@@](C2)(O)CC)N2CCC2=C1N=C1[C]2C=CC=C1 UGGWPQSBPIFKDZ-KOTLKJBCSA-N 0.000 description 1
- 102000004210 Vitamin K Epoxide Reductases Human genes 0.000 description 1
- 108090000779 Vitamin K Epoxide Reductases Proteins 0.000 description 1
- 241000710886 West Nile virus Species 0.000 description 1
- 241000710951 Western equine encephalitis virus Species 0.000 description 1
- 208000008383 Wilms Tumor Diseases 0.000 description 1
- RLQVKDVIBJCQGE-WDUCDQOOSA-N Win-25540 Chemical compound O=C1C(C#N)C[C@]2(C)[C@H]3CC[C@](C)([C@H](CC4)O)[C@@H]4[C@@H]3CC[C@@]32O[C@@H]31 RLQVKDVIBJCQGE-WDUCDQOOSA-N 0.000 description 1
- 206010056894 XYY syndrome Diseases 0.000 description 1
- 208000002622 XYY syndrome 47 Diseases 0.000 description 1
- 229940053867 Xeloda Drugs 0.000 description 1
- 108010041009 Xeroderma Pigmentosum Group D Protein Proteins 0.000 description 1
- AOCCBINRVIKJHY-UHFFFAOYSA-N Yamafur Chemical compound CCCCCCNC(=O)N1C=C(F)C(=O)NC1=O AOCCBINRVIKJHY-UHFFFAOYSA-N 0.000 description 1
- 241000607479 Yersinia pestis Species 0.000 description 1
- FBTUMDXHSRTGRV-ALTNURHMSA-N Zorubicin Chemical compound O([C@H]1C[C@@](O)(CC=2C(O)=C3C(=O)C=4C=CC=C(C=4C(=O)C3=C(O)C=21)OC)C(\C)=N\NC(=O)C=1C=CC=CC=1)[C@H]1C[C@H](N)[C@H](O)[C@H](C)O1 FBTUMDXHSRTGRV-ALTNURHMSA-N 0.000 description 1
- QIMGFXOHTOXMQP-GFAGFCTOSA-N [(2R,3S,4S,5R,6R)-2-[(2R,3S,4S,5S,6S)-2-[(1R,2S)-2-[[6-amino-2-[(1S)-3-amino-1-[[(2S)-2,3-diamino-3-oxopropyl]amino]-3-oxopropyl]-5-methylpyrimidine-4-carbonyl]amino]-3-[[(2R,3S,4S)-3-hydroxy-5-[[(2S,3R)-3-hydroxy-1-oxo-1-[2-[4-[4-[3-[[(1S)-1-phenylethyl] Chemical compound N([C@H](C(=O)N[C@H](C)[C@@H](O)[C@H](C)C(=O)N[C@@H]([C@H](O)C)C(=O)NCCC=1SC=C(N=1)C=1SC=C(N=1)C(=O)NCCCN[C@@H](C)C=1C=CC=CC=1)[C@@H](O[C@H]1[C@H]([C@@H](O)[C@H](O)[C@H](CO)O1)O[C@@H]1[C@H]([C@@H](OC(N)=O)[C@H](O)[C@@H](CO)O1)O)C=1NC=NC=1)C(=O)C1=NC([C@H](CC(N)=O)NC[C@H](N)C(N)=O)=NC(N)=C1C QIMGFXOHTOXMQP-GFAGFCTOSA-N 0.000 description 1
- SPJCRMJCFSJKDE-ZWBUGVOYSA-N [(3S,8S,9S,10R,13R,14S,17R)-10,13-dimethyl-17-[(2R)-6-methylheptan-2-yl]-2,3,4,7,8,9,11,12,14,15,16,17-dodecahydro-1H-cyclopenta[a]phenanthren-3-yl] 2-[4-[bis(2-chloroethyl)amino]phenyl]acetate Chemical compound O([C@@H]1CC2=CC[C@H]3[C@@H]4CC[C@@H]([C@]4(CC[C@@H]3[C@@]2(C)CC1)C)[C@H](C)CCCC(C)C)C(=O)CC1=CC=C(N(CCCl)CCCl)C=C1 SPJCRMJCFSJKDE-ZWBUGVOYSA-N 0.000 description 1
- IHGLINDYFMDHJG-UHFFFAOYSA-N [2-(4-methoxyphenyl)-3,4-dihydronaphthalen-1-yl]-[4-(2-pyrrolidin-1-ylethoxy)phenyl]methanone Chemical compound C1=CC(OC)=CC=C1C(CCC1=CC=CC=C11)=C1C(=O)C(C=C1)=CC=C1OCCN1CCCC1 IHGLINDYFMDHJG-UHFFFAOYSA-N 0.000 description 1
- LVBMFPUTQOHXQE-UHFFFAOYSA-N [2-[6-(diaminomethylideneamino)hexylamino]-2-oxoethyl] N-[4-(3-aminopropylamino)butyl]carbamate Chemical compound NCCCNCCCCNC(=O)OCC(=O)NCCCCCCN=C(N)N LVBMFPUTQOHXQE-UHFFFAOYSA-N 0.000 description 1
- NUKCGLDCWQXYOQ-UHFFFAOYSA-N [3-[4-(3-methylsulfonyloxypropanoyl)piperazin-1-yl]-3-oxopropyl] methanesulfonate Chemical compound CS(=O)(=O)OCCC(=O)N1CCN(C(=O)CCOS(C)(=O)=O)CC1 NUKCGLDCWQXYOQ-UHFFFAOYSA-N 0.000 description 1
- 230000001594 aberrant Effects 0.000 description 1
- 230000001154 acute Effects 0.000 description 1
- 239000012082 adaptor molecule Substances 0.000 description 1
- 201000005188 adrenal gland cancer Diseases 0.000 description 1
- 230000001800 adrenalinergic Effects 0.000 description 1
- 201000011452 adrenoleukodystrophy Diseases 0.000 description 1
- 235000004279 alanine Nutrition 0.000 description 1
- PYMYPHUHKUWMLA-WDCZJNDASA-N aldehydo-D-arabinose Chemical compound OC[C@@H](O)[C@@H](O)[C@H](O)C=O PYMYPHUHKUWMLA-WDCZJNDASA-N 0.000 description 1
- 229960001445 alitretinoin Drugs 0.000 description 1
- 150000008052 alkyl sulfonates Chemical class 0.000 description 1
- 239000002168 alkylating agent Substances 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 229960000473 altretamine Drugs 0.000 description 1
- 235000001014 amino acid Nutrition 0.000 description 1
- 150000001413 amino acids Chemical class 0.000 description 1
- 229960002749 aminolevulinic acid Drugs 0.000 description 1
- ZGXJTSGNIOSYLO-UHFFFAOYSA-N aminolevulinic acid Chemical compound NCC(=O)CCC(O)=O ZGXJTSGNIOSYLO-UHFFFAOYSA-N 0.000 description 1
- 239000003098 androgen Substances 0.000 description 1
- 201000003076 angiosarcoma Diseases 0.000 description 1
- 239000003242 anti bacterial agent Substances 0.000 description 1
- 230000002280 anti-androgenic Effects 0.000 description 1
- 230000001833 anti-estrogenic Effects 0.000 description 1
- 230000003388 anti-hormone Effects 0.000 description 1
- 230000000340 anti-metabolite Effects 0.000 description 1
- 239000000051 antiandrogen Substances 0.000 description 1
- 102000004965 antibodies Human genes 0.000 description 1
- 108090001123 antibodies Proteins 0.000 description 1
- 239000002256 antimetabolite Substances 0.000 description 1
- 229940045687 antimetabolites Folic acid analogs Drugs 0.000 description 1
- 230000001640 apoptogenic Effects 0.000 description 1
- 101700017456 asd-1 Proteins 0.000 description 1
- 201000001320 atherosclerosis Diseases 0.000 description 1
- NOWKCMXCCJGMRR-UHFFFAOYSA-N aziridine Chemical class C1CN1 NOWKCMXCCJGMRR-UHFFFAOYSA-N 0.000 description 1
- 150000001541 aziridines Chemical class 0.000 description 1
- 230000001580 bacterial Effects 0.000 description 1
- 239000011324 bead Substances 0.000 description 1
- 229960002938 bexarotene Drugs 0.000 description 1
- LKJPYSCBVHEWIU-KRWDZBQOSA-N bicalutamide Chemical compound C([C@@](O)(C)C(=O)NC=1C=C(C(C#N)=CC=1)C(F)(F)F)S(=O)(=O)C1=CC=C(F)C=C1 LKJPYSCBVHEWIU-KRWDZBQOSA-N 0.000 description 1
- 229960000997 bicalutamide Drugs 0.000 description 1
- 201000007180 bile duct carcinoma Diseases 0.000 description 1
- 201000009036 biliary tract cancer Diseases 0.000 description 1
- 230000003115 biocidal Effects 0.000 description 1
- 230000031018 biological processes and functions Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- YBJHBAHKTGYVGT-ZKWXMUAHSA-N biotin Chemical compound N1C(=O)N[C@@H]2[C@H](CCCCC(=O)O)SC[C@@H]21 YBJHBAHKTGYVGT-ZKWXMUAHSA-N 0.000 description 1
- 229960002685 biotin Drugs 0.000 description 1
- 235000020958 biotin Nutrition 0.000 description 1
- 239000011616 biotin Substances 0.000 description 1
- 229950008548 bisantrene Drugs 0.000 description 1
- 201000001531 bladder carcinoma Diseases 0.000 description 1
- MPBVHIBUJCELCL-UHFFFAOYSA-N bondronat Chemical compound CCCCCN(C)CCC(O)(P(O)(O)=O)P(O)(O)=O MPBVHIBUJCELCL-UHFFFAOYSA-N 0.000 description 1
- 201000005200 bronchus cancer Diseases 0.000 description 1
- 229960002115 carboquone Drugs 0.000 description 1
- 229960003261 carmofur Drugs 0.000 description 1
- 229960005243 carmustine Drugs 0.000 description 1
- 201000010881 cervical cancer Diseases 0.000 description 1
- 238000007385 chemical modification Methods 0.000 description 1
- 239000003795 chemical substances by application Substances 0.000 description 1
- 230000000973 chemotherapeutic Effects 0.000 description 1
- YTRQFSDWAXHJCC-UHFFFAOYSA-N chloroform;phenol Chemical compound ClC(Cl)Cl.OC1=CC=CC=C1 YTRQFSDWAXHJCC-UHFFFAOYSA-N 0.000 description 1
- 229960001480 chlorozotocin Drugs 0.000 description 1
- 201000009047 chordoma Diseases 0.000 description 1
- 229960001265 ciclosporin Drugs 0.000 description 1
- 229920002080 circulating cell-free DNA Polymers 0.000 description 1
- 229960004316 cisplatin Drugs 0.000 description 1
- 201000011231 colorectal cancer Diseases 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 201000003883 cystic fibrosis Diseases 0.000 description 1
- 229960003901 dacarbazine Drugs 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 230000002939 deleterious Effects 0.000 description 1
- 229960005052 demecolcine Drugs 0.000 description 1
- 201000002949 dengue disease Diseases 0.000 description 1
- 229960002923 denileukin diftitox Drugs 0.000 description 1
- 108010017271 denileukin diftitox Proteins 0.000 description 1
- 230000037304 dermatophytes Effects 0.000 description 1
- 230000001809 detectable Effects 0.000 description 1
- 239000003599 detergent Substances 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000002405 diagnostic procedure Methods 0.000 description 1
- 229950002389 diaziquone Drugs 0.000 description 1
- SHIBSTMRCDJXLN-KCZCNTNESA-N digoxigenin Chemical compound C1([C@@H]2[C@@]3([C@@](CC2)(O)[C@H]2[C@@H]([C@@]4(C)CC[C@H](O)C[C@H]4CC2)C[C@H]3O)C)=CC(=O)OC1 SHIBSTMRCDJXLN-KCZCNTNESA-N 0.000 description 1
- 239000003534 dna topoisomerase inhibitor Substances 0.000 description 1
- ZWAOHEXOSAUJHY-ZIYNGMLESA-N doxifluridine Chemical compound O[C@@H]1[C@H](O)[C@@H](C)O[C@H]1N1C(=O)NC(=O)C(F)=C1 ZWAOHEXOSAUJHY-ZIYNGMLESA-N 0.000 description 1
- 229950005454 doxifluridine Drugs 0.000 description 1
- 229940017743 dromostanolone propionate Drugs 0.000 description 1
- 229950004683 drostanolone propionate Drugs 0.000 description 1
- KCXVZYZYPLLWCC-UHFFFAOYSA-N edta Chemical compound OC(=O)CN(CC(O)=O)CCN(CC(O)=O)CC(O)=O KCXVZYZYPLLWCC-UHFFFAOYSA-N 0.000 description 1
- 229950007539 elliptinium Drugs 0.000 description 1
- 201000009051 embryonal carcinoma Diseases 0.000 description 1
- 230000002357 endometrial Effects 0.000 description 1
- 230000002616 endonucleolytic Effects 0.000 description 1
- 102000017256 epidermal growth factor-activated receptor activity proteins Human genes 0.000 description 1
- 108040009258 epidermal growth factor-activated receptor activity proteins Proteins 0.000 description 1
- OBMLHUPNRURLOK-XGRAFVIBSA-N epitiostanol Chemical compound C1[C@@H]2S[C@@H]2C[C@]2(C)[C@H]3CC[C@](C)([C@H](CC4)O)[C@@H]4[C@@H]3CC[C@H]21 OBMLHUPNRURLOK-XGRAFVIBSA-N 0.000 description 1
- 239000000328 estrogen antagonist Substances 0.000 description 1
- 108091007910 estrogen receptors beta Proteins 0.000 description 1
- QSRLNKCNOLVZIR-KRWDZBQOSA-N ethyl (2S)-2-[[2-[4-[bis(2-chloroethyl)amino]phenyl]acetyl]amino]-4-methylsulfanylbutanoate Chemical compound CCOC(=O)[C@H](CCSC)NC(=O)CC1=CC=C(N(CCCl)CCCl)C=C1 QSRLNKCNOLVZIR-KRWDZBQOSA-N 0.000 description 1
- WVYXNIXAMZOZFK-UHFFFAOYSA-N ethyl N-[2,5-bis(aziridin-1-yl)-4-(ethoxycarbonylamino)-3,6-dioxocyclohexa-1,4-dien-1-yl]carbamate Chemical compound O=C1C(NC(=O)OCC)=C(N2CC2)C(=O)C(NC(=O)OCC)=C1N1CC1 WVYXNIXAMZOZFK-UHFFFAOYSA-N 0.000 description 1
- JOYRKODLDBILNP-UHFFFAOYSA-N ethyl urethane Chemical compound CCOC(N)=O JOYRKODLDBILNP-UHFFFAOYSA-N 0.000 description 1
- DEFVIWRASFVYLL-UHFFFAOYSA-N ethylene glycol bis(2-aminoethyl)tetraacetic acid Chemical compound OC(=O)CN(CC(O)=O)CCOCCOCCN(CC(O)=O)CC(O)=O DEFVIWRASFVYLL-UHFFFAOYSA-N 0.000 description 1
- 230000001036 exonucleolytic Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000010195 expression analysis Methods 0.000 description 1
- 201000003542 factor VIII deficiency Diseases 0.000 description 1
- 229960000390 fludarabine Drugs 0.000 description 1
- GIUYCYHIANZCFB-FJFJXFQQSA-N fludarabine phosphate Chemical compound C1=NC=2C(N)=NC(F)=NC=2N1[C@@H]1O[C@H](COP(O)(O)=O)[C@@H](O)[C@@H]1O GIUYCYHIANZCFB-FJFJXFQQSA-N 0.000 description 1
- 235000019152 folic acid Nutrition 0.000 description 1
- 239000011724 folic acid Substances 0.000 description 1
- 201000003444 follicular lymphoma Diseases 0.000 description 1
- 238000005755 formation reaction Methods 0.000 description 1
- 229960004783 fotemustine Drugs 0.000 description 1
- YAKWPXVTIGTRJH-UHFFFAOYSA-N fotemustine Chemical compound CCOP(=O)(OCC)C(C)NC(=O)N(CCCl)N=O YAKWPXVTIGTRJH-UHFFFAOYSA-N 0.000 description 1
- 238000007710 freezing Methods 0.000 description 1
- 230000002538 fungal Effects 0.000 description 1
- 229910052733 gallium Inorganic materials 0.000 description 1
- 238000001502 gel electrophoresis Methods 0.000 description 1
- 229960005277 gemcitabine Drugs 0.000 description 1
- 210000004602 germ cell Anatomy 0.000 description 1
- 150000002338 glycosides Chemical class 0.000 description 1
- 238000000227 grinding Methods 0.000 description 1
- 230000012010 growth Effects 0.000 description 1
- 201000010536 head and neck cancer Diseases 0.000 description 1
- 201000010238 heart disease Diseases 0.000 description 1
- 230000002489 hematologic Effects 0.000 description 1
- 201000000907 hemorrhagic fever with renal syndrome Diseases 0.000 description 1
- 231100000844 hepatocellular carcinoma Toxicity 0.000 description 1
- 238000002744 homologous recombination Methods 0.000 description 1
- 239000005556 hormone Substances 0.000 description 1
- 125000002887 hydroxy group Chemical group [H]O* 0.000 description 1
- 229960001330 hydroxycarbamide Drugs 0.000 description 1
- VSNHCAURESNICA-UHFFFAOYSA-N hydroxyurea Chemical compound NC(=O)NO VSNHCAURESNICA-UHFFFAOYSA-N 0.000 description 1
- 229940027318 hydroxyurea Drugs 0.000 description 1
- 229960002751 imiquimod Drugs 0.000 description 1
- 229950008097 improsulfan Drugs 0.000 description 1
- 230000002779 inactivation Effects 0.000 description 1
- 230000002757 inflammatory Effects 0.000 description 1
- 200000000004 influenza B Diseases 0.000 description 1
- 108010019691 inhibin beta A subunit Proteins 0.000 description 1
- 229940079866 intestinal antibiotics Drugs 0.000 description 1
- 230000000302 ischemic Effects 0.000 description 1
- KFZMGEQAYNKOFK-UHFFFAOYSA-N iso-propanol Chemical compound CC(C)O KFZMGEQAYNKOFK-UHFFFAOYSA-N 0.000 description 1
- 101700064822 lcb1 Proteins 0.000 description 1
- 229940115286 lentinan Drugs 0.000 description 1
- GFIJNRVAKGFPGQ-LIJARHBVSA-N leuprolide Chemical compound CCNC(=O)[C@@H]1CCCN1C(=O)[C@H](CCCNC(N)=N)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](CC(C)C)NC(=O)[C@@H](NC(=O)[C@H](CO)NC(=O)[C@H](CC=1C2=CC=CC=C2NC=1)NC(=O)[C@H](CC=1N=CNC=1)NC(=O)[C@H]1NC(=O)CC1)CC1=CC=C(O)C=C1 GFIJNRVAKGFPGQ-LIJARHBVSA-N 0.000 description 1
- 150000002632 lipids Chemical class 0.000 description 1
- 201000007270 liver cancer Diseases 0.000 description 1
- 229960002247 lomustine Drugs 0.000 description 1
- 201000006903 long QT syndrome 3 Diseases 0.000 description 1
- 229960003538 lonidamine Drugs 0.000 description 1
- 201000005202 lung cancer Diseases 0.000 description 1
- FYYHWMGAXLPEAU-UHFFFAOYSA-N magnesium Chemical compound [Mg] FYYHWMGAXLPEAU-UHFFFAOYSA-N 0.000 description 1
- 229910052749 magnesium Inorganic materials 0.000 description 1
- 239000011777 magnesium Substances 0.000 description 1
- 229950008612 mannomustine Drugs 0.000 description 1
- 201000001441 melanoma Diseases 0.000 description 1
- 229960001924 melphalan Drugs 0.000 description 1
- 238000002844 melting Methods 0.000 description 1
- 239000012528 membrane Substances 0.000 description 1
- 230000003340 mental Effects 0.000 description 1
- 230000015689 metaplastic ossification Effects 0.000 description 1
- VJRAUFKOOPNFIQ-TVEKBUMESA-N methyl (1R,2R,4S)-4-[(2R,4S,5S,6S)-5-[(2S,4S,5S,6S)-5-[(2S,4S,5S,6S)-4,5-dihydroxy-6-methyloxan-2-yl]oxy-4-hydroxy-6-methyloxan-2-yl]oxy-4-(dimethylamino)-6-methyloxan-2-yl]oxy-2-ethyl-2,5,7,10-tetrahydroxy-6,11-dioxo-3,4-dihydro-1H-tetracene-1-carboxylat Chemical compound O([C@H]1[C@@H](O)C[C@@H](O[C@H]1C)O[C@H]1[C@H](C[C@@H](O[C@H]1C)O[C@H]1C[C@]([C@@H](C2=CC=3C(=O)C4=C(O)C=CC(O)=C4C(=O)C=3C(O)=C21)C(=O)OC)(O)CC)N(C)C)[C@H]1C[C@H](O)[C@H](O)[C@H](C)O1 VJRAUFKOOPNFIQ-TVEKBUMESA-N 0.000 description 1
- 229960005485 mitobronitol Drugs 0.000 description 1
- MXWHMTNPTTVWDM-NXOFHUPFSA-N mitoguazone Chemical compound NC(N)=N\N=C(/C)\C=N\N=C(N)N MXWHMTNPTTVWDM-NXOFHUPFSA-N 0.000 description 1
- 229950010913 mitolactol Drugs 0.000 description 1
- ZAHQPTJLOCWVPG-UHFFFAOYSA-N mitoxantrone dihydrochloride Chemical compound Cl.Cl.O=C1C2=C(O)C=CC(O)=C2C(=O)C2=C1C(NCCNCCO)=CC=C2NCCNCCO ZAHQPTJLOCWVPG-UHFFFAOYSA-N 0.000 description 1
- 238000002156 mixing Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000006011 modification reaction Methods 0.000 description 1
- 238000007479 molecular analysis Methods 0.000 description 1
- 108010066419 multidrug resistance-associated protein 2 Proteins 0.000 description 1
- 201000006938 muscular dystrophy Diseases 0.000 description 1
- HPNSFSBZBAHARI-RUDMXATFSA-N mycophenolic acid Chemical compound OC1=C(C\C=C(/C)CCC(O)=O)C(OC)=C(C)C2=C1C(=O)OC2 HPNSFSBZBAHARI-RUDMXATFSA-N 0.000 description 1
- 201000000050 myeloid neoplasm Diseases 0.000 description 1
- 230000002071 myeloproliferative Effects 0.000 description 1
- 238000002663 nebulization Methods 0.000 description 1
- 230000001338 necrotic Effects 0.000 description 1
- 201000008026 nephroblastoma Diseases 0.000 description 1
- 230000000955 neuroendocrine Effects 0.000 description 1
- 229960002653 nilutamide Drugs 0.000 description 1
- 229950008607 nitracrine Drugs 0.000 description 1
- 229910052757 nitrogen Inorganic materials 0.000 description 1
- 229950009266 nogalamycin Drugs 0.000 description 1
- MAZYQGHSTXUZJF-ZBRHGPMOSA-N nogalamycin Chemical compound CO[C@@H]1[C@@](O)(C)[C@@H](OC)[C@H](C)O[C@H]1O[C@@H]1C2=C(O)C(C(=O)C3=C(O)C=C4[C@]5(C)C[C@H](C[C@@H](O5)CC4=C3C3=O)N(C)C)=C3C=C2[C@@H](C(=O)OC)[C@@](C)(O)C1 MAZYQGHSTXUZJF-ZBRHGPMOSA-N 0.000 description 1
- 238000009376 nuclear reprocessing Methods 0.000 description 1
- 229960000572 olaparib Drugs 0.000 description 1
- 201000010133 oligodendroglioma Diseases 0.000 description 1
- 230000002246 oncogenic Effects 0.000 description 1
- 231100000590 oncogenic Toxicity 0.000 description 1
- 229940005935 ophthalmologic Antibiotics Drugs 0.000 description 1
- 230000003647 oxidation Effects 0.000 description 1
- 238000007254 oxidation reaction Methods 0.000 description 1
- 201000010198 papillary carcinoma Diseases 0.000 description 1
- 230000036961 partial Effects 0.000 description 1
- 201000011252 phenylketonuria Diseases 0.000 description 1
- 125000002467 phosphate group Chemical group [H]OP(=O)(O[H])O[*] 0.000 description 1
- 238000000053 physical method Methods 0.000 description 1
- 201000004123 pineal gland cancer Diseases 0.000 description 1
- 229960000952 pipobroman Drugs 0.000 description 1
- 229950001100 piposulfan Drugs 0.000 description 1
- 229960001221 pirarubicin Drugs 0.000 description 1
- 150000003057 platinum Chemical class 0.000 description 1
- 238000006116 polymerization reaction Methods 0.000 description 1
- 238000001556 precipitation Methods 0.000 description 1
- 229960004694 prednimustine Drugs 0.000 description 1
- 238000009598 prenatal testing Methods 0.000 description 1
- 229960000624 procarbazine Drugs 0.000 description 1
- 230000035755 proliferation Effects 0.000 description 1
- 150000003212 purines Chemical class 0.000 description 1
- 150000003230 pyrimidines Chemical class 0.000 description 1
- 238000003908 quality control method Methods 0.000 description 1
- 238000011002 quantification Methods 0.000 description 1
- 238000004445 quantitative analysis Methods 0.000 description 1
- 229960002185 ranimustine Drugs 0.000 description 1
- 229960000460 razoxane Drugs 0.000 description 1
- 230000000268 renotropic Effects 0.000 description 1
- 201000000582 retinoblastoma Diseases 0.000 description 1
- 229930002330 retinoic acid Natural products 0.000 description 1
- 201000009410 rhabdomyosarcoma Diseases 0.000 description 1
- 125000000548 ribosyl group Chemical group C1([C@H](O)[C@H](O)[C@H](O1)CO)* 0.000 description 1
- 201000010208 seminoma Diseases 0.000 description 1
- 238000002864 sequence alignment Methods 0.000 description 1
- 201000003176 severe acute respiratory syndrome Diseases 0.000 description 1
- 229960002930 sirolimus Drugs 0.000 description 1
- 210000001082 somatic cell Anatomy 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 210000000130 stem cell Anatomy 0.000 description 1
- 108020003113 steroid hormone receptors Proteins 0.000 description 1
- 108010045815 superoxide dismutase 2 Proteins 0.000 description 1
- 230000001629 suppression Effects 0.000 description 1
- 239000004094 surface-active agent Substances 0.000 description 1
- 238000001356 surgical procedure Methods 0.000 description 1
- 201000010965 sweat gland carcinoma Diseases 0.000 description 1
- 201000008736 systemic mastocytosis Diseases 0.000 description 1
- 229930003347 taxol Natural products 0.000 description 1
- 201000003120 testicular cancer Diseases 0.000 description 1
- BPEWUONYVDABNZ-DZBHQSCQSA-N testolactone Chemical compound O=C1C=C[C@]2(C)[C@H]3CC[C@](C)(OC(=O)CC4)[C@@H]4[C@@H]3CCC2=C1 BPEWUONYVDABNZ-DZBHQSCQSA-N 0.000 description 1
- 230000035897 transcription Effects 0.000 description 1
- 230000001131 transforming Effects 0.000 description 1
- 230000001052 transient Effects 0.000 description 1
- LXZZYRPGZAFOLE-UHFFFAOYSA-L transplatin Chemical compound [H][N]([H])([H])[Pt](Cl)(Cl)[N]([H])([H])[H] LXZZYRPGZAFOLE-UHFFFAOYSA-L 0.000 description 1
- IUCJMVBFZDHPDX-UHFFFAOYSA-N tretamine Chemical compound C1CN1C1=NC(N2CC2)=NC(N2CC2)=N1 IUCJMVBFZDHPDX-UHFFFAOYSA-N 0.000 description 1
- 229960001670 trilostane Drugs 0.000 description 1
- 229960000875 trofosfamide Drugs 0.000 description 1
- 241000701161 unidentified adenovirus Species 0.000 description 1
- 238000011144 upstream manufacturing Methods 0.000 description 1
- 229960001055 uracil mustard Drugs 0.000 description 1
- 201000005112 urinary bladder cancer Diseases 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
- 201000006266 variola major Diseases 0.000 description 1
- 201000000627 variola minor Diseases 0.000 description 1
- OGWKCGZFUXNPDA-XQKSVPLYSA-N vincristine Chemical compound C([N@]1C[C@@H](C[C@]2(C(=O)OC)C=3C(=CC4=C([C@]56[C@H]([C@@]([C@H](OC(C)=O)[C@]7(CC)C=CCN([C@H]67)CC5)(O)C(=O)OC)N4C=O)C=3)OC)C[C@@](C1)(O)CC)CC1=C2NC2=CC=CC=C12 OGWKCGZFUXNPDA-XQKSVPLYSA-N 0.000 description 1
- GBABOYUKABKIAF-IELIFDKJSA-N vinorelbine Chemical compound C1N(CC=2C3=CC=CC=C3NC=22)CC(CC)=C[C@H]1C[C@]2(C(=O)OC)C1=CC([C@]23[C@H]([C@@]([C@H](OC(C)=O)[C@]4(CC)C=CCN([C@H]34)CC2)(O)C(=O)OC)N2C)=C2C=C1OC GBABOYUKABKIAF-IELIFDKJSA-N 0.000 description 1
- CILBMBUYJCWATM-IJDPFCGHSA-N vinorelbine L-tartrate Chemical compound OC(=O)[C@H](O)[C@@H](O)C(O)=O.OC(=O)[C@H](O)[C@@H](O)C(O)=O.C1N(CC=2C3=CC=CC=C3NC=22)CC(CC)=C[C@H]1C[C@]2(C(=O)OC)C1=CC([C@]23[C@H]([C@]([C@H](OC(C)=O)[C@]4(CC)C=CCN([C@H]34)CC2)(O)C(=O)OC)N2C)=C2C=C1OC CILBMBUYJCWATM-IJDPFCGHSA-N 0.000 description 1
- HCHKCACWOHOZIP-UHFFFAOYSA-N zinc Chemical compound [Zn] HCHKCACWOHOZIP-UHFFFAOYSA-N 0.000 description 1
- 229910052725 zinc Inorganic materials 0.000 description 1
- 239000011701 zinc Substances 0.000 description 1
- 229950009268 zinostatin Drugs 0.000 description 1
- 229960000641 zorubicin Drugs 0.000 description 1
Abstract
The present invention includes compositions and methods useful for the detection of a mutational change, SNP, translocation, inversion, deletion, change in copy number, or other genetic variation within a sample of cellular genomic DNA or cell-free DNA (cfDNA). In some embodiments, the compositions and methods of the present invention provide an extremely high level of resolution that is particularly useful in detecting copy number variations in a small fraction of the total cfDNA from a biological sample (e.g., blood). and methods of the present invention provide an extremely high level of resolution that is particularly useful in detecting copy number variations in a small fraction of the total cfDNA from a biological sample (e.g., blood).
Description
METHODS FOR THE DETECTION OF C COPY CHANGES IN DNA
SAMPLES
REFERENCE TO D APPLICATIONS
This application is a divisional of New Zealand Patent Application No 750902
which claims ty to U.S. Provisional Patent Application No. 62/379,593, filed August 25,
2016, and U.S. Provisional Patent Application No. ,538, filed April 4, 2017, each of which
are incorporated herein by reference in their entireties.
ENT REGARDING SEQUENCE LISTING
The sequence listing associated with this application is provided in text format and
is hereby incorporated by reference into the specification.
TECHNICAL FIELD
The invention relates generally to compositions and methods for the quantitative
genetic analysis of biological samples, e.g., direct tissue biopsies or peripheral blood. In particular,
the present invention relates to methods for detection of target-specific copy number change, as
well as genetic characterization and is, of biological samples.
BACKGROUND
It is becoming increasing clear that most, if not all, of the most common human
cancers are diseases of the human . It is thought that somatic mutations accumulate
during an individual’s lifetime, some of which se the ility that the cell in which they
are harbored can develop into a tumor. With just the wrong combination of accumulated
mutational events, a precancerous growth loses constraints that keep uncontrolled proliferation in
check and the resulting cell mass becomes a cancer. The constellations of mutations that are
necessary and sufficient to cause cancer are often collectively referred to as “driver mutations.”
One of the themes that have emerged from recent and intensive molecular analysis is that cancer,
once thought of as a single, tissue-specific e, is in fact a group of related diseases, each
with a unique molecular
pathology. The human genome project laid the groundwork for genome-wide analysis of
cancers.
Changes in gene copy number are a fundamental driver of biological diversity. In
the context of evolution, duplication of genes and ence of function is a ecognized
driver of species diversity. In the context of human disease, gene loss and gene amplification
within somatic cells are hallmarks of diseased tissues such as cancer. Certain therapeutic agents
act specifically on cells with these genomic gain and/or loss mutations, however, the identification
of these copy number variations is difficult because often such mutations are only present within
the DNA of diseased or cancerous cells and are not found in other cells of the body. While the
diseased tissue or cells is the major source of the mutated DNA, acquiring DNA through a biopsy
is invasive, risky and often not possible. The observation that dying tumor or cancer cells release
small pieces of their DNA into the bloodstream, termed cell free DNA or circulating DNA has
allowed for the development of genetic tests that can be performed with less invasive techniques,
such as a blood sample. However, only small amounts of DNA can be obtained from isolating cell
free DNA from a , and only a portion of the total DNA will carry the mutation associated
with the disease. For e, in the context of cancer genomics, diagnostically icant tumor
mutations are often only found at minor allele frequencies that are icantly less than 50%.
This is in contrast to conventional SNP genotyping where allele frequencies are generally ~100%,
50% or 0%.
Thus there is a need for genomic techniques capable of ing genetic copy
number changes in specific target loci.
BRIEF SUMMARY
] In a first aspect, the invention s to a kit comprising a set of adaptors,
wherein each adaptor of the set of adaptors comprises a sample tag region selected from a pool
of unique sample tag regions, wherein the pool is selected from a plurality of pools, and wherein
the selected pool is unique to a test sample;
wherein the test sample comprises a plurality of DNA fragments.
[0006b] In a second , the invention relates to a DNA y, wherein the DNA
library comprises a ity of DNA library fragments, wherein each of the DNA library
fragments comprises an adaptor module and a DNA fragment,
wherein the adaptor module is a DNA polynucleotide comprising (i) an amplification region, (ii)
a sample tag region, and (iii) an anchor ;
wherein the amplification region comprises a polynucleotide sequence capable of serving as a
primer recognition site for PCR amplification;
wherein the sample tag region identifies the test sample; and
wherein the anchor region comprises a polynucleotide sequence that is capable of ing to a
DNA fragment.
[0006c] In a third aspect, the invention s to a set of adaptors,
wherein each adaptor of the set of adaptors comprises a sample tag region selected from a pool
of unique sample tag s, wherein the pool is selected from a plurality of pools, and n
the selected pool is unique to a test sample;
wherein each adaptor in said set of adapters is a DNA polynucleotide that comprises: an
amplification region, a sample tag region, and an anchor region;
wherein the amplification region comprises a cleotide sequence capable of g as a
primer recognition site for PCR ication;
wherein the sample tag region identifies the test sample; and
wherein the anchor region comprises a polynucleotide sequence that is capable of attaching to a
DNA fragment.
Methods of detecting rare mutations in cfDNA have been previously described in
International PCT Publication No.
requisite sensitivity to detect the rarest copy number losses at very minor allele frequencies.
Provided herein are itions and methods for detection of target-specific copy number change
that are applicable to several sample types, including direct tissue biopsies, peripheral blood, and
in particular cfDNA, The compositions and methods bed herein are sensitive enough to
detect changes in copy number that are present only a tiny fraction of the total DNA.
2a followed by page 3
The present invention includes, inter alia, compositions and methods
that are useful for the ion of a mutational change, SNP, translocation, inversion,
on, change in copy , or other genetic variation within a sample of cellular
genomic DNA (e.g,, from a tissue biopsy sample) or chNA (6g, from a blood sample). In
particular, the itions and methods of the present invention provide an extremely high
level of resolution that is particularly useful in detecting copy number variations in a small
fraction of the total chNA from a biological sample (6. g.
Particular embodiments are drawn to a method for performing a
genetic analysis on a DNA target region from a test sample comprising: (a) generating a
genomic DNA library comprising a plurality of DNA library fragments, wherein each of the
DNA library fragments comprises a c DNA fragment from the test sample and an
adaptor; (b) contacting the genomic DNA library with a plurality of e probes that
specifically bind to a DNA target region, y forming complexes between the capture
probes and DNA library nts comprising the DNA target region; and (c) performing a
quantitative genetic analysis of the genomic DNA fragments comprising the DNA target
region, wherein the adaptor is a DNA polynucleotide that comprises: an amplification ,
a sample tag region, and an anchor region; wherein the amplification region comprises a
polynucleotide sequence capable of serving as a primer ition site for PCR
amplification; n the sample tag comprises a polynucleotide sequence that encodes an
identity of the unique library DNA fragment and encodes an identity of the test sample;
wherein the anchor region comprises a polynucleotide sequence that s the identity of
the test sample and wherein the anchor region is capable of attaching to the genomic DNA
fragment, and wherein the genetic analysis is performed to detect a genetic change tive
of a disease state.
In some embodiments, the genetic change indicative of a disease state
is selected from a single nucleotide variant (SNV), an insertion less than 40 nucleotides in
length, a deletion of a DNA region less than 40 nucleotides in length, and/or a change in copy
number. In particular embodiments, the genetic change indicative of a disease state is a
change in copy number. In some ments, the test sample is a tissue biopsy. In various
embodiments, the tissue biopsy is taken from a tumor or a tissue suspected of being a tumor.
In certain embodiments, the genomic DNA is cell free DNA (chNA) or cellular DNA. In
particular embodiments, the genomic DNA is chNA is isolated from the test sample; and
wherein the test sample is a biological sample selected from the group consisting of: amniotic
fluid, blood, plasma, serum, semen, lymphatic fluid, cerebral spinal fluid, ocular fluid, urine,
saliva, stool, mucous, and sweat.
In certain ments, the genomic DNA fragments are obtained the
steps comprising, (i) isolating cellular DNA from the test sample, and (ii) fragmenting the
cellular DNA to obtain the genomic DNA fragments. In particular embodiments, step (ii) is
performed by contacting the cellular DNA with at least one digestion enzyme. In some
embodiments, step (ii) is performed by applying mechanical stress to the cellular DNA. In
certain embodiments, the mechanical stress is applied by sonicating the cellular DNA.
In particular embodiments, the sample tag further comprises a unique
molecule identifier (UMI) that facilitates the identification of the unique genomic DNA
fragment.
In some embodiments, the amplification region is between 10 and 50
tides in length. In particular embodiments, the cation region is between 20 and
nucleotides in length. In certain embodiments, the amplification region is 25 nucleotides
in length.
In some embodiments, the sample tag is between 5 and 50 nucleotides
in length. In particular embodiments, the sample tag is between 5 and 15 nucleotides in
length. In certain embodiments, the sample tag is 8 nucleotides in length. In some
embodiments, the UMI multiplier is adjacent to or contained within the sample tag .
In n embodiments, the UMI multiplier is between 1 and 5
nucleotides in length. In particular embodiments, the UMI multiplier is 3 nucleotides in
length, and comprises one of 64 possible nucleotide sequences.
In some ments, the anchor region is between 1 and 50
nucleotides in length. In particular embodiments, the anchor region is between 5 and 25
nucleotides in length. In certain embodiments, the anchor region is 10 nucleotides in length.
Particular embodiments of the present invention are drawn to s
where the step of (a) generating a genomic DNA library comprising a plurality of DNA
library fragments, ses attaching the genomic DNA fragments to a plurality of adaptors.
In certain embodiments, the genomic DNA nts are end repaired prior to attaching the
genomic DNA fragments with a plurality of adaptors. In particular embodiments, the
amplification regions of each adaptor of the plurality of adaptors ses an identical
nucleotide sequence.
In certain ments, the sample tag region of each adaptor of the
plurality of adaptors comprise one of between 2 and 1,000 nucleotide sequences. In ular
embodiments, the sample tag region of each adaptor of the plurality of adaptors comprise one
of between 50 and 500 nucleotide sequences. In various embodiments, the sample tag region
of each adaptor of the plurality of adaptors comprises one of between 100 and 400 nucleotide
sequences. In some embodiments, the sample tag region of each adaptor of the plurality of
adaptors comprises one of between 200 and 300 nucleotide sequences. In certain
ments, the sample tag region of each adaptor of the plurality of rs is 8
nucleotides in length. In some embodiments, each sequence of the nucleotide sequences are
discrete from any other ce of the 240 nucleotide sequences by Hamming distance of at
least two.
In particular embodiments, each of the plurality of adaptors ses
a UMI multiplier that is adjacent to or contained within the sample tag region. In some
embodiments, each of the plurality of adaptors ses a UMI multiplier that is adjacent to
the sample tag region. In n embodiments, the UMI multiplier of each adaptor of the
plurality of adaptors is between 1 and 5 nucleotides in length. In some embodiments, the
UMI multiplier of each adaptor of the plurality of adaptors is three nucleotides in length.
In particular embodiments, the anchor tag region of each r of the
plurality of adaptors comprises one of four nucleotide sequences, and each sample region of a
given sequence is paired to only one of the four anchor regions of a given sequence.
In some embodiments, the amplification regions of each adaptor of the
plurality of adaptors comprises an identical nucleotide sequence; the sample tag region of
each r of the plurality of adaptors is 8 nucleotides in length; the nucleotide sequence of
each sample tag is te from any other nucleotide sequence of the sample tags of the
plurality of adaptors by Hamming distance of at least two, each of the plurality of adaptors
comprises a UMI multiplier that is adjacent to or contained within the sample tag region; the
UMI multiplier of each adaptor of the ity of adaptors is three nucleotides in length; and
the UMI multiplier of each of the possible nucleotide sequences is paired to each sample tag
region of the plurality of adaptors, the anchor tag region of each adaptor of the plurality of
adaptors comprises one of four nucleotide sequences; and each sample region of a given
sequence is paired to only one of the four anchor regions of a given sequence.
Particular ments of the present invention are drawn to a method
where the step of ing the genomic DNA fragments with a plurality of adaptors
comprises: (i) attaching an oligonucleotide sing least a portion of an anchor region to
each genomic DNA fragment, wherein the oligonucleotide comprising least a portion of an
anchor region is a DNA duplex comprising a 5’ orylated ment strand duplexed
with a partner strand, wherein the partner strand is d from attachment by al
ation at its 3’ end, and wherein the attachment strand is attached to the genomic DNA
fragment, (ii) contacting the genomic DNA fragments attached to the oligonucleotides
sing at least a portion of the anchor region with DNA oligonucleotides encoding full
length adaptor sequences for each adaptor nucleotide sequence of the plurality of adaptors,
and (iii) contacting the genomic DNA fragments and the DNA oligonucleotides encoding the
full length adaptor sequence with T4 polynucleotide kinase, Taq DNA ligase and full-length
Bst polymerase under conditions suitable for DNA ligation, y attaching the plurality of
adaptors to the genomic DNA fragments. In some embodiments, the genomic DNA
fragments are chNA. In certain embodiments, the DNA target region is analyzed for a
change in copy number.
In particular embodiments, step (c) ming a quantitative genetic
is of the genomic DNA fragments comprising the DNA target region comprises
purification of the complexes formed between the capture probes and DNA library fragments
comprising the DNA target region. In certain embodiments, step (c) comprises purification of
the complexes formed between the capture probes and DNA library fragments comprising the
DNA target region, preforming primer extension and/or amplification of the DNA y
fragments comprising the region of interest from the genomic DNA library. In some
embodiments, step (0) comprises purification of the complexes formed n the capture
probes and DNA library fragments comprising the DNA target region, preforming primer
extension and amplification of the DNA library fragments comprising the region of interest
from the genomic DNA y. In certain embodiments, step (c) comprises DNA sequencing
of the DNA library fragments comprising the DNA target region to generate a plurality of
sequencing reads.
In some ments, the present invention is drawn to a method
wherein the genomic is comprises determining a change of copy number in a DNA
region of interest, and wherein step (c), performing a quantitative genetic analysis of the
genomic DNA fragments comprising the DNA target region, comprises determining a copy
number of the region of interest present in the genomic DNA library derived from the test
sample, and comparing it to a copy number of the region of interest present in the genomic
DNA library derived from a nce , wherein the reference sample comprises a
known copy number of the DNA target region.
In some embodiments, determining the copy number in the region of
interest comprises DNA sequencing of the DNA library nts comprising the DNA
target region to generate a plurality of sequencing reads, wherein each sequencing read
comprises a unique molecular identification element (UMIE). In some embodiments, the
UMIE comprises sequencing information from the adaptor and at least a portion of the
genomic DNA sequence. In some embodiments, sequencing reads comprising cal
UMIEs are identified as a unique genomic sequence (UGS).
In some embodiments, methods of ining the copy number
further comprise determining a raw genomic depth (RGD) for each of the e probes
contacted with the genomic DNA library. In some embodiments, ining the RGD
comprises determining the e number of UGSs associated with each capture probe
sequence within a group of sample replicates. In some ments, capture probes
associated with a highly variable number of UGSs are identified as noisy probes and are
removed from further calculations, In some embodiments, determining the RGD further
comprises calculating an RGD for a sample, comprising calculating a numerical average of
all RGDs for all capture probes in the sample. In some embodiments, the RGD values for
noisy probes are not included in calculating an RGD for a sample.
In some embodiments, the RGDs for the capture probes are ized
across all samples in an experimental group by converting the RGD for each capture probe
into a probe—specific, normalized read count comprising (i) multiplying each capture probe
RGD in a sample by a normalization constant, wherein the normalization constant comprises
any real number, and (ii) dividing the product of (i) by the RGD calculated for the
corresponding sample; or (iii) dividing the product of (i) by an average RGD calculated from
a subset of probes. In some embodiments, the subset of probes is a set of control probes.
In some embodiments, the probe-specific, normalized read counts are
converted in to a copy number value comprising (i) lying the probe-specific,
normalized read counts of probes directed to autosomal and/or ed regions by 2 in
samples derived from females; (ii) multiplying the probe—specific, normalized read counts of
probes directed to Y-linked and/or X-linked regions by 1 in samples derived from males, (iii)
averaging the products of (i) and/or (ii) across all samples in an experiment; and (iv) dividing
the product of (i) and/or (ii) by the average of (iii). In some embodiments, the approximate
copy number values for all probes that target a specific gene are averaged.
In some embodiments, the present invention is drawn to a method for
highly sensitive detection of copy number gain and copy number loss comprising (i)
determining an RGD for a e probe; (ii) normalizing the RGD for the capture probe
across all samples in an experimental group by converting the RGD for the capture probe into
a probe-specific, normalized read count; (iii) calculating an approximate copy number value
for each probe-specific, normalized read count; and (iv) averaging the approximate copy
number values for all probes that target a specific gene.
In some embodiments, the present invention is drawn to a method for
measuring chromosome stability comprising (i) designing and validating a set of one or more
chromosomal stability , wherein the chromosomal stability probes are uniformly
distributed across human somes; (ii) performing ed sequencing on patient
samples using the one or more chromosomal stability probes; (iii) determining an
imate copy number value for each chromosomal probe; (iv) determining a genomic
phenotype of a patient sample, wherein fluctuations in the copy number values for one or
more chromosomal probes in the patient sample indicate genomic instability.
In some embodiments, the present invention is drawn to a method of
treating a cancer in a subject in need thereof, wherein the subject has been identified as
having a destabilized genome according to the method claim 62, n the method of
treating the cancer comprises administering a pharmaceutically effective amount of a PARP
inhibitor.
In some embodiments, the present invention is drawn to a method
wherein the genomic analysis comprises determining a change of copy number in a DNA
region of interest, and wherein step (c), ming a quantitative genetic analysis of the
genomic DNA fragments comprising the DNA target , comprises ining a copy
number of the region of interest present in the genomic DNA library derived from the test
sample, and comparing it to a copy number of the region of interest present in the genomic
DNA library derived from a reference sample, wherein the reference sample comprises a
known copy number of the DNA target . In some embodiments, the region of interest is
a gene or a portion of the gene. In particular embodiments, the gene is ated with a
disease. In certain embodiments, the e is a cancer. In various embodiments, the gene is
BRCA2, ATM, BRCAl, BRIPl, CHEK2, FANCA, HDAC2, and/or PALB2.
Particular embodiments are drawn to a genomic DNA library
sing a plurality of DNA library fragments, wherein each of the DNA y fragments
comprises an adaptor and a genomic DNA fragment; wherein the adaptor is a DNA
polynucleotide that comprises: an cation region, a sample tag region, and an anchor
region, wherein the amplification region ses a polynucleotide sequence capable of
g as a primer recognition site for PCR amplification, wherein the sample tag comprises
a polynucleotide ce that encodes an identity of the unique library DNA fragment and
encodes an ty of the test sample, and wherein the anchor region comprises a
polynucleotide sequence that encodes the identity of the test sample, and wherein the anchor
region is capable of attaching to the genomic DNA fragment. In some embodiments, the
sample tag further comprises a unique molecule identifier (UMI), wherein the UMI facilitates
the identification of the unique genomic DNA fragment. In particular embodiments, the
amplification region is between 10 and 50 nucleotides in length. In particular embodiments,
the amplification region is 25 nucleotides in length. In particular embodiments, the sample
tag is n 5 and 50 nucleotides in length. In certain embodiments, the sample tag is 8
nucleotides in length. In some embodiments, the UMI multiplier is adjacent to or contained
within the sample tag region. In particular ments, the UMI multiplier is between I and
tides in length. In n embodiments, the anchor region is between 1 and 50
nucleotides in length. In some embodiments, the anchor region is 10 tides in length. In
particular embodiments, the amplification regions of each adaptor of the plurality of adaptors
comprises an identical nucleotide sequence. In some embodiments, each nucleotide sequence
of the sample tags are discrete from any other sequence of the nucleotide sequences of the
sample by g distance of at least two. In certain embodiments, each of the plurality of
adaptors comprises a UMI multiplier that is adjacent to or contained within the sample tag
region. In particular embodiments, each of the plurality of adaptors comprises a UMI
multiplier that is nt to the sample tag region. In some embodiments, the anchor tag
region of each adaptor of the plurality of rs comprises one of four nucleotide
sequences, and wherein each sample region of a given sequence is paired to only one of the
four anchor regions of a given sequence. In some embodiments, the genomic DNA nt
is chNA.
In certain embodiments, the amplification regions of each adaptor of
the plurality of rs comprises an identical nucleotide sequence; the sample tag region of
each adaptor of the plurality of adaptors is 8 nucleotides in length, the sample tag region of
each r of the plurality of rs comprises a nucleotide sequence that is te from
any other nucleotide sequence of the sample tags of the plurality of adaptors by Hamming
distance of at least two, the each of the plurality of adaptors comprises a UMI multiplier that
is adjacent to or contained within the sample tag region, the UMI multiplier of each adaptor
of the plurality of adaptors is three nucleotides in length, and the UMI multiplier of each of
the possible nucleotide sequences is paired to each of the sample tag regions of the plurality
of adaptors, the anchor tag region of each adaptor of the plurality of adaptors comprises one
of four nucleotide sequences, and each sample region of a given sequence is paired to only
one of the four anchor regions of a given sequence. In some embodiments, the genomic DNA
nt is chNA.
Certain embodiments are drawn to a plurality of genomic DNA
libraries, comprising more than one genomic library described herein. In some embodiments,
the nucleic acid sequences of the sample tag regions of a c DNA library belonging to
the plurality of genomic DNA libraries are different from the nucleic acid sequences of the
sample tag regions of other genomic DNA libraries belonging to the ity of genomic
DNA libraries. In ular ments, the nucleic acid ces of the amplification
regions of a genomic DNA y belonging to the ity of genomic DNA ies are
identical to the nucleic acid sequences of the amplification regions of other genomic DNA
libraries belonging to the plurality of genomic DNA libraries.
Certain embodiments are drawn to a method for genetic analysis of a
DNA target region of cell free DNA (chNA) comprising: (a) generating a DNA library as
described herein, (b) contacting the chNA y with a plurality of capture probes that
specifically bind to a DNA target region, thereby forming complexes between the capture
probes and DNA library fragments sing the DNA target region; and (c) performing a
quantitative genetic is of the chNA fragments comprising the DNA target region,
thereby performing genetic analysis of the DNA target region.
Certain embodiments are directed to a method of predicting,
diagnosing, or ring a genetic disease in a subject comprising: (a) obtaining a test
sample from the subject, (b) isolating genomic DNA from the test sample, (c) generating a
DNA library comprising a plurality of DNA library fragments, wherein each of the DNA
library nts comprises a genomic DNA fragment from the test sample and an adaptor,
(d) contacting the chNA library with a plurality of capture probes that specifically bind to a
DNA target region, thereby forming complexes between the capture probes and DNA library
fragments comprising the DNA target region; and (e) performing a quantitative genetic
analysis of one or more target genetic loci associated with the genetic disease in the chNA
clone library, n the fication or detection of one or more genetic lesions in the one
or more target genetic loci is prognostic for, stic of, or rs the progression of the
genetic disease. In particular embodiments, the tative genetic is comprises DNA
sequencing to generate a plurality of sequencing reads.
Particular embodiments are drawn to a set of adaptors that encode an
identify of a unique genomic DNA fragment and an ty of a test sample, for use in
generating a genomic DNA library, wherein each adaptor in said set of adapters is a DNA
polynucleotide that comprises: an amplification region, a sample tag region, and an anchor
region, n the amplification region comprises a polynucleotide sequence capable of
serving as a primer recognition site for PCR cation, n the sample tag comprises
a polynucleotide sequence that encodes the identity of the unique library DNA fragment and
encodes the identity of the test sample; and n the anchor region comprises a
polynucleotide sequence that encodes the identity of the test , and wherein the anchor
region is capable of attaching to the genomic DNA fragment. In some embodiments, the
sample tag further comprises a unique molecule identifier (UMI), wherein the UMI facilitates
the identification of the unique c DNA fragment. In various embodiments, the
amplification region is between 10 and 50 nucleotides in length. In certain embodiments, the
amplification region is 25 nucleotides in length. In particular embodiments, the sample tag is
between 5 and 50 nucleotides in length. In some embodiments, the sample tag is 8
nucleotides in . In particular embodiments, the UMI multiplier is adjacent to or
contained within the sample tag region. In some embodiments, the UMI multiplier is n
1 and 5 nucleotides in length. In particular embodiments, the anchor region is between 1 and
50 nucleotides in length. In some embodiments, the anchor region is 10 nucleotides in length.
In certain embodiments, the amplification regions of each adaptor of the plurality of adaptors
comprises an identical nucleotide sequence.
In some embodiments, each nucleotide sequence of the sample tags is
discrete from any other nucleotide sequence of the sample tags of the set of rs by
Hamming ce of at least two. In various embodiments, each of the plurality of adaptors
comprises a UMI multiplier that is adjacent to or contained within the sample tag region. In
particular embodiments, each of the plurality of adaptors comprises a UMI multiplier that is
adjacent to the sample tag region.
In some embodiments, the anchor tag region of each adaptor of the
plurality of adaptors comprises one of four nucleotide sequences, and wherein each sample
region of a given sequence is paired to only one of the four anchor regions of a given
sequence. The set of adaptors claim 75, wherein the amplification regions of each adaptor of
the plurality of adaptors comprises an identical nucleotide sequence; wherein the sample tag
region of each adaptor is 8 tides in length, wherein each tide sequence of the
sample tags is discrete from any other nucleotide sequence of the sample tags of the set of
adaptors by Hamming ce of at least two, wherein each of the plurality of adaptors
comprises a UMI lier that is nt to or contained within the sample tag region,
wherein the UMI multiplier of each adaptor of the plurality of adaptors is three nucleotides in
length, wherein the UMI multiplier comprises one of 64 possible nucleotide sequences, and
wherein the UMI multiplier of each of the 64 possible nucleotide sequences is paired to each
of the sample tag region of the plurality of adaptors, wherein the anchor tag region of each
adaptor of the plurality of adaptors comprises one of four nucleotide sequences, and wherein
each sample region of a given sequence is paired to only one of the four anchor regions of a
given sequence.
BRIEF PTION OF THE SEVERAL VIEWS OF THE DRAWINGS
shows the framework of the copy number loss (CNL) assay.
Each gene (rows) exhibits a characteristic unique read value that is represented here by a
shade. Each sample (columns) is interrogated across the same panel of genes.
shows a diagram illustrating the drivers of the CNL assay
signal.
FIG, 3 shows a m illustrating steps of an illustrative CNL assay
performed on cell free DNA (chNA).
— 4B shows diagrams of an illustrative first generation r
( and 4B) and an adaptor of the present invention (FIGs. 4C-4E). shows the
first generation adaptor . shows that in the first generation adaptors, there
were a collection of 249 possible sequence tags, each 5 nucleotides (nt) in length that
attached to a single anchor sequence. shows a diagram of a second generation
adaptor. shows an illustrative set of adaptors that are d to a single sample that
consists of four sets of 8mer tag sequences with each set having 60 members, Each set of 60
tags is specific to one of four anchor sequences. shows an illustrative DNA sequence
of a 47 nt adaptor.
— shows a m illustrating that ng the
position of the UMI multiplier within the sample tag can increase the number of unique
sample tags.
and B shows a diagram illustrating the process of constructing
c libraries for a CNL assay. shows the step where the 10 nt anchor ce
is attached to the 3’ ends of genomic fragments. shows the step where the full length
genomic adaptors are annealed to the initial anchor sequence.
shows DNA inputs into CNL libraries. Agarose gel images are
shown with the sizes of markers (bp) indicated at left.
— shows conventional box-and-whiskers plots of
measured gene copies across eight samples as determined by CNL analysis.
— shows Logio P-value plots that quantify significant
deviation-from—normal in CNL measurements for fragmented genomic samples. The SNP
percentages at the top show the minor allele frequencies of rare, heterozygous SNPs that are
present in the AATM and ABRCA2 samples.
A — B shows Logio P—value plots that quantify
significant deviation-from—normal in CNL measurements for chNA samples spiked with
fragmented genomic DNA. The SNP percentages at the top show the minor allele frequencies
of rare, heterozygous SNPs that are t in the AATM and ABRCA2 samples.
A — 11D illustrate the targeted hybrid capture rm. A shows conversion of chNA to a genomic library by the addition of adaptor sequences
that provide sal, single-primer PCR amplification sequences, sample multiplexing tags,
and unique molecular identifiers to every genomic clone. B shows denatured
amplified c hybridized with target specific capture probes and primer extension. C shows a schematic of asymmetric paired-end cing. D shows mapping
statistics for 377,711,020 Illumina NextSeq reads from a typical targeted capture sequence
run. 98.5% of reads map to their intended targets. ing de—duplication, 20.40% of reads
(77,053,048) are d from unique genomic clones.
A — H shows sequences of adaptor oligonucleotides
from Pools 1 — 3.
A — H shows sequences of adaptor oligonucleotides
from Pools 4 — 6.
A — 1 shows sequences of adaptor ucleotides
from Pools 7 — 9.
A — H shows sequences of adaptor oligonucleotides
from Pools 10 — 12.
A — H shows sequences of adaptor oligonucleotides
from Pools l3 — 15.
A — H shows sequences of adaptor oligonucleotides
from Pools l6 — 18.
A — H shows sequences of adaptor oligonucleotides
from Pools l9 — 21.
A — H shows sequences of adaptor oligonucleotides
from Pools 22 — 24.
A — H shows sequences of adaptor oligonucleotides
from Pools 25 — 27.
A — H shows sequences of r oligonucleotides
from Pools 28 — 30.
A — H shows sequences of adaptor oligonucleotides
from Pools 31 — 32.
A — 23C shows targeted sequencing of the TP53 gene. A illustrates BedFile y of capture probes. B illustrates coverage depth at each
base on on a scale of 0 to 8000 unique reads. C illustrates a UCSC gene model
display of known TP53 splice variants. The r rectangular regions represent the amino
acid coding s for the TP53-encoded protein.
A — 24C illustrate raw and normalized unique read density for
a single probe, TP53r10_1, across 16 samples. A illustrates the number of raw unique
reads capture by probe TP53r10_1 for 16 ndent sample after removal of redundant
reads by “de—duplication.” B shows global average of unique reads across 2596
capture probes for all 16 samples. C shows normalized unique read depth across 16
samples (Calculated as: [sample n unique reads from probe TP53r10_1 X constant + global
average unique reads/probe from sample n]).
shows general consistency of the normalized unique read
counts for all 16 samples within any given TP53 probe despite significant average depth
variation between probes. The normalized unique read counts for all 16 samples are shown as
“pillars” of tightly spaced bar graphs; the results for all 45 probes that target TP53 are shown.
Two probes that exhibit “noisy” counting behavior are highlighted with . Counts from
such probes often appear as outliers in subsequent copy number analysis.
rates sample-to-sample consistency of normalized probe-
by-probe unique read counts across a broad panel of 2596 probes. The scatter plots from three
representative samples are shown. Each dot represents a ent probe. The x-axis is the
normalized average unique read depth per probe across 16 s. The y-axis is the
ized unique read depth per probe for three different individual samples. The consistent
probe-by-probe unique read counts support quantitative analysis of chromosomal copy
variation.
A — 27C illustrate copy number analysis of chNA from a
healthy female and male donor and from an advanced stage prostate cancer patient. A
shows analysis of a chNA from a healthy female donor. The x-axis is a series of control
probes that target regions from all 22 autosomal chromosomes, a series of probes that target
the X-linked AR gene, and a series of probes that target the coding regions of the TP53 gene.
The Y-axis shows the calculated ploidy for each probe. This approximation is calculated for
each probe by normalizing the observed unique read counts to a series of control samples
whose ploidy is known ([unique read count for Y of sample _Z] x 2 + [average unique
read count for Y for multiple control samples]). B illustrates that the X-linked
AR gene exhibits a haploid copy number in healthy males. C illustrates copy number
analysis of chNA from an advanced prostate cancer patient and shows ce of very
significant oidy across the control probes, amplification of the AR gene, and loss of the
TP53 gene.
shows whole genome aneuploidy analysis of a te patient
chNA library relative to a control sample. The imate ploidy for each of 239 control
probes is shown sorted by chromosome. Patient chromosome 2 probes show consistent copy
loss and the majority of chromosome 5 probes show copy gain. Significant deviation of
approximate ploidy are seen for many, but not all, of the t control probes.
shows analytical validation of copy number loss detection.
Genomic DNA from immortalized line NA02718 (monoallelic AATM) and from NA09596
(monoallelic ) were spiked into the “gold standard” c DNA from NA1287 8
at 16%, ing in the equivalent of an 8% biallelic deletion minor allele frequency.
Following targeted sequencing and CNV analysis, the probe-by-probe es were averaged
for the two target genes. Two unperturbed control genes, BRIPl and HDAC2, are shown for
comparison.
DETAILED DESCRIPTION
A. OVERVIEW
The present invention includes, inter alia, compositions and methods
that are useful for the detection of a mutational change, SNP, translocation, inversion,
deletion, change in copy number or other genetic variation within a sample of cellular
genomic DNA (e.g. from a tissue biopsy sample) or chNA (e.g. from a blood ), The
compositions and methods of the current invention are particularly useful in detecting
incredibly hard to detect copy number variations in chNA from a biological sample (6. g.
blood) with ite resolution. In particular, some embodiments of the present invention are
drawn to a method for the detecting copy number of a DNA target region from a test sample
by generating a genomic DNA y made up of genomic DNA fragments attached to an
adaptor, capturing DNA target regions with a plurality of capture probes, isolating the DNA
library fragments comprising the DNA target region, and performing a quantitative genetic
analysis of the DNA target region to thereby determining the copy number of the DNA target
. The adaptors described herein allow for the identification of the individual DNA
fragment that is being sequenced, as well as the identity of the sample or source of the
genomic DNA.
The present invention contemplates, in part, compositions and methods
for ion of target-specific copy number changes that are able to several sample
types, including but not limited to direct tissue biopsies and peripheral blood. In the context
of cancer genomics, and in particular cell free DNA (chNA) assays for the analysis of solid
tumors, the amount of tumor DNA is often a very small fraction of the overall DNA. Further,
copy number loss is ult to detect in genomic DNA assays, and in particular, genomic
DNA assays where copy number change may only be present in a n of the total
genomic DNA from a sample, e.g., chNA . For example, most of the cell-free DNA
extracted from a cancer patient will be derived from normal sources and have a diploid copy
number (except for X-linked genes in male subjects). In a cancer patient, the fraction of DNA
derived from tumors often has a low minor allele frequency, such as for example, a patient in
which 2% of the circulating DNA extracted from plasma is derived from the tumor. The loss
of one copy of a tumor suppressor gene (for example, BRCAI in breast cancer) means that
the minor allele ncy for the absence of able genomic fragments is 1%. In this
io, a copy number loss assay engineered must be able to discriminate between 100
copies (normal) and 99 copies (heterozygous gene loss). Thus, particular embodiments
contemplate that the methods and compositions of the present invention allow for the
detection of copy number change with sufficient tion to detect changes in copy number
at minor allele frequencies even in the context of cfl)NA.
To e this level of discrimination, the present invention provides
novel sample r designs The rs of the present invention are designed to include
features that are critical for successful copy number loss assay mance including (i) even
performance across adaptors; (ii) a high number of unique molecule identifiers (UMIs); (iii)
high efficiency attachment; and (iv) accommodation of sample multiplexing. For example,
the adaptors of the present invention provide the following:
Even performance across adaptors: Bioinformatics analysis often
looks at intra—sample probe performance and inter-sample probe performance. Thus, it is
contemplated that any performance fluctuation between adaptor pools across samples will
negatively impact the y to detect the subtle variations ed by CNL analysis. In the
present invention, this evenness of performance is achieved by having multiple anchor tags
that are all represented in each sample tag pool, with the fixed sample tag s (which
serve to identify both the sample and the genomic fragments) being randomly selected for
each pool, and a UMI multiplier that increases the unique sample tag sequences for
identifying the genomic fragments.
High number of Unique Molecule Identifiers 1UMIs]: While adaptors
must be functionally equivalent from a molecular biology perspective, they must possess a
very large number of unique sequence tags (2 10,000) that t the identification of
unique genomic fragments. In this context, by “augment,” it is meant that each genomic clone
fragment has a particular pair of ntation sites corresponding to the position in the
genomic sequence where the double-strand DNA was cleaved. This cleavage site is used to
differentiate unique genomic clones since each clone is likely to possess a different ge
site. However, in libraries that possess thousands of independent clones, uniquely derived
nts will often possess the exact same cleavage sites. Genomic clones (i.e. fragments)
sharing the same cleavage site can be classified as either unique or as redundant with respect
to other clone sequences derived from the same sample. By attaching adaptors that introduce
a high diversity of sequence tags, different genomic clones sharing the same cleavage site are
more likely to be identified as unique. In this system, the UMI is created by a combination of
the sample tag region with the UMI multiplier. The combination of the UMI and the cleavage
site create a unique molecular identifier t , which facilitates the classification
of sequence reads as ant reads or unique reads. Particular embodiments contemplate
that the UMI multiplier could comprise longer or shorter sequences to increase or lower the
overall UMI xity.
High efficiency attachment: Adaptors must attach to genomic
fragments with high efficiency. In most oncology applications, the quantities of ble
cellular DNA or chNA are limited and therefore conversion of these genomic fragments to
genomic library clones must be highly efficient. In order to achieve this, in some aspects of
the present invention, the adaptor systems described herein convert about 25% to about 50%
or greater of the genomic input fragments are converted into genomic library .
Accommodation of sample multiplexing: In general, there must be
pools of different sets of adaptors where each unique adaptor of the set is ed to a
different sample. At the same time, each member of the set of adaptors must s
essentially identical behavior (from a sequence counting perspective) to all other members in
a set. In order to achieve this, in some embodiments, the sample tag regions have a Hamming
distance of 2 between any other possible sample tag combinations reducing the chance for a
read to be spuriously ed to the wrong sample. In some embodiments, each set of
adaptors is split into pools that are paired with specific anchor s, allowing for further
reduction in the possibility of an error in sample de-multiplexing. For example, in an 8mer
tag with Hamming distance of 2, the total number of possible sequences is .
In a particular embodiment, pre-specified pools of adaptor
oligonucleotides are ed. Such pre-specified pools are used to represent a single sample.
That is, each r sequence in each pool of X adapter oligonucleotides (16,384 in the
example given above) is distinct from each adapter sequence in every other pool used to
identify other samples. One of skill in the art will recognize the number of ct pre-
specified pools that are possible for the adapter oligonucleotides will depend on the length of
the sample tag and/or the UMI multiplier.
Thus, in certain embodiments the adaptors comprise a sequence, i.e.,
the sample tag and adjacent and/or encompassed UMI lier that represents or identifies
both the sample and uniquely identifies the genetic fragment. This is in stark contrast to the
current systems that are used in the art that use a randomly generated tag to identify the
sequence and a separate barcode or sequencer indexing to allow for multiplexing.
An illustrative embodiment for detecting -specific copy number
changes within DNA obtained from a sample is shown in While generates a
DNA library from chNA, this illustrative procedure could be used with DNA from other
s, e.g, fragmented cellular DNA. As shown in chNA is collected (top panel).
Next, a genomic library is generated from chNA by ating genomic library rs
(gray circles) of the present invention to the genomic DNA. Genomic DNA fragments are
ed with capture probes (black circles) that recognize the genomic region of interested.
The c DNA of st is ced, and data analysis is performed for copy loss
analysis and/or characterization of the genomic DNA of interest.
The practice of particular embodiments of the invention will employ,
unless indicated specifically to the contrary, conventional methods of chemistry,
biochemistry, organic chemistry, molecular biology, microbiology, recombinant DNA
techniques, genetics, immunology, and cell biology that are within the skill of the art, many
of which are described below for the purpose of illustration. Such techniques are explained
fully in the ture. See, e.g., Sambrook, et al., Molecular Cloning: A Laboratory Manual
(3rd Edition, 2001); Sambrook, et al., Molecular Cloning: A Laboratory Manual (2nd
Edition, 1989), Maniatis et al., Molecular Cloning: A Laboratory Manual (1982); Ausubel et
al., Current Protocols in Molecular Biology (John Wiley and Sons, updated July 2008), Short
ols in Molecular Biology: A Compendium of Methods fiom Current Protocols in
Molecular Biology, Greene Pub. Associates and Interscience, Glover, DNA Cloning: A
Practical Approach, vol. I & II (IRL Press, Oxford, 1985); Anand, Techniques for the
Analysis of Complex s, (Academic Press, New York, 1992); Transcription and
Translation (B. Hames & S. Higgins, Eds, 1984), Perbal, A Practical Guide to Molecular
Cloning (1984), and Harlow and Lane, Antibodies, (Cold Spring Harbor tory Press,
Cold Spring Harbor, NY, 1998).
B. DEFINITIONS
Unless defined otherwise, all technical and scientific terms used herein
have the same meaning as commonly understood by those of ordinary skill in the art to which
the invention belongs. Although any methods and materials similar or equivalent to those
described herein can be used in the practice or g of the present invention, preferred
embodiments of compositions, methods and materials are described herein. For the purposes
of the present invention, the following terms are defined below.
The articles “a ” “an, ” and “the” are used herein to refer to one or to
more than one (Le. to at least one) of the grammatical object of the article. By way of
example, “an element” means one element or more than one element.
The use of the alternative (eg, “or”) should be understood to mean
either one, both, or any combination f of the atives.
The term “and/or” should be understood to mean either one, or both of
the alternatives.
As used herein, the term “about” or “approximately” refers to a
quantity, level, value, number, ncy, percentage, dimension, size, amount, weight or
length that varies by as much as 15%, 10%, 9%, 8%, 7%, 6%, 5%, 4%, 3%, 2% or 1% to a
reference quantity, level, value, number, frequency, percentage, dimension, size, amount,
weight or length. In one embodiment, the term ” or “approximately” refers a range of
ty, level, value, number, frequency, percentage, dimension, size, amount, weight or
length 15%, 10%, 9%, + 8%, + 7%, 6%, 5%, 4%, i 3%, :: 2%, or i 1% about a
reference quantity, level, value, number, frequency, percentage, dimension, size, amount,
weight or length.
Throughout this specification, unless the context requires otherwise,
the words “comprise”, “comprises,” and “comprising” will be understood to imply the
inclusion of a stated step or element or group of steps or elements but not the exclusion of
any other step or element or group of steps or elements. In particular embodiments, the terms
de,” “has, 7: ains,” and “comprise” are used synonymously.
By “consisting of” is meant ing, and limited to, whatever s
the phrase “consisting of.” Thus, the phrase “consisting of” indicates that the listed elements
are required or mandatory, and that no other elements may be present.
By sting essentially of” is meant including any elements listed
after the phrase, and limited to other elements that do not interfere with or contribute to the
activity or action specified in the disclosure for the listed ts. Thus, the phrase
“consisting essentially of” indicates that the listed elements are required or ory, but
that no other elements are optional and may or may not be present depending upon whether
or not they affect the activity or action of the listed elements.
nce throughout this specification to “one ment,” “an
embodiment,77 cc 7) cc 7) cc
a particular embodiment, a related embodiment, a certain embodiment,”
“an additional ment,” or “a further embodiment” or combinations thereof means that a
particular feature, structure or characteristic described in tion with the ment is
included in at least one embodiment of the present invention. Thus, the appearances of the
foregoing phrases in various places throughout this specification are not necessarily all
referring to the same embodiment. Furthermore, the particular es, structures, or
characteristics may be ed in any suitable manner in one or more embodiments.
As used herein, the term “isolated” means material that is substantially
or essentially free from components that normally accompany it in its native state. In
particular embodiments, the term “obtained” or “derived” is used synonymously with
isolated.
As used herein, the term “DNA” refers to deoxyribonucleic acid. In
various embodiments, the term DNA refers to genomic DNA, recombinant DNA, synthetic
DNA, or cDNA. In one embodiment, DNA refers to genomic DNA or cDNA. In particular
embodiments, the DNA comprises a “target region.” DNA libraries contemplated herein
include c DNA libraries and cDNA libraries constructed from RNA, e.g., an RNA
expression library. In various ments, the DNA libraries comprise one or more
additional DNA ces and/or tags.
The terms t genetic locus” and “DNA target region” are used
interchangeably herein and refer to a region of interest within a DNA sequence. In various
embodiments, targeted genetic analyses are performed on the target genetic locus. In
particular embodiments, the DNA target region is a region of a gene that is associated with a
particular genetic state, genetic condition, genetic diseases, fetal testing; genetic mosaicism,
paternity testing; predicting response to drug treatment; sing or monitoring a medical
condition; microbiome profiling, pathogen screening; or organ transplant monitoring. In
further embodiments, the DNA target region is a DNA sequence that is associated with a
particular human chromosome, such as a particular autosomal or ed chromosome, or
region thereof (6. g. , a unique chromosome region).
As used herein, the terms “circulating DNA,” “circulating cell-free
DNA,” and “cell-free DNA” are often used interchangeably and refer to DNA that is
extracellular DNA, DNA that has been extruded from cells, or DNA that has been released
from necrotic or apoptotic cells. This term is often used in contrast to “cellular genomic
DNA” or “cellular DNA,” which are used interchangeably herein and refer to genomic DNA
that is contained within the cell (ie. the nuclease) and is only accessible to molecular
biological techniques such as those described herein, by lysing or otherwise disrupting the
integrity of the cell,
A “subject,” “individual,” or “patient” as used herein, includes any
animal that exhibits a symptom of a ion that can be ed or identified with
compositions contemplated herein. Suitable subjects include laboratory animals (such as
mouse, rat, , or guinea pig), farm animals (such as horses, cows, sheep, pigs), and
domestic animals or pets (such as a cat or dog). In particular embodiments, the subject is a
mammal. In certain embodiments, the subject is a non-human primate and, in preferred
embodiments, the subject is a human.
As used herein, the term “paired” when used with respect to two
different polynucleotide sequences or s of DNA comprising different polynucleotide
sequences, means that the two different polynucleotide sequences or s of DNA
comprising different polynucleotide sequences are present on the same polynucleotide. For
example, if a particular sample tag region of DNA is said to be paired to particular
amplification region of DNA, it is meant that the sample tag region and the amplification tag
are present on the same DNA cleotide molecule.
C. S OF COPY NUMBER IS
In various embodiments, a method for copy number analysis of a DNA
target region DNA is provided. In n embodiments, copy number analysis is performed
by generating a genomic DNA library of DNA library fragments that each contain genomic
DNA fragment and an adaptor, isolating the DNA library fragments containing the DNA
target regions, and performing a quantitative genetic analysis of the DNA target region. By
“quantitative genetic analysis” it is meant an analysis med by any molecular biological
technique that is able to fy s in a DNA (e.g., a gene, genetic locus, target region
of st, etc.) including but not limited to DNA mutations, SNPs, translocations, deletions,
and copy number variations (CNVs). In certain embodiments, the quantitative genetic
analysis is med by sequencing, for example, next generation sequencing.
Next-generation DNA cing (NGS) is ideally suited for two
diagnostic applications. The first is the determination of DNA sequence on a vast scale. In the
present context, this capability enables the search for rare, actionable variants that guide
effective treatment decisions. The second is counting gene copy number. The output of
millions of independent sequences can enable precise measurement of gene copy number on
a genome-wide scale. The emergence of non-invasive prenatal testing for fetal trisomy from
maternal blood samples is a ent to this capability. RNAseq, that is, the technology of
gene expression profiling using NGS is another example, albeit the input is RNA (cDNA)
rather than genomic DNA. Comparisons of t capture methods are described
Samorodnitsky et a]. J Mol Diagn. 2015 Jan;17(1):64-75.
The present invention extends NGS counting capability into the realm
of ed hybrid capture methods. The s described here are ive for the
detection of copy number variation at least in part because they possess the following four
qualities:
(a) The present methods differentiate between unigue clones and redundant
mNGS sequencing of amplified genomic DNA library fragments results in a plurality of
dual NGS reads, each comprising adaptor-encoded sequence information linked to a
specific human genomic sequence. These elements define the identity of every clone.
Because captured genomic regions are amplified by PCR, it is not uncommon for the same
clone to be encountered several times in a subsequent NGS analysis. Groups of reads that are
derived from a single cloning and capture process are termed “redundant ” Two or
more redundant reads are identified as redundant reads based on the sequencing ation
provided by the unique molecular identification elements (UMIE). The UMIE refers to the
combination of the sequence information from the adaptor tags and the start of the genomic
DNA sequence. Two or more reads comprising identical UMIEs are fied as redundant
reads. Redundant reads are grouped together and a single, entative consensus sequence
is assembled from families of redundant reads. This consensus sequence is ated as a
“unique 77
rea or a e genomic ce” (UGS). Each unique read represents a
separate clone from the original DNA specimen. The process of identifying and grouping
redundant clone families and of generating a single unique read entative of this family
is defined as “deduplication.” The adaptors used to create genomic libraries possess a very
deep repertoire of unique sample tag information (15,360 codes per adaptor). When applied
in conjunction with the exact mapping nates of each captured genomic clone (which
can span >100 different positions relative to a capture probe), each unique clone that is
ted in a genomic library and subsequently retrieved by a target-specific capture probe
has an extremely high likelihood of being entiable from all other unique clones that
encompass the same capture nment. The ability to differentiate between unique clones
and redundant clones is central to the methods described herein.
(b) The adaptors used to create c libraries permit sample multiplexing
without creating adaptor-to—adaptor variabili‘py in copy number counts. A central foundation
of copy number determination is the simultaneous analysis of a set of samples that have all
been processed within a single sequencing run. This allows positive and negative controls to
be included along with al samples. A major issue with previous adaptor design
ions induced subtle shifts in gene copy counts among identical control samples, in effect
setting a -to-noise uncertainty threshold that was too high to be clinically useful in
blood-based, solid tumor genotyping assays. The present invention overcomes this issue and
substantially lowers the signal—to-noise threshold such that single copy gene loss is detectable
at S 2% minor allele frequency. This improved signal recognition s the s of the
present invention to have significant clinical y in circulating tumor DNA assays.
(c) The proprietafl targeted hybrid capture method used herein must produce
h'ghly m “on-target” read coverage across all targets. Methods that rely on counting of
unique genomic fragments to estimate copy number, such as the ones described herein, must
achieve near-saturation in terms of encountering all possible unique nts. Near-
saturation is only achieved by oversampling, that is to say, gathering more sequencing reads
than the number of unique reads that will ultimately be encountered. To be practical, scalable,
and economical, the unique reads in a targeted hybrid capture y must exhibit sufficient
uniformity such that < 10-fold oversampling of on-target reads, and preferably < 4-fold
oversampling of on-target reads will capture > 90% of unique on-target reads at all target
loci.
(d) The targeted hybrid capture method {See US. Patent Publication No. 2014-
02747312 must have high on-target capture rates. To be practical, scalable and economical, in
other words to be a distinguishing feature of the present disclosure relative to other art in the
field, the method must achieve >90%, preferably >95% on-target reads. With on-target
mapping rates exceeding 95%, the requirement for 4 to 10-fold oversampling of on-target
reads and the requirement for overall oversampling are one in the same.
In some embodiments, the number of copies of the DNA target region
t in the sample is determined by the quantitative genetic analysis. In some
embodiments, the copy number of the DNA target region is determined by comparing the
amount of copies of DNA target regions present in the sample and comparing it to amounts of
DNA target regions present in one or more samples with known copy number.
Particular embodiments contemplate that the compositions and
s described herein are particularly useful for ing changes in copy number in a
sample of genomic DNA, where only a portion of the total genomic DNA in the sample has a
change in copy number. For example, a significant tumor mutation may be t in a
sample, e.g. a sample of cell free DNA, that is present in a minor allele ncy that is
significantly less than 50% ( e.g., in the range of 0.1% to >20%), in contrast to conventional
SNP genotyping where allele frequencies are generally ~100%, 50% or 0%. One of skill of
the art will recognize that the compositions and methods of the current invention are also
useful in detecting other types of mutation including single nucleotide variants (SNVs), short
(e.g., less than 40 base pairs (bp)) ions, and deletions (indels), and genomic
rearrangements including oncogenic gene fusions.
In n ments, the compositions and/or methods of the
present invention described herein are useful for, capable of, suited for, and/or able to detect,
identify, observe, and/or reveal a change in copy number of one or more DNA target regions
present in less than about 20%, less than about 19%, less than about 18%, less than about
17%, less than about 16%, less than about 15%, less than about 14%, less than about 13%,
less than about 12%, less than about 11%, less than about 10%, less than about 9%, less than
about 8%, less than about 7%, less than about 6%, less than about 5%, less than about 4%,
less than about 3%, less than about 2%, less than about 1%, less than about 0.5%, less than
about 0.2%, or less than about 0.1% of the total genomic DNA from the sample. In some
embodiments, the methods of the t invention are useful for, capable of, suited for,
and/or able to detect, identify, observe, and/or reveal a change in copy number of one or more
DNA target regions present in between about 0.01% to about 100%, about 0.01% to about
50%, and or about 0.1% to about 20% of the total genomic DNA from the sample.
Particular ments are represented by the conceptual framework
that is illustrated in In each gene is represented by a row and each patient
sample is represented as a column. Within any given genomic DNA sample, the number of
fragments counted for each individual gene will have some variability, and that for any given
DNA region of interest, 6. g. a gene, perturbations in copy number are ed as significant
fragment count deviations relative to the normalized counts to the DNA target region in other
samples. Such an assay requires the gene-by-gene fragment ng profile within a sample
to be reproducible, and also requires the sample-by-sample counting profiles to be highly
comparable. Both assay requirements demand excellent signal-to-noise counting
discrimination.
Some embodiments contemplate that the assay elements that contribute
to increasing the signal to noise ratio are the genomic input, the number of probes, and the
sequencing depth, as rated in
In particular embodiments, a method for c analysis of chNA
comprises: ting and amplifying a chNA library, determining the number of genome
equivalents in the chNA library, and performing a quantitative genetic analysis of one or
more genomic target loci.
Particular embodiments contemplate that the any of the methods and
compositions described herein are effective for use to efficiently analyze, detect, diagnose,
and/or monitor genetic states, genetic conditions, genetic diseases, c mosaicism, fetal
diagnostics, paternity testing, iome profiling, pathogen screening, and organ transplant
monitoring using genomic DNA, e.g., ar or chNA, Where all or Where only a portion of
the total genomic DNA in the sample has a feature of st, 6. g. a genetic lesion, mutation,
single nucleotide variant (SNV). In some embodiments, a feature of interest is a genetic
feature associated with a disease or condition. For example, a icant tumor on may
be present in a sample, 6. g. a sample of chNA, that is present in a minor allele frequency
that is significantly less than 50% (e. g. in the range of 0.1% to >20%), in contrast to
conventional SNP genotyping Where allele frequencies are generally ~100%, 50% or 0%.
In certain ments, the compositions and/or methods of the
present invention described herein are useful for, capable of, suited for, and/or able to detect,
identify, e, and/or reveal a genetic lesion of one or more DNA target regions present in
less than about 20%, less than about 19%, less than about 18%, less than about 17%, less than
about 16%, less than about 15%, less than about 14%, less than about 13%, less than about
12%, less than about 11%, less than about 10%, less than about 9%, less than about 8%, less
than about 7%, less than about 6%, less than about 5%, less than about 4%, less than about
3%, less than about 2%, less than about 1%, less than about 0.5%, less than about 0.2%, or
less than about 0.1% of the total genomic DNA from the sample. In some embodiments, the
methods of the present invention are useful for, capable of, suited for, and/or able to detect,
identify, e, and/or reveal a genetic lesion of one or more DNA target regions present in
between about 0.01% to about 100%, about 0.01% to about 50%, and or about 0.1% to about
% of the total genomic DNA from the sample.
1. GENERATINGADNA LIBRARY
In ular embodiments, methods of genetic analysis contemplated
herein comprise ting a DNA library comprising treating chNA or fragmented cellular
genomic DNA with one or more end-repair enzymes to generate paired DNA and
attaching one or more adaptors to each end of the end-repaired DNA to generate the DNA
library. Genomic DNA
In particular embodiments, the methods and compositions
contemplated herein are designed to efficiently analyze, detect, se, and/or monitor
change in copy number using genomic DNA as an analyte. In certain embodiments, copy
number analysis is med by generating a genomic DNA library from c DNA
obtained from a test sample, e.g., a biological sample such as a tissue biopsy. In certain
embodiments, the genomic DNA is ating or cell free DNA. In some embodiments, the
c DNA is cellular genomic DNA.
In certain embodiments, genomic DNA is obtained from a tissue
sample or biopsy taken from a tissue, including but not limited to, bone marrow, esophagus,
stomach, duodenum, rectum, colon, ileum, pancreases, lung, liver, prostate, brain, nerves,
eal tissue, renal tissue, endometrial tissue, cervical tissue, , lymph node, muscle,
and skin. In certain embodiments, the tissue sample is a biopsy of a tumor or a suspected
tumor. In particular embodiments, the tumor is cancerous or suspected of being cancerous. In
particular embodiments, the tissue sample comprises cancer cells or cells suspected of being
cancerous.
Methods for purifying genomic DNA from cells or from a biologic
tissue comprised of cells are well known in the art, and the skilled artisan will recognize
l procedures or commercial kits depending on the tissue and the ions in which
the tissue is obtained. Some embodiments contemplate that purifying cellular DNA from a
tissue will require cell disruption or cell lysis to expose the cellular DNA within, for example
by chemical and al methods such as blending, grinding or sonicating the tissue sample;
removing membrane lipids by adding a ent or surfactants which also serves in cell
lysis, optionally removing proteins, for example by adding a protease; removing RNA, for
example by adding an RNase; and DNA purification, for example from detergents, proteins,
salts and ts used during cell lysis step. DNA purification may be performed by
precipitation, for example with ethanol or isopropanol; by phenol—chloroform tion.
In ular embodiments, cellular DNA obtained from tissues and/or
cells are fragmented prior to and or during obtaining, generating, making, forming, and/or
producing a genomic DNA library as described herein. One of skill in the art will understand
that there are several suitable techniques for DNA fragmentation, and is able to recognize and
identify suitable ques for fragmenting ar DNA for the purposes of ting a
genomic DNA library for DNA sequencing, including but not limited to next-generation
sequencing. Certain ments contemplate that cellular DNA can be fragmented into
fragments of appropriate and/or sufficient length for generating a library by methods
including but not lirrrited to physical fragmentation, enzymatic fragmentation, and chemical
shearing.
Physical fragmentation can include, but is not limited to, acoustic
shearing, sonication, and hydrodynamic shear. In some embodiments, cellular DNA is
fragmented by physical fragmentation. In particular embodiments, cellular DNA is
fragmented by acoustic shearing or sonication. Particular embodiments contemplate that
acoustic shearing and sonication are common physical methods used to shear cellular DNA.
The Covaris® ment (Woburn, MA) is an acoustic device for breaking DNA into 100-
Skb bp. Covaris also manufactures tubes (gTubes) which will process samples in the 6-20 kb
for air libraries. The Bioruptor® (Denville, NJ) is a sonication device utilized for
ng chromatin, DNA and disrupting tissues. Small volumes of DNA can be sheared to
lSO-lkb in length. Hydroshear from Digilab (Marlborough, MA) utilizes hydrodynamic
forces to shear DNA. Nebulizers (Life Tech, Grand , NY) can also be used to atomize
liquid using compressed air, shearing DNA into 100-3kb fragments in seconds. Nebulization
is low cost, but the process can cause a loss of about 30% of the cellular DNA from the
original sample. In certain embodiments, ar DNA is nted by sonication.
Enzymatic fragmentation can include, but is not d to, treatment
with a restriction endonuclease, e. g. DNase I, or treatment with a nonspecific nuclease. In
some embodiments, ar DNA is fragmented by tic fragmentation. In particular
embodiments, the cellular DNA is fragmented by treatment with a restriction endonuclease.
In some embodiments, the ar DNA is fragmented by treatment with a nonspecific
nuclease. In certain embodiments, the cellular DNA is fragmented by ent with a
transposase. Certain embodiments contemplate that tic methods to shear cellular
DNA into small pieces include DNAse I, a combination of maltose binding protein (MBP)-
T7 Endo I and a non-specific nuclease Vibrio vulnificus (an) New England Biolabs’s
(Ipswich, MA) Fragmentase and Nextera tagmentation technology (Illumina, San Diego,
CA). The ation of non-specific se and T7 Endo synergistically work to produce
non-specific nicks and counter nicks, generating fragments that disassociate 8 nucleotides or
less from the nick site. Tagmentation uses a transposase to simultaneously fragment and
insert adapters onto double stranded DNA.
Chemical fragmentation can include treatment with heat and divalent
metal cation. In some embodiments, genomic DNA is fragmented by chemical fragmentation.
Particular embodiments contemplate that chemical shear is more commonly used for the
breakup of long RNA fragments as opposed to c DNA. Chemical ntation is
typically performed h the heat digestion of DNA with a divalent metal cation
(magnesium or zinc). The length of DNA fragments can be adjusted by increasing or
decreasing the time of incubation.
In particular embodiments, the methods and compositions
contemplated herein are designed to efficiently analyze, detect, se, and/or monitor
change in copy number using cell-free DNA (chNA) as an analyte. The size distribution of
chNA ranges from about 150 bp to about 180 bp fragments. Fragmentation of chNA may
be the result of endonucleolytic and/or exonucleolytic ty and presents a formidable
nge to the accurate, reliable, and robust analysis of chNA. Another challenge for
analyzing chNA is its short half-life in the blood stream, on the order of about 15 minutes.
Without wishing to be bound to any particular theory, the present invention contemplates, in
part, that analysis of chNA is like a d biopsy” and is a real-time snapshot of current
biological processes.
Moreover, because chNA is not found within cells and may be
obtained from a number of suitable sources including, but not limited to, biological fluids and
stool samples, it is not subject to the existing tions that plague next generation
sequencing analysis, such as direct access to the tissues being analyzed.
Illustrative examples of biological fluids that are suitable sources from
which to isolate chNA in ular embodiments include, but are not limited to amniotic
fluid, blood, plasma, serum, semen, tic fluid, cerebral spinal fluid, ocular fluid, urine,
saliva, mucous, and sweat. In particular embodiments, the biological fluid is blood or blood
plasma.
In certain embodiments, commercially available kits and other
methods known to the skilled artisan can used to isolate chNA directly from the biological
fluids of a subject or from a usly obtained and optionally stabilized ical sample,
e.g., by freezing and/or on of enzyme chelating agents including, but not limited to
EDTA, EGTA, or other chelating agents specific for divalent cations.
(a) Generating End-Repaired chNA
In particular embodiments, generating a genomic DNA library
comprises the end-repair of isolated chNA or fragmented cellular DNA. The fragmented
chNA or cellular DNA is processed by pair enzymes to generate end-repaired chNA
with blunt ends, rhangs, or 3’-overhangs. In some embodiments, the end-repair
enzymes can yield for example. In some embodiments, the end-repaired chNA or cellular
DNA contains blunt ends. In some embodiments, the end-repaired cellular DNA or chNA is
processed to contain blunt ends. In some embodiments, the blunt ends of the paired
chNA or cellular DNA are further modified to contain a single base pair overhang. In some
embodiments, end-repaired chNA or cellular DNA containing blunt ends can be further
processed to n adenine (A)/thymine (T) overhang. In some embodiments, paired
chNA or cellular DNA containing blunt ends can be further processed to contain adenine
(A)/thymine (T) ng as the single base pair overhang. In some embodiments, the end-
repaired chNA or cellular DNA has non-templated 3’ overhangs. In some embodiments, the
end—repaired chNA or cellular DNA is processed to contain 3’ overhangs. In some
embodiments, the end-repaired cflDNA or ar DNA is processed with terminal
transferase (TdT) to contain 3’ overhangs. In some embodiments, a G—tail can be added by
TdT. In some ments, the end-repaired chNA or cellular DNA is sed to contain
overhang ends using partial digestion with any known restriction enzymes (e.g., with the
enzyme Sau3A, and the like.
(b) Attaching Adaptor Molecules to paired chNA
In particular embodiments, generating a chNA y comprises
attaching one or more adaptors to each end of the end-repaired chNA. The present invention
contemplates, in part, an adaptor module designed to accommodate large numbers of genome
equivalents in cfl)NA libraries. Adaptor modules are red to measure the number of
genome equivalents t in chNA libraries, and, by extension, the sensitivity of
sequencing assays used to fy sequence mutations.
As used herein, the terms “adaptor” and “adaptor module” are used for
interchangeably, and refer to a polynucleotide comprising that comprises at least three
elements: an amplification region, a sample tag , and an anchor region. In particular
embodiments, the r comprises an amplification region, a sample tag region, and an
anchor region. In some embodiments, the adaptor also comprises a unique molecule identifier
(UMI). In particular embodiments, the adaptor comprises one or amplification regions, one or
more sample tag regions, one or more UMIs, and/or one or more anchor regions. In some
embodiments, the r comprises, in order from 5’ to 3’, an amplification region, a sample
tag region, a UMI, and an anchor . In particular ments, the adaptor comprises,
in order from 5’ to 3’, an amplification region, a sample tag region, a UMI, and an anchor
region. In certain embodiments, the UMI is contained within the sample tag region, and the
adaptor comprises, in order from 5’ to 3’, an amplification region, an ated sample
tag/UMI region, and an anchor region.
As used herein, the term “amplification region” refers to an element of
the r molecule that ses a polynucleotide sequence capable of serving as a primer
recognition site for PCR amplification. In particular embodiments, an adaptor comprises an
amplification region that comprises one or more primer recognition sequences for single-
primer amplification of a genomic DNA library. In some embodiments, the amplification
region comprises one, two, three, four, five, six, seven, eight, nine, ten, or more primer
recognition sequences for single-primer amplification of a genomic DNA library.
In some embodiments, the cation region is about is between 5
and 50 nucleotides, between 10 and 45 nucleotides, between 15 and 40 nucleotides, or
between 20 and 30 nucleotides in length. In some embodiments, the amplification region is
tides, 11 nucleotides, 12 nucleotides, 13 nucleotides, 14 nucleotides, 15 nucleotides,
16 nucleotides, 17 nucleotides, about 18 nucleotides, 19 nucleotides, 20 nucleotides, 21
nucleotides, 22 nucleotides, 23 nucleotides, 24 nucleotides, 25 nucleotides, 26 nucleotides, 27
nucleotides, 28 nucleotides, 29 nucleotides, 30 nucleotides, 31 nucleotides, 32 nucleotides, 33
nucleotides, 34 nucleotides, 35 nucleotides, 36 nucleotides, 37 tides, 38 tides, 39
nucleotides, or 40 nucleotides or more. In particular embodiments, the amplification region is
nucleotides in length.
As used herein, the term “sample tag” or sample tag region” are used
interchangeably and refer to an element of the adaptor that comprises a polynucleotide
sequence that uniquely identifies the particular DNA fragment as well as the sample from
which it was d.
In certain embodiments, the sample tag region is about is between 3
and 50 nucleotides, between 3 and 25 nucleotides, or between 5 and 15 nucleotides in length.
In some embodiments, the sample tag region is 3 nucleotides, 4 tides, 5 nucleotides, 6
nucleotides, 7 nucleotides, 8 tides, 9 nucleotides, 10 nucleotides, about 11 nucleotides,
12 nucleotides, 13 nucleotides, 14 nucleotides, 15 nucleotides, 16 nucleotides, 17 nucleotides,
18 nucleotides, 19 nucleotides, or 20 nucleotides or more in .
In certain embodiments, the adaptor comprises a UMI multiplier,
wherein the UMI multiplier is at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at
least 7, at least 8, at least 9, or at least 10 tides in length.
In certain embodiments, each nucleotide on of the UMI multiplier
can comprise any of adenine, guanine, cytosine, or thymine. Thus, in some embodiments, a
UMI multiplier comprising 11 number of nucleotides can comprise any of n4 possible
tide sequences. In some embodiments, the UMI multiplier is one nucleotide in length
and comprises one of four possible sequences. In some embodiments, the UMI multiplier is
two nucleotides in length and comprises one of sixteen possible sequences. In some
embodiments, the UMI multiplier is three nucleotides in length and comprises one of 64
possible sequences. In some embodiments, the UMI multiplier is four nucleotides in length
and comprises one of 256 le sequences. In some embodiments, the UMI multiplier is
five nucleotides in length and comprises one of 1,024 possible sequences. In some
embodiments, the UMI multiplier is six nucleotides in length and comprises one of 4,096
possible ces. In some embodiments, the UMI multiplier is seven nucleotides in length
and comprises one of 16,384 possible sequences. In some ments, the UMI multiplier
is eight nucleotides in length and comprises one of 65,5336 possible ces. In some
embodiments, the UMI multiplier is nine nucleotides in length and comprises one of 262,144
possible sequences. In some embodiments, the UMI multiplier is ten or more nucleotides in
length and comprises one of 1,048,576 or more possible sequences.
In particular embodiments, the r comprises a UMI multiplier,
wherein the UMI lier is adjacent to or contained within the sample tag region (). Illustrative examples of UMI multipliers adjacent or contained within the sample tag are
shown in . In , an 8-mer sample tag region is shown with an adjacent UMI
multiplier (top and bottom rows) or a UMI multiplier incorporated within the sample tag
(middle 7 rows). In some embodiments, that adaptor ses a sample tag that is eight
tides in length and a UMI multiplier that is three nucleotides in length and comprises
one of 64 possible sequences, and wherein the UMI multiplier is adjacent to or contained
within the sample tag region. In some embodiments, identical processes attach full length
adaptor to the other end of the genomic fragments.
In particular embodiments, an adaptor module comprises one or more
anchor sequences. As used herein, an “anchor region” and “anchor sequence” are used
interchangeably and refer to a nucleotide ce that hybridizes to a partner
oligonucleotide. In some embodiments, the anchor region comprises the following three
properties: (1) each anchor sequence is part of a family of two or more anchor sequences that
collectively represent each of the four possible DNA bases at each site within extension; this
feature, ed base representation, is useful to calibrate proper base calling in sequencing
reads in particular embodiments, (2) each anchor sequence is composed of only two of four
possible bases, and these are specifically chosen to be either and equal number of A + C or an
equal number of G + T, an anchor sequence formed from only two bases reduces the
possibility that the anchor sequence will participate in secondary ure formation that
would preclude proper adaptor function; and (3) because each anchor sequence is composed
of equal numbers of A + C or G + T, each anchor ce shares roughly the same melting
temperature and duplex stability as every other anchor sequence in a set of four.
In some embodiments, the anchor sequences is n 1 and 50
tides in length. In some embodiments, the anchor sequences is between 4 and 40
nucleotides in length. In certain embodiments, the anchor region is between 5 and 25
nucleotides in length. In particular embodiments, the anchor region is at least 4 nucleotides,
at least six tides, at least 8 nucleotides, at least 10 nucleotides, at least 12 nucleotides,
at least 14 tides, or at least 16 nucleotides in length. In particular embodiments, the
anchor region is 10 tides in length.
In particular embodiments, an attachment step comprises
attaching/ligating an adaptor module to the end-repaired chNA or cellular DNA to generate
a “tagged” genomic DNA library. In some embodiments, a single adaptor module is
ed. In some embodiments, two, three, four or five r modules are employed. In
some embodiments, an adaptor module of identical sequence is attached to each end of the
fragmented end-repaired DNA.
In some embodiments, a plurality of adaptor species is attached to an
paired cellular or cell free genomic DNA fragments. Each of the plurality of adaptors
may comprise one or more amplification regions for the amplification of the chNA or
cellular DNA library, one or more sample tag regions for the identification of the chNA or
ar genomic DNA fragment and identification of the individual sample; and one or more
sequences for DNA sequencing.
In some embodiments, a plurality of adaptor species is attached to an
end—repaired cellular or cell free genomic DNA fragments of a , and the plurality of
adaptors all comprise amplification regions of an identical nucleotide sequence.
In certain embodiments, the genomic DNA from a sample is attached
with a plurality of adaptors that se sample tag sequences that all are different from
other sequences of sample tag regions in adaptors that are attached to genomic DNA
fragments from other samples.
In particular embodiments, a plurality of adaptor species is attached to
an end-repaired cellular or cell free genomic DNA fragments from a sample, and the plurality
of adaptors all comprise one or more sample tag s comprising one of between 2 and
,000 tide sequences, one of between 5 and 5,000 nucleotide ces, one of
between 25 and 1,000 nucleotide sequences, one of between 50 and 500 nucleotide
sequencesone of between 100 and 400 nucleotide sequences, or one of between 200 and 300
nucleotide sequences. In some embodiments, the sample tag region of each adaptor is 8
tides in length, and each sample tag region of the plurality of adaptors comprises one
of 240 nucleotide sequences.
In certain embodiments, a plurality of r species is attached to an
end-repaired cellular or cell free genomic DNA fragments from a sample, and the sample tag
regions of the ity of adaptors comprises nucleotide sequences that are different from
each other by a Hamming distance of l, 2, 3, 4 or greater than 4. In ular ments,
the Hamming distance is 2.
In particular embodiments, the sample tag s of the plurality of
adaptors that are attached to genomic DNA fragments of a sample are 8 nucleotides in length,
and comprise one of 240 nucleotide sequences that are different from each other by a
Hamming distance of 2.
In certain embodiments, the sample tag region serves to identify
dual c DNA fragments and to identify the individual sample, i.e., the genomic
library source. For example, when the sample tags of a plurality of adaptors ed to a
sample have one of 240 possible sequences, each sample is identified as having one of 240
possible tags, and each sample receives a set of 240 tags that are discrete from any other
sample by Hamming distance of two (meaning two base changes are ed to change one
tag into another). These same tags are used to enumerate clone diversity and thus they also
serve as sequence tags, i.e., to identify genomic DNA nts. To further augment the
diversity of possible sequence tags, UMI multipliers may be added. For e, a UMI
multiplier can be added to the adaptor region comprising 3 nucleotides consisting of the 64
possible combinations of 3 bases. In addition, the plurality of adaptors can comprise more
than one anchor sequence. For example, a plurality of adaptors may contain 4 different
anchor sequences are used simultaneously. These anchor sequences may also be used during
sample de-multiplexing to lower errors.
shows an illustrative comparison between a first generation
adaptor ( and 4B) and an adaptor of the present invention ( — ). and show an example of first generation adaptor that is 40 nt in length and
consisted of a discrete PCR amplification sequence, ce tag, and sample tag. Here, the
sample is identified by a fixed sequence (sequence tag) that is present on all adaptors that are
used to generate a DNA y from the sample. Individual genomic fragments are identified
by a separate and distinct sequences (sequence tag). — show an illustrative
e of an adaptor from the present invention. The illustrative adaptor shown is 47
nucleotides in length, and the sequence tag is combined with the sample tag. There is an
additional 3 nt sequence, the UMI lier, ting of the 64 possible combinations of 3
bases. The 10 nt anchor sequence is one of four different distinct sequences.
Thus, in the illustrative example (See — ), a set of
adaptors that are used in tion with a single sample comprise 240 sample tag sequences
that can be split into four sets of sample tag sequences with each set comprising 60 tags (one
for each nucleotide, A, C, T and G). Thus, each set of 60 tags is specific to one of four anchor
sequences. In total, a pool of 240 possible sample tag configurations are possible per sample.
Specifically, in this scenario, the 240 sample tag sequences are divided into four sets of 60
sequences, with each set directed to a specific anchor region. Therefore, the sample ID
involves not only the sequence information from the eight nucleotide sample tag, but also the
associated anchor sequence information. In addition, the position of sequences within the
read is fixed, and ore the sample tags and anchor sequences must have a fixed position
within a sequencing read in order to pass inclusion filters for downstream consideration.
Further, the inclusion of the UMI multiplier increases the sequence tag diversity from 240 to
240 x 64 = 15,360 possible sequence tags.
] ment of one or more adaptors contemplated herein may be
carried out by s known to those of ordinary skill in the art. In particular embodiments,
one or more adaptors contemplated herein are attached to end-repaired chNA that comprises
blunt ends. In certain embodiments, one or more adaptors contemplated herein are attached to
end—repaired chNA that comprises complementary ends appropriate for the attachment
method employed. In n embodiments, one or more adaptors contemplated herein are
attached to end-repaired chNA that ses a 3’ ng.
In some embodiments, attaching the c DNA fragments to a
plurality of adaptors includes the steps of attaching the end repaired chNA or cellular DNA
fragments to an oligonucleotide containing at least a portion of an anchor region. In some
embodiments, the oligonucleotide contains the whole anchor region. In particular
embodiments, the oligonucleotide is a DNA duplex comprising a 5’ phosphorylated
attachment strand duplexed with a partner strand, wherein the r strand is blocked from
attachment by chemical ation at its 3’ end, and wherein the attachment strand is
attached to the genomic DNA fragment. In certain embodiments, the DNA fragments
attached with at least a n of the anchor region are then annealed with DNA
oligonucleotides encoding the full length adaptor sequences. In particular embodiments, one
or more polynucleotide kinases, one or more DNA ligases, and/or one or more DNA
polymerases are added to the genomic DNA fragments and the DNA oligonucleotides
encoding the full length adaptor sequence. In some embodiments, the polynucleotide kinase
is T4 polynucleotide kinase. In some embodiments, the DNA ligase is Taq DNA ligase. In
certain embodiments, the DNA polymerase is Taq polymerase. In particular embodiments,
the DNA rase is full length Bst polymerase.
shows an illustrative method for attaching a plurality of
adaptors to the 3’ end of repaired DNA fragments. In the first step, the anchor sequence is
ed to the 3’ ends of c nts In this step, the anchor portion is a DNA
duplex in which the ten nucleotide 5’ orylated “attachment strand” is ed with an
eight nucleotide “partner strand” that is blocked from attachment by chemical modification at
its 3’ end. The anchor duplex is blunt-ended on the phosphorylated/blocked end and can
therefore attach to blunt-ended genomic fragments. In the next step, pools of oligonucleotides
encoding the full adaptor sequences are annealed to the initial anchor sequence. The
ed action of T4 polynucleotide kinase, Taq DNA ligase, and ength Bst
polymerase attach this oligonucleotide Via ligation as illustrated for the top strand and extend
the initial anchor sequence by DNA rization on the bottom strand to complete the full-
length adaptor sequence. Identical processes may be used to attach full length adaptors to the
’ end of the genomic fragments.
2. DNA LIBRARY AMPLIFICATION
In particular embodiments, methods of genetic analysis contemplated
herein comprise amplification of a genomic DNA library, 6. g. a cellular DNA library or a
chNA library, to generate a DNA clone library or a library of DNA clones, e.g., a chNA
clone y or a library of chNA clones, or a cellular DNA clone library or a library of
cellular DNA clones. Each molecule of the DNA library ses an adaptor attached to
each end of an end-repaired DNA fragments, and each adaptor comprises one or more
cation regions. In some embodiments, different adaptors are attached to different ends
of the end-repaired chNA. In particular embodiments, ent adaptors are attached to
ent ends of the end-repaired cellular DNA.
In some embodiments, the same adaptor is attached to both ends of the
DNA fragment. Attachment of the same adaptor to both ends of end-repaired DNA allows for
PCR amplification with a single primer sequence. In particular embodiments, a portion of the
adaptor attached-chNA library will be amplified using standard PCR techniques with a
single primer sequence driving amplification. In one embodiment, the single primer sequence
is about 25 nucleotides, optionally with a projected Tm of 2 55° C under standard ionic
strength conditions.
In particular embodiments, picograms of the l genomic DNA
library, 6. g. a ar DNA library or chNA library, are amplified into micrograms of DNA
, implying a 10,000-fold amplification. The amount of amplified product can be
measured using methods known in the art, e.g., quantification on a Qubit 2.0 or Nanodrop
instrument.
3. DETERMINING THE NUMBER OF GEN0ME EQUIVALENTS
In various embodiments, a method for genetic analysis of genomic
DNA comprises determining the number of genome equivalents in the DNA clone library. As
used herein, the term “genome equivalent” refers to the number of genome copies in each
library. An important challenge met by the compositions and methods contemplated herein is
achieving sufficient assay ivity to detect and analysis rare genetic mutations or
differences in genetic sequence. To ine assay sensitivity value on a sample-by-sample
basis, the numbers of different and distinct sequences that are present in each sample are
measured by measuring the number of genome equivalents that are present in a sequencing
library. To establish sensitivity, the number of genome equivalents must be measured for
each sample library.
The number of genome equivalents can be determined by qPCR assay
or by using ormatics-based counting after sequencing is performed. In the process flow
of clinical samples, qPCR measurement of genome equivalents is used as a QC step for DNA
libraries, e.g., chNA libraries or genomic DNA libraries. It establishes an expectation for
assay sensitivity prior to sequence analysis and allows a sample to be excluded from analysis
if its corresponding DNA clone library lacks the required depth of genome equivalents.
Ultimately, the bioinformatics-based counting of genome equivalents is also used to identify
the genome equivalents — and hence the assay ivity and false negative estimates — for
each given DNA clone library.
The empirical qPCR assay and statistical counting assays should be
well correlated. In cases where cing fails to reveal the ce depth in a DNA clone
library, reprocessing of the DNA clone library and/or onal sequencing may be required.
In one embodiment, the genome equivalents in a cellular DNA or
chNA clone library are determined using a quantitative PCR (qPCR) assay. In a particular
embodiment, a standard library of known tration is used to construct a standard curve
and the ements from the qPCR assay are fit to the resulting standard curve and a value
for genome equivalents is derived from the fit. The present inventors have discovered that a
qPCR “repeat-based” assay comprising one primer that cally hybridizes to a common
sequence in the genome, e.g., a repeat sequence, and r primer that binds to the primer
g site in the adaptor, measured an 8-fold increase in genome equivalents compared to
methods using just the adaptor specific primer (present on both ends of the chNA clone).
The number of genome equivalents measured by the repeat-based assays es a more
consistent library-to-library performance and a better alignment between qPCR estimates of
genome equivalents and bioinformatically counted tag equivalents in sequencing runs.
Illustrative examples of repeats suitable for use in the -based
genome equivalent assays contemplated herein e, but not limited to: short interspersed
nuclear elements (SINEs), e.g., Alu repeats, long interspersed nuclear elements (LINES), e.g.,
LINEl, LINE2, LINE3, microsatellite repeat elements, e.g., short tandem s (STRs),
simple sequence repeats (S SRs), and mammalian-wide interspersed repeats (MIRs).
In one embodiment, the repeat is an Alu repeat.
4. QUANTITATIVE C ANALYSIS
In various embodiments, a method for genetic analysis of genomic
DNA, e.g., genomic cellular or chNA, comprises quantitative genetic analysis of one or
more target c loci of the DNA library clones. Quantitative genetic analysis comprises
one or more of, or all of, the following steps: capturing DNA clones comprising a target
genetic locus; amplification of the captured targeted genetic locus; sequencing of the
amplified captured targeted genetic locus; and bioinforrnatic analysis of the ing
sequence reads. As used herein, the terms “DNA library clone” refer to a DNA y
fragment wherein the combination of the adaptor and the genomic DNA fragment result in a
unique DNA sequence (e.g., a DNA sequence that can be distinguished from that of another
DNA library clone).
(a) Capture of Target c Locus
The present invention contemplates, in part, a capture probe module
designed to retain the efficiency and reliability of larger probes but that minimizes
uninformative sequence tion in a genomic DNA library that comprises smaller DNA
fragments, e.g., a chNA clone library. A “capture probe” or “capture probe module” as used
herein, are used interchangeably and refer to a polynucleotide that comprises a capture probe
sequence and a tail sequence. In particular embodiments, the e probe module sequence
or a portion thereof serves as a primer binding site for one or more sequencing primers.
In particular ments, a capture probe module comprises a
capture probe. As used herein a “capture probe” refers to a region capable of hybridizing to a
specific DNA target region. In some embodiments, the capture probes are used with genomic
DNA y constructed from ar DNA. In ular ments, the capture probes
are used with genomic DNA library constructed from chNA. Because the e size of
chNA is about 150 to about 170 bp and is highly fragmented, certain embodiments are
directed itions and methods contemplated herein comprise the use of high y and
relatively short capture probes to interrogate DNA target regions of interest. In some
embodiments, the capture probes are capable of hybridizing to DNA target regions that are
distributed across all chromosomal segments at a uniform density. A set of such capture
probes is referred to herein as “chromosomal stability probes.” Chromosomal ity probes
are used to interrogate copy number variations on a genome-wide scale in order to provide a
genome-wise measurement of chromosomal copy number (eg, chromosomal ).
One particular concern with using high density capture probes is that
generally capture probes are designed using specific “sequence rules.” For example, regions
of redundant ce or that exhibit extreme base composition biases are generally excluded
in designing capture probes. However, the present inventors have discovered that the lack of
flexibility in capture probe design rules does not substantially impact probe performance. In
st, capture probes chosen strictly by positional constraint provided on-target sequence
information; exhibit very little off-target and unmappable read capture; and yield uniform,
useful, on-target reads with only few exceptions. Moreover, the high redundancy at close
probe spacing more than compensates for occasional poor-performing capture probes.
In particular embodiments, a target region is targeted by a plurality of
capture probes, wherein any two or more capture probes are designed to bind to the target
region within 10 nucleotides of each other, within 15 nucleotides of each other, within 20
nucleotides of each other, within 25 tides of each other, within 30 nucleotides of each
other, within 35 nucleotides of each other, within 40 nucleotides of each other, within 45
nucleotides of each other, or within 50 nucleotides or more of each other, as well as all
intervening nucleotide lengths.
In one ment, the capture probe is about 25 nucleotides, about
26 nucleotides, about 27 tides, about 28 nucleotides, about 29 nucleotides, about 30
nucleotides, about 31 nucleotides, about 32 nucleotides, about 33 nucleotides, about 34
nucleotides, about 35 nucleotides, about 36 nucleotides, about 37 nucleotides, about 38
nucleotides, about 39 tides, about 40 nucleotides, about 41 nucleotides, about 42
tides, about 43 nucleotides, about 44 nucleotides, or about 45 nucleotides.
In one embodiment, the capture probe is about 100 nucleotides, about
200 nucleotides, about 300 nucleotides, about 400 nucleotides, or about 100 nucleotides. In
another embodiment, the capture probe is from about 100 nucleotides to about 500
nucleotides, about 200 nucleotides to about 500 nucleotides, about 300 nucleotides to about
500 tides, or about 400 nucleotides to about 500 nucleotides, or any intervening range
thereof.
In a particular embodiment, the e probe is 60 nucleotides. In
another embodiment, the capture probe is substantially smaller than 60 nucleotides but
hybridizes ably, as well as, or better than a 60 nucleotide capture probe targeting the
same DNA target region. In a certain embodiment, the e probe is 40 nucleotides.
In certain embodiments, a capture probe module comprises a tail
sequence. As used herein, the term “tail sequence” refers to a polynucleotide at the 5’ end of
the capture probe module, which in particular embodiments can serve as a primer binding
site. In particular embodiments, a sequencing primer binds to the primer binding site in the
tail region.
In particular embodiments, the tail sequence is about 5 to about 100
nucleotides, about 10 to about 100 nucleotides, about 5 to about 75 nucleotides, about 5 to
about 50 nucleotides, about 5 to about 25 nucleotides, or about 5 to about 20 nucleotides. In
certain embodiments, the third region is from about 10 to about 50 nucleotides, about 15 to
about 40 nucleotides, about 20 to about 30 nucleotides or about 20 nucleotides, or any
intervening number of nucleotides.
In particular ments, the tail sequence is about 30 nucleotides,
about 31 nucleotides, about 32 tides, about 33 nucleotides, about 34 nucleotides, about
nucleotides, about 36 nucleotides, about 37 nucleotides, about 38 tides, about 39
nucleotides, or about 40 nucleotides.
In various embodiments, the capture probe module comprises a
c member of a binding pair to enable isolation and/or purification of one or more
captured fragments of a tagged and or amplified c DNA library (e.g., a cellular or
chNA library) that hybridizes to the e probe. In particular embodiments, the capture
probe module is conjugate to biotin or r suitable , e.g., dinitrophenol,
digoxigenin.
In various embodiments, the e probe module is hybridized to a
tagged and optionally amplified DNA y to form a complex. In some embodiments, the
multifunctional capture probe module substantially hybridizes to a specific genomic target
region in the DNA library.
Hybridization or hybridizing conditions can include any reaction
conditions where two nucleotide sequences form a stable complex; for example, the tagged
DNA library and capture probe module forming a stable tagged DNA library —capture probe
module complex. Such reaction conditions are well known in the art and those of skill in the
art will appreciated that such conditions can be modified as appropriate, e.g., decreased
annealing temperatures with shorter length e probes, and within the scope of the
present invention. Substantial hybridization can occur when the second region of the capture
probe complex exhibits 100%, 99%, 98%, 97%, 96%, 95%, 94%, 93%, 92% 91%, 90%,
89%, 88%, 85%, 80%, 75%, or 70% sequence identity, gy or complementarity to a
region of the tagged DNA y.
In particular embodiments, the capture probe is about 40 nucleotides
and has an l annealing temperature of about 44° C to about 470 C.
In certain embodiments, the methods contemplated herein comprise
isolating a tagged chNA library—capture probe module complex. In particular
embodiments, methods for isolating DNA complexes are well known to those skilled in the
art and any methods deemed appropriate by one of skill in the art can be employed with the
methods of the present invention (Ausubel et al, Current Protocols in Molecular Biology,
2007-2012). In particular embodiments, the complexes are ed using biotin—streptavidin
isolation techniques.
In particular embodiments, removal of the single stranded 3’—ends from
the isolated tagged DNA library fragments-capture probe module complex is contemplated.
In certain ments, the methods comprise 3’—5' exonuclease enzymatic processing of the
isolated tagged DNA library-multifunctional capture probe module complex to remove the
single stranded 3’ ends.
In certain other embodiments, the methods comprise performing 5'-3’
DNA polymerase extension of unctional capture probe utilizing the isolated tagged
DNA library fragments as template.
In certain other embodiments, the methods comprise creating a hybrid
capture probe-isolated tagged DNA target molecule, e.g., a tagged chNA target molecule or
a tagged cellular DNA target molecule, through the concerted action of a 5’ FLAP
endonuclease, DNA polymerization and nick closure by a DNA ligase.
A variety of enzymes can be employed for the 3'-5' exonuclease
enzymatic processing of the isolated tagged DNA library-multifunctional capture probe
module x. Illustrative examples of suitable enzymes, which exhibit 3’-5’ exonuclease
enzymatic activity, that can be employed in ular ments include, but are not
limited to: T4 or Exonucleases I, 111, V (See also, Shevelev IV, Hubscher U., Nat Rev Mol
Cell Biol. 3(5):364—76 (2002)). In particular embodiments, the enzyme comprising 3'-5’
exonuclease activity is T4 polymerase. In particular embodiments, an enzyme which exhibits
3’-5’ exonuclease enzymatic activity and is capable of primer template extension can be
employed, including for example T4 or Exonucleases I, III, V. Id.
In some embodiments, the methods contemplated herein comprise
performing sequencing and/or PCR on the 3'-5' exonuclease enzymatically processed
x discussed supra and elsewhere herein. In ular ments, a tail portion of a
capture probe molecule is copied in order to generate a hybrid nucleic acid molecule. In one
embodiment, the hybrid nucleic acid molecule generated comprises the target region capable
of hybridizing to the e probe module and the ment of the capture probe module
tail sequence.
] In a particular embodiment, genetic analysis comprises a) hybridizing
one or more capture probe modules to one or more target genetic loci in a plurality of
genomic DNA library clones to form one or more capture probe module—DNA library clone
complexes; b) isolating the one or more capture probe -DNA y clone complexes
from a), c) enzymatically processing the one or more isolated capture probe module-DNA
library clone complexes from step b); d) performing PCR on the tically processed
x from c) wherein the tail portion of the capture probe molecule is copied in order to
generate amplified hybrid nucleic acid molecules, n the ed hybrid nucleic acid
molecules comprise a target sequence in the target genomic locus e of hybridizing to
the capture probe and the ment of the capture probe module tail sequence; and e)
ming quantitative genetic analysis on the amplified hybrid nucleic acid molecules from
In a particular embodiment, s for determining copy number of a
specific target genetic locus are contemplated comprising: a) hybridizing one or more capture
probe modules to one or more target genetic loci in a plurality of DNA library clones to form
one or more e probe module-DNA library clone complexes; b) isolating the one or
more capture probe module-DNA library clone complexes from a); c) enzymatically
processing the one or more isolated capture probe -DNA library clone complexes
from step b); d) performing PCR on the enzymatically processed complex from c) wherein
the tail n of the capture probe molecule is copied in order to generate amplified hybrid
nucleic acid molecules, wherein the amplified hybrid nucleic acid molecules comprise a
target sequence in the target genetic locus capable of hybridizing to the capture probe and the
complement of the capture probe module tail sequence; e) performing PCR amplification of
the amplified hybrid nucleic acid molecules in d); and f) quantitating the PCR reaction in e),
wherein the quantitation allows for a determination of copy number of the specific target
region.
In one ment, the enzymatic processing of step c) comprises
performing 3'-5' lease enzymatic processing on the one or more e probe
module-DNA library clone complexes from b) using an enzyme with 3'-5' exonuclease
activity to remove the single stranded 3’ ends; creating one or more hybrid capture probe
-cfl)NA library clone molecules through the concerted action of a 5’ FLAP
clease, DNA rization and nick closure by a DNA ligase; or performing 5’-3'
DNA polymerase extension of the capture probe using the isolated DNA clone in the
complex as a template.
In one embodiment, the enzymatic processing of step 0) comprises
performing 5’-3' DNA polymerase extension of the capture probe using the isolated DNA
clone in the complex as a template.
In particular embodiments, PCR can be performed using any standard
PCR reaction conditions well known to those of skill in the art. In certain embodiments, the
PCR on in e) employs two PCR primers. In one embodiment, the PCR reaction in e)
employs a first PCR primer that hybridizes to a repeat within the target genetic locus. In a
particular embodiment, the PCR reaction in e) s a second PCR primer that hybridizes
to the hybrid nucleic acid molecules at the target genetic locus/tail junction. In certain
embodiments, the PCR reaction in e) employs a first PCR primer that hybridizes to the target
genetic locus and a second PCR primer hybridizes to the amplified hybrid nucleic acid
molecules at the target c locus/tail junction. In particular embodiments, the second
primer hybridizes to the target genetic locus/tail junction such that at least one or more
nucleotides of the primer hybridize to the target genetic locus and at least one or more
nucleotides of the primer hybridize to the tail sequence.
In certain ments, the amplified hybrid nucleic acid molecules
obtained from step e) are sequenced and the sequences aligned horizontally, i.€., aligned to
one r but not d to a reference sequence. In particular embodiments, steps a)
through e) are ed one or more times with one or more capture probe modules. The
capture probe s can be the same or different and designed to target either chNA
strand of a target genetic locus. In some embodiments, when the capture probes are different,
they ize at overlapping or adjacent target sequences within a target genetic locus in the
tagged chNA clone library. In one embodiment, a high density capture probe gy is
used wherein a plurality of capture probes hybridize to a target genetic locus, and wherein
each of the plurality of capture probes izes to the target genetic locus within about 5,
, 15, 20, 25, 30, 35, 40, 45, 50, 100, 200 bp or more of any other capture probe that
hybridizes to the target genetic locus in a tagged DNA clone library, ing all intervening
distances.
In some embodiments, the method can be performed using two capture
probe modules per target c locus, wherein one hybridizes to the “Watson” strand (noncoding
or template strand) upstream of the target region and one izes to the “Crick”
strand (coding or non-template strand) downstream of the target region.
In particular embodiments, the methods contemplated herein can
further be performed multiple times with any number of capture probe s, for example
2, 3, 4, 5, 6, 7, 8, 9, or 10 or more capture probe modules per target genetic locus any number
of which hybridize to the Watson or Crick strand in any combination. In some embodiments,
the sequences obtained can be aligned to one another in order to identify any of a number of
differences.
In certain embodiments, a plurality of target genetic loci are
interrogated, e.g., 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1500, 2000, 2500, 3000,
3500, 4000, 4500, 5000, 10000, 50000, 100000, 500000 or more in a single reaction, using
one or more capture probe modules.
(b) Sequencing
In particular embodiments, the quantitative genetic analysis comprises
cing a plurality of hybrid nucleic acid molecules, as discussed elsewhere herein, supra,
to generate sufficient sequencing depths to obtain a plurality of unique sequencing reads. The
terms “unique reads” or “unique genomic sequences” (UGS) are used hangeably herein
and are identified by grouping individual redundant reads er into a y.”
Redundant reads are ce reads that share an identical UMIE (e.g, share the same read
code and the same DNA sequence start position within genomic sequence) and are derived
from a single attachment event and are therefore amplification-derived “siblings” of one
another. A single consensus representative of a family of redundant reads is carried forward
as a unique read or UGS. Each unique read or UGS is considered a unique attachment event.
The sum of unique reads corresponding to a particular capture probe is referred to as the “raw
genomic dept ” (RGD) for that particular capture probe Each capture probe yields
a set of
unique reads that are ationally distilled from total reads by grouping into families. The
unique reads for a given sample (e.g., raw genomic depth for a sample) are then computed as
the average of all the unique reads observed on a probe-by-probe basis. Unique reads are
important because each unique read must be d from a unique genomic DNA clone.
Each unique read represents the input and analysis of a haploid equivalent of genomic DNA.
The sum of unique reads is the sum of haploid genomes analyzed. The number of genomes
analyzed, in turn, defines the sensitivity of the cing assay. By way of a non-limiting
example, if the average unique read count is 100 genome equivalents, then that particular
assay has a sensitivity of being able to detect one mutant read in 100, or 1%. Any observation
less than this is not defensible.
Cases where there is an obvious copy number change (6.3, instances
of noisy probes) are excluded from the data set used to compute the sample average. Herein,
a “noisy probe” refers to a probe that captures a highly variable number of unique reads
among a large set identical s (e.g., a highly variable number of unique reads among 12
— 16 sample replicates). In some embodiments, the number of unique reads associated with a
noisy probe is sed ed to the average number of unique reads for the sample by
50% or more. In some embodiments, the number of unique reads ated with a noisy
probe is decreased compared to the e number of unique reads for the sample by 50% or
more. In some ments, about 2% to about 4% of probes used in a particular analysis are
identified as noisy probes and are ed from calculations to determine the average
number of unique reads for a given sample.
In some embodiments, sequencing reads are identified as either “on-
target reads” or “off-target reads.” On-target reads possess a genomic DNA sequence that
maps within the vicinity of a capture probe used to create the genomic library. In some
embodiments, where each genomic sequence is physically linked to a specific capture probe
and where the sequence of the genomic segment and capture probe are both determined as a
unified piece of information, an on-target read is defined as any genomic sequence whose
starting coordinate maps within 400 bp, and more generally within 200 bp of the 3’ end of the
corresponding capture probe. rget reads are defined as having genomic sequence that
aligns to the reference genome at a location 2 500 base pairs (and more often mapping to
entirely different chromosomes) relative to the capture probe.
] In particular ments, the quantitative genetic analysis comprises
lex sequencing of hybrid nucleic acid molecules derived from a plurality of samples.
In various embodiments, the quantitative genetic analysis comprises
obtaining one or more or a plurality of tagged DNA library clones, each clone comprising a
first DNA sequence and a second DNA sequence, wherein the first DNA sequence comprises
a sequence in a targeted genetic locus and the second DNA sequence comprises a capture
probe sequence; performing a paired end sequencing reaction on the one or more clones and
obtaining one or more cing reads or performing a sequencing reaction on the one or
more clones in which a single long sequencing read of greater than about 100, 200, 300, 400,
500 or more nucleotides is obtained, wherein the read is ent to identify both the first
DNA sequence and the second DNA sequence; and ordering or clustering the sequencing
reads of the one or more clones according to the probe sequences of the sequencing reads.
(0) Bioinformatics is
In various embodiments, the quantitative c analysis further
comprises bioinformatic analysis of the sequencing reads. Bioinformatic analysis excludes
any purely mental analysis performed in the absence of a composition or method for
sequencing. In certain ments, bioinformatics analysis includes, but is not limited to:
sequence alignments; genome equivalents analysis; single nucleotide variant (SNV) analysis;
gene copy number variation (CNV) analysis; measurement of chromosomal copy ;
and detection of genetic lesions. In particular embodiments; ormatics analysis is useful
to quantify the number of genome lents ed in the chNA clone library; to detect
the genetic state of a target genetic locus; to detect genetic lesions in a target genetic locus;
and to e copy number fluctuations within a target c locus.
Sequence alignments may be performed between the sequence reads
and one or more human reference DNA sequences. In particular embodiments; sequencing
alignments can be used to detect genetic lesions in a target genetic locus ing; but not
limited to detection of a nucleotide transition or transversion; a nucleotide insertion or
deletion; a genomic rearrangement; a change in copy number; or a gene . Detection of
genetic lesions that are causal or prognostic indicators may be useful in the diagnosis;
prognosis; treatment; and/or monitoring of a particular genetic condition or disease.
Also contemplated herein; are methods for sequence alignment
analysis that can be performed without the need for alignment to a reference sequence;
ed to herein as horizontal sequence is. Such analysis can be performed on any
sequences generated by the methods contemplated herein or any other methods. In particular
ments; the sequence analysis comprises performing sequence alignments on the reads
obtained by the methods contemplated herein.
In one embodiment; the genome equivalents in a chNA clone library
are determined using bioinformatics-based counting after sequencing is performed. Each
sequencing read is associated with a particular capture probe; and the collection of reads
assigned to each capture probe is parsed into groups. Within a group; sets of individual reads
share the same read code and the same DNA sequence start position within genomic
ce. These individual reads are grouped into a “family” and a single consensus
representative of this family is carried d as a “unique read.” All of the individual reads
that constituted a family are derived from a single ment event and thus; they are
amplification—derived ngs” of one another. Each unique read is considered a unique
attachment event and the sum of unique reads is considered equivalent to the number of
genome equivalents analyzed.
As the number of unique clones approaches the total number of
le sequence combinations, probability dictates that the same code and start site
combinations will be created by independent events and that these independent events will be
inappropriately grouped within single es. The net result will be an underestimate of
genome equivalents analyzed, and rare mutant reads may be discarded as sequencing errors
because they overlap with wild-type reads bearing the same identifiers.
In particular embodiments, to provide an accurate analysis for chNA
clone libraries, the number of genome equivalents ed is about 1/10, about 1/ 12, about
1/ 14, about 1/ 16, about 1/ 18, about 1/20, about 1/25 or less the number of possible unique
clones. It should be understood that the procedure outlined above is merely illustrative and
not limiting.
In some embodiments, the number of genome equivalents to be
analyzed may need to be increased. To expand the depth of genome lents, at least two
ons are contemplated. The first solution is to use more than one adaptor set per sample.
By combining adaptors, it is possible to multiplicatively expand the total number of possible
clones and therefore, expand the table limits of genomic input. The second solution is
to expand the read code by 1, 2, 3, 4, or 5, or more bases. The number of possible read codes
that differ by at least 2 bases from every other read code scales as 40“) where n is the number
of bases within a read code. Thus, in a non-limiting example, if a read code is 5 nucleotides
and 45'” = 256, therefore, the inclusion of additional bases expands the available repertoire
by a factor of four for each additional base.
In one embodiment, quantitative genetic analysis comprises
bioinformatic analysis of cing reads to fy rare single nucleotide variants (SNV).
Next-generation cing has an inherent error rate of roughly 0.02-
0.02%, meaning that anywhere from 1/200 to 1/500 base calls are incorrect. To detect
variants and other mutations that occur at frequencies lower than this, for example at
frequencies of 1 per 1000 sequences, it is necessary to invoke molecular annotation
gies. By way of a non-limiting example, analysis of 5000 unique molecules using
targeted sequence capture technology would generate — at sufficient cing depths of
>50,000 reads — a collection of 5000 unique reads, with each unique read belonging to a
“family” of reads that all possess the same read code. A SNV that occurs within a family is a
candidate for being a rare variant. When this same variant is observed in more than one
family, it becomes a very strong candidate for being a rare variant that exists within the
starting sample. In contrast, ts that occur sporadically within families are likely to be
sequencing errors and variants that occur within one and only one family are either rare or the
result of a base alteration that occurred ex vivo (e.g., oxidation of a DNA base or PCR—
uced errors).
In one ment, the methods of detecting SNVs comprise
introducing 10-fold more genomic input (genomes or genome equivalents) as the desired
target ivity of the assay. In one non-limiting example, if the desired sensitivity is 2% (2
in 100), then the experimental target is an input of 2000 genomes.
In particular embodiments, bioinforrnatics analysis of sequencing data
is used to detect or identify SNV associated with a genetic state, ion or disease, genetic
mosaicism, fetal testing, paternity testing, predicting response to drug treatment, diagnosing
or monitoring a medical condition, microbiome profiling, pathogen screening, and
monitoring organ transplants.
In various embodiments, a method for copy number determination
analysis is provided comprising obtaining one or more or a plurality of clones, each clone
comprising a first DNA sequence and a second DNA sequence, wherein the first DNA
sequence comprises a sequence in a targeted genetic locus and the second DNA sequence
comprises a capture probe sequence. In related embodiments, a paired end sequencing
reaction on the one or more clones is med and one or more sequencing reads are
obtained. In another ment, a sequencing reaction on the one or more clones is
performed in which a single long sequencing read of greater than about 100 tides is
obtained, wherein the read is sufficient to identify both the first DNA sequence and the
second DNA ce. The sequencing reads of the one or more clones can be ordered or
clustered according to the probe sequence of the sequencing reads.
Copy number analyses include, but are not limited to, analyses that
examine the number of copies of a particular gene or mutation that occurs in a given genomic
DNA sample and can further e quantitative determination of the number of copies of a
given gene or sequence differences in a given sample. In particular embodiments, copy
number analysis is used to detect or identify gene amplification associated with genetic states,
ions, or diseases, fetal testing, genetic mosaicism, paternity testing, predicting response
to drug treatment, diagnosing or monitoring a medical ion, microbiome profiling,
pathogen screening, and ring organ transplants.
In some embodiments, copy number analysis is used to measure
chromosomal instability. In such embodiments, sets of capture probes that comprise
somal stability probes are used to determine copy number ions at a uniform
density across all sets of chromosomes. Copy number analyses are performed for each
chromosomal stability probe and the chromosomal stability probes are then ordered
according to their chromosomal target. This allows for visualization of copy number losses or
gains across the genome and can serve as a measure of chromosomal ity.
In particular embodiments, bioinformatics analysis of sequencing data
is used to detect or identify one or more sequences or genetic s in a target locus
including, but not limited to detection of a nucleotide transition or transversion, a nucleotide
insertion or deletion, a genomic rearrangement, a change in copy number, or a gene fusion.
Detection of genetic s that are causal or prognostic indicators may be useful in the
diagnosis, prognosis, ent, and/or monitoring of a particular genetic condition or
disease. In one embodiment, genetic lesions are associated with genetic , conditions, or
diseases, fetal testing, genetic mosaicism, paternity testing, predicting response to drug
treatment, diagnosing or monitoring a medical condition, microbiome profiling, pathogen
screening, and monitoring organ transplants.
D. CLINICAL APPLICATIONS OF TATIVE CNL ASSAYS
] In various embodiments, the present invention contemplates a method
of detecting, identifying, predicting, diagnosing, or monitoring a ion or disease in a
subject by detecting a mutational change, SNP, translocation, inversion, deletion, change in
copy number or other genetic variation in a region of interest.
E. CLINICAL APPLICATIONS OF QUANTITATIVE GENETIC ANALYSIS
In various embodiments, the present invention contemplates a method
of detecting, identifying, predicting, sing, or monitoring a condition or disease in a
subject.
In particular embodiments, a method of detecting, identifying,
predicting, diagnosing, or monitoring a c state, condition or disease in a subject
comprises performing a tative genetic analysis of one or more target genetic loci in a
DNA clone library to detect or identify a change in the sequence at the one or more target
genetic loci. In some embodiments, the change is a change in copy number.
In one embodiment, a method of detecting, identifying, predicting,
diagnosing, or monitoring a genetic state, condition or disease comprises ing or
obtaining cellular DNA or chNA from a biological sample of a t; treating the cellular
DNA or chNA with one or more end-repair enzymes to generate end-repaired DNA,
attaching one or more adaptors to each end of the end-repaired DNA to generate a genomic
DNA library, amplifying the DNA library to generate a DNA clone library, determining the
number of genome equivalents in the DNA clone library, and performing a quantitative
genetic analysis of one or more target genetic loci in a DNA clone library to detect or identify
a change in the sequence, eg, an SNP, a ocation, an inversion, a deletion, or a change
in copy number at of the one or more target genetic loci.
In particular embodiments, a method of detecting, identifying,
predicting, sing, or monitoring a genetic state, or genetic condition or disease selected
from the group consisting of: genetic diseases, genetic mosaicism, fetal testing; paternity
testing, paternity testing, predicting response to drug treatment; diagnosing or monitoring a
medical condition; iome profiling; pathogen screening, and organ transplant
monitoring comprising isolating or obtaining genomic DNA from a biological sample of a
subject, treating the DNA with one or more end-repair s to generate paired
DNA, attaching one or more adaptors to each end of the end—repaired DNA to te a
genomic DNA library, amplifying the genomic DNA library to generate a DNA clone library,
determining the number of genome lents in the DNA clone library, and performing a
quantitative genetic analysis of one or more target genetic loci in a DNA clone library to
detect or identify a nucleotide tion or transversion, a nucleotide insertion or deletion, a
genomic rearrangement, a change in copy number, or a gene fusion in the sequence at the one
or more target genetic loci.
Illustrative examples of genetic diseases that can be detected,
identified, ted, diagnosed, or monitored with the compositions and methods
contemplated herein include, but are not limited to cancer, Alzheimer’s disease (APOEI),
Charcot-Marie-Tooth disease, Leber tary optic neuropathy (LHON), an
syndrome (UBE3A, ubiquitin-protein ligase E3A), Prader-Willi syndrome (region in
chromosome 15), B—Thalassaemia (HBB, B—Globin), Gaucher disease (type 1) (GBA,
Glucocerebrosidase), Cystic fibrosis (CFTR Epithelial chloride channel), Sickle cell disease
(HBB, B-Globin), Tay—Sachs e (HEXA, Hexosaminidase A), Phenylketonuria (PAH,
alanine hydrolyase), Familial holesterolaemia (LDLR, Low density lipoprotein
receptor), Adult polycystic kidney disease (PKDl, Polycystin), gton disease (HDD,
Huntingtin), Neuroflbromatosis type I (NFl, NFl tumour suppressor gene), Myotonic
dystrophy (DM, Myotonin), us sclerosis (TSCl, Tuberin), Achondroplasia (FGFR3,
Fibroblast growth factor receptor), Fragile X syndrome (FMRl, RNA-binding protein),
ne muscular dystrophy (DMD, Dystrophin), Haemophilia A (F8C, Blood ation
factor VIII), Lesch—Nyhan syndrome (HPRTl, Hypoxanthine guanine ribosyltransferase l),
and Adrenoleukodystrophy (ABCDl).
Illustrative examples of cancers that can be detected, identified,
predicted, diagnosed, or monitored with the compositions and methods contemplated herein
include, but are not limited to: B cell cancer, e.g., multiple myeloma, melanomas, breast
cancer, lung cancer (such as non-small cell lung carcinoma or NSCLC), bronchus cancer,
colorectal cancer, te cancer, pancreatic , stomach cancer, ovarian cancer, y
bladder cancer, brain or central nervous system , peripheral nervous system ,
esophageal , cervical cancer, uterine or trial cancer, cancer of the oral cavity or
pharynx, liver cancer, kidney , testicular cancer, biliary tract cancer, small bowel or
appendix cancer, salivary gland cancer, thyroid gland cancer, adrenal gland cancer,
osteosarcoma, chondrosarcoma, cancer of hematological tissues, adenocarcinomas,
inflammatory myof1broblastic tumors, gastrointestinal stromal tumor (GIST), colon cancer,
multiple myeloma (MM), myelodysplastic syndrome (MDS), myeloproliferative er
(MPD), acute lymphocytic leukemia (ALL), acute myelocytic ia (AML), chronic
myelocytic leukemia (CML), chronic lymphocytic leukemia (CLL), polycythemia Vera,
Hodgkin lymphoma, non-Hodgkin lymphoma (NHL), soft-tissue sarcoma, f1brosarcoma,
myxosarcoma, rcoma, osteogenic sarcoma, chordoma, angiosarcoma,
endotheliosarcoma, lymphangiosarcoma, lymphangioendotheliosarcoma, synovioma,
mesothelioma, s tumor, leiomyosarcoma, rhabdomyosarcoma, squamous cell
carcinoma, basal cell carcinoma, adenocarcinoma, sweat gland carcinoma, sebaceous gland
carcinoma, papillary carcinoma, papillary adenocarcinomas, ary carcinoma,
bronchogenic carcinoma, renal cell carcinoma, hepatoma, bile duct carcinoma,
choriocarcinoma, seminoma, embryonal carcinoma, Wilms' tumor, bladder carcinoma,
epithelial carcinoma, glioma, astrocytoma, medulloblastoma, craniopharyngioma,
ependymoma, pinealoma, ioblastoma, acoustic neuroma, oligodendroglioma,
meningioma, neuroblastoma, retinoblastoma, follicular lymphoma, diffuse large B-cell
lymphoma, mantle cell lymphoma, hepatocellular carcinoma, thyroid cancer, gastric cancer,
head and neck cancer, small cell cancers, essential thrombocythemia, agnogenic d
metaplasia, hypereosinophilic me, systemic mastocytosis, familiar hypereosinophilia,
c eosinophilic leukemia, neuroendocrine cancers, carcinoid tumors, and the like.
In one ment, the genetic lesion is a lesion annotated in the
Cosmic database (the lesions and sequence data are available online and can be downloaded
from the Cancer Gene Census section of the Cosmic website) or a lesion annotated in the
Cancer Genome Atlas (the lesions and sequence data are available online and can be
downloaded from The Cancer Genome Atlas website).
Illustrative examples of genes that harbor one or more genetic lesions
associated with cancer that can be detected, identified, predicted, diagnosed, or monitored
with the compositions and methods contemplated herein include, but are not d to
ABCBl, ABCC2, ABCC4, ABCG2, ABLl, ABL2, AKT1, AKT2, AKT3, ALDH4A1, ALK,
APC, AR, ARAF, ARFRPl, ARIDlA, ATM, ATR, AURKA, AURKB, BCL2, BCL2A1,
BCL2Ll, BCL2L2, BCL6, BRAF, BRCAl, BRCAZ, Clorfl44, CARDll, CBL, CCNDl,
CCND2, CCND3, CCNEl, CDHl, CDH2, CDH20, CDH5, CDK4, CDK6, CDK8,
CDKNZA, CDKNZB, CDKNZC, CEBPA, CHEKl, CHEKZ, CRKL, CRLFZ, CTNNBl,
, CYP2C19, CYP2C8, CYP2D6, CYP3A4, CYP3A5, DNMT3A, DOTlL, DPYD,
EGFR, EPHA3, EPHAS, EPHA6, EPHA7, EPHBl, EPHB4, EPHB6, EPHXl, ERBB2,
ERBB3, ERBB4, ERCC2, ERG, ESRl, ESR2,ETV1, ETV4, ETVS, ETV6, EWSRi, EZH2,
FANCA, FBXW7, , FGFRl, FGFR2, FGFR3, FGFR4, FLTl, FLT3, FLT4,
FOXP4, GATAl, GNAll, GNAQ, GNAS, GPR124, GSTPl, GUCY1A2, HOXA3, HRAS,
A1, IDHl, IDH2, IGFlR, IGF2R, IKBKE, IKZFl, INHBA, 1Rs2, ITPA, JAKl,
JAK2, JAK3, JUN, KDR, KIT, KRAS, LRPlB, LRP2, LTK, MANlBl, MAP2K1,
MAP2K2, , MCLl, MDM2, MDM4, MEN], MET, MITF, MLHl, MLL, MPL,
MREllA, MSH2, MSH6, MTHFR, MTOR, MUTYH, MYC, MYCLl, MYCN,NF1, NF2,
NKX2-1, NOTC
PDGFRA, PDGFRB, PIK3CA, PIK3R1, PKHDl, PLCGl, PRKDC, PTCHi, PTEN,
PTPN11, PTPRD, RAFl, RARA, RBI, RET, RICTOR, RPTOR, RUNXl, SLC19A1,
SLC22A2, SLCOlB3, SMAD2, SMAD3, SMAD4, SMARCA4, SMARCBl, SMO, SOD2,
SOXlO, SOX2, SRC, STKl 1, l, TBX22, TETZ, TGFBRZ, TMPRSSZ, TNFRSF14,
TOPl, TP53, TPMT, TSCl, TSC2, TYMS, UGTlAl, UMPS, USP9X, VHL, and WTl.
In particular embodiments, the genetic lesion comprises a nucleotide
transition or transversion, a nucleotide insertion or deletion, a c rearrangement, a
change in copy number, or a gene fusion.
In one embodiment, the genetic lesion is a gene fusion that fuses the 3'
coding region of the ALK gene to another gene.
In one embodiment, the genetic lesion is a gene fusion that fuses the 3'
coding region of the ALK gene to the EML4 gene.
Illustrative es of conditions suitable for fetal testing that can be
detected, identified, predicted, diagnosed, or monitored with the compositions and methods
contemplated herein include but are not limited to: Down Syndrome (Trisomy 21), s
Syndrome (Trisomy 18), Patau Syndrome (Trisomy 13), Klinefelter's me (XXY),
Triple X syndrome, XYY syndrome, Trisomy 8, y 16, Turner Syndrome (X0),
Robertsonian translocation, DiGeorge Syndrome and irschhorn Syndrome.
rative examples of alleles suitable for paternity testing that can be
detected, identified, predicted, diagnosed, or monitored with the compositions and methods
contemplated herein include but are not limited to 16 or more of: D20S1082, D6S474,
D12ATA63, D22S1045, D10S1248, D1Sl677, 63, D4S2364, D9S1122, D2Sl776,
D10S1425, 3, Dsszsoo, D1S1627, D3S4529, D28441, D17S974, D6S1017,
D482408, D9S2157, Amelogenin, D17Sl301, D1GATA113, D18S853, D20S482, and
D14S1434.
rative examples of genes suitable for predicting the response to
drug treatment that can be detected, identified, predicted, diagnosed, or monitored with the
compositions and methods plated herein include, but are not limited to, one or more of
the following genes: ABCBl (ATP-binding cassette, sub-family B (MDR/TAP), member 1),
ACE (angiotensin I converting ), ADHIA (alcohol dehydrogenase 1A (class 1), alpha
polypeptide), ADHlB (alcohol dehydrogenase IB (class 1), beta polypeptide), ADHlC
(alcohol dehydrogenase 1C (class I), gamma polypeptide), ADRBl (adrenergic, beta—l-,
receptor), ADRB2 (adrenergic, beta-2—, receptor, surface), AHR (aryl hydrocarbon receptor),
I (aldehyde dehydrogenase 1 , member A1), ALOX5 (arachidonate 5-
lipoxygenase), BRCAl (breast cancer 1, early onset), COMT (catechol-O—methyltransferase),
CYP2A6 (cytochrome P450, family 2, subfamily A, polypeptide 6), CYP2B6 (cytochrome
P450, family 2, subfamily B, polypeptide 6), CYP2C9 hrome P450, family 2,
subfamily C, ptide 9), CYP2C19 (cytochrome P450, family 2, subfamily C,
polypeptide 19), CYP2D6 (cytochrome P450, family 2, subfamily D, polypeptide 6), CYP2J2
(cytochrome P450, family 2, subfamily J, polypeptide 2), CYP3A4 (cytochrome P450, family
3, subfamily A, polypeptide 4), CYP3A5 (cytochrome P450, family 3, subfamily A,
polypeptide 5), DPYD (dihydropyrimidine dehydrogenase), DRD2 (dopamine receptor D2),
F5 (coagulation factor V), GSTPl (glutathione S-transferase pi), HMGCR (3-hydroxy
methylglutaryl-Coenzyme A reductase), KCNH2 (potassium voltage-gated channel,
subfamily H (eag—related), member 2), KCNJ11 sium inwardly-rectifying channel,
subfamily J, member 11), MTHFR (5,10-methylenetetrahydrofolate reductase (NADPH)),
NQOl (NAD(P)H dehydrogenase, quinone 1), P2RY1 ergic receptor P2Y, G—protein
coupled, l), P2RY12 (purinergic receptor P2Y, G—protein d, 12), PTGIS
(prostaglandin I2 (prostacyclin) synthase), SCN5A m channel, voltage-gated, type V,
alpha (long QT syndrome 3)), SLC19A1 (solute carrier family 19 e transporter),
member 1), SLCOlBl (solute carrier organic anion transporter family, member 1B1),
SULTIAI (sulfotransferase family, cytosolic, 1A, phenol-preferring, member 1), TPMT
(thiopurine S-methyltransferase), TYMS (thymidylate synthetase), UGT1A1 (UDP
onosyltransferase 1 family, polypeptide A1), VDR in D (1,25- dihydroxyvitamin
D3) receptor), VKORCl (vitamin K epoxide reductase complex, subunit 1).
Illustrative es of medical conditions that can be detected,
identified, predicted, diagnosed, or monitored with the compositions and methods
plated herein include, but are not limited to: stroke, transient ischemic ,
traumatic brain injury, heart disease, heart attack, angina, atherosclerosis, and high blood
pressure.
rative examples of pathogens that can be screened for with the
compositions and methods contemplated herein include, but are not limited to: bacteria fungi,
and s.
] Illustrative examples of bacterial species that can be screened for with
the compositions and methods contemplated herein include, but are not limited to: a
Mycobacterium spp., a Pneumococcus spp., an Escherichia spp., a Campylobacter spp., a
Corynebacterium spp., a Clostridium spp., a Streptococcus spp., a Staphylococcus spp., a
Pseudomonas spp., a Shigella spp., a Treponema spp., or a Salmonella spp.
Illustrative examples of fungal species that can be screened for with
the compositions and methods plated herein include, but are not d to: an
illis spp., a Blastomyces spp., a Candida spp., a Coccicioides spp., a Cryptococcus
spp., dermatophytes, a Tinea spp., a Trichophyton spp., a Microsporum spp., a Fusarium spp.,
a Histoplasma spp,, a mycotina spp., a Pneumocystis spp., a Sporothrix spp., an
Exserophilum spp., or a Cladosporium spp.
Illustrative examples of viruses that can be screened for with the
compositions and methods contemplated herein include, but are not d to: za A
such as HlNl, HlN2, H3N2 and H5Nl (bird flu), Influenza B, Influenza C virus, Hepatitis A
virus, Hepatitis B virus, Hepatitis C virus, Hepatitis D virus, Hepatitis E virus, Rotavirus, any
virus of the Norwalk virus group, enteric adenoviruses, parvovirus, Dengue fever virus,
Monkey pox, Mononegavirales, Lyssavirus such as rabies virus, Lagos bat virus, Mokola
virus, age virus, European bat virus 1 & 2 and Australian bat virus, Ephemerovirus,
Vesiculovirus, Vesicular Stomatitis Virus (VSV), Herpesviruses such as Herpes simplex
virus types 1 and 2, varicella zoster, galovirus, Epstein-Bar virus (EBV), human
herpesviruses (HHV), human herpesvirus type 6 and 8, Moloney murine leukemia virus (M-
MuLV), Moloney murine sarcoma virus (MoMSV), Harvey murine a virus
(HaMuSV), murine mammary tumor virus ), gibbon ape ia virus (GaLV),
feline leukemia virus (FLV), spumavirus, Friend murine leukemia virus, Murine Stem Cell
Virus (MSCV) and Rous Sarcoma Virus (RSV), HIV (human immunodeficiency virus,
including HIV type 1, and HIV type 2), visna-maedi virus (VMV) virus, the caprine arthritisencephalitis
virus (CAEV), equine infectious anemia virus (EIAV), feline immunodeficiency
virus (FIV), bovine immune deficiency virus (BIV), and simian immunodeficiency virus
(SIV), papilloma virus, murine gammaherpesvirus, Arenaviruses such as Argentine
hemorrhagic fever virus, Bolivian hemorrhagic fever virus, associated hemorrhagic
fever virus, Venezuelan hemorrhagic fever virus, Lassa fever virus, Machupo virus,
Lymphocytic choriomeningitis virus (LCMV), Bunyaviridiae such as Crimean-Congo
hagic fever virus, Hantavirus, hemorrhagic fever with renal syndrome causing virus,
Rift Valley fever virus, Filoviridae (filovirus) including Ebola hemorrhagic fever and
Marburg hemorrhagic fever, Flaviviridae including Kaysanur Forest e virus, Omsk
hemorrhagic fever virus, Tick—bome encephalitis causing virus and Paramyxoviridae such as
Hendra virus and Nipah virus, variola major and variola minor (smallpox), alphaviruses such
as Venezuelan equine encephalitis virus, eastern equine encephalitis virus, western equine
encephalitis virus, SARS-associated coronavirus (SARS-CoV), West Nile virus, and any
encephaliltis causing virus.
] Illustrative examples of genes suitable for ring an organ
transplant in a transplant ent that can be detected, identified, predicted, diagnosed, or
monitored with the compositions and methods contemplated herein include, but are not
limited to, one or more of the following genes: HLA-A, HLA-B, HLA—C, HLA—DR, HLA-
DP, and HLA-DQ.
In particular embodiments, a bioinformatic analysis is used to quantify
the number of genome equivalents analyzed in the chNA clone library, detect genetic
variants in atarget c locus, detect mutations within a target c locus, detect genetic
fusions within a target genetic locus; or measure copy number tions within a target
genetic locus.
F. CONIPANION DIAGNOSTICS
In various embodiments, a ion diagnostic for a genetic disease
is provided, comprising: ing or obtaining genomic DNA from a biological sample of a
subject, treating the DNA with one or more pair s to generate end-repaired
DNA; attaching one or more adaptors to each end of the end-repaired DNA to generate a
DNA library, amplifying the DNA library to generate a DNA clone library, determining the
number of genome lents in the DNA clone library; and ming a quantitative
genetic analysis of one or more biomarkers associated with the genetic disease in the DNA
clone library, wherein detection of, or failure to detect, at least one of the one or more
biomarkers indicates whether the subject should be treated for the genetic disease. In some
embodiments, the DNA is chNA. In particular embodiments, the DNA is cellular DNA.
As used herein, the term “companion diagnostic” refers to a diagnostic
test that is linked to a particular anti-cancer therapy. In a particular embodiment, the
diagnostic methods comprise detection of genetic lesion in a biomarker associated with in a
biological sample, thereby allowing for prompt identification of patients should or should not
be treated with the anti-cancer therapy.
Anti-cancer y includes, but is not limited to surgery, radiation,
chemotherapeutics, anti-cancer drugs, and immunomodulators.
Illustrative examples of anti-cancer drugs e, but are not limited
to: alkylating agents such as thiotepa and cyclophosphamide (CYTOXANTM), alkyl
sulfonates such as an, improsulfan and piposulfan, aziridines such as benzodopa,
carboquone, meturedopa, and uredopa; ethylenimines and methylamelamines including
altretamine, triethylenemelamine, lenephosphoramide, triethylenethiophosphaoramide
and trimethylolomelamine resume; nitrogen mustards such as chlorambucil, chlomaphazine,
cholophosphamide, estramustine, ifosfamide, mechlorethamine, mechlorethamine oxide
hydrochloride, melphalan, novembichin, phenesterine, prednimustine, trofosfamide, uracil
mustard; nitrosureas such as carmustine, chlorozotocin, fotemustine, lomustine, ine,
ranimustine; antibiotics such as aclacinomysins, actinomycin, authramycin, ine,
bleomycins, cactinomycin, calicheamicin, carabicin, carminomycin, ophilin,
chromomycins, dactinomycin, daunorubicin, bicin, 6-diazooxo-L-norleucine,
doxorubicin and its pegylated ations, epirubicin, esorubicin, idarubicin,
marcellomycin, mitomycins, mycophenolic acid, nogalamycin, olivomycins, peplomycin,
potfiromycin, puromycin, quelamycin, rodorubicin, streptonigrin, streptozocin, tubercidin,
ubenimeX, zinostatin, zorubicin; anti—metabolites such as methotrexate and 5-fluorouracil (5-
FU); folic acid analogues such as denopterin, methotrexate, pteropterin, trimetrexate; purine
analogs such as fludarabine, 6-mercaptopurine, thiamiprine, thioguanine; pyrimidine analogs
such as bine, azacitidine, 6—azauridine, carmofur, cytarabine, dideoxyuridine,
doxifluridine, enocitabine, floxuridine, 5-FU; androgens such as erone, dromostanolone
propionate, epitiostanol, ostane, testolactone; anti—adrenals such as lutethirnide,
mitotane, trilostane; folic acid replenisher such as frolinic acid; aceglatone; aldophosphamide
glycoside; aminolevulinic acid; ine; bucil; bisantrene; edatraxate; defofamine;
demecolcine; diaziquone; elformithine; elliptinium acetate; etoglucid; gallium e;
hydroxyurea; lentinan; lonidamine; mitoguazone; mitoxantrone; mopidamol; nitracrine;
pentostatin; phenamet; pirarubicin; podophyllinic acid; 2-ethylhydrazide; procarbazine;
PSK®; razoxane; sizofiran; spirogermanium; onic acid; triaziquone; 2, 2,2”-
trichlorotriethylamine; urethan; Vindesine; dacarbazine; mannomustine; mitobronitol;
mitolactol; pipobroman; gacytosine; arabinoside (“Ara-C”); cyclophosphamide; thiotepa;
taxoids, e.g., paclitaxel (TAXOL®, Bristol-Myers Squibb Oncology, Princeton, NJ.) and
doxetaxel (TAXOTERE®., Rhne—Poulenc Rorer, Antony, France); chlorambucil;
gemcitabine; 6-thioguanine; mercaptopurine; methotrexate; platinum analogs such as
cisplatin and carboplatin; stine; um; etoposide (VP-l6); ifosfamide; mitomycin C;
mitoxantrone; Vincristine; vinorelbine; navelbine; novantrone; teniposide; terin;
xeloda; ibandronate; CPT-ll; topoisomerase inhibitor RFS 2000; difluoromethylomithine
(DMFO); retinoic acid tives such as TargretinTM (bexarotene), PanretinTM (alitretinoin);
ONTAKTM (denileukin diftitox) ; esperamicins; capecitabine, and ceutically
able salts, acids or derivatives of any of the above. Also included in this definition are
anti-hormonal agents that act to regulate or inhibit hormone action on cancers such as anti-
estrogens including for example tamoxifen, raloxifene, aromatase inhibiting 4(5)—imidazoles,
4-hydroxytamoxifen, trioxifene, ene, LY117018, onapristone, and toremifene
(Fareston); and anti-androgens such as de, nilutamide, bicalutamide, leuprolide, and
goserelin; and pharmaceutically acceptable salts, acids or tives of any of the above.
Illustrative examples of immunomodulators include, but are not limited
to: cyclosporine, tacrolimus, tresperimus, olimus, sirolimus, mus, laflunimus,
imod and imiquimod, as well as analogs, derivatives, salts, ions and complexes f.
In some embodiments, an anti-cancer drug may include a poly-ADP
ribose polymerase (PARP) inhibitor. Illustrative examples of PARP inhibitors include, but
are not limited to, olaparib (AZD-2281), rucaparib (AG014699 or PF-01367338, niraparib
(MK-4827), talazoparib (BMN-673) rib 88), CEP 9722, E7016, BGB-290, 3-
aminobenzamide.
All publications, patent applications, and issued patents cited in this
specification are herein incorporated by reference as if each individual publication, patent
application, or issued patent were specifically and individually indicated to be incorporated
by nce. In particular, the entire contents of International PCT Publication No. WO
2016/028316 are specifically incorporated by reference.
Although the foregoing invention has been described in some detail by
way of ration and example for purposes of clarity of understanding, it will be readily
apparent to one of ordinary skill in the art in light of the teachings of this invention that
certain changes and modifications may be made thereto without departing from the spirit or
scope of the appended claims. The following examples are provided by way of illustration
only and not by way of limitation. Those of skill in the art will y recognize a variety of
noncritical parameters that could be changed or ed to yield essentially similar results.
EXAMPLES
Example 1: Copy Number Analysis of Samples Containing Blends Of Fragmented
c DNA
Meticulous blends of fragmented genomic DNA were generated that
contained DNA derived from AATM or ABRCAZ immortalized human samples spiked into a
fragmented wild-type human gDNA sample. The advantage of this sample type is that the
composition can be carefully controlled and sample availability is essentially unlimited.
Wild-type, human female genomic DNA was purified from whole
blood s donated by a healthy volunteer. Genomic DNA isolated from an immortalized
cell harboring a heterozygous deletion covering the entire ATM gene (NA095 96, AATM) and
a te sample bearing a heterozygous deletion of BRCA2 18, ABRCAZ) were
ed from the Coriell repository. Importantly, these samples appeared to have an
otherwise normal ploidy across the remainder of the genomes. The AATM sample was
d from a male donor and was therefore also hemizygous in copy number for the X-
linked AR gene. Cell free DNA (chNA) was ed from healthy donor plasma samples
of female or male origin. For library construction, genomic DNA was sonicated on a g
of 200 bp with a Covaris instrument, then further size selected using a “two-sided” DNA
bead purification. Library input DNA samples are shown in
Appropriate combinations of fragmented and chNA samples were
blended to defined percentages, end-repaired, and converted to genomic libraries.
Approximately 500 ng of each y was combined in sets of eight samples and hybridized
to the copy number loss (CNL) prostate probe pool that ned 2304 DNA probes.
Following sample processing, each set of eight samples was sequenced on an Illumina
NextSeq NGS instrument to a depth of ~480 million pass-filter reads, this corresponds to 60
million reads/sample. Roughly 95% of reads possessed legitimate sample ID tags and aligned
to the human reference genome and of these, ~98% mapped to the intended target loci. The
overall sequencing depth, measured as the number of reads per input genome per probe
(calculated as on-target reads (60 million) divided by average genome depth (2500) and
divided by probe count (2400)) was approximately 10 reads per genome per probe. A graphic
entation of the copy number loss analysis is shown in Copy number
perturbations are highlighted by . (Sample 1, 5% male DNA into female DNA, sample
2, 5% AATM DNA (male) into female DNA; sample 3, 5% ABRCAZ DNA (female) into
female DNA; sample 4, pure female DNA).
The CNL caller identifies redundant reads and condenses these into a
single sus reads that are then quantified at each probe location. This information was
further sed into gene-by-gene copy number averages. Finally, a statistical significance
was ed to deviations detected in each CNL measurement; this is shown graphically as
the logioP-value of statistical significance.
] shows box-and—whisker plots of copy number determinations
for the AR () and ATM () genes in fragmented and d c
libraries. Because the AATM sample is male, the AR gene (X-linked, hemizygous) and the
ATM gene both exhibited CNL or. As anticipated, the magnitude of measured copy
variation was modest. The statistical analysis shown in FIG 9B demonstrates that the
observed copy fluctuation was statistically significant. Moreover, very little significant
fluctuation was observed in the remaining genes that were predicted to exhibit uniform copy
characteristics. These values correlated well with frequencies predicted for the various
genomic blends. shows that statistically significant copy fluctuation was also readily
observed in samples that were primarily chNA with minor spike-ins of either chNA from
the opposite sex or minor additions of fragmented gDNA. These values correlated well with
frequencies predicted for the various genomic blends. The results seen with both fragmented
gDNA and with cfl)NA were comparable, thereby demonstrating the integrity of the assay
and suggesting that the ity will translate to al samples.
These data demonstrate the ability of the assay system to detect subtle
changes in gene copy number down to minor allele frequencies of 2%. While the focus of
demonstrated examples presented is on copy number loss, the technology is equally well
suited to the detection of copy number gains, including increases in gene copy that occur
through chromosomal arm duplications and focal amplifications. This assay further retains
the ability to detect other types of genomic variants, including SNVs, indels and gene fusions
osomal rearrangements). Importantly, these data demonstrate that the method can be
applied to genomic DNA derived from plasma, but also to genomic DNA derived from other
s such as tissue and other bodily sources.
Example 2: Copy number analysis of chNA from healthy donors and a cancer patient
The following e illustrate the manner in which the molecular
features added during genomic library construction and post-hybridization processing are
used to generate copy number is. DNA was extracted from the plasma of sixteen
healthy donors and one castration-resistant prostate cancer patient using the Qiagen
Circulating Nucleic Acids tion kit (Qiagen, Hilden, Germany). The yield of double—
strand DNA was quantified using a Qubit fluorometer (Thermo Fisher, Waltham, MA) and
the corresponding hsDNA quantitation kit. Size analysis was performed using gel
electrophoresis on 2% agarose gels with PCR markers as size standards (New England
Biolabs, Ipswich, MA). Approximately 40 — 100 ng of chNA, depending on the yield of
chNA from the sample, was used for library construction.
The basic features of library construction are illustrated in A —
11C. The chNA was first dephosphorylated and then repaired to blunt ends in a two-step
process. Short, 10 nt anchor sequences consisting of a phosphorylated ligation strand and an
inert partner strand were then d to the chNA. The eight oligonucleotides used to create
the set of four anchor sequences are shown in Table l.
T—g—g—able1: Liation anchor oli onucleotides
on strand oligo_16-4 /5Phos/ACC TGA TGC A**
A, C, G, or T)-Q] denotes a modified base in which the hydroxyl group resides on the 2’ position of the ribose ring
** /5Phos/ s the chemical addition of a 5’ phosphate group to the 5’ base position
] The adaptor structures were completed by the addition of full-length
adaptor sequences that annealed to the anchor sequence. Thirty-two sets of adaptor
sequences, each composed of 240 members, are shown in — . These adaptors
were attached to the chNA and extended h the concerted actions of cleotide
kinase, DNA polymerase and DNA ligase to generate genomic ies. As a pre—sequencing
quality control step, the resulting genomic libraries were quantified by qPCR for depth of
coverage. The genomic libraries were then ed and hybridized to probe sets targeting
specific genes (B). ing hybridization, primer ion of the probe was used
to copy the captured genomic sequences and the information encoded in the attached adaptor
(FIG. llC). An example of post sequencing analysis using standard next—generation analysis
software is shown in FIG. MD. This analysis was performed on a sequencing run that
contained 32 samples (28 cancer patient samples and 4 wild-type controls) and it displays the
overall distribution of sequencing reads.
A central e of the targeted hybrid capture platform described
herein is that it provides multiple types of genomic ation. One essential function of
capture probes is to e mutation detection across target regions at a high depth of
coverage. This function is governed by the sequence context, density, and placement of the
capture probes and is illustrated in with the TP53 gene (TP53 probe sequences are
shown in Table 2 below). Of equal significance, the targeted hybrid capture platform assay
generated a readout of equal depth of coverage in regions where no significant mutations
were ed. These data are al to physicians and patients as they add tical
significance in cases where no deleterious mutations were detected.
Table 2: TP53 Probes
TPS379 GCAGAGACCTGTGGGAAGCGAAAATTCCATGGGACTGACT 7697
TPS371 l GCAGGGGGATACGGCCAGGCATTGAAGTCTCATGGAAGCC 7699
CCATCGCTATCTGAGCAGCGCTCATGGTGGGGGCAGCGCC
GCCATCTACAAGCAGTCACAGCACATGACGGAGGTTGTGA
TP53723 CATGGCGCGGACGCGGGTGCCGGGCGGGGGTGTGGAATCA 771 1
GAGGGCCACTGACAACCACCCTTAACCCCTCCTCCCAGAG 7713
TP53737 TCTCCCAGGACAGGCACAAACACGCACCTCAAAGCTGTTC 7725
TP53744 CCTGGAGTGAGCCCTGCTCCCCCCTGGCTCCTTCCCAGCC 7732
TP53745 TCCGAGAGCTGAATGAGGCCTTGGAACTCAAGGATGCCCA 7733
The linkage of the capture probe with captured genomic sequence
(C) also facilitated ement of genomic depth at each probe location. The
number of unique reads associated with every capture probe used in the experiment was
measured (). The data shown in was derived from a sequencing run in which
16 healthy donor chNA samples were analyzed. The depth of unique reads encountered in
each sample at one probe location in the TP53 gene were calculated (Raw unique read counts
shown in A). Each sample comprised a unique library depth, as reflected in the broad
sample-to-sample distribution of unique reads. The global average of unique read depth
across all 2596 capture probes in the experiment was also calculated (B).
Significantly, normalization of the ed read depth at the single probe site yed in
C by the global unique read depth measured for all probes revealed a uniform density
of normalized unique reads. These data indicate that the capture performance of a particular
probe chosen for analysis was uniform from sample-to-sample and proportional to the
genomic depth of each individual library.
This same normalization function was applied to the 45 TP53-specific
probes shown in (normalization data shown in ). Whereas shows the
aggregate contribution of all probes to the sequencing depth of TP53 coding regions,
shows the normalized depth ved by each individual probe. The ized depth
retrieved by each individual probe was generally consistent from sample-to-sample for any
given probe but somewhat variable when one probe was compared to another. Several factors
governed the differences in the post—normalization capture depths observed between probes,
the most significant being the ent of probes relative to one another and the ity
of probes to c repeat regions. Not all probes exhibited uniform capture behavior; two
probes whose capture mance were not consistent are ghted by arrows in .
However, these data te that such probes are rare and easily identified. As such, and they
can be ed from downstream copy number analysis.
The uniform capture performance exhibited by the 45 TP53 ing
probes in is a general feature of the targeted hybrid capture platform described
. In , the average capture depth for each probe in a panel of 25 96 capture probes
was calculated for all 16 normal chNA libraries that were profiled in this experiment. The
average was then compared individually with three entative samples using scatter plot
analysis. Each dot represents a ent probe and its position on the graph is a ison
of the average on the x-axis and the individual sample on the y-axis. The tight diagonal
distribution of the majority of probes reflected the highly-correlated unique read capture
performance of most probes (R2 correlation Z 0.95 for all three graphs). Importantly, the
consistency of probe—by-probe sequencing depth supports the use of the targeted hybrid
capture platform in copy number measurement.
With respect to copy number, the most straightforward treatment of
probe data is to further normalize the adjusted genomic depth values that occur in autosomal
chromosomes to a diploid-averaged value of “”2 The same is true for probe values that occur
in females for X-linked loci. For X-linked and Y-linked regions in normal males, averaged
copy values are appropriately set to “1” This numerical transformation was applied to a set
of chromosomal control probes (239 probes that target select loci on all 22 autosomal
chromosomes, Table 3), a set of 199 probes that target the X-linked AR gene, and the 45
TP53-specif1c probes considered in detail above (A and 27B). Each dot represents the
value for an individual probe. With the exception of infrequent “noisy” probes, the vast
majority of individual probe counts in regions anticipated to be diploid possessed values that
were imately ‘2’ Probes for the AR gene in a healthy male fluctuated with an average
value close to the anticipated “.”1
Table 3: Chromosomal l Probes
Chrili9 GTGAGCCTTCTCTCACCATTCTGTCCAAAATAGCAGCCCT 7742
CCCAGCGCCCGTGGCTTTGGCTCCTCAGTCCCATTTAAAT
TATACCACCAAGTCTACCTACTGCCTGCACATGCTATGGC
Chr7279 GGTCAATCCGGCACTACTGGTTGTCCAAAGGGAGGTTACT 7754
Chr7271 l GTGTCTCCTGGAGGTGCATGGGTGGTTTTGAACTTCATTG 7756
GTCCCTGGGACCATCTGTGCATTGTTCTTGTAACTGGAAA
GACCGAATGGCGAACGCAGTGAATAGATCAGGAGGGAAAA
ChI7371 GAAGGAATGGAGTGGAACAGATAGGGGTGAGGGAATAACG 7764
ChI_3_3 ATCCAGGCTTCATGTTCAAATGCAATGGCCCTTGCCCCAT 7766
Chri4i3 CATGAGTCCTTCTATGACTCCCTCTCAGACATGCAGGAAG 7778
Chr7579 GTCGGTCAGAAGGAACACCTGAGAAACCGCTTTA 7792
GGAGACAACTTAGGAGGTTATCTAGACCATTCCCGCCTTC
GTGTTTCCTCCCAGCATGCACTTTGTGGCTGCCTTTCTTT
Chr75714 TGGCTTGTGTAGCGTGTTTCATTTTGGAACCTTGGAGCCG 7797
Chr75716 GTTTCAGATCTTGCAATGGGAGGGATCGACTCGGCCCTTT 7799
Chri6il GAGTTTTTCTTTCAGGTAGTCTGAGATGGCCCGCACCAAG 7802
3 GGCAGATTCGATGGGACTTTAGACACTTGCTTTGCTCCCT 7804
Chri7i4 CCATGACTTATGTGCAGCTTGCGCATCCAGGGGTAGATCT 7816
Chr7876 TGGCTTTGGCGCTTTAAGGCCAGACACGGCATTAAAAAGC 7830
CCATGGTTCTGTGAGACTGGTAGAAAGCACAGACCCCTTA
AATGTGCTTATCACTCGTGATGGGGTCCTGAAGCTGGCAG
Chri972 AGGGTCTCATTTTAAGACAGCTTGATTTGAGGGTGAGGGG 783 5
Chri9i4 GTCTAAGGGCATCTTACCTCCAAGAACTGCTTGAGGCGTA 7837
ChI7977 AGTGTCGGAAGAAACCTACCTGCGTTTCTTAGAA 7840
ChI_9_9 TATATCTCACGTGACCGAGGATGGGTCGTGGGCATTCACA 7842
Chril 174 AAGGTATAGAGCTGGGCGGCTTTCCTCGTTATAGGTGGAG 7854
ChI71277 ACATTATATCCGGTCCAGGAATATCTGGCTCAGGCTGGGT 7868
AAAAGAAATGCGATCAGCGCAACCCATCCGGTGTGGCGCT
GGCAGTGGTACCATGACATACTTAGCAGAGATGGACTACA
Chr71371 ATTTCCCATGCGAGAGGTAGCTTGCCCAGGCTGTTGGATA 7873
Chr71373 TCACGGGAGCTTCCTTCACTGAGTTCTGCGAATCTGAAGC 7875
Chr71376 GTTCACTCGTCGGTTTTTCACCAACCACAGACTAGCCTCA 7878
_8 TCTCAGTGAACAGAGGGCTCACTGAGAGGACTTTGAATAC 7880
Chr71574 TTCAATCAGGTACTCCGAGTTCCCTTGGAGGCCAAAAGGA 7892
ChI71676 CTGGCATTGGTGAGTAATAGGAGCCAGACGGGTCTGTGTT 7906
GTGCTACCCTCCTCCCTTCAGGTTATGTGGTCCAGGCTTT
TAAGTGGAACAACATTCCCTTCATTATAGCCCTTCGTGGG
Chr71671 1 GCAACGTCAACAACTACTACGTGCACAAGCGCCTCTACTG 791 1
Chr71772 GTGGTCACCATCTCTTCAAACCATTTGGACTGGGCCTGGT 7913
ChI71775 GTTGTCATTGGGGCTATAGACATAAGCACCTTCCGGAATC 7916
Chr_17_7 GTCAGACCCTGTCCTCGTCTCCTTTACCTTGTCTCGATTT 7918
Chr71878 CTATGAGCATACTGGGGAGGGAAACCTCTAAGCGGAACTT 7930
ChI71971 1 CCTCTTAGTCCTGGGCCATACCTTAGCCTTGTGC 7944
TCTAGATGGAAGCTGTATCCAAGGATGCTCCGGAATGTTG
ATCTTCTCTGCCTGCCGCACTAGCTTCTTGGTGACTTCTC
Chr72075 ATCGAGTTGTCGAGCCCCATGATTCGACACCAAGATCCCA 7949
Chr72077 GTGCACTGTCAGATCTTGGAAACGGCCAAAGGATTTTTCC 795 1
Chr720710 CTCCTCCAGGAGCTGGCAGCATCAAGACCCCACTTCGCTT 7954
Chr_21_2 AAGTCTGACAGCATCTGCTTGAACTGAGGCACAGTGATGG 7956
AGACCCAGCCTACCTGCATGATCTCTTGTACAGCTTTGCA
TCATGGAACATGGGCCTTGCAAAGGGGTCAAGATCACAAC
Chr72275 CATTCCCCATTCTGCAGGATCCGTTCCCCTGGCA 7968
CAGAAGGATACTAGAATGGAATGTCCTGCGTGACGAAAGC
CATCTGATTCTCCTATGGCTGCTAGGCTCCAGGA
Significantly, when the same analysis was applied to chNA collected
from the blood plasma fraction of a castration-resistant prostate cancer patient using healthy
samples as normalization controls, three ent features emerged (C). First, all of
the control probes exhibited noisy counting behavior. Second, the counts across all AR
probes were significantly elevated from a normal value of “l” to an ed value of
approximately “”5 Amplification of the AR gene is consistently observed in advanced
prostate cancer ts. Third, the TP53 probe counts, while more tightly clustered,
possessed an average value far closer to “1” than the expected value of “2’ This likely
reflected inactivation of one or both alleles of TP53 by copy number loss in the fraction of
circulating DNA derived from tumor tissue.
These data indicated that the methods of the present invention
comprise three important karyotyping aspects. Namely, the methods bed herein detect
generalized chromosomal aneuploidy, copy increases of specific, ed genes, and copy
losses in the same specific, targeted genes. These result further indicate that the methods and
platforms described herein can guide the use of precision ies, as all three of these
genomic abnormalities occur frequently in .
Generalized chromosomal aneuploidy for castration-resistant te
cancer patient samples (blue dots) relative to a healthy control (brown dots) was measured
(). In this analysis, the approximate ploidy for all 239 control probes used in the
experiment were d ing to their chromosomal targets. For some chromosomes
(e.g., chromosome 1 and chromosome 22) a similar ploidy value of “2” was observed
between t and control samples. In other cases, deviation between the two samples was
observed. The degree of information regarding overall genomic ploidy provided by these
experiments was constrained by the number and density of control probes used. However,
these data indicate that a denser probe panel covering all chromosomal segments at uniform
density can be used — in conjunction with the additional unique features of the t
invention. Such analyses will provide a higher tion, genome-wide measurement of
chromosomal copy number,
These data further ght the capabilities of the present invention as
a guide for precision therapy. For example, tumors that possess genomic deficiencies in
homologous recombination repair often exhibit highly destabilized chromosomal es,
and patients with such tumors are good candidates for inhibitors of the PARP enzyme
complex (See Popova et at, Genome Biol. 0(11):R128). Unlike most sequencing
assays that seek to genotype a tumor, the assays described herein use sequencing to detect
destabilized chromosomal ploidy as a tumor phenotype, even if the causal mutations driving
this ype remain hidden from targeted analysis.
The ability to detect gene loss in DNA shed from solid tumors is
especially significant. Mutation and deletion of tumor suppressor genes is a frequent event in
cancer genomes, moreover, individuals with germline loss of tumor suppressor genes are
uniquely vulnerable to developing cancer later in life. The diagnostic value of a liquid biopsy
copy number loss (CNL) assay is directly proportional to its sensitivity. To ine the
lower limit of detection for the invention described here, the immortalized lines described in
Example 1 were systematically diluted into the “genome-in—a—bottle” reference cell line,
NA12878. One line had a single copy deletion (monoallelic loss) of ATM, the other a single
copy deletion of BRCA2. The experiment included four control samples of pure NA12878
and eight spike-in samples containing 16% of each lelic deletion line (). For
reporting purposes, this corresponds to an 8% minor allele frequency of biallelic loss.
ed values for all probes targeting specific genes and two additional, undeleted control
genes are shown in . Copy loss of ATM and BRCA2 was confined to spike-in
s only. Additional computational treatment of the data revealed confident copy loss
calling of biallelic deletions down to 2% minor allele ncies. This sensitivity indicated
that the present invention required no specialized considerations in order to routinely include
copy loss calls in standard blood-based genotyping assays.
These data demonstrate the use of probe-specific genomic capture data
for the analysis of copy number, including both copy number gain and copy number loss of
target genomic loci. Additionally, the invention described herein has been shown to possess
the sensitive ability to detect single nucleotide variants, insertions and deletions ranging from
single nucleotides to many thousands of base pairs, and gene fusions resulting from
chromosomal rearrangement by aberrant mutational processes (See PCT Publication No. WO
2016/028316, and US. Patent Publication No. 2014-0274731). All of these mutational
processes can contribute to the ormation of normal tissue to stic cancers, and as
precision ies continue to emerge, accurate diagnosis of these ed genomic
signatures will become an increasingly indispensable feature of precision medicine.
Claims (21)
1. A kit comprising a set of adaptors, wherein each adaptor of the set of rs comprises a sample tag region selected from a pool of unique sample tag regions, wherein the pool is selected from a plurality of pools, and wherein the selected pool is unique to a test sample; wherein the test sample comprises a plurality of DNA fragments.
2. The kit of claim 1, wherein each adaptor of the set of adaptors is a DNA polynucleotide that comprises (i) an amplification region, (ii) a sample tag ; and (iii) an anchor region.
3. The kit of claim 2, wherein the amplification region comprises a polynucleotide sequence e of serving as a primer recognition site for PCR amplification.
4. The kit of claim 2 or claim 3, wherein the amplification region is about 10 to about 50 nucleotides in length, about 20 to about 30 nucleotides in length or n the amplification region is about 25 nucleotides in length.
5. The kit of any one of claims 2-4, wherein the sample tag region identifies the test sample.
6. The kit of any one of claims 2-5, wherein the sample tag region identifies a DNA fragment attached thereto.
7. The kit of any one of claims 2-6, wherein the sample tag region is about 5 to about 50 nucleotides in length, about 5 to about 15 nucleotides in length or wherein the sample tag region is about 8 nucleotides in length.
8. The kit of any one of claims 2-7, wherein the amplification region of each adaptor of the set of adaptors is identical to the amplification region of every other adaptor of the set of adaptors.
9. The kit of any one of claims 2-8, wherein the anchor region comprises a polynucleotide sequence that is capable of attaching to a DNA nt.
10. The kit of any one of claims 2-9, wherein the anchor region is about 1 to about 50 nucleotides in length, about 5 to about 25 nucleotides in length or wherein the anchor region is about 10 nucleotides in length.
11. The kit of any one of any one of claims 1-10, wherein each adaptor of the set of adaptors is ured to attach to a DNA nt of the plurality of DNA fragments to generate a DNA library comprising at least two unique sample tag regions, wherein each of the DNA library fragments comprises a DNA nt attached to an adaptor.
12. The kit of any one of claims 1-11, wherein the test sample is a tissue biopsy, optionally wherein the tissue biopsy is taken from a tumor or a tissue suspected of being a tumor.
13. The kit of any one of claims 1-12, wherein the DNA fragments are cell-free DNA (cfDNA) or cellular DNA, optionally wherein the cfDNA is isolated from the test sample; and wherein the test sample is a biological sample selected from the group consisting of: amniotic fluid, blood, plasma, serum, semen, lymphatic fluid, cerebral spinal fluid, ocular fluid, urine, saliva, stool, mucous, and sweat.
14. The kit of any one of claims 2-10, wherein each adaptor of the set of adaptors further comprises a unique molecule identifier multiplier (UMI multiplier) that is adjacent to or contained within the sample tag region, optionally wherein the UMI multiplier is about 1 to about 5 nucleotides in length or wherein the UMI multiplier is about 3 nucleotides in length.
15. The kit of claim 14, wherein the UMI multiplier increases the number of unique sequences for identifying the plurality of DNA fragments.
16. The kit of any one of claims 1-15, wherein each sample tag region of the pool of sample tag regions comprises one of 2 to 1,000 unique nucleotide sequences, one of 50 to 500 unique nucleotide sequences, one of 100 to 400 unique nucleotide ces, one of 200 to 300 unique nucleotide ces, or wherein each sample tag region of the pool of sample tag regions comprises one of 240 unique nucleotide sequences.
17. The kit of claim 16, n each ce of the unique tide sequences is discrete from any other sequence by a Hamming distance of at least two.
18. The kit of any one of claims 1-17, r comprising one or more capture probe modules.
19. The kit of claim 18, wherein each capture probe module comprises a tail sequence and a e probe sequence e of hybridizing to a target sequence in the test sample.
20. A DNA library, wherein the DNA library comprises a plurality of DNA library fragments, wherein each of the DNA library fragments ses an r module and a DNA fragment, n the adaptor module is a DNA polynucleotide comprising (i) an ication region, (ii) a sample tag region, and (iii) an anchor region; wherein the amplification region comprises a polynucleotide sequence capable of serving as a primer recognition site for PCR ication; wherein the sample tag region identifies the test sample; and wherein the anchor region comprises a polynucleotide sequence that is capable of attaching to a DNA fragment.
21. A set of adaptors, wherein each adaptor of the set of rs comprises a sample tag region selected from a pool of unique sample tag regions, wherein the pool is selected from a plurality of pools, and wherein the selected pool is unique to a test sample; wherein each adaptor in said set of adapters is a DNA polynucleotide that comprises: an ication region, a sample tag , and an anchor region; wherein the amplification region comprises a polynucleotide sequence capable of serving as a primer recognition site for PCR amplification; wherein the sample tag region identifies the test sample; and wherein the anchor region comprises a polynucleotide sequence that is capable of attaching to a DNA fragment.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US62/379,593 | 2016-08-25 | ||
US62/481,538 | 2017-04-04 |
Publications (1)
Publication Number | Publication Date |
---|---|
NZ791679A true NZ791679A (en) | 2022-08-26 |
Family
ID=
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20220325353A1 (en) | Methods for the detection of genomic copy changes in dna samples | |
JP7318054B2 (en) | Highly efficient construction of DNA library | |
CA2957657A1 (en) | Methods for quantitative genetic analysis of cell free dna | |
US20220073906A1 (en) | Adaptors and methods for high efficiency construction of genetic libraries and genetic analysis | |
US20240191293A1 (en) | Compositions and methods for simultaneous genetic analysis of multiple libraries | |
NZ791679A (en) | Methods for the detection of genomic copy changes in dna samples |