WO2024119461A1 - Compositions et procédés pour détecter les sites de clivage cibles des nucléases crispr/cas et la translocation de l'adn - Google Patents
Compositions et procédés pour détecter les sites de clivage cibles des nucléases crispr/cas et la translocation de l'adn Download PDFInfo
- Publication number
- WO2024119461A1 WO2024119461A1 PCT/CN2022/137789 CN2022137789W WO2024119461A1 WO 2024119461 A1 WO2024119461 A1 WO 2024119461A1 CN 2022137789 W CN2022137789 W CN 2022137789W WO 2024119461 A1 WO2024119461 A1 WO 2024119461A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- seq
- sequence
- target
- sites
- template
- Prior art date
Links
- 108020004414 DNA Proteins 0.000 title claims abstract description 140
- 238000000034 method Methods 0.000 title claims abstract description 135
- 101710163270 Nuclease Proteins 0.000 title claims abstract description 116
- 230000005945 translocation Effects 0.000 title claims abstract description 90
- 238000003776 cleavage reaction Methods 0.000 title claims abstract description 78
- 230000007017 scission Effects 0.000 title claims abstract description 77
- 108091033409 CRISPR Proteins 0.000 title claims description 55
- 239000000203 mixture Substances 0.000 title description 20
- 239000013598 vector Substances 0.000 claims abstract description 42
- 102000040430 polynucleotide Human genes 0.000 claims abstract description 41
- 108091033319 polynucleotide Proteins 0.000 claims abstract description 41
- 239000002157 polynucleotide Substances 0.000 claims abstract description 41
- 102100034343 Integrase Human genes 0.000 claims description 200
- 108010092799 RNA-directed DNA polymerase Proteins 0.000 claims description 193
- 238000003780 insertion Methods 0.000 claims description 162
- 230000037431 insertion Effects 0.000 claims description 161
- 108020005004 Guide RNA Proteins 0.000 claims description 114
- 230000005782 double-strand break Effects 0.000 claims description 92
- 230000027455 binding Effects 0.000 claims description 48
- 239000002773 nucleotide Substances 0.000 claims description 41
- 125000003729 nucleotide group Chemical group 0.000 claims description 39
- 238000012163 sequencing technique Methods 0.000 claims description 37
- 125000006850 spacer group Chemical group 0.000 claims description 36
- 241000713869 Moloney murine leukemia virus Species 0.000 claims description 32
- 108091032973 (ribonucleotides)n+m Proteins 0.000 claims description 30
- 230000002441 reversible effect Effects 0.000 claims description 26
- 238000011144 upstream manufacturing Methods 0.000 claims description 26
- 108091093088 Amplicon Proteins 0.000 claims description 25
- 241000713838 Avian myeloblastosis virus Species 0.000 claims description 16
- 230000001738 genotoxic effect Effects 0.000 claims description 14
- 231100000025 genetic toxicology Toxicity 0.000 claims description 13
- 230000008685 targeting Effects 0.000 claims description 13
- 108020001507 fusion proteins Proteins 0.000 claims description 12
- 102000037865 fusion proteins Human genes 0.000 claims description 12
- 238000002372 labelling Methods 0.000 claims description 8
- 230000009438 off-target cleavage Effects 0.000 claims description 6
- 230000003612 virological effect Effects 0.000 claims description 6
- 108010012306 Tn5 transposase Proteins 0.000 claims description 5
- 108091023037 Aptamer Proteins 0.000 claims description 4
- 108020004422 Riboswitch Proteins 0.000 claims description 4
- 108010002700 Exoribonucleases Proteins 0.000 claims description 3
- 102000004678 Exoribonucleases Human genes 0.000 claims description 3
- 238000010453 CRISPR/Cas method Methods 0.000 abstract description 7
- 239000013615 primer Substances 0.000 description 104
- 210000004027 cell Anatomy 0.000 description 44
- 108090000623 proteins and genes Proteins 0.000 description 41
- 101000808011 Homo sapiens Vascular endothelial growth factor A Proteins 0.000 description 37
- 102100039037 Vascular endothelial growth factor A Human genes 0.000 description 37
- 238000001727 in vivo Methods 0.000 description 32
- 238000003752 polymerase chain reaction Methods 0.000 description 28
- 210000000349 chromosome Anatomy 0.000 description 19
- 238000001514 detection method Methods 0.000 description 18
- 238000000338 in vitro Methods 0.000 description 17
- 210000002257 embryonic structure Anatomy 0.000 description 16
- 102100023823 Homeobox protein EMX1 Human genes 0.000 description 15
- 101001048956 Homo sapiens Homeobox protein EMX1 Proteins 0.000 description 15
- 150000007523 nucleic acids Chemical group 0.000 description 15
- 102000004169 proteins and genes Human genes 0.000 description 15
- 238000006243 chemical reaction Methods 0.000 description 14
- 102000039446 nucleic acids Human genes 0.000 description 14
- 108020004707 nucleic acids Proteins 0.000 description 14
- 210000001161 mammalian embryo Anatomy 0.000 description 12
- 238000010839 reverse transcription Methods 0.000 description 12
- 230000000295 complement effect Effects 0.000 description 11
- 230000000694 effects Effects 0.000 description 11
- 230000003321 amplification Effects 0.000 description 10
- 239000012634 fragment Substances 0.000 description 10
- 108020004999 messenger RNA Proteins 0.000 description 10
- 230000004048 modification Effects 0.000 description 10
- 238000012986 modification Methods 0.000 description 10
- 238000007481 next generation sequencing Methods 0.000 description 10
- 238000003199 nucleic acid amplification method Methods 0.000 description 10
- 230000008707 rearrangement Effects 0.000 description 10
- 238000010200 validation analysis Methods 0.000 description 10
- 238000004458 analytical method Methods 0.000 description 9
- 238000010586 diagram Methods 0.000 description 9
- 108090000765 processed proteins & peptides Proteins 0.000 description 9
- 238000012800 visualization Methods 0.000 description 9
- 108010008532 Deoxyribonuclease I Proteins 0.000 description 8
- 102000007260 Deoxyribonuclease I Human genes 0.000 description 8
- 101710203526 Integrase Proteins 0.000 description 8
- 239000011324 bead Substances 0.000 description 8
- 108091028043 Nucleic acid sequence Proteins 0.000 description 7
- 108090001074 Nucleocapsid Proteins Proteins 0.000 description 7
- 108091034117 Oligonucleotide Proteins 0.000 description 7
- 238000005516 engineering process Methods 0.000 description 7
- 239000013604 expression vector Substances 0.000 description 7
- 230000006870 function Effects 0.000 description 7
- 230000014509 gene expression Effects 0.000 description 7
- 238000002360 preparation method Methods 0.000 description 7
- 230000008439 repair process Effects 0.000 description 7
- 102100037964 E3 ubiquitin-protein ligase RING2 Human genes 0.000 description 6
- 101001095815 Homo sapiens E3 ubiquitin-protein ligase RING2 Proteins 0.000 description 6
- 241000699670 Mus sp. Species 0.000 description 6
- 230000015556 catabolic process Effects 0.000 description 6
- 238000006731 degradation reaction Methods 0.000 description 6
- 239000013612 plasmid Substances 0.000 description 6
- 229920001184 polypeptide Polymers 0.000 description 6
- 230000037452 priming Effects 0.000 description 6
- 102000004196 processed proteins & peptides Human genes 0.000 description 6
- 230000035945 sensitivity Effects 0.000 description 6
- 238000011740 C57BL/6 mouse Methods 0.000 description 5
- 238000001353 Chip-sequencing Methods 0.000 description 5
- 108091092584 GDNA Proteins 0.000 description 5
- 101001098868 Homo sapiens Proprotein convertase subtilisin/kexin type 9 Proteins 0.000 description 5
- 108020004684 Internal Ribosome Entry Sites Proteins 0.000 description 5
- 102100038955 Proprotein convertase subtilisin/kexin type 9 Human genes 0.000 description 5
- NIJJYAXOARWZEE-UHFFFAOYSA-N Valproic acid Chemical compound CCCC(C(O)=O)CCC NIJJYAXOARWZEE-UHFFFAOYSA-N 0.000 description 5
- 230000008045 co-localization Effects 0.000 description 5
- 230000001973 epigenetic effect Effects 0.000 description 5
- 238000005304 joining Methods 0.000 description 5
- 238000007857 nested PCR Methods 0.000 description 5
- 230000037361 pathway Effects 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 239000000523 sample Substances 0.000 description 5
- 238000013518 transcription Methods 0.000 description 5
- 230000035897 transcription Effects 0.000 description 5
- 108091028075 Circular RNA Proteins 0.000 description 4
- 102100033996 Double-strand break repair protein MRE11 Human genes 0.000 description 4
- 101000591400 Homo sapiens Double-strand break repair protein MRE11 Proteins 0.000 description 4
- 241001465754 Metazoa Species 0.000 description 4
- 241000699666 Mus <mouse, genus> Species 0.000 description 4
- 206010028980 Neoplasm Diseases 0.000 description 4
- 101000910035 Streptococcus pyogenes serotype M1 CRISPR-associated endonuclease Cas9/Csn1 Proteins 0.000 description 4
- 238000013459 approach Methods 0.000 description 4
- 201000011510 cancer Diseases 0.000 description 4
- 238000010362 genome editing Methods 0.000 description 4
- 230000017730 intein-mediated protein splicing Effects 0.000 description 4
- 239000000463 material Substances 0.000 description 4
- 238000011160 research Methods 0.000 description 4
- 238000012216 screening Methods 0.000 description 4
- 238000012070 whole genome sequencing analysis Methods 0.000 description 4
- 101710132601 Capsid protein Proteins 0.000 description 3
- 101710094648 Coat protein Proteins 0.000 description 3
- 230000006820 DNA synthesis Effects 0.000 description 3
- 102100035102 E3 ubiquitin-protein ligase MYCBP2 Human genes 0.000 description 3
- 102100031780 Endonuclease Human genes 0.000 description 3
- 108010042407 Endonucleases Proteins 0.000 description 3
- 102000004190 Enzymes Human genes 0.000 description 3
- 108090000790 Enzymes Proteins 0.000 description 3
- 102100021181 Golgi phosphoprotein 3 Human genes 0.000 description 3
- 108010033040 Histones Proteins 0.000 description 3
- 101710125418 Major capsid protein Proteins 0.000 description 3
- 101710141454 Nucleoprotein Proteins 0.000 description 3
- 101710083689 Probable capsid protein Proteins 0.000 description 3
- 108091028113 Trans-activating crRNA Proteins 0.000 description 3
- 239000003153 chemical reaction reagent Substances 0.000 description 3
- 238000010367 cloning Methods 0.000 description 3
- 238000004891 communication Methods 0.000 description 3
- 238000012937 correction Methods 0.000 description 3
- 238000005520 cutting process Methods 0.000 description 3
- 230000003247 decreasing effect Effects 0.000 description 3
- 238000013461 design Methods 0.000 description 3
- 229940088598 enzyme Drugs 0.000 description 3
- 238000002474 experimental method Methods 0.000 description 3
- 230000004927 fusion Effects 0.000 description 3
- 230000002068 genetic effect Effects 0.000 description 3
- 238000012165 high-throughput sequencing Methods 0.000 description 3
- 238000010348 incorporation Methods 0.000 description 3
- 238000002347 injection Methods 0.000 description 3
- 239000007924 injection Substances 0.000 description 3
- 230000001404 mediated effect Effects 0.000 description 3
- 230000033607 mismatch repair Effects 0.000 description 3
- 230000036438 mutation frequency Effects 0.000 description 3
- 230000006780 non-homologous end joining Effects 0.000 description 3
- 229940046166 oligodeoxynucleotide Drugs 0.000 description 3
- 230000035484 reaction time Effects 0.000 description 3
- 210000003705 ribosome Anatomy 0.000 description 3
- 231100000331 toxic Toxicity 0.000 description 3
- 230000002588 toxic effect Effects 0.000 description 3
- 230000010474 transient expression Effects 0.000 description 3
- 239000013603 viral vector Substances 0.000 description 3
- 102100031251 1-acylglycerol-3-phosphate O-acyltransferase PNPLA3 Human genes 0.000 description 2
- 108700028369 Alleles Proteins 0.000 description 2
- IJGRMHOSHXDMSA-UHFFFAOYSA-N Atomic nitrogen Chemical compound N#N IJGRMHOSHXDMSA-UHFFFAOYSA-N 0.000 description 2
- 108010077544 Chromatin Proteins 0.000 description 2
- 102000053602 DNA Human genes 0.000 description 2
- 230000033616 DNA repair Effects 0.000 description 2
- 102100021429 DNA-directed RNA polymerase II subunit RPB1 Human genes 0.000 description 2
- 208000001490 Dengue Diseases 0.000 description 2
- 206010012310 Dengue fever Diseases 0.000 description 2
- 102000012216 Fanconi Anemia Complementation Group F protein Human genes 0.000 description 2
- 108010022012 Fanconi Anemia Complementation Group F protein Proteins 0.000 description 2
- 241000710831 Flavivirus Species 0.000 description 2
- 208000034951 Genetic Translocation Diseases 0.000 description 2
- 208000031448 Genomic Instability Diseases 0.000 description 2
- 102100023919 Histone H2A.Z Human genes 0.000 description 2
- 102100038885 Histone acetyltransferase p300 Human genes 0.000 description 2
- 101001129184 Homo sapiens 1-acylglycerol-3-phosphate O-acyltransferase PNPLA3 Proteins 0.000 description 2
- 101001106401 Homo sapiens DNA-directed RNA polymerase II subunit RPB1 Proteins 0.000 description 2
- 101000905054 Homo sapiens Histone H2A.Z Proteins 0.000 description 2
- 101000882390 Homo sapiens Histone acetyltransferase p300 Proteins 0.000 description 2
- 101000868279 Homo sapiens Leukocyte surface antigen CD47 Proteins 0.000 description 2
- 108010003272 Hyaluronate lyase Proteins 0.000 description 2
- 102000001974 Hyaluronidases Human genes 0.000 description 2
- 108020004511 Recombinant DNA Proteins 0.000 description 2
- 108010081734 Ribonucleoproteins Proteins 0.000 description 2
- 102000004389 Ribonucleoproteins Human genes 0.000 description 2
- 108091027544 Subgenomic mRNA Proteins 0.000 description 2
- 210000001744 T-lymphocyte Anatomy 0.000 description 2
- 108091036066 Three prime untranslated region Proteins 0.000 description 2
- 108020000999 Viral RNA Proteins 0.000 description 2
- 208000003152 Yellow Fever Diseases 0.000 description 2
- 208000020329 Zika virus infectious disease Diseases 0.000 description 2
- 230000001594 aberrant effect Effects 0.000 description 2
- 125000003275 alpha amino acid group Chemical group 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 230000003115 biocidal effect Effects 0.000 description 2
- 230000015572 biosynthetic process Effects 0.000 description 2
- 230000001413 cellular effect Effects 0.000 description 2
- 238000012512 characterization method Methods 0.000 description 2
- 239000003795 chemical substances by application Substances 0.000 description 2
- 210000003483 chromatin Anatomy 0.000 description 2
- 239000002299 complementary DNA Substances 0.000 description 2
- 230000002596 correlated effect Effects 0.000 description 2
- 210000001771 cumulus cell Anatomy 0.000 description 2
- OPTASPLRGRRNAP-UHFFFAOYSA-N cytosine Chemical compound NC=1C=CNC(=O)N=1 OPTASPLRGRRNAP-UHFFFAOYSA-N 0.000 description 2
- 238000007405 data analysis Methods 0.000 description 2
- 238000012217 deletion Methods 0.000 description 2
- 230000037430 deletion Effects 0.000 description 2
- 208000025729 dengue disease Diseases 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 230000029087 digestion Effects 0.000 description 2
- 239000003623 enhancer Substances 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 238000013467 fragmentation Methods 0.000 description 2
- 238000006062 fragmentation reaction Methods 0.000 description 2
- 238000001415 gene therapy Methods 0.000 description 2
- 108010051779 histone H3 trimethyl Lys4 Proteins 0.000 description 2
- 210000005260 human cell Anatomy 0.000 description 2
- 229960002773 hyaluronidase Drugs 0.000 description 2
- 238000009396 hybridization Methods 0.000 description 2
- 230000001976 improved effect Effects 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 230000001965 increasing effect Effects 0.000 description 2
- 230000000977 initiatory effect Effects 0.000 description 2
- 230000035772 mutation Effects 0.000 description 2
- 210000004940 nucleus Anatomy 0.000 description 2
- 210000000056 organ Anatomy 0.000 description 2
- 230000001717 pathogenic effect Effects 0.000 description 2
- 230000001105 regulatory effect Effects 0.000 description 2
- 210000002966 serum Anatomy 0.000 description 2
- UNFWWIHTNXNPBV-WXKVUWSESA-N spectinomycin Chemical compound O([C@@H]1[C@@H](NC)[C@@H](O)[C@H]([C@@H]([C@H]1O1)O)NC)[C@]2(O)[C@H]1O[C@H](C)CC2=O UNFWWIHTNXNPBV-WXKVUWSESA-N 0.000 description 2
- 238000003786 synthesis reaction Methods 0.000 description 2
- 238000002560 therapeutic procedure Methods 0.000 description 2
- 210000001519 tissue Anatomy 0.000 description 2
- 238000001890 transfection Methods 0.000 description 2
- 238000013519 translation Methods 0.000 description 2
- 230000017105 transposition Effects 0.000 description 2
- 241001515965 unidentified phage Species 0.000 description 2
- 102000040650 (ribonucleotides)n+m Human genes 0.000 description 1
- 102000015896 5'-3' exoribonucleases Human genes 0.000 description 1
- 108010044256 5'-exoribonuclease Proteins 0.000 description 1
- 239000013607 AAV vector Substances 0.000 description 1
- 101710159080 Aconitate hydratase A Proteins 0.000 description 1
- 101710159078 Aconitate hydratase B Proteins 0.000 description 1
- 101100107610 Arabidopsis thaliana ABCF4 gene Proteins 0.000 description 1
- 241000180579 Arca Species 0.000 description 1
- 241000271566 Aves Species 0.000 description 1
- 241000894006 Bacteria Species 0.000 description 1
- 241000283690 Bos taurus Species 0.000 description 1
- 101710172824 CRISPR-associated endonuclease Cas9 Proteins 0.000 description 1
- 241000589875 Campylobacter jejuni Species 0.000 description 1
- 241000282465 Canis Species 0.000 description 1
- 102000011022 Chorionic Gonadotropin Human genes 0.000 description 1
- 108010062540 Chorionic Gonadotropin Proteins 0.000 description 1
- 108091026890 Coding region Proteins 0.000 description 1
- 239000003155 DNA primer Substances 0.000 description 1
- 238000001712 DNA sequencing Methods 0.000 description 1
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 1
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 1
- 241000252212 Danio rerio Species 0.000 description 1
- 102100021601 Ephrin type-A receptor 8 Human genes 0.000 description 1
- 241000283073 Equus caballus Species 0.000 description 1
- 108060002716 Exonuclease Proteins 0.000 description 1
- 241000282324 Felis Species 0.000 description 1
- 102100026121 Flap endonuclease 1 Human genes 0.000 description 1
- 108090000652 Flap endonucleases Proteins 0.000 description 1
- 241000589599 Francisella tularensis subsp. novicida Species 0.000 description 1
- 108700028146 Genetic Enhancer Elements Proteins 0.000 description 1
- 102000006771 Gonadotropins Human genes 0.000 description 1
- 108010086677 Gonadotropins Proteins 0.000 description 1
- 101000898676 Homo sapiens Ephrin type-A receptor 8 Proteins 0.000 description 1
- 208000026350 Inborn Genetic disease Diseases 0.000 description 1
- 102100032913 Leukocyte surface antigen CD47 Human genes 0.000 description 1
- 241000124008 Mammalia Species 0.000 description 1
- 241000588650 Neisseria meningitidis Species 0.000 description 1
- 229930193140 Neomycin Natural products 0.000 description 1
- 108020003217 Nuclear RNA Proteins 0.000 description 1
- 102000043141 Nuclear RNA Human genes 0.000 description 1
- 108700026244 Open Reading Frames Proteins 0.000 description 1
- 238000012408 PCR amplification Methods 0.000 description 1
- 101150094724 PCSK9 gene Proteins 0.000 description 1
- 108010010677 Phosphodiesterase I Proteins 0.000 description 1
- 241000288906 Primates Species 0.000 description 1
- 108010076504 Protein Sorting Signals Proteins 0.000 description 1
- 230000006819 RNA synthesis Effects 0.000 description 1
- 102000044126 RNA-Binding Proteins Human genes 0.000 description 1
- 101710105008 RNA-binding protein Proteins 0.000 description 1
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 1
- 101100068078 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) GCN4 gene Proteins 0.000 description 1
- 241000191967 Staphylococcus aureus Species 0.000 description 1
- 241000194007 Streptococcus canis Species 0.000 description 1
- 241000193996 Streptococcus pyogenes Species 0.000 description 1
- 241000194020 Streptococcus thermophilus Species 0.000 description 1
- 206010042573 Superovulation Diseases 0.000 description 1
- 108010006785 Taq Polymerase Proteins 0.000 description 1
- 108010017842 Telomerase Proteins 0.000 description 1
- 241000255588 Tephritidae Species 0.000 description 1
- 239000004098 Tetracycline Substances 0.000 description 1
- 108700009124 Transcription Initiation Site Proteins 0.000 description 1
- 108010020764 Transposases Proteins 0.000 description 1
- 102000008579 Transposases Human genes 0.000 description 1
- 241000251539 Vertebrata <Metazoa> Species 0.000 description 1
- 108010067390 Viral Proteins Proteins 0.000 description 1
- JLCPHMBAVCMARE-UHFFFAOYSA-N [3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methyl [5-(6-aminopurin-9-yl)-2-(hydroxymethyl)oxolan-3-yl] hydrogen phosphate Polymers Cc1cn(C2CC(OP(O)(=O)OCC3OC(CC3OP(O)(=O)OCC3OC(CC3O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c3nc(N)[nH]c4=O)C(COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3CO)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cc(C)c(=O)[nH]c3=O)n3cc(C)c(=O)[nH]c3=O)n3ccc(N)nc3=O)n3cc(C)c(=O)[nH]c3=O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)O2)c(=O)[nH]c1=O JLCPHMBAVCMARE-UHFFFAOYSA-N 0.000 description 1
- 239000012190 activator Substances 0.000 description 1
- 230000000735 allogeneic effect Effects 0.000 description 1
- 150000001413 amino acids Chemical class 0.000 description 1
- 238000010171 animal model Methods 0.000 description 1
- 229940125644 antibody drug Drugs 0.000 description 1
- 230000006907 apoptotic process Effects 0.000 description 1
- 238000003556 assay Methods 0.000 description 1
- 230000002457 bidirectional effect Effects 0.000 description 1
- 230000033228 biological regulation Effects 0.000 description 1
- 210000004369 blood Anatomy 0.000 description 1
- 239000008280 blood Substances 0.000 description 1
- 210000000481 breast Anatomy 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 210000000234 capsid Anatomy 0.000 description 1
- 125000003178 carboxy group Chemical group [H]OC(*)=O 0.000 description 1
- 238000004113 cell culture Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000002487 chromatin immunoprecipitation Methods 0.000 description 1
- 230000008711 chromosomal rearrangement Effects 0.000 description 1
- 239000002131 composite material Substances 0.000 description 1
- 239000013068 control sample Substances 0.000 description 1
- 230000000875 corresponding effect Effects 0.000 description 1
- 238000012258 culturing Methods 0.000 description 1
- 210000000805 cytoplasm Anatomy 0.000 description 1
- 229940104302 cytosine Drugs 0.000 description 1
- 230000002939 deleterious effect Effects 0.000 description 1
- 238000002716 delivery method Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 239000010432 diamond Substances 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 230000009977 dual effect Effects 0.000 description 1
- 210000003527 eukaryotic cell Anatomy 0.000 description 1
- 102000013165 exonuclease Human genes 0.000 description 1
- 230000001036 exonucleolytic effect Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 210000003811 finger Anatomy 0.000 description 1
- 239000012530 fluid Substances 0.000 description 1
- 238000007672 fourth generation sequencing Methods 0.000 description 1
- 208000016361 genetic disease Diseases 0.000 description 1
- 230000037442 genomic alteration Effects 0.000 description 1
- 231100000024 genotoxic Toxicity 0.000 description 1
- 231100000734 genotoxic potential Toxicity 0.000 description 1
- 239000002622 gonadotropin Substances 0.000 description 1
- 239000000710 homodimer Substances 0.000 description 1
- 230000006801 homologous recombination Effects 0.000 description 1
- 238000002744 homologous recombination Methods 0.000 description 1
- 102000044459 human CD47 Human genes 0.000 description 1
- 229940084986 human chorionic gonadotropin Drugs 0.000 description 1
- 238000001114 immunoprecipitation Methods 0.000 description 1
- 238000011065 in-situ storage Methods 0.000 description 1
- 230000001939 inductive effect Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000002955 isolation Methods 0.000 description 1
- 229960000318 kanamycin Drugs 0.000 description 1
- 229930027917 kanamycin Natural products 0.000 description 1
- SBUJHOSQTJFQJX-NOAMYHISSA-N kanamycin Chemical compound O[C@@H]1[C@@H](O)[C@H](O)[C@@H](CN)O[C@@H]1O[C@H]1[C@H](O)[C@@H](O[C@@H]2[C@@H]([C@@H](N)[C@H](O)[C@@H](CO)O2)O)[C@H](N)C[C@@H]1N SBUJHOSQTJFQJX-NOAMYHISSA-N 0.000 description 1
- 229930182823 kanamycin A Natural products 0.000 description 1
- 230000003902 lesion Effects 0.000 description 1
- 239000007788 liquid Substances 0.000 description 1
- 210000004185 liver Anatomy 0.000 description 1
- 238000007403 mPCR Methods 0.000 description 1
- 108010026228 mRNA guanylyltransferase Proteins 0.000 description 1
- 210000004962 mammalian cell Anatomy 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000003550 marker Substances 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000010208 microarray analysis Methods 0.000 description 1
- 239000002480 mineral oil Substances 0.000 description 1
- 235000010446 mineral oil Nutrition 0.000 description 1
- 229960004927 neomycin Drugs 0.000 description 1
- 229910052757 nitrogen Inorganic materials 0.000 description 1
- 238000007899 nucleic acid hybridization Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 210000003101 oviduct Anatomy 0.000 description 1
- 238000004806 packaging method and process Methods 0.000 description 1
- 125000002924 primary amino group Chemical group [H]N([H])* 0.000 description 1
- 239000002987 primer (paints) Substances 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- 210000002307 prostate Anatomy 0.000 description 1
- 230000012846 protein folding Effects 0.000 description 1
- 230000007026 protein scission Effects 0.000 description 1
- 230000006337 proteolytic cleavage Effects 0.000 description 1
- 238000012175 pyrosequencing Methods 0.000 description 1
- 230000006798 recombination Effects 0.000 description 1
- 238000005215 recombination Methods 0.000 description 1
- 230000007115 recruitment Effects 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 230000001177 retroviral effect Effects 0.000 description 1
- 102220273975 rs147394389 Human genes 0.000 description 1
- 210000003296 saliva Anatomy 0.000 description 1
- 230000009758 senescence Effects 0.000 description 1
- 229940126586 small molecule drug Drugs 0.000 description 1
- 239000000243 solution Substances 0.000 description 1
- 241000894007 species Species 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000002194 synthesizing effect Effects 0.000 description 1
- 230000009897 systematic effect Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 210000001550 testis Anatomy 0.000 description 1
- 229960002180 tetracycline Drugs 0.000 description 1
- 229930101283 tetracycline Natural products 0.000 description 1
- 235000019364 tetracycline Nutrition 0.000 description 1
- 150000003522 tetracyclines Chemical class 0.000 description 1
- 230000001225 therapeutic effect Effects 0.000 description 1
- 210000003813 thumb Anatomy 0.000 description 1
- 238000002054 transplantation Methods 0.000 description 1
- 241000701161 unidentified adenovirus Species 0.000 description 1
- 230000007919 viral pathogenicity Effects 0.000 description 1
- 210000005253 yeast cell Anatomy 0.000 description 1
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/10—Processes for the isolation, preparation or purification of DNA or RNA
- C12N15/1034—Isolating an individual clone by screening libraries
- C12N15/1068—Template (nucleic acid) mediated chemical library synthesis, e.g. chemical and enzymatical DNA-templated organic molecule synthesis, libraries prepared by non ribosomal polypeptide synthesis [NRPS], DNA/RNA-polymerase mediated polypeptide synthesis
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/16—Hydrolases (3) acting on ester bonds (3.1)
- C12N9/22—Ribonucleases RNAses, DNAses
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2310/00—Structure or type of the nucleic acid
- C12N2310/10—Type of nucleic acid
- C12N2310/20—Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPRs]
Definitions
- the present disclosure relates to complexes, polynucleotides, vectors, kits, and methods for detecting cleavage sites of CRISPR/Cas nucleases and DNA translocations at cleavage sites in a genome.
- CRISPR-based genome editing exhibited enormous potential in both biological research and clinical applications.
- CRISPR therapy has its unique advantage of directly targeting the nucleic acid sequences of previously undruggable targets.
- non-specific targeting of gRNAs which might introduce undesired edits, causes unexpected cell genotoxicity.
- it is urged to understand the outcomes of off-target edits and the resulting DNA translocations, which challenges the great translational potential of CRISPR technology in harnessing genetic disorders and other human diseases.
- GUIDE-seq labeled and enriched double-strand breaks in the genome of living cells using exogenous double-stranded oligodeoxynucleotides (dsODNs) , which were mediated by DNA repair process (Tsai et al., Nat Biotechnol 2015) .
- dsODNs exogenous double-stranded oligodeoxynucleotides
- BLISS is another type of in cellula technique, which utilizes in situ DSB ligation in fixed cells and characterizes the off-target sites for both SpCas9 and As/LbCpf1 (Yan et al., Nat Commun 2017) .
- CRISPR technology holds therapeutic potential for many unmet medical needs, the off-target identification of in vivo CRISPR editing and the evaluation of corresponding genotoxicity are highly demanded.
- one strategy is to use in vitro or computational approaches to prioritize a list of genomic regions and validate them on in vivo samples one by one through targeted amplicon sequencing (Amplicon-seq) (Newby et al., Nature 2021; Musunuru et al., Nature 2021; Akcakaya et al., Nature 2018) , which would risk overlooking in vivo specific off-targets and suffer from tedious labor work if the prior data comes with a long candidate list.
- DISCOVER-seq utilized the signal of chromatin immunoprecipitation of MRE11, which is involved in the DNA repairing pathway, to represent and enrich genomic sites undergoing DSB-induced repairs (Wienert et al., Science 2019) .
- the dynamic nuclease activity of Cas9 might not be fully captured by the “snapshot” signal from MRE11 immunoprecipitation.
- DNA translocation has been a significant concern for CRISPR editing, as it typically causes higher genotoxicity, although it occurs at a relatively lower frequency (Wei et al., Cell 2016) .
- the potential risk of DNA translocation has often been concentrated on applying CRISPR editing in producing CAR-T cells since multiple gRNAs were introduced to T cells and cause risks of translocation between double-strand DNA (DSB) ends (Liu et al., Cell 2017; Ren et al., Clin Cancer Res 2017) .
- compositions and methods for detecting target cleavage sites of CRISPR/Cas nucleases and DNA translocation are described in International Application No. PCT/CN2021/124025, filed October 15, 2021, which is incorporated herein by reference in its entirety.
- CRISPR technology holds significant promise for biological studies and gene therapies because of its high flexibility and efficiency when applied in mammalian cells.
- endonuclease e.g., Cas9
- Cas9 potentially generates undesired edits; thus, there is an urgent need to comprehensively identify off-target sites so that the genotoxicities can be accurately assessed.
- PEAC-seq a new technology, which is referred to as “PEAC-seq” in some embodiments, for detecting cleavage sites of CRISPR/Cas nucleases and DNA translocations at cleavage sites in a genome.
- PEAC-seq adopts the Prime Editor, or a modified version of the Prime Editor, to insert a sequence-optimized sequence (i.e., a label or a tag) to the Cas nuclease editing sites and enrich the labeled regions with site-specific primers for high throughput sequencing (HTS) .
- PEAC-seq employs a Cas nuclease, a reverse transcriptase, and a guide RNA (also called “pegRNA” ) .
- the PEAC-seq can identify Cas editing sites, as well as DNA translocations, which are more genotoxic but usually overlooked by other off-target detection methods. As PEAC-seq does not rely on exogenous oligodeoxynucleotides (ODNs) to label the editing site, it can be used in vivo for off- target identification. PEAC-seq provides a comprehensive and streamlined strategy to identify CRISPR off-targeting sites in vitro and in vivo, as well as DNA translocation events. This new technique further diversifies the toolkit to evaluate the genotoxicity of CRISPR applications in research and clinical applications.
- ODNs exogenous oligodeoxynucleotides
- PEAC-seq provides a method to detect Cas9 cleavage sites with high accuracy and sensitivity.
- PEAC-seq can be used in vitro and in vivo, as illustrated in the Examples of this disclosure.
- PEAC-seq can also be used to detect DNA translocations at Cas cleavage sites.
- PEAC-seq is designed to insert an insertion sequence (e.g., a label or a tag) into a Cas9 cleavage site (including both on-target and off-target sites) in the genome. These insertion sequences function as labels, marking the Cas9 cleavage sites.
- the incorporation of the insertion sequences in the genomic DNA is also referred to herein as “labeling. ”
- the insertion sequence (e.g., a label or a tag) can be optimized in composition and length to increase insertion efficiency. For instance, the insertion sequence can incorporate a tag sequence to represent and enrich the edited sites in the genome.
- the reverse transcriptase and the Cas9 nuclease are fused together as, e.g., a fusion protein.
- the labeling of the genomic DNA by an insertion sequence is performed at the same location of a cleavage site right after the Cas nuclease cleaves the genomic DNA at that cleavage site.
- the Cas cleavage sites can be identified on the genome. The accompanying process of cut-and-insertion ensures consistency between cutting events and insertion events.
- DNA translocations at Cas9 cleavage sites can be identified by a detection method disclosed herein.
- the present disclosure provides a comprehensive and streamlined method to identify CRISPR targeting sites both in vitro and in vivo, as well as DNA translocation events.
- the method employs a guide RNA comprising an insertion sequence reverse transcriptase (RT) template and does not rely on additional exogenous label sequence.
- RT reverse transcriptase
- the present disclosure provides a complex comprising a Cas nuclease, a reverse transcriptase (RT) , and a guide RNA which comprises a spacer, a scaffold, an insertion sequence RT template (RTT) , and a primer binding site (PBS) , wherein the Cas nuclease is capable of creating DNA double-strand break (DSB) .
- the spacer, the scaffold, the insertion sequence RT template, and the PBS are arranged from 5’ to 3’ in the guide RNA.
- the present disclosure provides a complex comprising a Cas nuclease, a reverse transcriptase, a guide RNA which comprises a spacer and a scaffold, and an RTT-PBS sequence which comprises an insertion sequence RT template and a primer binding site (PBS) , wherein the PBS is located downstream to the 3’ end of the insertion sequence RT template, and wherein the Cas nuclease is capable of creating DNA double-strand break (DSB) .
- the RTT-PBS sequence is a linear sequence or a circularized sequence.
- the RTT-PBS sequence further comprises an MS2 hairpin.
- the Cas nuclease is selected from Cas9, its variants, and mutants of any of the variants.
- the reverse transcriptase is selected from Moloney Murine Leukemia Virus M-MLV reverse transcriptase, Avian Myeloblastosis Virus (AMV) reverse transcriptase, variants thereof and mutants of any of the variants.
- the Cas nuclease and the reverse transcriptase are not fused or linked.
- the Cas nuclease and the reverse transcriptase are formed as a fusion protein, optionally operably connected by a linker.
- the fusion protein is encoded by a sequence of SEQ ID NO: 208.
- the insertion sequence RT template is about 10 to 30 nucleotides.
- the insertion sequence RT template comprises a nucleotide sequence of any one of SEQ ID NOs: 504-506.
- the insertion sequence RT template encodes one or more tags suitable for hybrid capture.
- the guide RNA comprises an RNA structural motif at the 3’ end.
- the RNA structural motif is a modified prequeosine1-1 riboswitch aptamer (evopreQ1) or a frameshifting pseudoknot from Moloney Murine Leukemia Virus (MMLV) .
- evopreQ1 modified prequeosine1-1 riboswitch aptamer
- MMLV Moloney Murine Leukemia Virus
- the PBS comprises random nucleotides.
- the present disclosure provides a polynucleotide encoding the Cas nuclease, the reverse transcriptase, the spacer, the scaffold, the insertion sequence RT template, and the PBS in any one of the complexes disclosed herein.
- the present disclosure provides a polynucleotide encoding a guide RNA comprising an insertion sequence RT template, wherein the insertion sequence RT template comprises a nucleotide sequence of any one of SEQ ID NOs: 504-506.
- the present disclosure provides a polynucleotide encoding an RTT-PBS sequence which comprises an insertion sequence RT template and a primer binding site (PBS) , wherein the insertion sequence RT template comprises a nucleotide sequence of any one of SEQ ID NOs: 504-506.
- the present disclosure provides a vector comprising any one of the polynucleotides disclosed herein.
- the present disclosure provides a kit comprising one or more polynucleotide sequences encoding a Cas nuclease, a reverse transcriptase, and a guide RNA which comprises a spacer, a scaffold, an insertion sequence RT template, and a primer binding site (PBS) , wherein the Cas nuclease is capable of creating DNA double-strand break (DSB) .
- a kit comprising one or more polynucleotide sequences encoding a Cas nuclease, a reverse transcriptase, and a guide RNA which comprises a spacer, a scaffold, an insertion sequence RT template, and a primer binding site (PBS) , wherein the Cas nuclease is capable of creating DNA double-strand break (DSB) .
- PBS primer binding site
- the present disclosure provides a kit comprising one or more polynucleotide sequences encoding a Cas nuclease, a reverse transcriptase, a guide RNA which comprises a spacer and a scaffold, and an insertion sequence RT template, and a primer binding site (PBS) , wherein the Cas nuclease is capable of creating DNA double-strand break (DSB) .
- a kit comprising one or more polynucleotide sequences encoding a Cas nuclease, a reverse transcriptase, a guide RNA which comprises a spacer and a scaffold, and an insertion sequence RT template, and a primer binding site (PBS) , wherein the Cas nuclease is capable of creating DNA double-strand break (DSB) .
- PBS primer binding site
- the present disclosure provides a kit comprising one or more vectors encoding a Cas nuclease, a reverse transcriptase, and a guide RNA which comprises a spacer, a scaffold, and an RTT-PBS sequence which comprises an insertion sequence RT template and a primer binding site (PBS) , wherein the Cas nuclease is capable of creating DNA double-strand break (DSB) .
- a kit comprising one or more vectors encoding a Cas nuclease, a reverse transcriptase, and a guide RNA which comprises a spacer, a scaffold, and an RTT-PBS sequence which comprises an insertion sequence RT template and a primer binding site (PBS) , wherein the Cas nuclease is capable of creating DNA double-strand break (DSB) .
- DSB DNA double-strand break
- the present disclosure provides a kit comprising one or more vectors encoding a Cas nuclease, a reverse transcriptase, a guide RNA which comprises a spacer and a scaffold, and an insertion sequence RT template, and a primer binding site (PBS) , wherein the Cas nuclease is capable of creating DNA double-strand break (DSB) .
- a kit comprising one or more vectors encoding a Cas nuclease, a reverse transcriptase, a guide RNA which comprises a spacer and a scaffold, and an insertion sequence RT template, and a primer binding site (PBS) , wherein the Cas nuclease is capable of creating DNA double-strand break (DSB) .
- PBS primer binding site
- the Cas nuclease is selected from Cas9, its variants and mutants of any of the variants.
- the reverse transcriptase is selected from Moloney Murine Leukemia Virus M-MLV reverse transcriptase, Avian Myeloblastosis Virus (AMV) reverse transcriptase, variants thereof and mutants of any of the variants.
- the insertion sequence RT template comprises a nucleotide sequence of any one of SEQ ID NOs: 504-506.
- the present disclosure provides a method for labeling Cas nuclease cleavage sites in genomic DNA, comprising contacting the genomic DNA with the complex in any one of claims 1-16, wherein the genomic DNA is cleaved at one or more cleavage sites, and one or more sequences that are reverse transcribed from the insertion sequence RT template in part or in whole are inserted into the one or more cleavage sites, and wherein the one or more sequences inserted into the one or more cleavage sites are labels.
- the cleavage site is on-target or off-target.
- the present disclosure provides a method for detecting Cas9 cleavage sites and/or detecting DNA translocation in genomic DNA, comprising
- the one or more amplicons each comprise a portion of genomic DNA that is immediately upstream or downstream to the one or more labels.
- the present disclosure provides a method for identifying off-target Cas cleavage sites, comprising comparing the Cas cleavage sites identified by the disclosed herein with a target sequence, wherein the cleavage site that is not identical to the target sequence is an off-target site.
- the labeled genomic DNA is processed by Tn5 tagmentation before enrichment, wherein sequencing adapters that include unique molecular identifiers (UMI) are embedded in the Tn5 transposases.
- UMI unique molecular identifiers
- the labeled genomic DNA is targeted and enriched by PCR or a hybrid capture-based target enrichment method.
- the enrichment is performed by two PCR, wherein in one PCR reaction the insertion sequence is used as the forward primer binding site and in the other PCR reaction the insertion sequence is used as the reverse primer binding site.
- the 3’ end of the primers that bind to the insertion sequence is at least 2-bp away from the insertion boundary.
- the present disclosure provides a method for determining the relative specificity of a plurality of guide RNAs comprising
- a guide RNA having fewer off-target sites is more specific than a guide RNA having more off-target sites.
- the present disclosure provides a method for determining the relative specificity of a plurality of Cas nuclease variants or mutants of any of the variants comprising
- the present disclosure provides a method for determining the relative genotoxicity of a plurality of guide RNAs comprising
- a guide RNA having fewer off-target sites and fewer DNA translocation is more specific than a guide RNA having more off-target sites and more DNA translocation.
- Fig. 1 illustrates an embodiment in this disclosure: PEAC-seq.
- Fig. 1A is a schematic representation of a PEAC-seq experimental procedure.
- the gDNA were extracted and undergone Tn5 tagmentation.
- the Tn5 was embedded with UMI-adaptors to eliminate PCR duplications. After tagmentation, fragments were amplified by pairs of primers (one priming at the PEAC-seq insertion, another priming with the Tn5 adaptor) .
- Fig. 1B is a schematic representation of the two forward primers and two reverse primers designed for tag enrichment and library preparation of PEAC-seq.
- Each forward primer was paired with a downstream Tn5 primer to generate amplicons including the PEAC-seq tag sequence and its downstream genomic sequences.
- Each reverse primer was paired with an upstream Tn5 primer to generate amplicons including the PEAC-seq tag sequence and its upstream genomic sequences.
- five Amplicon-seq data from the three forward primers and two reverse primers were generated, and six candidate lists of putative off-targets were inferred from the five Amplicon-seq data using a modified GUIDE-seq analysis pipeline.
- Figs. 1C-1E are Venn diagrams showing the shared and unique off-targets identified by PEAC-seq and GUIDE-seq.
- Fig. 2 shows an analysis of PEAC-seq off-target sites.
- Fig. 2A is the visualization of PEAC-seq on-target and off-target sites.
- the symbol ‘*’ represented a PEAC-seq site that was also called by the GUIDE-seq.
- the symbol ‘**’ represented a PEAC-seq off-target (PEAC-seq-unique) that was identified by Amplicon-seq but not called by the GUIDE-seq.
- PEAC score quantitative enrichment of the PEAC-seq tag at the edited sites
- PEAC-ID each identified sites (on-target and off-target) by PEAC-seq were assigned a PEAC-ID, which was ordered by the PEAC score (descending order) .
- Fig. 2B shows the number of reads from the shared PEAC-seq and GUIDE-seq sites are highly correlated.
- Fig. 2C shows screenshots of PEAC-seq signal tracks from the IGV Genome Browser.
- One on-target site, one shared off-target site, and one PEAC-seq unique off-target site were presented.
- signals from both the PEAC-seq and the wild-type (WT, no Cas9-MMLV treatment) samples were included.
- the first track represented signals from the amplicons of a forward primer and a downstream Tn5 primer
- the second track represented signals from the amplicons of a reverse primer and an upstream Tn5 primer.
- the model on the right side showed the direction of spacer and PAM of each case.
- Fig. 2D shows the shared off-targets (grey bars) tend to have less mismatches compared to the on-target site, while the PEAC-seq unique sites (slashed bars) and the GUIDE-seq unique sites (dashed bars) tend to have more mismatches.
- Fig. 2E shows the mutation frequencies were plotted at each position alongside the gRNA and PAM sequences (from 5’ to 3’ ) . From top to bottom are profiles of VEGFA TS1, TS2 and TS3.
- Fig. 3 shows PEAC-seq identified DNA translocations relevant to CRISPR genome editing.
- Fig. 3A shows signal tracks of one PEAC-seq site with unexpected upstream signals from the F-primer amplicon. Dashed grey bar: cutting site; Light grey and dark grey peaks: expected signals from the F-primer; White peak with board line and slashed bars: unexpected signals from the F-primer.
- Fig. 3B are proposed models of the generation of unexpected upstream signals. Both the Receiver site and the Donor site can generate DSBs and proximal to each other within the nucleus. Model (i) and Model (ii) joined DSB ends from the same Receiver site. Model (iii) , Model (iv) and Model (v) joined one donor DSB and one Receiver DSB. If the donor DSB carried the PEAC-seq insertion, the unexpected upstream signal would be observed at the Receiver Site. In the models, the gRNA location was set on the top strand.
- Fig. 3C shows the design of validation PCR to identify the genomic sequence of the Donor Sites.
- Two specific primers (Nest-F1 and Nest-F2) were designed upstream of the gRNA of the Receiver Site.
- the Nest-F1 and Nest-F2 were sequentially used with the downstream Tn5 primer, and two amplicons were generated.
- the 2nd amplicons were sent for Amplicon-seq.
- Fig. 3D shows the translocation cases identified by PEAC-seq + Amplicon-seq.
- Fig. 3E shows the translocation scores of all sites were plotted. The two arrows indicate the Receiver Site in Fig4D. A DNA translocation score was calculated as “translocation reads number” /(“normal reads number” + “translocation reads number” + 10) .
- Fig. 4 shows PEAC-seq identified pcsk9 off-targets from an edited mouse embryo.
- Fig. 4A is a schematic representation of an in vivo PEAC-seq experiment.
- Fig. 4B is a Venn diagram showing the overlap between the PEAC-seq on-target and off-targets of PCSK9 and the top 18 editing sites (including the on-target) identified by DISCOVER-seq.
- Fig. 4C is the sequence visualization of the PCSK9 on-target and off-targets.
- One off-target was identified from one of the two embryos. The site was also reported by DISCOVER-seq and validated by Amplicon-seq. The scale bar represented the indel frequency reported by CRISPResso.
- Fig. 4D shows signal tracks of the on-target and off-target sites identified from PEAC-seq in two different embryos and wild-type control.
- the signal of the WT control at chr4: 106463845 was 1000-fold lower than the samples and was considered as background.
- Fig. 5 illustrates ePEAC-seq, an enhanced version of PEAC-seq with higher sensitivity to identify off-targets.
- Fig. 5A is a schematic representation of the five modified versions of PEAC-seq.
- Fig. 5B shows the insertion frequencies of PEAC-seq tag in PEAC-seq and its five modifications.
- Fig. 5C is the Venn diagram of EMX1 off-targets identified by PEAC-seq and GUIDE-seq.
- Fig. 5D shows the ePEAC-seq identified two more verified off-targets that were missed by PEAC-seq.
- Fig. 6 shows genomic context of PEAC-seq off-target and translocations.
- Fig. 6A shows signals of the ATAC-seq peaks and ChIP-seq peaks of multiple histone modifications and proteins surrounding the PEAC-seq off-targets.
- Fig. 6B shows signals of the DSB surrounding the PEAC-seq translocation sites (left panel) and random controls (right panel) .
- Fig. 7 illustrates library preparation and modified GUIDE-seq pipeline to generate six lists of candidate sites.
- Amplicons were enriched by PEAC-seq insertion-specific primers and Tn5 primers. Three forward primers and two reverse primers were used with the upstream (light blue) and downstream (yellow) Tn5 primers, in five separate PCR reactions. A total of five NGS libraries were generated and sequenced.
- a modified GUIDEseq analysis pipeline was applied, and six lists of candidate sites were generated from each pair of the forward and the reverse primers.
- Fig. 8 shows indel frequency and tag insertion ability of a Cas9-MMLV system.
- Fig. 8A shows the results from Amplicon-seq, which was conducted to quantify the indel frequency of ten on-target sites.
- the indels were generated by Cas9 or Cas9-MMLV.
- Fig. 8B shows the frequency of tag insertion was estimated on the same ten on-target site.
- Fig. 9 shows signal tracks of PEAC-seq at VEGFA TS1. Chromosome locations and the overlap with GUIDE-seq were also shown.
- Fig. 10 shows signal tracks of PEAC-seq at VEGFA TS2.
- Fig. 10A is a Venn diagram showing the overlap of on-target and off-targets of VEGFA TS2 between PEAC-seq and GUIDE-seq. Eighty-one sites were overlapped. Seventy-one sites were GUIDE-seq unique and thirty-four sites were PEAC-seq unique.
- Fig. 10B is a GUIDE-seq visualization output of PEAC-seq sites at VEGFA TS2.
- Fig. 10C shows signal tracks of PEAC-seq sites at VEGFA TS2. Chromosome locations and the overlap with GUIDE-seq were also shown.
- Fig. 11 shows signal tracks of PEAC-seq at VEGFA TS3.
- Fig. 11A is a Venn diagram shows the overlap of on-target and off-targets of VEGFA TS3 between PEAC-seq and GUIDE-seq. Thirty-five sites were overlapped. Twenty-five sites were GUIDE-seq unique, and eight sites were PEAC-seq unique.
- Fig. 11B is GUIDE-seq visualization output of PEAC-seq sites at VEGFA TS3.
- Fig. 11C shows signal tracks of PEAC-seq sites at VEGFA TS3. Chromosome locations and the overlap with GUIDE-seq were also shown.
- Fig. 12 shows signal tracks of PEAC-seq at EMX1.
- Fig. 12A is a GUIDE-seq visualization output of PEAC-seq sites at EMX1.
- Fig. 12B shows signal tracks of PEAC-seq sites at EMX1. Chromosome locations and the overlap with GUIDE-seq were also shown.
- Fig. 13 shows signal tracks of PEAC-seq at RNF2.
- Fig. 13A is a Venn diagram shows the overlap of on-target and off-targets of RNF2 between PEAC-seq and GUIDE-seq. One site was called by both two methods.
- Fig. 13B is a GUIDE-seq visualization output of PEAC-seq sites at RNF2.
- Fig. 13C shows signal tracks of PEAC-seq sites at RNF2. Chromosome locations and the overlap with GUIDE-seq were also shown.
- Fig. 14 shows signal tracks of PEAC-seq at FANCF.
- Fig. 14A is a GUIDE-seq visualization output of PEAC-seq sites at FANCF.
- Fig. 14B shows signal tracks of PEAC-seq sites at FANCF. Chromosome locations and the overlap with GUIDE-seq were also shown.
- Fig. 15 shows the translocations call by PEAC-seq.
- Fig. 15A shows the primerE. geometric_mean and translocation rates of on/off target sites called by PEAC-seq. primerE. geometric_mean: Geometric mean of the number of reads amplified by forward/reverse primer with distinct molecular indices. Translocation Rate: The ratio of reads amplified by PEAC-seq forward primer but with reverse orientation.
- Figs. 15B and 15C show two of the translocation sites with highest translocation rates called by PEAC-seq, which were validated by unidirectional targeted sequencing (UDiTaS) .
- Circos plots show the chromosome rearrangements at the receiver sites Translocation Validation site1 (chr22: 37266776-37266799) (15B) and Translocation Validation site2 (chr14: 61612048-61612071) (15C) . Both sites are off-targets of VEGFA TS3. Arcs were used to represent the rearrangements between the Translocation validation sites and other sites. The receiver sites were marked as diamonds, and the known VEGFA TS3 off target sites were marked as stars.
- Fig. 16 shows PEAC-seq identified mPnpla3 off-targets from edited mouse embryo.
- Fig. 16A is a Venn diagram that shows the overlap between the PEAC-seq on-target and off-targets of PnPla3 and the top21 off-targets validated by WGS (Anderson et al, 2018) . Three cleavage sites were identified from two different embryos from our study. All three sites were reported previously.
- Fig. 16B is a sequence visualization of the Pnpla3 on-target and off-targets.
- One off-target site was identified by both embryos, and each embryo identified an embryo-specific off-target. All three off-targets were reported previously and also verified by Amplicon-NGS.
- Fig. 16 C shows signal track of the on-target and off-targets sites identified by PEAC-seq in two different embryos and wild-type control.
- Fig. 17 illustrates ePEAC-seq, an enhanced version of PEAC-seq.
- the Venn diagrams show the VEGFA TS2 off-targets identified by GUIDE-seq and PEAC-seq.
- Fig. 18 illustrates ePEAC-seq, an enhanced version of PEAC-seq.
- the Venn diagrams show the EMX1 off-targets identified by GUIDE-seq and PEAC-seq.
- Fig. 19 illustrates mut-pegRNA, an enhanced version of the pegRNA for PEAC-seq.
- Random nucleotide was incorporated into the PBS region of pegRNAs to improve the binding between pegRNA and off-targets with PBS mismatches.
- RTT is an insertion sequence RT template. This illustration shows five different mut-pegRNA, each included one random nucleotide shown as “N”.
- Fig. 20 shows target sites identified by PEAC-seq in cellulo.
- the Cas9 target sequences i.e., cleavage sites
- the Cas9 target sequences identified by PEAC-seq targeting six genes (VEGFA TS1 (Fig. 20A) , VEGFA TS2 (Fig. 20B) , VEGFA TS3 (Fig. 20C) , EMX1 (Fig. 20D) , RFN2 (Fig. 20E) and FANCF (Fig. 20F) ) , and their Chromosome locations.
- the number of mismatches is also shown.
- the cleavage site with 0 mismatch is the on-target site and the others are off-target sites.
- Fig. 21 shows target sites identified by PEAC-seq in vivo.
- Fig. 21A shows the Cas9 target sequences (i.e., cleavage sites) from Embryo #5 and Embryo #12 identified by PEAC-seq targeting Pcsk9.
- Fig. 21B shows the Cas9 target sequences (i.e., cleavage sites) from Embryo #21 and Embryo #31 identified by PEAC-seq targeting Pnpla3.
- Fig. 22 shows the primers, oligos, and vectors used in the development of PEAC-seq.
- Fig. 23 shows the primers used in the validation of chromosome translocation. (Examples 1, 3)
- Fig. 24 shows the primers and vectors used in PEAC-seq and Amplicon-NGS in vivo.
- Fig. 25 shows the result of Amplicon-seq validation.
- Fig. 26 shows the insertion efficiency of four insertion sequences on four target sites, and the nucleotide composition of each insertion sequence.
- Fig. 27 shows the insertion efficiency of two insertion sequences at HEK3 site, and the nucleotide composition of each insertion sequence.
- Fig. 28 shows off-targets sites of VEGFA TS2 called with GUIDE-seq, PEAC-seq, and ePEAC-seq.
- nucleic acids are written left to right in the 5'to 3'orientation; and amino acid sequences are written left to right in amino to carboxy orientation, respectively.
- variable refers to varied form of a subject, which includes wild-type forms, naturally occurring or artificially mutant forms.
- an “insertion sequence” refers to a DNA sequence that is encoded by the RT template comprised in a guide RNA and the products reverse transcribed from this RT template. Both partial and full-length products may exist in a reverse transcription. When “insertion sequence” is used to refer to the reverse transcription products, it includes both the partial and full-length products.
- a guide RNA refers to a synthetic or expressed RNA sequence that comprises a CRISPR binding motif and a spacer.
- a “spacer” is a DNA-targeting motif, which is a sequence that is complementary to a target specific DNA region.
- a CRISPR binding motif is sometimes call “scaffold. ”
- the CRISPR binding motif of a guide RNA can bind to a Cas enzyme and DNA-targeting motif of the gRNA can guide the complex to a specific target location on a DNA.
- a gRNA may further comprise an insertion sequence RT template.
- the guide RNA is a pegRNA.
- a “complex” refers to a system of components that achieves a function as disclosed herein, e.g., detecting cleavage sites of CRISPR/Cas nucleases and DNA translocations at cleavage sites. Some or all of the components of the system may be connected (covalently or non-covalently associated) or not connected.
- a “fusion protein” is a protein comprising at least two domains that are encoded by separate genes that have been joined a single polypeptide.
- a fusion protein can comprise two domains that are encoded by separate genes that have been joined so that they are transcribed and translated as a single unit, producing a single polypeptide.
- the at least two domains are fused together directly.
- the domains are connected by one or more linkers.
- the present disclosure provides a new method for identifying Cas protein cleavage sites.
- This method can be used for off-target identification. As explained below, this takes advantage of the sequence insertion ability from the Prime Editor (PE) , so it is referred to as PEAC-seq (Prime Editor Assisted off-target Characterization) in some embodiments (e.g., as illustrated in Fig. 1A, Fig. 17, and Fig. 18) .
- PEAC-seq Primary Editor Assisted off-target Characterization
- the Prime Editing system is a “search-and-replace” genome editing technology that mediates targeted insertions, deletions, and base-to-base conversions and combinations thereof in human cells without the need for double strand breaks (DSBs) or donor DNA templates.
- Prime Editors use a reverse transcriptase (RT) fused to an RNA-programmable nickase (e.g., Cas9 nickase) and a prime editing guide RNA (also known as pegRNA) to copy genetic information directly from an extension on the pegRNA into the target genomic locus (Anzalone et al., 2019) .
- RT reverse transcriptase
- pegRNA prime editing guide RNA
- the template sequence on the pegRNA extension will be reverse transcribed into DNA and hybridize to the unedited complementary strand with the help of another endonuclease (e.g., FEN1) .
- the native PE system utilizes a pegRNA (Prime Editor gRNA) containing extra sequences at the 3’ of gRNA, which serve as a priming site and reverse transcriptase template, allowing reverse transcription from the exposed 3’-hydroxyl group of the non-targeting strand to incorporated additional DNA sequences into the cleavage sites.
- pegRNA Primary Editor gRNA
- PEAC-seq the sequences reverse transcribed from the template and inserted into the cleavage sites are used as labels in subsequent enrichment and identification of the cleavage sits.
- An optimized reverse transcriptase (RT) template is used to incorporate PEAC-seq label sequences, which were further used to represent and enrich the local sequences of the editing sites from the genome, including both on-target and off-target sites.
- the PEAC-seq method in the present disclosure replaces the Cas9 nickase in the Prime Editing system with a Cas9, which creates DSBs in the genomic DNA. By creating DSBs, the newly reverse transcribed DNA sequences will be inserted into the cleavage site at a higher efficiency.
- PEAC-seq accompanies the process of CRISPR editing and label insertion, which ensures the consistency between editing events and PEAC-seq signals.
- PEAC-seq When applied PEAC-seq on a few promiscuous sites in both in cellulo and in vivo samples, it can effectively identify off-targets by comparing to the results of GUIDE-seq, DISCOVER-seq, WGS, and Amplicon-seq.
- DNA translocations can be successfully identified. DNA translocations can not be directly profiled by currently available methods and are typically more toxic to cells.
- PEAC-seq is an unbiased method of identifying CRISPR off-targets and off-target-related DNA translocations. As it bypassed the addition of high molarity of exogenous dsODNs, PEAC-seq also holds immense potential to identify off-targets and translocations for in vivo CRISPR editing, which would be particularly valuable for translational studies.
- Off-target detection is crucial to biotechnological and clinical applications of the CRISPR technology. Over the past years, many designs have been applied to depict profiles of off-targets in vitro and in cellulo. These methods often involve addition of exogenous dsODN or chemicals, which limits their applications in vivo. Besides these experimental approaches, computational algorithms considered diverse features of gRNA also contributed to generating candidate off-target list. However, it is always concerning how well the cellular context could be reflected by these alternative approaches.
- PEAC-seq uses a label sequence encoded within the CRISPR-Cas system and inserts it along with the cleavage into the cleavage sites.
- PEAC-seq has successfully identified and validated off-targets both in HEK293T and in mouse embryos.
- a Cas nuclease that is capable of creating double-strand breaks (DSB) such as Cas9
- Cas9n the Cas9 nickase
- a pegRNA comprises an insertion sequence RT template for inserting a label sequence by reverse transcription for subsequent enrichment.
- the Cas/pegRNA creates DSBs in the genome at both on-target and off-target sites, and the label sequence is introduced at the DSB sites through reverse transcription from the pegRNA and incorporated into the genome through the NHEJ pathway of DNA repair.
- the insertion sequence RT template can be reverse transcribed either in full or in part.
- the insertion sequence RT template can be designed with the following considerations: (1) avoiding the RNA secondary structure of the inserted sequence (i.e., the label sequence) and between the inserted sequence and the gRNA scaffold; (2) sequence uniqueness to the host genome; (3) sufficiently long for efficient anneal by PCR primers for enrichment.
- the insertion sequence RT template encodes a 21-nt sequence of SEQ ID NO: 1.
- DNA translocation is also referred to as chromosome translocation, or chromosome rearrangement.
- a translocation a segment from one chromosome is transferred to a nonhomologous chromosome or to a new site on the same chromosome.
- Chromosomal translocations appear to arise from improper repair of DNA double-strand breaks (DSBs) , which are highly toxic lesions.
- DSBs DNA double-strand breaks
- the “guardians” of genome integrity mostly ensure reliable repair of DSBs; also, unrepaired DSBs can lead to apoptosis or senescence.
- imprecise repair of DSBs has the potential to be highly deleterious, as it can lead to genome instability, including the formation of chromosomal rearrangements.
- chromosomal translocations can arise when DNA ends from DSBs on two heterologous chromosomes are improperly joined.
- the DSB-induced DNA rearrangements which have not been systematically evaluated by other CRISPR off-target identification techniques, would cause severe chromosome aberrant including large fragment deletion, inversion, and translocation.
- the resulted PCR amplicons can be used as indicators for chromosome rearrangements, as it can distinguish whether the amplicon came from the joining of expected DSB ends. It is also noticed that the occurrence of DNA translocation is independent to the frequency of DSB at a particular site, which indicated that other factors, e.g., position or DSB context sequences might contribute to translocation (Wei et al., Cell 2016) .
- both the translocation profiling methods and genotoxicity assessment need to be developed for CRISPR transitional applications.
- the presently disclosed methods e.g., PEAC-seq, can detect DNA translocation in genomic DNA and address at least some of the problems in the art.
- PEAC-seq method and DISCOVER-seq, both relying on agent signals that accompanying with the cleavage events.
- DISCOVER-seq is not as accurate and efficient as the PEAC-seq method because it uses MRE11 ChIP-seq signals to represent the DSB events undergoing in the edited cells, while the nature of ChIP-seq technique captured only the snapshot of MRE11 binding and might not exhibit the off-target sites over the course of editing.
- PEAC-seq relies on the enrichment of an inserted PCR handle. Random sequence screen demonstrated good efficiency of long insertion. Increasing the cell population may further increase the sensitivity of PEAC-seq, which have been demonstrated by the two verified PEAC-seq unique off-targets in cellulo. PEAC-seq provides a versatile tool to enhance our understanding about the occurrence of off-target in different context, which is a very informative alternative to the costly WGS.
- the insertion efficiency of a PEAC-seq label sequence is important for the detection accuracy and efficiency.
- the insertion efficiency may vary across different pegRNAs and at different off-targets.
- Recent studies have reported a variety of modifications to the native PE system to increase the editing efficiency, including modifications on pegRNA, MMLV, and transient expression of a dominant negative DNA mismatch repair (MMR) protein, such as the MLH1dn protein (Nelson et al., Nat Biotechnol 2022; Zong et al., Nat Biotechnol 2022; Chen et al., Cell 2021) .
- MMR dominant negative DNA mismatch repair
- incorporating epegRNA is an effective method to improve the insertion efficiency of PEAC-seq labels, which, for example, rescued two missing off-targets from EMX1 PEAC-seq (see Example 5) .
- Unprotected nuclear RNAs are susceptible to degradation from both the 5′and 3′termini by exonucleases.
- the 3′extension of pegRNAs is likely to be exposed in cells and thus more susceptible to exonucleolytic degradation.
- the pegRNA comprises a structural RNA motif at its 3’ end. Specifically, addition of structured RNA motifs at the 3’ end of the pegRNA can improve pegRNA stability and minimize degradation.
- structured RNA motif and “RNA structural motif” are used interchangeably, and they refer to a piece of RNA with a defined secondary and/or tertiary structure.
- the RNA structural motif is a modified prequeosine1-1 riboswitch aptamer (evopreQ1, SEQ ID NO: 75) . (Nelson et al., Nat Biotechnol 2022; Zong et al., Nat Biotechnol, 2022) .
- the RNA structural motif is a frameshifting pseudoknot from Moloney murine leukemia virus (MMLV) (mpknot, SEQ ID NO: 503) . (Chen et al., Cell 2021) . In some embodiments, some unnecessary sequence can be trimmed off from the RNA structural motif to remove extraneous sequences while maintain the pegRNA’s editing efficiency.
- MMLV Moloney murine leukemia virus
- mpknot SEQ ID NO: 503
- some unnecessary sequence can be trimmed off from the RNA structural motif to remove extraneous sequences while maintain the pegRNA’s editing efficiency.
- a viral exoribonuclease-resistant RNA (xrRNA) motif is appended to the 3’ end of the pegRNA. This modification can increase pegRNA’s resistance against degradation.
- the Xrn1-resistant RNAs (xrRNAs) are a group of conserved structures found in flaviviruses, including Dengue, Yellow fever, West Nile, and Zika. Located at the beginning of the 3’ untranslated region (3’ -UTR) of the viral genome, such structure protects the downstream viral RNA from degradation by the 5’ -3’ exoribonuclease Xrn1, resulting in the production of a non-coding sub-genomic viral RNA that functions to enhance viral pathogenicity.
- the xrRNAs adopt a characteristic knot-like structure that is thought to mechanically impede Xrn1 processing from the 5’ direction. Recent evidence demonstrated that even under bidirectional pulling forces, the xrRNA motif exhibited a remarkably high level of mechanical rigidity and resistance to unfolding. (Zhang, Guiquan, et al. "Enhancement of prime editing via xrRNA motif-joined pegRNA. " Nature communications 13.1 (2022) : 1-12. )
- the insertion efficiency of PEAC-seq may depend on the length and sequence composition of the insertion sequence (Fig. 26) .
- the RNA secondary structure of the insertion sequence and sequence uniqueness to the host genome can vary. But the present disclosure provides several considerations to be taken into account in designing insertion sequence RT template. As long as these considerations are taken into account, the insertion sequence (as well as the insertion sequence RT template) is exchangeable. (Figs. 26-27) .
- the present disclosure provides three insertion sequences, which are SEQ ID NOs: 1 and 497-498, and the insertion sequence RT templates of them are SEQ ID NOs: 504-506, respectively.
- the PBS primary binding site
- the PBS is a 13-nt sequence as in the native PE system (see Anzalone et al., Nature 2019) .
- the PBS is a 17-nt sequence. The present disclosure provides that both the 13-nt and 17-nt PBS worked well in the methods disclosed herein.
- the PBS sequences which are designed to be complementary to the on-target sites, can have mismatches at off-target sites. Many off-targets with PBS mismatches were successfully identified by PEAC-seq, indicating the complication of the effects of PBS mismatches on reverse transcription.
- the present disclosure further provides that in some embodiments, including random nucleotides in the PBS region of pegRNA can improve the extension efficiency at off-targets with PBS mismatches.
- the PBS in the pegRNA comprises random nucleotides, for example, proximal to the primer extension site.
- mut-pegRNAs are referred to as mut-pegRNAs.
- pegRNA designed from the on-target sequence can enable PEAC-seq tag insertion in most off-target sites, and the incorporation of mut-pegRNA may improve the insertion efficiency of PEAC-seq tags in some off-target sites with critical PBS mismatches.
- a mix of pegRNA and mut-pegRNAs may also increase the insertion efficiency of the PEAC-seq tag.
- the mix has 50%pegRNA and 50%mut-pegRNAs.
- the mix comprises more than one mut-pegRNA, such as two, three, four, or five different mut-pegRNAs.
- the mix comprises five different mut-pegRNAs, e.g., as shown in Fig. 19, with 10%of each.
- the kit disclosed herein comprises more than one guide RNAs, polynucleotide encoding the guide RNAs, or vectors encoding the guide RNAs, wherein the more than one guide RNAs are a mix of pegRNA and mut-pegRNA.
- the composition disclosed herein comprises more than one guide RNAs, polynucleotide encoding the guide RNAs, or vectors encoding the guide RNAs, wherein the more than one guide RNAs are a mix of pegRNA and mut-pegRNA.
- Reverse transcriptase evolving for error-correcting activity may also improve the primer extension efficiencies. If a proper enzyme can be evolved and characterized, the 3’ to 5’ exonuclease activity can correct mismatches between PBS and off-targets.
- the insertion sequence RT template is provided separately from the gRNA.
- the insertion sequence RT template (RTT) is provided together with the primer binding sequence (PBS) as a separate RTT-PBS sequence.
- the RTT-PBS sequence comprises a MS2 hairpin.
- the RTT-PBS sequence is circular.
- the RTT-PBS sequence is linear.
- the present disclosure further provides that in some embodiments, the Cas nuclease and the reverse transcriptase are not fused.
- the Cas nuclease and reverse transcriptase are provided separately, for example, by two separate vectors.
- the separate Cas nuclease and/or reverse transcriptase are each fused with one or more tags which facilitate recruit of the reverse transcription by the Cas nuclease or the pegRNA.
- the reverse transcriptase is fused with an MS2 coat protein, and the pegRNA is incorporated with multiple MS2 stem-loops.
- the reverse transcriptase is fused with a single-chain variable fragment (scFv)
- the Cas nuclease is fused with multiple copies of GCN4 peptide (this particular multi-peptide tag is called SunTag) .
- scFv single-chain variable fragment
- SunTag this particular multi-peptide tag
- the Cas nuclease is provided in two or more parts.
- the Cas nuclease is split into two parts and delivered by two separate vectors.
- each of the two parts of the Cas nuclease is connected with a trans splicing intein.
- the reverse transcriptase is a modified Moloney–murine leukemia virus reverse transcriptase (M-MLV RT) .
- M-MLV RT is composed of fingers, palm, thumb, and connection domains, each having a unique role in nucleotide incorporation during DNA synthesis.
- RNase H domain that functions as a processive endonuclease cleaving the RNA strand in RNA–DNA heteroduplexes.
- the reverse transcriptase is a M-MLV RT with decreased or disrupted RNase H activity.
- the RNase H activity of the M-MLV RT is decreased or disrupted by one or more point mutations within the RNase H domain of the M-MLV RT, for example an Asp524Asn substitution.
- the whole RNase H domain is deleted from the M-MLV RT.
- the whole RNase H domain and the connection domain that is linked to the RNase H domain are deleted from the M-MLV RT.
- Some viral proteins can facilitate reverse transcription, such as the nucleocapsid (NC) protein that has nucleic acid chaperone activity affecting a variety of RT-related functions.
- the reverse transcriptase is a M-MLV RT which is fused with an NC protein.
- the NC protein is fused at the C terminus of the M-MLV RT.
- the NC protein is fused between the Cas nuclease and the M-MLV RT.
- the present disclosure provides that other optimization and modification to the Prime Editor system can also be similarly applied to the PEAC-seq methods disclosed herein.
- the PEAC-seq methods disclosed herein adopt the Prime Editor system, or a modified version of the Prime Editor system, to report CRISPR off-targets in cellulo and in vivo, and Cas-dependent DNA rearrangement.
- PEAC-seq further diversifies the CRISPR off-target identification toolbox and provides a reliable solution to directly identify off-targets for in vivo editing and recognize DNA rearrangements, which would strengthen our ability to assess genotoxicity in clinics.
- the present disclosure provides a complex comprising a Cas nuclease, a reverse transcriptase (RT) , and a guide RNA which comprises a spacer, a scaffold, an insertion sequence RT template, and a primer binding site (PBS) , wherein the Cas nuclease is capable of creating DNA double-strand break (DSB) .
- the spacer, the scaffold, the insertion sequence RT template, and the PBS are arranged from 5’ to 3’ in the guide RNA.
- the present disclosure provides a complex comprising a Cas nuclease, a reverse transcriptase, a guide RNA which comprises a spacer and a scaffold, and an RTT-PBS sequence which comprises an insertion sequence RT template and a primer binding site (PBS) , wherein the PBS is located downstream to the 3’ end of the insertion sequence RT template, and wherein the Cas nuclease is capable of creating DNA double-strand break (DSB) .
- the RTT-PBS sequence is a linear sequence or a circularized sequence.
- the RTT-PBS sequence further comprises an MS2 hairpin
- the reverse transcriptase is fused with an MS2 coat protein.
- MS2 is a 19-nucleotide long viral (bacteriophage) RNA sequence present at the ribosomal binding site of the MS2 replicase mRNA, which folds into a hairpin loop structure. This hairpin loop is recognized with high specificity and affinity by the MS2 bacteriophage capsid RNA-binding protein MS2 (Kd of 3-300 ⁇ 10 9 , depending on the stem loop sequence and the MS2 RBP variant) .
- RBP MS2 as a chimeric protein containing a peptide tag facilitates the isolation of the [MS2 RNA/MS2 protein] complex, together with other molecules present in the complex.
- MS2 has been used widely to tag RNA transcribed in vitro and in vivo for other applications.
- the double-strand break created by the Cas nuclease has blunt ends. In some embodiments, the double-strand break created by the Cas nuclease has sticky ends.
- DNA ends refer to the properties of the ends of linear DNA molecules, which are described as “sticky” or “blunt” based on the shape of the complementary strands at the terminus. In sticky ends, one strand is longer than the other (typically by at least a few nucleotides) , such that the longer strand has bases which are left unpaired. In blunt ends, both strands are of equal length, i.e., they end at the same base position, leaving no unpaired bases on either strand.
- the Cas nuclease is selected from Cas9, its variants, and mutants of any one of the variants.
- CRISPR clustered, regularly interspaced, short palindromic repeats
- Cas CRISPR-associated systems
- the present disclosure involves a Cas nuclease or a variant or a mutant of any of the variants thereof.
- All variants and mutants of Cas9 can be used in a method, composition, or kit disclosed herein, including but not limited to a wild-type Cas9 or a Cas9 nickase (Cas9n) .
- the Cas9 nuclease used herein can either be wild type or be genetically modified.
- the Cas9 nucleases to be used herein can be selected from SpCas9 (Cas9 isolated from Streptococcus pyogenes) , SaCas9 (Cas9 isolated from Staphylococcus aureus) , StCas9 (Cas9 isolated from Streptococcus thermophilus) , NmCas9 (Cas9 isolated from Neisseria meningitidis) , FnCas9 (Cas9 isolated from Francisella novicida) , CjCas9 (Cas9 isolated from Campylobacter jejuni) , ScCas9 (Cas9 isolated from Streptococcus canis) , and any variants and mutant forms of the Cas9 listed above, such as high-fidelity Cas9 (Kleinstiver et al., Nature. 2016 Jan 28) and enhanced SpCas9 (Slaymaker et al., Sciences. 2016 Jan 01) .
- the reverse transcriptase is selected from Moloney Murine Leukemia Virus M-MLV reverse transcriptase, Avian Myeloblastosis Virus (AMV) reverse transcriptase, variants thereof and mutants of any of the variants.
- the present disclosure involves a reverse transcriptase or a variant or a mutant of any of the variants thereof, which can be provided as a fusion protein with a Cas nuclease, or provided in trans.
- Reverse transcriptase also known as RNA-dependent DNA polymerase, is a DNA polymerase enzyme that transcribes single-stranded RNA into DNA.
- Reverse transcriptase is found in many eukaryotic and prokaryotic systems like telomerase, retrotransposons, retrons, and are found abundantly in the genomes of plants and animals. Any of the wild type, variant, and mutant forms of reverse transcriptase which are known in the art or which can be made using methods known in the art are contemplated herein.
- the reverse transcriptase that can be used herein include, but not limited to, Moloney Murine Leukemia Virus (M-MLV) reverse transcriptase, Avian Myeloblastosis Virus (AMV) reverse transcriptase, and their variants or mutants of any of the variants.
- M-MLV RT has decreased or disrupted RNase H activity.
- the M-MLV RT is fused with a viral nucleocapsid (NC) protein.
- NC viral nucleocapsid
- the reverse transcriptase is fused directly to the Cas nuclease. In some embodiments, the reverse transcriptase is connected to the Cas nuclease with a linker. It would be understood that a person skilled in the art is able to select conditions (e.g., optimal temperature, pH, reaction time, and/or concentration) suitable for a reverse transcriptase to form the insertion double strand DNA and the like.
- the Cas nuclease and the reverse transcriptase are not fused or linked.
- the Cas nuclease and the reverse transcriptase are formed as a fusion protein, or operably connected by a linker.
- a fusion protein can be made from a fusion gene, e.g., created by joining parts of two different genes.
- the fusion protein is encoded by a sequence of SEQ ID NO: 208.
- the Cas nuclease is provided in two parts.
- an intein-mediated split-Cas9 is used in the complex disclosed herein and methods disclosed herein.
- a bi-lobed shaped structure of Cas9 has recently been discovered. The two lobes consist of a recognition lobe (REC) and a nuclease lobe (NUC) . In between, there is a positively charged groove where the negatively charged nucleic acids of the holo-form reside.
- REC recognition lobe
- NUC nuclease lobe
- Structural studies render the rational engineering of Cas9 possible, either to equip it with new functionalities or to change its characteristics.
- PE2 can be divided into two parts in the middle of the SpCas9 nickase and then reconstituted into intact functional PE2 if trans splicing inteins are placed at the location of the split.
- the components of this split-intein PE2 can be delivered into cells in vivo using dual AAV vectors to mediate PE events.
- a guide RNA is a short synthetic RNA composed of a scaffold sequence necessary for Cas-binding and a user-defined nucleotide spacer that defines the gene target to be modified.
- the strand of genomic DNA that is bound by the spacer is typically referred to as the complementary strand.
- the other strand of DNA is typically referred to as the non-complementary strand.
- the guide RNA used herein is made up of two RNA molecules which are a crRNA and a tracrRNA, wherein the crRNA is customized to bind a target gene, and the tracrRNA serves as a binding scaffold for a Cas enzyme.
- the guide RNA used herein is a single guide RNA (sgRNA) , wherein the single RNA molecule comprises a custom- designed crRNA sequence fused to a scaffold tracrRNA sequence.
- a single guide RNA is used to increase the editing efficiency.
- the guide RNA further comprises an extension arm to its 3’ end.
- the extension arm provides a DNA synthesis template sequence that encodes a single strand DNA flap that is to be inserted into a Cas cleavage site.
- at the 3’ end of the extension arm is a primer binding site (PBS) that binds to the non-complementary strand of the target gene and serves as a primer for the reverse transcriptase.
- PBS primer binding site
- the insertion sequence RT template and PBS are split off the guide RNA, and are provided separately as an RTT-PBS sequence.
- the DNA synthesis template sequence for the reverse transcriptase is referred to as an insertion sequence RT template in the present disclosure.
- the guide RNA comprising a spacer, a scaffold, and an insertion sequence reverse transcriptase (RT) template.
- the spacer, the scaffold, the insertion sequence RT template, and the PBS are arranged from 5’ to 3’ in the guide RNA.
- the insertion sequence RT template is about 10 to 30 nucleotides. In some embodiments, the insertion sequence template comprises a nucleotide sequence of any length, e.g., from about 10bp to 30bp. The insertion sequence can be of any length, including but not limited to 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30 nucleotides in length.
- the insertion sequence RT template comprises a nucleotide sequence of any one of SEQ ID NO: 504.
- the present disclosure provides two alternative insertion sequences, which are SEQ ID NOs: 505 and 506. (Fig. 27) .
- the insertion sequence RT template encodes one or more tags suitable for hybrid capture.
- Hybrid capture is a method used in target DNA enrichment, where a “bait” molecule is used to select target regions from DNA libraries.
- the hybrid capture method that can be used herein include, but not limited to, biotinylated oligonucleotide baits.
- the guide RNA comprises an RNA structural motif at the 3’ end.
- the RNA structural motif is a modified prequeosine1-1 riboswitch aptamer (evopreQ1) .
- the RNA structural motif is a frameshifting pseudoknot from Moloney Murine Leukemia Virus (MMLV) .
- the PBS comprises random nucleotides.
- the PBS is a short sequence complementary to the strand of the target gene other than the one targeted by the spacer. PBS binds to the target site and serves as the point of initiation for reverse transcription.
- Random nucleotide refers to the nucleotide in a PBS sequence that does not complementary to the target gene.
- the pegRNA comprises 1 random nucleotide. In some embodiments, the pegRNA comprises 2, 3, or 4 random nucleotides. In some embodiments, the random nucleotide is proximal to the reverse transcription initiation site. In some embodiments, the random nucleotide is the nucleotide next to the insertion sequence RT template.
- the guide RNA comprises a viral exoribonuclease-resistant RNA (xrRNA) motif at its 3’ end.
- xrRNA viral exoribonuclease-resistant RNA
- the xrRNA motif is derived from a flavivirus.
- the falvivurs is Dengue, Yellow fever, West Nile, or Zika.
- the present disclosure provides a polynucleotide encoding the Cas nuclease, the reverse transcriptase, the spacer, the scaffold, the insertion sequence RT template, and the PBS in any one of the complexes disclosed herein.
- the present disclosure provides a polynucleotide encoding a guide RNA comprising an insertion sequence RT template, wherein the insertion sequence RT template comprises a nucleotide sequence of any one of SEQ ID NOs: 504-506.
- the present disclosure provides a polynucleotide encoding an RTT-PBS sequence which comprises an insertion sequence RT template and a primer binding site (PBS) , wherein the insertion sequence RT template comprises a nucleotide sequence of any one of SEQ ID NOs: 504-506.
- polynucleotides disclosed herein can be obtained by methods known in the art.
- the polynucleotide can be obtained from cloned DNA (e.g., from a DNA library) , by chemical synthesis, by cDNA cloning, or by the cloning of genomic DNA or fragments thereof, purified from the desired cell.
- cloned DNA e.g., from a DNA library
- any method known to those skilled in the art for identification of nucleic acids that encode desired genes can be used. Any method available in the art can be used to obtain a full length (i.e., encompassing the entire coding region) cDNA or genomic DNA encoding a desired protein, such as from a cell or tissue source.
- Modified or variant polynucleotides can be engineered from a wildtype polynucleotide using standard recombinant DNA methods.
- Polynucleotides can be cloned or isolated using any available methods known in the art for cloning and isolating nucleic acid molecules. Such methods include PCR amplification of nucleic acids and screening of libraries, including nucleic acid hybridization screening, antibody-based screening, and activity-based screening.
- Methods for amplification of polynucleotides can be used to isolate polynucleotides encoding a desired protein, including for example, polymerase chain reaction (PCR) methods.
- PCR can be carried out using any known methods or procedures in the art. Exemplary methods include use of a Perkin-Elmer Cetus thermal cycler and Taq polymerase (Gene Amp) .
- a nucleic acid containing gene of interest can be used as a source material from which a desired polypeptide-encoding nucleic acid molecule can be amplified.
- DNA and mRNA preparations, cell extracts, tissue extracts from an appropriate source e.g., testis, prostate, breast
- fluid samples e.g., blood, serum, saliva
- samples from healthy and/or diseased subjects can be used in amplification methods.
- the source can be from any eukaryotic species including, but not limited to, vertebrate, mammalian, human, porcine, bovine, feline, avian, equine, canine, and other primate sources.
- Nucleic acid libraries also can be used as a source material. Primers can be designed to amplify a desired polynucleotide.
- primers can be designed based on expressed sequences from which a desired polynucleotide is generated. Primers can be designed based on back-translation of a polypeptide amino acid sequence. If desired, degenerate primers can be used for amplification. Oligonucleotide primers that hybridize to sequences at the 3’a nd 5’ termini of the desired sequence can be uses as primers to amplify by PCR from a nucleic acid sample. Primers can be used to amplify the entire full-length polynucleotide, or a truncated sequence thereof. Nucleic acid molecules generated by amplification can be sequenced and confirmed to encode a desired polypeptide.
- the present disclosure provides a vector comprising the polynucleotide disclosed herein.
- the present disclosure provides a vector comprising a first polynucleotide encoding a Cas nuclease and a reverse transcriptase, and a second polynucleotide encoding a guide RNA comprising a spacer, a scaffold, an insertion sequence RT template, and a primer binding site (PBS) , wherein the Cas nuclease is capable of creating DNA double-strand break (DSB) .
- PBS primer binding site
- the second polynucleotide is a polynucleotide encoding a guide RNA comprising an insertion sequence RT template, wherein the insertion sequence RT template comprises a nucleotide sequence of SEQ ID NO: 504, or any one of SEQ ID NOs: 505-506.
- the Cas nuclease is selected from Cas9, its variants and mutants of any of the variants.
- the reverse transcriptase is selected from Moloney Murine Leukemia Virus M-MLV reverse transcriptase, Avian Myeloblastosis Virus (AMV) reverse transcriptase, variants thereof and mutants of any of the variants.
- any methods known in the art for the insertion of DNA fragments into a vector can be used to construct expression vectors comprising a polynucleotide disclosed herein. These methods can include in vitro recombinant DNA and synthetic techniques and in vivo (genetic) recombination.
- the polynucleotide disclosed herein can be operably linked to control sequences in the expression vector (s) to ensure protein expression.
- control sequences may include, but are not limited to, leader or signal sequences, promoters (e.g., naturally associated or heterologous promoters) , ribosomal binding sites, enhancer or activator elements, translational start and termination sequences, and transcription start and termination sequences, and are chosen to be compatible with the host cell chosen to express the proteins.
- the promoters may be either naturally occurring promoters, hybrid promoters that combine elements of more than one promoter, or synthetic promoters.
- An expression construct may be present in a cell on an episome, such as a plasmid, or the expression construct may be inserted in a chromosome such as in a gene locus.
- the expression vector includes a selectable marker gene to allow the selection of transformed host cells.
- the vector is an expression vector comprising a nucleotide sequence encoding a variant polypeptide operably linked to at least one regulatory control sequence. Regulatory control sequences for use herein include promoters, enhancers, and other expression control elements.
- the expression vector is designed for the choice of the host cell to be transformed, the particular variant polypeptide desired to be expressed, the vector's copy number, the ability to control that copy number, and/or the expression of any other protein encoded by the vector, such as antibiotic markers.
- the vector can include, but is not limited to, viral vectors and plasmid DNA.
- Viral vectors can include, but are not limited to, adenoviral vectors, lentiviral vectors, retroviral vectors, and adeno-associated viral vectors.
- expression vectors contain selection markers such as ampicillin-resistance, hygromycin-resistance, tetracycline resistance, kanamycin resistance, or neomycin resistance to permit detection of those cells transformed with the desired DNA sequences.
- Suitable vectors, promoter, and enhancer elements are known in the art; many are commercially available for generating subject recombinant constructs.
- the vector is a polycistronic vector.
- the vector is a bicistronic vector or a tricistronic vector.
- Bicistronic or polycistronic expression vectors may include (1) multiple promoters fused to each of the open reading frames; (2) insertion of splicing signals between genes; (3) fusion of genes whose expressions are driven by a single promoter; and (4) insertion of proteolytic cleavage sites between genes (self-cleavage peptide) or insertion of internal ribosomal entry sites (IRESs) between genes.
- Apolycistronic vector is used to co-express multiple genes in the same cell.
- Two strategies are most commonly used to construct a multicistronic vector.
- an Internal Ribosome Entry Site (IRES) element is typically used for bi-cistronic vectors.
- the IRES element acting as another ribosome recruitment site, allows initiation of translation from an internal region of the mRNA. Thus, two proteins are translated from one mRNA.
- IRES elements are quite large (usually 500-600 bp) (Pelletier et al., 1988; Jang et al., 1988) .
- the engineered CD47 proteins disclosed herein have a smaller size compared to the wild-type full-length human CD47, and thus can be used with IRES element in a multicistronic vectors having limited packaging capacity.
- the present disclosure provides a kit comprising one or more polynucleotide sequences encoding a Cas nuclease, a reverse transcriptase, and a guide RNA which comprises a spacer, a scaffold, an insertion sequence RT template, and a primer binding site (PBS) , wherein the Cas nuclease is capable of creating DNA double-strand break (DSB) .
- a kit comprising one or more polynucleotide sequences encoding a Cas nuclease, a reverse transcriptase, and a guide RNA which comprises a spacer, a scaffold, an insertion sequence RT template, and a primer binding site (PBS) , wherein the Cas nuclease is capable of creating DNA double-strand break (DSB) .
- PBS primer binding site
- the present disclosure provides a kit comprising one or more polynucleotide sequences encoding a Cas nuclease, a reverse transcriptase, a guide RNA which comprises a spacer and a scaffold, and an insertion sequence RT template, and a primer binding site (PBS) , wherein the Cas nuclease is capable of creating DNA double-strand break (DSB) .
- a kit comprising one or more polynucleotide sequences encoding a Cas nuclease, a reverse transcriptase, a guide RNA which comprises a spacer and a scaffold, and an insertion sequence RT template, and a primer binding site (PBS) , wherein the Cas nuclease is capable of creating DNA double-strand break (DSB) .
- PBS primer binding site
- the present disclosure provides a kit comprising one or more vectors encoding a Cas nuclease, a reverse transcriptase, and a guide RNA which comprises a spacer, a scaffold, and an RTT-PBS sequence which comprises an insertion sequence RT template and a primer binding site (PBS) , wherein the Cas nuclease is capable of creating DNA double-strand break (DSB) .
- a kit comprising one or more vectors encoding a Cas nuclease, a reverse transcriptase, and a guide RNA which comprises a spacer, a scaffold, and an RTT-PBS sequence which comprises an insertion sequence RT template and a primer binding site (PBS) , wherein the Cas nuclease is capable of creating DNA double-strand break (DSB) .
- DSB DNA double-strand break
- the present disclosure provides a kit comprising one or more vectors encoding a Cas nuclease, a reverse transcriptase, a guide RNA which comprises a spacer and a scaffold, and an insertion sequence RT template, and a primer binding site (PBS) , wherein the Cas nuclease is capable of creating DNA double-strand break (DSB) .
- a kit comprising one or more vectors encoding a Cas nuclease, a reverse transcriptase, a guide RNA which comprises a spacer and a scaffold, and an insertion sequence RT template, and a primer binding site (PBS) , wherein the Cas nuclease is capable of creating DNA double-strand break (DSB) .
- PBS primer binding site
- the Cas nuclease is selected from Cas9, its variants and mutants of any of the variants.
- the reverse transcriptase is selected from Moloney Murine Leukemia Virus M-MLV reverse transcriptase, Avian Myeloblastosis Virus (AMV) reverse transcriptase, variants thereof and mutants of any of the variants.
- the insertion sequence RT template comprises a nucleotide sequence of SEQ ID NO: 504, or any one of SEQ ID NOs: 505-506.
- the present disclosure provides a composition comprising one or more polynucleotide sequences encoding a Cas nuclease, a reverse transcriptase, and a guide RNA comprises a spacer, a scaffold, an insertion sequence RT template, and a primer binding site (PBS) , wherein the Cas nuclease is capable of creating DNA double-strand break (DSB) .
- a composition comprising one or more polynucleotide sequences encoding a Cas nuclease, a reverse transcriptase, and a guide RNA comprises a spacer, a scaffold, an insertion sequence RT template, and a primer binding site (PBS) , wherein the Cas nuclease is capable of creating DNA double-strand break (DSB) .
- PBS primer binding site
- the present disclosure provides a composition
- a composition comprising one or more vectors encoding a Cas nuclease, a reverse transcriptase, and a guide RNA comprises a spacer, a scaffold, an insertion sequence RT template, and a primer binding site (PBS) , wherein the Cas nuclease is capable of creating DNA double-strand break (DSB) .
- PBS primer binding site
- the present disclosure provides a method for labeling Cas nuclease cleavage sites in genomic DNA, comprising contacting the genomic DNA with the complex disclosed herein, wherein the genomic DNA is cleaved at one or more cleavage sites, and one or more sequences that are reverse transcribed from the insertion sequence RT template in part or in whole are inserted into the one or more cleavage sites, and wherein the one or more sequences inserted into the one or more cleavage sites are labels.
- the cleavage site is on-target or off-target.
- a Cas nuclease binds to a genetic locus that has a sequence exactly the same as the target gene, the cleavage site created there is an on-target cleavage site. Otherwise, the cleavage site is an off-target site.
- the present disclosure provides a method for detecting Cas9 cleavage sites and/or detecting DNA translocation in genomic DNA, comprising (a) labeling the genomic DNA with the method disclosed herein, (b) targeting and amplifying the labeled sites on the genomic DNA to obtain amplicons, (c) sequencing the amplicons to obtain sequencing results, and (d) analyzing the sequencing result to identify Cas cleavage sites and/or DNA translocations at the Cas cleavage sites.
- the one or more amplicons each comprise a portion of genomic DNA that is immediately upstream or downstream to the one or more labels.
- the method is used to identify Cas nuclease off-target sites by comparing the Cas cleavage sites identified by the method disclosed herein with a target sequence, and the cleavage site that is not identical to the target sequence is an off-target site. It would be understood that, based on the method disclosed herein, those of ordinary skill in the art are able to locate the cleavage sites on the genome with readily available tools such as Burrows-Wheeler Aligner (BWA) .
- BWA Burrows-Wheeler Aligner
- the genomic DNA is processed by Tn5 tagmentation before amplification.
- Tn5 tagmentation uses a hyperactive variant of the Tn5 transposase that mediates the fragmentation of double-stranded DNA and ligates synthetic oligonucleotides at both ends (Adey et al. 2010) .
- Wild-type Tn5 transposon is a composite transposon in which two near-identical insertion sequences (IS50L and IS50R) are flanking three antibiotic resistance genes (Reznikoff 2008) .
- Each IS50 contains two inverted 19-bp end sequences (ESs) , an outside end (OE) and an inside end (IE) .
- Tn5 tagmentation platform or kits and their variants or mutants of any of the variants can be used in the present disclosure, such as Nextera DNA kits and on-bead tagmentation.
- the genomic DNA is processed by Tn5 tagmentation before amplification, wherein sequencing adapters that include unique molecular identifiers (UMI) are embedded in the Tn5 transposases.
- UMI is a type of molecular barcoding that provides error correction and increased accuracy in sequencing data analysis.
- the molecular barcodes are short sequences used to uniquely tag each molecule in a sample library.
- the UMI-included adapters are embedded into Tn5 so that dsDNA fragments after tagmentation are tagged with these UMI-included adapters, which can be used to eliminate PCR duplicates from the sequencing data.
- the genomic DNA comprises the insertion sequence or a portion of the insertion sequence is targeted and enriched by a method selected from PCR, or a hybrid capture-based target enrichment method.
- Hybrid capture-based target enrichment method that can be used herein includes, but not limited to, biotinylated oligonucleotide baits.
- PCR polymerase chain reaction
- a set of flanking primers anneal at the outer regions of the DNA sequence of interest, and therefore, unwanted DNA are not amplified.
- Another available group of methods for target enrichment is hybrid capture-based methods.
- One commonly used hybridization capture tag uses a biotinylated oligonucleotide bait. Any methods that can effectively enrich a targeted portion of the genomic DNA can be used herein.
- the enrichment is performed by two rounds of PCR, wherein in one reaction the insertion sequence is used as the forward primer binding site and in the other reaction the insertion sequence is used as the reverse primer binding site.
- the 3’ end of the primers that bind to the insertion sequence are at least 2-bp away from the insertion boundary so that the extension sequence information can be used to filter out random priming reads (see Fig 1B) . If the primer correctly binds to the insertion sequence, there would be at least 2 bp at the beginning of the extension sequence that are complementary to the insertion sequence.
- the insertion boundary described herein is the first and last base pair of the insertion sequence.
- the method is used to identify DNA translocation in genomic DNA, wherein a detection of signals located at the upstream genomic region of the forward primer binding site.
- the methods disclosed herein can be used in vitro, in cellulo, or in vivo.
- the present disclosure provides a method for determining the relative specificity of a plurality of guide RNAs comprising (a) identifying the off-target sites for Cas cleavage using each of the guide RNAs with a method disclosed herein, and (b) determining the relative specificity of the guide RNAs based on the total number of off-target sites identified for each of the guide RNAs, wherein a guide RNA having fewer off-target sites is more specific than a guide RNA having more off-target sites.
- the present disclosure provides a method for determining the relative specificity of a plurality of Cas nuclease variants and mutants comprising (a) identifying the off-target cleavage site for each of the Cas nuclease variants and mutants with a method disclosed herein, and (b) determining the relative specificity of the Cas nuclease variants and mutants based on the total number of off-target sites identified for each of the Cas nuclease variants and mutants, wherein a Cas nuclease variant or mutant having fewer off-target sites is more specific than a Cas nuclease variant or mutant having more off-target sites.
- the present disclosure provides a method for determining the relative genotoxicity of a plurality of guide RNAs comprising (a) identifying the off-target cleavage site and DNA translocation for each of the guide RNAs with a method disclosed herein, and (b) determining the relative genotoxicity of the guide RNAs based on the total number of off-target sites and DNA translocation identified for each of the guide RNAs, wherein a guide RNA having fewer off-target sites and fewer DNA translocation is more specific than a guide RNA having more off-target sites and DNA translocation.
- sequencing includes any method of determining the sequence of a nucleic acid. Any method of sequencing can be used in the present disclosure, including chain terminator (Sanger) sequencing and dye terminator sequencing. In preferred embodiments, Next Generation Sequencing (NGS) is used. NGS is a high-throughput sequencing technology that performs thousands or millions of sequencing reactions in parallel. Although different NGS platforms use varying assay chemistries, they all generate sequence data from a large number of sequencing reactions run simultaneously on a large number of templates. Typically, the sequence data is collected using a scanner, and then assembled and analyzed bioinformatically. Thus, the sequencing reactions are performed, read, assembled, and analyzed in parallel.
- chain terminator Sanger sequencing and dye terminator sequencing.
- NGS Next Generation Sequencing
- NGS methods require template amplification and some do not.
- Amplification-requiring methods include pyrosequencing; the Solexa/Illumina platform, and the Supported Oligonucleotide Ligation and Detection (SOLID) platform.
- Methods that do not require amplification include single-molecule sequencing methods, nanopore sequencing, HeliScope, real-time sequencing by synthesis, single molecule real time (SMRT) DNA sequencing methods using zero-mode waveguides (ZMWs) and others.
- SMRT single molecule real time
- ZMWs zero-mode waveguides
- hybridization-based sequence methods or other high-throughput methods can also be used, e.g., microarray analysis, NANOSTRING, ILLUMINA, or other sequencing platforms.
- the methods described herein can be used in any cell that is capable of repairing a DSB in genomic DNA and synthesizing new strand of DNA based on a template.
- the two major DSB repair pathways in eukaryotic cells are homologous recombination and non-homologous end joining (NHEJ) .
- the methods can be performed in cells capable of any of the repair pathways.
- the Prime Editor system was adapted by replacing the Cas9 nickase with wildtype Cas9.
- the RT template of pegRNA was modified.
- the RT templated used here is SEQ ID NO. 9.
- the Cas9 and pegRNA were assembled into a single vector as the PEAC-seq backbone.
- the spacer sequences targeting VEGFA, EMX1, RFN2 and FANCF were cloned into the PEAC-seq backbone individually.
- HEK293T cells were seeded in a 12-well plate and grow till ⁇ 80%confluency. Each well was transfected with 3ug plasmids by Lipofectamine 3000. The post-transfection cells were collected after 48 hours.
- the cell sorter (SONY MA900) was used to sort about 100,000 GFP positive cells. About 500ng extracted gDNA was digested with NotI then cleaned up with 0.5x AMPure XP beads to remove the carryover plasmids. The gDNA fragments were retained on the AMPure XP beads, and on-beads Tn5 digestion was performed at 55C for one hour and adaptors were inserted at the ends of the fragments. The Tn5 was expressed and embedded with the adaptors in-house. At the end of the Tn5 digestion, 6uL 0.2%SDS was added to terminate the reaction. The products were purified and size-selected by 1.5x AMPure XP beads and eluted in 50uL H2O.
- the 21bp insertion sequence was used to enrich the editing sites (both on-target and off-target) in the NGS library preparation.
- 1st round of the nested PCR two separate reactions were performed. Each reaction used a 20uL template in a total of 50uL volume at ⁇ 30 cycles.
- 2.5uL 1st round product was used as the template in the 2nd round amplification in a total of 50uL volume at 17 cycles, and Illumina adaptors were added.
- the amplicons were purified by AMPure XP beads using 0.6x+0.25x double size selection.
- the library was sequenced on the Illumina Novaseq platform as paired-end 150bp.
- oligo and vectors are summarized in the Fig. 22.
- the PEAC-seq data was analyzed using a modified pipeline from GUIDE-seq (Tsai et al., Nat Biotechnol 2015) . Firstly, adapters were trimmed using cutadapt (Martin M., EMBnet. journal 2011) , and reads without appropriate adapter were removed. Then the reads were mapped to the human or mouse genome (hg38, mm10) using bwa. Reads mapped to the same location and shared the same UMI were considered as PCR duplicates and merged in the following analysis. In order to fit in the target identification pipeline from GUIDE-seq, the reads name from bam files were modified, and the bam files from the forward and backward PCR were labeled and merged.
- the reads number from the GUIDE-seq output file was normalized to reads per million and the number of reads with correct primer extension was calculated.
- two nested PCR primers upstream of the gRNA were designed.
- the site-specific nested PCR primers were served as forward primers, and downstream Tn5 primer was served as reverse primer.
- the nested primers were sequentially used to amplify the adjacent sequences of translocated DSBs.
- About 300 ng PEAC-seq gDNA was fragmentized by Tn5, purified with 1.5x AMpure XP beads and eluted with 23uL H 2 O.
- About 20uL purified DNA was used as template for the 1 st round PCR for 20 cycles.
- 2.5uL products from the 1 st PCR was used as template for another 20 cycles in the 2 nd round of the nested PCR.
- Another 20-cycles PCR was conducted to add the sequencing adaptors.
- the amplicons were purified by 0.6x then 0.25x double-size beads selection.
- the library was sequenced on the Illumina Novaseq platform as paired-end 150b
- oligo and vectors are summarized in the Fig. 23.
- a 21-nt cytosine-depleted sequence was designed as an insertion sequence RT template
- PEAC-seq was then conducted in HEK293T cells at six sites (VEGFA TS1, VEGFA TS2, VEGFA TS3, EMX1, FANCF, and RNF2) that have been tested in multiple studies (Kim et al., Nat Methods 2015; Kim et al., Genome Res 2018; Tsai et al., Nat Methods 2017; Cameron et al., Nat Methods 2017; Tsai et al., Nat biotechnol 2015) .
- a modified GUIDE-seq analysis pipeline was used to rank and filter the identified editing sites. With an analysis of the off-target sites generated from different primer sets for PEAC-seq tag enrichment, the F1 and R2 primers were chosen as the enrichment primers in the following analysis (Figs. 9-14, 20) .
- Amplicon-seq was then conducted to verify those off-targets that were only identified by GUIDE-seq or PEAC-seq at VEGFA TS1, FANCF, and EMX1 sites (Tsai et al., Nat Biotechnol 2015) (Table 2) .
- VEGFA TS1 site Amplicon-seq confirmed the two PEAC-seq-unique off-targets, demonstrating good sensitivity of PEAC-seq.
- the PEAC score calculated from the sequencing reads of PEAC-seq, quantitatively represents the enrichment of PEAC-seq tag at the edited sites.
- the off-target sites identified by both PEAC-seq and GUIDE-seq show higher PEAC score compared to PEAC-seq-unique off-targets (Fig. 2A) .
- the number of sequencing reads surrounding the off-targets were highly correlated at the fourteen shared sites (Fig. 2B) , suggesting their consistency in detecting high confident off-targets.
- the on-target site, shared off-target sites, and PEAC-seq-unique off-target sites show similar tracks (Fig. 2C) .
- the shared off-target sites composed a smaller number of mismatches than off-target sites unique to one of the methods (Fig. 2D) , which is expected as the number of mismatches closely relate to the occurrence of off-target editing.
- the forward primer (F1) and downstream Tn5 primer would amplify regions downstream, but not upstream, of the PEAC-seq label (Fig. 3A) .
- unexpected signals located at the upstream genomic region of the F1-Tn5 amplicons were observed (Figs. 3A, 15) . These signals might come from the joining of DSB ends from another genome breaking site.
- PEAC-seq generates DSBs with three different ends, including one upstream end appended with a complete or partial PEAC-seq tag, one upstream end without PEAC-seq tag, and one downstream end.
- DSB ends from different breaking points might join together and cause DNA rearrangements.
- the upstream end with the PEAC-seq label from a distal Donor Site may join to the upstream end of a Receiver Site, but the direction of the PEAC-seq tag is reverse relative to the Receiver Site (Fig. 3B, model (v) ) .
- This joining generates signals upstream to the PEAC-seq label of the Receiver Site, which won’t be amplified by the F1 and Tn5 primers (Fig. 3A) .
- the directional insertion sequence in PEAC-seq allows identification of the aberrant ends joining from different DSB sites.
- primers (Nest-F) located at the upstream of the F1 primer were designed, which paired with the downstream Tn5 primer to identify the sequences of the unknown Donor sites (Fig. 3C) .
- a successful amplification bridging the Donor and the Receiver sites does not require the existence of the PEAC-seq label insertion (Fig. 3B, model (III) and (iv) ) , which allows comprehensive estimation of the various rearrangement patterns between the Donor and the Receiver sites.
- PEAC-seq used the templated information on pegRNA to insert label sequences and not rely on exogenous labels. This straightforward procedure allowed us to investigate its application in vivo.
- Mice embryos were edited at the pronuclear stage by injecting in vitro transcribed Cas9-MMLV mRNA and pegRNAs targeting PCSK9 and PNPLA3. Embryos were collected around E14.5 to E21 and generated the PEAC-seq off-target lists for these two sites (Fig. 4) .
- One PCSK9 on-target and one off-target were identified from the two embryos, which both have been previously reported by DISCOVER-seq (Figs. 4B-4D, 21) .
- Amplicon-seq verified the edits at the PEAC-seq off-targets and confirmed non-edits at the other reported off-targets.
- the small number of PCSK9 off-targets in our study might be relevant to the short editing time window by using mRNA injection in embryos, compared to the adenovirus delivery in the liver (Wienert et al., Science 2019) .
- the PEAC-seq at another in vivo CRISPR therapy target PNPLA3 was conducted. Three editing sites, including the on-target site, were identified by PEAC-seq from two embryos (Fig. 16, Fig. 21) .
- Both the pegRNA and the mRNA of Cas9-MMLV were prepared by in vitro transcription.
- the DNA template of pegRNA was amplified from the plasmids “pcsk9-sgRNA” and “mPnpla-sgRNA” by primers T7F and T7R.
- the PCR products were gel purified using MinElute Gel Extraction Kit (QIAGEN #28606) , which was used as the template for in vitro transcription by HiScribe T7 Quick High Yield RNA Synthesis Kit (NEB #E2050S) .
- the pCMV-Cas9-PE2 plasmid was linearized by MssI (Thermo #FD1344) . According to the manufacturer’s instructions, 1ug linearized product was used as a template to generate Cas9-PE mRNA from in vitro transcription by HiScribe T7 ARCA mRNA Kit (NEB #E2060S) .
- C57BL/6 and ICR mice were purchased and housed in the Laboratory Animal Resource Center (LARC) at the Westlake University.
- LARC Laboratory Animal Resource Center
- the LARC is a certified pathogen-free and environmental-control facility (21 ⁇ 2°C, 55 ⁇ 15%humidity and 12: 12-h light: dark cycle) .
- the C57BL/6 mice were used for embryo collection, and ICR females were used as recipients. All animal experiments were conducted under the protocol approved by the animal care and ethical committee of the Westlake University.
- Embryos were then flushed several times to rinse off the hyaluronidase and cumulus cells. Afterward, embryos were transferred into a dish with prewarmed KSOM medium (Millipore #MR-106-D) covered by mineral oil followed by three additional washes.
- the mixture of Cas9-PE2 mRNA (100ng/uL) and pegRNA (50ng/uL) was injected into the cytoplasm of the zygote in M2 medium.
- the injection was conducted using a microinjector (NARISHIGE #IM-400B) with constant flow settings.
- the injected embryos were cultured in KSOM medium with amino acids in a cell culture incubator at 37C and with 5%CO2, then were transplanted into oviducts of pseudopregnant ICR females at 0.5 dpc. Pups were sacrificed at E19.5 ⁇ E21, and organs were collected, dissected and snap-frozen in liquid nitrogen. Samples were stored at -80C until further analysis.
- the gDNA from organs was extracted using TIANamp Genomic DNA Kit (TIANGEN #DP304-03) according to the manufacturer’s instructions. Nested PCR was applied to amplify the targeting regions and attach the Illumina adaptors to amplicons.
- the in vivo PEAC-seq library was constructed as the cell line data in the previous section by Tn5 fragmentation.
- the PEAC-seq was modified to use epegRNA (engineered pegRNA, incorporated 3’ RNA structural motif evopreQ 1 ) and including transient expression of MLH1dn with Cas9-MMLV.
- epegRNA engineered pegRNA, incorporated 3’ RNA structural motif evopreQ 1
- MLH1dn transient expression of MLH1dn with Cas9-MMLV.
- epegRNA, hMLH1, and epegRNA plus MLH1dn three modified versions of PEAC-seq were developed and their performances on identifying off-targets at EMX1 and VEGFA TS2 sites were benchmarked (Fig. 5A) .
- the truncated MMLV was not included as it is reported to be effective in plants but not in mammal cells (Zong et al., Nat Biotechnol 2022) .
- the PEAC-seq label insertion was the main concentration because its efficiency is critical to the overall performance of PEAC-seq.
- incorporating epegRNA appears to be the most effective one to increase the number of PEAC-seq tag insertion at different cutoffs (Fig. 5B) .
- the epegRNA version of PEAC-seq was named as ePEAC-seq.
- ePEAC-seq successfully identified the two missed off-targets of EMX1 (Figs. 5C-5D) , emphasized its higher sensitivity than PEAC-seq.
- ePEAC-seq also called more off-target sites shared with GUIDE-seq, comparing to PEAC-seq (Fig. 10A, 17) . It is not surprising that the transient expression of MLH1dn didn’ t improve the performance, as MLH1dn is a dominant negative MMR protein, which involves DNA heteroduplexes by selectively replacing nicked DNA strands (Chen et al., Cell 2021) .
- the repair pathway activated by PEAC-seq is probably different, as in some embodiments, the wild-type Cas9 replaced the Cas9 nickase in the native PE system.
- Deeptools ‘computeMatrix’ (command : --referencePoint center --afterRegionStartLength 5000 --beforeRegionStartLength 5000 -p 15 --binSize 500) and ‘plotHeatmap’ function (Ramirez et al., Nucleic Acids Res 2014) were used to visualize the the genomic co-localizations between the all in vitro PEAC-seq off-target sites and epigenetic signals.
- DSBs hotspots were identified from the dsODN only control (no Cas9/gRNA) from the GUIDE-seq performed in the 293T cells. Control genomic regions, which were equally sized regions randomly across the genome, were generated with the in-house perl script.
- Deeptools ‘computeMatrix’a nd ‘plotHeatmap’ function were used to plot the heatmap of the genomic co-localizations between the PEAC-seq translocation sites or control genomic regions.
Landscapes
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Engineering & Computer Science (AREA)
- Genetics & Genomics (AREA)
- Organic Chemistry (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- Molecular Biology (AREA)
- Biotechnology (AREA)
- Biomedical Technology (AREA)
- Biochemistry (AREA)
- General Health & Medical Sciences (AREA)
- Microbiology (AREA)
- Plant Pathology (AREA)
- Crystallography & Structural Chemistry (AREA)
- Physics & Mathematics (AREA)
- Biophysics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Medicinal Chemistry (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
La présente divulgation concerne des complexes, des polynucléotides, des vecteurs, des kits et des procédés permettant de détecter les sites de clivage des nucléases CRISPR/Cas et les translocations d'ADN au niveau des sites de clivage dans un génome.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2022/137789 WO2024119461A1 (fr) | 2022-12-09 | 2022-12-09 | Compositions et procédés pour détecter les sites de clivage cibles des nucléases crispr/cas et la translocation de l'adn |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2022/137789 WO2024119461A1 (fr) | 2022-12-09 | 2022-12-09 | Compositions et procédés pour détecter les sites de clivage cibles des nucléases crispr/cas et la translocation de l'adn |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2024119461A1 true WO2024119461A1 (fr) | 2024-06-13 |
Family
ID=84901443
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2022/137789 WO2024119461A1 (fr) | 2022-12-09 | 2022-12-09 | Compositions et procédés pour détecter les sites de clivage cibles des nucléases crispr/cas et la translocation de l'adn |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2024119461A1 (fr) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2007140506A1 (fr) * | 2006-06-02 | 2007-12-13 | Human Genetic Signatures Pty Ltd | Acide nucléique microbien modifié destiné à la détection et à l'analyse de micro-organismes |
WO2023060539A1 (fr) * | 2021-10-15 | 2023-04-20 | Westlake University | Compositions et procédés pour détecter des sites de clivage cibles de nucléases crispr/cas et une translocation d'adn |
-
2022
- 2022-12-09 WO PCT/CN2022/137789 patent/WO2024119461A1/fr unknown
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2007140506A1 (fr) * | 2006-06-02 | 2007-12-13 | Human Genetic Signatures Pty Ltd | Acide nucléique microbien modifié destiné à la détection et à l'analyse de micro-organismes |
WO2023060539A1 (fr) * | 2021-10-15 | 2023-04-20 | Westlake University | Compositions et procédés pour détecter des sites de clivage cibles de nucléases crispr/cas et une translocation d'adn |
Non-Patent Citations (57)
Title |
---|
AKCAKAYA, P. ET AL.: "In vivo CRISPR editing with no detectable genome-wide off-target mutations", NATURE, vol. 561, 2018, pages 416 - 419, XP036902697, DOI: 10.1038/s41586-018-0500-9 |
ALANIS-LOBATO ET AL., PROC. NATL. ACAD. SCI. USA, 2021 |
ALANIS-LOBATO, G. ET AL.: "Frequent loss of heterozygosity in CRISPR-Cas9-edited early human embryos", PROC NATL ACAD SCI U S A, 2021, pages 118 |
ALT, F.W.ZHANG, Y.MENG, F.L.GUO, C.SCHWER, B.: "Mechanisms of programmed DNA lesions and genomic instability in the immune system", CELL, vol. 152, 2013, pages 417 - 429 |
ANDERSON KEITH R ET AL: "CRISPR off-target analysis in genetically engineered rats and mice", NATURE METHODS, NATURE PUBLISHING GROUP US, NEW YORK, vol. 15, no. 7, 21 May 2018 (2018-05-21), pages 512 - 514, XP036538714, ISSN: 1548-7091, [retrieved on 20180521], DOI: 10.1038/S41592-018-0011-5 * |
ANDERSON, K.R. ET AL.: "CRISPR off-target analysis in genetically engineered rats and mice", NAT METHODS, vol. 15, 2018, pages 512 - 514, XP036542157, DOI: 10.1038/s41592-018-0011-5 |
ANZALONE ANDREW V. ET AL: "Search-and-replace genome editing without double-strand breaks or donor DNA", NATURE, vol. 576, no. 7785, 21 October 2019 (2019-10-21), London, pages 149 - 157, XP055980447, ISSN: 0028-0836, Retrieved from the Internet <URL:https://www.nature.com/articles/s41586-019-1711-4> DOI: 10.1038/s41586-019-1711-4 * |
ANZALONE, A.V. ET AL.: "Search-and-replace genome editing without double-strand breaks or donor DNA", NATURE, vol. 576, 2019, pages 149 - 157, XP055899878, DOI: 10.1038/s41586-019-1711-4 |
BOIX ET AL., NATURE, 2021 |
BOIX, C.A.JAMES, B.T.PARK, Y.P.MEULEMAN, W.KELLIS, M.: "Regulatory genomic circuitry of human disease loci by integrative epigenomics", NATURE, vol. 590, 2021, pages 300 - 307, XP037365134, DOI: 10.1038/s41586-020-03145-z |
BOTHMER, A. ET AL.: "Detection and Modulation of DNA Translocations During Multi-Gene Genome Editing in T Cells", CRISPR J, vol. 3, 2020, pages 177 - 187 |
CAMERON, P. ET AL.: "Site-seq: Mapping the genomic landscape of CRISPR-Cas9 cleavage", NAT METHODS, vol. 14, 2017, pages 600 - 606 |
CHEN, P.J. ET AL.: "Enhanced prime editing systems by manipulating cellular determinants of editing outcomes", CELL, vol. 184, 2021, pages 5635 - 5652 |
CHIARLE, R. ET AL.: "Genome-wide translocation sequencing reveals mechanisms of chromosome breaks and rearrangements in B cells", CELL, vol. 147, 2011, pages 107 - 119, XP028304221, DOI: 10.1016/j.cell.2011.07.049 |
CHOI JUNHONG ET AL: "Precise genomic deletions using paired prime editing", NATURE BIOTECHNOLOGY, NATURE PUBLISHING GROUP US, NEW YORK, vol. 40, no. 2, 14 October 2021 (2021-10-14), pages 218 - 226, XP037691460, ISSN: 1087-0156, [retrieved on 20211014], DOI: 10.1038/S41587-021-01025-Z * |
DATABASE Genbank [online] 20 November 2022 (2022-11-20), WELLCOME SANGER TREE OF LIFE PROGRAMME: "Hofmannophila pseudospretella genome assembly, chromosome: 11", XP093045215, retrieved from https://www.ncbi.nlm.nih.gov/nuccore/OX376322.1 Database accession no. OX376322.1 * |
DATABASE Genebank [online] 9 June 2022 (2022-06-09), WELLCOME SANGER TREE OF LIFE PROGRAMME: "Piscicola geometra assembly chromosome: 11", XP093045210, retrieved from https://www.ncbi.nlm.nih.gov/nuccore/OX030965.1 Database accession no. OX030965 * |
ELLEFSON ET AL., SCIENCE, 2016 |
ELLEFSON, J.W. ET AL.: "Synthetic evolutionary origin of a proofreading reverse transcriptase", SCIENCE, vol. 352, 2016, pages 1590 - 1593, XP055787498, DOI: 10.1126/science.aaf5409 |
GIANNOUKOS, G. ET AL.: "UDiTaS, a genome editing detection method for indels and genome rearrangements", BMC GENOMICS, vol. 19, 2018, pages 212 |
GRUNEWALD, JULIAN ET AL.: "Engineered CRISPR prime editors with compact, untethered reverse transcriptases", NATURE BIOTECHNOLOGY, 2022, pages 1 - 7 |
HU, J. ET AL.: "Detecting DNA double-stranded breaks in mammalian genomes by linear amplification-mediated high-throughput genome-wide translocation sequencing", NAT PROTOC, vol. 11, 2016, pages 853 - 871, XP055668234, DOI: 10.1038/nprot.2016.043 |
KIM, D. ET AL.: "Digenome-seq: genome-wide profiling of CRISPR-Cas9 off-target effects in human cells", NAT METHODS, vol. 12, 2015, pages 237 - 243, XP055554961, DOI: 10.1038/nmeth.3284 |
KIM, D.KIM, J.S.: "DIG-seq: a genome-wide CRISPR off-target profiling method using chromatin DNA", GENOME RES, vol. 28, 2018, pages 1894 - 1900 |
KLEINSTIVER ET AL., NATURE, 28 January 2016 (2016-01-28) |
LIANG, G. ET AL.: "Frequent gene conversion in human embryos induced by double strand breaks", BIORXIV, 2020 |
LIU ET AL., CELL, 2017 |
LIU PENGPENG ET AL: "Improved prime editors enable pathogenic allele correction and cancer modelling in adult mice", NATURE COMMUNICATIONS, vol. 12, no. 1, 9 April 2021 (2021-04-09), XP055980471, Retrieved from the Internet <URL:http://www.nature.com/articles/s41467-021-22295-w> DOI: 10.1038/s41467-021-22295-w * |
LIU, BIN ET AL.: "A split prime editor with untethered reverse transcriptase and circular RNA template", NATURE BIOTECHNOLOGY, 2022, pages 1 - 6 |
LIU, PENGPENG ET AL.: "Improved prime editors enable pathogenic allele correction and cancer modelling in adult mice", NATURE COMMUNICATIONS, vol. 12, no. 1, 2021, pages 1 - 13, XP055980471, DOI: 10.1038/s41467-021-22295-w |
LIU, X. ET AL.: "CRISPR-Cas9-mediated multiplex gene editing in CAR-T cells", CELL RES, vol. 27, 2017, pages 154 - 157, XP055555205, DOI: 10.1038/cr.2016.142 |
MACLEAN ET AL., NATURE REV. MICROBIOL., vol. 7, 2009, pages 287 - 296 |
MARTIN, M.: "Cutadapt removes adapter sequences from high-throughput sequencing reads", EMBNET.JOURNAL, vol. 17, 2011, pages 10 - 12 |
MUSUNURU, K. ET AL.: "In vivo CRISPR base editing of PCSK9 durably lowers cholesterol in primates", NATURE, vol. 593, 2021, pages 429 - 434, XP037513148, DOI: 10.1038/s41586-021-03534-y |
NELSON, J.W. ET AL.: "Engineered pegRNAs improve prime editing efficiency", NAT BIOTECHNOL, vol. 40, 2022, pages 402 - 410, XP037720612, DOI: 10.1038/s41587-021-01039-7 |
NEWBY, G.A. ET AL.: "Base editing of haematopoietic stem cells rescues sickle cell disease in mice", NATURE, 2021 |
RAMIREZ, F.DUNDAR, F.DIEHL, S.GRUNING, B.A.MANKE, T.: "deepTools: a flexible platform for exploring deep-sequencing data", NUCLEIC ACIDS RES, vol. 42, 2014, pages W187 - 191 |
REN, J. ET AL.: "Multiplex Genome Editing to Generate Universal CAR T Cells Resistant to PD1 Inhibition", CLIN CANCER RES, vol. 23, 2017, pages 2255 - 2266, XP055565027, DOI: 10.1158/1078-0432.CCR-16-1300 |
SHENGDAR Q TSAI ET AL: "CIRCLE-seq: a highly sensitive in vitro screen for genome-wide CRISPR–Cas9 nuclease off-targets", NATURE METHODS, vol. 14, no. 6, 1 January 2017 (2017-01-01), New York, pages 607 - 614, XP055424040, ISSN: 1548-7091, DOI: 10.1038/nmeth.4278 * |
SLAYMAKER ET AL., SCIENCES, 1 January 2016 (2016-01-01) |
TRUONG, DONG-JIUNN JEFFERY ET AL.: "Development of an intein-mediated split-Cas9 system for gene therapy", NUCLEIC ACIDS RESEARCH, vol. 43, no. 13, 2015, pages 6450 - 6458, XP055791410, DOI: 10.1093/nar/gkv601 |
TSAI ET AL., NAT BIOTECHNOL, 2015 |
TSAI SHENGDAR Q ET AL: "GUIDE-seq enables genome-wide profiling of off-target cleavage by CRISPR-Cas nucleases", NATURE BIOTECHNOLOGY, NATURE PUBLISHING GROUP US, NEW YORK, vol. 33, no. 2, 16 December 2014 (2014-12-16), pages 187 - 197, XP037614260, ISSN: 1087-0156, [retrieved on 20141216], DOI: 10.1038/NBT.3117 * |
TSAI, S.Q. ET AL.: "CIRCLE-seq: a highly sensitive in vitro screen for genome-wide CRISPR-Cas9 nuclease off-targets", NAT METHODS, vol. 14, 2017, pages 607 - 614, XP055424040, DOI: 10.1038/nmeth.4278 |
TSAI, S.Q. ET AL.: "GUIDE-seq enables genome-wide profiling of off-target cleavage by CRISPR-Cas nucleases", NAT BIOTECHNOL, vol. 33, 2015, pages 187 - 197, XP055555627, DOI: 10.1038/nbt.3117 |
VOELKERDING ET AL., CLINICAL CHEM., vol. 55, 2009, pages 641 - 658 |
WEI, P.C. ET AL.: "Long Neural Genes Harbor Recurrent DNA Break Clusters in Neural Stem/Progenitor Cells", CELL, vol. 164, 2016, pages 644 - 655, XP029416800, DOI: 10.1016/j.cell.2015.12.039 |
WIENERT, B. ET AL.: "Unbiased detection of CRISPR off-targets in vivo using DISCOVER-Seq", SCIENCE, vol. 364, 2019, pages 286 - 289, XP055787709, DOI: 10.1126/science.aav9023 |
YAN, W.X. ET AL.: "BLISS is a versatile and quantitative method for genome-wide profiling of DNA double-strand breaks", NAT COMMUN, vol. 8, 2017, pages 15058, XP055485619, DOI: 10.1038/ncomms15058 |
YIN, J. ET AL.: "Optimizing genome editing strategy by primer-extension-mediated sequencing", CELL DISCOV, vol. 5, 2019, pages 18, XP055773402, DOI: 10.1038/s41421-019-0088-8 |
YU ZHENXING ET AL: "PEAC-seq adopts Prime Editor to detect CRISPR off-target and DNA translocation", NATURE COMMUNICATIONS, vol. 13, no. 1, 12 December 2022 (2022-12-12), XP093044844, Retrieved from the Internet <URL:https://www.nature.com/articles/s41467-022-35086-8> DOI: 10.1038/s41467-022-35086-8 * |
ZHANG, GUIQUAN ET AL.: "Enhancement of prime editing via xrRNA motif joined pegRNA", NATURE COMMUNICATIONS, vol. 13, no. 1, 2022, pages 1 - 12 |
ZHANG, GUIQUAN ET AL.: "Enhancement of prime editing via xrRNA motif-joined pegRNA", NATURE COMMUNICATIONS, vol. 13, no. 1, 2022, pages 1 - 12 |
ZONG YUAN ET AL: "An engineered prime editor with enhanced editing efficiency in plants", NATURE BIOTECHNOLOGY, vol. 40, no. 9, 24 March 2022 (2022-03-24), New York, pages 1394 - 1402, XP093045317, ISSN: 1087-0156, Retrieved from the Internet <URL:https://www.nature.com/articles/s41587-022-01254-w> DOI: 10.1038/s41587-022-01254-w * |
ZONG, Y. ET AL.: "An engineered prime editor with enhanced editing efficiency in plants", NAT BIOTECHNOL, vol. 40, 2022, pages 1394 - 1402 |
ZONG, YUAN ET AL.: "An engineered prime editor with enhanced editing efficiency in plants", NATURE BIOTECHNOLOGY, 2022, pages 1 - 9 |
ZUCCARO, M.V. ET AL.: "Allele-Specific Chromosome Removal after Cas9 Cleavage in Human Embryos", CELL, vol. 183, 2020, pages 1650 - 1664 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107586835B (zh) | 一种基于单链接头的下一代测序文库的构建方法及其应用 | |
JP7229923B2 (ja) | ヌクレアーゼ切断を評価する方法 | |
ES2955957T3 (es) | Polinucleótidos de ADN/ARN híbridos CRISPR y procedimientos de uso | |
CN110734908A (zh) | 高通量测序文库的构建方法以及用于文库构建的试剂盒 | |
JP7426370B2 (ja) | ゲノムdna断片の標的化された精製のための調製用電気泳動方法 | |
JP2018532419A (ja) | CRISPR−Cas sgRNAライブラリー | |
JP2020501554A (ja) | 短いdna断片を連結することによる一分子シーケンスのスループットを増加する方法 | |
JP2016538001A (ja) | 体細胞半数体ヒト細胞株 | |
TW201321518A (zh) | 微量核酸樣本的庫製備方法及其應用 | |
EP1969146A1 (fr) | Methodes pour la cartographie d'acides nucleiques et l'identification de variations structurales fines dans des acides nucleiques et leurs utilisations | |
CN109880851B (zh) | 用于富集CRISPR/Cas9介导的同源重组修复细胞的筛选报告载体及筛选方法 | |
EP3730616A1 (fr) | Systèmes d'édition de gènes à base unique fragmentés et application associée | |
WO2015144045A1 (fr) | Banque de plasmides comprenant deux marqueurs aléatoires et leur utilisation dans le séquençage à haut débit | |
EP4159853A1 (fr) | Système et procédé d'édition de génome | |
US11519026B2 (en) | Methods for removal of adaptor dimers from nucleic acid sequencing preparations | |
WO2019173248A1 (fr) | Acides nucléiques ciblant un acide nucléique modifié | |
Glick et al. | Medical biotechnology | |
CN116716298A (zh) | 一种引导编辑系统和目的基因序列的定点修饰方法 | |
US11661624B2 (en) | Methods of identifying and characterizing gene editing variations in nucleic acids | |
WO2023060539A1 (fr) | Compositions et procédés pour détecter des sites de clivage cibles de nucléases crispr/cas et une translocation d'adn | |
WO2023016021A1 (fr) | Outil d'édition de base et son procédé de construction | |
WO2024119461A1 (fr) | Compositions et procédés pour détecter les sites de clivage cibles des nucléases crispr/cas et la translocation de l'adn | |
JP2024509047A (ja) | Crispr関連トランスポゾンシステム及びその使用方法 | |
US20240279728A1 (en) | Detecting a dinucleotide sequence in a target polynucleotide | |
US20240287609A1 (en) | Compositions and methods for large-scale in vivo genetic screening |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22840008 Country of ref document: EP Kind code of ref document: A1 |