WO2022271725A1 - Détection de la modification du génome par crispr sur une base cellule par cellule - Google Patents
Détection de la modification du génome par crispr sur une base cellule par cellule Download PDFInfo
- Publication number
- WO2022271725A1 WO2022271725A1 PCT/US2022/034376 US2022034376W WO2022271725A1 WO 2022271725 A1 WO2022271725 A1 WO 2022271725A1 US 2022034376 W US2022034376 W US 2022034376W WO 2022271725 A1 WO2022271725 A1 WO 2022271725A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- cells
- sequencing
- sets
- genetically modified
- basis
- Prior art date
Links
- 230000004048 modification Effects 0.000 title claims abstract description 34
- 238000012986 modification Methods 0.000 title claims abstract description 34
- 108091033409 CRISPR Proteins 0.000 title description 19
- 108090000623 proteins and genes Proteins 0.000 claims abstract description 128
- 238000012163 sequencing technique Methods 0.000 claims abstract description 81
- 238000000034 method Methods 0.000 claims abstract description 78
- 108020004999 messenger RNA Proteins 0.000 claims abstract description 68
- 108091026890 Coding region Proteins 0.000 claims abstract description 7
- 239000002299 complementary DNA Substances 0.000 claims description 48
- 108020004635 Complementary DNA Proteins 0.000 claims description 41
- 238000007671 third-generation sequencing Methods 0.000 claims description 35
- 230000003321 amplification Effects 0.000 claims description 19
- 238000003199 nucleic acid amplification method Methods 0.000 claims description 19
- 238000010839 reverse transcription Methods 0.000 claims description 17
- 239000002773 nucleotide Substances 0.000 claims description 15
- 125000003729 nucleotide group Chemical group 0.000 claims description 15
- 238000011176 pooling Methods 0.000 claims description 8
- 108091028043 Nucleic acid sequence Proteins 0.000 claims description 6
- 239000011324 bead Substances 0.000 claims description 6
- 238000007672 fourth generation sequencing Methods 0.000 claims description 6
- 238000006467 substitution reaction Methods 0.000 claims description 5
- 239000000839 emulsion Substances 0.000 claims description 3
- 230000014509 gene expression Effects 0.000 abstract description 37
- 238000012239 gene modification Methods 0.000 abstract description 33
- 230000005017 genetic modification Effects 0.000 abstract description 30
- 235000013617 genetically modified food Nutrition 0.000 abstract description 30
- 230000000694 effects Effects 0.000 abstract description 12
- 210000004027 cell Anatomy 0.000 description 384
- 108020005004 Guide RNA Proteins 0.000 description 82
- 108010029485 Protein Isoforms Proteins 0.000 description 45
- 102000001708 Protein Isoforms Human genes 0.000 description 45
- 150000007523 nucleic acids Chemical class 0.000 description 42
- 102000039446 nucleic acids Human genes 0.000 description 39
- 108020004707 nucleic acids Proteins 0.000 description 39
- 230000035772 mutation Effects 0.000 description 34
- 108020004414 DNA Proteins 0.000 description 31
- 102100025234 Receptor of activated protein C kinase 1 Human genes 0.000 description 26
- 108010044157 Receptors for Activated C Kinase Proteins 0.000 description 25
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 24
- 101000738771 Homo sapiens Receptor-type tyrosine-protein phosphatase C Proteins 0.000 description 22
- 102100037422 Receptor-type tyrosine-protein phosphatase C Human genes 0.000 description 22
- 230000000295 complement effect Effects 0.000 description 22
- 230000027455 binding Effects 0.000 description 19
- 102000004169 proteins and genes Human genes 0.000 description 16
- 239000013612 plasmid Substances 0.000 description 14
- 102100033993 Heterogeneous nuclear ribonucleoprotein L-like Human genes 0.000 description 12
- 101001017573 Homo sapiens Heterogeneous nuclear ribonucleoprotein L-like Proteins 0.000 description 12
- 238000004458 analytical method Methods 0.000 description 12
- 238000010362 genome editing Methods 0.000 description 12
- 229920001184 polypeptide Polymers 0.000 description 12
- 102000004196 processed proteins & peptides Human genes 0.000 description 12
- 108090000765 processed proteins & peptides Proteins 0.000 description 12
- 102000004389 Ribonucleoproteins Human genes 0.000 description 11
- 108010081734 Ribonucleoproteins Proteins 0.000 description 11
- 102000015098 Tumor Suppressor Protein p53 Human genes 0.000 description 11
- 108010078814 Tumor Suppressor Protein p53 Proteins 0.000 description 11
- 238000003556 assay Methods 0.000 description 11
- 239000000047 product Substances 0.000 description 11
- 238000010453 CRISPR/Cas method Methods 0.000 description 10
- 108010039259 RNA Splicing Factors Proteins 0.000 description 10
- 239000012636 effector Substances 0.000 description 10
- 230000008685 targeting Effects 0.000 description 10
- 102000053602 DNA Human genes 0.000 description 9
- 102000015097 RNA Splicing Factors Human genes 0.000 description 9
- 230000008859 change Effects 0.000 description 8
- 238000004520 electroporation Methods 0.000 description 8
- 238000002474 experimental method Methods 0.000 description 8
- 102000040650 (ribonucleotides)n+m Human genes 0.000 description 7
- 108091034117 Oligonucleotide Proteins 0.000 description 7
- 238000013459 approach Methods 0.000 description 7
- 238000010361 transduction Methods 0.000 description 7
- 230000026683 transduction Effects 0.000 description 7
- 108700024394 Exon Proteins 0.000 description 6
- 101001128460 Homo sapiens Myosin light polypeptide 6 Proteins 0.000 description 6
- 102100031829 Myosin light polypeptide 6 Human genes 0.000 description 6
- 210000001744 T-lymphocyte Anatomy 0.000 description 6
- ISAKRJDGNUQOIC-UHFFFAOYSA-N Uracil Chemical compound O=C1C=CNC(=O)N1 ISAKRJDGNUQOIC-UHFFFAOYSA-N 0.000 description 6
- OPTASPLRGRRNAP-UHFFFAOYSA-N cytosine Chemical compound NC=1C=CNC(=O)N=1 OPTASPLRGRRNAP-UHFFFAOYSA-N 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 6
- 230000002068 genetic effect Effects 0.000 description 6
- UYTPUPDQBNUYGX-UHFFFAOYSA-N guanine Chemical compound O=C1NC(N)=NC2=C1N=CN2 UYTPUPDQBNUYGX-UHFFFAOYSA-N 0.000 description 6
- 230000003993 interaction Effects 0.000 description 6
- 101710163270 Nuclease Proteins 0.000 description 5
- JLCPHMBAVCMARE-UHFFFAOYSA-N [3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methyl [5-(6-aminopurin-9-yl)-2-(hydroxymethyl)oxolan-3-yl] hydrogen phosphate Polymers Cc1cn(C2CC(OP(O)(=O)OCC3OC(CC3OP(O)(=O)OCC3OC(CC3O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c3nc(N)[nH]c4=O)C(COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3CO)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cc(C)c(=O)[nH]c3=O)n3cc(C)c(=O)[nH]c3=O)n3ccc(N)nc3=O)n3cc(C)c(=O)[nH]c3=O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)O2)c(=O)[nH]c1=O JLCPHMBAVCMARE-UHFFFAOYSA-N 0.000 description 5
- 239000003153 chemical reaction reagent Substances 0.000 description 5
- 238000009396 hybridization Methods 0.000 description 5
- 239000000463 material Substances 0.000 description 5
- 230000001404 mediated effect Effects 0.000 description 5
- 102000040430 polynucleotide Human genes 0.000 description 5
- 108091033319 polynucleotide Proteins 0.000 description 5
- 239000002157 polynucleotide Substances 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 238000001890 transfection Methods 0.000 description 5
- 230000003612 virological effect Effects 0.000 description 5
- 229930024421 Adenine Natural products 0.000 description 4
- GFFGJBXGBJISGV-UHFFFAOYSA-N Adenine Chemical compound NC1=NC=NC2=C1N=CN2 GFFGJBXGBJISGV-UHFFFAOYSA-N 0.000 description 4
- 238000010356 CRISPR-Cas9 genome editing Methods 0.000 description 4
- 102100031780 Endonuclease Human genes 0.000 description 4
- 102000004190 Enzymes Human genes 0.000 description 4
- 108090000790 Enzymes Proteins 0.000 description 4
- 101000735358 Homo sapiens Poly(rC)-binding protein 2 Proteins 0.000 description 4
- 101000587438 Homo sapiens Serine/arginine-rich splicing factor 5 Proteins 0.000 description 4
- 102100034961 Poly(rC)-binding protein 2 Human genes 0.000 description 4
- 229960000643 adenine Drugs 0.000 description 4
- 230000004075 alteration Effects 0.000 description 4
- 238000000338 in vitro Methods 0.000 description 4
- 239000011148 porous material Substances 0.000 description 4
- 238000002360 preparation method Methods 0.000 description 4
- 230000001225 therapeutic effect Effects 0.000 description 4
- 239000013598 vector Substances 0.000 description 4
- 108010052875 Adenine deaminase Proteins 0.000 description 3
- 108091003079 Bovine Serum Albumin Proteins 0.000 description 3
- PHEDXBVPIONUQT-UHFFFAOYSA-N Cocarcinogen A1 Natural products CCCCCCCCCCCCCC(=O)OC1C(C)C2(O)C3C=C(C)C(=O)C3(O)CC(CO)=CC2C2C1(OC(C)=O)C2(C)C PHEDXBVPIONUQT-UHFFFAOYSA-N 0.000 description 3
- 102100033985 Heterogeneous nuclear ribonucleoprotein D0 Human genes 0.000 description 3
- 101001017535 Homo sapiens Heterogeneous nuclear ribonucleoprotein D0 Proteins 0.000 description 3
- 241000713666 Lentivirus Species 0.000 description 3
- 238000003559 RNA-seq method Methods 0.000 description 3
- 102100029703 Serine/arginine-rich splicing factor 5 Human genes 0.000 description 3
- 101150063416 add gene Proteins 0.000 description 3
- 230000001413 cellular effect Effects 0.000 description 3
- 210000000349 chromosome Anatomy 0.000 description 3
- 229940104302 cytosine Drugs 0.000 description 3
- 238000012217 deletion Methods 0.000 description 3
- 230000037430 deletion Effects 0.000 description 3
- 238000001514 detection method Methods 0.000 description 3
- 230000034431 double-strand break repair via homologous recombination Effects 0.000 description 3
- 239000012091 fetal bovine serum Substances 0.000 description 3
- 238000013467 fragmentation Methods 0.000 description 3
- 238000006062 fragmentation reaction Methods 0.000 description 3
- 238000004519 manufacturing process Methods 0.000 description 3
- 230000006780 non-homologous end joining Effects 0.000 description 3
- PHEDXBVPIONUQT-RGYGYFBISA-N phorbol 13-acetate 12-myristate Chemical compound C([C@]1(O)C(=O)C(C)=C[C@H]1[C@@]1(O)[C@H](C)[C@H]2OC(=O)CCCCCCCCCCCCC)C(CO)=C[C@H]1[C@H]1[C@]2(OC(C)=O)C1(C)C PHEDXBVPIONUQT-RGYGYFBISA-N 0.000 description 3
- 230000001105 regulatory effect Effects 0.000 description 3
- 230000000638 stimulation Effects 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 210000001519 tissue Anatomy 0.000 description 3
- 229940035893 uracil Drugs 0.000 description 3
- KDCGOANMDULRCW-UHFFFAOYSA-N 7H-purine Chemical compound N1=CNC2=NC=NC2=C1 KDCGOANMDULRCW-UHFFFAOYSA-N 0.000 description 2
- 108700028369 Alleles Proteins 0.000 description 2
- 108091093088 Amplicon Proteins 0.000 description 2
- 102100033210 CUGBP Elav-like family member 2 Human genes 0.000 description 2
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 2
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 2
- 239000006144 Dulbecco’s modified Eagle's medium Substances 0.000 description 2
- 108010042407 Endonucleases Proteins 0.000 description 2
- 206010064571 Gene mutation Diseases 0.000 description 2
- 101000944442 Homo sapiens CUGBP Elav-like family member 2 Proteins 0.000 description 2
- 238000012351 Integrated analysis Methods 0.000 description 2
- 229920006068 Minlon® Polymers 0.000 description 2
- 206010028980 Neoplasm Diseases 0.000 description 2
- 101150071454 PTPRC gene Proteins 0.000 description 2
- 108010092799 RNA-directed DNA polymerase Proteins 0.000 description 2
- 108020004682 Single-Stranded DNA Proteins 0.000 description 2
- 238000010459 TALEN Methods 0.000 description 2
- IQFYYKKMVGJFEH-XLPZGREQSA-N Thymidine Chemical compound O=C1NC(=O)C(C)=CN1[C@@H]1O[C@H](CO)[C@@H](O)C1 IQFYYKKMVGJFEH-XLPZGREQSA-N 0.000 description 2
- 108010043645 Transcription Activator-Like Effector Nucleases Proteins 0.000 description 2
- 108010017070 Zinc Finger Nucleases Proteins 0.000 description 2
- 239000012190 activator Substances 0.000 description 2
- 230000000692 anti-sense effect Effects 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 230000033228 biological regulation Effects 0.000 description 2
- 230000015572 biosynthetic process Effects 0.000 description 2
- 201000011510 cancer Diseases 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 2
- 210000003763 chloroplast Anatomy 0.000 description 2
- 230000002759 chromosomal effect Effects 0.000 description 2
- 238000003776 cleavage reaction Methods 0.000 description 2
- 238000010367 cloning Methods 0.000 description 2
- 230000005782 double-strand break Effects 0.000 description 2
- 230000009977 dual effect Effects 0.000 description 2
- 210000003527 eukaryotic cell Anatomy 0.000 description 2
- 230000004927 fusion Effects 0.000 description 2
- 230000037442 genomic alteration Effects 0.000 description 2
- 210000005260 human cell Anatomy 0.000 description 2
- 210000002865 immune cell Anatomy 0.000 description 2
- 238000001727 in vivo Methods 0.000 description 2
- 238000010348 incorporation Methods 0.000 description 2
- 230000010354 integration Effects 0.000 description 2
- PGHMRUGBZOYCAA-UHFFFAOYSA-N ionomycin Natural products O1C(CC(O)C(C)C(O)C(C)C=CCC(C)CC(C)C(O)=CC(=O)C(C)CC(C)CC(CCC(O)=O)C)CCC1(C)C1OC(C)(C(C)O)CC1 PGHMRUGBZOYCAA-UHFFFAOYSA-N 0.000 description 2
- PGHMRUGBZOYCAA-ADZNBVRBSA-N ionomycin Chemical compound O1[C@H](C[C@H](O)[C@H](C)[C@H](O)[C@H](C)/C=C/C[C@@H](C)C[C@@H](C)C(/O)=C/C(=O)[C@@H](C)C[C@@H](C)C[C@@H](CCC(O)=O)C)CC[C@@]1(C)[C@@H]1O[C@](C)([C@@H](C)O)CC1 PGHMRUGBZOYCAA-ADZNBVRBSA-N 0.000 description 2
- 229920002521 macromolecule Polymers 0.000 description 2
- 230000002438 mitochondrial effect Effects 0.000 description 2
- 239000000203 mixture Substances 0.000 description 2
- 239000013642 negative control Substances 0.000 description 2
- 238000005192 partition Methods 0.000 description 2
- 210000004990 primary immune cell Anatomy 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- RXWNCPJZOCPEPQ-NVWDDTSBSA-N puromycin Chemical compound C1=CC(OC)=CC=C1C[C@H](N)C(=O)N[C@H]1[C@@H](O)[C@H](N2C3=NC=NC(=C3N=C2)N(C)C)O[C@@H]1CO RXWNCPJZOCPEPQ-NVWDDTSBSA-N 0.000 description 2
- 238000011002 quantification Methods 0.000 description 2
- 238000012552 review Methods 0.000 description 2
- 230000007017 scission Effects 0.000 description 2
- 238000012216 screening Methods 0.000 description 2
- 239000006228 supernatant Substances 0.000 description 2
- 238000003786 synthesis reaction Methods 0.000 description 2
- 238000013518 transcription Methods 0.000 description 2
- 230000035897 transcription Effects 0.000 description 2
- 238000011222 transcriptome analysis Methods 0.000 description 2
- 230000007704 transition Effects 0.000 description 2
- 101150076401 16 gene Proteins 0.000 description 1
- FWMNVWWHGCHHJJ-SKKKGAJSSA-N 4-amino-1-[(2r)-6-amino-2-[[(2r)-2-[[(2r)-2-[[(2r)-2-amino-3-phenylpropanoyl]amino]-3-phenylpropanoyl]amino]-4-methylpentanoyl]amino]hexanoyl]piperidine-4-carboxylic acid Chemical compound C([C@H](C(=O)N[C@H](CC(C)C)C(=O)N[C@H](CCCCN)C(=O)N1CCC(N)(CC1)C(O)=O)NC(=O)[C@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 FWMNVWWHGCHHJJ-SKKKGAJSSA-N 0.000 description 1
- 102000012758 APOBEC-1 Deaminase Human genes 0.000 description 1
- 108010079649 APOBEC-1 Deaminase Proteins 0.000 description 1
- 108020005098 Anticodon Proteins 0.000 description 1
- BHELIUBJHYAEDK-OAIUPTLZSA-N Aspoxicillin Chemical compound C1([C@H](C(=O)N[C@@H]2C(N3[C@H](C(C)(C)S[C@@H]32)C(O)=O)=O)NC(=O)[C@H](N)CC(=O)NC)=CC=C(O)C=C1 BHELIUBJHYAEDK-OAIUPTLZSA-N 0.000 description 1
- 101000840545 Bacillus thuringiensis L-isoleucine-4-hydroxylase Proteins 0.000 description 1
- DWRXFEITVBNRMK-UHFFFAOYSA-N Beta-D-1-Arabinofuranosylthymine Natural products O=C1NC(=O)C(C)=CN1C1C(O)C(O)C(CO)O1 DWRXFEITVBNRMK-UHFFFAOYSA-N 0.000 description 1
- 102000008203 CTLA-4 Antigen Human genes 0.000 description 1
- 108010021064 CTLA-4 Antigen Proteins 0.000 description 1
- 229940045513 CTLA4 antagonist Drugs 0.000 description 1
- 108020004705 Codon Proteins 0.000 description 1
- 206010009944 Colon cancer Diseases 0.000 description 1
- 102000004127 Cytokines Human genes 0.000 description 1
- 108090000695 Cytokines Proteins 0.000 description 1
- 230000008265 DNA repair mechanism Effects 0.000 description 1
- 108010008532 Deoxyribonuclease I Proteins 0.000 description 1
- 102000007260 Deoxyribonuclease I Human genes 0.000 description 1
- 102100034458 Hepatitis A virus cellular receptor 2 Human genes 0.000 description 1
- 101710083479 Hepatitis A virus cellular receptor 2 homolog Proteins 0.000 description 1
- 229920000209 Hexadimethrine bromide Polymers 0.000 description 1
- 101150045333 Hnrnpll gene Proteins 0.000 description 1
- 101000834898 Homo sapiens Alpha-synuclein Proteins 0.000 description 1
- 101001037256 Homo sapiens Indoleamine 2,3-dioxygenase 1 Proteins 0.000 description 1
- 101000611936 Homo sapiens Programmed cell death protein 1 Proteins 0.000 description 1
- 101001077369 Homo sapiens Receptor of activated protein C kinase 1 Proteins 0.000 description 1
- 101000652359 Homo sapiens Spermatogenesis-associated protein 2 Proteins 0.000 description 1
- 101000666896 Homo sapiens V-type immunoglobulin domain-containing suppressor of T-cell activation Proteins 0.000 description 1
- 229940076838 Immune checkpoint inhibitor Drugs 0.000 description 1
- 102100040061 Indoleamine 2,3-dioxygenase 1 Human genes 0.000 description 1
- 102000037984 Inhibitory immune checkpoint proteins Human genes 0.000 description 1
- 108091008026 Inhibitory immune checkpoint proteins Proteins 0.000 description 1
- 102000002698 KIR Receptors Human genes 0.000 description 1
- 108010043610 KIR Receptors Proteins 0.000 description 1
- 102000017578 LAG3 Human genes 0.000 description 1
- 101150030213 Lag3 gene Proteins 0.000 description 1
- 108020005196 Mitochondrial DNA Proteins 0.000 description 1
- 238000012408 PCR amplification Methods 0.000 description 1
- 102000004160 Phosphoric Monoester Hydrolases Human genes 0.000 description 1
- 108090000608 Phosphoric Monoester Hydrolases Proteins 0.000 description 1
- 102100040678 Programmed cell death protein 1 Human genes 0.000 description 1
- CZPWVGJYEJSRLH-UHFFFAOYSA-N Pyrimidine Chemical compound C1=CN=CN=C1 CZPWVGJYEJSRLH-UHFFFAOYSA-N 0.000 description 1
- 101150035323 RACK1 gene Proteins 0.000 description 1
- 230000004570 RNA-binding Effects 0.000 description 1
- 108091028664 Ribonucleotide Proteins 0.000 description 1
- 101150050559 SOAT1 gene Proteins 0.000 description 1
- 101001037255 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) Indoleamine 2,3-dioxygenase Proteins 0.000 description 1
- 238000012300 Sequence Analysis Methods 0.000 description 1
- 102100021993 Sterol O-acyltransferase 1 Human genes 0.000 description 1
- 238000000692 Student's t-test Methods 0.000 description 1
- 108091027544 Subgenomic mRNA Proteins 0.000 description 1
- 229940126547 T-cell immunoglobulin mucin-3 Drugs 0.000 description 1
- 208000000389 T-cell leukemia Diseases 0.000 description 1
- 208000028530 T-cell lymphoblastic leukemia/lymphoma Diseases 0.000 description 1
- 101710137500 T7 RNA polymerase Proteins 0.000 description 1
- 108010006785 Taq Polymerase Proteins 0.000 description 1
- 108091036066 Three prime untranslated region Proteins 0.000 description 1
- 108020004566 Transfer RNA Proteins 0.000 description 1
- 102100038282 V-type immunoglobulin domain-containing suppressor of T-cell activation Human genes 0.000 description 1
- 125000003275 alpha amino acid group Chemical group 0.000 description 1
- 125000003277 amino group Chemical group 0.000 description 1
- 238000004873 anchoring Methods 0.000 description 1
- 238000002869 basic local alignment search tool Methods 0.000 description 1
- IQFYYKKMVGJFEH-UHFFFAOYSA-N beta-L-thymidine Natural products O=C1NC(=O)C(C)=CN1C1OC(CO)C(O)C1 IQFYYKKMVGJFEH-UHFFFAOYSA-N 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- JJWKPURADFRFRB-UHFFFAOYSA-N carbonyl sulfide Chemical compound O=C=S JJWKPURADFRFRB-UHFFFAOYSA-N 0.000 description 1
- 238000004113 cell culture Methods 0.000 description 1
- 230000003915 cell function Effects 0.000 description 1
- 239000006285 cell suspension Substances 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 239000003795 chemical substances by application Substances 0.000 description 1
- 208000029742 colonic neoplasm Diseases 0.000 description 1
- 150000001875 compounds Chemical class 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 238000012258 culturing Methods 0.000 description 1
- 239000005547 deoxyribonucleotide Substances 0.000 description 1
- 125000002637 deoxyribonucleotide group Chemical group 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000010494 dissociation reaction Methods 0.000 description 1
- 230000005593 dissociations Effects 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 239000013613 expression plasmid Substances 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- IJJVMEJXYNJXOJ-UHFFFAOYSA-N fluquinconazole Chemical compound C=1C=C(Cl)C=C(Cl)C=1N1C(=O)C2=CC(F)=CC=C2N=C1N1C=NC=N1 IJJVMEJXYNJXOJ-UHFFFAOYSA-N 0.000 description 1
- -1 for example Proteins 0.000 description 1
- 239000012737 fresh medium Substances 0.000 description 1
- 238000001415 gene therapy Methods 0.000 description 1
- 238000007429 general method Methods 0.000 description 1
- 238000012268 genome sequencing Methods 0.000 description 1
- 238000003205 genotyping method Methods 0.000 description 1
- 239000001963 growth medium Substances 0.000 description 1
- 230000006801 homologous recombination Effects 0.000 description 1
- 238000002744 homologous recombination Methods 0.000 description 1
- 210000000987 immune system Anatomy 0.000 description 1
- 239000012274 immune-checkpoint protein inhibitor Substances 0.000 description 1
- 230000002757 inflammatory effect Effects 0.000 description 1
- 238000003780 insertion Methods 0.000 description 1
- 230000037431 insertion Effects 0.000 description 1
- 230000003834 intracellular effect Effects 0.000 description 1
- 210000003734 kidney Anatomy 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 239000002609 medium Substances 0.000 description 1
- 239000012528 membrane Substances 0.000 description 1
- 238000010369 molecular cloning Methods 0.000 description 1
- 231100000350 mutagenesis Toxicity 0.000 description 1
- 239000003471 mutagenic agent Substances 0.000 description 1
- 231100000707 mutagenic chemical Toxicity 0.000 description 1
- 210000000822 natural killer cell Anatomy 0.000 description 1
- 229910052754 neon Inorganic materials 0.000 description 1
- GKAOGPIIYCISHV-UHFFFAOYSA-N neon atom Chemical compound [Ne] GKAOGPIIYCISHV-UHFFFAOYSA-N 0.000 description 1
- 239000008188 pellet Substances 0.000 description 1
- 125000002467 phosphate group Chemical group [H]OP(=O)(O[H])O[*] 0.000 description 1
- 230000004962 physiological condition Effects 0.000 description 1
- 229920000642 polymer Polymers 0.000 description 1
- 239000002243 precursor Substances 0.000 description 1
- 238000000513 principal component analysis Methods 0.000 description 1
- 238000001814 protein method Methods 0.000 description 1
- 238000000746 purification Methods 0.000 description 1
- 229950010131 puromycin Drugs 0.000 description 1
- 230000008707 rearrangement Effects 0.000 description 1
- 102000005962 receptors Human genes 0.000 description 1
- 108020003175 receptors Proteins 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000008439 repair process Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 239000002336 ribonucleotide Substances 0.000 description 1
- 125000002652 ribonucleotide group Chemical group 0.000 description 1
- 230000019491 signal transduction Effects 0.000 description 1
- 230000003007 single stranded DNA break Effects 0.000 description 1
- 230000009870 specific binding Effects 0.000 description 1
- 210000000130 stem cell Anatomy 0.000 description 1
- 239000000758 substrate Substances 0.000 description 1
- 238000012353 t test Methods 0.000 description 1
- 229940104230 thymidine Drugs 0.000 description 1
- 230000002103 transcriptional effect Effects 0.000 description 1
- 230000002463 transducing effect Effects 0.000 description 1
- 238000003151 transfection method Methods 0.000 description 1
- 239000012096 transfection reagent Substances 0.000 description 1
- 239000001226 triphosphate Substances 0.000 description 1
- 235000011178 triphosphate Nutrition 0.000 description 1
- 125000002264 triphosphate group Chemical class [H]OP(=O)(O[H])OP(=O)(O[H])OP(=O)(O[H])O* 0.000 description 1
- 239000013603 viral vector Substances 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/10—Processes for the isolation, preparation or purification of DNA or RNA
- C12N15/1034—Isolating an individual clone by screening libraries
- C12N15/1065—Preparation or screening of tagged libraries, e.g. tagged microorganisms by STM-mutagenesis, tagged polynucleotides, gene tags
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/11—DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
- C12N15/113—Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2310/00—Structure or type of the nucleic acid
- C12N2310/10—Type of nucleic acid
- C12N2310/20—Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPRs]
-
- C—CHEMISTRY; METALLURGY
- C40—COMBINATORIAL TECHNOLOGY
- C40B—COMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
- C40B40/00—Libraries per se, e.g. arrays, mixtures
- C40B40/02—Libraries contained in or displayed by microorganisms, e.g. bacteria or animal cells; Libraries contained in or displayed by vectors, e.g. plasmids; Libraries containing only microorganisms or vectors
-
- C—CHEMISTRY; METALLURGY
- C40—COMBINATORIAL TECHNOLOGY
- C40B—COMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
- C40B40/00—Libraries per se, e.g. arrays, mixtures
- C40B40/04—Libraries containing only organic compounds
- C40B40/06—Libraries containing nucleotides or polynucleotides, or derivatives thereof
Definitions
- Various molecular tools could be designed to produce a desired genomic modification. Different molecular tools that can produce a desired genetic modification may exhibit different efficacy of achieving the desired genetic modification. Moreover, different molecular tools may exhibit different effects on the expression of the non-target genes and on the global gene expression in the genetically modified cell.
- an assay that could be used to confirm that a molecular tool, such as CRISPR-Cas9, that is designed to produce a genetic modification produces the desired genetic modification, particularly, on a cell-by-cell basis.
- An assay is also provided for screening molecular tools that are designed for a genetic modification, particularly, on a cell-by-cell basis, to identify the molecular tool that produces a desired genomic modification. Further, an assay is provided for screening the effects of a molecular tool that is designed to generate a genetic modification on the global gene expression profile of the genetically modified cells.
- a Kit is also provided that could be used for performing the method disclosed herein.
- FIG. 1 Schematic representation of exemplary embodiment of the disclosure.
- CRISPR-based genome editing to introduce changes into a gene’s sequence (2) long-read sequencing to characterize the CRISPR-based alterations based on changes in the imRNA sequence; (3) cDNA barcoding to determine which cell or cell population has the CRISPR edit; (4) linkage of the CRISPR edit observed in the long-read sequence data with the short-read sequencing from the same cell or set of cells with the forementioned CRISPR edit.
- Long-read sequencing encompass read lengths greater than 500-600 bases.
- Short read sequences are defined as read lengths less than 500-600 bases.
- FIGS. 2A-2E (A) Overview of single-cell short/long-read integration strategy.
- FIGS. 3A-3C (A) Overview of single-cell CRISPR screen integrated with long-read sequencing. (B) Boxplot showing the ratio of short PTPRC transcript isoform (RO and RB) for cells with guide RNAs targeting indicated genes. P values are calculated in comparison with the nontarget cells. Genes which have less than 3 cells with target guide RNAs are not shown. (C) Heatmap showing proportion of each transcript isoform (x-axis) with each cell (y-axis) and clustering based on transcript isoform proportion for cells having indicated guide RNA sequence.
- FIGS. 4A-4C (A) Overview of splicing factors affect alternative splicing. (B) Quantification of short transcript isoform per target gene. For each gene (x-axis), cells with guide RNAs target the gene were grouped and the ratio of transcript isoform RO and RB among all PTPRC isoforms are shown as box plot. (C)
- FIG. 5 illustrates a process of using a based editor to introduce engineered gene mutations into single cells.
- FIG. 6 illustrates a multiplexed sequencing approach to identify mutations from single cell RNA-seq.
- SPEX refers to single cell prime extension (SPEX).
- FIG. 7 illustrates single-cell level detection of CRISPR induced TP53 mutations and their effect on single cell expression.
- polynucleotide and “nucleic acid,” used interchangeably herein, refer to a polymeric form of nucleotides of any length, either ribonucleotides or deoxyribonucleotides. Thus, this term includes, but is not limited to, single-, double-, or multi-stranded DNA or RNA, genomic DNA, cDNA, DNA-RNA hybrids, or a polymer comprising purine and pyrimidine bases or other natural, chemically or biochemically modified, non-natural, or derivatized nucleotide bases.
- hybridizable or “complementary” or “substantially complementary” it is meant that a nucleic acid (e.g. RNA, DNA) comprises a sequence of nucleotides that enables it to non-covalently bind, i.e. form Watson-Crick base pairs and/or G/U base pairs, “anneal”, or “hybridize,” to another nucleic acid in a sequence-specific, antiparallel, manner (i.e., a nucleic acid specifically binds to a complementary nucleic acid) under the appropriate in vitro and/or in vivo conditions of temperature and solution ionic strength.
- a nucleic acid e.g. RNA, DNA
- anneal i.e. form Watson-Crick base pairs and/or G/U base pairs
- Standard Watson-Crick base-pairing includes: adenine (A) pairing with thymidine (T), adenine (A) pairing with uracil (U), and guanine (G) pairing with cytosine (C) [DNA, RNA]
- adenine (A) pairing with thymidine (T) adenine (A) pairing with uracil (U)
- guanine (G) can also base pair with uracil (U).
- G/U base-pairing is at least partially responsible for the degeneracy (i.e., redundancy) of the genetic code in the context of tRNA anti-codon base-pairing with codons in mRNA.
- a guanine (G) e.g., of dsRNA duplex of a guide RNA molecule; of a guide RNA base pairing with a target nucleic acid, etc.
- U uracil
- A an adenine
- a G/U base-pair can be made at a given nucleotide position of a dsRNA duplex of a guide RNA molecule, the position is not considered to be non-complementary, but is instead considered to be complementary.
- sequence of a polynucleotide need not be 100% complementary to that of its target nucleic acid to be specifically hybridizable or hybridizable. Moreover, a polynucleotide may hybridize over one or more segments such that intervening or adjacent segments are not involved in the hybridization event (e.g., a bulge, a loop structure or hairpin structure, etc.).
- a polynucleotide can comprise 60% or more, 65% or more, 70% or more, 75% or more, 80% or more, 85% or more, 90% or more, 95% or more, 98% or more, 99% or more, 99.5% or more, or 100% sequence complementarity to a target region within the target nucleic acid sequence to which it will hybridize.
- an antisense nucleic acid in which 18 of 20 nucleotides of the antisense compound are complementary to a target region, and would therefore specifically hybridize would represent 90 percent complementarity.
- the remaining noncomplementary nucleotides may be clustered or interspersed with complementary nucleotides and need not be contiguous to each other or to complementary nucleotides.
- Percent complementarity between particular stretches of nucleic acid sequences within nucleic acids can be determined using any convenient method. Example methods include BLAST programs (basic local alignment search tools) and PowerBLAST programs (Altschul et al., J. Mol.
- Binding refers to a non-covalent interaction between macromolecules (e.g., between a protein and a nucleic acid; between a modified CRISPR/Cas effector polypeptide/guide RNA complex and a target nucleic acid; and the like).
- the macromolecules While in a state of non-covalent interaction, the macromolecules are said to be “associated” or “interacting” or “binding” (e.g., when a molecule X is said to interact with a molecule Y, it is meant the molecule X binds to molecule Y in a non-covalent manner). Not all components of a binding interaction need be sequence-specific (e.g., contacts with phosphate residues in a DNA backbone), but some portions of a binding interaction may be sequence-specific.
- Binding interactions are generally characterized by a dissociation constant (KD) of less than 10 6 M, less than 10 7 M, less than 10 8 M, less than 10 9 M, less than 10 10 M, less than 10 11 M, less than 10 12 M, less than 10 13 M, less than 10 14 M, or less than 10- 15 M.
- KD dissociation constant
- Affinity refers to the strength of binding, increased binding affinity being correlated with a lower KD.
- a “cell” as used herein, denotes an in vivo or in vitro eukaryotic cell or a cell line.
- a “binding site for a guide-RNA” as used herein is a polynucleotide (e.g., DNA such as genomic DNA) that includes a site ("target site” or "target sequence") targeted by a modified CRISPR/Cas effector polypeptide.
- the target sequence is the sequence to which the guide sequence of a guide nucleic acid (e.g., guide RNA; e.g., a dual guide RNA or a single-molecule guide RNA) will hybridize.
- the target site (or target sequence) 5'-GAGCAUAUC-3' within a target nucleic acid is targeted by (or is bound by, or hybridizes with, or is complementary to) the sequence 5’- -3’.
- Suitable hybridization conditions include physiological conditions normally present in a cell.
- the strand of the target nucleic acid that is complementary to and hybridizes with the guide RNA is referred to as the “complementary strand” or “target strand”; while the strand of the target nucleic acid that is complementary to the “target strand” (and is therefore not complementary to the guide RNA) is referred to as the “non-target strand” or “non complementary strand.”
- long-read sequencing refers to sequencing read lengths greater than 500 bases, particularly, longer than 600 bases.
- short read sequencing refers to sequencing read lengths less than 600 bases, particularly, less than 500 bases.
- the terms “may,” “optional,” “optionally,” or “may optionally” mean that the subsequently described circumstance may or may not occur, so that the description includes instances where the circumstance occurs and instances where it does not.
- long read sequencing such as single molecule real time (SMRT) sequencing or nanopore sequencing
- SMRT single molecule real time
- nanopore sequencing nanopore sequencing
- a cell’s long read sequencing can be combined with the cell’s short read transcriptome information (FIG. 2A).
- FOG. 2A short read transcriptome information
- an assay is provided herein that allows one to evaluate cells, on the basis of single cells or sets of cells, that are genetically modified, for example, via CRISPR-mediated genetic edit.
- the assay disclosed herein allows: (1) confirming the genomic modification, for example, CRISPR edit, based on the target gene’s mRNA; (2) assigning a desired genetic modification, for example, CRISPR-based genomic edit, to an individual cell or set of cells; and (3) determining the effects of a genetic modification on cellular phenotypes such as global gene or protein expression.
- Certain non-limiting examples of such molecular tools include: 1 ) incorporation of a genetic material into a targeted site in the genome, for example, via homologous recombination; 2) random incorporation of genetic material into a target chromosome; 3) introduction of random mutations in a target genetic material, for example, via exposure to mutagens.
- More recent tools for introducing genetic modifications in a target genome include programmable nuclease-based genome editing.
- Programmable nucleases such as zinc-finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs), and clustered regularly interspaced short palindromic repeat (CRISPR)- Cas-associated nucleases, provide targeted gene editing platforms.
- ZFNs zinc-finger nucleases
- TALENs transcription activator-like effector nucleases
- CRISPR clustered regularly interspaced short palindromic repeat
- Nuclease-based genetic modification involves targeted alterations in genomic regions based on nuclease-induced double-stranded breaks (DSBs) at a specific desired locus in the target genome.
- DSBs leads to the production of damaged DNA and stimulation of the cell’s DNA repair mechanism, such as homology-directed repair (HDR) and nonhomologous end-joining (NHEJ).
- HDR homology-directed repair
- NHEJ nonhomologous end-joining
- CRISPR-based genome editing is used to introduce genetic modifications.
- the assay involves: CRISPR-based genome editing to introduce changes into a gene’s sequence and long-read sequencing to characterize the CRISPR-based alterations based on the changes in the mRNA sequence.
- the long-read sequence can involve cDNA barcoding to determine which cell or set of cells has the desired CRISPR edit.
- the CRISPR edit observed in the long-read sequence data can be linked with the short-read sequencing from the same cell or set of cells to determine the effects of the genetic modification on the cell, for example, on global gene expression.
- the CRISPR system suitable for use in the methods of the present disclosure can be CRISPR-Cas9.
- a guide nucleic acid suitable for inclusion in a CRISPR-system used in the present disclosure can include: i) a first segment (referred to herein as a “targeting segment”); and ii) a second segment (referred to herein as a “protein-binding segment”).
- a “segment” is a region of a molecule, e.g., a contiguous stretch of nucleotides in a nucleic acid molecule.
- a segment can also be a section of a complex such that a segment may comprise regions of more than one molecule.
- the “targeting segment” is also referred to herein as a “variable region” of a guide RNA.
- the “protein-binding segment” is also referred to herein as a “constant region” of a guide RNA.
- the first segment (targeting segment) of a guide RNA includes a nucleotide sequence (a guide sequence) that is complementary to (and therefore hybridizes with) a specific sequence (a target site) within a target nucleic acid (e.g., a target ssRNA, a target ssDNA, the complementary strand of a double stranded target DNA, etc.).
- the protein-binding segment (or “protein-binding sequence”) interacts with, for example, binds to, a CRISPR/Cas effector polypeptide.
- the protein-binding segment of a guide RNA includes two complementary stretches of nucleotides that hybridize to one another to form a double stranded RNA duplex (dsRNA duplex).
- Site-specific binding and/or cleavage of a target nucleic acid can occur at locations (e.g., target sequence of a target locus) determined by base-pairing complementarity between the guide RNA (the guide sequence of the guide RNA) and the target nucleic acid.
- a guide RNA and a CRISPR/Cas effector polypeptide form a complex (e.g., bind via non-covalent interactions).
- the guide RNA provides target specificity to the complex by including a targeting segment, which includes a guide sequence (a nucleotide sequence that is complementary to a sequence of a target nucleic acid).
- the CRISPR/Cas effector polypeptide of the complex provides the site-specific activity (e.g., cleavage activity or an activity provided by the CRISPR/Cas effector polypeptide when the CRISPR/Cas effector polypeptide is a CRISPR/Cas effector polypeptide fusion polypeptide, i.e., has a fusion partner).
- the CRISPR/Cas effector polypeptide is guided to a target nucleic acid sequence (e.g. a target sequence in a chromosomal nucleic acid, e.g., a chromosome; a target sequence in an extrachromosomal nucleic acid, e.g. an episomal nucleic acid, a minicircle, an ssRNA, an ssDNA, etc.; a target sequence in a mitochondrial nucleic acid; a target sequence in a chloroplast nucleic acid; a target sequence in a plasmid; a target sequence in a viral nucleic acid; etc.) by virtue of its association with the guide RNA.
- a target nucleic acid sequence e.g. a target sequence in a chromosomal nucleic acid, e.g., a chromosome
- a target sequence in an extrachromosomal nucleic acid e.g. an episomal nucleic acid,
- the “guide sequence” also referred to as the “targeting sequence” of a guide RNA can be modified so that the guide RNA can target a CRISPR/Cas effector polypeptide to any desired sequence of any desired target nucleic acid, with the exception that the protospacer adjacent motif (PAM) sequence can be considered.
- PAM protospacer adjacent motif
- a guide RNA can have a targeting segment with a sequence (a guide sequence) that has complementarity with (e.g., can hybridize to) a sequence in a nucleic acid in a eukaryotic cell, e.g., a viral nucleic acid, a eukaryotic nucleic acid (e.g., a eukaryotic chromosome, chromosomal sequence, a eukaryotic RNA, etc.), and the like.
- a eukaryotic cell e.g., a viral nucleic acid, a eukaryotic nucleic acid (e.g., a eukaryotic chromosome, chromosomal sequence, a eukaryotic RNA, etc.), and the like.
- a guide RNA includes two separate nucleic acid molecules: an “activator” and a “targeter” and is referred to herein as a “dual guide RNA,” a “double-molecule guide RNA,” a “two-molecule guide RNA,” or a “dgRNA.”
- the activator and targeter are covalently linked to one another (e.g., via intervening nucleotides) and the guide RNA is referred to as a “single guide RNA,” a “Cas9 single guide RNA,” a “single-molecule Cas9 guide RNA,” or a “one- molecule Cas9 guide RNA,” or simply “sgRNA.”
- the target DNA can be a genomic nucleic acid, a mitochondrial nucleic acid; a chloroplast nucleic acid; a plasmid; or a viral nucleic acid.
- the target DNA can be isolated from a cell or can be within an intact cell.
- RNA-guided endonuclease for example, Cas9 and the guide-RNA
- the RNA-guided endonuclease are transfected into the cells to contact the RNA-guided endonuclease with the genomic DNA of the cell.
- An example of such transfection method is disclosed in the “Materials and Methods” below.
- CRISPR is used to target expressed genes.
- the guide RNAs can be designed to target (1) exon-intron junctions, (2) exon sequences, or (3) regulatory sequences. Once a guide RNAs are selected, it can be synthesized for singleplex or multiplex transduction.
- CRISPR-mediated genetic editing can involve different cell delivery systems, including: (1) plasmid transfection; (2) viral transduction that stably integrates the gRNA sequence into the genome; (3) a gRNA (singleplex or multiplex) that is directly associated with Cas9 for ribonucleoprotein (RNP) delivery.
- plasmid transfection a viral transduction that stably integrates the gRNA sequence into the genome
- gRNA singleplex or multiplex
- Cas9 for ribonucleoprotein (RNP) delivery ribonucleoprotein
- Any of these Cas9-guide RNA delivery systems can be used to genetically modify cells grown in tissue culture or directly on tissue using RNP delivery.
- CRISPR/Cas9 and other enzymes in the class introduces double stranded DNA breaks (DSBs) - this genomic alteration leads to insertions and deletions (indels).
- DSBs double stranded DNA breaks
- Indels insertions and deletions
- Base editors introduce point mutations without a DNA double-strand break (DSB) or a requirement for template donor DNA (Gaudelli, 2017; Komor, 2016; Nishida, 2016; Kim, 2019).
- CBEs cytosine base editors
- ABEs adenine base editors
- CBEs were developed by combining APOBEC1 enzymes, which remove an amine group from cytosine, with catalytically dead Cas9 (dCas9) or Cas9 nickase (nCas9) ( Komor, 2016).
- ABEs involve fusing an adenine deaminase to the Cas9 variant. Because an adenine deaminase accepts single-stranded DNA as a substrate, researchers created new ssDNA-targetable enzymes with engineered adenine deaminases (Gaudelli, 2017; Kim, 2019).
- Based editors allow for engineering in specific point mutations into the genome and allows their detection at single cell resolution. It does this by using base editor technology to introduce the mutation followed by single cell long read sequencing to determine which cells have the mutation Single cells undergo targeted sequencing of the engineered mutation. This is done by targeted sequencing of the specific gene undergo base editors. This involves using a special primer that provides multiplexed amplification of the cDNA target that were engineered. Then, the targeted products undergo long read sequencing and the mutation is identified.
- a point mutation on the cell’s function can be determined by integrating the long-read sequencing to identify the cells with the point mutation and the short read sequencing which identifies changes in that specific cell’s gene expression. Combining the long and short read cell barcodes, one has the single cell sequence data of both complete cDNAs and gene expression. The process of using a based editor to introduce engineered gene mutations into single cells is illustrated in Fig. 5.
- the first step is the binding of base editor-gRNA complex to its target DNA.
- the base pairing of the gRNA molecule and the complementary target DNA strand approximately 20nt of single-stranded DNA are displaced.
- the deaminase enzyme edits the target DNA bases within this ssDNA (i.e., R-loop).
- Base editors work efficiently in human cells with comparable efficiencies of Cas9 (Kim, 2019). Therefore, base editors are an adaptable tool for introducing various genetic substitution mutations in the genome. Using a specially designed gRNA that acts as a repair template, prime editors introduce the mutation.
- An assay disclosed herein can be applied across a range of cell numbers.
- a multiplexed transduction with a guide RNA library can be conducted.
- sets of cells and assigning genetic modifications, e.g., CRISPR edits a multiplexed transduction can be conducted on sets of cells that are grouped into different partitions, separate wells, or separate plates.
- the cells can be grown, harvested as a single cell suspension and cDNA can be prepared.
- a cell indexing barcode can be incorporated into the cDNAs at the 5’ or 3’ end such that one can assign a set of cells to a given guide RNA library.
- the barcode can be used on different number of cells ranging from one cell to a group of cells.
- cells can be grown in partitions, wells, or plates.
- each set of cells can be transduced with a CRISPR-Cas9 involving a multiplexed pool of guide RNAs.
- Intact cDNA can be prepared for sequencing without any fragmentation. Avoiding fragmentation retains the full length of the cDNA as an extended molecule.
- targeted sequencing can be performed on the gene or sets of genes that were targeted for modification.
- the targeted sequence library preparation can involve: (1) PCR amplification, (2) selective hybridization with a bait oligonucleotide or (3) single primer extension of the target gene or cDNA from the target gene.
- the sequencing library preparation can be performed with a full-length cDNA without any fragmentation.
- long-read sequencing can be conducted.
- the Oxford NanoporeTM or Pacific BiosciencesTM sequencing methods which generates long-read sequences can be used.
- the cell indexing barcode can be first identified from the long-read sequence to determine which cells were exposed to a given guide RNA. Then, how the target cDNA sequence was changed could be determined.
- the long read sequence can cover the entire mRNA sequence and, therefore, a specific genetic modification can be found at any location in the transcript and still be linked to the cell index barcode at the 5’ or 3’ end.
- the cell indexing barcode from the long read with the CRISPR genotype can be matched with the same cell indexing barcode from short read data (RNA-Seq or antibody barcodes). Linking these two barcodes enables the CRISPR genotype to be assigned to a given molecular phenotype for a specific population of cells.
- certain embodiments of the invention disclose how single cell long read analysis and genetic modifications, e.g., CRISPR edits, can be used to directly confirm the genetic modifications and used in cellular engineering applications.
- Certain embodiments of the disclosure provide a method for analyzing cells, comprising:
- step (c) on the basis of single cells or sets of cells, comparing the identified modification in the target gene with the modification expected in step (a).
- Any target gene can be genetically modified. Also, any portion of the target gene can be modified, which includes: exon-intron junction, protein-coding sequence of a gene, promoter of a gene, or 3’ untranslated region of a gene.
- the term “on the basis of a single cell” as used herein indicates that the analysis is made on a cell-by-cell basis. For example, the coding sequence of the mRNA encoded by the target gene is identified in individual cells from the population of genetically modified cells. Similarly, the identified target modification in individual cells is compared to the modification expected in step (a).
- the term “on the basis of sets of cells” as used herein indicates that the analysis is made on different sets of cells. For example, the coding sequence of the mRNA encoded by the target gene is identified in different sets of cells, particularly, wherein different sets of cells could be descendants from different cells from the population of genetically modified cells. Similarly, the identified target modification in sets of cells is compared to the modification expected in step (a).
- sequencing the mRNA encoded by the target gene from the genetically modified cells is performed on the basis of single cells. Also, in some cases, sequencing the mRNA encoded by the target gene from the genetically modified cells is performed on the basis of sets of cells.
- step (b) can comprises: (i) separating single cells or sets of cells from the genetically modified cells,
- step (ii) reverse transcribing the mRNAs encoded by the target genes from the single cells or sets of cells separated in step (i) to produce cDNAs, wherein the primer for the reverse transcription of the mRNAs comprises a unique barcode on the basis of single cells or sets of cells,
- 100 unique barcodes can be incorporated in 100 reverse transcription primers, each of which contains: 1) a sequence that binds to the mRNA encoded by the target gene and 2) a unique barcode.
- the reverse transcription primer can contain a primer binding site that could be used to subsequently amplify the cDNA.
- the sequence that binds to the mRNA encoded by the target gene can be the same or different in the different reverse transcription primers.
- 100 unique barcodes can be incorporated in 100 reverse transcription primers, each of which contains: 1) a sequence that binds to the mRNA encoded by the target gene and 2) a unique barcode.
- the sequence that binds to the mRNA encoded by the target gene can be the same or different in the different reverse transcription primers.
- the cDNAs are attached to the beads and the method comprises in step (iii), pooling the beads.
- the reverse transcription primers could be attached to the beads, thereby attaching the amplified cDNA to the beads.
- the step of separating single cells or sets of cells from the genetically modified cells comprises separating the single cells or sets of cells in individual wells of a multi-well plate or separating the cells in individual droplets in an emulsion.
- imRNA can be isolated from the single cells.
- the step of producing the cDNA can be performed in the droplet.
- single cells cultured in individual wells can be grown into multiple cells, i.e., to produce sets of cells that descend from the single cells.
- mRNA can be isolated from the sets of cells and treated according to the methods disclosed herein.
- the cDNAs produced from single cells or sets of cells are sequenced.
- substantial entirety of the mRNAs encoded by the target gene is sequenced.
- the term “substantial entirety of an mRNA” includes sequences from the first exon to the last exon of a transcript with the possible exception of the sequences at the 5’ end of the first exon and the 3’ end of the last exon.
- the sequences at the 5’ end of the first exon and the 3’ end of the last exon could be used for primer binding and, therefore, mutations in these sequences may not be detected.
- the method disclosed herein comprises PCR amplifying the cDNAs containing the unique barcodes.
- the reverse transcription primer can contain a primer binding site that could be used to subsequently amplify the cDNA. Accordingly, one of the primer pairs that amplifies a cDNA can bind to the primer binding site introduced into the cDNA via the reverse transcription primer. In certain such cases, the other primer can bind to the sequence at the 3’ end of the cDNA.
- primer pairs can be designed that specifically bind to the sequences at the 5’ and the 3’ ends of the cDNA. Such sequences can be designed based on the sequence of the target gene. Therefore, in some cases, the primer pair for amplifying the cDNAs comprises: a first primer that hybridizes with the sequence at the 5’ end of the mRNA encoded by the target gene and a second primer that hybridizes with the sequence at the 3’ of the mRNA encoded by the target gene, and wherein one or both the primers in the primer pair comprise a unique barcode.
- the method disclosed herein involves amplifying the cDNA produced from a single cell or a set of cells using a primer pair. The amplification product so produced contains the barcode introduced into the cDNA thereby indicating the source cell or group of cells of the cDNA.
- the amplified cDNA can be sequenced, particularly, using long-read sequencing.
- the long-read sequencing comprises single molecule real time (SMRT) sequencing or nanopore sequencing.
- SMRT sequencing can be circular consensus sequencing or continuous long read sequencing.
- SMRT sequencing an amplicon is ligated to hairpin adapters to form a circular molecule, called a SMRT bell.
- the SMRTbell is bound by a DNA polymerase and loaded onto a SMRT Cell for sequencing.
- a SMRT Cell can contain up to 8 million zero-mode waveguides (ZMWs). ZMWs are chambers of picolitre volumes. Light penetrates the lower 20-30 nm of SMRT Cells. The SMRTbell template and polymerase become immobilized on the bottom of the chamber.
- dNTPs deoxynucleoside triphosphates
- nanopore sequencing long DNA strand is tagged with sequencing adapters preloaded with a motor protein on one or both ends.
- the DNA is combined with tethering proteins and loaded onto the flow cell for sequencing.
- the flow cell contains protein nanopores embedded in a synthetic membrane.
- the tethering proteins bring the molecules to be sequenced towards the nanopores and as the motor protein unwinds the DNA, an electric current is applied, which drives the negatively charged DNA through the pore.
- the DNA is sequenced as it passes through the pore and causes characteristic changes in the current.
- Long-read sequencing can sequence at least about 500 or at least about 600 bases. Particularly, long-read sequencing sequences at least 800, at least 1000, at least 1200, at least 1400, at least 1600, at least 1800, at least 2000, at least 2500, or at least 3,000 bases of the amplified products. Thus, the long-read sequence can be used to sequence a target mRNA of at least 500 to at least 3,000 bases in length.
- the method comprises further sequencing the transcriptomes of the genetically modified cells on the basis of single cells or sets of cells.
- the method comprises conducting short-read sequencing of the transcriptome on the basis of single cells or sets of cells.
- mRNA is isolated from the single cells offsets of cells and analyzed via transcriptome analysis by short-read sequencing.
- sequencing the transcriptomes comprises:
- step (ii) reverse transcribing the transcriptomes from the genetically modified cells or sets of cells separated in step (i) to produce cDNAs of the transcriptomes, wherein the primers used for the reverse transcription comprise a unique barcode on the basis of single cells or sets of cells,
- step (iv) amplifying the cDNAs and sequencing the amplification products of the cDNAs of the transcriptomes, and (iv) depending on the unique barcodes in the amplification products produced in step (iv), quantifying the transcriptomes from the genetically modified cells on the basis of single cells or sets of cells.
- the reverse transcribing the transcriptomes can be performed using primers comprising: 1 ) random nucleotide sequences, for example, random hexamers, or 2) oligo-dT sequence.
- the primers can have a unique barcode on the basis of single cells or sets of cells.
- the same barcode can be used in long-read sequencing of mRNA sequencing of the target gene from a cell or set of cells and short-read sequencing of the transcriptome of the cell or the set of cells.
- the same barcode could be used to attribute a cDNA sequence and the transcriptome sequence to a cell or a set of cells.
- Transcriptome from the single cells or sets of cells can be sequenced using short-read sequencing.
- the short-read sequencing comprises paired-end sequencing. Certain details of short-read sequencing are also described by the Logsdon etal. (2020) publication.
- the disclosure provides a method of determining efficacy of a molecular tool for editing the target gene.
- Certain such methods comprise methods of analyzing cells as disclosed herein and further comparing, on the basis of single cells or sets of cells, the observed modification to the target gene with the modification expected in the target gene. Based on the number of single cells or the number of sets of cells that exhibit the desired modification as compared to the total number of genetically modified cells or sets of cells, the efficacy of the molecular tool for producing the genetic modification can be determined.
- the methods disclosed herein involve editing one or more target genes. Certain such methods comprise methods of analyzing cells as disclosed herein and further comparing:
- the mRNA encoded by one or more additional target genes can be analyzed on the basis of single cells or sets of cells.
- the same barcode can be incorporated in the cDNAs produced from one cell or one set of cell.
- multiple reverse transcriptase primers can be designed, each primer directed at producing cDNA from a different target mRNA but all reverse transcriptase primers having one barcode.
- all cDNAs from a single cell or a set of cell contains the same barcode.
- step (b) can be performed by:
- step (ii) reverse transcribing the mRNAs encoded by the one or more additional target genes from the cells or sets of cells separated in step (i) to produce cDNAs for the one or more additional target genes, wherein the primer for the reverse transcription of the mRNAs comprises a unique barcode on the basis of single cells or sets of cells,
- the methods described above for sequencing cDNAs for a target gene can be similarly applied for sequencing cDNAs for the one or more additional target genes.
- the mRNA for the one or more additional target genes can be sequenced using long-read sequencing. Certain details of long-read sequencing are described above and such are also applicable to sequencing one or more additional target genes.
- kits having one or more components and/or reagents and/or devices, where applicable, for practicing one or more of the above-described methods.
- the subject kits may vary greatly. Kits of interest include those having one or more reagents mentioned herein, and associated devices where applicable, with respect to the steps of:
- step (c) on the basis of single cells or sets of cells, comparing the identified modification in the target gene with the modification expected in step (a).
- Kits may include certain combinations of components in a single reaction vessel. Kits may include different components in different vessels.
- a kit comprises: reagents for genetic modifications in a target gene, such as CRISPR-Cas9 and gRNA; transfection reagents, cells or cell lines, media for culturing the cells, reverse transcription primers, primer pairs for amplification of cDNAs, reagents for sequencing, etc.
- the methods described in this disclosure find use in a variety of applications. Applications of interest include, but are not limited to: research applications and therapeutic applications. Methods of the invention find use in a variety of different applications including any convenient application where identifying effects of genetic modifications, e.g., CRISPR-mediated genomic editing is desired.
- the method finds particular use in analyzing and/or engineering therapeutic cells, e.g., genetically engineered cells that are destined for therapeutic use, e.g., stem cells or immune cells.
- the method may be used to analyze knockouts and/or modifications in T cells or natural killer (NK) cells.
- the method may be used to analyze therapeutic cells that have been modified by CRISPR editing to be allogenic.
- the method may be used to analyze immune cells that have a knockout in a immune checkpoint inhibitor such as PD1 , CTLA-4, TIM-3, VISTA, LAG-3, IDO or KIR, etc., that have a knockout in an endogenous receptor such as a knockout in TRAC or TRBC, etc, or that have CRISPR-mediated edit that modifies the expression of a cytokine or other inflammatory molecule or a component of a signal transduction pathway, etc.
- the cells being analyzed may be primary immune cells, or they may be expanded primary immune cells.
- HEK293T cells and Cas9-stable HEK293T cells were maintained in Dulbecco's Modified Eagle Medium (DMEM) supplemented with 10% Fetal Bovine Serum (FBS).
- DMEM Dulbecco's Modified Eagle Medium
- FBS Fetal Bovine Serum
- Jurkat ATCC TIB-152
- RPMI Roswell Park Memorial Institute 1640 Medium supplemented with 10%
- the oligonucleotide pool for guide RNA library cloning were synthesized. Amplified guide RNA cassettes were cloned into a plasmid for expression.
- HEK293T cells 2.0 x 10 6 HEK293T cells were plated 24h prior to transfection.
- Cells were transfected with lentiviral sguide RNA library (2000ng), psPAX2 (1500ng, Addgene plasmid #12260) and pMD2.G (500ng, Addgene plasmid #12259) using a lipofectant agent.
- the viral supernatant was collected after 48hr of transfection, filtered through a 0.45miti filter, and used.
- the lentiviral supernatant and 8pg of polybrene were added and the mixture was centrifuged at 800g for 30 minutes at 32 degrees. After that, cell pellets were resuspended to fresh media and plated in a 6- well plate. After 72 hours, transduced cells were selected by puromycin.
- Short read transcripts Basecalling for 5’ gene expression libraries was performed followed by alignment to reference genome GRCh38, and transcript quantification. In preparation for integrated analysis, a cell transcript data matrix was processed by removing cells with fewer than 100 or more than 8000 genes, cells with more than 30% mitochondrial genes. Additionally, any genes present in 3 or fewer cells were removed. Dimension reduction was performed using principal component analysis and UMAP with 30 principal components and cluster resolution of 0.8.
- the putative long- read barcode is identified by evaluating the soft clipped portions of the aligned long reads, which are extracted using a custom python script.
- a second custom python script used a machine-learning approach to identify the barcode.
- the list of valid short-read barcodes was vectorized using with a kmer length of 8 to create a reference list.
- the 5’ soft-clipped region of each read was then vectorized in the same way and compared to the created reference using a cosine similarity metric.
- the 3’ soft-clipped region of each read was evaluated by matching the reverse-complement of the soft-clipped sequence to the reference list.
- the sguide RNA was in-vitro transcribed by T7 RNA polymerase. Templates for sguide RNA were generated by extension of two complementary oligo nucleotides. Transcribed RNA was purified by column purification. Purified RNA was quantified by fluorimetry.
- Example 1 Identifying CRISPR edits using long reads that covers mRNA - full lenpth cDNA from individual cells
- Each cell indexing barcode represents one cell.
- the cell indexing barcode consists of a DNA sequence that is specifically added to the cDNA extracted from an individual cell or set of cells (FIG. 1). As noted, each cell indexing barcode represents only one cell. In the case of populations of cells, the cell indexing barcode represents a group of cells (two or more).
- the RACK1 transcript was amplified using two primers that included sequence from the 5’ adaptor for the sequencing library and the last exon of RACK1.
- the amplified full-length cDNA underwent nanopore sequencing (Oxford Nanopore) which generates long reads. We performed base calling and aligned the long reads to the reference genome, GrCh38.
- the full length cDNAs had cell indexing barcodes at the 5’ end - as noted, this barcodes enables the assignment of a long read covering a transcript to its cell or cells of origin.
- each long- read sequence did not align to the human genome.
- This non-aligning sequence represents the cell indexing barcode which is not found in the human genome sequence.
- the soft-clipped sequence and the whitelist of barcodes are vectorized using 8-mers, a sequence of eight bps. The frequency of these short sequence tracts (i.e. k-mers) were determined from a whitelist of predesignated barcodes representing the ground truth.
- the long sequence read covers the entire mRNA sequence. Therefore a specific genotype edit can be identified at any location in the transcript and still be linked to the cell index barcode at the 5’ or 3’ end.
- Using long read sequencing of target cDNAs we characterized the sequence of the mRNA in each individual cell. This analysis identified the different RACK1 transcript isoforms as well as the cell indexing barcodes identified the individual cell from which the mRNA originated. Then, we aggregated reads per cell indexing barcode and used hierarchical clustering to determine the distribution of different cDNA sequences, representing the full length mRNA, among cell subpopulations.
- the long read sequence covers the entire mRNA sequence and, therefore, a specific genotype edit can be found at any location in the transcript and still be linked to the cell index barcode at the 5’ or 3’ end. From the reads that covered the entire RACK1 cDNA, we determined the structure of the transcript and the composition of the cell indexing barcode.
- the different CRISPR-generated RACK1 isoforms changes the gene expression for a given cell or set of cells.
- This sequence linkage can use any type of single cell library process where one matches the cell indexing barcode sequences between the two different sequence data sets (single cell long read and single cell short read) that come from the same cell population.
- PTPRC transmembrane phosphatase - its pre-mRNA alternative splicing is critical for changing T cell regulatory states.
- PTPRC has five highly expressed isoforms. This includes two short ones where there is substantial degree of exon loss and longer isoforms where the majority of exons from the variable region are retained.
- CD4-CD8-double negative T cells and NK precursor cells preferentially express longer isoforms like RABC and RBC and when activated, T cells and NK cells preferentially express shorter isoforms like RO and RB.
- PTPRC transcript isoform structure.
- a guide RNA lentiviral library targeting 16 splicing factors (two guide RNAs per gene) and five non-targeting guide RNAs as negative controls (Table 1).
- HNRNPLL and SRSF5 induce exon skipping of PTPRC and PCBP2 and HNRNPD inhibit exon skipping.
- HNRNPLL and SRSF5 knock-outs inhibited PTPRC exon skipping, their isoform expression patterns were significantly different (data not shown). The ratio of RBC and RABC isoforms was higher in the knock-outs than in non-targeted cells.
- HNRNPLL gene for a single knock-out experiment.
- RNP Cas9 ribonucleoprotein
- Methods the Cas9 ribonucleoprotein
- HNRNPLL RNP-treated cells Most of the stimulated wild-type cells had RO and RB transcript isoforms. Flowever, the stimulated HNRNPLL RNP-treated cells had less RO and RB transcript isoforms (10.32-fold, P ⁇ 1.0e-5, FIGS. 3B and 3C).
- PTPRC we analyzed the impact of splicing factors on myosin light chain 6 ( MYL6 ) transcript isoforms. Exon6 skipping of MYL6 is known to be regulated by various splicing factors.
- Table 4 List of oligonucleotides for gRNA capture. Table 4. Mutation rate detected from long-read sequencing for each gRNA target.
- T ⁇ -> C and A ⁇ -> G were identified that are suited for ABE and CBE enzymes.
- the C to T transition is one of the most frequent mutations in human genome. For example, among the 10 most reported TP53 mutations reported in COSMIC database, 9 mutations were transitions. Among this set, CBEs can be used to engineer in eight while ABEs can engineer one. Using TP53 as an example, nine out of the ten mutations were viable candidates for base editors.
- the spCas9 base editor requires ‘NGG’ - this sequence is referred to as the protospacer adjacent motif (PAM).
- PAM protospacer adjacent motif
- Oligonucleotides were synthesized with the gRNA sequence and subclone them into plasmid or lentiviral vectors. For less than 100 gRNAs we will order single oligonucleotides. For larger sets, we will order oligonucleotides from array synthesis and include primer sequences to enable rapid subcloning into plasmid or lentiviral vectors.
- Base editors were applied to introduce mutations into a cell line. 10 gRNAs which target various TP53 mutations were designed. Using electroporation, a multiplexed plasmid pool with all gRNAs and CRISPR based editors was transduced into the colon cancer cell-line HOT 116. The cells underwent single cell cDNA generation and then were sequenced with both short- and long read platforms.
- a targeted multiplexed enrichment was done using a based on single-primer extension method.
- primers were designed for the target transcripts and then synthesized.
- a linear single primer extension from the cDNA library increased the yield of the target while minimizing the generation of off-target sequences.
- the target product undergoes a 2 nd DNA strand synthesis using DNA polymerase.
- the product is loaded on to a single molecule sequencer (Oxford Nanopore or Pacific Biosciences) and the target reads are analyzed. The mutation is identified within single cell using the cell barcode information.
- the short read sequencing provided the gene expression profile for each cell.
- Fig. 7 shows how long and short read single cell sequencing can be integrated with to match the gene expression profiles from single cells with the mutation.
- the cells with a TP53 mutation showed distinct transcriptional patterns compared to the cells with wildtype TP53. This proof-of-concept study demonstrated that this technology provides high-throughput engineering and analysis of various cancer mutations into single cells.
Landscapes
- Life Sciences & Earth Sciences (AREA)
- Genetics & Genomics (AREA)
- Health & Medical Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Organic Chemistry (AREA)
- Zoology (AREA)
- Biomedical Technology (AREA)
- Wood Science & Technology (AREA)
- Biotechnology (AREA)
- General Engineering & Computer Science (AREA)
- Crystallography & Structural Chemistry (AREA)
- Plant Pathology (AREA)
- Molecular Biology (AREA)
- Microbiology (AREA)
- Biophysics (AREA)
- Physics & Mathematics (AREA)
- Biochemistry (AREA)
- General Health & Medical Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
La présente divulgation concerne un procédé d'analyse cellulaire, comprenant les étapes suivantes : édition d'un gène cible dans une population cellulaire pour produire des cellules génétiquement modifiées ; sur la base de cellules uniques ou d'ensembles de cellules, séquençage de la quasi-totalité de la séquence codante de l'ARNm codé par le gène cible à partir des cellules ou ensembles de cellules génétiquement modifiés pour identifier une modification dans le gène cible ; et sur la base de cellules uniques ou d'ensembles de cellules, comparaison de la modification identifiée dans le gène cible avec la modification génétique attendue. La divulgation concerne également de déterminer l'efficacité d'un procédé de modification génétique d'un gène cible. La présente invention concerne également un procédé permettant de déterminer les effets d'une modification génétique sur l'expression génique globale. L'invention concerne également des kits pour la mise en œuvre desdits procédés.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US202163214680P | 2021-06-24 | 2021-06-24 | |
US63/214,680 | 2021-06-24 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022271725A1 true WO2022271725A1 (fr) | 2022-12-29 |
Family
ID=84544898
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2022/034376 WO2022271725A1 (fr) | 2021-06-24 | 2022-06-21 | Détection de la modification du génome par crispr sur une base cellule par cellule |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2022271725A1 (fr) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2024092151A1 (fr) * | 2022-10-27 | 2024-05-02 | The Board Of Trustees Of The Leland Stanford Junior University | Mesure directe de mutations de cancer modifiées et de leurs phénotypes transcriptionnels dans des cellules uniques |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020056451A1 (fr) * | 2018-09-21 | 2020-03-26 | Garvan Institute Of Medical Research | Caractérisation phénotypique et moléculaire de cellules individuelles |
WO2020168075A1 (fr) * | 2019-02-13 | 2020-08-20 | Beam Therapeutics Inc. | Rupture de site accepteur d'épissage d'un gène associé à une maladie à l'aide d'éditeurs de bases d'adénosine désaminase, y compris pour le traitement d'une maladie génétique |
-
2022
- 2022-06-21 WO PCT/US2022/034376 patent/WO2022271725A1/fr active Application Filing
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020056451A1 (fr) * | 2018-09-21 | 2020-03-26 | Garvan Institute Of Medical Research | Caractérisation phénotypique et moléculaire de cellules individuelles |
WO2020168075A1 (fr) * | 2019-02-13 | 2020-08-20 | Beam Therapeutics Inc. | Rupture de site accepteur d'épissage d'un gène associé à une maladie à l'aide d'éditeurs de bases d'adénosine désaminase, y compris pour le traitement d'une maladie génétique |
Non-Patent Citations (1)
Title |
---|
GUPTA ISHAAN, COLLIER PAUL G, HAASE BETTINA, MAHFOUZ AHMED, JOGLEKAR ANOUSHKA, FLOYD TAYLOR, KOOPMANS FRANK, BARRES BEN, SMIT AUGU: "Single-cell isoform RNA sequencing characterizes isoforms in thousands of cerebellar cells", NATURE BIOTECHNOLOGY, NATURE PUBLISHING GROUP US, NEW YORK, vol. 36, no. 12, 1 December 2018 (2018-12-01), New York, pages 1197 - 1202, XP093020954, ISSN: 1087-0156, DOI: 10.1038/nbt.4259 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2024092151A1 (fr) * | 2022-10-27 | 2024-05-02 | The Board Of Trustees Of The Leland Stanford Junior University | Mesure directe de mutations de cancer modifiées et de leurs phénotypes transcriptionnels dans des cellules uniques |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11584959B2 (en) | Compositions and methods for selection of nucleic acids | |
Oikonomopoulos et al. | Methodologies for transcript profiling using long-read technologies | |
US20200208141A1 (en) | Methods and compositions comprising crispr-cpf1 and paired guide crispr rnas for programmable genomic deletions | |
WO2020206285A1 (fr) | Procédés et applications pour le codage à barres de cellules | |
EP3555305B1 (fr) | Procédé pour augmenter le débit d'un séquençage de molécule unique par concaténation de fragments d'adn court | |
Karlic et al. | Long non-coding RNA exchange during the oocyte-to-embryo transition in mice | |
JP2018532419A (ja) | CRISPR−Cas sgRNAライブラリー | |
EP3794141A1 (fr) | Séquençage de cellules uniques à haut débit avec biais d'amplification réduit | |
US20220033811A1 (en) | Method and kit for preparing complementary dna | |
Bundschuh et al. | Complete characterization of the edited transcriptome of the mitochondrion of Physarum polycephalum using deep sequencing of RNA | |
US20180216103A1 (en) | Methods and Compositions for Enrichment of Target Polynucleotides | |
EP3935185A1 (fr) | Compositions et procédés de marquage d'acides nucléiques et de séquençage et d'analyse de ceux-ci | |
Ritter et al. | Deletion of a telomeric region on chromosome 8 correlates with higher productivity and stability of CHO cell lines | |
Sterling et al. | An efficient and sensitive method for preparing cDNA libraries from scarce biological samples | |
Agarwal et al. | Sequencing of first-strand cDNA library reveals full-length transcriptomes | |
WO2022271725A1 (fr) | Détection de la modification du génome par crispr sur une base cellule par cellule | |
WO2020172199A1 (fr) | Construction de banques de brins guides et procédés d'utilisation associés | |
US11946163B2 (en) | Methods for measuring and improving CRISPR reagent function | |
Yates et al. | A simple and rapid method for enzymatic synthesis of CRISPR-Cas9 sgRNA libraries | |
CN116716298A (zh) | 一种引导编辑系统和目的基因序列的定点修饰方法 | |
US20180346963A1 (en) | Preparation of Concatenated Polynucleotides | |
Raghavan et al. | High-throughput screening and CRISPR-Cas9 modeling of causal lipid-associated expression quantitative trait locus variants | |
US20230287396A1 (en) | Methods and compositions of nucleic acid enrichment | |
Haas | Tracing the specificity of CRISPR-Cas nucleases in clinically relevant human cells | |
Gupta et al. | Molecular biology and genetic engineering |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22829170 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 22829170 Country of ref document: EP Kind code of ref document: A1 |