WO2021082830A1 - 靶向性修饰植物基因组序列的方法 - Google Patents
靶向性修饰植物基因组序列的方法 Download PDFInfo
- Publication number
- WO2021082830A1 WO2021082830A1 PCT/CN2020/117736 CN2020117736W WO2021082830A1 WO 2021082830 A1 WO2021082830 A1 WO 2021082830A1 CN 2020117736 W CN2020117736 W CN 2020117736W WO 2021082830 A1 WO2021082830 A1 WO 2021082830A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- sequence
- pegrna
- target
- plant
- reverse transcriptase
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 41
- 230000004048 modification Effects 0.000 title claims abstract description 39
- 238000012986 modification Methods 0.000 title claims abstract description 39
- 108020001507 fusion proteins Proteins 0.000 claims abstract description 35
- 102000037865 fusion proteins Human genes 0.000 claims abstract description 35
- 239000002773 nucleotide Substances 0.000 claims description 58
- 125000003729 nucleotide group Chemical group 0.000 claims description 56
- 238000006467 substitution reaction Methods 0.000 claims description 50
- 102100034343 Integrase Human genes 0.000 claims description 36
- 108010092799 RNA-directed DNA polymerase Proteins 0.000 claims description 36
- 238000010839 reverse transcription Methods 0.000 claims description 36
- 108020005004 Guide RNA Proteins 0.000 claims description 35
- 108091033409 CRISPR Proteins 0.000 claims description 33
- 108020004414 DNA Proteins 0.000 claims description 26
- 238000010362 genome editing Methods 0.000 claims description 24
- 102000007260 Deoxyribonuclease I Human genes 0.000 claims description 23
- 108010008532 Deoxyribonuclease I Proteins 0.000 claims description 23
- 238000012217 deletion Methods 0.000 claims description 23
- 230000037430 deletion Effects 0.000 claims description 23
- 230000014509 gene expression Effects 0.000 claims description 22
- 125000003275 alpha amino acid group Chemical group 0.000 claims description 20
- 230000027455 binding Effects 0.000 claims description 15
- 238000010354 CRISPR gene editing Methods 0.000 claims description 13
- 230000000295 complement effect Effects 0.000 claims description 11
- 108091081021 Sense strand Proteins 0.000 claims description 7
- 230000000692 anti-sense effect Effects 0.000 claims description 6
- 238000011144 upstream manufacturing Methods 0.000 claims description 6
- 238000007792 addition Methods 0.000 claims description 5
- 238000002844 melting Methods 0.000 claims description 5
- 230000008018 melting Effects 0.000 claims description 5
- 230000008685 targeting Effects 0.000 claims description 4
- 230000001131 transforming effect Effects 0.000 claims description 4
- 108091030145 Retron msr RNA Proteins 0.000 claims description 3
- 238000004519 manufacturing process Methods 0.000 claims description 3
- 238000012258 culturing Methods 0.000 claims description 2
- 210000005069 ears Anatomy 0.000 claims description 2
- 230000001172 regenerating effect Effects 0.000 claims description 2
- 102000004163 DNA-directed RNA polymerases Human genes 0.000 abstract description 6
- 108090000626 DNA-directed RNA polymerases Proteins 0.000 abstract description 6
- 241000196324 Embryophyta Species 0.000 description 90
- 240000007594 Oryza sativa Species 0.000 description 34
- 210000004027 cell Anatomy 0.000 description 34
- 210000001938 protoplast Anatomy 0.000 description 34
- 235000007164 Oryza sativa Nutrition 0.000 description 32
- 235000009566 rice Nutrition 0.000 description 32
- 108090000623 proteins and genes Proteins 0.000 description 27
- 230000009466 transformation Effects 0.000 description 20
- 150000007523 nucleic acids Chemical class 0.000 description 19
- 108091028043 Nucleic acid sequence Proteins 0.000 description 16
- 235000001014 amino acid Nutrition 0.000 description 16
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 15
- 238000003780 insertion Methods 0.000 description 15
- 230000037431 insertion Effects 0.000 description 15
- 150000001413 amino acids Chemical class 0.000 description 14
- 241000209140 Triticum Species 0.000 description 13
- 235000021307 Triticum Nutrition 0.000 description 13
- 230000000694 effects Effects 0.000 description 13
- 235000018102 proteins Nutrition 0.000 description 13
- 102000004169 proteins and genes Human genes 0.000 description 13
- 210000001519 tissue Anatomy 0.000 description 12
- 108090000994 Catalytic RNA Proteins 0.000 description 11
- 102000053642 Catalytic RNA Human genes 0.000 description 11
- 108091092562 ribozyme Proteins 0.000 description 11
- IJGRMHOSHXDMSA-UHFFFAOYSA-N Atomic nitrogen Chemical compound N#N IJGRMHOSHXDMSA-UHFFFAOYSA-N 0.000 description 10
- 108020004705 Codon Proteins 0.000 description 10
- 102000039446 nucleic acids Human genes 0.000 description 10
- 108020004707 nucleic acids Proteins 0.000 description 10
- 230000001105 regulatory effect Effects 0.000 description 10
- 239000013598 vector Substances 0.000 description 10
- 239000012634 fragment Substances 0.000 description 9
- 238000012163 sequencing technique Methods 0.000 description 9
- 101150067971 PPE3 gene Proteins 0.000 description 8
- 230000009418 agronomic effect Effects 0.000 description 8
- 238000006243 chemical reaction Methods 0.000 description 8
- 239000000203 mixture Substances 0.000 description 8
- 238000012360 testing method Methods 0.000 description 8
- 102000053602 DNA Human genes 0.000 description 7
- 101150109997 PPE2 gene Proteins 0.000 description 7
- 238000004458 analytical method Methods 0.000 description 7
- 230000035772 mutation Effects 0.000 description 7
- 108091033319 polynucleotide Proteins 0.000 description 7
- 102000040430 polynucleotide Human genes 0.000 description 7
- 239000002157 polynucleotide Substances 0.000 description 7
- 108090000765 processed proteins & peptides Proteins 0.000 description 7
- 239000000047 product Substances 0.000 description 7
- 239000000243 solution Substances 0.000 description 7
- 238000013518 transcription Methods 0.000 description 7
- 230000035897 transcription Effects 0.000 description 7
- 230000003321 amplification Effects 0.000 description 6
- 238000010586 diagram Methods 0.000 description 6
- 230000004927 fusion Effects 0.000 description 6
- 238000003199 nucleic acid amplification method Methods 0.000 description 6
- 241000701489 Cauliflower mosaic virus Species 0.000 description 5
- 108700010070 Codon Usage Proteins 0.000 description 5
- 108091027544 Subgenomic mRNA Proteins 0.000 description 5
- 239000006227 byproduct Substances 0.000 description 5
- 230000006870 function Effects 0.000 description 5
- 229910052757 nitrogen Inorganic materials 0.000 description 5
- 239000013612 plasmid Substances 0.000 description 5
- 229920001184 polypeptide Polymers 0.000 description 5
- 102000004196 processed proteins & peptides Human genes 0.000 description 5
- 238000012545 processing Methods 0.000 description 5
- 230000008439 repair process Effects 0.000 description 5
- 239000006228 supernatant Substances 0.000 description 5
- 108091093088 Amplicon Proteins 0.000 description 4
- 108091026890 Coding region Proteins 0.000 description 4
- FFEARJCKVFRZRR-BYPYZUCNSA-N L-methionine Chemical compound CSCC[C@H](N)C(O)=O FFEARJCKVFRZRR-BYPYZUCNSA-N 0.000 description 4
- 101710163270 Nuclease Proteins 0.000 description 4
- 101000910035 Streptococcus pyogenes serotype M1 CRISPR-associated endonuclease Cas9/Csn1 Proteins 0.000 description 4
- OIRDTQYFTABQOQ-KQYNXXCUSA-N adenosine Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@@H]1O[C@H](CO)[C@@H](O)[C@H]1O OIRDTQYFTABQOQ-KQYNXXCUSA-N 0.000 description 4
- 230000008859 change Effects 0.000 description 4
- 230000002255 enzymatic effect Effects 0.000 description 4
- 238000000684 flow cytometry Methods 0.000 description 4
- 238000000338 in vitro Methods 0.000 description 4
- 230000010354 integration Effects 0.000 description 4
- 239000012528 membrane Substances 0.000 description 4
- 108020004999 messenger RNA Proteins 0.000 description 4
- 229930182817 methionine Natural products 0.000 description 4
- 229920000642 polymer Polymers 0.000 description 4
- 241000894007 species Species 0.000 description 4
- 230000001052 transient effect Effects 0.000 description 4
- 238000013519 translation Methods 0.000 description 4
- 241000589158 Agrobacterium Species 0.000 description 3
- LFQSCWFLJHTTHZ-UHFFFAOYSA-N Ethanol Chemical compound CCO LFQSCWFLJHTTHZ-UHFFFAOYSA-N 0.000 description 3
- 108091081024 Start codon Proteins 0.000 description 3
- 108020004566 Transfer RNA Proteins 0.000 description 3
- 240000008042 Zea mays Species 0.000 description 3
- 235000002017 Zea mays subsp mays Nutrition 0.000 description 3
- 239000003795 chemical substances by application Substances 0.000 description 3
- 230000001276 controlling effect Effects 0.000 description 3
- 238000013461 design Methods 0.000 description 3
- 235000013399 edible fruits Nutrition 0.000 description 3
- 230000007071 enzymatic hydrolysis Effects 0.000 description 3
- 238000006047 enzymatic hydrolysis reaction Methods 0.000 description 3
- 238000002744 homologous recombination Methods 0.000 description 3
- 230000006801 homologous recombination Effects 0.000 description 3
- 230000001939 inductive effect Effects 0.000 description 3
- 230000001404 mediated effect Effects 0.000 description 3
- 235000015097 nutrients Nutrition 0.000 description 3
- 125000006850 spacer group Chemical group 0.000 description 3
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Chemical compound O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 3
- FWMNVWWHGCHHJJ-SKKKGAJSSA-N 4-amino-1-[(2r)-6-amino-2-[[(2r)-2-[[(2r)-2-[[(2r)-2-amino-3-phenylpropanoyl]amino]-3-phenylpropanoyl]amino]-4-methylpentanoyl]amino]hexanoyl]piperidine-4-carboxylic acid Chemical compound C([C@H](C(=O)N[C@H](CC(C)C)C(=O)N[C@H](CCCCN)C(=O)N1CCC(N)(CC1)C(O)=O)NC(=O)[C@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 FWMNVWWHGCHHJJ-SKKKGAJSSA-N 0.000 description 2
- KDCGOANMDULRCW-UHFFFAOYSA-N 7H-purine Chemical compound N1=CNC2=NC=NC2=C1 KDCGOANMDULRCW-UHFFFAOYSA-N 0.000 description 2
- 101000909256 Caldicellulosiruptor bescii (strain ATCC BAA-1888 / DSM 6725 / Z-1320) DNA polymerase I Proteins 0.000 description 2
- FBPFZTCFMRRESA-KVTDHHQDSA-N D-Mannitol Chemical compound OC[C@@H](O)[C@@H](O)[C@H](O)[C@H](O)CO FBPFZTCFMRRESA-KVTDHHQDSA-N 0.000 description 2
- 238000007400 DNA extraction Methods 0.000 description 2
- 101710177611 DNA polymerase II large subunit Proteins 0.000 description 2
- 101710184669 DNA polymerase II small subunit Proteins 0.000 description 2
- AEMRFAOFKBGASW-UHFFFAOYSA-N Glycolic acid Chemical compound OCC(O)=O AEMRFAOFKBGASW-UHFFFAOYSA-N 0.000 description 2
- NYHBQMYGNKIUIF-UUOKFMHZSA-N Guanosine Chemical compound C1=NC=2C(=O)NC(N)=NC=2N1[C@@H]1O[C@H](CO)[C@@H](O)[C@H]1O NYHBQMYGNKIUIF-UUOKFMHZSA-N 0.000 description 2
- OUYCCCASQSFEME-QMMMGPOBSA-N L-tyrosine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-QMMMGPOBSA-N 0.000 description 2
- 241000829100 Macaca mulatta polyomavirus 1 Species 0.000 description 2
- 229930195725 Mannitol Natural products 0.000 description 2
- 108091005461 Nucleic proteins Proteins 0.000 description 2
- 239000004677 Nylon Substances 0.000 description 2
- 238000012408 PCR amplification Methods 0.000 description 2
- 229920001030 Polyethylene Glycol 4000 Polymers 0.000 description 2
- 108010009736 Protein Hydrolysates Proteins 0.000 description 2
- 101000902592 Pyrococcus furiosus (strain ATCC 43587 / DSM 3638 / JCM 8422 / Vc1) DNA polymerase Proteins 0.000 description 2
- 108020004511 Recombinant DNA Proteins 0.000 description 2
- 108020004682 Single-Stranded DNA Proteins 0.000 description 2
- 244000062793 Sorghum vulgare Species 0.000 description 2
- IQFYYKKMVGJFEH-XLPZGREQSA-N Thymidine Chemical compound O=C1NC(=O)C(C)=CN1[C@@H]1O[C@H](CO)[C@@H](O)C1 IQFYYKKMVGJFEH-XLPZGREQSA-N 0.000 description 2
- DRTQHJPVMGBUCF-XVFCMESISA-N Uridine Chemical compound O[C@@H]1[C@H](O)[C@@H](CO)O[C@H]1N1C(=O)NC(=O)C=C1 DRTQHJPVMGBUCF-XVFCMESISA-N 0.000 description 2
- 241000700605 Viruses Species 0.000 description 2
- 235000016383 Zea mays subsp huehuetenangensis Nutrition 0.000 description 2
- 238000009825 accumulation Methods 0.000 description 2
- 125000000539 amino acid group Chemical group 0.000 description 2
- 230000001488 breeding effect Effects 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 244000038559 crop plants Species 0.000 description 2
- OPTASPLRGRRNAP-UHFFFAOYSA-N cytosine Chemical compound NC=1C=CNC(=O)N=1 OPTASPLRGRRNAP-UHFFFAOYSA-N 0.000 description 2
- 230000007423 decrease Effects 0.000 description 2
- MTHSVFCYNBDYFN-UHFFFAOYSA-N diethylene glycol Chemical compound OCCOCCO MTHSVFCYNBDYFN-UHFFFAOYSA-N 0.000 description 2
- 230000009977 dual effect Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 239000000499 gel Substances 0.000 description 2
- 238000012239 gene modification Methods 0.000 description 2
- 230000005017 genetic modification Effects 0.000 description 2
- 235000013617 genetically modified food Nutrition 0.000 description 2
- UYTPUPDQBNUYGX-UHFFFAOYSA-N guanine Chemical compound O=C1NC(N)=NC2=C1N=CN2 UYTPUPDQBNUYGX-UHFFFAOYSA-N 0.000 description 2
- HNDVDQJCIGZPNO-UHFFFAOYSA-N histidine Natural products OC(=O)C(N)CC1=CN=CN1 HNDVDQJCIGZPNO-UHFFFAOYSA-N 0.000 description 2
- 239000000413 hydrolysate Substances 0.000 description 2
- 238000005286 illumination Methods 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 238000002955 isolation Methods 0.000 description 2
- 235000009973 maize Nutrition 0.000 description 2
- 235000010355 mannitol Nutrition 0.000 description 2
- 239000000594 mannitol Substances 0.000 description 2
- 238000010369 molecular cloning Methods 0.000 description 2
- 230000030648 nucleus localization Effects 0.000 description 2
- 229920001778 nylon Polymers 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 230000008929 regeneration Effects 0.000 description 2
- 238000011069 regeneration method Methods 0.000 description 2
- 238000012216 screening Methods 0.000 description 2
- 239000000126 substance Substances 0.000 description 2
- RWQNBRDOKXIBIV-UHFFFAOYSA-N thymine Chemical compound CC1=CNC(=O)NC1=O RWQNBRDOKXIBIV-UHFFFAOYSA-N 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- OUYCCCASQSFEME-UHFFFAOYSA-N tyrosine Natural products OC(=O)C(N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-UHFFFAOYSA-N 0.000 description 2
- DIGQNXIGRZPYDK-WKSCXVIASA-N (2R)-6-amino-2-[[2-[[(2S)-2-[[2-[[(2R)-2-[[(2S)-2-[[(2R,3S)-2-[[2-[[(2S)-2-[[2-[[(2S)-2-[[(2S)-2-[[(2R)-2-[[(2S,3S)-2-[[(2R)-2-[[(2S)-2-[[(2S)-2-[[(2S)-2-[[2-[[(2S)-2-[[(2R)-2-[[2-[[2-[[2-[(2-amino-1-hydroxyethylidene)amino]-3-carboxy-1-hydroxypropylidene]amino]-1-hydroxy-3-sulfanylpropylidene]amino]-1-hydroxyethylidene]amino]-1-hydroxy-3-sulfanylpropylidene]amino]-1,3-dihydroxypropylidene]amino]-1-hydroxyethylidene]amino]-1-hydroxypropylidene]amino]-1,3-dihydroxypropylidene]amino]-1,3-dihydroxypropylidene]amino]-1-hydroxy-3-sulfanylpropylidene]amino]-1,3-dihydroxybutylidene]amino]-1-hydroxy-3-sulfanylpropylidene]amino]-1-hydroxypropylidene]amino]-1,3-dihydroxypropylidene]amino]-1-hydroxyethylidene]amino]-1,5-dihydroxy-5-iminopentylidene]amino]-1-hydroxy-3-sulfanylpropylidene]amino]-1,3-dihydroxybutylidene]amino]-1-hydroxy-3-sulfanylpropylidene]amino]-1,3-dihydroxypropylidene]amino]-1-hydroxyethylidene]amino]-1-hydroxy-3-sulfanylpropylidene]amino]-1-hydroxyethylidene]amino]hexanoic acid Chemical compound C[C@@H]([C@@H](C(=N[C@@H](CS)C(=N[C@@H](C)C(=N[C@@H](CO)C(=NCC(=N[C@@H](CCC(=N)O)C(=NC(CS)C(=N[C@H]([C@H](C)O)C(=N[C@H](CS)C(=N[C@H](CO)C(=NCC(=N[C@H](CS)C(=NCC(=N[C@H](CCCCN)C(=O)O)O)O)O)O)O)O)O)O)O)O)O)O)O)N=C([C@H](CS)N=C([C@H](CO)N=C([C@H](CO)N=C([C@H](C)N=C(CN=C([C@H](CO)N=C([C@H](CS)N=C(CN=C(C(CS)N=C(C(CC(=O)O)N=C(CN)O)O)O)O)O)O)O)O)O)O)O)O DIGQNXIGRZPYDK-WKSCXVIASA-N 0.000 description 1
- 102000040650 (ribonucleotides)n+m Human genes 0.000 description 1
- UHDGCWIWMRVCDJ-UHFFFAOYSA-N 1-beta-D-Xylofuranosyl-NH-Cytosine Natural products O=C1N=C(N)C=CN1C1C(O)C(O)C(CO)O1 UHDGCWIWMRVCDJ-UHFFFAOYSA-N 0.000 description 1
- 101150072531 10 gene Proteins 0.000 description 1
- 101150000874 11 gene Proteins 0.000 description 1
- YKBGVTZYEHREMT-KVQBGUIXSA-N 2'-deoxyguanosine Chemical compound C1=NC=2C(=O)NC(N)=NC=2N1[C@H]1C[C@H](O)[C@@H](CO)O1 YKBGVTZYEHREMT-KVQBGUIXSA-N 0.000 description 1
- CKTSBUTUHBMZGZ-ULQXZJNLSA-N 4-amino-1-[(2r,4s,5r)-4-hydroxy-5-(hydroxymethyl)oxolan-2-yl]-5-tritiopyrimidin-2-one Chemical compound O=C1N=C(N)C([3H])=CN1[C@@H]1O[C@H](CO)[C@@H](O)C1 CKTSBUTUHBMZGZ-ULQXZJNLSA-N 0.000 description 1
- 101150106774 9 gene Proteins 0.000 description 1
- 230000005730 ADP ribosylation Effects 0.000 description 1
- 102000007469 Actins Human genes 0.000 description 1
- 108010085238 Actins Proteins 0.000 description 1
- 229930024421 Adenine Natural products 0.000 description 1
- GFFGJBXGBJISGV-UHFFFAOYSA-N Adenine Chemical compound NC1=NC=NC2=C1N=CN2 GFFGJBXGBJISGV-UHFFFAOYSA-N 0.000 description 1
- 241000894006 Bacteria Species 0.000 description 1
- 239000002028 Biomass Substances 0.000 description 1
- 239000002126 C01EB10 - Adenosine Substances 0.000 description 1
- 238000010453 CRISPR/Cas method Methods 0.000 description 1
- LZZYPRNAOMGNLH-UHFFFAOYSA-M Cetrimonium bromide Chemical compound [Br-].CCCCCCCCCCCCCCCC[N+](C)(C)C LZZYPRNAOMGNLH-UHFFFAOYSA-M 0.000 description 1
- 229920000742 Cotton Polymers 0.000 description 1
- MIKUYHXYGGJMLM-GIMIYPNGSA-N Crotonoside Natural products C1=NC2=C(N)NC(=O)N=C2N1[C@H]1O[C@@H](CO)[C@H](O)[C@@H]1O MIKUYHXYGGJMLM-GIMIYPNGSA-N 0.000 description 1
- UHDGCWIWMRVCDJ-PSQAKQOGSA-N Cytidine Natural products O=C1N=C(N)C=CN1[C@@H]1[C@@H](O)[C@@H](O)[C@H](CO)O1 UHDGCWIWMRVCDJ-PSQAKQOGSA-N 0.000 description 1
- 241000701022 Cytomegalovirus Species 0.000 description 1
- NYHBQMYGNKIUIF-UHFFFAOYSA-N D-guanosine Natural products C1=2NC(N)=NC(=O)C=2N=CN1C1OC(CO)C(O)C1O NYHBQMYGNKIUIF-UHFFFAOYSA-N 0.000 description 1
- 230000033616 DNA repair Effects 0.000 description 1
- 208000035240 Disease Resistance Diseases 0.000 description 1
- 102000004190 Enzymes Human genes 0.000 description 1
- 108090000790 Enzymes Proteins 0.000 description 1
- 241000588724 Escherichia coli Species 0.000 description 1
- 108010002537 Fruit Proteins Proteins 0.000 description 1
- 241000287828 Gallus gallus Species 0.000 description 1
- 244000068988 Glycine max Species 0.000 description 1
- 235000010469 Glycine max Nutrition 0.000 description 1
- 241000219146 Gossypium Species 0.000 description 1
- 244000020551 Helianthus annuus Species 0.000 description 1
- 235000003222 Helianthus annuus Nutrition 0.000 description 1
- 240000005979 Hordeum vulgare Species 0.000 description 1
- 235000007340 Hordeum vulgare Nutrition 0.000 description 1
- 229930010555 Inosine Natural products 0.000 description 1
- UGQMRVRMYYASKQ-KQYNXXCUSA-N Inosine Chemical compound O[C@@H]1[C@H](O)[C@@H](CO)O[C@H]1N1C2=NC=NC(O)=C2N=C1 UGQMRVRMYYASKQ-KQYNXXCUSA-N 0.000 description 1
- 108091092195 Intron Proteins 0.000 description 1
- 102000003960 Ligases Human genes 0.000 description 1
- 108090000364 Ligases Proteins 0.000 description 1
- 241000209510 Liliopsida Species 0.000 description 1
- 235000007688 Lycopersicon esculentum Nutrition 0.000 description 1
- 240000003183 Manihot esculenta Species 0.000 description 1
- 235000016735 Manihot esculenta subsp esculenta Nutrition 0.000 description 1
- 240000004658 Medicago sativa Species 0.000 description 1
- 235000017587 Medicago sativa ssp. sativa Nutrition 0.000 description 1
- 102000003792 Metallothionein Human genes 0.000 description 1
- 108090000157 Metallothionein Proteins 0.000 description 1
- 244000061176 Nicotiana tabacum Species 0.000 description 1
- 235000002637 Nicotiana tabacum Nutrition 0.000 description 1
- 108700026244 Open Reading Frames Proteins 0.000 description 1
- 229920006976 PPE-M Polymers 0.000 description 1
- 108020005120 Plant DNA Proteins 0.000 description 1
- CZPWVGJYEJSRLH-UHFFFAOYSA-N Pyrimidine Chemical compound C1=CN=CN=C1 CZPWVGJYEJSRLH-UHFFFAOYSA-N 0.000 description 1
- 241000714474 Rous sarcoma virus Species 0.000 description 1
- 240000000111 Saccharum officinarum Species 0.000 description 1
- 235000007201 Saccharum officinarum Nutrition 0.000 description 1
- 239000005708 Sodium hypochlorite Substances 0.000 description 1
- 240000003768 Solanum lycopersicum Species 0.000 description 1
- 244000061456 Solanum tuberosum Species 0.000 description 1
- 235000002595 Solanum tuberosum Nutrition 0.000 description 1
- 235000011684 Sorghum saccharatum Nutrition 0.000 description 1
- 241000193996 Streptococcus pyogenes Species 0.000 description 1
- ATJFFYVFTNAWJD-UHFFFAOYSA-N Tin Chemical compound [Sn] ATJFFYVFTNAWJD-UHFFFAOYSA-N 0.000 description 1
- 108700041896 Zea mays Ubi-1 Proteins 0.000 description 1
- 235000005824 Zea mays ssp. parviglumis Nutrition 0.000 description 1
- 238000010521 absorption reaction Methods 0.000 description 1
- 239000002253 acid Substances 0.000 description 1
- 150000007513 acids Chemical class 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 230000003213 activating effect Effects 0.000 description 1
- 229960000643 adenine Drugs 0.000 description 1
- 229960005305 adenosine Drugs 0.000 description 1
- 238000000246 agarose gel electrophoresis Methods 0.000 description 1
- 238000004164 analytical calibration Methods 0.000 description 1
- 238000000137 annealing Methods 0.000 description 1
- 230000001580 bacterial effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 102000005936 beta-Galactosidase Human genes 0.000 description 1
- 108010005774 beta-Galactosidase Proteins 0.000 description 1
- DRTQHJPVMGBUCF-PSQAKQOGSA-N beta-L-uridine Natural products O[C@H]1[C@@H](O)[C@H](CO)O[C@@H]1N1C(=O)NC(=O)C=C1 DRTQHJPVMGBUCF-PSQAKQOGSA-N 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000009395 breeding Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000010325 cell repair pathway Effects 0.000 description 1
- 235000013339 cereals Nutrition 0.000 description 1
- 239000013611 chromosomal DNA Substances 0.000 description 1
- 238000010205 computational analysis Methods 0.000 description 1
- 239000013068 control sample Substances 0.000 description 1
- 235000005822 corn Nutrition 0.000 description 1
- UHDGCWIWMRVCDJ-ZAKLUEHWSA-N cytidine Chemical compound O=C1N=C(N)C=CN1[C@H]1[C@H](O)[C@@H](O)[C@H](CO)O1 UHDGCWIWMRVCDJ-ZAKLUEHWSA-N 0.000 description 1
- 229940104302 cytosine Drugs 0.000 description 1
- 238000004925 denaturation Methods 0.000 description 1
- 230000036425 denaturation Effects 0.000 description 1
- QDOXWKRWXJOMAK-UHFFFAOYSA-N dichromium trioxide Chemical compound O=[Cr]O[Cr]=O QDOXWKRWXJOMAK-UHFFFAOYSA-N 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 239000013604 expression vector Substances 0.000 description 1
- 102000034287 fluorescent proteins Human genes 0.000 description 1
- 108091006047 fluorescent proteins Proteins 0.000 description 1
- 230000006251 gamma-carboxylation Effects 0.000 description 1
- 238000010363 gene targeting Methods 0.000 description 1
- 238000012214 genetic breeding Methods 0.000 description 1
- 238000010353 genetic engineering Methods 0.000 description 1
- 125000000291 glutamic acid group Chemical group N[C@@H](CCC(O)=O)C(=O)* 0.000 description 1
- OTGHWLKHGCENJV-UHFFFAOYSA-N glycidic acid Chemical compound OC(=O)C1CO1 OTGHWLKHGCENJV-UHFFFAOYSA-N 0.000 description 1
- 230000013595 glycosylation Effects 0.000 description 1
- 238000006206 glycosylation reaction Methods 0.000 description 1
- 229940029575 guanosine Drugs 0.000 description 1
- 238000003306 harvesting Methods 0.000 description 1
- 230000002363 herbicidal effect Effects 0.000 description 1
- 239000004009 herbicide Substances 0.000 description 1
- 239000005556 hormone Substances 0.000 description 1
- 229940088597 hormone Drugs 0.000 description 1
- 125000002887 hydroxy group Chemical group [H]O* 0.000 description 1
- 230000033444 hydroxylation Effects 0.000 description 1
- 208000015181 infectious disease Diseases 0.000 description 1
- 238000002347 injection Methods 0.000 description 1
- 239000007924 injection Substances 0.000 description 1
- 229960003786 inosine Drugs 0.000 description 1
- 150000002632 lipids Chemical class 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 235000019713 millet Nutrition 0.000 description 1
- 210000003470 mitochondria Anatomy 0.000 description 1
- 238000002156 mixing Methods 0.000 description 1
- 230000000877 morphologic effect Effects 0.000 description 1
- 239000013642 negative control Substances 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000009437 off-target effect Effects 0.000 description 1
- 210000000056 organ Anatomy 0.000 description 1
- 210000003463 organelle Anatomy 0.000 description 1
- 210000001672 ovary Anatomy 0.000 description 1
- 230000037361 pathway Effects 0.000 description 1
- 238000010647 peptide synthesis reaction Methods 0.000 description 1
- 238000003976 plant breeding Methods 0.000 description 1
- 210000002706 plastid Anatomy 0.000 description 1
- 230000008488 polyadenylation Effects 0.000 description 1
- 229920001223 polyethylene glycol Polymers 0.000 description 1
- 229920013636 polyphenyl ether polymer Polymers 0.000 description 1
- 238000012257 pre-denaturation Methods 0.000 description 1
- 239000002243 precursor Substances 0.000 description 1
- 230000037452 priming Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000022532 regulation of transcription, DNA-dependent Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 150000003839 salts Chemical class 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- SUKJFIGYRHOWBL-UHFFFAOYSA-N sodium hypochlorite Chemical compound [Na+].Cl[O-] SUKJFIGYRHOWBL-UHFFFAOYSA-N 0.000 description 1
- 230000009870 specific binding Effects 0.000 description 1
- 239000008223 sterile water Substances 0.000 description 1
- 238000003860 storage Methods 0.000 description 1
- 230000019635 sulfation Effects 0.000 description 1
- 238000005670 sulfation reaction Methods 0.000 description 1
- 229940113082 thymine Drugs 0.000 description 1
- 230000005030 transcription termination Effects 0.000 description 1
- 230000009261 transgenic effect Effects 0.000 description 1
- DRTQHJPVMGBUCF-UHFFFAOYSA-N uracil arabinoside Natural products OC1C(O)C(CO)OC1N1C(=O)NC(=O)C=C1 DRTQHJPVMGBUCF-UHFFFAOYSA-N 0.000 description 1
- 229940045145 uridine Drugs 0.000 description 1
- 239000013603 viral vector Substances 0.000 description 1
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/16—Hydrolases (3) acting on ester bonds (3.1)
- C12N9/22—Ribonucleases RNAses, DNAses
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/11—DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
- C12N15/62—DNA sequences coding for fusion proteins
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/82—Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
- C12N15/8201—Methods for introducing genetic material into plant cells, e.g. DNA, RNA, stable or transient incorporation, tissue culture methods adapted for transformation
- C12N15/8213—Targeted insertion of genes into the plant genome by homologous recombination
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/10—Transferases (2.)
- C12N9/12—Transferases (2.) transferring phosphorus containing groups, e.g. kinases (2.7)
- C12N9/1241—Nucleotidyltransferases (2.7.7)
- C12N9/1276—RNA-directed DNA polymerase (2.7.7.49), i.e. reverse transcriptase or telomerase
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6844—Nucleic acid amplification reactions
- C12Q1/686—Polymerase chain reaction [PCR]
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
Definitions
- the invention relates to the field of plant genetic engineering. Specifically, the present invention relates to a method for targeted modification of plant genome sequence. More specifically, the present invention relates to a method for targeted modification of a specific sequence in the plant genome to a target sequence of interest through a nuclease-reverse transcriptase fusion protein guided by a guide RNA, and the method produced by the method. Genetically modified plants and their descendants.
- the single-base editing system can achieve efficient conversion of cytosine to thymine (C ⁇ T) and conversion of adenine to guanine (A ⁇ G) at the target site.
- C ⁇ T cytosine to thymine
- a ⁇ G adenine to guanine
- this method has limited types of base conversion, and it cannot achieve precise insertion or deletion of fragments. Therefore, there is still a need in the art for efficient methods that can achieve precise targeted modification of plant genome sequences.
- the present invention includes a new type of plant DNA precision editing system, which consists of Cas nuclease (Cas9-H840A) with target chain nicking activity fused with reverse transcriptase, and a 3'end with a repair template (RT template) and free
- the single-stranded binding region (PBS) consists of pegRNA (prime editing gRNA).
- This system combines the free single-stranded DNA sequence produced by Cas nickase such as Cas9-H840A through PBS, and makes it transcribe the single-stranded DNA sequence according to the given RT template. After cell repair, the PAM sequence-3 can be realized in the genome. Any change in the DNA sequence downstream of the bit.
- new nicking sgRNA it creates nicks on the non-target strand of pegRNA, which helps to promote cell repair according to the donor template. Experimental results show that the system effectively induces precise modification of target sites in plants.
- FIG. 1 Schematic diagram of the principle of the present invention
- FIG. 2 The working diagram of three different types of PPE (plant prime editor) systems.
- the system that does not provide additional nicking sgRNA is named (PPE2); the system that provides additional nicking sgRNA that helps to cut the opposite strand of pegRNA is named (PPE3); when the PAM sequence of nicking sgRNA that cuts the opposite strand is located in the spacer sequence of pegRNA The system is named (PPE3b).
- FIG. 3 Schematic diagram of PPE construct and pegRNA construct.
- Figure 4 The working principle of the BFP-to-GFP reporter system for detecting precise editing in plant protoplasts.
- FIG. 5 Flow cytometer measurement of the fluorescence intensity of the PPE system.
- CK is the protoplast control without plasmid transformation
- PBE is the BE3 single-base editing reporter system
- PPE3b( ⁇ M-MLV) refers to the control group without M-MLV reverse transcriptase.
- FIG. 6 Flow cytometry measuring the efficiency of the PPE system.
- CK is the protoplast control without plasmid transformation
- PBE is the BE3 single-base editing reporter system
- PPE3b( ⁇ M-MLV) refers to the control group without M-MLV reverse transcriptase.
- Figure 7 Editing of PPE system in rice endogenous targets.
- FIG. 8 Editing of PPE system's endogenous targets in wheat.
- Figure 9 By-products and their proportions produced by the PPE system.
- Figure 10 Editing of PPE-CaMV system in plant endogenous targets.
- Figure 11 Schematic diagram of pegRNA processed by ribozymes initiated by type II promoters.
- Figure 12 Editing of PPE-R system in plant endogenous genes.
- Figure 14 The effect of different PBS lengths on the PPE system.
- Figure 15 The impact of different RT template lengths on the PPE system.
- Figure 16 The influence of different RT template lengths on the precise editing ratio of the PPE system.
- Figure 17 The effect of different nicking sgRNA positions on the PPE system.
- FIG. 18 The PPE system implements different types of mutations in plant endogenous genes.
- Figure 19 The PPE system realizes the insertion of fragments of different lengths in plant endogenous genes.
- FIG. 20 The PPE system realizes the deletion of fragments of different lengths in plant endogenous genes.
- Figure 21 Schematic diagram of PPE construct used for Agrobacterium infection in rice.
- Figure 22 Using the PPE system to obtain rice mutants and their sequencing results, the arrow indicates the location of the target mutation.
- FIG. 23 Monoclonal sequencing results of T0-9 mutant plants.
- Figure 24 Use the published data for three target sites and the newly obtained data for ten new target sites in rice protoplasts to compare the effect of different Tm-guided PBS lengths on editing efficiency.
- Figure 25 Normalization of priming editing frequency with different PBS melting temperatures. Normalize the highest editing frequency obtained at each target to 1, and adjust the frequency obtained at other PBS Tm accordingly.
- Figure 26 Schematic diagram of primed editing using single pegRNA and dual pegRNA strategies.
- (a) Use only NGG-pegRNA for editing (editing the forward DNA strand).
- (b) Use only CCN-pegRNA for editing (editing reverse DNA strands).
- (c) Edit using dual-pegRNA strategy. Double-pegRNA creates two edits in two DNA strands at the same time.
- Figure 27 Comparison of editing efficiency induced by NGG-pegRNA, CCN-pegRNA and dual-pegRNA strategies at 15 target sites.
- Figure 28 Product purity of NGG-pegRNA, CCN-pegRNA and double-pegRNA when edited at 15 endogenous sites in rice protoplasts.
- Figure 29 The percentage of rice genome bases that can theoretically be targeted by single pegRNA and double-pegRNA primed editing.
- the term “and/or” encompasses all combinations of items connected by the term, and should be treated as if each combination has been individually listed herein.
- “A and/or B” encompasses “A”, “A and B”, and “B”.
- “A, B, and/or C” encompasses "A”, “B”, “C”, “A and B”, “A and C”, “B and C”, and "A and B and C”.
- the protein or nucleic acid may be composed of the sequence, or may have additional amino acids or nuclei at one or both ends of the protein or nucleic acid. Glycolic acid, but still has the activity described in the present invention.
- methionine encoded by the start codon at the N-terminus of the polypeptide will be retained under certain actual conditions (for example, when expressed in a specific expression system), but does not substantially affect the function of the polypeptide.
- Gene as used herein not only covers chromosomal DNA present in the nucleus, but also includes organelle DNA present in subcellular components of the cell (such as mitochondria, plastids).
- Genetically modified plant means a plant that contains an exogenous polynucleotide or a modified gene or expression control sequence in its genome.
- exogenous polynucleotides can be stably integrated into the genome of plants and inherited for successive generations.
- the exogenous polynucleotide can be integrated into the genome alone or as part of a recombinant DNA construct.
- the modified gene or expression control sequence includes one or more deoxynucleotide substitutions, deletions and additions in the plant genome.
- Form in terms of sequence means a sequence from a foreign species, or if from the same species, a sequence that has undergone significant changes in composition and/or locus from its natural form through deliberate human intervention.
- nucleic acid sequence is used interchangeably and are single-stranded or double-stranded RNA or DNA polymers, optionally containing synthetic, non-natural Or changed nucleotide bases.
- Nucleotides are referred to by their single letter names as follows: “A” is adenosine or deoxyadenosine (respectively RNA or DNA), “C” is cytidine or deoxycytidine, and “G” is guanosine or Deoxyguanosine, “U” means uridine, “T” means deoxythymidine, “R” means purine (A or G), “Y” means pyrimidine (C or T), “K” means G or T, “ H” means A or C or T, “D” means A, T or G, “I” means inosine, and “N” means any nucleotide.
- Polypeptide “peptide”, and “protein” are used interchangeably in the present invention and refer to a polymer of amino acid residues.
- the term applies to amino acid polymers in which one or more amino acid residues are artificial chemical analogs of the corresponding naturally-occurring amino acids, as well as to naturally-occurring amino acid polymers.
- the terms "polypeptide”, “peptide”, “amino acid sequence” and “protein” may also include modified forms, including but not limited to glycosylation, lipid linkage, sulfation, gamma carboxylation of glutamic acid residues, hydroxyl And ADP-ribosylation.
- expression construct refers to a vector suitable for expression of a nucleotide sequence of interest in an organism, such as a recombinant vector.
- “Expression” refers to the production of a functional product.
- the expression of a nucleotide sequence may refer to the transcription of the nucleotide sequence (such as transcription to generate mRNA or functional RNA) and/or the translation of RNA into a precursor or mature protein.
- the "expression construct" of the present invention can be a linear nucleic acid fragment, a circular plasmid, a viral vector, or, in some embodiments, can be RNA (such as mRNA) that can be translated, for example, RNA generated by in vitro transcription.
- RNA such as mRNA
- the "expression construct" of the present invention may comprise regulatory sequences and nucleotide sequences of interest from different sources, or regulatory sequences and nucleotide sequences of interest from the same source but arranged in a manner different from those normally occurring in nature.
- regulatory sequence and “regulatory element” are used interchangeably and refer to the upstream (5' non-coding sequence), middle or downstream (3' non-coding sequence) of the coding sequence, and affect the transcription, RNA processing, or processing of the related coding sequence. Stability or translated nucleotide sequence. Regulatory sequences may include, but are not limited to, promoters, translation leader sequences, introns, and polyadenylation recognition sequences.
- Promoter refers to a nucleic acid fragment capable of controlling the transcription of another nucleic acid fragment.
- a promoter is a promoter capable of controlling gene transcription in a cell, regardless of whether it is derived from the cell.
- the promoter can be a constitutive promoter or a tissue-specific promoter or a developmentally regulated promoter or an inducible promoter.
- tissue-specific promoter and “tissue-preferred promoter” are used interchangeably, and refer to mainly but not necessarily exclusively expressed in a tissue or organ, and can also be expressed in a specific cell or cell type Promoter.
- tissue-preferred promoter refers to a promoter whose activity is determined by developmental events.
- inducible promoters selectively express operably linked DNA sequences in response to endogenous or exogenous stimuli (environment, hormones, chemical signals, etc.).
- promoters include, but are not limited to, polymerase (pol) I, pol II, or pol III promoters.
- pol I promoter examples include the chicken RNA pol I promoter.
- pol II promoters include, but are not limited to, cytomegalovirus immediate early (CMV) promoter, Rous sarcoma virus long terminal repeat (RSV-LTR) promoter, and simian virus 40 (SV40) immediate early promoter.
- pol III promoters include U6 and H1 promoters.
- An inducible promoter such as a metallothionein promoter can be used.
- promoters include T7 phage promoter, T3 phage promoter, ⁇ -galactosidase promoter, and Sp6 phage promoter.
- the promoter may be cauliflower mosaic virus 35S promoter, maize Ubi-1 promoter, wheat U6 promoter, rice U3 promoter, maize U3 promoter, rice actin promoter.
- operably linked refers to the connection of regulatory elements (for example, but not limited to, promoter sequences, transcription termination sequences, etc.) to nucleic acid sequences (for example, coding sequences or open reading frames) such that the nucleotides The transcription of the sequence is controlled and regulated by the transcription control element.
- regulatory elements for example, but not limited to, promoter sequences, transcription termination sequences, etc.
- nucleic acid sequences for example, coding sequences or open reading frames
- "Introducing" nucleic acid molecules (such as plasmids, linear nucleic acid fragments, RNA, etc.) or proteins into an organism refers to transforming the cells of the organism with the nucleic acid or protein so that the nucleic acid or protein can function in the cell.
- the "transformation” used in the present invention includes stable transformation and transient transformation.
- “Stable transformation” refers to the introduction of an exogenous nucleotide sequence into the genome, resulting in the stable inheritance of the exogenous gene. Once stably transformed, the exogenous nucleic acid sequence is stably integrated into the genome of the organism and any successive generations thereof.
- Transient transformation refers to the introduction of nucleic acid molecules or proteins into cells to perform functions without stable inheritance of foreign genes. In transient transformation, the foreign nucleic acid sequence is not integrated into the genome.
- Proteins refer to the physiological, morphological, biochemical or physical characteristics of cells or organisms.
- “Agronomic traits” especially refer to the measurable index parameters of crop plants, including but not limited to: leaf green, grain yield, growth rate, total biomass or accumulation rate, fresh weight at maturity, dry weight at maturity, fruit Yield, seed yield, plant total nitrogen content, fruit nitrogen content, seed nitrogen content, plant nutrient tissue nitrogen content, plant total free amino acid content, fruit free amino acid content, seed free amino acid content, plant nutrient tissue free amino acid content, plant total protein Content, fruit protein content, seed protein content, plant nutrient tissue protein content, herbicide resistance, drought resistance, nitrogen absorption, root lodging, harvest index, stem lodging, plant height, ear height, ear length, disease resistance Resistance, cold resistance, salt resistance and tiller number.
- the present invention relates to a genome editing system for targeted modification of the genomic DNA sequence of an organism, which comprises:
- fusion protein and/or an expression construct containing a nucleotide sequence encoding the fusion protein, wherein the fusion protein comprises CRISPR nickase and reverse transcriptase; and/or
- the at least one pegRNA includes a guide sequence, a scaffold sequence, a reverse transcription (RT) template sequence, and a primer binding site (PBS) sequence from 5'to 3'direction.
- RT reverse transcription
- PBS primer binding site
- the at least one pegRNA can form a complex with the fusion protein and target the fusion protein to a target sequence in the genome, resulting in a nick in the target sequence.
- the organism is a plant.
- gene editing system refers to a combination of components required for genome editing of the genome in a cell.
- Each component of the system such as fusion protein, gRNA, etc., can exist independently of each other, or can exist in any combination as a composition.
- target sequence refers to a sequence of about 20 nucleotides in length in the genome characterized by a 5'or 3'flanking PAM (proximal region sequence adjacent motif) sequence.
- PAM proximal region sequence adjacent motif
- the target sequence is immediately adjacent to the PAM at the 3'end, such as 5'-NGG-3'. Based on the existence of PAM, those skilled in the art can easily determine the target sequence in the genome that can be used for targeting. And depending on the location of the PAM, the target sequence can be located on any strand of the genomic DNA molecule.
- the target sequence is preferably 20 nucleotides.
- the CRISPR nickase in the fusion protein can form a nick in the target sequence in the genomic DNA.
- the CRISPR nickase is a Cas9 nickase.
- the Cas9 nickase is derived from SpCas9 of S. pyogenes, and at least includes the amino acid substitution H840A relative to wild-type SpCas9.
- An exemplary wild-type SpCas9 includes the amino acid sequence shown in SEQ ID NO:1.
- the Cas9 nickase comprises the amino acid sequence shown in SEQ ID NO: 2.
- the Cas9 nickase in the fusion protein can be located at the -3 nucleotide of the PAM of the target sequence (the first nucleotide at the 5'end of the PAM sequence is the +1) and the -4 nuclear An incision is formed between the glycidic acid.
- the Cas9 nickase is a Cas9 nickase variant capable of recognizing altered PAM sequences.
- the Cas9 nickase is a Cas9 variant that recognizes the PAM sequence 5'-NG-3'.
- the Cas9 nickase variant that recognizes the PAM sequence 5'-NG-3' contains the following amino acid substitutions relative to wild-type Cas9: H840A, R1335V, L1111R, D1135V, G1218R, E1219F, A1322R, T1337R, where amino acid numbering Refer to SEQ ID NO:1.
- the nick formed by the Cas9 nickase of the present invention can cause the target sequence to form a free single chain with a 3'end (3' free single chain) and a free single chain with a 5'end (5' free single chain).
- the reverse transcriptase in the fusion protein of the present invention can be derived from different sources.
- the reverse transcriptase is a reverse transcriptase derived from a virus.
- the reverse transcriptase is M-MLV reverse transcriptase or a functional variant thereof.
- An exemplary wild-type M-MLV reverse transcriptase sequence is shown in SEQ ID NO: 3.
- the reverse transcriptase is an enhanced M-MLV reverse transcriptase, for example, the amino acid sequence of the enhanced M-MLV reverse transcriptase is shown in SEQ ID NO: 4.
- the reverse transcriptase is CaMV-RT from Cauliflower mosaic virus (CaMV), and its amino acid sequence is shown in SEQ ID NO: 5.
- the reverse transcriptase is a reverse transcriptase derived from bacteria, such as retron-RT from Escherichia coli, and its amino acid sequence is shown in SEQ ID NO: 6.
- the CRISPR nickase and the reverse transcriptase in the fusion protein are connected by a linker.
- linkers can be 1-50 pieces in length (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 or 20-25, 25-50) or more amino acids, non-functional amino acid sequences without secondary or higher structure.
- the linker may be a flexible linker, such as GGGGS, GS, GAP, (GGGGS)x3, GGS and (GGS)x7, etc.
- it may be the linker shown in SEQ ID NO: 7.
- the CRISPR nickase in the fusion protein is fused to the N-terminus of the reverse transcriptase directly or through a linker. In some embodiments, the CRISPR nickase in the fusion protein is fused to the C-terminus of the reverse transcriptase directly or through a linker.
- the fusion protein of the present invention may also include a nuclear localization sequence (NLS).
- NLS nuclear localization sequence
- one or more NLS in the fusion protein should have sufficient strength to drive the accumulation of the fusion protein in an amount that can achieve its base editing function in the nucleus of the cell.
- the strength of nuclear localization activity is determined by the number and location of NLS in the fusion protein, one or more specific NLS used, or a combination of these factors.
- the guide sequence (also called seed sequence or spacer sequence) in at least one pegRNA of the present invention is set to have sufficient sequence identity (preferably 100% identity) with the target sequence, so as to be able to communicate with the target sequence through base pairing.
- Complementary strands combine to achieve sequence-specific targeting.
- scaffold sequences of gRNA suitable for genome editing based on CRISPR nuclease are known in the art, and these can be used in the pegRNA of the present invention.
- the scaffold sequence of the gRNA is shown in SEQ ID NO: 8.
- the primer binding sequence is set to be complementary to at least a part of the target sequence (preferably with at least a part of the target sequence).
- the primer binding sequence is complementary to the DNA where the target sequence is located.
- At least a part of the 3'free single strands caused by the nick in the strand is complementary to at least a part of the 3'free single strands, especially the nucleotide sequence at the 3'end of the 3'free single strands Complementary (preferably fully paired).
- the 3'free single strand of the chain binds to the primer binding sequence through base pairing
- the 3'free single strand can be used as a primer to be a reverse transcription (RT) template immediately adjacent to the primer binding sequence
- RT reverse transcription
- the sequence is used as a template, and reverse transcription is performed under the action of the reverse transcriptase in the fusion protein to extend the DNA sequence corresponding to the reverse transcription (RT) template sequence.
- the primer binding sequence depends on the length of the free single strand formed in the target sequence by the CRISPR nickase used, however, it should have the minimum length to ensure specific binding.
- the primer binding sequence may be 4-20 nucleotides in length, for example, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 in length. , 17, 18, 19, 20 nucleotides.
- the primer binding sequence is set to have a Tm (melting temperature) of no more than about 52°C. In some embodiments, the Tm (melting temperature) of the primer binding sequence is about 18°C-52°C, preferably about 24°C-36°C, more preferably about 28°C-32°C, more preferably about 30°C.
- the method of calculating the Tm of a nucleic acid sequence is well known in the art, for example, it can be calculated using the Oligo Analysis Tool online analysis tool.
- the appropriate Tm can be obtained by selecting the appropriate length of the PBS.
- a PBS sequence with a suitable Tm can be obtained by selecting a suitable target sequence.
- the RT template sequence can be any sequence.
- the sequence information can be integrated into the DNA strand where the target sequence is located (that is, the strand containing the target sequence PAM), and then through the cell's DNA repair function, a DNA double strand containing the sequence information of the RT template is formed .
- the RT template sequence contains the desired modification.
- the desired modification includes substitutions, deletions, and/or additions of one or more nucleotides.
- the modification includes one or more substitutions selected from: C to T substitution, C to G substitution, C to A substitution, G to T substitution, G to C substitution, G to A substitution, A to T substitution , A to G substitution, A to C substitution, T to C substitution, T to G substitution, T to A substitution; and/or including one or more nucleotide deletions, such as 1 to about 100 or more One, such as 1, 2, 3, 4, 5, about 10, about 20, about 30, about 40, about 50, about 75, about 100 nucleotide deletions ; And/or include the insertion of one or more nucleotides, such as 1 to about 100 or more, such as 1 to about 100 or more, such as 1, 2, 3, 4 One, five, about 10, about 20, about 30, about 40, about 50, about 75, about 100 nucleotide insertions.
- the RT template sequence is set to correspond to the sequence downstream of the nick of the target sequence (for example, complementary to at least a part of the sequence downstream of the nick of the target sequence), and contains the desired modification.
- the desired modification includes substitutions, deletions and/or additions of one or more nucleotides.
- the modification includes one or more substitutions selected from: C to T substitution, C to G substitution, C to A substitution, G to T substitution, G to C substitution, G to A substitution, A to T substitution , A to G substitution, A to C substitution, T to C substitution, T to G substitution, T to A substitution; and/or including one or more nucleotide deletions, such as 1 to about 100 or more One, such as 1, 2, 3, 4, 5, about 10, about 20, about 30, about 40, about 50, about 75, about 100 nucleotide deletions ; And/or include the insertion of one or more nucleotides, such as 1 to about 100 or more, such as 1 to about 100 or more, such as 1, 2, 3, 4 One, five, about 10, about 20, about 30, about 40, about 50, about 75, about 100 nucleotide insertions.
- the RT template sequence may be about 1-300 or more nucleotides in length, for example, 1, 2, 3, 4, 5, about 10, about 20, about 30, about 40, about 50, about 75, about 100, about 125, about 150, about 175, about 200, about 225, about 250, about 275 , About 300 nucleotides or more polynucleotides.
- the RT template sequence is 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23 nucleotides in length.
- the plant genome editing system further includes a nicking gRNA (nicking gRNA for generating additional nicks) and/or an expression construct containing a nucleotide sequence encoding the nicking gRNA, and the nicking gRNA includes Guide sequence and scaffold sequence.
- the nicked gRNA does not include reverse transcription (RT) template sequence and primer binding site (PBS) sequence.
- the guide sequence (also called seed sequence or spacer sequence) in the nicked gRNA of the present invention is set to have sufficient sequence identity (preferably 100% identity) with the nick target sequence in the genome, so that the fusion protein target of the present invention can be
- the nicking target sequence results in a nick in the nicking target sequence, and the nicking target sequence and the target sequence targeted by pegRNA (pegRNA target sequence) are located on opposite strands of the genomic DNA.
- the nick formed by the nick RNA and the nick formed by the pegRNA are about 1 to about 300 or more nucleotides apart, such as 1, 2, 3, 4, 5, about 10, about 20, about 30, about 40, about 50, about 75, about 100, about 125, about 150, about 175, about 200, about 225 , About 250, about 275, about 300 or more nucleotides.
- the nick formed by the nicking RNA is located upstream or downstream of the formation of the pegRNA (the upstream or downstream refers to the DNA strand where the pegRNA target sequence is located).
- the guide sequence in the nicked gRNA and the relative strand (modified) of the pegRNA target sequence after the editing event has sufficient sequence identity (preferably 100% identity), so that the nicked gRNA only targets To nick target sequences that are generated after pegRNA-induced target sequences are targeted and modified.
- the PAM of the nick target sequence is located within the complement of the pegRNA target sequence.
- the sequence of the pegRNA and/or nicked gRNA can be precisely processed using a self-processing system.
- the 5'end of the pegRNA and/or nicked gRNA is connected to the 3'end of the first ribozyme, and the first ribozyme is designed to be at the 5'end of the pegRNA and/or nicked gRNA.
- 'End cuts the fusion; and/or the 3'end of the pegRNA and/or nicked gRNA is connected to the 5'end of the second ribozyme, which is designed to be in the pegRNA and/or The 3'end of the nicked gRNA cuts the fusion.
- first or second ribozyme is within the abilities of those skilled in the art. For example, see Gao et al., JIPB, Apr, 2014; Vol 56, Issue 4,343-349.
- a method of precisely processing gRNA refer to WO 2018/149418, for example.
- the genome editing system comprises at least one pair of pegRNA and/or an expression construct containing a nucleotide sequence encoding the at least one pair of pegRNA.
- the two pegRNAs in the pegRNA pair are configured to target different target sequences on the same strand of genomic DNA.
- the two pegRNAs in the pegRNA pair are configured to target target sequences on different strands of genomic DNA.
- the PAM of the target sequence of one pegRNA in the pegRNA pair is located on the sense strand, and the PAM of the other pegRNA is located on the antisense strand.
- the nicks induced by the two pegRNAs are located on both sides of the site to be modified.
- the pegRNA-induced nick for the sense strand is located upstream (5' direction) of the site to be modified, and the pegRNA-induced nick for the antisense strand is located downstream (3' direction) of the site to be modified. The upstream or downstream is relative to the sense strand.
- the induced nicks of the two pegRNAs are about 1 to about 300 or more nucleotides apart, for example, 1-15 nucleotides apart.
- the two pegRNAs in the pegRNA pair are configured to introduce the same desired modification.
- one type of pegRNA is configured to introduce A to G substitutions in the sense strand
- the other type of pegRNA is configured to introduce T to C substitutions at corresponding positions on the antisense strand.
- one pegRNA is set to introduce a two-nucleotide deletion in the sense strand
- the other pegRNA is set to also introduce a two-nucleotide deletion in the corresponding position of the antisense strand.
- Other types of modification can be deduced by analogy.
- the pegRNA targeting two different strands can achieve the same desired modification by designing an appropriate RT template sequence.
- the nucleotide sequence encoding the fusion protein is codon-optimized for the plant species whose genome is to be modified.
- Codon optimization refers to replacing at least one codon of the natural sequence with a codon that is used more frequently or most frequently in the gene of the host cell (e.g., about or more than about 1, 2, 3, 4, 5, 10). , 15, 20, 25, 50 or more codons while maintaining the natural amino acid sequence to modify the nucleic acid sequence to enhance expression in the host cell of interest.
- Different species display certain codons for specific amino acids Codon preference (the difference in codon usage between organisms) is often related to the translation efficiency of messenger RNA (mRNA), and the translation efficiency is considered to depend on the nature and the nature of the codon being translated
- mRNA messenger RNA
- tRNA transfer RNA
- Codon utilization tables can be easily obtained, such as the codon usage database available on www.kazusa.orjp/codon/ ("Codon Usage Database"), and these tables can be adjusted in different ways Applicable. See, Nakamura Y. et al., "Codon usage tabulated from the international DNA sequence databases: status for the year 2000. Nucl. Acids Res., 28:292 (2000).
- the fusion protein of the present invention is encoded by the nucleotide sequence shown in any one of SEQ ID NO: 9-11 or comprises the amino acid sequence shown in any one of SEQ ID NO: 12-14.
- Plants that can be genome modified by the genome editing system of the present invention include monocotyledonous plants and dicotyledonous plants.
- the plants are crop plants, including but not limited to wheat, rice, corn, soybean, sunflower, sorghum, rape, and alfalfa.
- the present invention provides a method for determining the PBS sequence in pegRNA of the genome editing system of the present invention, the method comprising:
- a PBS sequence with a Tm not exceeding 52°C for example, a Tm of about 18°C-52°C, preferably about 24°C-36°C, more preferably about 28°C-32°C, and more preferably about 30°C.
- the present invention provides a method for producing a genetically modified plant, comprising introducing the genome editing system of the present invention into at least one of the plants, thereby causing a modification in the genome of the at least one plant.
- the modification includes substitution, deletion and/or addition of one or more nucleotides.
- the modification includes one or more substitutions selected from: C to T substitution, C to G substitution, C to A substitution, G to T substitution, G to C substitution, G to A substitution, A to T substitution , A to G substitution, A to C substitution, T to C substitution, T to G substitution, T to A substitution; and/or including one or more nucleotide deletions, such as 1 to about 100 or more One, such as 1, 2, 3, 4, 5, about 10, about 20, about 30, about 40, about 50, about 75, about 100 nucleotide deletions ; And/or include the insertion of one or more nucleotides, such as 1 to about 100 or more, such as 1 to about 100 or more, such as 1, 2, 3, 4 1, 5, about 10, about 20, about 30, about 40, about 50, about 75, about 100 nucleotide insertions.
- the method further includes screening for plants having the desired modification from the at least one plant.
- the genome editing system can be introduced into plants by various methods well known to those skilled in the art.
- Methods that can be used to introduce the genome editing system of the present invention into plants include, but are not limited to: gene bombardment, PEG-mediated transformation of protoplasts, Agrobacterium-mediated transformation, plant virus-mediated transformation, pollen tube passage method, and ovary Injection method.
- the genome editing system is introduced into the plant by transient transformation.
- the genome modification can be realized by introducing or producing the fusion protein and gRNA into plant cells, and the modification can be inherited stably, without the need to change the components of the genome editing system.
- the exogenous polynucleotide is stably transformed into plants. This avoids the potential off-target effects of the stable (continuously generated) genome editing system, and also avoids the integration of foreign nucleotide sequences in the plant genome, thereby having higher biological safety.
- the introduction is performed in the absence of selective pressure, so as to avoid the integration of foreign nucleotide sequences in the plant genome.
- the introduction includes transforming the genome editing system of the present invention into an isolated plant cell or tissue, and then regenerating the transformed plant cell or tissue into a whole plant.
- the regeneration is performed in the absence of selective pressure, that is, no selective agent for the selective gene carried on the expression vector is used during the tissue culture process.
- no selection agent can improve the regeneration efficiency of plants and obtain modified plants that do not contain exogenous nucleotide sequences.
- the genome editing system of the present invention can be transformed to specific parts on the whole plant, such as leaves, stem tips, pollen tubes, young ears or hypocotyls. This is particularly suitable for the transformation of plants that are difficult to regenerate from tissue culture.
- the protein expressed in vitro and/or the RNA molecule transcribed in vitro (for example, the expression construct is an RNA molecule transcribed in vitro) is directly transformed into the plant.
- the protein and/or RNA molecule can realize genome editing in plant cells and then be degraded by the cell, avoiding the integration of foreign nucleotide sequences in the plant genome.
- genetic modification and breeding of plants using the method of the present invention can obtain plants whose genomes have no exogenous polynucleotide integration, that is, transgene-free modified plants.
- the method further includes culturing the plant cell, tissue or whole plant into which the genome editing system has been introduced at an elevated temperature, for example, 37°C.
- the modified genomic region is related to plant traits such as agronomic traits, whereby the modified substitution causes the plant to have an altered (preferably improved) trait relative to a wild-type plant, For example, agronomic traits.
- the method further includes the step of screening for plants with desired modifications and/or desired traits such as agronomic traits.
- the method further includes obtaining progeny of the genetically modified plant.
- the genetically modified plant or its progeny have desired modifications and/or desired traits such as agronomic traits.
- the present invention also provides a genetically modified plant or its progeny or part thereof, wherein the plant is obtained by the above-mentioned method of the present invention.
- the genetically modified plant or progeny or part thereof is non-transgenic.
- the genetically modified plant or its progeny have desired genetic modification and/or desired traits such as agronomic traits.
- the present invention also provides a plant breeding method, comprising crossing a genetically modified first plant obtained by the above-mentioned method of the present invention with a second plant that does not contain the modification, thereby combining the modification Introduce the second plant.
- the genetically modified first plant has desired traits such as agronomic traits.
- nCas9(H840A)-M-MLV construct nCas9(H840A)-CaMV construct, nCas9(H840A)-retron construct were constructed by Suzhou Jinweizhi Company.
- the M-MLV used in this example is compared with the wild type M-MLV reverse transcriptase has 5 amino acid mutations.
- M-MLV, RT-CaMV and RT-retron are all codon-optimized by monocotyledons.
- the pegRNA fragments were constructed on the vector promoted by the OsU3 promoter using the Gibson method to obtain an OsU3-pegRNA construct suitable for rice.
- the pegRNA fragments were constructed on the TaU6 promoter-promoted vector using the Gibson method to obtain a TaU6-pegRNA construct suitable for wheat.
- Use the Gibson method to construct pegRNA fragments (including RT and PBS sequences) with ribozymes at both the 5'and 3'ends into the vector initiated by the maize Ubiquitin-1 (Ubi-1) promoter to obtain Ubi-pegRNA-R Construct.
- the nicking gRNA is constructed by T4 ligase into the vector promoted by the TaU3 promoter to obtain the TaU3-nick vector.
- the PAM sequence is shown in bold.
- the protoplasts used in the present invention are from rice Zhonghua 11 variety and Kenong 199 wheat variety.
- the seeds are rinsed with 75% ethanol for 1 minute, then treated with 4% sodium hypochlorite for 30 minutes, and washed with sterile water for more than 5 times. Cultivate on M6 medium for 3-4 weeks, at 26°C, protected from light.
- Potted wheat seeds are planted in a culture room, and cultivated for about 1-2 weeks (about 10 days) under the conditions of a temperature of 25 ⁇ 2°C, an illumination of 1000 Lx, and an illumination of 14-16h/d.
- the FACSAria III (BD Biosciences) instrument is used for flow cytometry analysis of protoplasts. The specific steps are as follows:
- the 20 ⁇ L amplification system contains 4 ⁇ L 5 ⁇ Fastpfu buffer, 1.6 ⁇ L dNTPs (2.5mM), 0.4 ⁇ L Forward primer (10 ⁇ M), 0.4 ⁇ L Reverse primer (10 ⁇ M), 0.4 ⁇ L FastPfu polymerase (2.5U/ ⁇ L), and 2 ⁇ L DNA template ( ⁇ 60ng).
- Amplification conditions 95°C pre-denaturation for 5min; 95°C denaturation for 30s, 50-64°C annealing for 30s, 72°C extension for 30s, 35 cycles; 72°C full extension for 5min, storage at 12°C;
- the above-mentioned amplified product is diluted 10 times, and 1 ⁇ L is used as the second round of PCR amplification template, and the amplification primer is a sequencing primer containing Barcode.
- the 50 ⁇ L amplification system contains 10 ⁇ L 5 ⁇ Fastpfu buffer, 4 ⁇ L dNTPs (2.5mM), 1 ⁇ L Forward primer (10 ⁇ M), 1 ⁇ L Reverse primer (10 ⁇ M), 1 ⁇ L FastPfu polymerase (2.5U/ ⁇ L), and 1 ⁇ L DNA template.
- the amplification conditions are as above, and the number of amplification cycles is 35 cycles.
- PCR products were separated by 2% agarose gel electrophoresis, and the target fragments were recovered by gel extraction with AxyPrep DNA Gel Extraction kit, and the recovered products were quantitatively analyzed by NanoDrop ultra-micro spectrophotometer; 100ng of recovered products were taken and mixed, And sent to Shenggong Bioengineering Co., Ltd. for amplicon sequencing library construction and amplicon sequencing analysis.
- the original data is split according to the sequencing primers, and the WT is used as a control to compare and analyze the editing type and editing efficiency of the product at different gene targeting sites in the three repeated experiments.
- the Cas9 (H840A) nickase-reverse transcriptase fusion (PPEs, plant prime editors) can be used to precisely modify the target sequence ( Figure 1-2)
- the nCas9(H840A)-M-MLV construct (PPE-M) was constructed -MLV), nCas9(H840A)-CaMV construct (PPE-CaMV), nCas9(H840A)-retron (PPE-retron) construct, OsU3/TaU6 promoter driven RNA with target and RT and PBS sequence
- the pegRNA construct and the nicking gRNA construct driven by the TaU3 promoter can produce nicking on the non-target strand ( Figure 3).
- target 10 endogenous sites in rice OsCDC48-T1, OsCDC48-T2, OsCDC48-T3, OsALS-T1, OsALS-T2, OsDEP1, OsEPSPS-T1, OsEPSPS -T2, OsLDAMR and OsGAPDH
- 7 wheat endogenous sites TaUbi10-T1, TaUbi10-T2, TaGW2, TaGASR7, TaLOX2, TaMLO and TaDME
- pegRNA processed by ribozymes was also tested for the working conditions of the PPE system.
- a Ubi-1 driven ribozyme-processed pegRNA construct with target guide RNA and RT and PBS sequences was constructed to replace the OsU3 driven pegRNA construct in the original system, and the system was named (PPE-R, R represents ribozyme (Ribozyme) ( Figure 11).
- the results of endogenous targets show that the use of ribozyme processing strategies can also achieve precise endogenous sequence changes.
- PPE-R is improved compared with PPE at some sites, with an efficiency of up to 9.7% ( Figure 12). This result indicates that both pegRNA processed by ribozymes or pegRNA using type II promoters are suitable for the PPE system.
- the protoplasts were cultured at 37°C to test whether they could improve the editing efficiency.
- Two rice endogenous sites (OsCDC48-T2 and OsALS-T2) were selected for testing. After the transformed protoplasts were cultured overnight at 26°C, they were incubated at 37°C for 8 hours, and then returned to 26°C to continue the culture. The efficiency was compared with the treatment group. The results show that treatment at 37°C can significantly improve the editing efficiency of PPE systems (including PPE2, PPE3 and PPE3b), with an average increase of 1.6 times (from 3.9% to 6.3%), and a maximum increase of 2.9 times (Figure 13).
- Example 4 Test the influence of different PBS, RT template length and nicking gRNA position on the PPE system
- the effects of different PBS, RT template lengths and nicking gRNA positions on the PPE system were tested.
- the results using OsCDC48-T1 as the test site showed that the tested different PBS lengths (6-16nt) and RT template lengths (7-23nt) ) Can produce targeted sequence modification at a specific site ( Figure 14-15).
- the efficiency at the OsCDC48-T1 site is 3.4% to 15.3%, and the efficiency at the OsCDC48-T2 site is 0.9% to 8.1%.
- the efficiency at the OsALS-T2 site is 1.1% to 10.5%.
- Embodiment 5 PPE system realizes multiple types of precise modification of endogenous sites
- the maximum base-directed insertion efficiency can reach 3.0%, and the longest insertion length can reach 15 nt ( Figure 19); the highest base-directed deletion efficiency can reach 19.2 %, the longest deletion length can reach 40nt ( Figure 20). Therefore, the system can efficiently add and delete small fragments. Therefore, the PPE system can achieve multiple types of targeted modification of endogenous sites.
- Example 6 PPE system to obtain targeted editing plants
- Example 7 The Tm of PBS affects the efficiency of the PPE system
- Example 8 Using dual pegRNA strategy to significantly improve the efficiency of the PPE system
- NGG-pegRNA and CCN-pegRNA different pegRNAs
- Figure 26 Fifteen target sites were selected from ten rice genes, and a pair of pegRNA was designed for each target (Table 3). Then, the editing activities of only NGG-pegRNA, only CCN-pegRNA and double-pegRNA were compared at the same position.
- the PAM for each target site is shown in bold, and PBS is underlined.
- the dual-pegRNA strategy has the highest activity in most target sites (13 out of 15). They generated C-to-A, G-to-A, G-to-T, A-to-G, T-to-A, C-to-G and CT-AG point mutations, 1bp(T) Or 2bp (AT) deletion, and 1bp (A) insertion, the maximum editing efficiency reached 24.5% (Figure 27).
- the editing efficiency of double-pegRNA at all tested sites is about 4.2 times higher than that of single NGG-pegRNA (OsNRT1.1B (insert A) is up to 27.9 times), and on average 1.8 times higher than that of single CCN-pegRNA (the highest is OsALS (A ⁇ G) ) 7.2 times). Also, the ratio of by-products of using double pegRNA is not higher than that of single pegRNA ( Figure 28).
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Genetics & Genomics (AREA)
- Engineering & Computer Science (AREA)
- Organic Chemistry (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biotechnology (AREA)
- Biomedical Technology (AREA)
- General Engineering & Computer Science (AREA)
- Molecular Biology (AREA)
- Microbiology (AREA)
- Biochemistry (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- Biophysics (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Plant Pathology (AREA)
- Analytical Chemistry (AREA)
- Medicinal Chemistry (AREA)
- Chemical Kinetics & Catalysis (AREA)
- Immunology (AREA)
- Cell Biology (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
- Breeding Of Plants And Reproduction By Means Of Culturing (AREA)
Abstract
Description
Claims (19)
- 一种用于靶向性修饰植物基因组的植物基因组编辑系统,其包含:i)融合蛋白和/或含有编码所述融合蛋白的核苷酸序列的表达构建体,其中所述融合蛋白包含CRISPR切口酶和逆转录酶;和/或ii)至少一种pegRNA和/或含有编码所述至少一种pegRNA的核苷酸序列的表达构建体,其中所述至少一种pegRNA从5’至3’方向包含引导序列、支架序列、反转录(RT)模板序列和引物结合位点(PBS)序列,其中所述至少一种gRNA能够与所述融合蛋白形成复合物并将所述融合蛋白靶向基因组中的靶序列,导致所述靶序列内的切口。
- 权利要求1的系统,其中所述CRISPR切口酶是Cas9切口酶,例如包含SEQ ID NO:2所示氨基酸序列。
- 权利要求1或2的系统,其中所述逆转录酶是M-MLV逆转录酶,优选氨基酸序列如SEQ ID NO:4所示的增强型M-MLV逆转录酶,或者所述逆转录酶是SEQ ID NO:5所示的CaMV逆转录酶或SEQ ID NO:6所示的retron逆转录酶。
- 权利要求1-3中任一项的系统,其中pegRNA中的引导序列被设置为与靶序列具有充分序列相同性,从而能够通过碱基配对与靶序列的互补链结合,实现序列特异性靶向。
- 权利要求1-4中任一项的系统,其中所述pegRNA的支架序列包含SEQ ID NO:8所述序列。
- 权利要求1-5中任一项的系统,其中所述引物结合序列被设置为与所述靶序列的至少一部分互补,优选地,所述引物结合序列与所述切口导致的3’游离单链的至少一部分互补,特别是与所述3’游离单链的3’末端的核苷酸序列互补。
- 权利要求1-6中任一项的系统,其中所述引物结合序列的Tm(解链温度)为大约18℃-52℃,优选大约24℃-36℃,更优选大约28℃-32℃,更优选大约30℃。
- 权利要求1-7中任一项的系统,其中,所述RT模板序列被设置为对应于切口下游的序列,并包含期望的修饰,所述修饰包括一或多个核苷酸的取代、缺失和/或添加。
- 权利要求1-8中任一项的系统,其还包括切口gRNA和/或含有编码所述切口gRNA的核苷酸序列的表达构建体,所述切口gRNA包含引导序列和支架序列,所示引导序列被设置为与基因组中的靶序列具有充分序列相同性,从而能够将所述融合蛋白靶向所述靶序列,并导致所述靶序列内的切口,所述切口gRNA的靶序列与所述pegRNA的靶序列位于基因组DNA的相对链上,所述切口gRNA诱导的切口和所述pegRNA诱导的切口相距大约1个-大约300个核苷酸。
- 权利要求1-8中任一项的系统,其包含至少一对pegRNA和/或含有编码所述至 少一对pegRNA的核苷酸序列的表达构建体。
- 权利要求10的系统,所述pegRNA对中的两种pegRNA被设置为靶向基因组DNA的相同链上的不同靶序列,或者,所述pegRNA对中的两种pegRNA被设置为靶向基因组DNA的不同链上的靶序列。
- 权利要求10或11的系统,所述pegRNA对中的一种pegRNA的靶序列的PAM位于有义链,而另一种pegRNA的PAM位于反义链。
- 权利要求10-12中任一项的系统,所述两种pegRNA的诱导的切口分别位于待修饰位点的两侧。
- 权利要求13的系统,其中针对有义链的pegRNA诱导的切口位于待修饰位点的上游(5’方向),针对反义链的pegRNA诱导的切口位于待修饰位点的下游(3’方向)。
- 权利要求14的系统,所述两种pegRNA的诱导的切口相距大约1个-大约300个或更多个核苷酸,例如相距1-15个核苷酸。
- 权利要求10-15中任一项的系统,所述pegRNA对中的两种pegRNA被设置为导入相同的期望的修饰。
- 一种产生经遗传修饰的植物的方法,包括将权利要求1-16中任一项的基因组编辑系统导入至少一个所述植物,由此导致所述至少一个植物的基因组中的修饰,例如所述修饰包括一或多个核苷酸的取代、缺失和/或添加。
- 权利要求17的方法,其中所述导入包括将权利要求1-16中任一项的基因组编辑系统转化至分离的植物细胞或组织,然后使所述经转化的植物细胞或组织再生为完整植物;或者所述导入包括将权利要求1-16中任一项的基因组编辑系统转化至完整植物上的特定部位,例如叶片、茎尖、花粉管、幼穗或下胚轴。
- 权利要求18的方法,所述方法还包括在升高的温度下培养已经导入所述基因组编辑系统的植物细胞、组织或完整植物,例如所述升高的温度是37℃。
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
BR112022008468A BR112022008468A2 (pt) | 2019-11-01 | 2020-09-25 | Método para modificação direcionada de uma sequência de genoma de planta |
EP20882981.2A EP4053284A4 (en) | 2019-11-01 | 2020-09-25 | METHOD FOR TARGETED MODIFICATION OF PLANT GENOME SEQUENCE |
US17/773,426 US20230075587A1 (en) | 2019-11-01 | 2020-09-25 | Method for targeted modification of sequence of plant genome |
CN202080077133.2A CN114945671A (zh) | 2019-11-01 | 2020-09-25 | 靶向性修饰植物基因组序列的方法 |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911062005.6 | 2019-11-01 | ||
CN201911062005 | 2019-11-01 | ||
CN202010036374.4 | 2020-01-14 | ||
CN202010036374 | 2020-01-14 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021082830A1 true WO2021082830A1 (zh) | 2021-05-06 |
Family
ID=75715770
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2020/117736 WO2021082830A1 (zh) | 2019-11-01 | 2020-09-25 | 靶向性修饰植物基因组序列的方法 |
Country Status (5)
Country | Link |
---|---|
US (1) | US20230075587A1 (zh) |
EP (1) | EP4053284A4 (zh) |
CN (1) | CN114945671A (zh) |
BR (1) | BR112022008468A2 (zh) |
WO (1) | WO2021082830A1 (zh) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2022242660A1 (en) * | 2021-05-17 | 2022-11-24 | Wuhan University | System and methods for insertion and editing of large nucleic acid fragments |
WO2023030534A1 (zh) * | 2021-09-06 | 2023-03-09 | 苏州齐禾生科生物科技有限公司 | 改进的引导编辑系统 |
WO2023227050A1 (zh) * | 2022-05-25 | 2023-11-30 | 中国科学院遗传与发育生物学研究所 | 一种在基因组中定点插入外源序列的方法 |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11866728B2 (en) | 2022-01-21 | 2024-01-09 | Renagade Therapeutics Management Inc. | Engineered retrons and methods of use |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018149418A1 (en) | 2017-02-20 | 2018-08-23 | Institute Of Genetics And Developmental Biology, Chinese Academy Of Sciences | Genome editing system and method |
CN111378051A (zh) * | 2020-03-25 | 2020-07-07 | 北京市农林科学院 | Pe-p2引导编辑系统及其在基因组碱基编辑中的应用 |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2014186686A2 (en) * | 2013-05-17 | 2014-11-20 | Two Blades Foundation | Targeted mutagenesis and genome engineering in plants using rna-guided cas nucleases |
WO2016080795A1 (ko) * | 2014-11-19 | 2016-05-26 | 기초과학연구원 | 두 개의 벡터로부터 발현된 cas9 단백질을 이용한 유전자 발현 조절 방법 |
EP3942040A1 (en) * | 2019-03-19 | 2022-01-26 | The Broad Institute, Inc. | Methods and compositions for editing nucleotide sequences |
WO2021072328A1 (en) * | 2019-10-10 | 2021-04-15 | The Broad Institute, Inc. | Methods and compositions for prime editing rna |
CN115279898A (zh) * | 2019-10-23 | 2022-11-01 | 成对植物服务股份有限公司 | 用于植物中rna模板化编辑的组合物和方法 |
-
2020
- 2020-09-25 EP EP20882981.2A patent/EP4053284A4/en active Pending
- 2020-09-25 CN CN202080077133.2A patent/CN114945671A/zh active Pending
- 2020-09-25 WO PCT/CN2020/117736 patent/WO2021082830A1/zh unknown
- 2020-09-25 BR BR112022008468A patent/BR112022008468A2/pt unknown
- 2020-09-25 US US17/773,426 patent/US20230075587A1/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018149418A1 (en) | 2017-02-20 | 2018-08-23 | Institute Of Genetics And Developmental Biology, Chinese Academy Of Sciences | Genome editing system and method |
CN111378051A (zh) * | 2020-03-25 | 2020-07-07 | 北京市农林科学院 | Pe-p2引导编辑系统及其在基因组碱基编辑中的应用 |
Non-Patent Citations (6)
Title |
---|
ANZALONE, AV ET AL.: "Search-and-replace genome editing without double-strand breaks or donor DNA", NATURE, vol. 576, no. 7785, 21 October 2019 (2019-10-21), XP036953141, ISSN: 0028-0836, DOI: 10.1038/s41586-019-1711-4 * |
GAO ET AL., JIPB, vol. 56, April 2014 (2014-04-01), pages 343 - 349 |
LIN, QP ET AL.: "Prime genome editing in rice and wheat", NATURE BIOTECHNOLOGY, vol. 38, no. 5, 16 March 2020 (2020-03-16), XP037113496, ISSN: 1087-0156, DOI: 10.1038/s41587-020-0455-x * |
NAKAMURA, Y ET AL.: "Codon usage tabulated from the international DNA sequence databases: status for the year 2000", NUCL. ACIDS RES., vol. 28, 2000, pages 292, XP002941557, DOI: 10.1093/nar/28.1.292 |
SAMBROOK, J.FRITSCH, E.F.MANIATIS, T.: "Molecular Cloning: A Laboratory Manual", 1989, COLD SPRING HARBOR LABORATORY PRESS |
See also references of EP4053284A4 |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2022242660A1 (en) * | 2021-05-17 | 2022-11-24 | Wuhan University | System and methods for insertion and editing of large nucleic acid fragments |
WO2023030534A1 (zh) * | 2021-09-06 | 2023-03-09 | 苏州齐禾生科生物科技有限公司 | 改进的引导编辑系统 |
WO2023227050A1 (zh) * | 2022-05-25 | 2023-11-30 | 中国科学院遗传与发育生物学研究所 | 一种在基因组中定点插入外源序列的方法 |
Also Published As
Publication number | Publication date |
---|---|
EP4053284A1 (en) | 2022-09-07 |
CN114945671A (zh) | 2022-08-26 |
BR112022008468A2 (pt) | 2022-07-19 |
US20230075587A1 (en) | 2023-03-09 |
EP4053284A4 (en) | 2024-03-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021082830A1 (zh) | 靶向性修饰植物基因组序列的方法 | |
WO2019120310A1 (en) | Base editing system and method based on cpf1 protein | |
US11820990B2 (en) | Method for base editing in plants | |
Liang et al. | Genome editing of bread wheat using biolistic delivery of CRISPR/Cas9 in vitro transcripts or ribonucleoproteins | |
RU2679510C2 (ru) | Обогащение активируемой флуоресценцией сортировки клеток (facs) для создания растений | |
AU2017203177B2 (en) | Constructs for expressing transgenes using regulatory elements from Setaria ubiquitin genes | |
WO2021032155A1 (zh) | 一种碱基编辑系统和其使用方法 | |
CN110526993B (zh) | 一种用于基因编辑的核酸构建物 | |
CN108130342A (zh) | 基于Cpf1的植物基因组定点编辑方法 | |
WO2023169454A1 (zh) | 腺嘌呤脱氨酶及其在碱基编辑中的用途 | |
EP4116426A1 (en) | Multiplex genome editing method and system | |
CN110892074A (zh) | 用于增加香蕉的保质期的组成物及方法 | |
CN110396523B (zh) | 一种重复片段介导的植物定点重组方法 | |
JP2022511508A (ja) | ゲノム編集による遺伝子サイレンシング | |
CN112048493A (zh) | 一种增强Cas9及其衍生蛋白介导的基因操纵系统的方法及应用 | |
WO2023030534A1 (zh) | 改进的引导编辑系统 | |
US9777286B2 (en) | Zea mays metallothionein-like regulatory elements and uses thereof | |
WO2021175288A1 (zh) | 改进的胞嘧啶碱基编辑系统 | |
WO2024051850A1 (zh) | 基于dna聚合酶的基因组编辑系统和方法 | |
CN104024416B (zh) | 用于植物基因表达的终止子序列 | |
CN112662687A (zh) | 推迟玉米花期的方法、试剂盒、基因 | |
WO2022188816A1 (zh) | 改进的cg碱基编辑系统 | |
CN112458102B (zh) | 桃热激转录因子PpHSF5及其应用 | |
WO2023227050A1 (zh) | 一种在基因组中定点插入外源序列的方法 | |
CN116042573A (zh) | 一种提高引导编辑系统碱基编辑效率的方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20882981 Country of ref document: EP Kind code of ref document: A1 |
|
REG | Reference to national code |
Ref country code: BR Ref legal event code: B01A Ref document number: 112022008468 Country of ref document: BR |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 2020882981 Country of ref document: EP Effective date: 20220601 |
|
ENP | Entry into the national phase |
Ref document number: 112022008468 Country of ref document: BR Kind code of ref document: A2 Effective date: 20220502 |